반응형
[Paper 리뷰] NaturalSpeech: End-to-End Text-to-Speech Synthesis with Human-Level Quality
NaturalSpeech: End-to-End Text-to-Speech Synthesis with Human-Level QualityText-to-Speech에서 human-level quality를 판단하는 것은 어려움NaturalSpeechHuman-level quality를 달성하기 위해 variational auto-encoder를 활용한 end-to-end text-to-speech 모델Phoneme pre-training, differentiable duration modeling, bidirectional prior/posterior modeling, VAE memory mechanism을 포함논문 (PAMI 2024) : Paper Link1. IntroductionText-to-Spee..
Paper/TTS
2024. 6. 29. 14:48
반응형