반응형
[Paper 리뷰] StyleSpeech: Self-Supervised Style Enhancing with VQ-VAE-based Pre-training for Expressive Audiobook Speech Synthesis
StyleSpeech: Self-Supervised Style Enhancing with VQ-VAE-based Pre-training for Expressive Audiobook Speech SynthesisAudiobook을 위한 음성 합성은 generalized architecture와 training data의 unbalanced style distribution으로 인해 한계가 있음StyleSpeechExpressive audiobook synthesis를 위해 VQ-VAE-based pre-training을 통한 self-supervised style enhancing method를 적용Text style encoder는 large-scale unlabeled text-only data로 p..
Paper/TTS
2024. 5. 15. 11:30
반응형