반응형
![](http://i1.daumcdn.net/thumb/C148x148/?fname=https://blog.kakaocdn.net/dn/caP09C/btsHo322PkO/n4YvUxJeCjhzYGnVbzLWYK/img.png)
StyleSpeech: Self-Supervised Style Enhancing with VQ-VAE-based Pre-training for Expressive Audiobook Speech SynthesisAudiobook을 위한 음성 합성은 generalized architecture와 training data의 unbalanced style distribution으로 인해 한계가 있음StyleSpeechExpressive audiobook synthesis를 위해 VQ-VAE-based pre-training을 통한 self-supervised style enhancing method를 적용Text style encoder는 large-scale unlabeled text-only data로 p..
Paper/TTS
2024. 5. 15. 11:30
반응형