반응형
DecoupledSynth: Enhancing Zero-Shot Text-to-Speech via Factors Decoupling기존의 Zero-Shot Text-to-Speech model은 intermediate representation의 linguistic, para-linguistic, non-linguistic information을 balancing 하는데 어려움이 있음DecoupledSynth다양한 self-supervised model을 combine 하여 comprehensive, decoupled representation을 추출Decoupled processing stage를 활용하여 nuanced synthesis를 지원논문 (ICASSP 2025) : Paper Link1. I..
Paper/TTS
2025. 6. 17. 17:20
반응형
