반응형
[Paper 리뷰] ClariTTS: Feature-ratio Normalization and Duration Stabilization for Code-Mixed Multi-Speaker Speech Synthesis
ClariTTS: Feature-ratio Normalization and Duration Stabilization for Code-Mixed Multi-Speaker Speech SynthesisText-to-Speech model에서 code-mixed text는 speaker-related feature에 source language에 대한 linguistic feature가 포함될 수 있으므로 unnatural accent를 생성할 수 있음ClariTTSFlow-based text-to-speech model에 Feature-ratio Normalized Affine Coupling Layer를 적용- Speaker와 linguistic feature를 disentangle 하여 target sp..
Paper/TTS
2024. 10. 9. 10:30
반응형