반응형
[Paper 리뷰] MQTTS: A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech
MQTTS: A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous SpeechText-to-Speech에서 human-level diversity를 반영할 필요가 있음MQTTSMel-spectrogram based autoregressive model의 alignment mismatch 문제를 해결하기 위해 multiple code group으로 학습된 discrete code를 활용합성 품질 향상을 위해 clean silence prompt를 활용하고 multiple code generation과 monotonic alignment architecture를 도입논문 (AAAI 2023) : Paper Link..
Paper/TTS
2024. 5. 7. 09:51
반응형