반응형
SimpleSpeech2: Towards Simple and Efficient Text-to-Speech with Flow-based Scalar Latent Transformer Diffusion ModelsNon-autoregressive Text-to-Speech model은 duration alignment로 인한 complexity가 있음SimpleSpeech2Autoregressive, Non-autoregressive approach를 combine 하여 straightforward model을 구성Simplified data preparation, fast inference, stable generation을 지원논문 (TASLP 2025) : Paper Link1. Introduction..
Paper/TTS
2025. 11. 25. 14:49
반응형
