반응형
![](http://i1.daumcdn.net/thumb/C148x148/?fname=https://blog.kakaocdn.net/dn/oVkyA/btsIxKMOdkm/o49IADoptdKjJLYu5tm5U0/img.png)
SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion ModelsDiffusion 기반의 non-autoregressive text-to-speech 모델은 높은 효율성이 요구됨SimpleSpeechScalar quantization을 수행하는 speech codec인 SQ-Codec을 활용- Complex speech signal을 finite, compact scalar latent space로 mapping 하는 역할이후 SQ-Codec의 scalar latent space에 transformer diffusion model을 적용논문 (INTERSPEECH 2024) : Pa..
Paper/TTS
2024. 7. 12. 09:35
반응형