'2025/05/04 글 목록

[Paper 리뷰] NaturalSpeech3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

NaturalSpeech3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion ModelsLarge-scale text-to-speech system은 여전히 prosody, similarity 측면에서 한계가 있음NaturalSpeech3Speech waveform을 content, prosody, timbre, acoustic detail의 subspace로 disentangle 하는 Factorized Vector Quantization에 기반한 neural codec을 활용Prompt에 따라 각 subspace에서 attribute를 생성하는 factorized diffusion model을 도입논문 (ICML 2024) : Paper..

Paper/TTS 2025. 5. 4. 09:33

이전 1 다음

이전 다음

최근에 올라온 글

최근에 달린 댓글

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Total

Today

Yesterday

Let IT Begin

티스토리툴바