'2026/04/14 글 목록

[Paper 리뷰] VibeVoice: Expressive Podcast Generation with Next-Token Diffusion

VibeVoice: Expressive Podcast Generation with Next-Token DiffusionPodcast와 같은 long-form, multi-speaker conversational audio를 생성하기 위해서는 Text-to-Speech system에서 scalability, speaker consistency, natural turn-taking를 보장할 수 있어야 함VibeVoice7.5 ultra-low frame rate의 continuous speech tokenizer를 활용해 long sequence efficiency를 개선추가적으로 next-token diffusion framework를 통해 expressive podcast generation을 지원논문 ..

Paper/Language Model 2026. 4. 14. 12:59

이전 1 다음

이전 다음

최근에 올라온 글

최근에 달린 댓글

« 2026/04 »
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30

Total

Today

Yesterday

Let IT Begin

티스토리툴바