반응형
EmoVoice: LLM-based Emotional Text-to-Speech Model with Freestyle Text PromptingText-to-Speech model은 여전히 emotional expression 측면에서 한계가 있음EmoVoiceLarge Language Model을 활용하여 fine-grained freestyle natural language emotion control을 지원Phoneme token과 audio token을 parallel output 하여 content consistency를 향상논문 (MM 2025) : Paper Link1. IntroductionEmotion-contorllable Text-to-Speech (TTS) model은 emotion..
Paper/Language Model
2025. 10. 29. 12:45
반응형
