반응형
[Paper 리뷰] VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the WildSpeech editing, zero-shot text-to-speech를 위해 token infilling neural codec language model을 구성할 수 있음VocieCraftTransformer decoder architecture와 causal masking, delayed stacking을 결합하여 existing sequence 내에서 generation을 수행하는 token rearrangement를 도입추가적으로 speech editing evaluation을 위한 RealEdit dataset을 제공논문 (ACL 2024) : Paper Link1. Int..
Paper/Language Model
2024. 7. 20. 11:12
반응형