반응형
PAST: Phonetic-Acoustic Speech TokenizerSignal reconstruction과 phonetic information을 jointly modeling 할 수 있음PASTPre-trained self-supervised model 없이 supervised phonetic data를 사용하여 auxiliary task를 통해 domain knowledge를 tokenization process에 integrate추가적으로 real-time application을 위한 streamable architecture를 구성논문 (INTERSPEECH 2025) : Paper Link1. IntroductionSpeech language model은 일반적으로 acoustic toke..
Paper/Neural Codec
2025. 9. 24. 17:02
반응형
