반응형

Multi-Resolution HuBERT: Multi-Resolution Speech Self-Supervised Learning with Masked Unit Prediction기존의 self-supervised learning model은 20ms의 fixed resolution으로 speech signal을 처리하므로 서로 다른 resolution의 informational content를 overlook 함Multi-Resolution HuBERTSpeech self-supervised learning에 multi-resolution information을 incorporateHuBERT-style masked prediction objective를 개선한 hierarchical Transfor..
Paper/Representation
2025. 5. 17. 08:20
반응형