반응형
VoxCPM: Hierarchical Semantic-Acoustic Modeling via Semi-Discrete Residual Representations for Expressive End-to-End Speech SynthesisSpeech tokenizer 기반의 multi-stage speech synthesis는 semantic-acoustic divide로 인한 trade-off가 존재함VoxCPMSemi-discrete residual representation 기반의 hierarchical semantic-acoustic modeling을 적용추가적으로 natural specialization을 위한 differentiable quantization bottleneck을 도입논문 (I..
Paper/Language Model
2026. 4. 6. 13:00
반응형
