
BridgeVoC: Neural Vocoder with Shrodinger BridgeDiffusion-based neural vocoder는 mel-spectrogram의 linear-degradation을 neglect 함BridgeVoCTime-Frequency domain-based neural vocoder와 Schrodinger Bridge를 연결Mel-spectrogram을 target linear-scale domain으로 project 하고 degraded spectral representation으로 취급논문 (IJCAI 2025) : Paper Link1. IntroductionNeural vocoder는 acoustic feature로부터 high-quality waveform을 생..

PALLE: Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech SynthesisZero-Shot Text-to-Speech에서 autoregressive model은 generation speed, non-autoregressive model은 temporal modeling의 한계가 있음PALLEAutoregressive의 explicit temporal modeling과 non-autoregressive의 parallel genertion을 combine 한 pseudo-autoregressive approach를 도입Two-stage framework를 기반으로 first stage에서는 ..

RNDVoC: Learning Neural Vocoder from Range-Null Space DecompositionNeural vocoder는 parameter-performance trade-off가 존재함RNDVoCRange-Null Decomposition과 vocoder task를 bridge 하여 target spectrogram reconstruction을 range-space와 null-space 간의 superimposition으로 decompose추가적으로 sub-band, sequential modeling을 위해 cross-/narrow-band module을 활용한 dual-path framework를 구성논문 (IJCAI 2025) : Paper Link1. Introduct..