ComVo: Toward Complex-Valued Neural Networks for Waveform GenerationiSTFT-based vocoder는 complex spectrogram의 inherent structure를 capture 하기 어려움ComVoGenerator, discriminator에서 native complex arithmetic을 사용하여 complex-valued representation에 대한 structured feedback을 제공Phase quantization을 도입하여 phase value를 discretize 하고 training process를 regularize추가적으로 block-matrix computation을 통해 training efficienc..
DegVoC: Revisiting Neural Vocoder from a Degradation Perspective기존의 neural vocoder는 performance-cost trade-off가 존재함DegVoCMel-spectrogram을 target spectrum으로부터의 signal degradation process로 취급Degradation prior를 활용하여 simple linear transformation을 통해 initial spectral structure를 retrieve 하고 time-frequency domain에서 heterogeneous distribution을 고려한 deep prior solver를 도입논문 (AAAI 2026) : Paper Link1. Intro..
WaveNeXt2: ConvNeXt-based Fast Neural Vocoders with Residual Denoising and Sub-Modeling for GAN and Diffusion Models대부분의 ConvNeXt-based vocoder는 Generative Adversarial Network framework만 사용함WaveNeXt2Residual denoising과 sub-modeling을 도입하여 waveform을 progressively refineGenerative Adeversarial Network, diffusion에 모두 compatible 한 ConvNeXt-based architecture를 구성논문 (ICASSP 2026) : Paper Link1. Introdu..
Wave-Trainer-Fit: Neural Vocoder with Trainable Prior and Fixed-Point Iteration Towards High-Quality Speech Generation from SSL FeaturesSelf-Supervised Learning과 같은 data-driven feature에 대해 high-quality waveform generation을 수행할 수 있음WaveTrainerFitTrainable prior를 도입하여 target speech와 close 한 noise에서 inference process를 수행Reference-aware gain adjustment를 통해 trainable prior에 constraint를 impose논문 (ICAS..
BridgeVoC: Neural Vocoder with Shrodinger BridgeDiffusion-based neural vocoder는 mel-spectrogram의 linear-degradation을 neglect 함BridgeVoCTime-Frequency domain-based neural vocoder와 Schrodinger Bridge를 연결Mel-spectrogram을 target linear-scale domain으로 project 하고 degraded spectral representation으로 취급논문 (IJCAI 2025) : Paper Link1. IntroductionNeural vocoder는 acoustic feature로부터 high-quality waveform을 생..
RNDVoC: Learning Neural Vocoder from Range-Null Space DecompositionNeural vocoder는 parameter-performance trade-off가 존재함RNDVoCRange-Null Decomposition과 vocoder task를 bridge 하여 target spectrogram reconstruction을 range-space와 null-space 간의 superimposition으로 decompose추가적으로 sub-band, sequential modeling을 위해 cross-/narrow-band module을 활용한 dual-path framework를 구성논문 (IJCAI 2025) : Paper Link1. Introduct..
