반응형

WaveFM: A High-Fidelity and Efficient Vocoder based on Flow MatchingFlow Matching은 diffusion model에 대한 robust training을 제공하지만 neural vocoder에 directly applying 하면 audio quality가 저하됨WaveFMStandard Gaussian prior 대신 mel-conditioned prior distribution을 채택하여 transportation cost를 minimizeRefined multi-resolution STFT loss를 결합하여 audio quality를 향상추가적으로 inference speed 향상을 위해 consistency distillation me..
Paper/Vocoder
2025. 3. 30. 12:44
반응형