반응형
[Paper 리뷰] StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with Temporal Adaptive Normalization
StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with Temporal Adaptive NormalizationLightweight neural vocoder는 여전히 perceptual quailty 측면에서 열등한 성능을 보임StyleMelGAN낮은 complexity를 가지면서 high-fidelity의 음성을 합성할 수 있는 lightweight neural vocoderTemporal Adaptive Normalization을 사용하여 target speech의 acoustic feature로 low-dimensional noise vector를 style 함Random Window Discriminator는 multi-scale sp..
Paper/Vocoder
2024. 5. 1. 10:21
반응형