'Paper/Vocoder' 카테고리의 글 목록 (7 Page)

[Paper 리뷰] BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network

BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network Generative Adversarial Network (GAN) 기반의 vocoder는 빠르게 고품질의 waveform을 합성할 수 있다는 장점이 있음 - BUT, 대부분의 GAN은 feature space에서 real/fake data를 discriminating 하기 위한 optimal projection을 얻지 못하는 것으로 나타남 BigVSAN Optimal projection을 얻을 수 있는 Slicing Adversarial Network (SAN)을 vocoding task에 적용한 모델 GAN-based vocoder에서 채택되는 least-squar..

Paper/Vocoder 2024. 4. 4. 11:24

[Paper 리뷰] Parallel WaveGAN: A Fast Waveform Generation Model Based on Generative Adversarial Networks with Multi-Resolution Spectrogram

Parallel WaveGAN: A Fast Waveform Generation Model Based on Generative Adversarial Networks with Multi-Resolution Spectrogram Generative Adversarial Network를 사용하여 distillation 과정이 필요 없는 vocoder를 구성할 수 있음 Parallel WaveGAN Waveform의 time-frequency 분포를 효과적으로 capture 하는 multi-resolution spectrogram loss와 adversarial loss를 jointly optimize 하여 non-autoregressive WaveNet을 training 함 기존의 teacher-student..

Paper/Vocoder 2024. 4. 1. 09:32

[Paper 리뷰] BigVGAN: A Universal Neural Vocoder with Large-Scale Training

BigVGAN: A Universal Neural Vocoder with Large-Scale TrainingGenerative Adversarial Network (GAN) 기반의 vocoder는 우수한 품질을 보이고 있지만, 다양한 recording 환경과 speaker에 대한 audio를 합성하는 것에는 어려움이 있음BigVGANFine-tuning 없이 다양한 out-of-distribution scenario에 generalize 할 수 있는 universal vocoderGAN generator에 periodic activation function과 anti-aliased representation을 도입하여 inductive bias를 제공하고 합성 성능을 향상결과적으로 over-regula..

Paper/Vocoder 2024. 3. 30. 11:14

[Paper 리뷰] AutoVocoder: Fast Waveform Generation from a Learned Speech Representation Using Differentiable Digital Signal Processing

AutoVocoder: Fast Waveform Generation from a Learned Speech Representation Using Differentiable Digital Signal Processing Mel-spectrogram은 waveform으로부터 간단하게 추출될 수 있지만, mel-spectrogram에서 waveform을 생성하는 vocoder에는 많은 계산 비용이 필요함 AutoVocoder 기존 mel-spectrogram 방식에서 벗어나 inverse STFT의 differentiable implementation을 사용하여 waveform을 생성 결과적으로 기존 neural vocoder에 비해 14배 이상의 가속 효과를 달성 논문 (ICASSP 2023) : Paper..

Paper/Vocoder 2024. 3. 27. 09:51

[Paper 리뷰] UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation

UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation Full-band spectral feature를 사용하면 vocoder에 많은 acoustic information을 제공할 수 있음 - BUT, full-band mel-spectrogram 사용 시 over-smoothing 문제가 발생할 수 있음 UnivNet Full-band over-smoothing 문제를 해결하는 고품질 neural vocoder Multiple linear spectrogram magnitude를 사용하는 multi-resolution spectrogram discrimin..

Paper/Vocoder 2024. 3. 22. 10:12

[Paper 리뷰] FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net Encoder with Multiple STFTs

FastFit: Towards Real-Time Iterative Neural Vocoder by Replacing U-Net Encoder with Multiple STFTsU-Net encoder를 multiple Short-Time Fourier Transform (STFT)로 대체하여 sample 품질을 유지하면서 더 빠른 합성 속도를 얻을 수 있음FastFit각 encoder block을 STFT로 대체하고 decoder block의 temporal resolution과 동일한 parameter를 사용해 skip connection으로 연결이를 통해 high-fidelity의 sample을 유지하면서 parameter 수와 생성 속도를 절반으로 줄임논문 (INTERSP..

Paper/Vocoder 2024. 3. 21. 11:17

이전 1 ··· 4 5 6 7 8 9 10 11 다음

이전 다음

최근에 올라온 글

최근에 달린 댓글

« 2025/05 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Total

Today

Yesterday

Let IT Begin

티스토리툴바