'Paper/Vocoder' 카테고리의 글 목록 (6 Page)

[Paper 리뷰] Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform Generation in Multiple Domains

Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform Generation in Multiple Domains 여러 domain에서 high-fidelity의 음성을 합성할 수 있는 vocoder가 필요함 Universal MelGAN MelGAN-based structure에 multi-resolution spectrogram discriminator를 추가하여 생성된 waveform의 spectral resolution을 향상 이를 통해 large footprint 모델의 high-frequency band에서의 over-smoothing 문제를 방지 논문 (ICASSP 2021) : Paper Link 1. Introduction ..

Paper/Vocoder 2024. 4. 17. 10:05

[Paper 리뷰] BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis

BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech SynthesisDiffusion model은 우수한 합성 품질을 보이고 있지만 효율적인 sampling의 어려움이 있음Bilateral Denoising Diffusion Model (BDDM)Bilateral modeling objective로 train 할 수 있는 schedule network와 score network를 사용하여 forward/reverse process를 parameterize 하는 bilateral denoising diffusion model제안된 surrogate objective는 기존 surrogate보다 tighter 한 log ma..

Paper/Vocoder 2024. 4. 14. 12:12

[Paper 리뷰] FeatherWave: An Efficient High-Fidelity Neural Vocoder with Multi-Band Linear Prediction

FeatherWave: An Efficient High-Fidelity Neural Vocoder with Multi-Band Linear Prediction Multi-band signal processing과 linear predictive coding을 결합하여 neural vocoder를 구성할 수 있음 FeatherWave LPCNet에 multi-band linear predictive coding을 결합한 모델 Multi-band method를 활용하여 여러 sample을 병렬적으로 빠르게 합성할 수 있도록 함 논문 (INTERSPEECH 2020) : Paper Link 1. Introduction Text-to-Speech (TTS)에서 vocoder는 human-like 음성을 합성하는..

Paper/Vocoder 2024. 4. 10. 10:31

[Paper 리뷰] AdaVocoder: Adaptive Vocoder for Custom Voice

AdaVocoder: Adaptive Vocoder for Custom Voice Custom voice는 few target recording만을 사용하여 personal 음성 합성을 구축하는 것을 목표로 함 이때 vocoder 학습을 위한 multi-speaker dataset은 확보하기 어렵고, target speaker의 분포는 training dataset의 분포와 항상 mismatch 하게 나타나는 문제점이 있음 AdaVocoder Adaptive vocoder를 위해 cross-domain consistency loss를 도입 Few-shot transfer learning에 대한 GAN-based vocoder의 overfitting 문제를 해결하여 고품질의 custom voice를 얻음 ..

Paper/Vocoder 2024. 4. 5. 09:45

[Paper 리뷰] BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network

BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network Generative Adversarial Network (GAN) 기반의 vocoder는 빠르게 고품질의 waveform을 합성할 수 있다는 장점이 있음 - BUT, 대부분의 GAN은 feature space에서 real/fake data를 discriminating 하기 위한 optimal projection을 얻지 못하는 것으로 나타남 BigVSAN Optimal projection을 얻을 수 있는 Slicing Adversarial Network (SAN)을 vocoding task에 적용한 모델 GAN-based vocoder에서 채택되는 least-squar..

Paper/Vocoder 2024. 4. 4. 11:24

[Paper 리뷰] Parallel WaveGAN: A Fast Waveform Generation Model Based on Generative Adversarial Networks with Multi-Resolution Spectrogram

Parallel WaveGAN: A Fast Waveform Generation Model Based on Generative Adversarial Networks with Multi-Resolution Spectrogram Generative Adversarial Network를 사용하여 distillation 과정이 필요 없는 vocoder를 구성할 수 있음 Parallel WaveGAN Waveform의 time-frequency 분포를 효과적으로 capture 하는 multi-resolution spectrogram loss와 adversarial loss를 jointly optimize 하여 non-autoregressive WaveNet을 training 함 기존의 teacher-student..

Paper/Vocoder 2024. 4. 1. 09:32

이전 1 ··· 3 4 5 6 7 8 9 ··· 11 다음

이전 다음

최근에 올라온 글

최근에 달린 댓글

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Total

Today

Yesterday

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Let IT Begin

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역