'Paper/Vocoder' 카테고리의 글 목록 (2 Page)

[Paper 리뷰] QHM-GAN: Neural Vocoder based on Quasi-Harmonic Modeling

QHM-GAN: Neural Vocoder based on Quasi-Harmonic Modeling기존 end-to-end neural vocoder는 black-box nature로 인해 speech의 intrinsic structure를 revealing 하지 못하므로 고품질 합성의 한계가 있음QHM-GANQuasi-Harmonic Model을 기반으로 network architecture를 개선Speech signal을 quasi-harmonic component로 parameterize 하여 고품질 합성을 지원하고, time consumption과 network size를 절감논문 (INTERSPEECH 2024) : Paper Link1. IntroductionVocoder는 acoustic ..

Paper/Vocoder 2024. 10. 27. 12:19

[Paper 리뷰] RefineGAN: Universally Generating Waveform Better than Ground Truth with Highly Accurate Pitch and Intensity Responses

RefineGAN: Universally Generating Waveform Better than Ground Truth with Highly Accurate Pitch and Intensity ResponsesGenerative Adversarial Network-based waveform generation은 discriminator에 크게 의존함- 따라서 generation process에 uncertainty가 존재하고 pitch/intensity mismatch가 발생함RefineGANRobustness, pitch/intensity accuracy를 유지하기 위해 pitch-guided refine architecture를 구성추가적으로 training을 stabilize 하기 위해 multi..

Paper/Vocoder 2024. 7. 23. 09:34

[Paper 리뷰] Bunched LPCNet: Vocoder for Low-cost Neural Text-to-Speech Systems

Bunched LPCNet: Vocoder for Low-cost Neural Text-to-Speech SystemsLPCNet은 linear prediction과 neural network를 결합하여 computational complexity를 크게 낮출 수 있음Bunched LPCNetLPCNet이 추론 당 둘 이상의 audio sample을 생성하도록 하는 sample-bunchingLPCNet final layer에서 computation을 줄이는 bit-bunching을 도입논문 (INTERSPEECH 2020) : Paper Link1. IntroductionLPCNet은 추론 속도와 합성 품질 측면에서 뛰어난 성능을 달성함특히 source-filter model을 기반으로 low-cost..

Paper/Vocoder 2024. 7. 14. 10:29

[Paper 리뷰] End-to-End LPCNet: A Neural Vocoder with Fully-Differentiable LPC Estimation

End-to-End LPCNet: A Neural Vocoder with Fully-Differentiable LPC EstimationNeural vocoder는 여전히 우수한 합성 품질에 비해 높은 computational complexity가 요구됨End-to-End LPCNetLinear prediction에 기반한 autoregressive model을 사용하여 neural vocoding의 complexity를 완화추가적으로 frame rate network의 input feature에서 linear prediction cofficient를 예측하는 방법을 학습하여 기존 end-to-end version을 구성논문 (INTERSPEECH 2022) : Paper Link1. Introducti..

Paper/Vocoder 2024. 7. 13. 11:00

[Paper 리뷰] DFlow: A Generative Model Combining Denoising AutoEncoder and Normalizing Flow for High Fidelity Waveform Generation

DFlow: A Generative Model Combining Denoising AutoEncoder and Normalizing Flow for High Fidelity Waveform GenerationHigh-fidelity의 waveform generation을 위한 vocoder가 필요함DFlow고품질 생성을 위해 Normalizing Flow와 Denoising AutoEncoder를 결합추가적으로 model size와 training set을 확장하여 DFlow를 large-scale universal vocoder로 scaling up논문 (ICML 2024) : Paper Link1. IntroductionDeep Generative Model (DGM)은 waveform generat..

Paper/Vocoder 2024. 7. 7. 13:27

[Paper 리뷰] JenGAN: Stacked Shifted Filters in GAN-based Speech Synthesis

JenGAN: Stacked Shifted Filters in GAN-based Speech SynthesisNon-autoregressive GAN-based vocoder는 빠른 추론 속도와 우수한 품질을 지원하지만, audible artifact가 발생하는 경향이 있음JenGANShift-equivariant property를 보장하기 위해 shifted low-pass filter를 stack 하는 training strategy추론 시에 사용되는 model structure를 유지하면서 aliasing을 방지하고 artifact를 줄임논문 (INTERSPEECH 2024) : Paper Link1. IntroductionNeural vocoder는 mel-spectrogram과 같은 audio ..

Paper/Vocoder 2024. 7. 3. 09:48

이전 1 2 3 4 5 ··· 11 다음

이전 다음

최근에 올라온 글

최근에 달린 댓글

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Total

Today

Yesterday

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Let IT Begin

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역