'2024/08 글 목록

[Paper 리뷰] DreamVoice: Text-Guided Voice Conversion

DreamVoice: Text-Guided Voice ConversionText-guided generation을 활용하면 user need에 따른 음성을 합성할 수 있음DreamVoiceEnd-to-End diffusion-based text-guided voice conversion을 위한 DreamVC와 text-to-voice generation을 위한 DreamVG를 제공추가적으로 VCTK, LibriTTS에 대한 voice timbre annotation을 가진 DreamVoiceDB dataset을 구축논문 (INTERSPEECH 2024) : Paper Link1. IntroductionVoice Conversion (VC)는 training/inference 중에 target voice의..

Paper/Conversion 2024. 8. 31. 08:48

[Paper 리뷰] FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion

FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion기존의 voice conversion은 speaker information이 leak 되거나 많은 양의 annotated data가 필요함FreeVCVITS의 end-to-end framework를 채택하고 text annotation 없이 clean content information을 추출- 특히 WavLM feature에 information bottleneck을 impose 하여 content information을 disentangling추출된 content information의 purity를 향상하기 위해 spectrogram-resize based data augmentatio..

Paper/Conversion 2024. 8. 28. 09:18

[Paper 리뷰] StreamVC: Real-Time Low-Latency Voice Conversion

StreamVC: Real-Time Low-Latency Voice ConversionLightweight, high-quality conversion을 지원하는 streaming voice conversion 모델이 필요함StreamVCSoundStream의 neural audio codec architecture를 활용Soft speech unit을 causal 하게 학습하고 pitch stability를 향상하기 위해 whitened fundamental frequency information을 제공논문 (ICASSP 2024) : Paper Link1. IntroductionVoice Conversion (VC)는 linguistic content를 preserve 하면서 speech signal..

Paper/Conversion 2024. 8. 27. 09:11

[Paper 리뷰] S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations

S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained RepresentationsAny-to-Any Voice Conversion은 seen/unseen speaker의 모든 utterance로 변환을 수행할 수 있어야 함S2VCSource/target feature로 self-supervised feature를 사용Speaker-independent 하고 content information을 추출할 수 있는 supervised phoneme posteriorgram을 baseline feature로 선정논문 (INTERSPEECH 2021) : Paper Link1. IntroductionSelf-Supervi..

Paper/Conversion 2024. 8. 25. 09:26

[Paper 리뷰] MaskCycleGAN-VC: Learning Non-Parallel Voice Conversion with Filling in Frames

MaskCycleGAN-VC: Learning Non-Parallel Voice Conversion with Filling in FramesNon-parallel voice conversion을 위한 Cycle-Consistent Adversarial Network-based 방식은 time-frequency structure를 capture 하는 능력이 부족함MaskCycleGAN-VCCycleGAN-VC2의 확장으로써 Filling in Frames를 통해 training 하여 얻어짐Filling in Frames를 사용하여 input mel-spectrogram에 temporal mask를 적용하고 converter가 surrounding frame을 기반으로 missing frame을 fillin..

Paper/Conversion 2024. 8. 22. 10:27

[Paper 리뷰] CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-Spectrogram Conversion

CycleGAN-VC3: Examining and Improving CycleGAN-VCs for Mel-Spectrogram ConversionNon-parallel voice conversion에서 CycleGAN-VC가 우수한 성능을 보임- BUT, mel-spectrogram conversion에 대한 ambiguity로 인해 time-frequency structure가 손상됨CycleGAN-VC3Time-Frequency Adaptive Normalization을 도입하여 time-frequency structure를 반영기존 CycleGAN의 mel-spectrogram conversion 성능을 향상논문 (INTERSPEECH 2020) : Paper Link1. IntroductionVo..

Paper/Conversion 2024. 8. 21. 09:15

이전 1 2 3 4 다음

이전 다음

최근에 올라온 글

최근에 달린 댓글

« 2024/08 »
일	월	화	수	목	금	토
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Total

Today

Yesterday

Let IT Begin

티스토리툴바