반응형
[Paper 리뷰] SALTTS: Leveraging Self-Supervised Speech Representations for Improved Text-to-Speech Synthesis
SALTTS: Leveraging Self-Supervised Speech Representations for Improved Text-to-Speech SynthesisText-to-Speech에서 richer representation을 반영하기 위해 Self-Supervised Learning model을 활용할 수 있음SALTTSSelf-Supervised Learning representation을 reconstruct 하기 위해 encoder layer를 통해 FastSpeech2 encoder의 length-regulated output을 전달함SALTTS-parallel에서 해당 encoder representation은 auxiliary reconstruction loss로 사용되고, S..
Paper/TTS
2024. 6. 25. 09:49
반응형