반응형
EmotionRankCLAP: Bridging Natural Language Speaking Styles and Ordinal Speech Emotion via Rank-N-ContrastContrastive Language Audio Pre-training은 emotion의 ordinal nature를 capture 하지 못하고 audio, text embedding 간의 insufficient alignment가 나타남EmotionRankCLAPEmotional speech와 natural language prompt의 dimensional attribute를 활용하여 fine-grained emotion variation을 jointly captureRank-N-Contrast objective를 ..
Paper/Representation
2025. 9. 15. 17:02
반응형
