'Paper/Representation' 카테고리의 글 목록 (2 Page)

[Paper 리뷰] SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model

SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space ModelAudio representation learning을 위한 Transformer architecture는 memory, inference time 측면에서 quadratic complexity를 가짐SSAMBAState Space Model인 Mamba를 self-supervised audio representation learning에 도입Bidirectional Mamba를 사용하여 complex audio pattern을 capture 하고 unlabeled dataset으로부터 robust audio representation을 학습논문 (SLT 20..

Paper/Representation 2025. 11. 4. 12:57

[Paper 리뷰] SSAST: Self-Supervised Audio Spectrogram Transformer

SSAST: Self-Supervised Audio Spectrogram TransformerAudio task에 Transformer를 적용할 수 있음SSASTSelf-Supervised Learning을 통해 Audio Spectrogram Transformer를 향상Joint discriminative and generative masked spectrogram patch modeling에 기반한 pre-training을 적용논문 (AAAI 2022) : Paper Link1. IntroductionAudio Spectrogram Transformer (AST)와 같은 pure self-attention-based model은 기존 CNN-based model에 비해 많은 training data를..

Paper/Representation 2025. 10. 30. 12:35

[Paper 리뷰] AxLSTMs: Learning Self-Supervised Audio Representations with xLSTMs

AxLSTMs: Learning Self-Supervised Audio Representations with xLSTMsxLSTM은 Transformer와 비교할만한 성능을 가짐AxLSTMSelf-supervised setting에서 xLSTM을 활용해 masked spectrogram patch로부터 general-purpose audio representation을 학습AudioSet dataset으로 pre-training 하여 다양한 downstream task에 대응논문 (INTERSPEECH 2025) : Paper Link1. IntroductionTransformer는 뛰어난 generalization ability와 data-agnostic nature를 가지지만 scaled dot-pr..

Paper/Representation 2025. 9. 20. 07:50

[Paper 리뷰] EmotionRankCLAP: Bridging Natural Language Speaking Styles and Ordinal Speech Emotion via Rank-N-Contrast

EmotionRankCLAP: Bridging Natural Language Speaking Styles and Ordinal Speech Emotion via Rank-N-ContrastContrastive Language Audio Pre-training은 emotion의 ordinal nature를 capture 하지 못하고 audio, text embedding 간의 insufficient alignment가 나타남EmotionRankCLAPEmotional speech와 natural language prompt의 dimensional attribute를 활용하여 fine-grained emotion variation을 jointly captureRank-N-Contrast objective를 ..

Paper/Representation 2025. 9. 15. 17:02

[Paper 리뷰] Audio Mamba: Selective State Space for Self-Supervised Audio Representations

Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations최근 selective state space model이 주목받고 있음Audio MambaAudio representation learning을 위해 selective state space model에 self-supervised learning을 적용 Randomly masked spectrogram patch를 통해 general-purpose audio representation을 학습논문 (INTERSPEECH 2024) : Paper Link1. IntroductionTransformer는 multiple domain과 data modality에 대한 repr..

Paper/Representation 2025. 9. 12. 13:09

[Paper 리뷰] HuBERT-VIC: Improving Noise-Robust Automatic Speech Recognition of Speech Foundation Model via Variance-Invariance-Covariance Regularization

HuBERT-VIC: Improving Noise-Robust Automatic Speech Recognition of Speech Foundation Model via Variance-Invariance-Covariance RegularizationSpeech foundation model은 noise-robustness 측면에서 한계가 있음HuBERT-VICVariance, Invariance, Covariance regularization objective를 활용하여 model을 trainingNoisy speech representation의 statistics를 adjust 하여 다양한 noise type에 대한 generalization ability를 향상논문 (INTERSPEECH 2025..

Paper/Representation 2025. 9. 7. 08:09

이전 1 2 3 4 5 ··· 9 다음

이전 다음

최근에 올라온 글

최근에 달린 댓글

« 2026/05 »
일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Total

Today

Yesterday

Let IT Begin

티스토리툴바