'Paper/Representation' 카테고리의 글 목록

[Paper 리뷰] Robust Data2Vec: Noise-Robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning

Robust Data2Vec: Noise-Robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive LearningContrastive learning과 regression task에 기반한 self-supervised pre-training method를 통해 Automatic Speech Recognition 성능을 향상할 수 있음Robust Data2VecPre-training stage에서 contrastive learning과 regression task를 jointly optimizing추가적으로 patch-based non-semantic negative sample과 positiv..

Paper/Representation 2025. 4. 11. 18:31

[Paper 리뷰] Data2Vec-AQC: Search for the Right Teaching Assistant in the Teacher-Student Training Setup

Data2Vec-AQC: Search for the Right Teaching Assistant in the Teacher-Student Training SetupUnlabled speech data로부터 speech representation을 얻기 위해 Self-Supervised Learning을 활용할 수 있음Data2Vec-AQCData2Vec을 기반으로 data augmentation, quantized representation, clustering을 도입각 module의 interaction을 통해 additional self-supervised objective인 cross-contrastive loss를 solve논문 (ICASSP 2023) : Paper Link1. Introduct..

Paper/Representation 2025. 4. 10. 17:53

[Paper 리뷰] Data2Vec 2.0: Efficient Self-Supervised Learning with Contextualized Target Representations for Vision, Speech and Language

Data2Vec 2.0: Efficient Self-Supervised Learning with Contextualized Target Representations for Vision, Speech and LanguageSelf-supervised learning을 위해서는 상당한 computational resource가 필요함Data2Vec 2.0Data2Vec을 기반으로 rich contextualized target representation을 얻고,Fast convolutional decoder를 통해 teacher representation을 build 하는데 필요한 effort를 amortize 함논문 (ICML 2023) : Paper Link1. IntroductionSelf-superv..

Paper/Representation 2025. 4. 6. 09:44

[Paper 리뷰] Data2Vec: A General Framework for Self-Supervised Learning in Speech, Vision and Language

Data2Vec: A General Framework for Self-Supervised Learning in Speech, Vision and LanguageSelf-supervised learning은 single modality에 초점을 두고 있음Data2VecSpeech, NLP, vision에 동일한 learning method를 적용하는 self-supervised frameworkStandard transformer architecture를 사용하고, self-distillation setup에서 input의 masked view를 기반으로 full input data의 latent representation을 predict- Modality-specific target 대신 entire i..

Paper/Representation 2025. 4. 5. 11:11

[Paper 리뷰] XLSR: Unsupervised Cross-Lingual Representation Learning for Speech Recognition

XLSR: Unsupervised Cross-Lingual Representation Learning for Speech RecognitionMultiple language에서 single model을 pre-training 하여 cross-lingual speech representation을 얻을 수 있음XLSRWav2Vec 2.0을 기반으로 language 간에 share 되는 latent의 quantization을 jointly learning 함추가적으로 labeled data에서 fine-tuning을 수행논문 (INTERSPEECH 2021) : Paper Link1. IntroductionCross-Lingual learning은 other language를 활용하여 model perfor..

Paper/Representation 2025. 4. 4. 17:23

[Paper 리뷰] Wav2Vec 2.0: A Framework for Self-Supervised Learning of Speech Representations

Wav2Vec 2.0: A Framework for Self-Supervised Learning of Speech RepresentationsSpeech audio만으로 powerful representation을 학습하고 transcribed speech에 대한 fine-tuning을 통해 speech recognition 성능을 향상할 수 있음Wav2Vec 2.0Latent space에서 speech input을 maskJointly learned latent representation의 quantization에 대한 contrastive task를 solve'논문 (NeurIPS 2020) : Paper Link1. IntroductionSpeech recognition에서 labeled data는..

Paper/Representation 2025. 3. 23. 08:52

이전 1 2 다음

이전 다음

최근에 올라온 글

최근에 달린 댓글

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Total

Today

Yesterday

Let IT Begin

티스토리툴바