반응형

Data2Vec: A General Framework for Self-Supervised Learning in Speech, Vision and LanguageSelf-supervised learning은 single modality에 초점을 두고 있음Data2VecSpeech, NLP, vision에 동일한 learning method를 적용하는 self-supervised frameworkStandard transformer architecture를 사용하고, self-distillation setup에서 input의 masked view를 기반으로 full input data의 latent representation을 predict- Modality-specific target 대신 entire i..
Paper/Representation
2025. 4. 5. 11:11
반응형