
ParaNoise-SV: Integrated Approach for Noise-Robust Speaker Verification with Parallel Joint Learning of Speech Enhancement and Noise Extraction 기존의 speaker verification model은 noise-robustness 측면에서 한계가 있음ParaNoise-SVNoise Extraction network와 Speech Enhancement network를 combine 한 dual U-Net을 활용Noise Extraction U-Net은 noise를 explicitly modeling 하고 Speech Enhancement U-Net은 parallel connection을 통한 ..

CAM++: A Fast and Efficient Network for Speaker Verification Using Context-Aware MaskingECAPA-TDNN은 high complexity와 slow inference speed의 문제가 있음CAM++Context-Aware Masking을 densely-connected Time Delay Neural Network backbone에 적용Multi-granularity pooling을 적용하여 서로 다른 level의 textual information을 capture논문 (INTERSPEECH 2023) : Paper Link1. IntroductionSpeaker Verification (SV)는 voice characteristic..

CAM: Context-Aware Masking for Robust Speaker VerificationSpeaker Verification은 noise로 인한 성능 저하의 문제가 있음CAMInterest speaker에 focus 하고 unrelated noise는 blur 하는 Speaker embedding network를 구성Speaker, noise characteristic을 capture하는 auxiliary context embedding을 통해 masking threshold를 dynamically control논문 (ICASSP 2021) : Paper Link1. IntroductionSpeaker Verification은 test utterance를 enrollment와 compar..

SSPS: Self-Supervised Positive Sampling for Robust Self-Supervised Speaker VerificationSpeaker Verification에서 Self-Supervised Learning은 동일한 speaker의 anchor-positive pair만을 사용함SSPS주어진 anchor에 대해 latent space에서 clustering assignment와 memory queue를 적용동일한 speaker지만 서로 다른 recording condition을 가지는 appropriate positive를 find논문 (INTERSPEECH 2025) : Paper Link1. IntroductionSpeaker Verification (SV)는 주어진..

ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN based Speaker VerificationSpeaker verification은 speaker representation을 추출하는 neural network에 의존함ECAPA-TDNNInitial frame layer를 1-dimensional Res2Net module로 reconstruct 하고 channel interdependency를 explicitly modeling 하기 위해 Squeeze-and-Excitation block을 도입서로 다른 hierarchical level의 feature를 aggregate, propagate 하고 channe..

NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for Speaker VerificationConvNet structure를 활용하여 speaker verification을 위한 ECAPA-TDNN을 개선할 수 있음NeXt-TDNNECAPA-TDNN의 SE-Res2Net block을 TS-ConvNeXt block으로 대체- TS-ConvNeXt block은 temporal multi-scale convolution과 frame-wise feed-forward network로 구성됨Frame-wise feed-forward network에 global response normalization을 도입하여 selective feautre p..