'Paper/Neural Codec' 카테고리의 글 목록

[Paper 리뷰] Say More with Less: Variable-Frame-Rate Speech Tokenization via Adaptive Clustering and Implicit Duration Coding

Say More with Less: Variable-Frame-Rate Speech Tokenization via Adaptive Clustering and Implicit Duration Coding기존의 speech tokenizer는 information density나 temporal fluctuation에 관계없이 고정된 token per second를 assign 하므로 speech의 intrinsic structure와 mismatch가 존재함VARSTokSpeech를 variable-length unit으로 adaptively segment 하는 Temporal-Aware Density Peak Clustering을 도입Content, temporal span을 single token in..

Paper/Neural Codec 2026. 2. 11. 13:29

[Paper 리뷰] Scaling Transformers for Low-Bitrate High-Quality Speech Coding

Scaling Transformers for Low-Bitrate High-Quality Speech Coding기존의 speech tokenization model은 대부분 strong inductive bias를 가지는 component를 사용한 low parameter-count architecture에 집중함TAAELarge parameter-count를 가지는 Transformer architecture를 사용하여 tokenization model을 scalingFinite Scalar Quantization-based bottleneck을 도입해 low bit-rate의 speech quality를 향상논문 (ICLR 2025) : Paper Link1. IntroductionSoundStre..

Paper/Neural Codec 2026. 1. 29. 13:20

[Paper 리뷰] Variable Bitrate Residual Vector Quantization for Audio Coding

Variable Bitrate Residual Vector Quantization for Audio CodingNeural audio codec은 rate-distortion trade-off 측면에서 suboptimal 함VRVQFrame 당 사용되는 codebook 수를 adapting 하여 efficient coding을 지원Importance map을 binary importance mask로 transform 하는 non-differentiable masking operation에 대한 gradient estimation method를 도입논문 (ICASSP 2025) : Paper Link1. Introduction최근 SoundStream, EnCodec, DAC와 같은 Residual Ve..

Paper/Neural Codec 2026. 1. 8. 12:50

[Paper 리뷰] PURE Codec: Progressive Unfolding of Residual Entropy for Speech Codec Learning

PURE Codec: Progressive Unfolding of Residual Entropy for Speech Codec LearningNeural speech codec은 Residual Vector Quantization으로 인한 reconstruction의 한계가 있음PURE CodecPre-trained speech enhancement model을 활용하여 multi-stage quantization을 guidingFirst stage에서는 low-entropy, denoised speech embedding을 reconstruct 하고 second stage에서는 residual high-entropy component를 encode논문 (ASRU 2025) : Paper Link1. I..

Paper/Neural Codec 2025. 12. 9. 13:04

[Paper 리뷰] Language-Codec: Bridging Discrete Codec Representations and Speech Language Models

Language-Codec: Bridging Discrete Codec Representations and Speech Language ModelsDiscrete acoustic codec은 speech language model에서 intermediate representation으로 사용됨Language-CodecMasked Channel Residual Vector Quantization을 도입하여 initial codebook의 excessive information 문제를 해결추가적으로 Fourier transform structure, attention block, refined discriminator를 적용논문 (ACL 2025) : Paper Link1. IntroductionVALL-E..

Paper/Neural Codec 2025. 11. 27. 14:26

[Paper 리뷰] SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound

SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound대부분의 neural codec은 high bitrate에서 동작하고 narrow domain을 가짐SemantiCodecSpeech, general sound, music 등의 다양한 domain을 100 token/sec 이하의 token으로 compress$k$-means clustering을 통해 discretize 된 Self-Supervised Pre-Trained Audio Masked AutoEncoder와 acoustic encoder로 구성된 dual-encoder architecture를 활용논문 (JSTSP 2024) : Paper Link1. Intro..

Paper/Neural Codec 2025. 11. 18. 13:07

이전 1 2 3 4 ··· 8 다음

이전 다음

최근에 올라온 글

최근에 달린 댓글

« 2026/03 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Total

Today

Yesterday

Let IT Begin

티스토리툴바