High Fidelity Neural Audio Compression - Paper & Code Explained
Offered By: Aleksa Gordić - The AI Epiphany via YouTube
Course Description
Overview
Dive into a comprehensive video explanation of the "High Fidelity Neural Audio Compression" paper and its accompanying code. Explore cutting-edge techniques that achieve 10x compression rates compared to mp3, with audio quality maintained at just 6 kbps. Learn about advanced concepts like VQ-VAE, VQ-GAN, and AudioGen applied to audio compression. Follow along with a detailed paper walk-through, code analysis, and in-depth explanations of key components such as Residual Vector Quantization, EnCodec architecture, and efficient bit packing. Gain insights into the potential impact on internet traffic reduction and the future of audio streaming technology.
Syllabus
Intro
Paper walk-through: high level overview
Residual Vector Quantization
Reducing the BW using arithmetic coding and transformers
Loss formulations and results
Code walk-through
EnCodec architecture
Residual Vector Quantizer module
Loading the audio signal
Compression - a forward pass through the encoder
Quantization forward pass
Efficiently packing the bits
Using LM to further compress audio
Outro
Taught by
Aleksa Gordić - The AI Epiphany
Related Courses
Introduction to Digital Sound DesignEmory University via Coursera Foundations of Wavelets and Multirate Digital Signal Processing
Indian Institute of Technology Bombay via Swayam iOS Development for Creative Entrepreneurs
University of California, Irvine via Coursera Deploying TinyML
Harvard University via edX Digital Signal Processing
École Polytechnique Fédérale de Lausanne via Coursera