High Fidelity Neural Audio Compression - Paper & Code Explained
Offered By: Aleksa Gordić - The AI Epiphany via YouTube
Course Description
Overview
Dive into a comprehensive video explanation of the "High Fidelity Neural Audio Compression" paper and its accompanying code. Explore cutting-edge techniques that achieve 10x compression rates compared to mp3, with audio quality maintained at just 6 kbps. Learn about advanced concepts like VQ-VAE, VQ-GAN, and AudioGen applied to audio compression. Follow along with a detailed paper walk-through, code analysis, and in-depth explanations of key components such as Residual Vector Quantization, EnCodec architecture, and efficient bit packing. Gain insights into the potential impact on internet traffic reduction and the future of audio streaming technology.
Syllabus
Intro
Paper walk-through: high level overview
Residual Vector Quantization
Reducing the BW using arithmetic coding and transformers
Loss formulations and results
Code walk-through
EnCodec architecture
Residual Vector Quantizer module
Loading the audio signal
Compression - a forward pass through the encoder
Quantization forward pass
Efficiently packing the bits
Using LM to further compress audio
Outro
Taught by
Aleksa Gordić - The AI Epiphany
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Natural Language Processing
Columbia University via Coursera Probabilistic Graphical Models 1: Representation
Stanford University via Coursera Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent