YoVDO

High Fidelity Neural Audio Compression - Paper & Code Explained

Offered By: Aleksa Gordić - The AI Epiphany via YouTube

Tags

Neural Networks Courses Machine Learning Courses Quantization Courses Audio Processing Courses Transformers Courses

Course Description

Overview

Dive into a comprehensive video explanation of the "High Fidelity Neural Audio Compression" paper and its accompanying code. Explore cutting-edge techniques that achieve 10x compression rates compared to mp3, with audio quality maintained at just 6 kbps. Learn about advanced concepts like VQ-VAE, VQ-GAN, and AudioGen applied to audio compression. Follow along with a detailed paper walk-through, code analysis, and in-depth explanations of key components such as Residual Vector Quantization, EnCodec architecture, and efficient bit packing. Gain insights into the potential impact on internet traffic reduction and the future of audio streaming technology.

Syllabus

Intro
Paper walk-through: high level overview
Residual Vector Quantization
Reducing the BW using arithmetic coding and transformers
Loss formulations and results
Code walk-through
EnCodec architecture
Residual Vector Quantizer module
Loading the audio signal
Compression - a forward pass through the encoder
Quantization forward pass
Efficiently packing the bits
Using LM to further compress audio
Outro


Taught by

Aleksa Gordić - The AI Epiphany

Related Courses

Introduction to Digital Sound Design
Emory University via Coursera
Foundations of Wavelets and Multirate Digital Signal Processing
Indian Institute of Technology Bombay via Swayam
iOS Development for Creative Entrepreneurs
University of California, Irvine via Coursera
Deploying TinyML
Harvard University via edX
Digital Signal Processing
École Polytechnique Fédérale de Lausanne via Coursera