8-bit Methods for Efficient Deep Learning
Offered By: Center for Language & Speech Processing(CLSP), JHU via YouTube
Course Description
Overview
Syllabus
Intro
How does quantization work?
Quantization as a mapping
Quantization Example: A non-standard 2-bit data ty
Floating point data types (FP8)
Dynamic exponent quantization
Motivation: Optimizers take up a lot of memory!
What do outliers in quantization look like?
Block-wise quantization
Putting it together: 8-bit optimizers
Using OPT-175B on a single machine via 8-bit weig
The problem with quantizing outliers with large valu
Emergent features: sudden vs. smooth emergence
Mixed precision decomposition
Bit-level scaling laws experimental setup overview
What does help to improve scaling? Data types
Nested Quantization
Instruction Tuning with 4-bit + Adapters
4-bit Normal Float (NF4)
Taught by
Center for Language & Speech Processing(CLSP), JHU
Related Courses
Digital Signal ProcessingÉcole Polytechnique Fédérale de Lausanne via Coursera Principles of Communication Systems - I
Indian Institute of Technology Kanpur via Swayam Digital Signal Processing 2: Filtering
École Polytechnique Fédérale de Lausanne via Coursera Digital Signal Processing 3: Analog vs Digital
École Polytechnique Fédérale de Lausanne via Coursera Digital Signal Processing 4: Applications
École Polytechnique Fédérale de Lausanne via Coursera