YoVDO

Compressing Large Language Models (LLMs) with Python Code - 3 Techniques

Offered By: Shaw Talebi via YouTube

Tags

Model Compression Courses Machine Learning Courses Python Courses BERT Courses Quantization Courses Hugging Face Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore three methods for compressing Large Language Models (LLMs) - Quantization, Pruning, and Knowledge Distillation/Model Distillation - with accompanying Python code examples. Learn about the challenges of model size and the benefits of compression techniques. Follow along with a practical demonstration of combining Knowledge Distillation and Quantization to compress a BERT-based phishing classifier model. Access additional resources including a blog post, GitHub repository, pre-trained models, and dataset for further exploration of LLM compression techniques.

Syllabus

Intro -
"Bigger is Better" -
The Problem -
Model Compression -
1 Quantization -
2 Pruning -
3 Knowledge Distillation -
Example: Compressing a model with KD + Quantization -


Taught by

Shaw Talebi

Related Courses

Introduction to Artificial Intelligence
Stanford University via Udacity
Natural Language Processing
Columbia University via Coursera
Probabilistic Graphical Models 1: Representation
Stanford University via Coursera
Computer Vision: The Fundamentals
University of California, Berkeley via Coursera
Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent