YoVDO

Compressing Large Language Models (LLMs) with Python Code - 3 Techniques

Offered By: Shaw Talebi via YouTube

Tags

Model Compression Courses Machine Learning Courses Python Courses BERT Courses Quantization Courses Hugging Face Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore three methods for compressing Large Language Models (LLMs) - Quantization, Pruning, and Knowledge Distillation/Model Distillation - with accompanying Python code examples. Learn about the challenges of model size and the benefits of compression techniques. Follow along with a practical demonstration of combining Knowledge Distillation and Quantization to compress a BERT-based phishing classifier model. Access additional resources including a blog post, GitHub repository, pre-trained models, and dataset for further exploration of LLM compression techniques.

Syllabus

Intro -
"Bigger is Better" -
The Problem -
Model Compression -
1 Quantization -
2 Pruning -
3 Knowledge Distillation -
Example: Compressing a model with KD + Quantization -


Taught by

Shaw Talebi

Related Courses

Sentiment Analysis with Deep Learning using BERT
Coursera Project Network via Coursera
Natural Language Processing with Attention Models
DeepLearning.AI via Coursera
Fine Tune BERT for Text Classification with TensorFlow
Coursera Project Network via Coursera
Deploy a BERT question answering bot on Django
Coursera Project Network via Coursera
Generating discrete sequences: language and music
Ural Federal University via edX