YoVDO

How to Quantize a Large Language Model with GGUF or AWQ

Offered By: Trelis Research via YouTube

Tags

Quantization Courses Machine Learning Courses Model Compression Courses Hugging Face Courses llama.cpp Courses GGUF Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn how to quantize large language models using GGUF or AWQ in this 26-minute video tutorial. Explore the reasons for quantization, understand different quantization methods, and compare GGUF, BNB, AWQ, and GPTQ techniques. Follow step-by-step instructions for quantizing models with AWQ and GGUF (GGML), and gain access to advanced fine-tuning resources, including scripts for unsupervised and supervised fine-tuning, dataset preparation, and embedding creation. Discover valuable resources such as presentation slides, GitHub repositories, and related research papers to enhance your understanding of LLM quantization techniques.

Syllabus

How to quantize a large language model
: Why quantize a language model
What is quantization
Which quantization to use?
GGUF vs BNB vs AWQ vs GPTQ
How to quantize with AWQ
How to quantize with GGUF GGML
Recap


Taught by

Trelis Research

Related Courses

TensorFlow Lite for Edge Devices - Tutorial
freeCodeCamp
Few-Shot Learning in Production
HuggingFace via YouTube
TinyML Talks Germany - Neural Network Framework Using Emerging Technologies for Screening Diabetic
tinyML via YouTube
TinyML for All: Full-stack Optimization for Diverse Edge AI Platforms
tinyML via YouTube
TinyML Talks - Software-Hardware Co-design for Tiny AI Systems
tinyML via YouTube