YoVDO

Full Fine-tuning LLMs with Lower VRAM: Optimizers, GaLore, and Advanced Techniques

Offered By: Trelis Research via YouTube

Tags

Fine-Tuning Courses Gradient Descent Courses LoRA (Low-Rank Adaptation) Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore advanced techniques for full fine-tuning of large language models with limited GPU resources in this comprehensive video tutorial. Dive deep into optimizer strategies, including Stochastic Gradient Descent (SGD), AdamW, and Adafactor, while learning about their VRAM requirements and performance implications. Discover the innovative GaLore method for reducing gradient and optimizer VRAM usage, and compare it with LoRA. Gain insights into layerwise gradient updates, gradient checkpointing, and the implementation of various optimizers. Follow along with practical demonstrations, including a notebook demo of layerwise gradient updates and running models with LoRA. Learn how to inference and push models to the hub, and receive valuable recommendations for both single and multi-GPU setups. Access a wealth of resources, including slides, GitHub repositories, and additional support channels to enhance your understanding of advanced fine-tuning techniques.

Syllabus

LLM Full fine-tuning with lower VRAM
Video Overview
Understanding Optimisers
Stochastic Gradient Descent SGD
AdamW Optimizer and VRAM requirements
AdamW 8-bit optimizer
Adafactor optimiser and memory requirements
GaLore - reducing gradient and optimizer VRAM
LoRA versus GaLoRe
Better and Faster GaLoRe via Subspace Descent
Layerwise gradient updates
Training Scripts
How gradient checkpointing works to reduce memory
AdamW Performance
AdamW 8bit Performance
Adafactor with manual learning rate and schedule
Adafactor with default/auto learning rate
Galore AdamW
Galore AdamW with Subspace descent
Using AdamW8bit and Adafactor with GaLoRe
Notebook demo of layerwise gradient updates
Running with LoRa
Inferencing and Pushing Models to Hub
Single GPU Recommendations
Multi-GPU Recommendations
Resources


Taught by

Trelis Research

Related Courses

TensorFlow: Working with NLP
LinkedIn Learning
Introduction to Video Editing - Video Editing Tutorials
Great Learning via YouTube
HuggingFace Crash Course - Sentiment Analysis, Model Hub, Fine Tuning
Python Engineer via YouTube
GPT3 and Finetuning the Core Objective Functions - A Deep Dive
David Shapiro ~ AI via YouTube
How to Build a Q&A AI in Python - Open-Domain Question-Answering
James Briggs via YouTube