LLMOps: LLMs Memory and Compute Optimizations
Offered By: The Machine Learning Engineer via YouTube
Course Description
Overview
Explore FlashAttention and GQA techniques to enhance efficiency in self-attention layers, and discover FSDP and DDP methods for training and fine-tuning Large Language Models (LLMs) in this 24-minute tutorial. Gain practical insights into memory and compute optimizations for LLMs, with access to a comprehensive PowerPoint presentation and hands-on Jupyter notebook for implementation.
Syllabus
LLMOps: LLMs Memory and Compute Optimizations #machinelearning #datascience
Taught by
The Machine Learning Engineer
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Natural Language Processing
Columbia University via Coursera Probabilistic Graphical Models 1: Representation
Stanford University via Coursera Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent