ZeRO-Offload - Democratizing Billion-Scale Model Training
Offered By: USENIX via YouTube
Course Description
Overview
Explore ZeRO-Offload, a groundbreaking technology democratizing billion-scale model training, in this 14-minute conference talk from USENIX ATC '21. Learn how this innovative approach enables training models with over 13 billion parameters on a single GPU, a tenfold increase compared to popular frameworks like PyTorch. Discover the techniques used to offload data and compute to CPU while minimizing data movement and maximizing GPU memory savings. Understand how ZeRO-Offload achieves impressive computational efficiency, scaling near-linearly on up to 128 GPUs, and its potential to work with model parallelism for even larger models. Gain insights into the unique optimal offload strategy, scheduling for single and multi-GPU setups, and optimized CPU execution. Examine evaluation results showcasing the technology's impact on model scale, training throughput, and scalability. By the end of this talk, grasp how ZeRO-Offload is making large-scale model training accessible to data scientists with limited GPU resources, potentially revolutionizing the field of deep learning.
Syllabus
Intro
The Size of Deep Learning Model is increasing Quickly
Billon-Scale Model Training - Scale Out Large
Mixed-precision training
Limiting CPU Computation
Minimizing Communication Volume
ZeRO-Offload enables large model training , offloading data and compute to CPU
Unique Optimal Offload Strategy
ZERO-Offload Single GPU Schedule
ZERO-Offload Multi-GPUs Schedule
Optimized CPU Execution
Evaluation
Model Scale
Training Throughput - Single GPU
Training Throughput - Multiple GPUs
Throughput Scalability
One-step Delayed Parameter Update (DPU)
Conclusions
Taught by
USENIX
Related Courses
Amazon DynamoDB - A Scalable, Predictably Performant, and Fully Managed NoSQL Database ServiceUSENIX via YouTube Faasm - Lightweight Isolation for Efficient Stateful Serverless Computing
USENIX via YouTube AC-Key - Adaptive Caching for LSM-based Key-Value Stores
USENIX via YouTube The Future of the Past - Challenges in Archival Storage
USENIX via YouTube A Decentralized Blockchain with High Throughput and Fast Confirmation
USENIX via YouTube