BigScience BLOOM - 3D Parallelism Explained - Large Language Models - ML Coding Series
Offered By: Aleksa Gordić - The AI Epiphany via YouTube
Course Description
Overview
Dive into the fourth video of the Large Language Model series, exploring the BigScience BLOOM model codebase with a focus on understanding 3D parallelism. Learn about pipeline parallelism, model parallelism, and data parallelism - the engineering concepts behind recent scaling efforts and machine learning successes. Follow along as the video walks through the eval script, model construction, sharding techniques, and the forward pass. Gain insights into embedding table sharding, transformer layer sharding, attention layer sharding, and the intricacies of ColumnParallel and RowParallel sharding. Understand how dataset building relates to data parallelism and explore pipeline parallelism communication. Conclude with a comprehensive recap of the 3D parallelism concepts covered in this in-depth, 72-minute tutorial.
Syllabus
Intro - focusing on the 3D parallelism!
Quick setup
Stepping through the eval script
3D paralellism - model construction
Sharding the embedding table model parallelism
Sharding the transformer layer
LayerNorm fused kernels
Sharding the attention layer
ColumnParallel and RowParallel sharding
Synchronizing input and output embedding tables
Building the dataset data parallelism
3D parallelism - forward pass
Pipeline parallelism communication
Pass through the sharded embedding table
Pass through the sharded transformer layer
Sharded logit and cross-entropy computation
Recap
Outro
Taught by
Aleksa Gordić - The AI Epiphany
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Natural Language Processing
Columbia University via Coursera Probabilistic Graphical Models 1: Representation
Stanford University via Coursera Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent