Generalized Pipeline Parallelism for DNN Training - PipeDream System Overview
Offered By: Databricks via YouTube
Course Description
Overview
Explore the concept of Generalized Pipeline Parallelism for DNN training in this 21-minute conference talk from Databricks. Learn about PipeDream, a system that combines inter-batch pipelining with intra-batch parallelism to improve parallel training throughput for deep neural networks. Discover how PipeDream addresses challenges such as state version mismatches and pipeline flushes through techniques like weight versioning and efficient scheduling. Understand the automatic partitioning of DNN layers among workers to balance workload and minimize communication. Gain insights into how PipeDream outperforms traditional intra-batch parallelism techniques, achieving up to 5.3X faster training times while maintaining high accuracy. Delve into topics such as model parallelism, weight stashing, operator assignment to pipeline stages, and double-buffered weight updates. This talk is essential for those interested in optimizing DNN training processes and overcoming memory constraints in large-scale machine learning models.
Syllabus
Intro
Model Parallelism: An alternative to data parallelism
Pipelining in DNN training != Traditional pipelining
Challenge 1: Pipelining leads to weight version mismatches
Weight stashing: A solution to version mismatches
Challenge 2: How do we assign operators to pipeline stages?
Pipe Dream vs. Data Parallelism on Time-to-Accuracy
but modern Deep Neural Networks are becoming extremely large!
Double-buffered weight updates: weight semantics
2BW has weight update semantics similar to data parallelism
Taught by
Databricks
Related Courses
Building Language Models on AWSAmazon Web Services via AWS Skill Builder Building Language Models on AWS (Japanese)
Amazon Web Services via AWS Skill Builder Building Language Models on AWS (Japanese) 日本語字幕版
Amazon Web Services via AWS Skill Builder Building Language Models on AWS (Japanese) (Sub) 日本語字幕版
Amazon Web Services via AWS Skill Builder Intel® Solutions Pro – AI in the Cloud
Intel via Coursera