YoVDO

Generalized Pipeline Parallelism for DNN Training - PipeDream System Overview

Offered By: Databricks via YouTube

Tags

Deep Neural Networks Courses Distributed Training Courses

Course Description

Overview

Explore the concept of Generalized Pipeline Parallelism for DNN training in this 21-minute conference talk from Databricks. Learn about PipeDream, a system that combines inter-batch pipelining with intra-batch parallelism to improve parallel training throughput for deep neural networks. Discover how PipeDream addresses challenges such as state version mismatches and pipeline flushes through techniques like weight versioning and efficient scheduling. Understand the automatic partitioning of DNN layers among workers to balance workload and minimize communication. Gain insights into how PipeDream outperforms traditional intra-batch parallelism techniques, achieving up to 5.3X faster training times while maintaining high accuracy. Delve into topics such as model parallelism, weight stashing, operator assignment to pipeline stages, and double-buffered weight updates. This talk is essential for those interested in optimizing DNN training processes and overcoming memory constraints in large-scale machine learning models.

Syllabus

Intro
Model Parallelism: An alternative to data parallelism
Pipelining in DNN training != Traditional pipelining
Challenge 1: Pipelining leads to weight version mismatches
Weight stashing: A solution to version mismatches
Challenge 2: How do we assign operators to pipeline stages?
Pipe Dream vs. Data Parallelism on Time-to-Accuracy
but modern Deep Neural Networks are becoming extremely large!
Double-buffered weight updates: weight semantics
2BW has weight update semantics similar to data parallelism


Taught by

Databricks

Related Courses

Building Language Models on AWS
Amazon Web Services via AWS Skill Builder
Building Language Models on AWS (Japanese)
Amazon Web Services via AWS Skill Builder
Building Language Models on AWS (Japanese) 日本語字幕版
Amazon Web Services via AWS Skill Builder
Building Language Models on AWS (Japanese) (Sub) 日本語字幕版
Amazon Web Services via AWS Skill Builder
Intel® Solutions Pro – AI in the Cloud
Intel via Coursera