YoVDO

Unity - Accelerating DNN Training Through Joint Optimization of Algebraic Transformations and Parallelization

Offered By: USENIX via YouTube

Tags

OSDI (Operating Systems Design and Implementation) Courses Formal Verification Courses Parallelization Courses

Course Description

Overview

Explore a conference talk from OSDI '22 that introduces Unity, a groundbreaking system for optimizing distributed Deep Neural Network (DNN) training. Delve into how Unity jointly optimizes algebraic transformations and parallelization using a unified parallel computation graph (PCG). Learn about the system's innovative approach to automatically generating and verifying optimizations, as well as its hierarchical search algorithm for maintaining scalability. Discover Unity's performance improvements over existing DNN training frameworks, with evaluations conducted on seven real-world DNNs using up to 192 GPUs across 32 nodes. Gain insights into the potential impact of Unity on accelerating DNN training and its availability as part of the open-source FlexFlow framework.

Syllabus

Introduction
Unitys Goal
Parallelization
Parallel Computation Graph
Data Parallelization
PCG Advantages
Techniques
Results
Conclusion


Taught by

USENIX

Related Courses

GraphX - Graph Processing in a Distributed Dataflow Framework
USENIX via YouTube
Theseus - An Experiment in Operating System Structure and State Management
USENIX via YouTube
RedLeaf - Isolation and Communication in a Safe Operating System
USENIX via YouTube
Microsecond Consensus for Microsecond Applications
USENIX via YouTube
KungFu - Making Training in Distributed Machine Learning Adaptive
USENIX via YouTube