Dorylus - Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads

Offered By: USENIX via YouTube

Course Description

Overview

Explore a cutting-edge distributed system for training Graph Neural Networks (GNNs) in this 15-minute conference talk from OSDI '21. Learn about Dorylus, an innovative approach that leverages serverless computing to overcome the challenges of expensive GPU servers and limited memory when working with billion-edge graphs. Discover how computation separation enables a deep, bounded-asynchronous pipeline that effectively hides network latency. Understand why CPU servers offer the best performance-per-dollar for large graphs and how integrating Lambda threads can significantly boost efficiency. Gain insights into Dorylus' architecture, its ability to scale GNN training, and its impressive performance compared to existing systems. Delve into the challenges of using serverless computing and the solutions implemented to address limited resources and network constraints.

Syllabus

Intro
Machine Learning
Graph Neural Networks
Stages of a Graph Neural Network
GPUs Are Not a Good Fit for Graph Operations
Combining CPUs and GPUs is Cost-Ineffective
Using Many CPU Servers Can Still Be Expensive
Key Insight: Serverless Fits Our Goals
Serverless Achieves Low-Cost, Scalable Efficiency
Challenges with Using Serverless
Challenge 1: Limited Resources
Solution: Computation Separation
Dorylus Architecture
Flow of Decomposed Tasks
Challenge 2: Limited Network
Solution: Create Pipeline of Decomposed Tasks
Data Chunks Moving Through Layer of Pipeline
Synchronize after Scatter Hinders Pipeline
Two Sync Points Makes Asynchrony Difficult
Minimizing Effects of Asynchrony on Convergence
Serverless Optimizations
Data Graphs
We Evaluated Several Aspects of Dorylus
High Value on Large-Sparse Graphs
Dorylus Outperforms Existing Systems
Dorylus Scales Full Graph Training
Conclusion: Dorylus Provides Value

Taught by

USENIX

Dorylus - Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Dorylus - Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue