YoVDO

Dorylus - Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads

Offered By: USENIX via YouTube

Tags

OSDI (Operating Systems Design and Implementation) Courses Deep Learning Courses Scalability Courses Distributed Computing Courses Serverless Computing Courses

Course Description

Overview

Explore a cutting-edge distributed system for training Graph Neural Networks (GNNs) in this 15-minute conference talk from OSDI '21. Learn about Dorylus, an innovative approach that leverages serverless computing to overcome the challenges of expensive GPU servers and limited memory when working with billion-edge graphs. Discover how computation separation enables a deep, bounded-asynchronous pipeline that effectively hides network latency. Understand why CPU servers offer the best performance-per-dollar for large graphs and how integrating Lambda threads can significantly boost efficiency. Gain insights into Dorylus' architecture, its ability to scale GNN training, and its impressive performance compared to existing systems. Delve into the challenges of using serverless computing and the solutions implemented to address limited resources and network constraints.

Syllabus

Intro
Machine Learning
Graph Neural Networks
Stages of a Graph Neural Network
GPUs Are Not a Good Fit for Graph Operations
Combining CPUs and GPUs is Cost-Ineffective
Using Many CPU Servers Can Still Be Expensive
Key Insight: Serverless Fits Our Goals
Serverless Achieves Low-Cost, Scalable Efficiency
Challenges with Using Serverless
Challenge 1: Limited Resources
Solution: Computation Separation
Dorylus Architecture
Flow of Decomposed Tasks
Challenge 2: Limited Network
Solution: Create Pipeline of Decomposed Tasks
Data Chunks Moving Through Layer of Pipeline
Synchronize after Scatter Hinders Pipeline
Two Sync Points Makes Asynchrony Difficult
Minimizing Effects of Asynchrony on Convergence
Serverless Optimizations
Data Graphs
We Evaluated Several Aspects of Dorylus
High Value on Large-Sparse Graphs
Dorylus Outperforms Existing Systems
Dorylus Scales Full Graph Training
Conclusion: Dorylus Provides Value


Taught by

USENIX

Related Courses

Introduction to Cloud Infrastructure Technologies
Linux Foundation via edX
Cloud Computing
Indian Institute of Technology, Kharagpur via Swayam
Elastic Cloud Infrastructure: Containers and Services en Español
Google Cloud via Coursera
Kyma – A Flexible Way to Connect and Extend Applications
SAP Learning
Modernize Infrastructure and Applications with Google Cloud
Google Cloud via Coursera