YoVDO

GraphX - Graph Processing in a Distributed Dataflow Framework

Offered By: USENIX via YouTube

Tags

OSDI (Operating Systems Design and Implementation) Courses Scala Courses Apache Spark Courses

Course Description

Overview

Explore GraphX, a graph processing framework embedded within Apache Spark, in this conference talk from OSDI '14. Dive into the advantages of using general-purpose distributed dataflow systems for graph processing, challenging the notion that specialized graph systems are necessary. Learn how GraphX implements graph-specific optimizations using basic dataflow operators and achieves performance parity with specialized systems. Discover how this approach enables low-cost fault tolerance and supports a wider range of computations. Examine real-world workload evaluations, benchmarks for PageRank and Connected Components, and a demonstration of a small pipeline in GraphX. Gain insights into modern analytics, graph-parallel patterns, representation techniques, and join site selection using routing tables.

Syllabus

Intro
Modern Analytics
Separate Systems
Key Question
Graph-Parallel Pattern
Graph System Optimizations
Representation
Graph Operators (Scala)
Join Site Selection using Routing Tables Routing Vertex
Additional Optimizations
PageRank Benchmark
Connected Comp. Benchmark
A Small Pipeline in GraphX


Taught by

USENIX

Related Courses

Theseus - An Experiment in Operating System Structure and State Management
USENIX via YouTube
RedLeaf - Isolation and Communication in a Safe Operating System
USENIX via YouTube
Microsecond Consensus for Microsecond Applications
USENIX via YouTube
KungFu - Making Training in Distributed Machine Learning Adaptive
USENIX via YouTube
Caladan - Mitigating Interference at Microsecond Timescales
USENIX via YouTube