YoVDO

GraphX - Graph Processing in a Distributed Dataflow Framework

Offered By: USENIX via YouTube

Tags

OSDI (Operating Systems Design and Implementation) Courses Scala Courses Apache Spark Courses

Course Description

Overview

Explore GraphX, a graph processing framework embedded within Apache Spark, in this conference talk from OSDI '14. Dive into the advantages of using general-purpose distributed dataflow systems for graph processing, challenging the notion that specialized graph systems are necessary. Learn how GraphX implements graph-specific optimizations using basic dataflow operators and achieves performance parity with specialized systems. Discover how this approach enables low-cost fault tolerance and supports a wider range of computations. Examine real-world workload evaluations, benchmarks for PageRank and Connected Components, and a demonstration of a small pipeline in GraphX. Gain insights into modern analytics, graph-parallel patterns, representation techniques, and join site selection using routing tables.

Syllabus

Intro
Modern Analytics
Separate Systems
Key Question
Graph-Parallel Pattern
Graph System Optimizations
Representation
Graph Operators (Scala)
Join Site Selection using Routing Tables Routing Vertex
Additional Optimizations
PageRank Benchmark
Connected Comp. Benchmark
A Small Pipeline in GraphX


Taught by

USENIX

Related Courses

Functional Programming Principles in Scala
École Polytechnique Fédérale de Lausanne via Coursera
Functional Program Design in Scala
École Polytechnique Fédérale de Lausanne via Coursera
Parallel programming
École Polytechnique Fédérale de Lausanne via Coursera
Big Data Analysis with Scala and Spark
École Polytechnique Fédérale de Lausanne via Coursera
Functional Programming in Scala Capstone
École Polytechnique Fédérale de Lausanne via Coursera