YoVDO

Large-Scale Data Shuffle in Ray with Exoshuffle

Offered By: Anyscale via YouTube

Tags

Machine Learning Courses Data Sorting Courses Distributed Computing Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the innovative Exoshuffle system for large-scale data processing in this 26-minute conference talk from Anyscale. Delve into the world of shuffle, a crucial primitive in data processing applications, and discover how Exoshuffle challenges conventional wisdom by implementing high-performance, reliable shuffle on Ray, a general-purpose distributed computing system. Learn how Exoshuffle outperforms Spark and achieves an impressive 82% of theoretical performance on a 100TB sort using 100 nodes. Gain insights into the integration of Exoshuffle with Ray 2.0's Datasets library, providing enhanced large-scale shuffle capabilities for machine learning users. This talk offers valuable knowledge for data scientists, engineers, and anyone interested in advancing large-scale data processing techniques.

Syllabus

Large-scale data shuffle in Ray with Exoshuffle


Taught by

Anyscale

Related Courses

Introduction to Databases
Meta via Coursera
Analyzing Big Data with SQL
Cloudera via Coursera
Query Client Data with LibreOffice Base
Coursera Project Network via Coursera
Unix Command Course for Beginners
Udemy
Excel For Beginners! Top 30 Hottest Tutorials,Tips & Tricks!
Udemy