YoVDO

Running Remote Shuffle Service to Solve Apache Spark's Dynamic Resource Allocation Challenge on Kubernetes

Offered By: CNCF [Cloud Native Computing Foundation] via YouTube

Tags

Apache Spark Courses Machine Learning Courses Cloud Computing Courses Kubernetes Courses Data Storage Courses Scalability Courses ETL Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a novel solution to Apache Spark's dynamic resource allocation (DRA) challenge on Kubernetes using an open-source remote shuffle service (RSS). Gain insights into Spark's DRA in Kubernetes, learn how the RSS alleviates resource contention issues, and discover a more reliable and scalable solution for big data processing. Understand how offloading shuffle data to remote storage outside Spark's executor pods can decouple storage and compute, supporting dynamic scaling needs. Delve into the implementation details, benefits, and potential impact of this approach for efficient large-scale data processing in machine learning and ETL use cases.

Syllabus

Running Remote Shuffle Service to Solve a Well-Known Challenge for Apa... Melody Yang & Keyong Zhou


Taught by

CNCF [Cloud Native Computing Foundation]

Related Courses

CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX
Big Data Analytics
University of Adelaide via edX
Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera
Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera
Introduction to Apache Spark and AWS
University of London International Programmes via Coursera