Spark SQL Shuffle Join Optimization at eBay
Offered By: The ASF via YouTube
Course Description
Overview
Explore a 15-minute conference talk on Spark SQL Shuffle Join improvements implemented at eBay. Discover how Wang Yuming, an eBay software engineer and Apache Spark PMC Member, presents a series of optimizations for one of the most expensive and widely used operations in data warehouses. Learn about three key enhancements: unwrapping join conditions to utilize bucket joins, enhancing shuffle exchange reuse to minimize table scans, and pushing down partial aggregation through joins. Gain insights into SQL query performance optimization techniques from an expert in Apache Spark development and a 2022 SIGMOD Systems Award winner.
Syllabus
Spark Sql Shuffle Join Improvement At Ebay
Taught by
The ASF
Related Courses
CS115x: Advanced Apache Spark for Data Science and Data EngineeringUniversity of California, Berkeley via edX Big Data Analytics
University of Adelaide via edX Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera Introduction to Apache Spark and AWS
University of London International Programmes via Coursera