YoVDO

Generalized Sub-Query Fusion for Eliminating Redundant I-O from Big-Data Queries

Offered By: USENIX via YouTube

Tags

OSDI (Operating Systems Design and Implementation) Courses

Course Description

Overview

Explore a 19-minute conference talk from USENIX OSDI '20 that introduces RESIN, an optimizer extension designed to eliminate redundant I/O in big-data SQL queries. Learn about Generalized Sub-Query Fusion, a novel technique that identifies and fuses sub-queries computing on overlapping data into the same map-reduce stages. Discover how this approach can optimize query execution by reducing disk and network I/O, sometimes eliminating expensive binary operations like Joins and Unions. Gain insights into the implementation of RESIN in sparkSQL and its performance improvements on the TPCDS benchmark suite, demonstrating speed-ups of 1.1-6x for 40% of queries and a 12% reduction in overall execution time.

Syllabus

OSDI '20 - Generalized Sub-Query Fusion for Eliminating Redundant I/O from Big-Data Queries


Taught by

USENIX

Related Courses

GraphX - Graph Processing in a Distributed Dataflow Framework
USENIX via YouTube
Theseus - An Experiment in Operating System Structure and State Management
USENIX via YouTube
RedLeaf - Isolation and Communication in a Safe Operating System
USENIX via YouTube
Microsecond Consensus for Microsecond Applications
USENIX via YouTube
KungFu - Making Training in Distributed Machine Learning Adaptive
USENIX via YouTube