Compute Engine Testing with Synthetic Data Generation
Offered By: USENIX via YouTube
Course Description
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore Meta's innovative testing framework for compute engines like Presto in this 16-minute conference talk from PEPR '24. Discover how privacy-safe, production-like synthetic data is utilized to detect regressions within the Meta Data Warehouse. Learn about the challenges and solutions implemented to operate this framework at scale, including key features of the synthetic data generation process such as differential privacy, expanded column schema support, and improved scalability. Gain insights into how Meta leverages this testing framework to increase test coverage, reduce the Presto release cycle, and prevent production regressions. Presented by Jiangnan Cheng and Eric Liu from Meta, this talk offers valuable knowledge for professionals interested in advanced testing methodologies for large-scale data systems.
Syllabus
PEPR '24 - Compute Engine Testing with Synthetic Data Generation
Taught by
USENIX
Related Courses
Amazon EMR Getting Started (Indonesian)Amazon Web Services via AWS Skill Builder Hadoop Ecosystem Essentials
Packt via FutureLearn Master SQL for Data Science
LinkedIn Learning Presto Essentials: Data Science
LinkedIn Learning Building an Open Data Lakehouse on AWS with Presto and Apache Hudi
Linux Foundation via YouTube