Testing Data Pipelines - Techniques for Ensuring Data Flow Integrity
Offered By: PyCon US via YouTube
Course Description
Overview
Explore effective strategies for testing data pipelines in this informative PyCon US talk. Learn how to ensure smooth data flow and quickly identify and resolve issues in your pipelines. Discover toolkit-agnostic techniques applicable beyond Airflow, including unit testing for individual components, integration testing for the entire pipeline, and end-to-end testing for accurate data output. Gain insights into unique methods such as data snapshot testing and online/offline data quality checks. Access the presentation slides for a comprehensive overview of the testing approaches discussed in this 25-minute talk, aimed at enhancing the reliability and efficiency of your data pipeline processes.
Syllabus
Talks - Amitosh Swain: Testing Data Pipelines
Taught by
PyCon US
Related Courses
内存数据库管理openHPI CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX Processing Big Data with Azure Data Lake Analytics
Microsoft via edX Google Cloud Big Data and Machine Learning Fundamentals en Español
Google Cloud via Coursera Google Cloud Big Data and Machine Learning Fundamentals 日本語版
Google Cloud via Coursera