Observability for Data Pipelines with OpenLineage
Offered By: Databricks via YouTube
Course Description
Overview
Explore the critical importance of observability in data pipelines through a 24-minute video presentation from Databricks. Dive into the world of OpenLineage, an API standardizing metadata collection across ecosystems, and discover how it enhances data operations, governance, and security. Learn about Marquez, an open-source project that implements OpenLineage API to provide visibility into dependencies across organizations and technologies. Gain insights into how collecting lineage metadata during pipeline execution can improve auditability, reliability, and timeliness of data operations in fast-paced environments. Understand the role of metadata in creating reliable data products and how OpenLineage reduces complexity in lineage information collection. Examine the Spark integration and its purpose in enhancing data pipeline observability.
Syllabus
Intro
Metadata
OpenLineage
Purpose
Spark Integration
Taught by
Databricks
Related Courses
Metadata: Organizing and Discovering InformationThe University of North Carolina at Chapel Hill via Coursera Gérer les documents numériques : maîtriser les risques
CNAM via France Université Numerique Research Data Management and Sharing
The University of North Carolina at Chapel Hill via Coursera SharePoint Enterprise Content Management
Microsoft via edX Configuration Management on Google Cloud Platform
Google via Coursera