YoVDO

Observability for Data Pipelines with OpenLineage

Offered By: Databricks via YouTube

Tags

Data Pipelines Courses Data Governance Courses Metadata Courses Data Security Courses Observability Courses

Course Description

Overview

Explore the critical importance of observability in data pipelines through a 24-minute video presentation from Databricks. Dive into the world of OpenLineage, an API standardizing metadata collection across ecosystems, and discover how it enhances data operations, governance, and security. Learn about Marquez, an open-source project that implements OpenLineage API to provide visibility into dependencies across organizations and technologies. Gain insights into how collecting lineage metadata during pipeline execution can improve auditability, reliability, and timeliness of data operations in fast-paced environments. Understand the role of metadata in creating reliable data products and how OpenLineage reduces complexity in lineage information collection. Examine the Spark integration and its purpose in enhancing data pipeline observability.

Syllabus

Intro
Metadata
OpenLineage
Purpose
Spark Integration


Taught by

Databricks

Related Courses

Relational Database Support for Data Warehouses
University of Colorado System via Coursera
Compliance in Office 365: Data Governance
Microsoft via edX
Introduction to Data Analytics for Business
University of Colorado Boulder via Coursera
Compliance in Office 365: Data Governance
Microsoft via edX
Microsoft Azure Security Services
Microsoft via edX