YoVDO

OpenLineage: An Open Standard for Data Lineage

Offered By: The ASF via YouTube

Tags

Data Lineage Courses Apache Spark Courses Apache Airflow Courses Data Transformation Courses Metadata Courses Data Pipelines Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the world of data lineage in this 40-minute conference talk from ApacheCon 2022. Delve into the challenges of managing complex data pipelines and discover how OpenLineage, an open framework for collecting lineage metadata, can provide solutions. Learn about the importance of data lineage in modern data stacks, understand the basics of OpenLineage's data model and metadata collection process, and get introduced to Marquez, an OpenLineage metadata server. Gain insights on tracking job failures, ensuring data freshness and quality, predicting the impact of changes, and visualizing your entire pipeline through lineage graphs. Perfect for data professionals looking to enhance their understanding of data relationships and improve pipeline management.

Syllabus

OpenLineage An Open Standard for Data Lineage Ross Turk


Taught by

The ASF

Related Courses

CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX
Big Data Analytics
University of Adelaide via edX
Big Data Essentials: HDFS, MapReduce and Spark RDD
Yandex via Coursera
Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames
Yandex via Coursera
Introduction to Apache Spark and AWS
University of London International Programmes via Coursera