YoVDO

Building a Robust Data Pipeline with the DAG Stack - dbt, Airflow, and Great Expectations

Offered By: Open Data Science via YouTube

Tags

Big Data Courses Apache Airflow Courses dbt (Data build tool) Courses

Course Description

Overview

Explore the "dag Stack" - a robust data pipeline solution combining dbt, Airflow, and Great Expectations. Learn how to build a transformation layer with dbt, validate source data and add complex tests using Great Expectations, and orchestrate the entire pipeline with Apache Airflow. Discover practical examples of how these tools complement each other to ensure data quality, prevent "garbage in - garbage out" scenarios, and create comprehensive data documentation. Gain insights into automatic profiling, data testing, and validation techniques. Follow along with sample code demonstrations and technical pointers to implement this powerful stack in your own data engineering projects.

Syllabus

Intro
Who am I
Overview
dbt
sample code
dbt run
What dbt doesnt have
Apache Airflow
dbt in Airflow
Airflow dag file
What is Great Expectations
What is Great Expectations Statement
Typical Great Expectations Workflow
Automatic Profiling
Databox
Great Expectations Operator
Recap
Test your data
Where do we start
Technical pointers
Data testing
Data validation
Putting it all together
Airflow dag
Source data load validation
Running tests during development
Test integrity
Wrap up
QA


Taught by

Open Data Science

Related Courses

Data Modeling, Transformation, and Serving
DeepLearning.AI via Coursera
Introduction to dbt
DataCamp
Advance Your Data Engineering Skills
LinkedIn Learning
Data Engineering: dbt for SQL
LinkedIn Learning
Data Engineering Hands-On Practice
LinkedIn Learning