Modern Data Orchestration: Best Practices and Real-World Use Cases
Offered By: The ASF via YouTube
Course Description
Overview
Explore advanced techniques and best practices for elevating your data pipeline game in this practical talk. Dive into real-world use cases, examining patterns for data pipelines using Airflow with Spark, DBT, and Polars. Learn strategies to avoid dependencies management in Airflow and reuse DAG templates across your organization. Delve into fundamental concepts of data pipelines, including data lineage, observability, metadata, quality, and auditing, and discover how to integrate these elements effectively. Master the art of writing clean code for data pipelines using the Factory Design Pattern with spark-submit, Airflow, and KubernatesPodOperator. Gain insights into Airflow alternatives like Dagster and Mage for your data architecture. Led by Riccardo Amadio, a Senior Data Engineer at Agile Lab, this 26-minute presentation offers a no-nonsense approach to modern data orchestration.
Syllabus
Modern Data Orchestrators
Taught by
The ASF
Related Courses
Learn DBT from ScratchUdemy The Complete dbt (Data Build Tool) Bootcamp: Zero to Hero
Udemy Analytics Engineering Bootcamp
Udemy Building a Robust Data Pipeline with the DAG Stack - dbt, Airflow, and Great Expectations
Open Data Science via YouTube Orchestration Made Easy with Databricks Workflows
Databricks via YouTube