YoVDO

Lessons Learned from the Migration to Apache Airflow

Offered By: Linux Foundation via YouTube

Tags

Apache Airflow Courses Python Courses Docker Courses Kubernetes Courses ETL Pipelines Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover key insights from migrating machine learning and big data processing pipelines to Apache Airflow in this 38-minute conference talk. Explore how Skimlinks leverages Airflow to power their big data infrastructure, analyzing hundreds of terabytes of data. Learn about building ETL pipelines and managing machine learning Spark pipeline workflows using Airflow. Gain understanding of basic Airflow concepts and see real-life examples of defining workflows in Python code. Delve into advanced topics such as custom task operators, sensors, and plugins. Examine best practices, pros and cons of the tool, and implementation in Docker and Kubernetes environments. Understand the concept of Directed Acyclic Graphs (DAGs) and their importance in creating idempotent workflows.

Syllabus

Intro
Lessons learned from the migration to Apache Airflow
Agenda
Skimlinks: What we do
Why Airflow?
Data Architecture Overv
Airflow and Spark
DAG: Directed Acyclic Graph
Operator
Advanced Features
Sample code
Idempotent DAGS
Best practices: Docker and Kubernetes environments
Airflow: The Good, the Bad and the Ugly


Taught by

Linux Foundation

Tags

Related Courses

Introduction to Airflow in Python
DataCamp
Building Data Engineering Pipelines in Python
DataCamp
The Complete Hands-On Introduction to Apache Airflow
Udemy
Apache Airflow: The Hands-On Guide
Udemy
ETL and Data Pipelines with Shell, Airflow and Kafka
IBM via Coursera