YoVDO

Feeding Data to AWS Redshift with Airflow

Offered By: EuroPython Conference via YouTube

Tags

EuroPython Courses Data Warehousing Courses Apache Airflow Courses Data Transformation Courses Data Engineering Courses Data Pipelines Courses SQLAlchemy Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a comprehensive talk from EuroPython 2017 on leveraging Airflow for efficient data pipelines to AWS Redshift. Dive into the fundamentals of Airflow, including its scheduling capabilities and workflow management features. Learn about data pipeline-specific concepts such as backfills and retries, and discover practical examples of integration. Gain insights into structuring data in Redshift, performing basic pre-loading transformations, and managing schemas using SQLAlchemy and Alembic. Follow along as the speaker shares valuable lessons learned and addresses common Redshift challenges. Perfect for data engineers and analysts looking to optimize their ETL processes and harness the power of Airflow in conjunction with AWS Redshift.

Syllabus

Introduction
Federicos background
Product problem
Is it simple
Data pipelines
Scale
Stages
Archive
Airflow
Python
Database
UI
Workflow
Operators
Airflow UI
Airflow scheduling
Tracking state
Downtime
Scrapers
Batch IDs
Timestamp
Formats
Redshift copy command
JSON path flattening
Schema conversion
Migration framework
Redshift annoyances
Futureproof
Thank you


Taught by

EuroPython Conference

Related Courses

Introduction to Airflow in Python
DataCamp
Building Data Engineering Pipelines in Python
DataCamp
The Complete Hands-On Introduction to Apache Airflow
Udemy
Apache Airflow: The Hands-On Guide
Udemy
ETL and Data Pipelines with Shell, Airflow and Kafka
IBM via Coursera