Feeding Data to AWS Redshift with Airflow
Offered By: EuroPython Conference via YouTube
Course Description
Overview
Explore a comprehensive talk from EuroPython 2017 on leveraging Airflow for efficient data pipelines to AWS Redshift. Dive into the fundamentals of Airflow, including its scheduling capabilities and workflow management features. Learn about data pipeline-specific concepts such as backfills and retries, and discover practical examples of integration. Gain insights into structuring data in Redshift, performing basic pre-loading transformations, and managing schemas using SQLAlchemy and Alembic. Follow along as the speaker shares valuable lessons learned and addresses common Redshift challenges. Perfect for data engineers and analysts looking to optimize their ETL processes and harness the power of Airflow in conjunction with AWS Redshift.
Syllabus
Introduction
Federicos background
Product problem
Is it simple
Data pipelines
Scale
Stages
Archive
Airflow
Python
Database
UI
Workflow
Operators
Airflow UI
Airflow scheduling
Tracking state
Downtime
Scrapers
Batch IDs
Timestamp
Formats
Redshift copy command
JSON path flattening
Schema conversion
Migration framework
Redshift annoyances
Futureproof
Thank you
Taught by
EuroPython Conference
Related Courses
Create Your First Web App with Python and FlaskCoursera Project Network via Coursera Python and Flask Bootcamp: Create Websites using Flask!
Udemy Introduction to Databases in Python
DataCamp Advanced SQL for Application Development
LinkedIn Learning Crea tu primera aplicación web con Python y Flask
Coursera Project Network via Coursera