YoVDO

Feeding Data to AWS Redshift with Airflow

Offered By: EuroPython Conference via YouTube

Tags

EuroPython Courses Data Warehousing Courses Apache Airflow Courses Data Transformation Courses Data Engineering Courses Data Pipelines Courses SQLAlchemy Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a comprehensive talk from EuroPython 2017 on leveraging Airflow for efficient data pipelines to AWS Redshift. Dive into the fundamentals of Airflow, including its scheduling capabilities and workflow management features. Learn about data pipeline-specific concepts such as backfills and retries, and discover practical examples of integration. Gain insights into structuring data in Redshift, performing basic pre-loading transformations, and managing schemas using SQLAlchemy and Alembic. Follow along as the speaker shares valuable lessons learned and addresses common Redshift challenges. Perfect for data engineers and analysts looking to optimize their ETL processes and harness the power of Airflow in conjunction with AWS Redshift.

Syllabus

Introduction
Federicos background
Product problem
Is it simple
Data pipelines
Scale
Stages
Archive
Airflow
Python
Database
UI
Workflow
Operators
Airflow UI
Airflow scheduling
Tracking state
Downtime
Scrapers
Batch IDs
Timestamp
Formats
Redshift copy command
JSON path flattening
Schema conversion
Migration framework
Redshift annoyances
Futureproof
Thank you


Taught by

EuroPython Conference

Related Courses

A Brief History of Data Storage
EuroPython Conference via YouTube
Breaking the Stereotype - Evolution & Persistence of Gender Bias in Tech
EuroPython Conference via YouTube
We Can Get More from Spatial, GIS, and Public Domain Datasets
EuroPython Conference via YouTube
Using NLP to Detect Knots in Protein Structures
EuroPython Conference via YouTube
The Challenges of Doing Infra-As-Code Without "The Cloud"
EuroPython Conference via YouTube