YoVDO

Feeding Data to AWS Redshift with Airflow

Offered By: EuroPython Conference via YouTube

Tags

EuroPython Courses Data Warehousing Courses Apache Airflow Courses Data Transformation Courses Data Engineering Courses Data Pipelines Courses SQLAlchemy Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a comprehensive talk from EuroPython 2017 on leveraging Airflow for efficient data pipelines to AWS Redshift. Dive into the fundamentals of Airflow, including its scheduling capabilities and workflow management features. Learn about data pipeline-specific concepts such as backfills and retries, and discover practical examples of integration. Gain insights into structuring data in Redshift, performing basic pre-loading transformations, and managing schemas using SQLAlchemy and Alembic. Follow along as the speaker shares valuable lessons learned and addresses common Redshift challenges. Perfect for data engineers and analysts looking to optimize their ETL processes and harness the power of Airflow in conjunction with AWS Redshift.

Syllabus

Introduction
Federicos background
Product problem
Is it simple
Data pipelines
Scale
Stages
Archive
Airflow
Python
Database
UI
Workflow
Operators
Airflow UI
Airflow scheduling
Tracking state
Downtime
Scrapers
Batch IDs
Timestamp
Formats
Redshift copy command
JSON path flattening
Schema conversion
Migration framework
Redshift annoyances
Futureproof
Thank you


Taught by

EuroPython Conference

Related Courses

Google Cloud Big Data and Machine Learning Fundamentals en Español
Google Cloud via Coursera
Data Analysis with Python
IBM via Coursera
Intro to TensorFlow 日本語版
Google Cloud via Coursera
TensorFlow on Google Cloud - Français
Google Cloud via Coursera
Freedom of Data with SAP Data Hub
SAP Learning