Code Once, Use Often - Declarative Data Pipelines
Offered By: Databricks via YouTube
Course Description
Overview
Explore declarative data pipelines in this 28-minute conference talk featuring Anthony Awuley and Carter Kilgour from Flashfood. Learn about food waste reduction efforts, Flashfood's data challenges, and the evolution of their data pipeline solutions. Discover the benefits of declarative approaches, including code reusability and automation. Gain insights into tools like airflow-declarative, SyncTable, and custom operators for extract and transform processes. Understand key lessons learned, future challenges, and the potential of Spark YAML. Conclude with a discussion on Keillor's Principles and the importance of feedback in data engineering.
Syllabus
Intro
Agenda
Food Waste
Flashfood Data
Problem Definition
Attempt 1
The Declarative Data Pipeline
Attempt 3
The right amount of automation
Why configs?
airflow-declarative
SyncTablelob
Custom Operator
Extract
Transform
Summary
Lessons Learned
Challenges ahead
Spark YAML
Keillor's Principles
Feedback
Taught by
Databricks
Related Courses
内存数据库管理openHPI CS115x: Advanced Apache Spark for Data Science and Data Engineering
University of California, Berkeley via edX Processing Big Data with Azure Data Lake Analytics
Microsoft via edX Google Cloud Big Data and Machine Learning Fundamentals en Español
Google Cloud via Coursera Google Cloud Big Data and Machine Learning Fundamentals 日本語版
Google Cloud via Coursera