YoVDO

Orchestrating Data Assets Instead of Tasks, With Dagster - Sandy Ryza

Offered By: Open Data Science via YouTube

Tags

Data Pipelines Courses Machine Learning Courses Apache Airflow Courses Data Engineering Courses Dagster Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Discover the power of data orchestration in this insightful talk by Sandy Ryza, lead of the Dagster project at Elementl. Learn how orchestrators serve as the backbone for keeping data assets up-to-date and synchronized, from datasets to ML models. Explore the concept of data pipelines, delve into Apache Airflow, and understand the process of building and deploying pipelines. Gain valuable insights into the development lifecycle, including local development, unit testing, review and staging, as well as debugging and monitoring techniques. Perfect for data engineers, machine learning enthusiasts, and professionals interested in optimizing data synchronization and advanced analytics.

Syllabus

- Introductions
- What is a data pipeline?
- Apache Airflow
- Building a pipeline
- The development lifecycle
- Local development
- Unit/regression testing
- Review and staging
- Deploying
- Debugging/monitoring
- To sum up
- Q&A


Taught by

Open Data Science

Related Courses

Introduction to Artificial Intelligence
Stanford University via Udacity
Natural Language Processing
Columbia University via Coursera
Probabilistic Graphical Models 1: Representation
Stanford University via Coursera
Computer Vision: The Fundamentals
University of California, Berkeley via Coursera
Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent