Move Fast and Don't Break Things - How to Build a Scalable Data Platform
Offered By: Data Council via YouTube
Course Description
Overview
Explore the intricacies of building a scalable data platform in this 35-minute conference talk from Data Council. Dive into Machine Learning Operations as Elijah Ben Izzy, Co-founder & CTO of DAGWorks Inc., shares insights from his experience at Stitch Fix. Learn about the two-phase process of deploying ML models, focusing on the open-source library Hamilton. Gain valuable knowledge on overcoming challenges faced by large data science teams, discover practical solutions using Hamilton, and compare it with other industry tools. Understand strategic goals for platform development, explore concepts like cognitive dissonance in data workflows, and examine real-world applications in quantitative finance and machine learning pipelines. Delve into topics such as getting started with Hamilton, scaling operations, ensuring data quality, and customizing workflows. Acquire insights on production use cases, tokenization, summarization, and a new approach to data platform abstraction.
Syllabus
Intro
About Dagworks
Todays Agenda
Agenda
Drag Race
Cognitive Dissonance
Time Energy
Quantitative Finance
Building a Platform
Strategic Goals of a Platform
My Favorite Platform
Getting Started
Users
Hamilton
DataFrames
Driver
TLDR
decorators
map to your data workflow
testing
unit test
pathway testing
Scaling
Parallelization
Data Quality
CheckOut
Customization
Life Cycle Method
materializes
production use case
machine learning pipeline
machine learning pipeline in practice
ML and Rag case
DAG case
Tokenizer
Summarization
Drive it Home
Recap
Platform abstraction
New approach to data
Taught by
Data Council
Related Courses
Data AnalysisJohns Hopkins University via Coursera Computing for Data Analysis
Johns Hopkins University via Coursera Scientific Computing
University of Washington via Coursera Introduction to Data Science
University of Washington via Coursera Web Intelligence and Big Data
Indian Institute of Technology Delhi via Coursera