Fabricator - Streamlining Declarative Feature Engineering at DoorDash
Offered By: Databricks via YouTube
Course Description
Overview
Explore a 26-minute conference talk on Fabricator, a comprehensive framework developed to streamline declarative data pipelines for machine learning at DoorDash. Learn how this innovative system efficiently orchestrates 1400 daily batch jobs, managing 2.2 trillion feature values across all business verticals. Discover the components of Fabricator, including its job registry, library for large-scale data ELT jobs, and orchestration and execution service. Understand the numerous advantages offered by Fabricator, such as streamlining feature development with a declarative feature DSL and centralized repository, accelerating data fabrication using a high-level SDK, mitigating latency and consistency discrepancies between offline and online feature data, and automating operational tasks. Gain insights into how Databricks Jobs and Delta Lake were leveraged in Fabricator's construction and the lessons learned during its development. Presented by Hebo Yang, ML Infra Engineer, and Kunal Shah, Software Engineer from DoorDash, this talk provides valuable knowledge for professionals interested in advanced feature engineering techniques and machine learning infrastructure.
Syllabus
Fabricator: Streamlining Declarative Feature Engineering at DoorDash
Taught by
Databricks
Related Courses
Data Processing with AzureLearnQuest via Coursera Mejores prácticas para el procesamiento de datos en Big Data
Coursera Project Network via Coursera Data Science with Databricks for Data Analysts
Databricks via Coursera Azure Data Engineer con Databricks y Azure Data Factory
Coursera Project Network via Coursera Curso Completo de Spark con Databricks (Big Data)
Coursera Project Network via Coursera