Fabricator - Streamlining Declarative Feature Engineering at DoorDash
Offered By: Databricks via YouTube
Course Description
Overview
Explore a 26-minute conference talk on Fabricator, a comprehensive framework developed to streamline declarative data pipelines for machine learning at DoorDash. Learn how this innovative system efficiently orchestrates 1400 daily batch jobs, managing 2.2 trillion feature values across all business verticals. Discover the components of Fabricator, including its job registry, library for large-scale data ELT jobs, and orchestration and execution service. Understand the numerous advantages offered by Fabricator, such as streamlining feature development with a declarative feature DSL and centralized repository, accelerating data fabrication using a high-level SDK, mitigating latency and consistency discrepancies between offline and online feature data, and automating operational tasks. Gain insights into how Databricks Jobs and Delta Lake were leveraged in Fabricator's construction and the lessons learned during its development. Presented by Hebo Yang, ML Infra Engineer, and Kunal Shah, Software Engineer from DoorDash, this talk provides valuable knowledge for professionals interested in advanced feature engineering techniques and machine learning infrastructure.
Syllabus
Fabricator: Streamlining Declarative Feature Engineering at DoorDash
Taught by
Databricks
Related Courses
Google Cloud Big Data and Machine Learning Fundamentals en EspañolGoogle Cloud via Coursera Data Analysis with Python
IBM via Coursera Intro to TensorFlow 日本語版
Google Cloud via Coursera TensorFlow on Google Cloud - Français
Google Cloud via Coursera Freedom of Data with SAP Data Hub
SAP Learning