The Daft Distributed Python Data Engine: Multimodal Data Curation at Any Scale
Offered By: MLOps.community via YouTube
Course Description
Overview
Explore the Daft distributed Python data engine for multimodal data curation at any scale in this 27-minute talk by Jay Chia. Discover how Daft addresses the fundamental needs of ML/AI data platforms, including terabyte-scale ETL with complex model batch inference, analytics for multimodal datatypes using SQL, and performant dataloading for model training and inference. Learn why other tools fall short in meeting these requirements and see a full example of building a highly performant data platform using the Daft Dataframe and open file formats like JSON and Parquet. Gain insights from Jay's experience in ML Infrastructure across biotech and autonomous driving industries, and understand how Daft can revolutionize your approach to data curation for ML/AI projects in 2024 and beyond.
Syllabus
The Daft distributed Python data engine: multimodal data curation at any scale // Jay Chia // DE4AI
Taught by
MLOps.community
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Probabilistic Graphical Models 1: Representation
Stanford University via Coursera Artificial Intelligence for Robotics
Stanford University via Udacity Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent