YoVDO

The Daft Distributed Python Data Engine: Multimodal Data Curation at Any Scale

Offered By: MLOps.community via YouTube

Tags

Python Courses Artificial Intelligence Courses Machine Learning Courses Data Engineering Courses Distributed Computing Courses ETL Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the Daft distributed Python data engine for multimodal data curation at any scale in this 27-minute talk by Jay Chia. Discover how Daft addresses the fundamental needs of ML/AI data platforms, including terabyte-scale ETL with complex model batch inference, analytics for multimodal datatypes using SQL, and performant dataloading for model training and inference. Learn why other tools fall short in meeting these requirements and see a full example of building a highly performant data platform using the Daft Dataframe and open file formats like JSON and Parquet. Gain insights from Jay's experience in ML Infrastructure across biotech and autonomous driving industries, and understand how Daft can revolutionize your approach to data curation for ML/AI projects in 2024 and beyond.

Syllabus

The Daft distributed Python data engine: multimodal data curation at any scale // Jay Chia // DE4AI


Taught by

MLOps.community

Related Courses

Introduction to Artificial Intelligence
Stanford University via Udacity
Probabilistic Graphical Models 1: Representation
Stanford University via Coursera
Artificial Intelligence for Robotics
Stanford University via Udacity
Computer Vision: The Fundamentals
University of California, Berkeley via Coursera
Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent