YoVDO

Supercharging AI Training with Mosaic Streaming and Lance Columnar Format

Offered By: Databricks via YouTube

Tags

Machine Learning Courses PyTorch Courses Data Management Courses Distributed Training Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a 28-minute conference talk that delves into innovative solutions for managing large-scale AI training datasets. Learn about Mosaic StreamingDataset, designed to simplify multi-node, distributed training of large models, and its seamless integration with PyTorch. Discover the advantages of Lance columnar format over parquet for ML workloads, offering significantly improved random access performance crucial for various training operations. Gain insights into how combining StreamingDataset with Lance enables direct data streaming from object storage, resulting in enhanced performance and cost-effectiveness. Examine the inner workings of these technologies and understand how to construct a basic training pipeline that leverages their capabilities for more efficient and higher-quality AI model training.

Syllabus

Supercharging AI Training with Mosaic Streaming and Lance Columnar Format


Taught by

Databricks

Related Courses

Introduction to Artificial Intelligence
Stanford University via Udacity
Natural Language Processing
Columbia University via Coursera
Probabilistic Graphical Models 1: Representation
Stanford University via Coursera
Computer Vision: The Fundamentals
University of California, Berkeley via Coursera
Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent