YoVDO

Supercharging AI Training with Mosaic Streaming and Lance Columnar Format

Offered By: Databricks via YouTube

Tags

Machine Learning Courses PyTorch Courses Data Management Courses Distributed Training Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a 28-minute conference talk that delves into innovative solutions for managing large-scale AI training datasets. Learn about Mosaic StreamingDataset, designed to simplify multi-node, distributed training of large models, and its seamless integration with PyTorch. Discover the advantages of Lance columnar format over parquet for ML workloads, offering significantly improved random access performance crucial for various training operations. Gain insights into how combining StreamingDataset with Lance enables direct data streaming from object storage, resulting in enhanced performance and cost-effectiveness. Examine the inner workings of these technologies and understand how to construct a basic training pipeline that leverages their capabilities for more efficient and higher-quality AI model training.

Syllabus

Supercharging AI Training with Mosaic Streaming and Lance Columnar Format


Taught by

Databricks

Related Courses

Custom and Distributed Training with TensorFlow
DeepLearning.AI via Coursera
Architecting Production-ready ML Models Using Google Cloud ML Engine
Pluralsight
Building End-to-end Machine Learning Workflows with Kubeflow
Pluralsight
Deploying PyTorch Models in Production: PyTorch Playbook
Pluralsight
Inside TensorFlow
TensorFlow via YouTube