YoVDO

Supercharging AI Training with Mosaic Streaming and Lance Columnar Format

Offered By: Databricks via YouTube

Tags

Machine Learning Courses PyTorch Courses Data Management Courses Distributed Training Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a 28-minute conference talk that delves into innovative solutions for managing large-scale AI training datasets. Learn about Mosaic StreamingDataset, designed to simplify multi-node, distributed training of large models, and its seamless integration with PyTorch. Discover the advantages of Lance columnar format over parquet for ML workloads, offering significantly improved random access performance crucial for various training operations. Gain insights into how combining StreamingDataset with Lance enables direct data streaming from object storage, resulting in enhanced performance and cost-effectiveness. Examine the inner workings of these technologies and understand how to construct a basic training pipeline that leverages their capabilities for more efficient and higher-quality AI model training.

Syllabus

Supercharging AI Training with Mosaic Streaming and Lance Columnar Format


Taught by

Databricks

Related Courses

Données et services numériques, dans le nuage et ailleurs
Certificat informatique et internet via France Université Numerique
Introduction to Digital Curation
University College London via Independent
Excel Avanzado
Miríadax
SAP Business Warehouse powered by SAP HANA
SAP Learning
Programming Mobile Applications for Android Handheld Systems: Part 2
University of Maryland, College Park via Coursera