Fast, Flexible, and Scalable Data Loading for ML Training with Ray Data
Offered By: Anyscale via YouTube
Course Description
Overview
Explore the capabilities of Ray Data for fast, flexible, and scalable data loading in machine learning training pipelines through this 31-minute conference talk. Dive into performance comparisons between different open-source data loader solutions and discover how Ray Data matches PyTorch DataLoader and tf.data in single-node performance while offering advanced features for scale. Learn about in-memory streaming, automatic recovery from out-of-memory failures, and support for heterogeneous clusters. Gain insights into how Ray Data provides unmatched speed, scale, and flexibility compared to other open-source data loaders, addressing the growing complexity of data preprocessing requirements in diverse data types. Access the accompanying slide deck for a comprehensive overview of the presented concepts and techniques.
Syllabus
Fast, Flexible, and Scalable Data Loading for ML Training with Ray Data
Taught by
Anyscale
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Natural Language Processing
Columbia University via Coursera Probabilistic Graphical Models 1: Representation
Stanford University via Coursera Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent