YoVDO

Building Large-Scale Data Processing Pipelines for Multimodal Models with Ray - ByteDance Case Study

Offered By: Anyscale via YouTube

Tags

Data Processing Courses Distributed Computing Courses Ray Serve Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore ByteDance's innovative approach to building large-scale data processing pipelines for multimodal models using Ray in this 35-minute conference talk. Discover how Xiaohong Dong, Wanxing Wang, and Liguang Xie from ByteDance tackled the challenges of processing vast amounts of high-quality video data for advanced video generation models. Learn about their utilization of Ray's ecosystem, including Ray Core, Ray Data, and Ray Serve, to create a robust and scalable data pipeline. Gain valuable insights into managing Ray infrastructure, best practices for large-scale multimodal AI projects, and solutions for dynamic scaling and orchestration of heterogeneous resources. Uncover a blueprint for leveraging Ray in ambitious AI endeavors and understand how ByteDance overcame the complexities of handling massive video datasets.

Syllabus

How Bytedance Builds Large-Scale Data Processing Pipelines for Multimodal Models with Ray | RS 24


Taught by

Anyscale

Related Courses

Cloud Computing Concepts, Part 1
University of Illinois at Urbana-Champaign via Coursera
Cloud Computing Concepts: Part 2
University of Illinois at Urbana-Champaign via Coursera
Reliable Distributed Algorithms - Part 1
KTH Royal Institute of Technology via edX
Introduction to Apache Spark and AWS
University of London International Programmes via Coursera
Réalisez des calculs distribués sur des données massives
CentraleSupélec via OpenClassrooms