YoVDO

Uber's GenAI Leap: Batch Predictions Using Ray and vLLM - Ray Summit 2024

Offered By: Anyscale via YouTube

Tags

Machine Learning Courses Kubernetes Courses Generative AI Courses GPU Computing Courses vLLM Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore Uber's innovative approach to large-scale Generative AI batch prediction in this Ray Summit 2024 presentation. Learn how Uber integrates Ray and vLLM within their Michelangelo machine learning platform to enhance GenAI application development. Discover how this new method addresses limitations in traditional Spark-based approaches, particularly for GPU-intensive tasks. Gain insights into the architecture of Uber's new system, its integration with Kubernetes and Michelangelo's LLM evaluation workflow, and its application to various Uber services. Understand the benchmarking results and lessons learned from developing and implementing this solution. Acquire valuable knowledge for scaling Generative AI capabilities, leveraging Ray and vLLM to improve prediction tasks, reduce latency, and enhance overall GenAI performance.

Syllabus

Uber's GenAI Leap: Batch Predictions Using Ray and vLLM | Ray Summit 2024


Taught by

Anyscale

Related Courses

Finetuning, Serving, and Evaluating Large Language Models in the Wild
Open Data Science via YouTube
Cloud Native Sustainable LLM Inference in Action
CNCF [Cloud Native Computing Foundation] via YouTube
Optimizing Kubernetes Cluster Scaling for Advanced Generative Models
Linux Foundation via YouTube
LLaMa for Developers
LinkedIn Learning
Scaling Video Ad Classification Across Millions of Classes with GenAI
Databricks via YouTube