Deploying Many Models Efficiently with Ray Serve
Offered By: Anyscale via YouTube
Course Description
Overview
Explore efficient deployment and management of multiple models using Ray Serve in this 26-minute conference talk. Gain comprehensive insights into serving numerous models while optimizing resource utilization and maintaining ease of use. Learn about three key features of Ray Serve: model composition, multi-application, and model multiplexing. Discover common industry patterns for serving many models and how Ray Serve simplifies management and enhances performance. Dive into case studies of Ray Serve users running many-model applications in production. Access the slide deck for additional information and visual aids. Understand how Ray, an open-source framework, powers ambitious AI workloads, including Generative AI, LLMs, and computer vision. Consider Anyscale's managed Ray service for developing, running, and scaling AI applications.
Syllabus
Deploying Many Models Efficiently with Ray Serve
Taught by
Anyscale
Related Courses
Patterns of ML Models in ProductionPyCon US via YouTube Modernizing DoorDash Model Serving Platform with Ray Serve
Anyscale via YouTube Ray for Large-Scale Time-Series Energy Forecasting to Plan a More Resilient Power Grid
Anyscale via YouTube Enabling Cost-Efficient LLM Serving with Ray Serve
Anyscale via YouTube Inference Graphs at LinkedIn Using Ray-Serve
Anyscale via YouTube