Accelerate Your GenAI Model Inference with Ray and Kubernetes
Offered By: CNCF [Cloud Native Computing Foundation] via YouTube
Course Description
Overview
Explore how to accelerate Generative AI model inference using Ray and Kubernetes in this informative conference talk. Delve into the challenges of serving massive GenAI models with hundreds of billions of parameters and learn practical solutions for deploying them in production environments. Discover the power of KubeRay on Kubernetes, leveraging hardware accelerators like GPUs and TPUs for enhanced performance. Gain insights into Ray, an open-source framework for distributed machine learning, and its Serve library for scalable online inference. Understand how integrating Ray with accelerators creates a robust platform for serving GenAI models efficiently and cost-effectively. Acquire valuable knowledge on scaling workloads across large clusters of machines and optimizing your Kubernetes platform for cutting-edge AI applications.
Syllabus
Accelerate Your GenAI Model Inference with Ray and Kubernetes - Richard Liu, Google Cloud
Taught by
CNCF [Cloud Native Computing Foundation]
Related Courses
Scalable Data ScienceIndian Institute of Technology, Kharagpur via Swayam Data Science and Engineering with Spark
Berkeley University of California via edX Data Science on Google Cloud: Machine Learning
Google via Qwiklabs Modern Distributed Systems
Delft University of Technology via edX KungFu - Making Training in Distributed Machine Learning Adaptive
USENIX via YouTube