Sailing Ray Workloads with KubeRay and Kueue in Kubernetes
Offered By: CNCF [Cloud Native Computing Foundation] via YouTube
Course Description
Overview
Explore how to manage Ray workloads in Kubernetes using KubeRay and Kueue in this informative conference talk. Learn about the growing compute demands in machine learning and how Ray, a unified computing framework, enables ML engineers to scale workloads without complex infrastructure. Discover the benefits of using Kubernetes with KubeRay for managing diverse workloads, and gain insights from ByteDance's experience of submitting thousands of jobs daily to Ray clusters. Understand the challenges of managing concurrent Ray jobs, including job starvation and resource allocation, and how Kueue, a Kubernetes native job queueing system, addresses these issues with features like resource management, multi-tenant support, and fair-sharing.
Syllabus
Sailing Ray Workloads with KubeRay and Kueue in Kubernetes - Jason Hu, Volcano Engine & Kante Yin
Taught by
CNCF [Cloud Native Computing Foundation]
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Natural Language Processing
Columbia University via Coursera Probabilistic Graphical Models 1: Representation
Stanford University via Coursera Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent