YoVDO

Batch Systems in Production with Kueue - Multi-Tenancy and Fungibility

Offered By: CNCF [Cloud Native Computing Foundation] via YouTube

Tags

Kubernetes Courses MLOps Courses Cluster Management Courses Cloud Native Computing Courses Multi-Tenancy Courses Kueue Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the capabilities of Kueue, a cloud-native job scheduler for building multi-tenant batch systems on Kubernetes clusters, in this informative conference talk. Learn about Kueue's architecture, extensibility, and its ability to support various workloads while implementing job queueing based on quotas, priority, and resource sharing hierarchies. Discover how Kueue operates in both on-premises and autoscaled cloud environments, maximizing resource utilization through borrowing and preemption mechanisms. Gain insights into Kueue's real-world application in production self-managed clusters, serving machine-learning researchers, MLOps engineers, and data scientists. Understand how Kueue integrates with popular frameworks like DeepSpeed, PyTorch, Kubernetes Job, RayJob, and Jupyter to provide fair resource use and efficient management of accelerators and other resources.

Syllabus

Batch Systems in Production with Kueue: Multi-Tenancy and Fungibility- Yuki Iwai & Aldo Culquicondor


Taught by

CNCF [Cloud Native Computing Foundation]

Related Courses

SIG Scheduling Deep Dive in Kubernetes - Latest Enhancements and Opportunities
CNCF [Cloud Native Computing Foundation] via YouTube
Kubernetes WG Batch: Recent Improvements and Future Roadmap
CNCF [Cloud Native Computing Foundation] via YouTube
Building a Batch System for the Cloud with Kueue
CNCF [Cloud Native Computing Foundation] via YouTube
Kueue: Kubernetes-Native Job Queueing for Batch Workloads
CNCF [Cloud Native Computing Foundation] via YouTube
Sailing Ray Workloads with KubeRay and Kueue in Kubernetes
CNCF [Cloud Native Computing Foundation] via YouTube