Building a Batch System for the Cloud with Kueue
Offered By: CNCF [Cloud Native Computing Foundation] via YouTube
Course Description
Overview
Explore the key concepts of Kueue, a cloud-native job scheduler, in this 29-minute conference talk. Learn how to address resource constraints in batch, HPC, and AI/ML clusters serving multiple teams and researchers. Discover how Kueue works in combination with the default Kubernetes scheduler, job controller, and cluster-autoscaler to provide a comprehensive batch system. Understand how Kueue implements job queueing, making decisions on when jobs should wait or start based on quotas and a hierarchy for fair resource sharing among teams. Gain insights into Kueue's effectiveness in cloud environments with heterogeneous, fungible resources that can be scaled for cost optimization. Learn how to model your teams and resources to transform your Kubernetes cluster into an efficient batch system using Kueue.
Syllabus
Building a Batch System for the Cloud with Kueue - Aldo Culquicondor, Google & Kante Yin, DaoCloud
Taught by
CNCF [Cloud Native Computing Foundation]
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Natural Language Processing
Columbia University via Coursera Probabilistic Graphical Models 1: Representation
Stanford University via Coursera Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent