Co-Location of CPU and GPU Workloads for High Resource Efficiency in Kubernetes
Offered By: Linux Foundation via YouTube
Course Description
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore strategies for optimizing resource utilization in Kubernetes clusters by co-locating CPU and GPU workloads. Learn how Ant Financial and Alibaba achieved a 10% increase in utilization through innovative approaches. Discover the creation of a new QoS class, implementation of node-level cgroups for batch jobs, and use of PodGroup CRD for gang scheduling. Gain insights into building and managing a co-location cluster with over 100 GPU and 500 CPU nodes, effectively combining long-running services and AI batch jobs. This 37-minute conference talk from the Linux Foundation provides valuable experience and practices for maximizing resource efficiency in Kubernetes environments.
Syllabus
Co-Location of CPU and GPU Workloads with High Resource Efficiency - Penghao Cen & Jian He
Taught by
Linux Foundation
Tags
Related Courses
Моделирование биологических молекул на GPU (Biomolecular modeling on GPU)Moscow Institute of Physics and Technology via Coursera LLM Server
Pragmatic AI Labs via edX AI Infrastructure and Operations Fundamentals
Nvidia via Coursera Open Source LLMOps Solutions
Duke University via Coursera Deep Learning - Computer Vision for Beginners Using PyTorch
Packt via Coursera