Co-Location of CPU and GPU Workloads for High Resource Efficiency in Kubernetes
Offered By: Linux Foundation via YouTube
Course Description
Overview
Explore strategies for optimizing resource utilization in Kubernetes clusters by co-locating CPU and GPU workloads. Learn how Ant Financial and Alibaba achieved a 10% increase in utilization through innovative approaches. Discover the creation of a new QoS class, implementation of node-level cgroups for batch jobs, and use of PodGroup CRD for gang scheduling. Gain insights into building and managing a co-location cluster with over 100 GPU and 500 CPU nodes, effectively combining long-running services and AI batch jobs. This 37-minute conference talk from the Linux Foundation provides valuable experience and practices for maximizing resource efficiency in Kubernetes environments.
Syllabus
Co-Location of CPU and GPU Workloads with High Resource Efficiency - Penghao Cen & Jian He
Taught by
Linux Foundation
Tags
Related Courses
Implementando un motor con Alibaba Cloud y ElasticSearchCoursera Project Network via Coursera Design a Cloud Migration Strategy
LinkedIn Learning Operate Alibaba Cloud Systems and Services
Alibaba via Coursera DevOps on Alibaba Cloud
Alibaba via Coursera Deploy and Manage Your Application on Alibaba Cloud
Alibaba via Coursera