Bytedance Spark Support for Wanka Model Inference - GPU Optimization Practices
Offered By: The ASF via YouTube
Course Description
Overview
Explore how Bytedance's infrastructure team enhanced Spark to support large-scale GPU-based model inference on Kubernetes. Learn about the challenges faced in migrating from Hadoop to Kubernetes, including GPU computing power supply issues, resource pool scaling, and online resource waste. Discover the solutions implemented through GPU sharing technology, mixed GPU scheduling, Spark engine improvements, and platform enhancements. Gain insights into how these advancements enabled efficient processing of 8 billion multi-modal training data points using 7,000 mixed GPUs in just 7.5 hours, significantly improving resource efficiency and stability for Wanka model inference practices.
Syllabus
Bytedance Spark Supports Wanka Model Inference Practices
Taught by
The ASF
Related Courses
Introduction to Cloud Infrastructure TechnologiesLinux Foundation via edX Scalable Microservices with Kubernetes
Google via Udacity Google Cloud Fundamentals: Core Infrastructure
Google via Coursera Introduction to Kubernetes
Linux Foundation via edX Fundamentals of Containers, Kubernetes, and Red Hat OpenShift
Red Hat via edX