Power-aware Deep Learning Model Serving with μ-Serve
Offered By: USENIX via YouTube
Course Description
Overview
Explore power-aware deep learning model serving with μ-Serve in this 21-minute conference talk from USENIX ATC '24. Discover how researchers from the University of Illinois Urbana-Champaign and IBM Research address the challenge of reducing energy consumption in model-serving clusters while maintaining performance requirements. Learn about the benefits of GPU frequency scaling for power saving in model serving and the importance of co-designing fine-grained model multiplexing with GPU frequency scaling. Examine μ-Serve, a novel power-aware model-serving system that optimizes power consumption and performance for serving multiple ML models in a homogeneous GPU cluster. Gain insights into evaluation results showing significant power savings through dynamic GPU frequency scaling without compromising service level objectives.
Syllabus
USENIX ATC '24 - Power-aware Deep Learning Model Serving with μ-Serve
Taught by
USENIX
Related Courses
Managing Big Data in Clusters and Cloud StorageCloudera via Coursera The Complete Apache Kafka Practical Guide
Udemy Dynamical Systems in Neuroscience
MITCBMM via YouTube Dimensionality Reduction II
MITCBMM via YouTube Optimizing Spark SQL Jobs with Parallel and Asynchronous IO
Databricks via YouTube