Power-aware Deep Learning Model Serving with μ-Serve
Offered By: USENIX via YouTube
Course Description
Overview
Explore power-aware deep learning model serving with μ-Serve in this 21-minute conference talk from USENIX ATC '24. Discover how researchers from the University of Illinois Urbana-Champaign and IBM Research address the challenge of reducing energy consumption in model-serving clusters while maintaining performance requirements. Learn about the benefits of GPU frequency scaling for power saving in model serving and the importance of co-designing fine-grained model multiplexing with GPU frequency scaling. Examine μ-Serve, a novel power-aware model-serving system that optimizes power consumption and performance for serving multiple ML models in a homogeneous GPU cluster. Gain insights into evaluation results showing significant power savings through dynamic GPU frequency scaling without compromising service level objectives.
Syllabus
USENIX ATC '24 - Power-aware Deep Learning Model Serving with μ-Serve
Taught by
USENIX
Related Courses
Energy and the EarthUniversity of Wisconsin–Madison via Coursera Politics and Economics of International Energy
Sciences Pro via France Université Numerique Energy: Thermodynamics in Everyday Life
University of Liverpool via FutureLearn Energía: pasado, presente y futuro
Tecnológico de Monterrey via edX Smart Grids for Smart Cities: Towards Zero Emissions
Homuork via FutureLearn