YoVDO

AlpaServe - Statistical Multiplexing with Model Parallelism for Deep Learning Serving

Offered By: USENIX via YouTube

Tags

USENIX Symposium on Operating Systems Design and Implementation (OSDI) Courses Deep Learning Courses Distributed Systems Courses Cluster Computing Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a groundbreaking approach to deep learning model serving in this 15-minute conference talk from OSDI '23. Discover how AlpaServe, a novel serving system, leverages model parallelism for statistical multiplexing across multiple devices, even when individual models fit on a single device. Learn about the trade-off between model parallelism overhead and the benefits of statistical multiplexing in reducing serving latency for bursty workloads. Gain insights into AlpaServe's efficient strategy for placing and parallelizing large deep learning models across distributed clusters. Examine evaluation results from production workloads, showcasing AlpaServe's ability to process requests at significantly higher rates and handle increased burstiness while maintaining latency constraints for over 99% of requests.

Syllabus

OSDI '23 - AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving


Taught by

USENIX

Related Courses

Neural Networks for Machine Learning
University of Toronto via Coursera
機器學習技法 (Machine Learning Techniques)
National Taiwan University via Coursera
Machine Learning Capstone: An Intelligent Application with Deep Learning
University of Washington via Coursera
Прикладные задачи анализа данных
Moscow Institute of Physics and Technology via Coursera
Leading Ambitious Teaching and Learning
Microsoft via edX