INFaaS - Automated Model-less Inference Serving
Offered By: USENIX via YouTube
Course Description
Overview
Explore INFaaS, an automated model-less system for distributed inference serving, presented at USENIX ATC '21. Dive into the challenges of machine learning inference serving at scale and discover how INFaaS addresses ease-of-use and cost efficiency issues. Learn about the system's ability to generate model-variants from trained models and efficiently navigate the complex trade-off space to meet application-specific objectives. Understand how INFaaS combines VM-level horizontal autoscaling with model-level autoscaling to achieve higher throughput, improved latency objective compliance, and significant cost savings compared to existing inference serving systems. Gain insights into the model-less abstraction, autoscaling techniques, and evaluation results that demonstrate INFaaS's effectiveness in handling diverse application requirements and evolving query loads.
Syllabus
Intro
Model Registration
Diverse application requirements
Today's Inference serving
Goals & Requirements
INFaaS overview
The model-less abstraction
Existing systems
Autoscaling
Model-Autoscaler at each worker
Evaluation
How well does INFaaS scale with load?
Putting it all together
Conclusion
Taught by
USENIX
Related Courses
Advanced Operating SystemsGeorgia Institute of Technology via Udacity High Performance Computing
Georgia Institute of Technology via Udacity GT - Refresher - Advanced OS
Georgia Institute of Technology via Udacity Distributed Machine Learning with Apache Spark
University of California, Berkeley via edX CS125x: Advanced Distributed Machine Learning with Apache Spark
University of California, Berkeley via edX