YoVDO

INFaaS - Automated Model-less Inference Serving

Offered By: USENIX via YouTube

Tags

USENIX Annual Technical Conference Courses Distributed Systems Courses Autoscaling Courses

Course Description

Overview

Explore INFaaS, an automated model-less system for distributed inference serving, presented at USENIX ATC '21. Dive into the challenges of machine learning inference serving at scale and discover how INFaaS addresses ease-of-use and cost efficiency issues. Learn about the system's ability to generate model-variants from trained models and efficiently navigate the complex trade-off space to meet application-specific objectives. Understand how INFaaS combines VM-level horizontal autoscaling with model-level autoscaling to achieve higher throughput, improved latency objective compliance, and significant cost savings compared to existing inference serving systems. Gain insights into the model-less abstraction, autoscaling techniques, and evaluation results that demonstrate INFaaS's effectiveness in handling diverse application requirements and evolving query loads.

Syllabus

Intro
Model Registration
Diverse application requirements
Today's Inference serving
Goals & Requirements
INFaaS overview
The model-less abstraction
Existing systems
Autoscaling
Model-Autoscaler at each worker
Evaluation
How well does INFaaS scale with load?
Putting it all together
Conclusion


Taught by

USENIX

Related Courses

Advanced Operating Systems
Georgia Institute of Technology via Udacity
High Performance Computing
Georgia Institute of Technology via Udacity
GT - Refresher - Advanced OS
Georgia Institute of Technology via Udacity
Distributed Machine Learning with Apache Spark
University of California, Berkeley via edX
CS125x: Advanced Distributed Machine Learning with Apache Spark
University of California, Berkeley via edX