INFaaS - Automated Model-less Inference Serving

Offered By: USENIX via YouTube

Course Description

Overview

Explore INFaaS, an automated model-less system for distributed inference serving, presented at USENIX ATC '21. Dive into the challenges of machine learning inference serving at scale and discover how INFaaS addresses ease-of-use and cost efficiency issues. Learn about the system's ability to generate model-variants from trained models and efficiently navigate the complex trade-off space to meet application-specific objectives. Understand how INFaaS combines VM-level horizontal autoscaling with model-level autoscaling to achieve higher throughput, improved latency objective compliance, and significant cost savings compared to existing inference serving systems. Gain insights into the model-less abstraction, autoscaling techniques, and evaluation results that demonstrate INFaaS's effectiveness in handling diverse application requirements and evolving query loads.

Syllabus

Intro
Model Registration
Diverse application requirements
Today's Inference serving
Goals & Requirements
INFaaS overview
The model-less abstraction
Existing systems
Autoscaling
Model-Autoscaler at each worker
Evaluation
How well does INFaaS scale with load?
Putting it all together
Conclusion

Taught by

USENIX

INFaaS - Automated Model-less Inference Serving

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

INFaaS - Automated Model-less Inference Serving

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue