YoVDO

INFaaS - Automated Model-less Inference Serving

Offered By: USENIX via YouTube

Tags

USENIX Annual Technical Conference Courses Distributed Systems Courses Autoscaling Courses

Course Description

Overview

Explore INFaaS, an automated model-less system for distributed inference serving, presented at USENIX ATC '21. Dive into the challenges of machine learning inference serving at scale and discover how INFaaS addresses ease-of-use and cost efficiency issues. Learn about the system's ability to generate model-variants from trained models and efficiently navigate the complex trade-off space to meet application-specific objectives. Understand how INFaaS combines VM-level horizontal autoscaling with model-level autoscaling to achieve higher throughput, improved latency objective compliance, and significant cost savings compared to existing inference serving systems. Gain insights into the model-less abstraction, autoscaling techniques, and evaluation results that demonstrate INFaaS's effectiveness in handling diverse application requirements and evolving query loads.

Syllabus

Intro
Model Registration
Diverse application requirements
Today's Inference serving
Goals & Requirements
INFaaS overview
The model-less abstraction
Existing systems
Autoscaling
Model-Autoscaler at each worker
Evaluation
How well does INFaaS scale with load?
Putting it all together
Conclusion


Taught by

USENIX

Related Courses

Designing Highly Scalable Web Apps on Google Cloud Platform
Google via Coursera
Elastic Google Cloud Infrastructure: Scaling and Automation
Google Cloud via Coursera
Elastic Cloud Infrastructure: Scaling and Automation auf Deutsch
Google Cloud via Coursera
Elastic Cloud Infrastructure: Scaling and Automation en Français
Google Cloud via Coursera
Alibaba Cloud Native Solutions and Container Service
Alibaba via Coursera