YoVDO

INFaaS - Automated Model-less Inference Serving

Offered By: USENIX via YouTube

Tags

USENIX Annual Technical Conference Courses Distributed Systems Courses Autoscaling Courses

Course Description

Overview

Explore INFaaS, an automated model-less system for distributed inference serving, presented at USENIX ATC '21. Dive into the challenges of machine learning inference serving at scale and discover how INFaaS addresses ease-of-use and cost efficiency issues. Learn about the system's ability to generate model-variants from trained models and efficiently navigate the complex trade-off space to meet application-specific objectives. Understand how INFaaS combines VM-level horizontal autoscaling with model-level autoscaling to achieve higher throughput, improved latency objective compliance, and significant cost savings compared to existing inference serving systems. Gain insights into the model-less abstraction, autoscaling techniques, and evaluation results that demonstrate INFaaS's effectiveness in handling diverse application requirements and evolving query loads.

Syllabus

Intro
Model Registration
Diverse application requirements
Today's Inference serving
Goals & Requirements
INFaaS overview
The model-less abstraction
Existing systems
Autoscaling
Model-Autoscaler at each worker
Evaluation
How well does INFaaS scale with load?
Putting it all together
Conclusion


Taught by

USENIX

Related Courses

Amazon DynamoDB - A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service
USENIX via YouTube
Faasm - Lightweight Isolation for Efficient Stateful Serverless Computing
USENIX via YouTube
AC-Key - Adaptive Caching for LSM-based Key-Value Stores
USENIX via YouTube
The Future of the Past - Challenges in Archival Storage
USENIX via YouTube
A Decentralized Blockchain with High Throughput and Fast Confirmation
USENIX via YouTube