Clipper - A Low-Latency Online Prediction Serving System
Offered By: USENIX via YouTube
Course Description
Overview
Explore a conference talk on Clipper, a low-latency online prediction serving system designed to address the challenges of deploying machine learning models in real-time applications. Learn about Clipper's modular architecture that simplifies model deployment across various frameworks and applications. Discover how the system improves prediction latency, throughput, accuracy, and robustness through techniques like caching, batching, and adaptive model selection. Examine Clipper's performance on four machine learning benchmark datasets and its comparison to TensorFlow Serving. Gain insights into how Clipper enables model composition and online learning to enhance prediction accuracy and robustness without modifying underlying machine learning frameworks.
Syllabus
Introduction
Machine Learning
Prediction Serving
Oneoff Systems
Architecture
Experiments
Comparison to Tensorflow
Comparison to SIF
Model abstraction layer
Multiple models per application
Wrapup
Taught by
USENIX
Related Courses
Scaling Memcache at FacebookUSENIX via YouTube Multi-Person Localization via RF Body Reflections
USENIX via YouTube Opaque - An Oblivious and Encrypted Distributed Analytics Platform
USENIX via YouTube Live Video Analytics at Scale with Approximation and Delay-Tolerance
USENIX via YouTube VFP - A Virtual Switch Platform for Host SDN in the Public Cloud
USENIX via YouTube