StreamBox - A Lightweight GPU Sandbox for Serverless Inference Workflow
Offered By: USENIX via YouTube
Course Description
Overview
Explore a groundbreaking conference talk on StreamBox, a lightweight GPU sandbox designed for serverless inference workflows. Delve into the challenges of dynamic workloads and latency-sensitive DNN inference in serverless computing environments. Discover how StreamBox addresses the limitations of existing serverless inference systems by implementing fine-grained and auto-scaling memory management, enabling transparent and efficient intra-GPU communication across functions, and facilitating PCIe bandwidth sharing among concurrent streams. Learn about the significant improvements StreamBox offers, including up to 82% reduction in GPU memory footprint and a 6.7X increase in throughput compared to state-of-the-art systems. Gain insights into the potential impact of this innovative approach on scalable DNN inference serving and the future of serverless computing for GPU-intensive tasks.
Syllabus
USENIX ATC '24 - StreamBox: A Lightweight GPU SandBox for Serverless Inference Workflow
Taught by
USENIX
Related Courses
Heterogeneous Parallel ProgrammingUniversity of Illinois at Urbana-Champaign via Coursera Advanced Operating Systems
Georgia Institute of Technology via Udacity 計算機程式設計 (Computer Programming)
National Taiwan University via Coursera Introduction to Operating Systems
Georgia Institute of Technology via Udacity Android Performance
Google via Udacity