YoVDO

Transparent GPU Sharing in Container Clouds for Deep Learning Workloads

Offered By: USENIX via YouTube

Tags

USENIX Symposium on Networked Systems Design and Implementation (NSDI) Courses

Course Description

Overview

Explore a cutting-edge solution for GPU sharing in container clouds designed specifically for deep learning workloads. This 15-minute conference talk introduces TGS (Transparent GPU Sharing), an innovative system operating at the OS layer that addresses the challenge of GPU underutilization in datacenters. Learn how TGS leverages adaptive rate control and transparent unified memory to achieve high GPU utilization and performance isolation, ensuring minimal impact on production jobs while significantly improving throughput for opportunistic jobs. Discover the advantages of TGS over existing application-layer and OS-layer solutions, and gain insights into its integration with Docker and Kubernetes. Understand the potential of this technology to revolutionize resource management in container clouds and optimize deep learning training processes.

Syllabus

NSDI '23 - Transparent GPU Sharing in Container Clouds for Deep Learning Workloads


Taught by

USENIX

Related Courses

Scaling Memcache at Facebook
USENIX via YouTube
Multi-Person Localization via RF Body Reflections
USENIX via YouTube
Opaque - An Oblivious and Encrypted Distributed Analytics Platform
USENIX via YouTube
Live Video Analytics at Scale with Approximation and Delay-Tolerance
USENIX via YouTube
Clipper - A Low-Latency Online Prediction Serving System
USENIX via YouTube