YoVDO

Investigating Checkpoint and Restore for GPU-Accelerated Containers

Offered By: Linux Foundation via YouTube

Tags

High Performance Computing Courses Linux Containers Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the potential of Checkpoint and Restore technology for GPU-accelerated containers in this 39-minute conference talk presented by Nan Lu from Microsoft and Adrian Reber from Red Hat. Delve into the early investigations and proof-of-concepts surrounding this nascent technology, aimed at optimizing the use of costly GPUs and time-intensive model training processes. Gain insights into existing functionalities and identify gaps in the ecosystem that need to be addressed to enable this solution. Learn about the challenges and opportunities in leveraging Checkpoint and Restore techniques for GPU-powered containers, and understand how this approach could potentially revolutionize resource management in high-performance computing environments.

Syllabus

Investigating Checkpoint and Restore for GPU-Accelerated Containers - Nan Lu & Adrian Reber


Taught by

Linux Foundation

Tags

Related Courses

High Performance Computing
Georgia Institute of Technology via Udacity
Введение в параллельное программирование с использованием OpenMP и MPI
Tomsk State University via Coursera
High Performance Computing in the Cloud
Dublin City University via FutureLearn
Production Machine Learning Systems
Google Cloud via Coursera
LAFF-On Programming for High Performance
The University of Texas at Austin via edX