Habitat - A Runtime-Based Computational Performance Predictor for Deep Neural Network Training
Offered By: USENIX via YouTube
Course Description
Overview
Explore a runtime-based computational performance predictor for deep neural network training in this USENIX ATC '21 conference talk. Dive into the challenges of selecting GPUs for DNN training, considering both performance and cost factors. Learn about Habitat, a new technique that leverages existing GPUs to make informed predictions about performance on other GPU options. Understand the concept of wave scaling and its application in predicting execution times across different GPU architectures. Discover how Habitat achieves accurate iteration execution time predictions for various DNN models across multiple GPU architectures. Gain insights into the implementation of Habitat as a Python library supporting PyTorch, and explore its potential for helping researchers and practitioners make cost-efficient GPU selections for their deep learning projects.
Syllabus
Intro
What this talk is about The problem: • Many GPUs available for deep neural network (DNN) training . Each has a different cost and performance
A Cambrian explosion in hardware for training
Choosing a GPU: The paradox of choice
Key observations • Deep learning users may already have an existing GPU
Habitat: A runtime-based performance predictor
One last wrinkle: Kernel-varying operations Wave scaling assumes the same kernel is used across GPUS
Evaluation
How accurate is Habitat?
Rent a GPU in the cloud? Scenario: Want to train GNMT, have access to a P4000. Which cloud GPU to use, if any?
Key takeaways . DNN computation is special (repetitive), enabling new analysis opportunities
Taught by
USENIX
Related Courses
Deep Learning with Python and PyTorch.IBM via edX Introduction to Machine Learning
Duke University via Coursera How Google does Machine Learning em Português Brasileiro
Google Cloud via Coursera Intro to Deep Learning with PyTorch
Facebook via Udacity Secure and Private AI
Facebook via Udacity