YoVDO

Refurbish Your Training Data - Reusing Partially Augmented Samples for Faster Deep Neural Network Training

Offered By: USENIX via YouTube

Tags

USENIX Annual Technical Conference Courses PyTorch Courses Data Augmentation Courses

Course Description

Overview

Explore a 15-minute conference talk from USENIX ATC '21 that introduces data refurbishing, a novel sample reuse mechanism to accelerate deep neural network training while preserving model generalization. Learn how this technique splits data augmentation into partial and final stages, reusing partially augmented samples to reduce CPU computation while maintaining sample diversity. Discover the design and implementation of Revamper, a new data loading system that maximizes overlap between CPU and deep learning accelerators. Examine the evaluation results showing how Revamper can accelerate training of computer vision models by 1.03×–2.04× while maintaining comparable accuracy. Gain insights into the DNN training pipeline, challenges of data augmentation, and innovative solutions for improving training efficiency.

Syllabus

Intro
DNN Training Pipeline
Overhead of Data Augmentation
Existing Approach: Data Echoing
Our Approach: Data Refurbishing
Analysis on Sample Diversity
Standard Training
Challenge: Inconsistent Batch Time
PyTorch Dataloader
Revamper
Balanced Eviction
Cache-Aware Shuffle
Implementation
Evaluation: Environments
Evaluation: Baselines
Evaluation: Accuracy & Throughput
Conclusion


Taught by

USENIX

Related Courses

TensorFlow を使った畳み込みニューラルネットワーク
DeepLearning.AI via Coursera
Emotion AI: Facial Key-points Detection
Coursera Project Network via Coursera
Transfer Learning for Food Classification
Coursera Project Network via Coursera
Facial Expression Classification Using Residual Neural Nets
Coursera Project Network via Coursera
Apply Generative Adversarial Networks (GANs)
DeepLearning.AI via Coursera