Octo - INT8 Training with Loss-aware Compensation and Backward Quantization for Tiny On-device Learning
Offered By: USENIX via YouTube
Course Description
Overview
Explore a conference talk on INT8 training for tiny on-device learning, presented at USENIX ATC '21. Dive into the innovative Octo system, which employs 8-bit fixed-point quantization in both forward and backward passes of deep models. Learn about the challenges of on-device learning and how the proposed Loss-aware Compensation (LAC) and Parameterized Range Clipping (PRC) techniques optimize computation while preserving training quality. Discover how Octo achieves higher training efficiency, processing speedup, and memory reduction compared to full-precision training and state-of-the-art quantization methods. Gain insights into the system's performance on commercial AI chips and its potential impact on edge intelligence.
Syllabus
Intro
Rise of On-device Learning
Common Compression Methods
The Workflow of DNN Training
Bridge the Gap: Data Quantization
Why We Need Quantization?
Potential Gains
Co-design of Network and Training Engine
Our System: Octo
Loss-aware Compensation
Backward Quantization
Evaluation Setup
Convergence Results
Ablation Study: Impact of LAC and PRC
Image Processing Throughput
Deep Insight of Feature Distribution Visualization of intermediate Feature Distribution
System Overhead
Conclusion
Taught by
USENIX
Related Courses
Amazon DynamoDB - A Scalable, Predictably Performant, and Fully Managed NoSQL Database ServiceUSENIX via YouTube Faasm - Lightweight Isolation for Efficient Stateful Serverless Computing
USENIX via YouTube AC-Key - Adaptive Caching for LSM-based Key-Value Stores
USENIX via YouTube The Future of the Past - Challenges in Archival Storage
USENIX via YouTube A Decentralized Blockchain with High Throughput and Fast Confirmation
USENIX via YouTube