YoVDO

Inference and Quantization for AI - Session 3

Offered By: Nvidia via YouTube

Tags

Quantization Courses Deep Learning Courses Neural Networks Courses TensorFlow Courses Object Detection Courses Model Optimization Courses TensorRT Courses Hyperscale Computing Courses

Course Description

Overview

Explore advanced techniques for AI inference and quantization in this session from the NVIDIA AI Tech Workshop at NeurIPS Expo 2018. Dive into quantized inference, NVIDIA TensorRT™ 5 and TensorFlow integration, and the TensorRT Inference Server. Learn about 4-bit quantization, binary neural networks, tensor cores, and strategies for maintaining speed while optimizing accuracy. Discover post-training calibration techniques, mixed precision networks, and the benefits of per-channel scaling. Gain insights into object detection with NMS, the TensorRT hyperscale inference platform, and the NVIDIA TensorRT Inference Server's features including dynamic batching and concurrent model execution. Access additional resources and tools to enhance your AI inference capabilities.

Syllabus

Intro
OUTLINE
4-BIT QUANTIZATION
QUANTIZATION FOR INFERENCE
BINARY NEURAL NETWORKS
USING TENSOR CORES
QUANTIZED NETWORK ACCURACY
MAINTAINING SPEED AT BEST ACCURACY
SCALE-ONLY QUANTIZATION
PER-CHANNEL SCALING
TRAINING FOR QUANTIZATION
CONCLUSION
POST-TRAINING CALIBRATION
MIXED PRECISION NETWORKS
THE ROOT CAUSE
BRING YOUR OWN CALIBRATION
SUMMARY
INT PERFORMANCE
ALSO IN TensorRT
TF-TRT RELATIVE PERFORMANCE
OBJECT DETECTION - NMS
USING THE NEW NMS OP
NOW AVAILABLE ON GITHUB
TENSORRT HYPERSCALE INFERENCE PLATFORM
INEFFICIENCY LIMITS INNOVATION
NVIDIA TENSORRT INFERENCE SERVER
CURRENT FEATURES
AVAILABLE METRICS
DYNAMIC BATCHING
CONCURRENT MODEL EXECUTION-RESNET 50
NVIDIA RESEARCH AI PLAYGROUND
NV LEARN MORE AND DOWNLOAD TO USE
ADDITIONAL RESOURCES


Taught by

NVIDIA Developer

Tags

Related Courses

3D-печать для всех и каждого
Tomsk State University via Coursera
Developing a Multidimensional Data Model
Microsoft via edX
Launching into Machine Learning 日本語版
Google Cloud via Coursera
Art and Science of Machine Learning 日本語版
Google Cloud via Coursera
Launching into Machine Learning auf Deutsch
Google Cloud via Coursera