YoVDO

Inference and Quantization for AI - Session 3

Offered By: Nvidia via YouTube

Tags

Quantization Courses Deep Learning Courses Neural Networks Courses TensorFlow Courses Object Detection Courses Model Optimization Courses TensorRT Courses Hyperscale Computing Courses

Course Description

Overview

Explore advanced techniques for AI inference and quantization in this session from the NVIDIA AI Tech Workshop at NeurIPS Expo 2018. Dive into quantized inference, NVIDIA TensorRTâ„¢ 5 and TensorFlow integration, and the TensorRT Inference Server. Learn about 4-bit quantization, binary neural networks, tensor cores, and strategies for maintaining speed while optimizing accuracy. Discover post-training calibration techniques, mixed precision networks, and the benefits of per-channel scaling. Gain insights into object detection with NMS, the TensorRT hyperscale inference platform, and the NVIDIA TensorRT Inference Server's features including dynamic batching and concurrent model execution. Access additional resources and tools to enhance your AI inference capabilities.

Syllabus

Intro
OUTLINE
4-BIT QUANTIZATION
QUANTIZATION FOR INFERENCE
BINARY NEURAL NETWORKS
USING TENSOR CORES
QUANTIZED NETWORK ACCURACY
MAINTAINING SPEED AT BEST ACCURACY
SCALE-ONLY QUANTIZATION
PER-CHANNEL SCALING
TRAINING FOR QUANTIZATION
CONCLUSION
POST-TRAINING CALIBRATION
MIXED PRECISION NETWORKS
THE ROOT CAUSE
BRING YOUR OWN CALIBRATION
SUMMARY
INT PERFORMANCE
ALSO IN TensorRT
TF-TRT RELATIVE PERFORMANCE
OBJECT DETECTION - NMS
USING THE NEW NMS OP
NOW AVAILABLE ON GITHUB
TENSORRT HYPERSCALE INFERENCE PLATFORM
INEFFICIENCY LIMITS INNOVATION
NVIDIA TENSORRT INFERENCE SERVER
CURRENT FEATURES
AVAILABLE METRICS
DYNAMIC BATCHING
CONCURRENT MODEL EXECUTION-RESNET 50
NVIDIA RESEARCH AI PLAYGROUND
NV LEARN MORE AND DOWNLOAD TO USE
ADDITIONAL RESOURCES


Taught by

NVIDIA Developer

Tags

Related Courses

Optimize TensorFlow Models For Deployment with TensorRT
Coursera Project Network via Coursera
Jetson Xavier NX Developer Kit - Edge AI Supercomputer Features and Applications
Nvidia via YouTube
NVIDIA Jetson: Enabling AI-Powered Autonomous Machines at Scale
Nvidia via YouTube
Jetson AGX Xavier: Architecture and Applications for Autonomous Machines
Nvidia via YouTube
Streamline Deep Learning for Video Analytics with DeepStream SDK 2.0
Nvidia via YouTube