Leaner, Greener and Faster PyTorch Inference with Quantization
Offered By: MLOps World: Machine Learning in Production via YouTube
Course Description
Overview
Discover the power of quantization in PyTorch for optimizing neural networks in this comprehensive conference talk. Learn how to transform FP32 parameters into integers without sacrificing accuracy, resulting in leaner, greener, and faster models. Explore the fundamentals of quantization, its implementation in PyTorch, and various approaches available. Gain insights into the benefits and potential pitfalls of each method, enabling informed decision-making for specific use cases. Follow along as the speaker demonstrates the application of quantization techniques on a large non-academic model, showcasing real-world effectiveness. Presented by Suraj Subramanian, a developer advocate and ML engineer at Meta AI, this talk offers valuable knowledge for enhancing PyTorch inference performance.
Syllabus
Leaner, Greener and Faster Pytorch Inference with Quantization
Taught by
MLOps World: Machine Learning in Production
Related Courses
Deep Learning with Python and PyTorch.IBM via edX Introduction to Machine Learning
Duke University via Coursera How Google does Machine Learning em Português Brasileiro
Google Cloud via Coursera Intro to Deep Learning with PyTorch
Facebook via Udacity Secure and Private AI
Facebook via Udacity