LLMOps: OpenVino Toolkit Quantization 4int LLama 3.2 3B and Inference on CPU
Offered By: The Machine Learning Engineer via YouTube
Course Description
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn how to convert the LLAMA3.2 3 Billion parameter model to OpenVino IR format and quantize it to 4-bit integer precision. Follow along as the process of model conversion and quantization is demonstrated step-by-step. Discover how to perform inference on a CPU using Chain of Thought (CoT) prompts with the optimized model. Access the accompanying Jupyter notebook for hands-on practice and deeper understanding of the LLMOps techniques covered in this 26-minute tutorial on data science and machine learning.
Syllabus
LLMOps: OpenVino Toolkit quantization 4int LLama3.2 3B, Inference CPU #datascience #machinelearning
Taught by
The Machine Learning Engineer
Related Courses
Aerial Image Segmentation with PyTorchCoursera Project Network via Coursera Discrete Inference and Learning in Artificial Vision
École Centrale Paris via Coursera Building Language Models on AWS (Japanese) 日本語字幕版
Amazon Web Services via AWS Skill Builder ChatGPT Prompt Engineering for Developers
DeepLearning.AI via Independent Introduction to Bayesian Statistics
Databricks via Coursera