LLMOps: OpenVino Toolkit para Quantizar LLama 3.2 3B a 4int e Inferencia en CPU
Offered By: The Machine Learning Engineer via YouTube
Course Description
Overview
Explore a 32-minute video tutorial on LLMOps, focusing on using the OpenVino Toolkit to quantize the LLAMA3.2 3B model to 4-bit integer format and perform CPU inference. Learn how to convert the LLAMA3.2 3 billion parameter model to OpenVino IR format, apply 4-bit integer quantization, and execute inference on CPU using Chain of Thought (CoT) prompts. Access the accompanying Jupyter notebook for hands-on practice and in-depth understanding of the process. Ideal for data scientists and machine learning enthusiasts looking to optimize large language models for efficient deployment.
Syllabus
LLMOps: OpenVino Toolkit quantizar 4int LLama3.2 3B e Inferencia CPU #datascience #machinelearning
Taught by
The Machine Learning Engineer
Related Courses
LLaMA- Open and Efficient Foundation Language Models - Paper ExplainedYannic Kilcher via YouTube Alpaca & LLaMA - Can it Compete with ChatGPT?
Venelin Valkov via YouTube Experimenting with Alpaca & LLaMA
Aladdin Persson via YouTube What's LLaMA? ChatLLaMA? - And Some ChatGPT/InstructGPT
Aladdin Persson via YouTube Llama Index - Step by Step Introduction
echohive via YouTube