YoVDO

Fine-tuning Pixtral - Multi-modal Vision and Text Model

Offered By: Trelis Research via YouTube

Tags

Machine Learning Courses Deep Learning Courses Computer Vision Courses Jupyter Notebooks Courses GPU Computing Courses Fine-Tuning Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the process of fine-tuning Pixtral, a multi-modal vision and text model, in this comprehensive tutorial video. Learn about Pixtral's architecture, including its custom image encoder trained from scratch, and follow step-by-step instructions for fine-tuning in a Jupyter notebook. Discover GPU setup requirements, dataset preparation techniques, and advanced chat templating. Gain insights into evaluating baseline performance, setting up LoRA fine-tuning, and optimizing training arguments. Explore methods for merging LoRA adapters, measuring OCR performance, and setting up an API endpoint using vLLM for inference. Access additional resources, including slides, datasets, and code repositories, to enhance your understanding of Pixtral fine-tuning techniques.

Syllabus

How to fine-tune Pixtral.
Video Overview
Pixtral architecture and design choices
Mistral’s custom image encoder - trained from scratch
Fine-tuning Pixtral in a Jupyter notebook
GPU setup for notebook fine-tuning and VRAM requirements
Getting a “transformers” version of Pixtral for fine-tuning
Loading Pixtral
Dataset loading and preparation
Chat templating somewhat advanced, but recommended
Inspecting and evaluating baseline performance on the custom data
Setting up data collation including for multi-turn training.
Training on completions only tricky but improves performance
Setting up LoRA fine-tuning
Setting up training arguments batch size, learning rate, gradient checkpointing
Setting up tensor board
Evaluating the trained model
Merging LoRA adapters and pushing the model to hub
Measuring performance on OCR optical character recognition
Inferencing Pixtral with vLLM, setting up an API endpoint
Video resources


Taught by

Trelis Research

Related Courses

Introduction to Artificial Intelligence
Stanford University via Udacity
Natural Language Processing
Columbia University via Coursera
Probabilistic Graphical Models 1: Representation
Stanford University via Coursera
Computer Vision: The Fundamentals
University of California, Berkeley via Coursera
Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent