YoVDO

Training and Serving Custom Multi-modal Models - IDEFICS 2 and LLaVA Llama 3

Offered By: Trelis Research via YouTube

Tags

LoRA (Low-Rank Adaptation) Courses Fine-Tuning Courses Hugging Face Courses LLaVA Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn to train and serve custom multi-modal models using IDEFICS 2 and LLaVA Llama 3 in this comprehensive tutorial video. Explore the IDEFICS 2 model overview, model loading techniques, and LoRA setup. Evaluate OCR performance and handle multiple image inputs. Dive into the training and fine-tuning process, and review the LLaVA Llama 3 model. Set up a multi-modal inference endpoint and understand VRAM requirements for these advanced models. Discover why IDEFICS 2 is recommended as a foundation for building custom multi-modal applications. Access additional resources, including complete scripts, one-click fine-tuning templates, and community support to enhance your learning experience.

Syllabus

Fine-tuning and server setup for multi-modal models
Prerequisites pre-watching
IDEFICS 2 Model Overview
Model loading, evaluation and LoRA setup
Evaluating OCR performance
Evaluating multiple image inputs
Training / Fine-tuning
LLaVA Llama 3 Model Review
Multi-modal inference endpoint
VRAM Requirements for multi-modal models
IDEFICS 2 - my recommended model to build on


Taught by

Trelis Research

Related Courses

How to Do Stable Diffusion LORA Training by Using Web UI on Different Models
Software Engineering Courses - SE Courses via YouTube
MicroPython & WiFi
Kevin McAleer via YouTube
Building a Wireless Community Sensor Network with LoRa
Hackaday via YouTube
ComfyUI - Node Based Stable Diffusion UI
Olivio Sarikas via YouTube
AI Masterclass for Everyone - Stable Diffusion, ControlNet, Depth Map, LORA, and VR
Hugh Hou via YouTube