Training and Serving Custom Multi-modal Models - IDEFICS 2 and LLaVA Llama 3

Offered By: Trelis Research via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Learn to train and serve custom multi-modal models using IDEFICS 2 and LLaVA Llama 3 in this comprehensive tutorial video. Explore the IDEFICS 2 model overview, model loading techniques, and LoRA setup. Evaluate OCR performance and handle multiple image inputs. Dive into the training and fine-tuning process, and review the LLaVA Llama 3 model. Set up a multi-modal inference endpoint and understand VRAM requirements for these advanced models. Discover why IDEFICS 2 is recommended as a foundation for building custom multi-modal applications. Access additional resources, including complete scripts, one-click fine-tuning templates, and community support to enhance your learning experience.

Syllabus

Fine-tuning and server setup for multi-modal models
Prerequisites pre-watching
IDEFICS 2 Model Overview
Model loading, evaluation and LoRA setup
Evaluating OCR performance
Evaluating multiple image inputs
Training / Fine-tuning
LLaVA Llama 3 Model Review
Multi-modal inference endpoint
VRAM Requirements for multi-modal models
IDEFICS 2 - my recommended model to build on

Taught by

Trelis Research

Training and Serving Custom Multi-modal Models - IDEFICS 2 and LLaVA Llama 3

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Training and Serving Custom Multi-modal Models - IDEFICS 2 and LLaVA Llama 3

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue