Training and Serving Custom Multi-modal Models - IDEFICS 2 and LLaVA Llama 3
Offered By: Trelis Research via YouTube
Course Description
Overview
Learn to train and serve custom multi-modal models using IDEFICS 2 and LLaVA Llama 3 in this comprehensive tutorial video. Explore the IDEFICS 2 model overview, model loading techniques, and LoRA setup. Evaluate OCR performance and handle multiple image inputs. Dive into the training and fine-tuning process, and review the LLaVA Llama 3 model. Set up a multi-modal inference endpoint and understand VRAM requirements for these advanced models. Discover why IDEFICS 2 is recommended as a foundation for building custom multi-modal applications. Access additional resources, including complete scripts, one-click fine-tuning templates, and community support to enhance your learning experience.
Syllabus
Fine-tuning and server setup for multi-modal models
Prerequisites pre-watching
IDEFICS 2 Model Overview
Model loading, evaluation and LoRA setup
Evaluating OCR performance
Evaluating multiple image inputs
Training / Fine-tuning
LLaVA Llama 3 Model Review
Multi-modal inference endpoint
VRAM Requirements for multi-modal models
IDEFICS 2 - my recommended model to build on
Taught by
Trelis Research
Related Courses
How to Do Stable Diffusion LORA Training by Using Web UI on Different ModelsSoftware Engineering Courses - SE Courses via YouTube MicroPython & WiFi
Kevin McAleer via YouTube Building a Wireless Community Sensor Network with LoRa
Hackaday via YouTube ComfyUI - Node Based Stable Diffusion UI
Olivio Sarikas via YouTube AI Masterclass for Everyone - Stable Diffusion, ControlNet, Depth Map, LORA, and VR
Hugh Hou via YouTube