YoVDO

Tiny Text and Vision Models - Fine-Tuning and API Setup

Offered By: Trelis Research via YouTube

Tags

Fine-Tuning Courses API Deployment Courses vLLM Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the intricacies of fine-tuning and deploying tiny text and vision models in this 44-minute tutorial. Dive into the architecture of multi-modal models, focusing on the Moondream model's components including its vision encoder (SigLIP), MLP (visionprojection), and language model (Phi). Learn how to apply LoRA adapters to multi-modal models and follow along with a hands-on fine-tuning notebook demo. Discover techniques for deploying custom APIs for multi-modal models, utilizing vLLM, and training models from scratch. Gain insights into multi-modal datasets and access a wealth of video resources to further your understanding of advanced vision and language processing techniques.

Syllabus

Fine-tuning tiny multi-modal models
Moondream server demo
Video Overview
Multi-modal model architecture
Moondream architecture
Moondream vision encoder SigLIP
Moondream MLP visionprojection
Moondream Language Model Phi
Applying LoRA adapters to a multi-modal model
Fine-tuning notebook demo
Deploying a custom API for multi-modal models
vLLM
Training a multi-modal model from scratch
Multi-modal datasets
Video resources


Taught by

Trelis Research

Related Courses

TensorFlow: Working with NLP
LinkedIn Learning
Introduction to Video Editing - Video Editing Tutorials
Great Learning via YouTube
HuggingFace Crash Course - Sentiment Analysis, Model Hub, Fine Tuning
Python Engineer via YouTube
GPT3 and Finetuning the Core Objective Functions - A Deep Dive
David Shapiro ~ AI via YouTube
How to Build a Q&A AI in Python - Open-Domain Question-Answering
James Briggs via YouTube