YoVDO

Tiny Text and Vision Models - Fine-Tuning and API Setup

Offered By: Trelis Research via YouTube

Tags

Fine-Tuning Courses API Deployment Courses vLLM Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the intricacies of fine-tuning and deploying tiny text and vision models in this 44-minute tutorial. Dive into the architecture of multi-modal models, focusing on the Moondream model's components including its vision encoder (SigLIP), MLP (visionprojection), and language model (Phi). Learn how to apply LoRA adapters to multi-modal models and follow along with a hands-on fine-tuning notebook demo. Discover techniques for deploying custom APIs for multi-modal models, utilizing vLLM, and training models from scratch. Gain insights into multi-modal datasets and access a wealth of video resources to further your understanding of advanced vision and language processing techniques.

Syllabus

Fine-tuning tiny multi-modal models
Moondream server demo
Video Overview
Multi-modal model architecture
Moondream architecture
Moondream vision encoder SigLIP
Moondream MLP visionprojection
Moondream Language Model Phi
Applying LoRA adapters to a multi-modal model
Fine-tuning notebook demo
Deploying a custom API for multi-modal models
vLLM
Training a multi-modal model from scratch
Multi-modal datasets
Video resources


Taught by

Trelis Research

Related Courses

API Design and Fundamentals of Google Cloud's Apigee API Platform
Google Cloud via Coursera
API Development on Google Cloud's Apigee API Platform
Google Cloud via Coursera
On Premises Management, Security, and Upgrade with Google Cloud's Apigee API Platform
Google Cloud via Coursera
Create a REST API With Node JS and Mongo DB
Udemy
AWS Networking and the API Gateway
Pluralsight