Tiny Text and Vision Models - Fine-Tuning and API Setup
Offered By: Trelis Research via YouTube
Course Description
Overview
          Explore the intricacies of fine-tuning and deploying tiny text and vision models in this 44-minute tutorial. Dive into the architecture of multi-modal models, focusing on the Moondream model's components including its vision encoder (SigLIP), MLP (visionprojection), and language model (Phi). Learn how to apply LoRA adapters to multi-modal models and follow along with a hands-on fine-tuning notebook demo. Discover techniques for deploying custom APIs for multi-modal models, utilizing vLLM, and training models from scratch. Gain insights into multi-modal datasets and access a wealth of video resources to further your understanding of advanced vision and language processing techniques.
        
Syllabus
 Fine-tuning tiny multi-modal models
 Moondream server demo
 Video Overview
 Multi-modal model architecture
 Moondream architecture
 Moondream vision encoder SigLIP
 Moondream MLP visionprojection
 Moondream Language Model Phi 
 Applying LoRA adapters to a multi-modal model
 Fine-tuning notebook demo
 Deploying a custom API for multi-modal models
 vLLM
 Training a multi-modal model from scratch
 Multi-modal datasets
 Video resources
Taught by
Trelis Research
Related Courses
API Design and Fundamentals of Google Cloud's Apigee API PlatformGoogle Cloud via Coursera API Development on Google Cloud's Apigee API Platform
Google Cloud via Coursera On Premises Management, Security, and Upgrade with Google Cloud's Apigee API Platform
Google Cloud via Coursera Create a REST API With Node JS and Mongo DB
Udemy AWS Networking and the API Gateway
Pluralsight