YoVDO

Fine-tuning PaliGemma for Custom Object Detection

Offered By: Roboflow via YouTube

Tags

Computer Vision Courses Machine Learning Courses Object Detection Courses Model Deployment Courses Image Captioning Courses Fine-Tuning Courses Vision-Language Models Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a comprehensive tutorial on fine-tuning Google's open-source Vision-Language Model, PaliGemma, for custom object detection tasks. Follow step-by-step instructions to modify Google's notebook and train PaliGemma on a handwritten digits and math operations dataset from RF100. Dive into the JSONL format, learn how to deploy the fine-tuned model for real-world inference, and discover PaliGemma's capabilities in image captioning, visual question answering, and object detection. Gain insights into overcoming limitations and important considerations when working with this powerful model. Access additional resources, including GitHub repositories, research papers, and community sessions to further enhance your understanding and application of PaliGemma.

Syllabus

- PaliGemma Capabilities
- Environment Setup
- Dataset Format
- Downloading Pre-trained Model
- Loading Dataset
- Training and Evaluating the Model
- Deploying the Model
- Important Considerations
- Outro
- Community Session June 6th, 2024 at 08:00 AM PST / 11:00 AM EST / PM CET: https://roboflow.stream


Taught by

Roboflow

Related Courses

LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Launchpad via YouTube
Fine-tuning Florence-2: Microsoft's Multimodal Model for Custom Object Detection
Roboflow via YouTube
Florence-2: The Best Small Vision Language Model - Capabilities and Demo
Sam Witteveen via YouTube
Vision Language Models and PDFs: What You See Is What You Search - Haystack EU 2024
OpenSource Connections via YouTube
Mastering Google's PaliGemma VLM: Tips and Tricks for Success and Fine-Tuning
Sam Witteveen via YouTube