Mastering Google's PaliGemma VLM: Tips and Tricks for Success and Fine-Tuning
Offered By: Sam Witteveen via YouTube
Course Description
Overview
Explore Google's Vision Language Model PaliGemma in this informative video tutorial. Learn about the model's architecture, capabilities, and applications through a comprehensive overview of PaLI-3 and SigLIP papers. Discover the three pre-trained checkpoints, various sizes, and releases of PaliGemma. Gain hands-on experience with a Hugging Face Spaces demo and explore ScreenAI datasets. Dive into practical coding sessions, focusing on using PaliGemma with Transformers and fine-tuning techniques. Access provided resources, including Colab notebooks for inference and fine-tuning, to enhance your understanding and implementation of this powerful vision language model.
Syllabus
Intro
What is PaliGemma?
PaLI-3 Paper
SigLIP Paper
Hugging Face Blog: PaliGemma
PaliGemma: Three Pre-trained Checkpoints
PaliGemma different Sizes and Releases
PaliGemma Hugging Face Spaces Demo
ScreenAI Datasets
Code Time
Using PaliGemma with Transformers
PaliGemma Finetuning
Taught by
Sam Witteveen
Related Courses
Hugging Face on Azure - Partnership and Solutions AnnouncementMicrosoft via YouTube Question Answering in Azure AI - Custom and Prebuilt Solutions - Episode 49
Microsoft via YouTube Open Source Platforms for MLOps
Duke University via Coursera Masked Language Modelling - Retraining BERT with Hugging Face Trainer - Coding Tutorial
rupert ai via YouTube Masked Language Modelling with Hugging Face - Microsoft Sentence Completion - Coding Tutorial
rupert ai via YouTube