Mastering Google's PaliGemma VLM: Tips and Tricks for Success and Fine-Tuning
Offered By: Sam Witteveen via YouTube
Course Description
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore Google's Vision Language Model PaliGemma in this informative video tutorial. Learn about the model's architecture, capabilities, and applications through a comprehensive overview of PaLI-3 and SigLIP papers. Discover the three pre-trained checkpoints, various sizes, and releases of PaliGemma. Gain hands-on experience with a Hugging Face Spaces demo and explore ScreenAI datasets. Dive into practical coding sessions, focusing on using PaliGemma with Transformers and fine-tuning techniques. Access provided resources, including Colab notebooks for inference and fine-tuning, to enhance your understanding and implementation of this powerful vision language model.
Syllabus
Intro
What is PaliGemma?
PaLI-3 Paper
SigLIP Paper
Hugging Face Blog: PaliGemma
PaliGemma: Three Pre-trained Checkpoints
PaliGemma different Sizes and Releases
PaliGemma Hugging Face Spaces Demo
ScreenAI Datasets
Code Time
Using PaliGemma with Transformers
PaliGemma Finetuning
Taught by
Sam Witteveen
Related Courses
The AI Engineer PathScrimba Developing Generative AI Applications with Python
IBM via edX Models and Platforms for Generative AI
IBM via edX Intro to Hugging Face
Codecademy Large Language Models: Application through Production
Databricks via edX