YoVDO

Mastering Google's PaliGemma VLM: Tips and Tricks for Success and Fine-Tuning

Offered By: Sam Witteveen via YouTube

Tags

Machine Learning Courses Computer Vision Courses Transformers Courses Fine-Tuning Courses Hugging Face Courses Vision-Language Models Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore Google's Vision Language Model PaliGemma in this informative video tutorial. Learn about the model's architecture, capabilities, and applications through a comprehensive overview of PaLI-3 and SigLIP papers. Discover the three pre-trained checkpoints, various sizes, and releases of PaliGemma. Gain hands-on experience with a Hugging Face Spaces demo and explore ScreenAI datasets. Dive into practical coding sessions, focusing on using PaliGemma with Transformers and fine-tuning techniques. Access provided resources, including Colab notebooks for inference and fine-tuning, to enhance your understanding and implementation of this powerful vision language model.

Syllabus

Intro
What is PaliGemma?
PaLI-3 Paper
SigLIP Paper
Hugging Face Blog: PaliGemma
PaliGemma: Three Pre-trained Checkpoints
PaliGemma different Sizes and Releases
PaliGemma Hugging Face Spaces Demo
ScreenAI Datasets
Code Time
Using PaliGemma with Transformers
PaliGemma Finetuning


Taught by

Sam Witteveen

Related Courses

The AI Engineer Path
Scrimba
Developing Generative AI Applications with Python
IBM via edX
Models and Platforms for Generative AI
IBM via edX
Intro to Hugging Face
Codecademy
Large Language Models: Application through Production
Databricks via edX