Image Captioning Python App with ViT and GPT2 Using Hugging Face Models - Applied Deep Learning
Offered By: 1littlecoder via YouTube
Course Description
Overview
Learn to create an image captioning Python application using Vision Transformer (ViT) and GPT-2 models from Hugging Face. Follow along as the tutorial guides you through building a Gradio app that generates descriptive captions for images. Explore the integration of Sachin's pre-trained model from the Hugging Face Model Hub, which combines ViT for image processing and GPT-2 for text generation. By the end of this 25-minute tutorial, deploy your own image captioning app on the Hugging Face Model Hub, gaining practical experience in applied deep learning and natural language processing.
Syllabus
Build Image Captioning Python App with ViT & GPT2 using Hugging Face Models | Applied Deep Learning
Taught by
1littlecoder
Related Courses
Deep Learning For Visual ComputingIndian Institute of Technology, Kharagpur via Swayam Literacy Essentials: Core Concepts Generative Adversarial Network
Pluralsight Machine Learning & Deep Learning Projects
The AI University via YouTube Implement Image Captioning with Recurrent Neural Networks
Pluralsight VirTex- Learning Visual Representations from Textual Annotations
Yannic Kilcher via YouTube