YoVDO

Image Captioning Python App with ViT and GPT2 Using Hugging Face Models - Applied Deep Learning

Offered By: 1littlecoder via YouTube

Tags

Image Captioning Courses Deep Learning Courses Python Courses GPT-2 Courses Model Deployment Courses Gradio Courses Hugging Face Courses Vision Transformers Courses

Course Description

Overview

Learn to create an image captioning Python application using Vision Transformer (ViT) and GPT-2 models from Hugging Face. Follow along as the tutorial guides you through building a Gradio app that generates descriptive captions for images. Explore the integration of Sachin's pre-trained model from the Hugging Face Model Hub, which combines ViT for image processing and GPT-2 for text generation. By the end of this 25-minute tutorial, deploy your own image captioning app on the Hugging Face Model Hub, gaining practical experience in applied deep learning and natural language processing.

Syllabus

Build Image Captioning Python App with ViT & GPT2 using Hugging Face Models | Applied Deep Learning


Taught by

1littlecoder

Related Courses

Neural Networks for Machine Learning
University of Toronto via Coursera
機器學習技法 (Machine Learning Techniques)
National Taiwan University via Coursera
Machine Learning Capstone: An Intelligent Application with Deep Learning
University of Washington via Coursera
Прикладные задачи анализа данных
Moscow Institute of Physics and Technology via Coursera
Leading Ambitious Teaching and Learning
Microsoft via edX