YoVDO

Automatic Image Captioning with Vision Transformer and GPT-2

Offered By: Eran Feit via YouTube

Tags

Image Captioning Courses Deep Learning Courses Computer Vision Courses Neural Networks Courses PyTorch Courses GPT-2 Courses Transfer Learning Courses Hugging Face Courses Vision Transformers Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Learn how to generate descriptive captions for images using Python and PyTorch in this 16-minute tutorial. Explore the process of automatic image captioning with the pre-trained 'nlpconnect/vit-gpt2-image-captioning' model from Hugging Face. Set up the Vision Transformer (ViT) for image processing and GPT-2 for text generation. Discover how to install the necessary environment and Python libraries, load pre-trained models, process images with Vision Transformers, generate text with GPT-2 in PyTorch, and display the captioning results alongside the images. Access the tutorial code and find additional computer vision resources through provided links. Gain practical skills in implementing state-of-the-art image captioning techniques using popular deep learning frameworks.

Syllabus

Automatic Image Captioning with Vit-Gpt2


Taught by

Eran Feit

Related Courses

Deep Learning For Visual Computing
Indian Institute of Technology, Kharagpur via Swayam
Literacy Essentials: Core Concepts Generative Adversarial Network
Pluralsight
Machine Learning & Deep Learning Projects
The AI University via YouTube
Implement Image Captioning with Recurrent Neural Networks
Pluralsight
VirTex- Learning Visual Representations from Textual Annotations
Yannic Kilcher via YouTube