YoVDO

Convergence of Vision and Language in AI - Recent Developments and Projects

Offered By: Aleksa Gordić - The AI Epiphany via YouTube

Tags

Computer Vision Courses Machine Learning Courses Deep Learning Courses Multimodal AI Courses Vision Transformers Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the convergence of vision and language in artificial intelligence through this 55-minute talk featuring Lucas Beyer from Google DeepMind. Delve into Beyer's personal journey, understand the motivations behind integrating vision and language, and learn how language serves as an API for vision. Discover the concept of LiT tuning, examine the convergence of architectures in AI, and gain insights into PaLI, a vision-language model. Engage with cutting-edge research and projects in the field of AI, including Vision Transformers (ViT) and other innovative approaches to combining visual and linguistic information processing.

Syllabus

Lucas's story
Motivation
Language as API for vision
LiT tuning
Convergence of architectures
PaLI vision language model


Taught by

Aleksa Gordić - The AI Epiphany

Related Courses

2D image processing
Higher School of Economics via Coursera
3D Reconstruction - Multiple Viewpoints
Columbia University via Coursera
3D Reconstruction - Single Viewpoint
Columbia University via Coursera
AI-900: Microsoft Certified Azure AI Fundamentals
A Cloud Guru
TensorFlow Developer Certificate Exam Prep
A Cloud Guru