YoVDO

Towards General Purpose Vision: Insights from Multimodal AI Systems

Offered By: SAIConference via YouTube

Tags

Artificial Intelligence Courses Machine Learning Courses Deep Learning Courses Computer Vision Courses Neural Networks Courses Visual Programming Courses Multimodal AI Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the evolution of multimodal AI systems in this insightful 20-minute conference talk from the Future Technologies Conference 2023. Gain valuable insights from Derek Hoiem of the University of Illinois at Urbana-Champaign as he delves into the development of advanced models capable of processing text, images, audio, and more. Discover the transformative impact of general-purpose models like GPT and their role in shaping AI's future. Learn about innovative approaches such as Unified IO and Visual Programming, which seamlessly integrate various modalities to solve complex tasks. Examine the challenges and breakthroughs in creating these systems, from specialized architectures to comprehensive memory systems. Through real-world examples and in-depth analysis, understand the potential of multimodal AI and the key to unlocking human-like intelligence in AI systems. Cover topics including custom NLP tasks, the differences between NLP and vision data, vision model evolution, AI limitations, and the importance of memory in AI systems.

Syllabus

Introduction
How to make a custom NLP task
NLP vs Vision data
Vision model evolution
AI limitations
Memory
Conclusion


Taught by

SAIConference

Related Courses

Generative AI, from GANs to CLIP, with Python and Pytorch
Udemy
ODSC East 2022 Keynote by Luis Vargas, Ph.D. - The Big Wave of AI at Scale
Open Data Science via YouTube
Comparing AI Image Caption Models: GIT, BLIP, and ViT+GPT2
1littlecoder via YouTube
In Conversation with the Godfather of AI
Collision Conference via YouTube
LLaVA: The New Open Access Multimodal AI Model
1littlecoder via YouTube