YoVDO

Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks

Offered By: USC Information Sciences Institute via YouTube

Tags

Computer Vision Courses Image Synthesis Courses Image Segmentation Courses Vision-Language Models Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a groundbreaking unified model for AI tasks in this 49-minute talk presented by Jiasen Lu from AI2. Delve into Unified-IO, the first neural model capable of performing a wide range of tasks across computer vision, image synthesis, vision-and-language, and natural language processing. Learn how this model homogenizes diverse task inputs and outputs into token sequences, achieving broad unification. Discover the model's architecture, training objectives, dataset implementations, and pre-training distribution. Examine evaluation methods, including the GRIT benchmark, and analyze results across various tasks such as semantic segmentation, depth estimation, object detection, image inpainting, and segmentation-based image generation. Gain insights into the future of multi-modal AI models and their potential impact on the field.

Syllabus

Intro
Single-Task Model vs. Unified Model
Single-Task Model for Vision
Image Output Quantization
Text Input for Different Tasks
Model Details
Objective
Dataset and Implementations
Pre-training Distribution
Evaluation
GRIT requires diverse skills
Results
Semantic Segmentation
Depth Estimation
Object Detection
Image Inpainting
Segmentation based image generation
Summary
Tasks Distribution


Taught by

USC Information Sciences Institute

Related Courses

Computer Vision: The Fundamentals
University of California, Berkeley via Coursera
Image and Video Processing: From Mars to Hollywood with a Stop at the Hospital
Duke University via Coursera
Fundamentals of Digital Image and Video Processing
Northwestern University via Coursera
医学图像处理技术 Medical Image Analysis
Shanghai Jiao Tong University via Coursera
Image Processing and Analysis for Life Scientists
École Polytechnique Fédérale de Lausanne via edX