YoVDO

Emu Video Generation - From MAE Pre-training to Multimodal Embeddings

Offered By: Aleksa Gordić - The AI Epiphany via YouTube

Tags

Computer Vision Courses Self-supervised Learning Courses Multimodal AI Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a 55-minute talk featuring Ishan Misra from Meta discussing self-supervised learning and multimodal data, with a focus on the recent Emu Video project. Dive into topics including the effectiveness of MAE pre-training for billion-scale pretraining, ImageBind's unified embedding approach, and the Emu Video generation model. Learn about qualitative comparisons and human evaluations of the generated videos, and gain insights from the Q&A session. Discover cutting-edge developments in computer vision, multimodal AI, and video generation techniques through this comprehensive discussion.

Syllabus

00:00 - Intro
00:42 - Hyperstack GPUs sponsored
02:23 - Talk intro
04:42 - The effectivenes of MAE pre-training for billion scale pretraining
12:58 - ImageBind: One Embedding to Rule them All
29:26 - Emu Video
50:39 - Qualitative Comparisons, human eval
54:30 - Q&A / outro


Taught by

Aleksa Gordić - The AI Epiphany

Related Courses

Introduction to Artificial Intelligence
Stanford University via Udacity
Computer Vision: The Fundamentals
University of California, Berkeley via Coursera
Computational Photography
Georgia Institute of Technology via Coursera
Einführung in Computer Vision
Technische Universität München (Technical University of Munich) via Coursera
Introduction to Computer Vision
Georgia Institute of Technology via Udacity