YoVDO

Lumiere: Space-Time Diffusion Model for Video Generation

Offered By: Yannic Kilcher via YouTube

Tags

Diffusion Models Courses Artificial Intelligence Courses Machine Learning Courses Deep Learning Courses Computer Vision Courses Image Processing Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a detailed explanation of Google Research's Lumiere, a groundbreaking text-to-video diffusion model designed to generate realistic and coherent motion in synthesized videos. Dive into the innovative Space-Time U-Net architecture that enables the creation of entire video durations in a single pass, overcoming limitations of existing keyframe-based approaches. Learn about the model's ability to process videos at multiple space-time scales, its state-of-the-art performance in text-to-video generation, and its versatility in various content creation tasks. Examine the technical aspects, including temporal down- and up-sampling, leveraging pre-trained text-to-image models, and applications such as image-to-video conversion, video inpainting, and stylized generation. Gain insights into the training, evaluation, and potential societal impacts of this cutting-edge technology in the field of AI-driven video synthesis.

Syllabus

- Introduction
- Problems with keyframes
- Space-Time U-Net STUNet
- Extending U-Nets to video
- Multidiffusion for SSR prediction fusing
- Stylized generation by swapping weights
- Training & Evaluation
- Societal Impact & Conclusion


Taught by

Yannic Kilcher

Related Courses

Introduction to Artificial Intelligence
Stanford University via Udacity
Probabilistic Graphical Models 1: Representation
Stanford University via Coursera
Artificial Intelligence for Robotics
Stanford University via Udacity
Computer Vision: The Fundamentals
University of California, Berkeley via Coursera
Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent