YoVDO

Learning the Depths of Moving People by Watching Frozen People

Offered By: Launchpad via YouTube

Tags

Object Detection Courses Machine Learning Courses Computer Vision Courses

Course Description

Overview

Explore a 22-minute Launchpad video that delves into the innovative technique of predicting depth for moving people by analyzing static images. Learn about the challenges of traditional stereo triangulation and how single-view depth prediction using multi-view supervision overcomes these limitations. Discover the process of transforming internet images into valuable training data and the application of the Mannequin Challenge to create a dataset for human depth prediction. Examine the progression from statues to people in depth estimation, and understand how the model is trained using RGB-only input. Witness the improvement in performance with increased input and the generation of pseudo-depth maps. Finally, explore the practical applications of this groundbreaking technology in various fields.

Syllabus

Intro
Goal
Where Could We Use This?
Existing Technologies
SLAM/MVS
Traditional Stereo Triangulation
Triangulation Here... Not So Good!
Single View Depth Prediction Using Multi-View Supervision
Internet Images Into Data
After Applying MV...
Depth Prediction on MegaDepth... A Lot Less Noisy!
Statues vs People
Mannequin Challenge
Training Data... Now On People
Get The Training Data
Train The Model (Using RGB- Only)
Prediction on Single RGB Image
More The Input Better The Performance
Getting Pseudo-Depth Map
Final Model Output
Application


Taught by

Launchpad

Related Courses

Introduction to Artificial Intelligence
Stanford University via Udacity
Computer Vision: The Fundamentals
University of California, Berkeley via Coursera
Computational Photography
Georgia Institute of Technology via Coursera
Einführung in Computer Vision
Technische Universität München (Technical University of Munich) via Coursera
Introduction to Computer Vision
Georgia Institute of Technology via Udacity