YoVDO

Learning the Depths of Moving People by Watching Frozen People

Offered By: Launchpad via YouTube

Tags

Object Detection Courses Machine Learning Courses Computer Vision Courses

Course Description

Overview

Explore a 22-minute Launchpad video that delves into the innovative technique of predicting depth for moving people by analyzing static images. Learn about the challenges of traditional stereo triangulation and how single-view depth prediction using multi-view supervision overcomes these limitations. Discover the process of transforming internet images into valuable training data and the application of the Mannequin Challenge to create a dataset for human depth prediction. Examine the progression from statues to people in depth estimation, and understand how the model is trained using RGB-only input. Witness the improvement in performance with increased input and the generation of pseudo-depth maps. Finally, explore the practical applications of this groundbreaking technology in various fields.

Syllabus

Intro
Goal
Where Could We Use This?
Existing Technologies
SLAM/MVS
Traditional Stereo Triangulation
Triangulation Here... Not So Good!
Single View Depth Prediction Using Multi-View Supervision
Internet Images Into Data
After Applying MV...
Depth Prediction on MegaDepth... A Lot Less Noisy!
Statues vs People
Mannequin Challenge
Training Data... Now On People
Get The Training Data
Train The Model (Using RGB- Only)
Prediction on Single RGB Image
More The Input Better The Performance
Getting Pseudo-Depth Map
Final Model Output
Application


Taught by

Launchpad

Related Courses

Computer Vision: The Fundamentals
University of California, Berkeley via Coursera
Detección de objetos
Universitat Autònoma de Barcelona (Autonomous University of Barcelona) via Coursera
Deep Learning Summer School
Independent
Deep Learning in Computer Vision
Higher School of Economics via Coursera
Computer Vision and Image Analysis
Microsoft via edX