Learning the Depths of Moving People by Watching Frozen People
Offered By: Launchpad via YouTube
Course Description
Overview
Explore a 22-minute Launchpad video that delves into the innovative technique of predicting depth for moving people by analyzing static images. Learn about the challenges of traditional stereo triangulation and how single-view depth prediction using multi-view supervision overcomes these limitations. Discover the process of transforming internet images into valuable training data and the application of the Mannequin Challenge to create a dataset for human depth prediction. Examine the progression from statues to people in depth estimation, and understand how the model is trained using RGB-only input. Witness the improvement in performance with increased input and the generation of pseudo-depth maps. Finally, explore the practical applications of this groundbreaking technology in various fields.
Syllabus
Intro
Goal
Where Could We Use This?
Existing Technologies
SLAM/MVS
Traditional Stereo Triangulation
Triangulation Here... Not So Good!
Single View Depth Prediction Using Multi-View Supervision
Internet Images Into Data
After Applying MV...
Depth Prediction on MegaDepth... A Lot Less Noisy!
Statues vs People
Mannequin Challenge
Training Data... Now On People
Get The Training Data
Train The Model (Using RGB- Only)
Prediction on Single RGB Image
More The Input Better The Performance
Getting Pseudo-Depth Map
Final Model Output
Application
Taught by
Launchpad
Related Courses
Computer Vision: The FundamentalsUniversity of California, Berkeley via Coursera Detección de objetos
Universitat Autònoma de Barcelona (Autonomous University of Barcelona) via Coursera Deep Learning Summer School
Independent Deep Learning in Computer Vision
Higher School of Economics via Coursera Computer Vision and Image Analysis
Microsoft via edX