YoVDO

Towards High-Fidelity Open-Vocabulary 3D Scene Understanding

Offered By: Montreal Robotics via YouTube

Tags

Computer Vision Courses Foundation Models Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore cutting-edge research on open-vocabulary 3D scene understanding in this insightful conference talk. Delve into Transformer-based networks and their applications in various 3D scene understanding tasks, including object segmentation, human body part segmentation, and vectorized floorplan reconstruction. Discover the limitations of fully-supervised models in real-world scenarios and learn about innovative open-vocabulary approaches that leverage foundation models like CLIP and SAM. Gain valuable insights into the current challenges and future directions of this rapidly evolving field. Presented by Francis Engelmann, a PostDoc at ETH Zurich and visiting researcher at Google, this talk offers a comprehensive overview of recent advancements in 3D scene understanding and their potential impact on computer vision applications.

Syllabus

Francis Engelmann: Towards High-Fidelity Open-Vocabulary 3D Scene Understanding


Taught by

Montreal Robotics

Related Courses

Soil Structure Interaction
Indian Institute of Technology, Kharagpur via Swayam
Fundamentals of Machine Learning for Healthcare
Stanford University via Coursera
Artificial Intelligence Foundations: Thinking Machines
LinkedIn Learning
Could a Purely Self-Supervised Foundation Model Achieve Grounded Language Understanding?
Santa Fe Institute via YouTube
Foundation Models - FSDL 2022
The Full Stack via YouTube