Towards High-Fidelity Open-Vocabulary 3D Scene Understanding
Offered By: Montreal Robotics via YouTube
Course Description
Overview
Explore cutting-edge research on open-vocabulary 3D scene understanding in this insightful conference talk. Delve into Transformer-based networks and their applications in various 3D scene understanding tasks, including object segmentation, human body part segmentation, and vectorized floorplan reconstruction. Discover the limitations of fully-supervised models in real-world scenarios and learn about innovative open-vocabulary approaches that leverage foundation models like CLIP and SAM. Gain valuable insights into the current challenges and future directions of this rapidly evolving field. Presented by Francis Engelmann, a PostDoc at ETH Zurich and visiting researcher at Google, this talk offers a comprehensive overview of recent advancements in 3D scene understanding and their potential impact on computer vision applications.
Syllabus
Francis Engelmann: Towards High-Fidelity Open-Vocabulary 3D Scene Understanding
Taught by
Montreal Robotics
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Computational Photography
Georgia Institute of Technology via Coursera Einführung in Computer Vision
Technische Universität München (Technical University of Munich) via Coursera Introduction to Computer Vision
Georgia Institute of Technology via Udacity