YoVDO

Emerging Properties in Self-Supervised Vision Transformers - Paper Explained

Offered By: Aleksa Gordić - The AI Epiphany via YouTube

Tags

Computer Vision Courses Self-supervised Learning Courses Transformer Models Courses

Course Description

Overview

Explore a comprehensive video analysis of the "Emerging Properties in Self-Supervised Vision Transformers" paper, focusing on DINO (self DIstillation with NO labels) introduced by Facebook AI. Delve into the concept of using self-supervised learning for vision transformers and discover emerging properties such as predicting segmentation masks and high-quality features for k-NN classification. Follow a detailed walkthrough of DINO's main ideas, attention maps, pseudocode, multi-crop technique, teacher network details, results, ablations, and feature visualizations. Gain insights into how self-supervised learning in computer vision can potentially match the success seen in natural language processing tasks.

Syllabus

DINO main ideas, attention maps explained
DINO explained in depth
Pseudocode walk-through
Multi-crop and local-to-global correspondence
More details on the teacher network
Results
Ablations
Collapse analysis
Features visualized and outro


Taught by

Aleksa Gordić - The AI Epiphany

Related Courses

Introduction to Artificial Intelligence
Stanford University via Udacity
Computer Vision: The Fundamentals
University of California, Berkeley via Coursera
Computational Photography
Georgia Institute of Technology via Coursera
Einführung in Computer Vision
Technische Universität München (Technical University of Munich) via Coursera
Introduction to Computer Vision
Georgia Institute of Technology via Udacity