Emerging Properties in Self-Supervised Vision Transformers - Paper Explained
Offered By: Aleksa Gordić - The AI Epiphany via YouTube
Course Description
Overview
Explore a comprehensive video analysis of the "Emerging Properties in Self-Supervised Vision Transformers" paper, focusing on DINO (self DIstillation with NO labels) introduced by Facebook AI. Delve into the concept of using self-supervised learning for vision transformers and discover emerging properties such as predicting segmentation masks and high-quality features for k-NN classification. Follow a detailed walkthrough of DINO's main ideas, attention maps, pseudocode, multi-crop technique, teacher network details, results, ablations, and feature visualizations. Gain insights into how self-supervised learning in computer vision can potentially match the success seen in natural language processing tasks.
Syllabus
DINO main ideas, attention maps explained
DINO explained in depth
Pseudocode walk-through
Multi-crop and local-to-global correspondence
More details on the teacher network
Results
Ablations
Collapse analysis
Features visualized and outro
Taught by
Aleksa Gordić - The AI Epiphany
Related Courses
Sequence ModelsDeepLearning.AI via Coursera Modern Natural Language Processing in Python
Udemy Stanford Seminar - Transformers in Language: The Development of GPT Models Including GPT-3
Stanford University via YouTube Long Form Question Answering in Haystack
James Briggs via YouTube Spotify's Podcast Search Explained
James Briggs via YouTube