YoVDO

Emerging Properties in Self-Supervised Vision Transformers - Facebook AI Research Explained

Offered By: Yannic Kilcher via YouTube

Tags

Computer Vision Courses Self-supervised Learning Courses Attention Mechanisms Courses Representation Learning Courses

Course Description

Overview

Explore the groundbreaking DINO system developed by Facebook AI Research in this comprehensive video lecture. Delve into the fusion of self-supervised learning for computer vision with the innovative Vision Transformer (ViT) architecture. Discover how DINO achieves impressive results without labels, including the direct interpretation of attention maps as segmentation maps and the use of obtained representations for image retrieval and zero-shot k-nearest neighbor classifiers. Learn about Vision Transformers, self-supervised learning for images, self-distillation techniques, and the process of building a teacher from a student using moving averages. Examine the DINO pseudocode, understand the rationale behind using cross-entropy loss, and analyze experimental results. Gain insights into the lecturer's hypothesis on DINO's effectiveness and conclude with a discussion on the implications of this research for the field of computer vision and artificial intelligence.

Syllabus

- Intro & Overview
- Vision Transformers
- Self-Supervised Learning for Images
- Self-Distillation
- Building the teacher from the student by moving average
- DINO Pseudocode
- Why Cross-Entropy Loss?
- Experimental Results
- My Hypothesis why this works
- Conclusion & Comments


Taught by

Yannic Kilcher

Related Courses

Deep Learning for Natural Language Processing
University of Oxford via Independent
Sequence Models
DeepLearning.AI via Coursera
Deep Learning Part 1 (IITM)
Indian Institute of Technology Madras via Swayam
Deep Learning - Part 1
Indian Institute of Technology, Ropar via Swayam
Deep Learning - IIT Ropar
Indian Institute of Technology, Ropar via Swayam