YoVDO

Perceiver - General Perception with Iterative Attention

Offered By: Yannic Kilcher via YouTube

Tags

Computer Vision Courses Deep Learning Courses Neural Networks Courses Transformer Architecture Courses Positional Encoding Courses

Course Description

Overview

Explore a comprehensive analysis of Google DeepMind's Perceiver model in this 30-minute video lecture. Delve into the innovative architecture that addresses the quadratic bottleneck problem of Transformers and processes multiple input modalities simultaneously. Learn about cross-attention mechanisms, latent low-dimensional Transformers, and weight sharing across layers. Discover how the Perceiver achieves competitive performance on ImageNet and state-of-the-art results on various modalities without making architectural adjustments to input data. Gain insights into positional encodings via Fourier features, experimental results, and attention maps. Understand the potential implications of this groundbreaking research for the future of machine learning and artificial intelligence.

Syllabus

- Intro & Overview
- Built-In assumptions of Computer Vision Models
- The Quadratic Bottleneck of Transformers
- Cross-Attention in Transformers
- The Perceiver Model Architecture & Learned Queries
- Positional Encodings via Fourier Features
- Experimental Results & Attention Maps
- Comments & Conclusion


Taught by

Yannic Kilcher

Related Courses

NeRF - Representing Scenes as Neural Radiance Fields for View Synthesis
Yannic Kilcher via YouTube
LambdaNetworks- Modeling Long-Range Interactions Without Attention
Yannic Kilcher via YouTube
Attention Is All You Need - Transformer Paper Explained
Aleksa Gordić - The AI Epiphany via YouTube
NeRFs- Neural Radiance Fields - Paper Explained
Aladdin Persson via YouTube
Deep Dive into the Transformer Encoder Architecture
CodeEmporium via YouTube