YoVDO

Infinite Memory Transformer - Research Paper Explained

Offered By: Yannic Kilcher via YouTube

Tags

Transformers Courses Sequence Modeling Courses Importance Sampling Courses

Course Description

Overview

Explore the groundbreaking ∞-former (Infinity-Former) model in this comprehensive video explanation of a research paper. Dive into how this innovative approach extends vanilla Transformers with an unbounded long-term memory, allowing for processing of arbitrarily long sequences. Learn about the continuous attention mechanism that enables attention complexity independent of context length, and discover the concept of "sticky memories" for highlighting important past events. Follow along as the video breaks down the problem statement, architecture, and experimental results, including applications in language modeling. Gain insights into the pros and cons of using heuristics and understand how this model addresses long-range dependencies in sequence tasks.

Syllabus

- Intro & Overview
- Sponsor Spot: Weights & Biases
- Problem Statement
- Continuous Attention Mechanism
- Unbounded Memory via concatenation & contraction
- Does this make sense?
- How the Long-Term Memory is used in an attention layer
- Entire Architecture Recap
- Sticky Memories by Importance Sampling
- Commentary: Pros and cons of using heuristics
- Experiments & Results


Taught by

Yannic Kilcher

Related Courses

Rendering Games with Millions of Ray Traced Lights
Nvidia via YouTube
Brooklyn Quant Experience Lecture Series - Fourier-Based Methods for Complex Insurance Products Management
New York University (NYU) via YouTube
Batch Offline Reinforcement Learning - Part 1
Simons Institute via YouTube
A Classical Algorithm Framework for Dequantizing Quantum Machine Learning
Simons Institute via YouTube
ADSI Summer Workshop: Algorithmic Foundations of Learning and Control - Emma Brunskill
Paul G. Allen School via YouTube