YoVDO

Infinite Memory Transformer - Research Paper Explained

Offered By: Yannic Kilcher via YouTube

Tags

Transformers Courses Sequence Modeling Courses Importance Sampling Courses

Course Description

Overview

Explore the groundbreaking ∞-former (Infinity-Former) model in this comprehensive video explanation of a research paper. Dive into how this innovative approach extends vanilla Transformers with an unbounded long-term memory, allowing for processing of arbitrarily long sequences. Learn about the continuous attention mechanism that enables attention complexity independent of context length, and discover the concept of "sticky memories" for highlighting important past events. Follow along as the video breaks down the problem statement, architecture, and experimental results, including applications in language modeling. Gain insights into the pros and cons of using heuristics and understand how this model addresses long-range dependencies in sequence tasks.

Syllabus

- Intro & Overview
- Sponsor Spot: Weights & Biases
- Problem Statement
- Continuous Attention Mechanism
- Unbounded Memory via concatenation & contraction
- Does this make sense?
- How the Long-Term Memory is used in an attention layer
- Entire Architecture Recap
- Sticky Memories by Importance Sampling
- Commentary: Pros and cons of using heuristics
- Experiments & Results


Taught by

Yannic Kilcher

Related Courses

Linear Circuits
Georgia Institute of Technology via Coursera
مقدمة في هندسة الطاقة والقوى
King Abdulaziz University via Rwaq (رواق)
Magnetic Materials and Devices
Massachusetts Institute of Technology via edX
Linear Circuits 2: AC Analysis
Georgia Institute of Technology via Coursera
Transmisión de energía eléctrica
Tecnológico de Monterrey via edX