YoVDO

Long-context Attention in Near-Linear Time

Offered By: Simons Institute via YouTube

Tags

Attention Mechanisms Courses Computational Complexity Courses Locality-Sensitive Hashing Courses Sublinear Algorithms Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a groundbreaking lecture on "Long-context Attention in Near-Linear Time" presented by David Woodruff from Carnegie Mellon University at the Simons Institute. Delve into the innovative "HyperAttention" mechanism, designed to tackle computational challenges in Large Language Models (LLMs) with extended contexts. Discover how this approach achieves linear time sampling, even with unbounded matrix entries or high stable rank, by introducing two key parameters. Learn about the modular design of HyperAttention and its compatibility with other fast implementations like FlashAttention. Examine the empirical performance of this technique across various long-context datasets, including its impressive impact on ChatGLM2's inference time and its significant speedup for larger context lengths. Gain insights into the collaborative research behind this advancement in the field of sublinear algorithms.

Syllabus

Long-context Attention in Near-Linear Time


Taught by

Simons Institute

Related Courses

Automata Theory
Stanford University via edX
Introduction to Computational Thinking and Data Science
Massachusetts Institute of Technology via edX
算法设计与分析 Design and Analysis of Algorithms
Peking University via Coursera
How to Win Coding Competitions: Secrets of Champions
ITMO University via edX
Introdução à Ciência da Computação com Python Parte 2
Universidade de São Paulo via Coursera