YoVDO

Accelerating Transformers via Kernel Density Estimation - Google TechTalk

Offered By: Google TechTalks via YouTube

Tags

Transformers Courses Machine Learning Courses Computational Complexity Courses Attention Mechanisms Courses Matrix Multiplication Courses Algorithm Optimization Courses Sequence Modeling Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore efficient Transformer acceleration techniques in this Google TechTalk presented by Insu Han. Dive into the challenges of processing long sequences with dot-product attention mechanisms and discover innovative solutions using kernel density estimation (KDE). Learn about the KDEformer approach, which approximates attention in sub-quadratic time with provable spectral norm bounds. Examine experimental results comparing KDEformer's performance to other attention approximations in terms of accuracy, memory usage, and runtime on various pre-trained models. Gain insights into the potential applications and future directions of this research in accelerating large language models and sequence modeling tasks.

Syllabus

Intro
Outline for Efficient Transformer
Introduction
Transformer for Sequential Modeling
Transformer with Long Sequence
Contributions
High-level Approach
Weighted Exponential KDE
Adaptive KDE Algorithm
Algorithm Summary
Experiments
Conclusion
Future Work


Taught by

Google TechTalks

Related Courses

Automata Theory
Stanford University via edX
Introduction to Computational Thinking and Data Science
Massachusetts Institute of Technology via edX
算法设计与分析 Design and Analysis of Algorithms
Peking University via Coursera
How to Win Coding Competitions: Secrets of Champions
ITMO University via edX
Introdução à Ciência da Computação com Python Parte 2
Universidade de São Paulo via Coursera