Representational Strengths and Limitations of Transformers

Offered By: Google TechTalks via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Explore the mathematical foundations of attention layers in transformers through this Google TechTalk presented by Clayton Sanford. Delve into both positive and negative results regarding the representation power of attention layers, focusing on intrinsic complexity parameters such as width, depth, and embedding dimension. Discover how transformers outperform recurrent and feedforward networks in a sparse averaging task, scaling logarithmically rather than polynomially with input size. Examine the limitations of attention layers in a triple detection task, where complexity scales linearly with input size. Learn about the application of communication complexity in transformer analysis and gain insights into the representational properties and inductive biases of neural networks. Presented by Clayton Sanford, a PhD student at Columbia studying machine learning theory, this talk also touches on his work in solving learning combinatorial algorithms with transformers and climate modeling using machine learning.

Syllabus

Representational Strengths and Limitations of Transformers

Taught by

Google TechTalks

Representational Strengths and Limitations of Transformers

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Representational Strengths and Limitations of Transformers

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue