YoVDO

Representational Strengths and Limitations of Transformers

Offered By: Google TechTalks via YouTube

Tags

Transformers Courses Deep Learning Courses Neural Networks Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the mathematical foundations of attention layers in transformers through this Google TechTalk presented by Clayton Sanford. Delve into both positive and negative results regarding the representation power of attention layers, focusing on intrinsic complexity parameters such as width, depth, and embedding dimension. Discover how transformers outperform recurrent and feedforward networks in a sparse averaging task, scaling logarithmically rather than polynomially with input size. Examine the limitations of attention layers in a triple detection task, where complexity scales linearly with input size. Learn about the application of communication complexity in transformer analysis and gain insights into the representational properties and inductive biases of neural networks. Presented by Clayton Sanford, a PhD student at Columbia studying machine learning theory, this talk also touches on his work in solving learning combinatorial algorithms with transformers and climate modeling using machine learning.

Syllabus

Representational Strengths and Limitations of Transformers


Taught by

Google TechTalks

Related Courses

Neural Networks for Machine Learning
University of Toronto via Coursera
Good Brain, Bad Brain: Basics
University of Birmingham via FutureLearn
Statistical Learning with R
Stanford University via edX
Machine Learning 1—Supervised Learning
Brown University via Udacity
Fundamentals of Neuroscience, Part 2: Neurons and Networks
Harvard University via edX