YoVDO

Transformers, Parallel Computation, and Logarithmic Depth

Offered By: Simons Institute via YouTube

Tags

Transformers Courses Artificial Intelligence Courses Machine Learning Courses Computational Complexity Courses Self-Attention Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the computational power of transformers in this 57-minute lecture by Daniel Hsu from Columbia University. Delve into the relationship between self-attention layers and communication rounds in Massively Parallel Computation. Discover how logarithmic depth enables transformers to efficiently solve complex computational tasks that challenge other neural sequence models and sub-quadratic transformer approximations. Gain insights into parallelism as a crucial distinguishing feature of transformers. Learn about the collaborative research with Clayton Sanford from Google and Matus Telgarsky from NYU, focusing on the simulation capabilities between constant numbers of self-attention layers and communication rounds in Massively Parallel Computation.

Syllabus

Transformers, parallel computation, and logarithmic depth


Taught by

Simons Institute

Related Courses

Linear Circuits
Georgia Institute of Technology via Coursera
مقدمة في هندسة الطاقة والقوى
King Abdulaziz University via Rwaq (رواق)
Magnetic Materials and Devices
Massachusetts Institute of Technology via edX
Linear Circuits 2: AC Analysis
Georgia Institute of Technology via Coursera
Transmisión de energía eléctrica
Tecnológico de Monterrey via edX