YoVDO

Attention with Linear Biases Explained

Offered By: Unify via YouTube

Tags

Attention Mechanisms Courses Machine Learning Courses Deep Learning Courses Neural Networks Courses Transformers Courses Positional Encoding Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Dive into an in-depth exploration of attention mechanisms with linear biases in this 39-minute video. Uncover the innovative Attention with Linear Biases (ALiBi) method, which introduces a novel position representation technique in transformer models. Learn how ALiBi penalizes query-key attention scores proportionally to their distance without adding explicit positional embeddings, enabling efficient extrapolation to longer sequence lengths beyond those seen during training. Explore the research paper "Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation" by Ofir Press, Noah A. Smith, and Mike Lewis. Gain insights into the latest AI research and industry trends through recommended resources such as The Deep Dive newsletter and Unify's blog. Connect with the Unify community through their website, GitHub, Discord, and Twitter to further engage with AI deployment stack discussions and developments.

Syllabus

Attention with Linear Biases Explained


Taught by

Unify

Related Courses

Deep Learning for Natural Language Processing
University of Oxford via Independent
Sequence Models
DeepLearning.AI via Coursera
Deep Learning Part 1 (IITM)
Indian Institute of Technology Madras via Swayam
Deep Learning - Part 1
Indian Institute of Technology, Ropar via Swayam
Deep Learning - IIT Ropar
Indian Institute of Technology, Ropar via Swayam