Attention with Linear Biases Explained
Offered By: Unify via YouTube
Course Description
Overview
Dive into an in-depth exploration of attention mechanisms with linear biases in this 39-minute video. Uncover the innovative Attention with Linear Biases (ALiBi) method, which introduces a novel position representation technique in transformer models. Learn how ALiBi penalizes query-key attention scores proportionally to their distance without adding explicit positional embeddings, enabling efficient extrapolation to longer sequence lengths beyond those seen during training. Explore the research paper "Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation" by Ofir Press, Noah A. Smith, and Mike Lewis. Gain insights into the latest AI research and industry trends through recommended resources such as The Deep Dive newsletter and Unify's blog. Connect with the Unify community through their website, GitHub, Discord, and Twitter to further engage with AI deployment stack discussions and developments.
Syllabus
Attention with Linear Biases Explained
Taught by
Unify
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Natural Language Processing
Columbia University via Coursera Probabilistic Graphical Models 1: Representation
Stanford University via Coursera Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent