xLSTM - Extended Long Short-Term Memory
Offered By: Yannic Kilcher via YouTube
Course Description
Overview
Explore the innovative xLSTM architecture in this comprehensive video lecture that combines the recurrency and constant memory requirement of LSTMs with the large-scale training capabilities of transformers. Delve into the details of exponential gating, modified memory structures, and the integration of LSTM extensions into residual block backbones. Learn how xLSTM blocks are residually stacked to create powerful architectures that perform favorably compared to state-of-the-art Transformers and State Space Models. Discover the potential of this extended LSTM approach in language modeling and its impressive results when scaled to billions of parameters. Gain insights from the research paper's abstract, which outlines the key components and advantages of xLSTM. Presented by Yannic Kilcher, this 57-minute lecture offers a deep dive into the future of large language models and the evolution of LSTM technology.
Syllabus
xLSTM: Extended Long Short-Term Memory
Taught by
Yannic Kilcher
Related Courses
Introduction to Artificial IntelligenceStanford University via Udacity Probabilistic Graphical Models 1: Representation
Stanford University via Coursera Artificial Intelligence for Robotics
Stanford University via Udacity Computer Vision: The Fundamentals
University of California, Berkeley via Coursera Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent