YoVDO

xLSTM - Extended Long Short-Term Memory

Offered By: Yannic Kilcher via YouTube

Tags

Deep Learning Courses Artificial Intelligence Courses Machine Learning Courses Neural Networks Courses Transformers Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore the innovative xLSTM architecture in this comprehensive video lecture that combines the recurrency and constant memory requirement of LSTMs with the large-scale training capabilities of transformers. Delve into the details of exponential gating, modified memory structures, and the integration of LSTM extensions into residual block backbones. Learn how xLSTM blocks are residually stacked to create powerful architectures that perform favorably compared to state-of-the-art Transformers and State Space Models. Discover the potential of this extended LSTM approach in language modeling and its impressive results when scaled to billions of parameters. Gain insights from the research paper's abstract, which outlines the key components and advantages of xLSTM. Presented by Yannic Kilcher, this 57-minute lecture offers a deep dive into the future of large language models and the evolution of LSTM technology.

Syllabus

xLSTM: Extended Long Short-Term Memory


Taught by

Yannic Kilcher

Related Courses

Neural Networks for Machine Learning
University of Toronto via Coursera
機器學習技法 (Machine Learning Techniques)
National Taiwan University via Coursera
Machine Learning Capstone: An Intelligent Application with Deep Learning
University of Washington via Coursera
Прикладные задачи анализа данных
Moscow Institute of Physics and Technology via Coursera
Leading Ambitious Teaching and Learning
Microsoft via edX