Differentiable Associative Memories, Attention, and Transformers
Offered By: Alfredo Canziani via YouTube
Course Description
Overview
          Explore a comprehensive lecture on differentiable associative memories, attention mechanisms, and transformers delivered by renowned speaker Yann LeCun. Delve into the motivation behind reasoning and planning, learn about inference through energy minimization, and understand the concept of planning via energy minimization. Discover the intricacies of differentiable associative memory and attention, followed by an in-depth look at transformer architectures and their various applications. Examine specific use cases including multilingual transformers, supervised symbol manipulation, natural language understanding and generation, and DETR (DEtection TRansformer). Conclude with insights on planning through optimal control, gaining a thorough understanding of these advanced machine learning concepts and their practical implementations.
        
Syllabus
 – Motivation for reasoning & planning
 – Inference through energy minimization
 – Disclaimer
 – Planning through energy minimization
 – Q&A Optimal control diagram
 – Differentiable associative memory and attention
 – Transformers
 – Q&A Other differentiable attention architectures
 – Transformer architecture
 – Transformer applications: 1. Multilingual transformer Architecture XML-R
 – 2. Supervised symbol manipulation
 – 3. NL understanding & generation
 – 4. DETR
 – Planing through optimal control
 – Conclusion
Taught by
Alfredo Canziani
Tags
Related Courses
Discrete Inference and Learning in Artificial VisionÉcole Centrale Paris via Coursera Teaching Literacy Through Film
The British Film Institute via FutureLearn Linear Regression and Modeling
Duke University via Coursera Probability and Statistics
Stanford University via Stanford OpenEdx Statistical Reasoning
Stanford University via Stanford OpenEdx
