Neural Nets for NLP 2017 - Attention
Offered By: Graham Neubig via YouTube
Course Description
Overview
Syllabus
Intro
Sentence Representations
Basic Idea (Bahdanau et al. 2015)
Calculating Attention (1)
A Graphical Example
Attention Score Functions (2)
Input Sentence
Previously Generated Things
Various Modalities
Hierarchical Structures (Yang et al. 2016)
Multiple Sources
Intra-Attention / Self Attention (Cheng et al. 2016) • Each element in the sentence attends to other elements + context sensitive encodings!
Coverage
Incorporating Markov Properties (Cohn et al. 2015)
Bidirectional Training (Cohn et al. 2015)
Supervised Training (Mi et al. 2016)
Attention is not Alignment! (Koehn and Knowles 2017) • Attention is often blurred
Monotonic Attention (e.g. Yu et al. 2016)
Convolutional Attention (Allamanis et al. 2016)
Multi-headed Attention
Summary of the "Transformer" (Vaswani et al. 2017)
Attention Tricks
Taught by
Graham Neubig
Related Courses
Deep Learning for Natural Language ProcessingUniversity of Oxford via Independent Sequence Models
DeepLearning.AI via Coursera Deep Learning Part 1 (IITM)
Indian Institute of Technology Madras via Swayam Deep Learning - Part 1
Indian Institute of Technology, Ropar via Swayam Deep Learning - IIT Ropar
Indian Institute of Technology, Ropar via Swayam