Neural Nets for NLP 2021 - Language Modeling, Efficiency/Training Tricks
Offered By: Graham Neubig via YouTube
Course Description
Overview
Syllabus
Intro
Language Modeling: Calculating the Probability of a Sentence
Count-based Language Models
A Refresher on Evaluation
Problems and Solutions?
An Alternative: Featurized Models
A Computation Graph View
A Note: "Lookup"
Training a Model
Parameter Update
Unknown Words
Evaluation and Vocabulary
Linear Models can't Learn Feature Combinations
Neural Language Models . (See Bengio et al. 2004)
Tying Input/Output Embeddings
Standard SGD
SGD With Momentum
Adagrad
Adam
Shuffling the Training Data
Neural nets have lots of parameters, and are prone to overfitting
Efficiency Tricks: Mini-batching
Minibatching
Manual Mini-batching
Mini-batched Code Example
Automatic Mini-batching!
Code-level Optimization . eg. TorchScript provides a restricted representation of a PyTorch module that can be run efficiently in C++
Regularizing and Optimizing LSTM Language Models (Merity et al. 2017)
In-class Discussion
Taught by
Graham Neubig
Related Courses
Natural Language ProcessingColumbia University via Coursera Natural Language Processing
Stanford University via Coursera Introduction to Natural Language Processing
University of Michigan via Coursera moocTLH: Nuevos retos en las tecnologĂas del lenguaje humano
Universidad de Alicante via MirĂadax Natural Language Processing
Indian Institute of Technology, Kharagpur via Swayam