YoVDO

Debugging Neural Nets for NLP

Offered By: Graham Neubig via YouTube

Tags

Neural Networks Courses Natural Language Processing (NLP) Courses Overfitting Courses Model Optimization Courses

Course Description

Overview

Explore debugging techniques for neural networks in natural language processing during this lecture from CMU's Neural Networks for NLP course. Learn to diagnose problems, address training and decoding time issues, combat overfitting, and handle disconnects between loss and evaluation. Gain insights into model sizing, optimization challenges, initialization strategies, and the impact of data sorting on performance. Discover effective approaches for beam search debugging and implementing dev-driven learning rate decay to enhance your NLP models.

Syllabus

Intro
In Neural Networks, Tuning is Paramount!
A Typical Situation
Possible Causes
Identifying Training Time Problems
Is My Model Too Weak? Your model needs to be big enough to learn . Model size depends on task . For language modeling, at least 512 nodes • For natural language analysis, 128 or so may do . Multiple layers are often better
Be Careful of Deep Models
Trouble w/ Optimization
Reminder: Optimizers
Initialization
Bucketing/Sorting • If we use sentences of different lengths, too much padding and sorting can result in slow training • To remedy this sort sentences so similarly-lengthed sentences are in the same batch • But this can affect performance! (Morishita et al. 2017)
Debugging Decoding
Beam Search
Debugging Search
Look At Your Data!
Symptoms of Overfitting
Reminder: Dev-driven Learning Rate Decay Start w/ a high learning rate, then degrade learning rate when start overfitting the development set (the newbob learning rate schedule)


Taught by

Graham Neubig

Related Courses

Practical Machine Learning
Johns Hopkins University via Coursera
Practical Deep Learning For Coders
fast.ai via Independent
機器學習基石下 (Machine Learning Foundations)---Algorithmic Foundations
National Taiwan University via Coursera
Data Analytics Foundations for Accountancy II
University of Illinois at Urbana-Champaign via Coursera
Entraînez un modèle prédictif linéaire
CentraleSupélec via OpenClassrooms