YoVDO

Neural Nets for NLP - Structured Prediction with Local Independence Assumptions

Offered By: Graham Neubig via YouTube

Tags

Neural Networks Courses Natural Language Processing (NLP) Courses Convolutional Neural Networks (CNN) Courses Sequence Labeling Courses

Course Description

Overview

Explore structured prediction with local independence assumptions in this lecture from CMU's Neural Networks for NLP course. Dive into the rationale behind local independence assumptions and gain insights into Conditional Random Fields. Learn about sequence labeling techniques, including BILSTM-CRF models, and understand their training and decoding processes. Discover how to use CNNs for character-level representations and explore various reward functions in structured prediction. Examine methods for minimizing risk through enumeration and sampling, and understand the concept of token-wise minimum risk. Enhance your understanding of advanced NLP techniques and their practical applications in sequence labeling tasks.

Syllabus

Intro
Sequence Labeling One tag for one word .e.g. Part of speech tagging hate
Sequence Labeling as Independent Classification hate
Problems
Exposure Bias Teacher Forcing
Label Bias
Models w/ Local Dependencies
Reminder: Globally Normalized Models
Conditional Random Fields General form of globally normalized model
Potential Functions
BILSTM-CRF for Sequence Labeling hate
CRF Training & Decoding
Interactions
Step: Initial Part First, calculate transition from and emission of the first word for every POS
Step: Middle Parts
Forward Step: Final Part • Finish up the sentence with the sentence final symbol
Computing the Partition Function • Hey|X is the partition of sequence with length equal tot and end with label y
Decoding and Gradient Calculation
CNN for Character-level representation • We used CNN to extract morphological information such as prefix or suffix of a word
Training Details
Experiments
Reward Functions in Structured Prediction
Previous Methods to Consider Reward
Minimizing Risk by Enumeration Simple idea: directly calculate the risk of all hypotheses in the space
Enumeration + Sampling (Shen+ 2016) • Enumerating all hypotheses is intractable! . Instead of enumerating over everything, only enumerate over a sample, and re-normalize
Token-wise Minimum Risk If we can come up with a decomposable error function, we can calculate risk for each word


Taught by

Graham Neubig

Related Courses

Automating Data Extraction from Documents Using NLP
Pluralsight
CMU Multilingual NLP 2020 - Advanced Text Classification-Labeling
Graham Neubig via YouTube
CMU Multilingual NLP 2020 - Text Classification and Sequence Labeling
Graham Neubig via YouTube
CMU Neural Nets for NLP - Structured Prediction Basics
Graham Neubig via YouTube
Neural Nets for NLP - Structured Prediction Basics
Graham Neubig via YouTube