CMU Advanced NLP: Word Segmentation and Morphology
Offered By: Graham Neubig via YouTube
Course Description
Overview
Explore word segmentation and morphology in this advanced Natural Language Processing lecture from Carnegie Mellon University. Delve into the complexities of defining a "word," learn about tokenization techniques, and understand morphological analysis across various languages. Discover unsupervised subword segmentation methods and examine language typology, including isolated and agglutinative languages. Investigate historical linguistics, patterns of languages, and the application of finite state automata in morphological analysis. Gain insights into spelling rules, type-token curves, and the intricacies of morphology in English and other European languages.
Syllabus
Introduction
Overview
Word Segmentation
The apostrophe
What is a word
Tokenization
European Languages
Slides
Problem with tokenization
Rulebased tokenization
Sentence boundary
Subword analysis
What is morphology
Rulebased systems
Language typology
Isolated languages
Gluteative languages
Turkish
English
Other European Languages
IndoEuropean Languages
Germanic Languages
Chinese
Historical Linguistics
Patterns of Languages
Reduplication
Type token curves
Recognizing words of a language
Spelling rules
Finite State Automata
Adjectives
Morphology in English
Finite State Transducer
Einsertion
FST
Taught by
Graham Neubig
Related Courses
Miracles of Human Language: An Introduction to LinguisticsLeiden University via Coursera Introduction to Catalan Sign Language: Speaking with Your Hands and Hearing with Your Eyes
Universitat Pompeu Fabra via FutureLearn Zoologia
University of Naples Federico II via Federica Linguaggio, identità di genere e lingua italiana
Ca' Foscari University of Venice via EduOpen Sign Language Structure, Learning, and Change
Georgetown University via edX