CMU Advanced NLP: Word Segmentation and Morphology
Offered By: Graham Neubig via YouTube
Course Description
Overview
Explore word segmentation and morphology in this advanced Natural Language Processing lecture from Carnegie Mellon University. Delve into the complexities of defining a "word," learn about tokenization techniques, and understand morphological analysis across various languages. Discover unsupervised subword segmentation methods and examine language typology, including isolated and agglutinative languages. Investigate historical linguistics, patterns of languages, and the application of finite state automata in morphological analysis. Gain insights into spelling rules, type-token curves, and the intricacies of morphology in English and other European languages.
Syllabus
Introduction
Overview
Word Segmentation
The apostrophe
What is a word
Tokenization
European Languages
Slides
Problem with tokenization
Rulebased tokenization
Sentence boundary
Subword analysis
What is morphology
Rulebased systems
Language typology
Isolated languages
Gluteative languages
Turkish
English
Other European Languages
IndoEuropean Languages
Germanic Languages
Chinese
Historical Linguistics
Patterns of Languages
Reduplication
Type token curves
Recognizing words of a language
Spelling rules
Finite State Automata
Adjectives
Morphology in English
Finite State Transducer
Einsertion
FST
Taught by
Graham Neubig
Related Courses
Natural Language ProcessingColumbia University via Coursera Natural Language Processing
Stanford University via Coursera Introduction to Natural Language Processing
University of Michigan via Coursera moocTLH: Nuevos retos en las tecnologĂas del lenguaje humano
Universidad de Alicante via MirĂadax Natural Language Processing
Indian Institute of Technology, Kharagpur via Swayam