YoVDO

CMU Multilingual NLP 2020 - Data Augmentation for Machine Translation

Offered By: Graham Neubig via YouTube

Tags

Natural Language Processing (NLP) Courses Data Augmentation Courses Multilingual Natural Language Processing Courses

Course Description

Overview

Explore data augmentation techniques for machine translation in this 25-minute lecture from CMU's Multilingual Natural Language Processing course. Delve into methods utilizing monolingual data and high-resource languages, covering topics such as back translation, multilingual training approaches, and pivoting strategies. Learn about iterative back-translation, English-HRL augmentation, and dictionary-based techniques. Gain insights into word alignment and word-by-word data augmentation with reordering. Understand the challenges of low-resource machine translation and discover practical solutions to enhance translation quality in resource-constrained scenarios.

Syllabus

Intro
Data Challenges in Low-resource MT
Multilingual Training Approaches
Data Augmentation 101: Back Translation
Back Translation Idea
How to Generate Translations
Iterative Back-translation
Back Translation Issues
English - HRL Augmentation
Augmentation via Pivoting
Data w/ Various Types of Pivoting
Monolingual Data Copying
Dictionary-based Augmentation
An Aside: Word Alignment
Word-by-word Data Augmentation
Word-by-word Augmentation w/ Reordering


Taught by

Graham Neubig

Related Courses

CMU Multilingual NLP - The LORELEI Project
Graham Neubig via YouTube
CMU Multilingual NLP 2022 - Speech
Graham Neubig via YouTube
Multilingual NLP 2022 - Language Contact and Change
Graham Neubig via YouTube
CMU Multilingual NLP 2022 - Data-Driven Strategies for NMT
Graham Neubig via YouTube
CMU Multilingual NLP 2022 - Typology
Graham Neubig via YouTube