KDD2020 - Transfer Learning Joshi
Offered By: Association for Computing Machinery (ACM) via YouTube
Course Description
Overview
Explore transfer learning and pre-trained contextualized representations in this 20-minute conference talk from KDD2020. Dive into BERT and its improvements, including span-based efficient pre-training and ROBERTA. Learn about extractive QA, GLUE, and the challenges that remain in the field. Discover potential future directions such as few-shot learning and non-parametric memories. Gain insights from Mandar Joshi on advancing natural language processing techniques through innovative pre-training approaches and model architectures.
Syllabus
Transfer Learning via Pre-training
Pre-trained Contextualized Representations
BERT [Devlin et al. (2018)]
How can we do better?
Span-based Efficient Pre-training
Pre-training Span Representations
Why is this more efficient?
Random subword masks can be too easy
Which spans to mask?
Why SBO?
Single-sequence Inputs
Evaluation
Baselines
Extractive QA: SQUAD
GLUE
ROBERTA: Scaling BERT
The ROBERTA Recipe
What is still hard?
Next Big Thing: Few Shot Learning?
Next Big Thing: Non-parametric Memories?
Taught by
Association for Computing Machinery (ACM)
Related Courses
Sentiment Analysis with Deep Learning using BERTCoursera Project Network via Coursera Natural Language Processing with Attention Models
DeepLearning.AI via Coursera Fine Tune BERT for Text Classification with TensorFlow
Coursera Project Network via Coursera Deploy a BERT question answering bot on Django
Coursera Project Network via Coursera Generating discrete sequences: language and music
Ural Federal University via edX