YoVDO

KDD2020 - Transfer Learning Joshi

Offered By: Association for Computing Machinery (ACM) via YouTube

Tags

Transfer Learning Courses BERT Courses Scalability Courses

Course Description

Overview

Explore transfer learning and pre-trained contextualized representations in this 20-minute conference talk from KDD2020. Dive into BERT and its improvements, including span-based efficient pre-training and ROBERTA. Learn about extractive QA, GLUE, and the challenges that remain in the field. Discover potential future directions such as few-shot learning and non-parametric memories. Gain insights from Mandar Joshi on advancing natural language processing techniques through innovative pre-training approaches and model architectures.

Syllabus

Transfer Learning via Pre-training
Pre-trained Contextualized Representations
BERT [Devlin et al. (2018)]
How can we do better?
Span-based Efficient Pre-training
Pre-training Span Representations
Why is this more efficient?
Random subword masks can be too easy
Which spans to mask?
Why SBO?
Single-sequence Inputs
Evaluation
Baselines
Extractive QA: SQUAD
GLUE
ROBERTA: Scaling BERT
The ROBERTA Recipe
What is still hard?
Next Big Thing: Few Shot Learning?
Next Big Thing: Non-parametric Memories?


Taught by

Association for Computing Machinery (ACM)

Related Courses

Sentiment Analysis with Deep Learning using BERT
Coursera Project Network via Coursera
Natural Language Processing with Attention Models
DeepLearning.AI via Coursera
Fine Tune BERT for Text Classification with TensorFlow
Coursera Project Network via Coursera
Deploy a BERT question answering bot on Django
Coursera Project Network via Coursera
Generating discrete sequences: language and music
Ural Federal University via edX