YoVDO

KDD2020 - Transfer Learning Joshi

Offered By: Association for Computing Machinery (ACM) via YouTube

Tags

Transfer Learning Courses BERT Courses Scalability Courses

Course Description

Overview

Explore transfer learning and pre-trained contextualized representations in this 20-minute conference talk from KDD2020. Dive into BERT and its improvements, including span-based efficient pre-training and ROBERTA. Learn about extractive QA, GLUE, and the challenges that remain in the field. Discover potential future directions such as few-shot learning and non-parametric memories. Gain insights from Mandar Joshi on advancing natural language processing techniques through innovative pre-training approaches and model architectures.

Syllabus

Transfer Learning via Pre-training
Pre-trained Contextualized Representations
BERT [Devlin et al. (2018)]
How can we do better?
Span-based Efficient Pre-training
Pre-training Span Representations
Why is this more efficient?
Random subword masks can be too easy
Which spans to mask?
Why SBO?
Single-sequence Inputs
Evaluation
Baselines
Extractive QA: SQUAD
GLUE
ROBERTA: Scaling BERT
The ROBERTA Recipe
What is still hard?
Next Big Thing: Few Shot Learning?
Next Big Thing: Non-parametric Memories?


Taught by

Association for Computing Machinery (ACM)

Related Courses

Structuring Machine Learning Projects
DeepLearning.AI via Coursera
Natural Language Processing on Google Cloud
Google Cloud via Coursera
Introduction to Learning Transfer and Life Long Learning (3L)
University of California, Irvine via Coursera
Advanced Deployment Scenarios with TensorFlow
DeepLearning.AI via Coursera
Neural Style Transfer with TensorFlow
Coursera Project Network via Coursera