Stanford Seminar - Transformers in Language: The Development of GPT Models Including GPT-3
Offered By: Stanford University via YouTube
Course Description
Overview
Syllabus
Introduction.
3-Gram Model (Shannon 1951).
Recurrent Neural Nets (Sutskever et al 2011).
Big LSTM (Jozefowicz et al 2016).
Transformer (Llu and Saleh et al 2018).
GPT-2: Big Transformer (Radford et al 2019).
GPT-3: Very Big Transformer (Brown et al 2019).
GPT-3: Can Humans Detect Generated News Articles?.
Why Unsupervised Learning?.
Is there a Big Trove of Unlabeled Data?.
Why Use Autoregressive Generative Models for Unsupervised Learnin.
Unsupervised Sentiment Neuron (Radford et al 2017).
Radford et al 2018).
Zero-Shot Reading Comprehension.
GPT-2: Zero-Shot Translation.
Language Model Metalearning.
GPT-3: Few Shot Arithmetic.
GPT-3: Few Shot Word Unscrambling.
GPT-3: General Few Shot Learning.
IGPT (Chen et al 2020): Can we apply GPT to images?.
IGPT: Completions.
IGPT: Feature Learning.
Isn't Code Just Another Modality?.
The HumanEval Dataset.
The Pass @ K Metric.
Codex: Training Details.
An Easy Human Eval Problem (pass@1 -0.9).
A Medium HumanEval Problem (pass@1 -0.17).
A Hard HumanEval Problem (pass@1 -0.005).
Calibrating Sampling Temperature for Pass@k.
The Unreasonable Effectiveness of Sampling.
Can We Approximate Sampling Against an Oracle?.
Main Figure.
Limitations.
Conclusion.
Acknowledgements.
Taught by
Stanford Online
Tags
Related Courses
Text Mining and AnalyticsUniversity of Illinois at Urbana-Champaign via Coursera Introduction to Natural Language Processing
University of Michigan via Coursera Enabling Technologies for Data Science and Analytics: The Internet of Things
Columbia University via edX Machine Learning Capstone: An Intelligent Application with Deep Learning
University of Washington via Coursera moocTLH: Nuevos retos en las tecnologĂas del lenguaje humano
Universidad de Alicante via MirĂadax