Stanford Seminar - Transformers in Language: The Development of GPT Models Including GPT-3
Offered By: Stanford University via YouTube
Course Description
Overview
Syllabus
Introduction.
3-Gram Model (Shannon 1951).
Recurrent Neural Nets (Sutskever et al 2011).
Big LSTM (Jozefowicz et al 2016).
Transformer (Llu and Saleh et al 2018).
GPT-2: Big Transformer (Radford et al 2019).
GPT-3: Very Big Transformer (Brown et al 2019).
GPT-3: Can Humans Detect Generated News Articles?.
Why Unsupervised Learning?.
Is there a Big Trove of Unlabeled Data?.
Why Use Autoregressive Generative Models for Unsupervised Learnin.
Unsupervised Sentiment Neuron (Radford et al 2017).
Radford et al 2018).
Zero-Shot Reading Comprehension.
GPT-2: Zero-Shot Translation.
Language Model Metalearning.
GPT-3: Few Shot Arithmetic.
GPT-3: Few Shot Word Unscrambling.
GPT-3: General Few Shot Learning.
IGPT (Chen et al 2020): Can we apply GPT to images?.
IGPT: Completions.
IGPT: Feature Learning.
Isn't Code Just Another Modality?.
The HumanEval Dataset.
The Pass @ K Metric.
Codex: Training Details.
An Easy Human Eval Problem (pass@1 -0.9).
A Medium HumanEval Problem (pass@1 -0.17).
A Hard HumanEval Problem (pass@1 -0.005).
Calibrating Sampling Temperature for Pass@k.
The Unreasonable Effectiveness of Sampling.
Can We Approximate Sampling Against an Oracle?.
Main Figure.
Limitations.
Conclusion.
Acknowledgements.
Taught by
Stanford Online
Tags
Related Courses
How to Build Codex SolutionsMicrosoft via YouTube Unlocking the Power of OpenAI for Startups - Microsoft for Startups
Microsoft via YouTube Building Intelligent Applications with World-Class AI
Microsoft via YouTube ChatGPT: GPT-3, GPT-4 Turbo: Unleash the Power of LLM's
Udemy Fine-Tuning OpenAI's GPT-3
Weights & Biases via YouTube