YoVDO

Stanford Seminar - Transformers in Language: The Development of GPT Models Including GPT-3

Offered By: Stanford University via YouTube

Tags

Chatbot Courses Unsupervised Learning Courses Sentiment Analysis Courses GPT-3 Courses Long short-term memory (LSTM) Courses Image Processing Courses Transformer Models Courses Generative Models Courses

Course Description

Overview

Explore the evolution of language models in this Stanford seminar, tracing the development from early 3-Gram models to advanced GPT-3 systems. Delve into the architecture of Transformers, their applications in unsupervised learning, and their impact on various natural language processing tasks. Examine zero-shot and few-shot learning capabilities, and investigate the extension of GPT models to image processing and code generation. Gain insights into the HumanEval dataset, the Pass@K metric, and the challenges faced in developing these powerful language models. Conclude with a discussion on the limitations and future directions of GPT technology.

Syllabus

Introduction.
3-Gram Model (Shannon 1951).
Recurrent Neural Nets (Sutskever et al 2011).
Big LSTM (Jozefowicz et al 2016).
Transformer (Llu and Saleh et al 2018).
GPT-2: Big Transformer (Radford et al 2019).
GPT-3: Very Big Transformer (Brown et al 2019).
GPT-3: Can Humans Detect Generated News Articles?.
Why Unsupervised Learning?.
Is there a Big Trove of Unlabeled Data?.
Why Use Autoregressive Generative Models for Unsupervised Learnin.
Unsupervised Sentiment Neuron (Radford et al 2017).
Radford et al 2018).
Zero-Shot Reading Comprehension.
GPT-2: Zero-Shot Translation.
Language Model Metalearning.
GPT-3: Few Shot Arithmetic.
GPT-3: Few Shot Word Unscrambling.
GPT-3: General Few Shot Learning.
IGPT (Chen et al 2020): Can we apply GPT to images?.
IGPT: Completions.
IGPT: Feature Learning.
Isn't Code Just Another Modality?.
The HumanEval Dataset.
The Pass @ K Metric.
Codex: Training Details.
An Easy Human Eval Problem (pass@1 -0.9).
A Medium HumanEval Problem (pass@1 -0.17).
A Hard HumanEval Problem (pass@1 -0.005).
Calibrating Sampling Temperature for Pass@k.
The Unreasonable Effectiveness of Sampling.
Can We Approximate Sampling Against an Oracle?.
Main Figure.
Limitations.
Conclusion.
Acknowledgements.


Taught by

Stanford Online

Tags

Related Courses

How to Build Codex Solutions
Microsoft via YouTube
Unlocking the Power of OpenAI for Startups - Microsoft for Startups
Microsoft via YouTube
Building Intelligent Applications with World-Class AI
Microsoft via YouTube
ChatGPT: GPT-3, GPT-4 Turbo: Unleash the Power of LLM's
Udemy
Fine-Tuning OpenAI's GPT-3
Weights & Biases via YouTube