Typical Decoding for Natural Language Generation - Get More Human-Like Outputs From Language Models
Offered By: Yannic Kilcher via YouTube
Course Description
Overview
Explore the concept of typical decoding for natural language generation in this 49-minute video lecture. Learn about the challenges of generating human-like text from language models and discover a new decoding method called typical sampling. Understand the trade-off between high-probability and high-information samples, and how this approach connects to psycholinguistic theories of human speech generation. Examine the limitations of current sampling methods like top-k and nucleus sampling, and see how typical sampling offers a more principled and effective alternative. Follow along as the video breaks down the paper's key ideas, experimental results, and potential implications for improving text generation from AI language models.
Syllabus
- Intro
- Sponsor: Fully Connected by Weights & Biases
- Paper Overview
- What's the problem with sampling?
- Beam Search: The good and the bad
- Top-k and Nucleus Sampling
- Why the most likely things might not be the best
- The expected information content of the next word
- How to trade off information and likelihood
- Connections to information theory and psycholinguistics
- Introducing Typical Sampling
- Experimental Evaluation
- My thoughts on this paper
Taught by
Yannic Kilcher
Related Courses
Neural Networks for Machine LearningUniversity of Toronto via Coursera 機器學習技法 (Machine Learning Techniques)
National Taiwan University via Coursera Machine Learning Capstone: An Intelligent Application with Deep Learning
University of Washington via Coursera Прикладные задачи анализа данных
Moscow Institute of Physics and Technology via Coursera Leading Ambitious Teaching and Learning
Microsoft via edX