Author Interview - Typical Decoding for Natural Language Generation
Offered By: Yannic Kilcher via YouTube
Course Description
Overview
Explore an in-depth interview with Clara Meister, the first author of a paper introducing "typical sampling" - a new decoding method for natural language generation. Learn about the challenges of generating interesting text from modern language models and how typical sampling offers a principled solution. Discover the connections between this approach and psycholinguistic theories of human speech generation. Gain insights into why high-probability text can often seem dull, and how typical sampling aims to balance generating high-probability and high-information samples. Examine experimental results comparing typical sampling to other methods like top-k and nucleus sampling. Delve into discussions on training objectives, arbitrary engineering choices, and how to get started implementing this technique.
Syllabus
- Intro
- Sponsor: Introduction to GNNs Course link in description
- Why does sampling matter?
- What is a "typical" message?
- How do humans communicate?
- Why don't we just sample from the model's distribution?
- What happens if we condition on the information to transmit?
- Does typical sampling really represent human outputs?
- What do the plots mean?
- Diving into the experimental results
- Are our training objectives wrong?
- Comparing typical sampling to top-k and nucleus sampling
- Explaining arbitrary engineering choices
- How can people get started with this?
Taught by
Yannic Kilcher
Related Courses
Neural Networks for Machine LearningUniversity of Toronto via Coursera 機器學習技法 (Machine Learning Techniques)
National Taiwan University via Coursera Machine Learning Capstone: An Intelligent Application with Deep Learning
University of Washington via Coursera Прикладные задачи анализа данных
Moscow Institute of Physics and Technology via Coursera Leading Ambitious Teaching and Learning
Microsoft via edX