Author Interview - Typical Decoding for Natural Language Generation
Offered By: Yannic Kilcher via YouTube
Course Description
Overview
Explore an in-depth interview with Clara Meister, the first author of a paper introducing "typical sampling" - a new decoding method for natural language generation. Learn about the challenges of generating interesting text from modern language models and how typical sampling offers a principled solution. Discover the connections between this approach and psycholinguistic theories of human speech generation. Gain insights into why high-probability text can often seem dull, and how typical sampling aims to balance generating high-probability and high-information samples. Examine experimental results comparing typical sampling to other methods like top-k and nucleus sampling. Delve into discussions on training objectives, arbitrary engineering choices, and how to get started implementing this technique.
Syllabus
- Intro
- Sponsor: Introduction to GNNs Course link in description
- Why does sampling matter?
- What is a "typical" message?
- How do humans communicate?
- Why don't we just sample from the model's distribution?
- What happens if we condition on the information to transmit?
- Does typical sampling really represent human outputs?
- What do the plots mean?
- Diving into the experimental results
- Are our training objectives wrong?
- Comparing typical sampling to top-k and nucleus sampling
- Explaining arbitrary engineering choices
- How can people get started with this?
Taught by
Yannic Kilcher
Related Courses
Building Language Models on AWS (Japanese)Amazon Web Services via AWS Skill Builder Building Language Models on AWS (Korean)
Amazon Web Services via AWS Skill Builder Building Language Models on AWS (Simplified Chinese)
Amazon Web Services via AWS Skill Builder Building Language Models on AWS (Traditional Chinese)
Amazon Web Services via AWS Skill Builder Introduction to ChatGPT
edX