The Pitfalls of Next-token Prediction in Language Models

Offered By: Simons Institute via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Explore a thought-provoking lecture that delves into the limitations of next-token prediction in modeling human intelligence. Examine the critical distinction between autoregressive inference and teacher-forced training in language models. Discover why the popular criticism of error compounding during autoregressive inference may overlook a more fundamental issue: the potential failure of teacher-forcing to learn accurate next-token predictors for certain task classes. Investigate a general mechanism of teacher-forcing failure and analyze empirical evidence from a minimal planning task where both Transformer and Mamba architectures struggle. Consider the potential benefits of training models to predict multiple tokens in advance as a possible solution. Gain insights that can inform future debates and inspire research beyond the current next-token prediction paradigm in artificial intelligence.

Syllabus

The Pitfalls of Next-token Prediction

Taught by

Simons Institute

The Pitfalls of Next-token Prediction in Language Models

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

The Pitfalls of Next-token Prediction in Language Models

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue