Scaling LLM Test-Time Compute Optimally for Improved Performance

Offered By: Yannic Kilcher via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Explore a comprehensive analysis of scaling inference-time computation in Large Language Models (LLMs) through this in-depth video presentation. Delve into the research paper that investigates how LLMs can improve their performance by utilizing additional test-time computation. Examine two primary mechanisms for scaling test-time computation: searching against dense, process-based verifier reward models and updating the model's distribution over a response adaptively. Discover how the effectiveness of different approaches varies depending on prompt difficulty, leading to the development of a "compute-optimal" scaling strategy. Learn how this strategy can improve test-time compute efficiency by more than 4x compared to a best-of-N baseline. Gain insights into the implications of these findings for LLM pretraining and the trade-offs between inference-time and pre-training compute. Understand how, in certain scenarios, test-time compute can be leveraged to outperform significantly larger models in a FLOPs-matched evaluation.

Syllabus

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper)

Taught by

Yannic Kilcher

Scaling LLM Test-Time Compute Optimally for Improved Performance

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Scaling LLM Test-Time Compute Optimally for Improved Performance

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue