YoVDO

Scaling LLM Test-Time Compute Optimally for Improved Performance

Offered By: Yannic Kilcher via YouTube

Tags

Scaling Laws Courses Prompt Engineering Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a comprehensive analysis of scaling inference-time computation in Large Language Models (LLMs) through this in-depth video presentation. Delve into the research paper that investigates how LLMs can improve their performance by utilizing additional test-time computation. Examine two primary mechanisms for scaling test-time computation: searching against dense, process-based verifier reward models and updating the model's distribution over a response adaptively. Discover how the effectiveness of different approaches varies depending on prompt difficulty, leading to the development of a "compute-optimal" scaling strategy. Learn how this strategy can improve test-time compute efficiency by more than 4x compared to a best-of-N baseline. Gain insights into the implications of these findings for LLM pretraining and the trade-offs between inference-time and pre-training compute. Understand how, in certain scenarios, test-time compute can be leveraged to outperform significantly larger models in a FLOPs-matched evaluation.

Syllabus

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters (Paper)


Taught by

Yannic Kilcher

Related Courses

Discover, Validate & Launch New Business Ideas with ChatGPT
Udemy
150 Digital Marketing Growth Hacks for Businesses
Udemy
AI: Executive Briefing
Pluralsight
The Complete Digital Marketing Guide - 25 Courses in 1
Udemy
Learn to build a voice assistant with Alexa
Udemy