Enhancing Reasoning of Large Language Models through Reward-Guided Search and Self-Training

Offered By: Association for Computing Machinery (ACM) via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Explore cutting-edge techniques for enhancing the reasoning capabilities of Large Language Models (LLMs) in this keynote presentation from the Large Language Model Day at KDD2024. Delve into innovative approaches that leverage inference-time compute for self-improvement, including thought space search and self-training methodologies. Discover the development of generalizable, fine-grained reward models using tree search to automatically collect per-step values of reasoning trace correctness. Learn about ReST-MCTS, a process reward-guided tree search algorithm that enables continuous training of policy and reward models without manual annotation. Examine the application of these techniques in strategic game-playing, vision-language modeling, and 3D scene generation. Gain insights into how these advancements contribute to improving the capabilities of state-of-the-art language models like grok-2. Explore future directions for scaling up self-training and applying online reinforcement learning to unlock even greater intrinsic improvements in LLM capabilities.

Syllabus

KDD2024 - Enhancing Reasoning of Large Language Models through Reward-Guided Search and SelfTraining

Taught by

Association for Computing Machinery (ACM)

Enhancing Reasoning of Large Language Models through Reward-Guided Search and Self-Training

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Enhancing Reasoning of Large Language Models through Reward-Guided Search and Self-Training

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue