Reinforced Self-Training for Language Modeling - Paper Explained
Offered By: Yannic Kilcher via YouTube
Course Description
Overview
Explore a comprehensive video explanation of the Reinforced Self-Training (ReST) method for language modeling. Delve into how ReST utilizes a bootstrap-like approach to generate its own extended dataset, training on increasingly high-quality subsets to enhance its reward system. Understand the efficiency advantages of ReST compared to Online Reinforcement Learning techniques like PPO, including its ability to reuse generated data multiple times. Examine the paper's abstract, which outlines ReST's application in machine translation and its potential to significantly improve translation quality. Learn about the authors behind this innovative approach and their findings on ReST's compute and sample efficiency in improving large language models through alignment with human preferences.
Syllabus
Reinforced Self-Training (ReST) for Language Modeling (Paper Explained)
Taught by
Yannic Kilcher
Related Courses
Computational NeuroscienceUniversity of Washington via Coursera Reinforcement Learning
Brown University via Udacity Reinforcement Learning
Indian Institute of Technology Madras via Swayam FA17: Machine Learning
Georgia Institute of Technology via edX Introduction to Reinforcement Learning
Higher School of Economics via Coursera