MSRL - Distributed Reinforcement Learning with Dataflow Fragments
Offered By: USENIX via YouTube
Course Description
Overview
Explore a conference talk from USENIX ATC '23 that introduces MSRL, a novel distributed reinforcement learning (RL) training system. Discover how MSRL utilizes the concept of fragmented dataflow graphs (FDGs) to execute RL algorithms flexibly on GPU clusters. Learn about the challenges in current RL systems and how MSRL addresses them by decoupling algorithm definition from distributed execution strategies. Understand the benefits of FDGs in handling diverse RL algorithms, allowing fragments to execute on different devices through various low-level dataflow implementations. Gain insights into how MSRL's distribution policy enables efficient mapping of fragments to devices without altering the RL algorithm implementation. Examine the experimental results demonstrating MSRL's ability to expose trade-offs between execution strategies while outperforming existing RL systems with fixed strategies.
Syllabus
USENIX ATC '23 - MSRL: Distributed Reinforcement Learning with Dataflow Fragments
Taught by
USENIX
Related Courses
Intro to Parallel ProgrammingNvidia via Udacity Introduction to Linear Models and Matrix Algebra
Harvard University via edX Введение в параллельное программирование с использованием OpenMP и MPI
Tomsk State University via Coursera Supercomputing
Partnership for Advanced Computing in Europe via FutureLearn Fundamentals of Parallelism on Intel Architecture
Intel via Coursera