Swing - Short-cutting Rings for Higher Bandwidth Allreduce
Offered By: USENIX via YouTube
Course Description
Overview
Explore a 19-minute conference talk from NSDI '24 that introduces Swing, a novel algorithm designed to enhance allreduce performance on torus networks. Learn how this innovative approach reduces the number of hops between communicating nodes by swinging between torus directions, resulting in up to 3x performance improvement over existing allreduce algorithms. Discover the algorithm's effectiveness across various vector sizes and torus-like topologies, regardless of shape and size. Gain insights into the significance of allreduce operations in distributed systems and their impact on workload runtime, particularly in machine learning-optimized systems like Google TPUs and Amazon Trainium devices, as well as Top500 supercomputers. Understand the challenges posed by torus networks and how Swing addresses them to achieve higher bandwidth allreduce operations.
Syllabus
NSDI '24 - Swing: Short-cutting Rings for Higher Bandwidth Allreduce
Taught by
USENIX
Related Courses
High Performance ComputingGeorgia Institute of Technology via Udacity Введение в параллельное программирование с использованием OpenMP и MPI
Tomsk State University via Coursera High Performance Computing in the Cloud
Dublin City University via FutureLearn Production Machine Learning Systems
Google Cloud via Coursera LAFF-On Programming for High Performance
The University of Texas at Austin via edX