YoVDO

OPER: Optimality-Guided Embedding Table Parallelization for Large-scale Recommendation Models

Offered By: USENIX via YouTube

Tags

Distributed Computing Courses

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore an innovative approach to parallelizing embedding tables for large-scale recommendation models in this 20-minute conference talk from USENIX ATC '24. Dive into OPER, an algorithm-system co-design that addresses the challenges of deploying Deep Learning Recommendation Models (DLRMs) across multiple GPUs. Learn how OPER's optimality-guided embedding table parallelization technique improves upon existing methods by considering input-dependent behavior, resulting in more balanced workload distribution and reduced inter-GPU communication. Discover the heuristic search algorithm used to approximate near-optimal EMT parallelization and the implementation of a distributed shared memory-based system that supports fine-grained EMT parallelization. Gain insights into the significant performance improvements achieved by OPER, with reported average speedups of 2.3× in training and 4.0× in inference compared to state-of-the-art DLRM frameworks.

Syllabus

USENIX ATC '24 - OPER: Optimality-Guided Embedding Table Parallelization for Large-scale...


Taught by

USENIX

Related Courses

Cloud Computing Concepts, Part 1
University of Illinois at Urbana-Champaign via Coursera
Cloud Computing Concepts: Part 2
University of Illinois at Urbana-Champaign via Coursera
Reliable Distributed Algorithms - Part 1
KTH Royal Institute of Technology via edX
Introduction to Apache Spark and AWS
University of London International Programmes via Coursera
Réalisez des calculs distribués sur des données massives
CentraleSupélec via OpenClassrooms