OPER: Optimality-Guided Embedding Table Parallelization for Large-scale Recommendation Models

Offered By: USENIX via YouTube

Course Description

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Explore an innovative approach to parallelizing embedding tables for large-scale recommendation models in this 20-minute conference talk from USENIX ATC '24. Dive into OPER, an algorithm-system co-design that addresses the challenges of deploying Deep Learning Recommendation Models (DLRMs) across multiple GPUs. Learn how OPER's optimality-guided embedding table parallelization technique improves upon existing methods by considering input-dependent behavior, resulting in more balanced workload distribution and reduced inter-GPU communication. Discover the heuristic search algorithm used to approximate near-optimal EMT parallelization and the implementation of a distributed shared memory-based system that supports fine-grained EMT parallelization. Gain insights into the significant performance improvements achieved by OPER, with reported average speedups of 2.3× in training and 4.0× in inference compared to state-of-the-art DLRM frameworks.

Syllabus

USENIX ATC '24 - OPER: Optimality-Guided Embedding Table Parallelization for Large-scale...

Taught by

USENIX

OPER: Optimality-Guided Embedding Table Parallelization for Large-scale Recommendation Models

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

OPER: Optimality-Guided Embedding Table Parallelization for Large-scale Recommendation Models

Tags

Course Description

Overview

Syllabus

Taught by

Related Courses

Login to Continue