YoVDO

Fast RDMA-based Ordered Key-Value Store Using Remote Learned Cache

Offered By: USENIX via YouTube

Tags

OSDI (Operating Systems Design and Implementation) Courses Machine Learning Courses Distributed Systems Courses

Course Description

Overview

Explore an innovative RDMA-based ordered key-value store using remote learned cache in this USENIX OSDI '20 conference talk. Dive into the design and implementation of XSTORE, which combines a tree-based index at the server for dynamic workloads with a learned cache at the client for static workloads. Learn how this hybrid architecture leverages machine learning models as a cache structure for tree-based indexes, decoupling model retraining from index updating. Discover how XSTORE achieves impressive performance, outperforming state-of-the-art RDMA-based ordered key-value stores by up to 5.9 times. Understand the challenges of implementing learned caches and the solutions employed, including handling stale models and ensuring correctness through validation mechanisms. Gain insights into the memory-performance tradeoffs and the potential for significant client-side memory savings.

Syllabus

Intro
KVS: key pillar for distributed systems
Traditional KVS uses RPC (Server-centric)
Challenge: limited NIC abstraction
Existing systems adopt caching
High cache miss cost for caching tree Tree node size can be much larger than the KV
Trade-off of existing KVS
Overview of XSTORE Hybrid architecture 11
Our approach: Learned cache Using ML as the cache structure for tree-based index Motivated by the learned index[1]
Client-direct Get() using learned cache
Benefits of the learned cache
Challenges of learned cache
Outline of the remaining content Server-side data structure for dynamic workloads
Models cannot learn dynamic B+Tree address Can only learn when the addresses are sorted
Solution: another layer of indirection Observation: leaf nodes are logically sorted
Client-direct Get() using model & TT
Model retraining Model is retrained at server in background threads 9: Small cost & extra CPU usage at the server
Stale model handling Background update causes stale learned models
Performance of XSTORE on YCSB 100M KVS, uniform workloads
Sensitive to the dataset


Taught by

USENIX

Related Courses

Introduction to Artificial Intelligence
Stanford University via Udacity
Natural Language Processing
Columbia University via Coursera
Probabilistic Graphical Models 1: Representation
Stanford University via Coursera
Computer Vision: The Fundamentals
University of California, Berkeley via Coursera
Learning from Data (Introductory Machine Learning course)
California Institute of Technology via Independent