PARALLEL DATA LAB 

PDL Abstract

Using RDMA Efficiently for Key-Value Services

ACM SIGCOMM 2014. Chicago, Illinois, August 17-22, 2014. Supersedes CMU-PDL-14-106, June 2014.

Anuj Kalia Michael Kaminsky† David G. Andersen

Carnegie Mellon University
Pittsburgh, PA 15213

† Intel Labs

This paper describes the design and implementation of HERD, a key-value system designed to make the best use of an RDMA network. Unlike prior RDMA-based key-value systems, HERD focuses its design on reducing network round trips while using efficient RDMA primitives; the result is substantially lower latency, and throughput that saturates modern, commodity RDMA hardware.

HERD has two unconventional decisions: First, it does not use RDMA reads, despite the allure of operations that bypass the remote CPU entirely. Second, it uses a mix of RDMA and messaging verbs, despite the conventional wisdom that the messaging primitives are slow. A HERD client writes its request into the server’s memory; the server computes the reply. This design uses a single round trip for all requests and supports up to 26 million key-value operations per second with 5 μs average latency. Notably, for small key-value items, our full system throughput is similar to native RDMA read throughput and is over 2X higher than recent RDMA-based key-value systems. We believe that HERD further serves as an effective template for the construction of RDMA-based datacenter services.

KEYWORDS: RDMA; InfiniBand; RoCE; Key-Value Stores

FULL PAPER: pdf