16th USENIX Symposium on Networked Systems Design and Implementation (NSDI) Feb. 26–28, 2019, Boston, MA. BEST PAPER AWARD!
Anuj Kalia, Michael Kaminsky†, David G. Andersen
Carnegie Mellon University
†Intel Labs
It is commonly believed that datacenter networking software must sacrice generality to attain high performance. The popularity of specialized distributed systems designed specically for niche technologies such as RDMA, lossless networks, FPGAs, and programmable switches testies to this belief. In this paper, we show that such specialization is not necessary. eRPC is a new general-purpose remote procedure call (RPC) library that oers performance comparable to specialized systems, while running on commodity CPUs in traditional datacenter networks based on either lossy Ethernet or lossless fabrics. eRPC performs well in three key metrics: message rate for small messages; bandwidth for large messages; and scalability to a large number of nodes and CPU cores. It handles packet loss, congestion, and background request execution. In microbenchmarks, one CPU core can handle up to 10 million small RPCs per second, or send large messages at 75 Gbps. We port a production-grade implementation of Raft state machine replication to eRPC without modifying the core Raft source code. We achieve 5.5 µs of replication latency on lossy Ethernet, which is faster than or comparable to specialized replication systems that use programmable switches, FPGAs, or RDMA.
FULL PAPER: pdf