PARALLEL DATA LAB 

PDL Abstract

FAWN: A Fast Array of Wimpy Nodes

22nd ACM Symposium on Operating Systems Principles (SOSP’09), October 11-14, 2009, Big Sky, MT, USA. BEST PAPER!

David G. Andersen, Jason Franklin, Michael Kaminsky*, Amar Phanishayee,
Lawrence Tan, Vijay Vasudevan

Carnegie Mellon University
Pittsburgh, PA 15213

*Intel Research Pittsburgh

http://www.pdl.cmu.edu/

This paper presents a new cluster architecture for low-power data-intensive computing. FAWN couples low-power embedded CPUs to small amounts of local flash storage, and balances computation and I/O capabilities to enable efficient, massively parallel access to data. The key contributions of this paper are the principles of the FAWN architecture and the design and implementation of FAWN-KV—a consistent, replicated, highly available, and high-performance key-value storage system built on a FAWN prototype. Our design centers around purely log-structured datastores that provide the basis for high performance on flash storage, as well as for replication and consistency obtained using chain replication on a consistent hashing ring. Our evaluation demonstrates that FAWN clusters can handle roughly 350 key-value queries per Joule of energy—two orders of magnitude more than a disk-based system.

FULL PAPER: pdf