PARALLEL DATA LAB 

PDL Abstract

Living with Nondeterminism in Replicated Middleware Applications

Proceedings of Middleware 2006, ACM/IFIP/USENIX, 6th International Middleware Conference, Melbourne, Australia, November 27 - December 1, 2006, Proceedings. Lecture Notes in Computer Science 4290 Springer 2006.

Joseph Slember, Priya Narasimhan

Electrical and Computer Engineering
Carnegie Mellon University
Pittsburgh, PA 15213

http://www.pdl.cmu.edu/

Application-level nondeterminism can lead to inconsistent state that defeats the purpose of replication as a fault-tolerance strategy. We present Midas, a new approach for living with nondeterminism in distributed, replicated, middleware applications. Midas exploits (i) the deployment and (ii) the online compensation of replica divergence even as replicas execute. We identify the sources of nondeterminism within the application, discriminate between actual and superficial nondeterminism, and track the propagation of actual nondeterminism. We evaluate our techniques for the active replication of servers using micro-benchmarks that contain various sources (multi-threading, system calls and propagation) of nondeterminism.

FULL PAPER: pdf