Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-19-105, Aug 2019.
Rajat Kateja, Nathan Beckmann, Gregory R. Ganger
Carnegie Mellon University
TVARAK efficiently implements system-level redundancy for direct-access (DAX) NVM storage. Production storage systems complement device-level ECC (which covers media errors) with system-checksums and cross-device parity. This system-level redundancy enables detection of and recovery from data corruption due to device firmware bugs (e.g., reading data from the wrong physical location). Direct access to NVM penalizes software-only implementations of system-level redundancy, forcing a choice between lack of data protection or significant performance penalties. Offloading the update and verification of system-level redundancy to TVARAK, a hardware controller co-located with the last-level cache, enables efficient protection of data from such bugs in memory controller and NVM DIMM firmware. Simulation-based evaluation with seven data-intensive applications shows TVARAK’s performance and energy efficiency. For example, TVARAK reduces Redis set-only performance by only 3%, compared to 50% reduction for a state-of-the-art software-only approach.
KEYWORDS: NVM, DAX, redundancy, hardware offload
FULL TR: pdf