Parallel Data Laboratory

Other PDL Research Areas

Parallel Access Technologies: Prefetching

Transparent Informed Prefetching and Caching (TIP)
Automatic Prefetching Hints

Parallel Access Technologies: File Systems

Scotch Parallel File Systems
Network Support for Parallel File Systems

Parallel Access Technologies: Disk Arrays

Error-Handling in Redundant Disk Arrays
RAIDframe: An Extensible RAID Controller Framework
Parity Declustering
Parity Logging
Disk Arrays for Mobile Computers

Scotch High Performance Storage Testbeds

Parallel Access Technologies: Prefetching

We propose that applications should issue hints which disclose their future I/O accesses to help applications leverage disk array parallelism as well as take full advantage of available network bandwidth to minimize access latency.

Transparent Informed Prefetching and Caching (TIP)

For this project, we have implemented an aggressive prefetching strategy based on application access-pattern disclosure (hints) that allows high storage-system throughput to be converted to low application latency. We have also built a mechanism which allocates file buffers dynamically where they will have the best impact on application execution time. To do this, we estimate the impact of various allocations independently, weigh the costs against the benefits, and then allocate the buffers for the most benefit.

Related Links:

PDL TIP web pages

Our publications on TIP research

Automatic Prefetching Hints

Manually modifying applications to issue prefetching hints can require substantial programming and debugging effort. To address this concern, we have built a binary modification tool that transforms application binaries so that they will generate prefetching hints automatically. In particular, the transformation causes the applications to discover their future data needs by performing speculative execution while they would ordinarily be stalled on
I/O.

Related Links:

For more details on hint generation, visit Automatic I/O Hint Generation through Speculative Execution
Within the list of TIP Publications, see the conference paper "Automatic I/O Hint Generation through Speculative Execution".

Parallel Access Technologies: File Systems

We are employing broad strategies that expose parallelism in high levels of system and application software, allowing storage devices to be used more efficiently and application programs' actual throughput and latency needs to be better met.

Scotch Parallel File System

The Scotch Parallel File System (SPFS), is an advanced multicomputer file system, developed concurrently with the first generation Scotch High-Peformance Storage Testbed (Scotch-1). SPFS has a number of interesting features including:

High scalability by striping over independent storage servers,
Fault tolerance and availability selectable on a file by file basis,
Disclosure based prefetching, write-behind and resource management, and
Application controlled cache consistency on read-write files.

Related Links:

Scotch Parallel Storage System Publications
A section of the paper "The Scotch Parallel Storage Systems" discusses the design of SPFS.

Network Support for Parallel File Systems

We have built a new network parallel flow service that supports high-bandwidth data movement from parallel storage servers to parallel clients over switched networks (e.g., ATM, HiPPI, switched Ethernet, and switched FDDI). The two main components of the parallel flow service are: 1) a new network abstraction that supports parallel communication and bandwidth planning; and 2) a coordinated routing mechanism that dictates the flow of data flow, leading to better use of network resources and high application throughput.

Parallel Storage Technology: Disk Arrays

Redundant disk arrays are emerging as an important architecture for high performance, high reliability, cost effective secondary storage. The broad goals of this project are to advance the state of the art of redundant disk arrays. Each of the following projects deals with reducing the complexity of RAID designs.

Error-Handling in Redundant Disk Arrays

Correctness verification of RAID implementations is difficult and over 50 percent of code is often devoted to error-handling. The overall goal of this project is to enable correctness verification, decrease design-cycle time, and ensure that our error-handling method is extensible by decoupling implementation from design. We use antecedence graphs to achieve these goals.

Related Links:

RAIDframe: An Extensible RAID Controller Framework

To evaluate new RAID architectures and algorithms, we have developed an extensible RAID driver that runs as a simulator, a user-level software array controller, and a device driver in the kernel. This controller, RAIDframe, is based upon representing RAID read and write operations as directed acyclic graphs (DAGs) of primitive operations.

Related Links:

RAIDframe code and documentation
The PDL RAID web pages
A port of the RAIDframe kernel driver to NetBSD.

Parity Declustering

This project, now finished, resulted in a RAID organization for improving performance during the recovery of a failed disk.

Related Links:

Mark's Holland's thesis and other papers on declustering research.

Parity Logging

This project, now finished, resulted in a RAID organization for improving throughput workloads that emphasize small, random writes.

Related Links:

Our papers on logging research.

Disk Arrays for Mobile Computers

There is a great demand for low maintenance coupled with high storage from small disks. Our goal is to replace the single disk in portable systems with an array of four to six 1" disks that would provide both the needed capacity and lower power use. Adding a redundant disk will also increase the reliability and availability of storage on mobile computers.

Related Links:

Rachad Youssef's Master's thesis "RAID for Mobile Computers"

Scotch High Performance Storage Testbeds

The first Scotch testbed, Scotch-1, no longer in use, was primarily used for the early Transparent Informed Prefetching research. Scotch-1 was composed of a 25 MHz Decstation 5000/200 with a turbochannel system bus (100 MB/s) running the Mach 3.0 operating system. It was equipped with two SCSI buses and four 300 MB IBM 0661 "Lightning" drives.

The second Scotch testbed, Scotch-2, was a larger and faster version of Scotch-1 used for the RAID architecture and implementation research in the Error Recovery and RAIDframe projects and for second generation TIP experiments. Scotch-2 was composed of a 150-Mhz DEC 3000/500 (Alpha workstation running the OSF/1 operating system and equipped with six fast SCSI bus controllers. Each bus had five HP 2247 drives, giving the total system a capacity of 30 GB.

The third testbed, Scotch-3, was the storage component in a heterogenous multicomputer composed of 38 workstations, 30 DEC 3000 (Alpha) and 8 IBM RS6000 (Power PC), distributed over switched-HIPPI and OC3 ATM networks. This multicomputer was used for parallel application, parallel programming tool, and multicomputer operating system experiments in addition to TIP research. Scotch-3 was composed of ten DEC 3000 (Alpha) workstations with turbochannel system buses. Each workstation contained one fast, wide, differential SCSI adapter connected to both controllers of an AT&T (NCR) 6299 disk array. All workstations were interconnected by OC3 (155 Mbit/s) links to a FORE ASX-200 ATM switch complex and five of the workstations were also connected by HIPPI (800 Mbit/s) links to an NSC PS-32 HIPPI switch complex. All storage was available to any node through the Scotch parallel file system and the appropriate routing.

Related Links:

An HTML version of "The Scotch Parallel Storage Systems"

Acknowledgements

We thank the members and companies of the PDL Consortium: Amazon, Bloomberg LP, Datadog, Google, Honda, Intel Corporation, Jane Street, LayerZero Research, Meta, Microsoft Research, Oracle Corporation, Oracle Cloud Infrastructure, Pure Storage, Salesforce, Samsung Semiconductor Inc., and Western Digital for their interest, insights, feedback, and support.