PARALLEL DATA LAB

Other PDL Research Areas

  • Parallel Access Technologies: Prefetching
    • Transparent Informed Prefetching and Caching (TIP)
    • Automatic Prefetching Hints

  • Parallel Access Technologies: File Systems
  • Parallel Access Technologies: Disk Arrays
  • Scotch High Performance Storage Testbeds

  • Parallel Access Technologies: Prefetching

    We propose that applications should issue hints which disclose their future I/O accesses to help applications leverage disk array parallelism as well as take full advantage of available network bandwidth to minimize access latency.

    Transparent Informed Prefetching and Caching (TIP)

    For this project, we have implemented an aggressive prefetching strategy based on application access-pattern disclosure (hints) that allows high storage-system throughput to be converted to low application latency. We have also built a mechanism which allocates file buffers dynamically where they will have the best impact on application execution time. To do this, we estimate the impact of various allocations independently, weigh the costs against the benefits, and then allocate the buffers for the most benefit.

    Related Links:

    Automatic Prefetching Hints

    Manually modifying applications to issue prefetching hints can require substantial programming and debugging effort. To address this concern, we have built a binary modification tool that transforms application binaries so that they will generate prefetching hints automatically. In particular, the transformation causes the applications to discover their future data needs by performing speculative execution while they would ordinarily be stalled on
    I/O.

    Related Links:

    Parallel Access Technologies: File Systems

    We are employing broad strategies that expose parallelism in high levels of system and application software, allowing storage devices to be used more efficiently and application programs' actual throughput and latency needs to be better met.

    Scotch Parallel File System

    The Scotch Parallel File System (SPFS), is an advanced multicomputer file system, developed concurrently with the first generation Scotch High-Peformance Storage Testbed (Scotch-1). SPFS has a number of interesting features including:

    • High scalability by striping over independent storage servers,
    • Fault tolerance and availability selectable on a file by file basis,
    • Disclosure based prefetching, write-behind and resource management, and
    • Application controlled cache consistency on read-write files.

    Related Links:

    Network Support for Parallel File Systems

    We have built a new network parallel flow service that supports high-bandwidth data movement from parallel storage servers to parallel clients over switched networks (e.g., ATM, HiPPI, switched Ethernet, and switched FDDI). The two main components of the parallel flow service are: 1) a new network abstraction that supports parallel communication and bandwidth planning; and 2) a coordinated routing mechanism that dictates the flow of data flow, leading to better use of network resources and high application throughput.

    Parallel Storage Technology: Disk Arrays

    Redundant disk arrays are emerging as an important architecture for high performance, high reliability, cost effective secondary storage. The broad goals of this project are to advance the state of the art of redundant disk arrays. Each of the following projects deals with reducing the complexity of RAID designs.

    Error-Handling in Redundant Disk Arrays

    Correctness verification of RAID implementations is difficult and over 50 percent of code is often devoted to error-handling. The overall goal of this project is to enable correctness verification, decrease design-cycle time, and ensure that our error-handling method is extensible by decoupling implementation from design. We use antecedence graphs to achieve these goals.

    Related Links:

    RAIDframe: An Extensible RAID Controller Framework

    To evaluate new RAID architectures and algorithms, we have developed an extensible RAID driver that runs as a simulator, a user-level software array controller, and a device driver in the kernel. This controller, RAIDframe, is based upon representing RAID read and write operations as directed acyclic graphs (DAGs) of primitive operations.

    Related Links:

    Parity Declustering

    This project, now finished, resulted in a RAID organization for improving performance during the recovery of a failed disk.

    Related Links:

    Parity Logging

    This project, now finished, resulted in a RAID organization for improving throughput workloads that emphasize small, random writes.

    Related Links:

    Disk Arrays for Mobile Computers

    There is a great demand for low maintenance coupled with high storage from small disks. Our goal is to replace the single disk in portable systems with an array of four to six 1" disks that would provide both the needed capacity and lower power use. Adding a redundant disk will also increase the reliability and availability of storage on mobile computers.

    Related Links:

    Scotch High Performance Storage Testbeds

    The first Scotch testbed, Scotch-1, no longer in use, was primarily used for the early Transparent Informed Prefetching research. Scotch-1 was composed of a 25 MHz Decstation 5000/200 with a turbochannel system bus (100 MB/s) running the Mach 3.0 operating system. It was equipped with two SCSI buses and four 300 MB IBM 0661 "Lightning" drives.

    The second Scotch testbed, Scotch-2, was a larger and faster version of Scotch-1 used for the RAID architecture and implementation research in the Error Recovery and RAIDframe projects and for second generation TIP experiments. Scotch-2 was composed of a 150-Mhz DEC 3000/500 (Alpha workstation running the OSF/1 operating system and equipped with six fast SCSI bus controllers. Each bus had five HP 2247 drives, giving the total system a capacity of 30 GB.

    The third testbed, Scotch-3, was the storage component in a heterogenous multicomputer composed of 38 workstations, 30 DEC 3000 (Alpha) and 8 IBM RS6000 (Power PC), distributed over switched-HIPPI and OC3 ATM networks. This multicomputer was used for parallel application, parallel programming tool, and multicomputer operating system experiments in addition to TIP research. Scotch-3 was composed of ten DEC 3000 (Alpha) workstations with turbochannel system buses. Each workstation contained one fast, wide, differential SCSI adapter connected to both controllers of an AT&T (NCR) 6299 disk array. All workstations were interconnected by OC3 (155 Mbit/s) links to a FORE ASX-200 ATM switch complex and five of the workstations were also connected by HIPPI (800 Mbit/s) links to an NSC PS-32 HIPPI switch complex. All storage was available to any node through the Scotch parallel file system and the appropriate routing.

    Related Links:

     

    Acknowledgements

    We thank the members and companies of the PDL Consortium: Amazon, Bloomberg, Datadog, Google, Honda, Intel Corporation, IBM, Jane Street, Meta, Microsoft Research, Oracle Corporation, Pure Storage, Salesforce, Samsung Semiconductor Inc., Two Sigma, and Western Digital for their interest, insights, feedback, and support.