PARALLEL DATA LAB 

PDL Abstract

A Performance Study of Sequential I/O on Windows NT

Appears in the Proceedings of the Second Usenix Windows/NT Symposium, Seattle, Washington, August 1998. Best Student Paper Award.

Erik Riedel*, Catharine van Ingen, Jim Gray

Microsoft Research
Bay Area Research Center
San Francisco, CA 94105

*Department of Electrical and Computer Engineering
Carnegie Mellon University
Pittsburgh, PA 15213

http://www.pdl.cmu.edu/

Large-scale database, data mining, and multimedia applications require large, sequential transfers and have bandwidth as a key requirement. This paper investigates the performance of reading and writing large sequential files using the Windows NT(tm) 4.0 File System. The study explores the performance of Intel Pentium Pro(tm) based memory and IO subsystems, including the processor bus, the PCI bus, the SCSI bus, the disk controllers, and the disk media in a typical server or high-end desktop system. We provide details of the overhead costs at each level of the system and examine a variety of the available tuning knobs. We show that NTFS out-of-the-box performance is quite good, but overheads for small requests can be quite high. The best performance is achieved by using large requests, bypassing the file system cache, spreading the data across many disks and controllers, and using deep-asynchronous requests. This combination allows us to reach or exceed the half-power point of all the individual hardware components.

FULL PAPER: pdf / postscript / talk