Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-05-105, April 2005. Superceded by SOSP'05, October 2326, 2005, Brighton, United Kingdom.
Craig A.N. Soules, Gregory R. Ganger
Parallel Data Laboratory, Carnegie Mellon University.
Pittsburgh, PA 15213
The continued growth of personal file systems demands a shift from manual file organization to effective on-demand search tools. Today’s best search tools use content analysis techniques to provide targeted, ranked results for user queries. However, these tools are missing a key way that users remember and search for their data: context. Context is the set of external events that a user associates with a file’s use: the user’s current task, other files being accessed, the time of day, etc. This paper presents Connections, a search system that combines content analysis with context information using temporal locality of file accesses. Through this combination, Connections improves both the false-negative rate (recall) and false-positive rate (precision) over content analysis alone. That is, by adding context information, our system finds more of the desired files and ranks them more accurately.
KEYWORDS: file search, contextual search, successor models
FULL PAPER (TR VERSION): pdf
FULL PAPER (CONFERENCE VERSION): pdf