PARALLEL DATA LAB 

PDL Abstract

Toward Automatic Context-based Attribute Assignment for Semantic File Systems

Carnegie Mellon University Parallel Data Lab Technical Report CMU-PDL-04-105, June 2004.

Craig A. N. Soules, Gregory R. Ganger

Dept. Electrical and Computer Engineering
Carnegie Mellon University
Pittsburgh, PA 15213

http://www.pdl.cmu.edu/

Semantic file systems enable users to search for files based on attributes rather than just pre-assigned names. This paper develops and evaluates several new approaches to automatically generating file attributes based on context, complementing existing approaches based on content analysis. Context captures broader system state that can be used to provide new attributes for files, and to propagate attributes among related files; context is also how humans often remember previous items [2], and so should fit the primary role of semantic file systems well. Based on our study of ten systems over four months, the addition of context-based mechanisms, on average, reduces the number of files with zero attributes by 73%. This increases the total number of classifiable files by over 25% in most cases, as is shown in Figure 1. Also, on average, 71% of the content-analyzable files also gain additional valuable attributes.

KEYWORDS: semantic filesystems, context, attributes, classfication

FULL PAPER: pdf