Signature files

Abstract
The signature-file access method for text retrieval is studied. According to this method, documents are stored sequentially in the "text file." Abstractions ("signatures") of the documents are stored in the "signature file." The latter serves as a filter on retrieval: It helps in discarding a large number of nonqualifying documents. In this paper two methods for creating signatures are studied analytically, one based on word signatures and the other on superimposed coding. Closed-form formulas are derived for the false-drop probability of the two methods, factors that affect it are studied, and performance comparisons of the two methods based on these formulas are provided.

This publication has 15 references indexed in Scilit: