Signature files
- 1 October 1984
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Information Systems
- Vol. 2 (4), 267-288
- https://doi.org/10.1145/2275.357411
Abstract
The signature-file access method for text retrieval is studied. According to this method, documents are stored sequentially in the "text file." Abstractions ("signatures") of the documents are stored in the "signature file." The latter serves as a filter on retrieval: It helps in discarding a large number of nonqualifying documents. In this paper two methods for creating signatures are studied analytically, one based on word signatures and the other on superimposed coding. Closed-form formulas are derived for the false-drop probability of the two methods, factors that affect it are studied, and performance comparisons of the two methods based on these formulas are provided.Keywords
This publication has 15 references indexed in Scilit:
- Design Considerations for a Message File ServerIEEE Transactions on Software Engineering, 1984
- Estimating block transfers and join sizesPublished by Association for Computing Machinery (ACM) ,1983
- Text Retrieval ComputersComputer, 1979
- FIRST: Flexible Information Retrieval System for TextJournal of the American Society for Information Science, 1979
- A fast string searching algorithmCommunications of the ACM, 1977
- Fast Pattern Matching in StringsSIAM Journal on Computing, 1977
- Associative/parallel processors for searching very large textual data basesPublished by Association for Computing Machinery (ACM) ,1977
- Efficient string matchingCommunications of the ACM, 1975
- Analysis and performance of inverted data base structuresCommunications of the ACM, 1975
- Implementation of the substring test by hashingCommunications of the ACM, 1971