A vector space model for automatic indexing

Open Access

1 November 1975

journal article
research article
Published by Association for Computing Machinery (ACM) in Communications of the ACM

Vol. 18 (11), 613-620
https://doi.org/10.1145/361219.361220

Abstract

In a document retrieval, or other pattern matching environment where stored entities (documents) are compared with each other or with incoming patterns (search requests), it appears that the best indexing (property) space is one where each entity lies as far away from the others as possible; in these circumstances the value of an indexing system may be expressible as a function of the density of the object space; in particular, retrieval performance may correlate inversely with space density. An approach based on space density computations is used to choose an optimum indexing vocabulary for a collection of documents. Typical evaluation results are shown, demonstating the usefulness of the model.

Keywords

This publication has 2 references indexed in Scilit:

ON THE SPECIFICATION OF TERM VALUES IN AUTOMATIC INDEXING
Journal of Documentation, 1973
A STATISTICAL INTERPRETATION OF TERM SPECIFICITY AND ITS APPLICATION IN RETRIEVAL
Journal of Documentation, 1972

Cited by 4392 articles