Cluster-based retrieval using language models

25 July 2004

conference paper
conference paper
Published by Association for Computing Machinery (ACM)

p. 186-193
https://doi.org/10.1145/1008992.1009026

Abstract

Previous research on cluster-based retrieval has been inconclusive as to whether it does bring improved retrieval effectiveness over document-based retrieval. Recent developments in the language modeling approach to IR have motivated us to re-examine this problem within this new retrieval framework. We propose two new models for cluster-based retrieval and evaluate them on several TREC collections. We show that cluster-based retrieval can perform consistently across collections of realistic size, and significant improvements over document-based retrieval can be obtained in a fully automatic manner and without relevance information provided by human.

Keywords

This publication has 20 references indexed in Scilit:

Passage retrieval based on language models
Published by Association for Computing Machinery (ACM) ,2002
The effectiveness of query-specific hierarchic clustering in information retrieval
Information Processing & Management, 2002
A probabilistic model of information retrieval: development and comparative experiments
Information Processing & Management, 2000
A probabilistic model of information retrieval: development and comparative experiments
Information Processing & Management, 2000
Comparison of Hierarchic Agglomerative Clustering Methods for Document Retrieval
The Computer Journal, 1989
Recent trends in hierarchic document clustering: A critical review
Information Processing & Management, 1988
Using interdocument similarity information in document retrieval systems
Journal of the American Society for Information Science, 1986
A model of cluster searching based on classification
Information Systems, 1980
Document clustering: An evaluation of some experiments with the cranfield 1400 collection
Information Processing & Management, 1975
The use of hierarchic clustering in information retrieval
Information Storage and Retrieval, 1971

Cited by 239 articles