Context-sensitive learning methods for text categorization
- 1 April 1999
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Information Systems
- Vol. 17 (2), 141-173
- https://doi.org/10.1145/306686.306688
Abstract
Two recently implemented machine-learning algorithms, RIPPERand sleeping-experts for phrases, are evaluated on a number of large text categorization problems. These algorithms both construct classifiers that allow the “context” of a word w to affect how (or even whether) the presence or absence of w will contribute to a classification. However, RIPPER and sleeping-experts differ radically in many other respects: differences include different notions as to what constitutes a context, different ways of combining contexts to construct a classifier, different methods to search for a combination of contexts, and different criteria as to what contexts should be included in such a combination. In spite of these differences, both RIPPER and sleeping-experts perform extremely well across a wide variety of categorization problems, generally outperforming previously applied learning methods. We view this result as a confirmation of the usefulness of classifiers that represent contextual information.Keywords
This publication has 12 references indexed in Scilit:
- Stemming algorithms: A case study for detailed evaluationJournal of the American Society for Information Science, 1996
- Poisson mixturesNatural Language Engineering, 1995
- A desicion-theoretic generalization of on-line learning and an application to boostingLecture Notes in Computer Science, 1995
- Automated learning of decision rules for text categorizationACM Transactions on Information Systems, 1994
- The Weighted Majority AlgorithmInformation and Computation, 1994
- The Effect of Adding Relevance Information in a Relevance Feedback EnvironmentPublished by Springer Nature ,1994
- Developments in Automatic Text RetrievalScience, 1991
- Boolean Feature Discovery in Empirical LearningMachine Learning, 1990
- Editorial: Advice to Machine Learning AuthorsMachine Learning, 1990
- Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold AlgorithmMachine Learning, 1988