Flexible and efficient XML search with complex full-text predicates
- 27 June 2006
- proceedings article
- Published by Association for Computing Machinery (ACM)
- p. 575-586
- https://doi.org/10.1145/1142473.1142537
Abstract
Recently, there has been extensive research that generated a wealth of new XML full-text query languages, ranging from simple Boolean search to combining sophisticated proximity and order predicates on keywords. While computing least common ancestors of query terms was proposed for efficient evaluation of conjunctive keyword queries by exploiting the document structure, no such solution was developed to evaluate complex full-text queries. We present efficient evaluation algorithms based on a formalization of XML queries in terms of keyword patterns and an algebra which manipulates pattern matches. Our algebra captures most existing languages and their varying semantics and our algorithms combine relational query evaluation techniques with the exploitation of document structure to process queries with complex full-text predicates. We show how scoring can be incorporated into our framework without compromising the algorithms complexity. Our experiments show that considering element nesting dramatically improves the performance of queries with complex full-text predicates.Keywords
This publication has 13 references indexed in Scilit:
- Efficient keyword search for smallest LCAs in XML databasesPublished by Association for Computing Machinery (ACM) ,2005
- An Algebra for Structured Queries in Bayesian NetworksLecture Notes in Computer Science, 2005
- TOSSPublished by Association for Computing Machinery (ACM) ,2004
- Searching XML documents via XML fragmentsPublished by Association for Computing Machinery (ACM) ,2003
- XRANKPublished by Association for Computing Machinery (ACM) ,2003
- A System for Keyword Proximity Search on XML DatabasesPublished by Elsevier ,2003
- Algebras for Querying Text Regions: Expressive Power and OptimizationJournal of Computer and System Sciences, 1998
- A probabilistic relational algebra for the integration of information retrieval and database systemsACM Transactions on Information Systems, 1997
- Fast evaluation of structured queries for information retrievalPublished by Association for Computing Machinery (ACM) ,1995
- An Algebra for Structured Text Search and a Framework for its ImplementationThe Computer Journal, 1995