On supporting containment queries in relational database management systems
Top Cited Papers
- 1 May 2001
- proceedings article
- Published by Association for Computing Machinery (ACM)
- Vol. 30 (2), 425-436
- https://doi.org/10.1145/375663.375722
Abstract
Virtually all proposals for querying XML include a class of query we term “containment queries”. It is also clear that in the foreseeable future, a substantial amount of XML data will be stored in relational database systems. This raises the question of how to support these containment queries. The inverted list technology that underlies much of Information Retrieval is well-suited to these queries, but should we implement this technology (a) in a separate loosely-coupled IR engine, or (b) using the native tables and query execution machinery of the RDBMS? With option (b), more than twenty years of work on RDBMS query optimization, query execution, scalability, and concurrency control and recovery immediately extend to the queries and structures that implement these new operations. But all this will be irrelevant if the performance of option (b) lags that of (a) by too much. In this paper, we explore some performance implications of both options using native implementations in two commercial relational database systems and in a special purpose inverted list engine. Our performance study shows that while RDBMSs are generally poorly suited for such queries, under conditions they can outperform an inverted list engine. Our analysis further identifies two significant causes that differentiate the performance of the IR and RDBMS implementations: the join algorithms employed and the hardware cache utilization. Our results suggest that contrary to most expectations, with some modifications, a native implementations in an RDBMS can support this class of query much more efficiently.Keywords
This publication has 9 references indexed in Scilit:
- On supporting containment queries in relational database management systemsPublished by Association for Computing Machinery (ACM) ,2001
- Integrating keyword search into XML query processingComputer Networks, 2000
- Making B+- trees cache conscious in main memoryPublished by Association for Computing Machinery (ACM) ,2000
- WSQ/DSQPublished by Association for Computing Machinery (ACM) ,2000
- A query language for XMLComputer Networks, 1999
- LoreACM SIGMOD Record, 1997
- The Lorel query language for semistructured dataInternational Journal on Digital Libraries, 1997
- Integrating contents and structure in text retrievalACM SIGMOD Record, 1996
- Text / relational database management systems: Harmonizing SQL and SGMLLecture Notes in Computer Science, 1994