Cost-based query scrambling for initial delays

1 June 1998

proceedings article
Published by Association for Computing Machinery (ACM)

Vol. 27 (2), 130-141
https://doi.org/10.1145/276304.276317

Abstract

Remote data access from disparate sources across a wide-area network such as the Internet is problematic due to the unpredictable nature of the communications medium and the lack of knowledge about the load and potential delays at remote sites. Traditional, static, query processing approaches break down in this environment because they are unable to adapt in response to unexpected delays. Query scrambling has been proposed to address this problem. Scrambling modifies query execution plans on-the-fly when delays are encountered during runtime. In its original formulation, scrambling was based on simple heuristics, which although providing good performance in many cases, were also shown to be susceptible to problems resulting from bad scrambling decisions. In this paper we address these shortcomings by investigating ways to exploit query optimization technology to aid in making intelligent scrambling choices. We propose three different approaches to using query optimization for scrambling. These approaches vary, for example, in whether they optimize for total work or response-time, and whether they construct partial or complete alternative plans. Using a two-phase randomized query optimizer, a distributed query processing simulator, and a workload derived from queries of the TPCD benchmark, we evaluate these different approaches and compare their ability to cope with initial delays in accessing remote sources. The results show that cost-based scrambling can effectively hide initial delays, but that in the absence of good predictions of expected delay durations, there are fundamental tradeoffs between risk aversion and effectiveness.

Keywords

This publication has 12 references indexed in Scilit:

The Garlic project
Published by Association for Computing Machinery (ACM) ,1996
InterViso: Dealing with the complexity of federated database access
The VLDB Journal, 1995
Reducing multidatabase query response time by tree balancing
Published by Association for Computing Machinery (ACM) ,1995
Optimization of dynamic query evaluation plans
Published by Association for Computing Machinery (ACM) ,1994
Query evaluation techniques for large databases
ACM Computing Surveys, 1993
Randomized algorithms for optimizing large join queries
Published by Association for Computing Machinery (ACM) ,1990
Query optimization by simulated annealing
Published by Association for Computing Machinery (ACM) ,1987
Join processing in database systems with large main memories
ACM Transactions on Database Systems, 1986
R* optimizer validation and performance evaluation for local queries
Published by Association for Computing Machinery (ACM) ,1986
Access path selection in a relational database management system
Published by Association for Computing Machinery (ACM) ,1979

Cited by 89 articles