Dataflow query execution in a parallel main-memory environment

Abstract
The performance and characteristics of the execution of various join-trees on a parallel DBMS are studied. The results are a step in the direction of the design of a query optimization strategy that is fit for parallel execution of complex queries. Among others, synchronization issues are identified to limit the performance gain from parallelism. A new hash-join algorithm, called pipelining hash-join is introduced that has fewer synchronization constraints than the known hash-join algorithms. Also, the behavior of individual join operations in a join-tree is studied in a simulation experiment. The results show that the pipelining hash-join algorithm yields a better performance for multi-join queries. Also, the format of the optimal join-tree appears to depend on the size of the operands of the join. The results from the simulation study are confirmed with an analytic model for dataflow query execution.

This publication has 11 references indexed in Scilit: