Waiting algorithms for synchronization in large-scale multiprocessors

1 August 1993

journal article
Published by Association for Computing Machinery (ACM) in ACM Transactions on Computer Systems

Vol. 11 (3), 253-294
https://doi.org/10.1145/152864.152869

Abstract

Through analysis and experiments, this paper investigates two-phase waiting algorithms to minimize the cost of waiting for synchronization in large-scale multiprocessors. In a two-phase algorithm, a thread first waits by polling a synchronization variable. If the cost of polling reaches a limit L _poll and further waiting is necessary, the thread is blocked, incurring an additional fixed cost, B . The choice of L _poll is a critical determinant of the performance of two-phase algorithms. We focus on methods for statically determining L _poll because the run-time overhead of dynamically determining L _poll can be comparable to the cost of blocking in large-scale multiprocessor systems with lightweight threads. Our experiments show that always-block ( L _poll = 0) is a good waiting algorithm with performance that is usually close to the best of the algorithms compared. We show that even better performance can be achieved with a static choice of L _poll based on knowledge of likely wait-time distributions. Motivated by the observation that different synchronization types exhibit different wait-time distributions, we prove that a static choice of L _poll can yield close to optimal on-line performance against an adversary that is restricted to choosing wait times from a fixed family of probability distributions. This result allows us to make an optimal static choice of L _poll based on synchronization type. For exponentially distributed wait times, we prove that setting L _poll = 1n(e-1) B results in a waiting cost that is no more than e/(e-1) times the cost of an optimal off-line algorithm. For uniformly distributed wait times, we prove that setting L _poll =1/2(square root of 5 -1) B results in a waiting cost that is no more than (square root of 5 + 1)/2 (the golden ratio) times the cost of an optimal off-line algorithm. Experimental measurements of several parallel applications on the Alewife multiprocessor simulator corroborate our theoretical findings.

Keywords

This publication has 9 references indexed in Scilit:

Adaptive Backoff Synchronization Techniques
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Lazy task creation: a technique for increasing the granularity of parallel programs
IEEE Transactions on Parallel and Distributed Systems, 1991
Algorithms for scalable synchronization on shared-memory multiprocessors
ACM Transactions on Computer Systems, 1991
M-structures: Extending a parallel, non-strict, functional language with state
Lecture Notes in Computer Science, 1991
Synchronization algorithms for shared-memory multiprocessors
Computer, 1990
The performance of spin lock alternatives for shared-money multiprocessors
IEEE Transactions on Parallel and Distributed Systems, 1990
Characterizing the synchronization behavior of parallel programs
ACM SIGPLAN Notices, 1988
Distributing Hot-Spot Addressing in Large-Scale Multiprocessors
IEEE Transactions on Computers, 1987
MULTILISP: a language for concurrent symbolic computation
ACM Transactions on Programming Languages and Systems, 1985

Cited by 26 articles