Abstract
The potential speedup of a standard cell global router using a general-purpose multiprocessor is investigated. LocusRoute, a global routing algorithm for standard cells, and its parallel implementation are presented. The uniprocessor speed and quality of LocusRoute is comparable to modern global routers. LocusRoute compares favorably with the TimberWolf 5.0 global router and a maze router that searches the same space more completely. Two successful methods of parallel decomposition of the router are presented. The first, in which multiple wires are routed in parallel, uses the notion of chaotic parallelism to achieve significant performance gains by relaxing data dependencies, at the cost of a minor loss in quality. Using iteration and careful assignment of wires to processors, this degradation is reduced. The approach achieves measured speedups from 5 to 14 using 15 processors. The second parallel decomposition technique is the evaluation of different routes for each wire on separate processors. It achieves speedups of up to 6 using 10 processors. It is demonstrated that when these two approaches are combined, the aggregate speedup is the product of the individual approaches' speedup, and, using an improved scheduling approach, it can be even greater. With a simple model based on these results, speedups of more than 75 using 150 processors are predicted

This publication has 22 references indexed in Scilit: