Parallelizable restarted iterative methods for nonsymmetric linear systems. II: parallel implementation

Abstract
In Part I we defined several new algorithms which gave the same iterates as the standard GMRES algorithm but require less computational expense when implemented on scalar and vector computers. In this paper, we demonstrate that the new algorithms are even more economical than the standard algorithms when applied to several other computer architectures: machines with distributed memory, machine which use local cache memory, and machines for which it is desired to use disk memory or solid state disk memory to solve the given problem. The increased efficiency is due to a reduction in the number of memory transfers required by the algorithms. In this paper we describe how these algorithms are implemented in order to obtain the improved performance. Numerical comparisons of the algorithms are given on several representative computers.

This publication has 14 references indexed in Scilit: