Impact of Hierarchical Memory Systems On Linear Algebra Algorithm Design
- 1 March 1988
- journal article
- research article
- Published by SAGE Publications in The International Journal of Supercomputing Applications
- Vol. 2 (1), 12-48
- https://doi.org/10.1177/109434208800200103
Abstract
Linear algebra algorithms based on the BLAS or ex tended BLAS do not achieve high performance on mul tivector processors with a hierarchical memory system because of a lack of data locality. For such machines, block linear algebra algorithms must be implemented in terms of matrix-matrix primitives (BLAS3). Designing ef ficient linear algebra algorithms for these architectures requires analysis of the behavior of the matrix-matrix primitives and the resulting block algorithms as a func tion of certain system parameters. The analysis must identify the limits of performance improvement possible via blocking and any contradictory trends that require trade-off consideration. We propose a methodology that facilitates such an analysis and use it to analyze the per formance of the BLAS3 primitives used in block methods. A similar analysis of the block size-perfor mance relationship is also performed at the algorithm level for block versions of the LU decomposition and the Gram-Schmidt orthogonalization procedures.Keywords
This publication has 7 references indexed in Scilit:
- The Use of BLAS3 in Linear Algebra on a Parallel Processor with a Hierarchical MemorySIAM Journal on Scientific and Statistical Computing, 1987
- Parallel Supercomputing Today and the Cedar ApproachScience, 1986
- The cosmic cubeCommunications of the ACM, 1985
- On the Impact of Communication Complexity on the Design of Parallel Numerical AlgorithmsIEEE Transactions on Computers, 1984
- LINPACK Users' GuidePublished by Society for Industrial & Applied Mathematics (SIAM) ,1979
- On the stability of Gauss-Jordan elimination with pivotingCommunications of the ACM, 1975
- A study of replacement algorithms for a virtual-storage computerIBM Systems Journal, 1966