Design and Evaluation of an HPVM-Based Windows NT Supercomputer
- 1 August 1999
- journal article
- research article
- Published by SAGE Publications in The International Journal of High Performance Computing Applications
- Vol. 13 (3), 201-219
- https://doi.org/10.1177/109434209901300304
Abstract
We describe the design and evaluation of a 192-processor Windows NT cluster for high performance computing based on the High Performance Virtual Machine (HPVM) communication suite. While other clusters have been described in the literature, building a 58 GFlop/s NT cluster to be used as a general-purpose production machine for NCSA required solving new problems. The HPVM software meets the challenges represented by the large number of processors, the peculiarities of the NT operating system, the need for a production-strength job submission facility, and the requirement for mainstream programming interfaces. First, HPVM provides users with a collection of standard APIs like MPI, Shmem, Global Arrays with supercomputer class performance (13 μs minimum latency, 84 MB/s peak bandwidth for MPI), efficiently delivering Myrinet’s hardware performance to application programs. Second, HPVM provides cluster management and scheduling (through integration with Platform Computing’s LSF). Finally, HPVM addresses Windows NT’s remote access problem, providing convenient remote access and job control (through a graphical Java-applet front-end). Given the production nature of the cluster, the performance characterization is largely based on a sample of the NCSA scientific applications the machine will be running. The side-by-side comparison with other present-generation NCSA supercomputers shows the cluster to be within a factor of 2 to 4 of the SGI Origin 2000 and Cray T3E performance at a fraction of the cost. The inherent scalability of the cluster design produces a comparable or better speedup than the Origin 2000 despite a limitation in the HPVM flow control mechanism.Keywords
This publication has 7 references indexed in Scilit:
- High-performance switching with fibre channelPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Fast messages: efficient, portable communication for workstation clusters and MPPsIEEE Concurrency, 1997
- MPI-FM: High Performance MPI on Workstation ClustersJournal of Parallel and Distributed Computing, 1997
- Memory channel network for PCIIEEE Micro, 1996
- Virtual-memory-mapped network interfacesIEEE Micro, 1995
- Myrinet: a gigabit-per-second local area networkIEEE Micro, 1995
- Utopia: A load sharing facility for large, heterogeneous distributed computer systemsSoftware: Practice and Experience, 1993