STORM: Lightning-Fast Resource Management
- 1 January 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 1225 (10639535), 46
- https://doi.org/10.1109/sc.2002.10057
Abstract
Although workstation clusters are a common platform for high-performance computing (HPC), they remain more difficult to manage than sequential systems or even symmetric multiprocessors. Furthermore, as cluster sizes increase, the quality of the resource-management subsystem — essentially, all of the code that runs on a cluster other than the applications — increasingly impacts application efficiency. In this paper, we present STORM, a resource-management framework designed for scalability and performance. The key innovation behind STORM is a software architecture that enables resource management to exploit low-level network features. As a result of this HPC-application-like design, STORM is orders of magnitude faster than the best reported results in the literature on two sample resource-management functions: job launching and process scheduling.Keywords
This publication has 18 references indexed in Scilit:
- Gang scheduling with lightweight user-level communicationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Buffered coscheduling: a new methodology for multitasking parallel jobs on distributed systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- The Quadrics network: high-performance clustering technologyIEEE Micro, 2002
- InfiniBridge: an InfiniBand channel adapter with integrated switchIEEE Micro, 2002
- BProcPublished by Association for Computing Machinery (ACM) ,2002
- IMPROVED RESOURCE UTILIZATION WITH BUFFERED COSCHEDULINGParallel Algorithms and Applications, 2001
- GLUix: a global layer unix for a network of workstationsSoftware: Practice and Experience, 1998
- PM: An operating system coordinated high performance communication libraryPublished by Springer Nature ,1997
- Packing schemes for gang schedulingPublished by Springer Nature ,1996
- Myrinet: a gigabit-per-second local area networkIEEE Micro, 1995