Base Operating System Provisioning and Bringup for a Commercial Supercomputer
- 1 January 2007
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 15302075,p. 1-7
- https://doi.org/10.1109/ipdps.2007.370630
Abstract
Commercial scale-out is a new research project at IBM research. Its main goal is to investigate and develop technologies for the use of large scale parallelism in commercial applications, eventually leading to a commercial supercomputer. The project leverages and explores the features of IBM's BladeCenter family of products. A significant challenge in using a large cluster of servers is the installation and provisioning of the base operating system in those servers. Compounding this problem is the issue of maintenance of the software image in each server after its provisioning. This paper describes the system we developed to manage the installation, provisioning, and maintenance process for a cluster of blades, providing a base level of functionality to be used by higher level management tools. The system leverages the management facilitation features of BladeCenter, and exploits the network and storage architecture of the commercial scale-out prototype cluster. It uses a single shared root filesystem image to reduce management complexity, and completely automates the process of bringing a new blade into the cluster upon its insertion into a BladeCenter chassiss.Keywords
This publication has 5 references indexed in Scilit:
- How to build a fast and reliable 1024 node cluster with only one diskThe Journal of Supercomputing, 2006
- BladeCenter system overviewIBM Journal of Research and Development, 2005
- BladeCenter chassis managementIBM Journal of Research and Development, 2005
- System management in the BlueGene/L supercomputerPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- Xen and the art of virtualizationPublished by Association for Computing Machinery (ACM) ,2003