Dynamic program instrumentation for scalable performance tools
- 17 December 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
In this paper, we present a new technique called dynamic instrumentation that provides efficient, scalable, yet detailed data collection for large-scale parallel appli- cations. Our approach is unique because it defers insert- ing any instrumentation until the application is in execu- tion. We can insert or change instrumentation at any time during execution by modifying the application's binary image. Only the instrumentation required for the currently selected analysis or visualization is inserted. As a result, our technique collects several orders of magni- tude less data than traditional data collection approaches. We have implemented a prototype of our dynamic instrumentation on the CM-5, and present results for several real applications. In addition, we include recommendations to operating system designers, compiler writers, and computer architects about the features neces- sary to permit efficient monitoring of large-scale parallel systems. 1. Introduction Efficient data collection is a critical problem for any system that monitors the performance of a parallel or distributed application. We have estimated that monitor- ing programs at a reasonable level of detail on current RISC processors can easily generate two megabytes per second per processor of performance data. For a mas- sively parallel computer (say 1000 nodes), this amount of data is impractical to collect for all but the shortest pro- grams. However, to understand the performance of paral- lel programs, it is necessary to collect data for full-sized data sets running on large numbers of processors. In this paper, we present a new approach to performance instru- mentation that defers instrumenting the program until it is in execution, permitting dynamic insertion and alteration of the instrumentation during program execution. Monitoring the performance of massively parallel programs requires an instrumentation system that is detailed, frugal, and scalable. It must collect information that is detailed enough to permit the programmer toKeywords
This publication has 9 references indexed in Scilit:
- Dynamic control of performance monitoring on large scale parallel systemsPublished by Association for Computing Machinery (ACM) ,1993
- Practical data breakpointsPublished by Association for Computing Machinery (ACM) ,1993
- Integrated Pvm Framework Supports Heterogeneous Network ComputingComputers in Physics, 1993
- The paragon performance monitoring environmentPublished by Association for Computing Machinery (ACM) ,1993
- Optimally profiling and tracing programsPublished by Association for Computing Machinery (ACM) ,1992
- Performance debugging shared memory multiprocessor programs with MTOOLPublished by Association for Computing Machinery (ACM) ,1991
- Fast breakpoints: design and implementationPublished by Association for Computing Machinery (ACM) ,1990
- Threads and input/output in the synthesis kernalPublished by Association for Computing Machinery (ACM) ,1989
- A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environmentPublished by Association for Computing Machinery (ACM) ,1989