Dynamic program instrumentation for scalable performance tools

Abstract
In this paper, we present a new technique called dynamic instrumentation that provides efficient, scalable, yet detailed data collection for large-scale parallel appli- cations. Our approach is unique because it defers insert- ing any instrumentation until the application is in execu- tion. We can insert or change instrumentation at any time during execution by modifying the application's binary image. Only the instrumentation required for the currently selected analysis or visualization is inserted. As a result, our technique collects several orders of magni- tude less data than traditional data collection approaches. We have implemented a prototype of our dynamic instrumentation on the CM-5, and present results for several real applications. In addition, we include recommendations to operating system designers, compiler writers, and computer architects about the features neces- sary to permit efficient monitoring of large-scale parallel systems. 1. Introduction Efficient data collection is a critical problem for any system that monitors the performance of a parallel or distributed application. We have estimated that monitor- ing programs at a reasonable level of detail on current RISC processors can easily generate two megabytes per second per processor of performance data. For a mas- sively parallel computer (say 1000 nodes), this amount of data is impractical to collect for all but the shortest pro- grams. However, to understand the performance of paral- lel programs, it is necessary to collect data for full-sized data sets running on large numbers of processors. In this paper, we present a new approach to performance instru- mentation that defers instrumenting the program until it is in execution, permitting dynamic insertion and alteration of the instrumentation during program execution. Monitoring the performance of massively parallel programs requires an instrumentation system that is detailed, frugal, and scalable. It must collect information that is detailed enough to permit the programmer to

This publication has 9 references indexed in Scilit: