Monitoring distributed systems
- 1 March 1987
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Computer Systems
- Vol. 5 (2), 121-150
- https://doi.org/10.1145/13677.22723
Abstract
The monitoring of distributed systems involves the collection, interpretation, and display of information concerning the interactions among concurrently executing processes. This information and its display can support the debugging, testing, performance evaluation, and dynamic documentation of distributed systems. General problems associated with monitoring are outlined in this paper, and the architecture of a general purpose, extensible, distributed monitoring system is presented. Three approaches to the display of process interactions are described: textual traces, animated graphical traces, and a combination of aspects of the textual and graphical approaches. The roles that each of these approaches fulfill in monitoring and debugging distributed systems are identified and compared. Monitoring tools for collecting communication statistics, detecting deadlock, controlling the non-deterministic execution of distributed systems, and for using protocol specifications in monitoring are also described. Our discussion is based on experience in the development and use of a monitoring system within a distributed programming environment called Jade. Jade was developed within the Computer Science Department of the University of Calgary and is now being used to support teaching and research at a number of university and research organizations.Keywords
This publication has 9 references indexed in Scilit:
- Graphical views of parallel programsACM SIGSOFT Software Engineering Notes, 1986
- Multibug: Interative Debugging in Distributed SystemsIEEE Micro, 1986
- Distributed process groups in the V KernelACM Transactions on Computer Systems, 1985
- Techniques for Algorithm AnimationIEEE Software, 1985
- INCENSEACM SIGGRAPH Computer Graphics, 1983
- Development of a debugger for a concurrent languageACM SIGSOFT Software Engineering Notes, 1983
- The Interlisp Programming EnvironmentComputer, 1981
- Thoth, a portable real-time operating systemCommunications of the ACM, 1979
- Time, clocks, and the ordering of events in a distributed systemCommunications of the ACM, 1978