TL;DR: The purpose of the PAPI project is to specify a standard application programming interface for accessing hardware performance counters available on most modern microprocessors, which exist as a small set of registers that count events.
Abstract: The purpose of the PAPI project is to specify a standard application programming interface (API) for accessing hardware performance counters available on most modern microprocessors. These counters exist as a small set of registers that count events, which are occurrences of specific signals and states related to the processor's function. Monitoring these events facilitates correlation between the structure of source/object code and the efficiency of the mapping of that code to the underlying architecture. This correlation has a variety of uses in performance analysis, including hand tuning, compiler optimization, debugging, benchmarking, monitoring, and performance modeling. In addition, it is hoped that this information will prove useful in the development of new compilation technology as well as in steering architectural development toward alleviating commonly occurring bottlenecks in high performance computing.
TL;DR: In this paper, a variety of techniques are described to obtain program execution information in connection with an executing application including instrumentation techniques and use of a debugger interface to obtain profiling and other execution information.
Abstract: Techniques for gathering execution information about an application, such as a distributed application, are described. Key communication points in cross execution context calls, such as remote procedure calls, are determined and control is transferred to instrumentation routines to insert and extract execution information. Outgoing remote procedure calls are intercepted on a client that inserts call origin information into the request sent to a server system. Messages received by a server are intercepted. The server system extracts the call origin information and additionally inserts other information in a response sent to the client system upon completion of a remote procedure call. In turn, the client system intercepts the response and extracts other performance information. On each client and server system, information is gathered by a reader and forwarded to a local collector. This information may be further forwarded to and correlated by a client collector from one or more remote server collectors in accordance with processes of each distributed application. Various statistics for a distributed application may be determined in addition to per process statistics. These include wire time, code coverage as related to the distributed application, remote procedure call tracing, and performance profiling. A variety of techniques are described to obtain program execution information in connection with an executing application including instrumentation techniques and use of a debugger interface to obtain profiling and other execution information. All of the program execution data may be collected and correlated at one or more particular points using other techniques described to represent coordinated application monitoring.
TL;DR: In this paper, techniques and systems for analysis, diagnosis and debugging fabricated hardware designs at a Hardware Description Language (HDL) level are described, where the hardware designs have been designed in HDL and have been fabricated in integrated circuit products with limited input/output pins.
Abstract: Techniques and systems for analysis, diagnosis and debugging fabricated hardware designs at a Hardware Description Language (HDL) level are described. Although the hardware designs (which were designed in HDL) have been fabricated in integrated circuit products with limited input/output pins, the techniques and systems enable the hardware designs within the integrated circuit products to be comprehensively analyzed, diagnosed, and debugged at the HDL level at speed. The ability to debug hardware designs at the HDL level facilitates correction or adjustment of the HDL description of the hardware designs.
TL;DR: In this paper, a shape-analysis algorithm statically analyzes a program to determine information about the heap-allocated data structures that the program manipulates, which can be used to understand or verify programs.
Abstract: A shape-analysis algorithm statically analyzes a program to determine information about the heap-allocated data structures that the program manipulates. The results can be used to understand or verify programs. They also contain information valuable for debugging, compile-time garbage collection, instruction scheduling, and parallelization.
TL;DR: Metaglue is described, an extension to the Java programming language for building software agent systems for controlling Intelligent Environments that has been specifically designed to address these needs.
Abstract: Intelligent Environments (IEs) have specific computational properties that generally distinguish them from other computational systems. They have large numbers of hardware and software components that need to be interconnected. Their infrastructures tend to be highly distributed, reflecting both the distributed nature of the real world and the IEs’ need for large amounts of computational power. They also tend to be highly dynamic and require reconfiguration and resource management on the fly as their components and inhabitants change, and as they adjust their operation to suit the learned preferences of their users. Because IEs generally have multimodal interfaces, they also usually have high degrees of parallelism for resolving multiple, simultaneous events. Finally, debugging IEs present unique challenges to their creators, not only because of their distributed parallelism, but also because of the difficulty of pinning down their “state” in a formal computational sense. This paper describes Metaglue, an extension to the Java programming language for building software agent systems for controlling Intelligent Environments that has been specifically designed to address these needs. Metaglue has been developed as part of the MIT Artificial Intelligence Lab’s Intelligent Room Project, which has spent the past four years designing Intelligent Environments for research in Human-Computer Interaction.
TL;DR: This work develops a framework suitable for diagnosing configuration knowledge bases by developing an algorithm based on conflicts that is general enough for its adaptation to diagnosing customer requirements to identify unachievable conditions during configuration sessions.
Abstract: Configuration problems are a thriving application area for declarative knowledge representation that currently experiences a constant increase in size and complexity of knowledge bases. Automated support of the debugging of such knowledge bases is a necessary prerequisite for effective development of configurators. We show that this task can be achieved by consistency-based diagnosis techniques. Based on the formal definition of consistency-based configuration we develop a framework suitable for diagnosing configuration knowledge bases. During the test phase of configurators, valid and invalid examples are used to test the correctness of the system. In case such examples lead to unintended results, debugging of the knowledge base is initiated. Starting from a clear definition of diagnosis in the configuration domain we develop an algorithm based on conflicts. Our framework is general enough for its adaptation to diagnosing customer requirements to identify unachievable conditions during configuration sessions.
TL;DR: In this paper, a system, method and database development tools are disclosed for automatically generating the complete dependency graph for use in debugging stored code objects in a database, by using a recursive dependency tracking algorithm, which takes into consideration the indirect dependencies on triggers as well as the dependencies on implementations of object oriented code objects which are represented as separate objects in the database catalog.
Abstract: A system, method and database development tools are disclosed for automatically generating the complete dependency graph (12) for use in debugging stored code objects (11) in a database, by using a recursive dependency tracking algorithm (14) which takes into consideration the indirect dependencies (15) on triggers as well as the dependencies on implementations of object oriented code objects which are represented as separate objects in the database catalog.
TL;DR: In this paper, a system and method for graphically debugging a computer program is presented, which is capable of displaying a graphical representation of an application program to be debugged, and allows a user to insert debugging tools such as breakpoints directly into the graphical representation.
Abstract: A system and method for graphically debugging a computer program is disclosed (300). In a preferred embodiment, a graphical debugging environment is provided, which is capable of displaying a graphical representation of an application program to be debugged (440). Thereafter, the graphical debugging environment allows a user to insert debugging tools, such as breakpoints (430) directly into the graphical representation of the application program. Thus a user is not required to interact with the textual source code of an application program when debugging it. The graphical debugging environment (300) may display indicators illustrating where debug tools have been inserted within the application program. In a preferred embodiment, the graphical debugging environment (300) allows a user to perform debugging during an application program's runtime. Thus, a user is not required to halt an application program prior to debugging it. Also in a preferred embodiment the graphical debugging environment (300) executing on a local computer may be used to debug an application program residing on a remote computer.
TL;DR: This work describes a methodology, called NetLogger, that enables real-time diagnosis of performance problems in complex distributed systems and combines network, host, and application-level monitoring, providing a complete view of the entire system.
Abstract: Diagnosis and debugging of performance problems on complex distributed systems requires end-to-end performance information at both the application and system level. We describe a methodology, called NetLogger, that enables real-time diagnosis of performance problems in such systems. The methodology includes tools for generating precision event logs, an interface to a system event-monitoring framework, and tools for visualizing the log data and real-time state of the distributed system. Low overhead is an important requirement for such tools, therefore we evaluate efficiency of the monitoring itself. The approach is novel in that it combines network, host, and application-level monitoring, providing a complete view of the entire system.
TL;DR: This paper discusses the research into algorithms for creating anefficient bidirectional debugger in which all traditional forward movement commands can be performed with equal ease in the reverse direction and expects that adding these backwards movement capabilities to a debugger will greatly increase its efficacy as a programming tool.
Abstract: This paper discusses our research into algorithms for creating an efficient bidirectional debugger in which all traditional forward movement commands can be performed with equal ease in the reverse direction. We expect that adding these backwards movement capabilities to a debugger will greatly increase its efficacy as a programming tool.The efficiency of our methods arises from our use of event counters that are embedded into the program being debugged. These counters are used to precisely identify the desired target event on the fly as the target program executes. This is in contrast to traditional debuggers that may trap back to the debugger many times for some movements. For reverse movements we re-execute the program (possibly using two passes) to identify and stop at the desired earlier point. Our counter based techniques are essential for these reverse movements because they allow us to efficiently execute through the millions of events encountered during re-execution.Two other important components of this debugger are its I/O logging and checkpointing. We log and later replay the results of system calls to ensure deterministic re-execution, and we use checkpointing to bound the amount of re-execution used for reverse movements. Short movements generally appear instantaneous, and the time for longer movements is usually bounded within a small constant factor of the temporal distance moved back.
TL;DR: In this paper, an execution monitoring tool, a method and a computer program product for monitoring execution of an hierarchical visual program is presented. But the execution progress reports are sent to an execution controller which maps the report data to its own representation of the hierarchical program to determine the current position within an execution program.
Abstract: Provided are an execution monitoring tool, a method and a computer program product for monitoring execution of an hierarchical visual program. Execution progress reports are sent to an execution monitoring controller which maps the report data to its own representation of the hierarchical program to determine the current position within an execution program. The hierarchical structure of the program and the current execution position are displayed during execution on a test system. The execution monitoring controller maintains an hierarchical representation of the visual program's structure, builds an execution stack from the execution progress reports, and compares the received reports with the hierarchical representation to determine a current execution position. The execution reports include the current execution status as well as the position within the execution flow. In an implementation of the invention for debugging a visual message flow, which represents a sequence of message processing operations as a set of nodes and connections between the nodes, the execution progress reports include the content and structure of the message during execution and this is also displayed to the user. A set of debug nodes for generating the execution progress reports are preferably automatically inserted in the message flow before executing it on a test and debugging system, and these debug nodes send execution progress reports to a debug controller.
TL;DR: In this paper, the authors present techniques and systems for debugging an electronic system having instrumentation circuitry included therein, which facilitate analysis, diagnosis and debugging fabricated hardware designs at a HDL level.
Abstract: Techniques and systems for debugging an electronic system having instrumentation circuitry included therein are disclosed. The techniques and systems facilitate analysis, diagnosis and debugging fabricated hardware designs at a Hardware Description Language (HDL) level. Although the hardware designs (which were designed in HDL) have been fabricated in integrated circuit products with limited input/output pins, the invention enables the hardware designs within the integrated circuit products to be comprehensively analyzed, diagnosed, and debugged at the HDL level at speed. The ability to debug hardware designs at the HDL level facilitates correction or adjustment of the HDL description of the hardware designs.
TL;DR: In this article, a debugging environment for a multi-processor simulator or emulator is described, which is ideally suited for the development of embedded software and can contain multiple processor models, with each processor model representing a processor.
Abstract: A debugging environment for a multi-processor simulator or emulator is disclosed. The simulator or emulator is ideally suited for the development of embedded software. The simulator can contain multiple processor models, with each processor model representing a processor. The simulator or emulator also includes a scheduler which controls the execution of the processor models. Each processor also communicates with a debugger via a debug adapter. The debug adapter acts as a pass-through filter for non-control commands which are communicated between a processor and its attached debugger. However, the debug adapter routes control commands to the scheduler. The scheduler ensures that all of the processors and debuggers maintain synchronization. Other modules can also be included in the multi-processor simulation environment, for example, clock gate modules.
TL;DR: In this paper, a method and apparatus for tracing hardware states using dynamically reconfigurable test circuits provides improved debug and troubleshooting capability for functional logic implemented within field programmable logic arrays (FPGAs).
Abstract: A method and apparatus for tracing hardware states using dynamically reconfigurable test circuits provides improved debug and troubleshooting capability for functional logic implemented within field programmable logic arrays (FPGAs). Special test logic configurations may be loaded to enhance the debugging of a system using FPGAs. Registers are used to capture snapshots of internal signals for access by a trace program and a test multiplexer is used to provide real-time output to test pins for use with external test equipment. By retrieving the hardware snapshot information with a trace program running on a system in which the FPGA is used, software and hardware debugging are coordinated, providing a sophisticated model of overall system behavior. Special test circuits are implemented within the test logic configurations to enable detection of various events and errors. Counters are used to capture count values when system processor execution reaches a hardware trace point or when events occur. Comparators are used to detect specific data or address values and event detectors are used to detect particular logic value combinations that occur within the functional logic.
TL;DR: In this paper, a computer system including a microprocessor on an integrated circuit chip comprising an on-chip CPU and a debugging port connected to a communication bus on the integrated circuit and to an external debugging computer device is described.
Abstract: There is disclosed a computer system including a microprocessor on an integrated circuit chip comprising an on-chip CPU and a debugging port connected to a communication bus on the integrated circuit and to an external debugging computer device. The external debugging device is operable to transmit control signals through the debugging port: a) to stop execution by the CPU of instructions obtained from a first on-chip memory; b) to provide from a second memory associated with the external debugging computer device a debugging routine to be executed by the CPU; and c) to restart operation of the CPU after the routine with execution of instructions from an address determined by the external debugging device. The on-chip CPU is operable with code in the first memory which is independent of the debugging routine. A method of operating such a computer system with an external debugging device is also disclosed.
TL;DR: In this article, an execution monitoring tool identifies locations within the message processing program corresponding to a predefined set of execution progress stages, and inserts execution progress report generator components at these locations.
Abstract: Provided are an execution monitoring tool, a method and a computer program product for monitoring a message processing program or system. The execution monitoring tool identifies locations within the message processing program corresponding to a predefined set of execution progress stages, and inserts execution progress report generator components at these locations. Execution progress reports (including a representation of the message contents and structure) are then sent to the execution monitoring controller which maps the report data to its own representation of the program to determine the current position within an execution program. The message contents and structure, as well as the structure of the program and the current execution position, are displayed during execution on a test system. The execution reports include the current execution status as well as the position within the execution flow. The invention is advantageous for debugging a visual message flow, which represents a sequence of message processing operations as a set of nodes and connections between the nodes. A set of debug nodes for generating the execution progress reports are automatically inserted in the message flow before executing it on a test and debugging system, and these debug nodes send execution progress reports to a debug controller.
TL;DR: In this article, a processor core for transitioning a debugging unit between a plurality of operating states in response to an instruction stream is disclosed, where the processor core generates trace data as it processes operating signals of the instruction stream.
Abstract: A processor core for transitioning a debugging unit between a plurality of operating states in response to an instruction stream is disclosed. The processor core generates trace data as it processes operating signals of the instruction stream. The processor core provides a first trigger event signal to the debugging unit in response to a first trigger instruction signal within the instruction stream that is representative of a triggering instruction to transitions the debugging unit to a base operating state. The processor core provides a second trigger event signal to the debugging unit in response to a second trigger instruction signal within the instruction stream that is representative of a triggering instruction to dynamically store trace data within the memory component of the debugging unit. The processor core provides a third trigger event signal to the debugging unit in response to a third trigger instruction signal within the instruction stream that is representative of a triggering instruction to statically store trace data within the memory component of the debugging unit. Concurrently or alternatively, the processor core can provide one or more of the trigger event signals to the debugging unit as a function of a generated trigger data in response to additional operational instructions within the instruction stream.
TL;DR: By the online recording of significant system events, and then deterministically replaying them off-line, a novel software-based approach is presented for the cyclic debugging of distributed real-time systems that can inspect areal-time system in great detail, while still preserving its real- time behaviour.
Abstract: Cyclic debugging is one of the most important and most commonly used activities in program development. During cyclic debugging, a program is repeatedly re-executed to track down errors when a failure has been observed. This process necessitates reproducible program executions. Applying classical debugging techniques, such as using breakpoints or single stepping, in real-time systems changes the temporal behaviour and makes reproduction of the observed failure during debugging less likely, if not impossible. Consequently, these techniques are not directly applicable to the cyclic debugging of real-time systems. In this paper, we present a novel software-based approach for the cyclic debugging of distributed real-time systems. By the online recording of significant system events, and then deterministically replaying them off-line, we can inspect a real-time system in great detail, while still preserving its real-time behaviour.
TL;DR: In this paper, a receiver/decoder for testing an application, for example, for an analogue or a digital television system, is disclosed, the receiver and decoder comprising means for exchanging messages with a network, and means for running the application in dependence on a message received from the network.
Abstract: A receiver/decoder for testing an application, for example, for an analogue or a digital television system, is disclosed, the receiver/decoder comprising means for exchanging messages with a network, and means for running the application in dependence on a message received from the network. The receiver/decoder may be used for debugging the application. An associated workstation, and an application development tool for editing and testing applications, are also disclosed. Also disclosed is a method of transferring an application from a workstation to a receiver/decoder, and methods of running an application on a receiver/decoder.
TL;DR: The newly defined standard for embedded system debugging, the IEEE-ISTO Nexus 5001 Forum Standard for a Global Embedded Debug Interface, is introduced and is related to the test and debugging requirements of development engineers.
Abstract: The increased clock frequencies and higher integration levels of today's high-performance embedded microcontrollers have led to the widespread incorporation of on-chip debugging logic into new microcontroller chip designs. The newly defined standard for embedded system debugging, the IEEE-ISTO Nexus 5001 Forum Standard for a Global Embedded Debug Interface, is introduced and is related to the test and debugging requirements of development engineers.
TL;DR: The primary purpose of the tool is to provide an environment that mimics some of the failure modes of a real lab, which aids the student in learning debugging techniques.
Abstract: We present the rationale, implementation and performance features of a virtual lab environment for an electronic circuits course. The primary purpose of the tool is to provide an environment that mimics some of the failure modes of a real lab, which aids the student in learning debugging techniques. The tool is implemented as a Java application.
TL;DR: In this paper, a distributed and transparent debugging environment, a controller and an executor, each which accepts operations from a user, are disposed on each of the computers in the system.
Abstract: For a distributed and transparent debugging environment, a controller and an executor, each which accepts operations from a user, are disposed on each of plural computers. The controller and the executor each has a setting-status manager for managing the setting of a debugger and an execution-status manager for managing the execution status of a debugger. In response to a status change, the controller and the executor each notifies other computers of the content of the change via the network. A process manager manages a debug object program according to the content of setting and sets an operation status change due to detection of a break point to an execution-status manager. Meanwhile, the process manager changes its operation in response to a change in execution status sent from another computer.
TL;DR: A system and method for accelerated reliability testing of computer system software components over prolonged periods of time provides a uniform, extensible, reporting framework that includes a plurality of reporting clients, at least one controller as discussed by the authors.
Abstract: A system and method for accelerated reliability testing of computer system software components over prolonged periods of time provides a uniform, extensible, reporting framework that includes a plurality of reporting clients, at least one controller. The system and method are adaptable for operation over a dedicated intranet as well as the Internet. It provides for tracking the reliability of system components and logs failures of varying severity that may be expected to occur over time. This data is useful, among other things, for estimating mean time between failures for software being tested and expected support costs. This information is particularly useful in providing a reliability measure where multiple independently developed software modules are expected to function together. The testing includes random scheduling of tasks and sleep intervals reflecting expected usage patterns, but at a faster pace to efficiently sample the state space to detect sequence of operations that are likely to result in failures in actual use. The method and system include using pseudo-random numbers to schedule the tasks and provides for storage of random numbers to facilitate reproducing failures, for instance for debugging efforts.
TL;DR: An emulator environment based on an FPGA prototyping board is described, which verifies the functionality of a multimedia processor and implements its cycle level simulator.
Abstract: We describe an emulator environment based on an FPGA prototyping board. This emulator environment is for functional verification of a multimedia processor we are developing and for software development and debugging of its application programs. For these purposes, the emulator environment includes a debugging network and provides virtual wires and some utilities, board control functions, and a virtual FPGA board. With this environment we verify the functionality of a multimedia processor and implements its cycle level simulator.
TL;DR: In this article, the authors present a step region, which is delimited by an entry point and an exit point and may include any number of step elements (e.g., instructions, statement, line numbers, etc.).
Abstract: The preferred embodiment of the present invention generally provides a method, apparatus and article of manufacture for debugging computer programs. Debugging computer programs is aided by establishing a step region. The step region is delimited by an entry point and an exit point and may include any number of step elements (e.g., instructions, statement, line numbers, etc.). When a step region is enterered in response to a command, the code contained in the region is executed until the end of the region. Execution is halted for inspection of the region by a user. The executed step region can, be formatted (e.g., by highlighting, bolding, italicizing, shading and the like) to identify executed instructions.
TL;DR: It is described how MPD enables much faster startup and better runtime management of MPICH jobs, and how close control of stdio can support the easy implementation of a number of convenient system utilities, even a parallel debugger.
Abstract: We present a process management system for parallel programs such as those written using MPI. A primary goal of the system, which we call MPD (for multipurpose daemon), is to be scalable. By this we mean that startup of interactive parallel jobs comprising a thousand processes is quick, that signals can be quickly delivered to processes, and that stdin, stdout, and stderr are managed intuitively. Our primary target is parallel machines made up of clusters of SMPs, but the system is also useful in more tightly integrated environments. We describe how MPD enables much faster startup and better runtime management of MPICH jobs. We show how close control of stdio can support the easy implementation of a number of convenient system utilities, even a parallel debugger. MPD is implemented and freely distributed with MPICH.
TL;DR: This article reviews these techniques and introduces JHDL, a design tool that exploits these and other verification aids and is introduced as a novel verification tool for custom computing machines.
Abstract: Custom computing machines offer unique opportunities for verification. This article reviews these techniques and introduces JHDL, a design tool that exploits these and other verification aids.
TL;DR: This paper relies on analysis of a system model, defined as a set of interacting components, each represented as a form of condition system Petri net, that is synthesized from these individual models to drive the system through specified control goals.
Abstract: Automated control synthesis methods for discrete-event systems promise to reduce the time required to develop, debug, and modify control software. Such methods must be able to translate high-level control goals into detailed sequences of actuation and sensing signals. In this paper, we present such a technique. It relies on analysis of a system model, defined as a set of interacting components, each represented as a form of condition system Petri net. Control logic modules, called taskblocks, are synthesized from these individual models. These then interact hierarchically and sequentially to drive the system through specified control goals. The resulting controller is automatically converted to executable control code. The paper concludes with a discussion of a set of software tools developed to demonstrate the techniques on a small manufacturing system.
TL;DR: This paper presents the design of the Memory Instrumentation and Emulation System (MemorIES), a hardware-based emulation tool that can be used to aid memory system designers and observes that previous studies of SPLASH2 applications using scaled application sizes can result in optimistic miss rates relative to real sizes on real machines, providing potentially misleading data when used for design evaluation.
Abstract: Modern system design often requires multiple levels of simulation for design validation and performance debugging. However, while machines have gotten faster, and simulators have become more detailed, simulation speeds have not tracked machine speeds, As a result, it is difficult to simulate realistic problem sizes and hardware configurations for a target machine. Instead, researchers have focussed on developing sealing methodologies and running smaller problem sizes and configurations that attempt to represent the behavior of the real problem. Given the increasing size of problems today, it is unclear whether such an approach yields accurate results. Moreover, although commercial workloads are prevalent and important in today's marketplace, many simulation tools are unable to adequately profile such applications, let alone for realistic sizes.In this paper we present a hardware-based emulation tool that can be used to aid memory system designers. Our focus is on the memory system because the ever-widening gap between processor and memory speeds means that optimizing the memory subsystem is critical for performance. We present the design of the MemoryInstrumentation and Emulation System (MemoriES). MemoriES is a programmable tool designed using FPGAs and SDRAMs. It plugs into an SMP bus to perform on-line emulation of several cache configurations, structures and protocols while the system is running real-life workloads in real-time, without any slowdown in application execution speed. We demonstrate its usefulness in several case studies, and find several important results. First, using traces to perform system evaluation can lead to incorrect results (off by 100% or more in some cases) if the trace size is not sufficiently large. Second. MemoriES is able to detect performance problems by profiling miss behavior over the entire course of a run, rather than relying on a small interval of time. Finally, we observe that previous studies of SPLASH2 applications using scaled application sizes can result in optimistic miss rates relative to real sizes on real machines, providing potentially misleading data when used for design evaluation.