TL;DR: The goal is to design tools that enable modestly-skilled programmers to isolate performance bottlenecks in distributed systems composed of black-box nodes by developing two very different algorithms for inferring the dominant causal paths through a distributed system from these traces.
Abstract: Many interesting large-scale systems are distributed systems of multiple communicating components. Such systems can be very hard to debug, especially when they exhibit poor performance. The problem becomes much harder when systems are composed of "black-box" components: software from many different (perhaps competing) vendors, usually without source code available. Typical solutions-provider employees are not always skilled or experienced enough to debug these systems efficiently. Our goal is to design tools that enable modestly-skilled programmers (and experts, too) to isolate performance bottlenecks in distributed systems composed of black-box nodes.We approach this problem by obtaining message-level traces of system activity, as passively as possible and without any knowledge of node internals or message semantics. We have developed two very different algorithms for inferring the dominant causal paths through a distributed system from these traces. One uses timing information from RPC messages to infer inter-call causality; the other uses signal-processing techniques. Our algorithms can ascribe delay to specific nodes on specific causal paths. Unlike previous approaches to similar problems, our approach requires no modifications to applications, middleware, or messages.
TL;DR: New non-standard reasoning services are designed and implemented to pinpoint logical contradictions when developing the medical terminology DICE to provide complete algorithms for unfoldable ACC-TBoxes based on minimisation of axioms using Boolean methods for minimal unsatisfiability-presening sub-T boxes.
TL;DR: A practical low-overhead hardware recorder for cachecoherent multiprocessors, called Flight Data Recorder (FDR), which like an aircraft flight data recorder continuously records the execution, even on deployed systems, logging the execution for post-mortem analysis.
Abstract: Debuggers have been proven indispensable in improving software reliability. Unfortunately, on most real-life software, debuggers fail to deliver their most essential feature --- a faithful replay of the execution. The reason is non-determinism caused by multithreading and non-repeatable inputs. A common solution to faithful replay has been to record the non-deterministic execution. Existing recorders, however, either work only for datarace-free programs or have prohibitive overhead.As a step towards powerful debugging, we develop a practical low-overhead hardware recorder for cachecoherent multiprocessors, called Flight Data Recorder (FDR). Like an aircraft flight data recorder, FDR continuously records the execution, even on deployed systems, logging the execution for post-mortem analysis.FDR is practical because it piggybacks on the cache coherence hardware and logs nearly the minimal threadordering information necessary to faithfully replay the multiprocessor execution. Our studies, based on simulating a four-processor server with commercial workloads, show that when allocated less than 7% of system's physical memory, our FDR design can capture the last one second of the execution at modest (less than 2%) slowdown.
TL;DR: A runtime assertion checker for the Java Modeling Language (JML) that helps in assigning blame during debugging and in automatic generation of test oracles, which represents a significant advance over the current state of the art.
Abstract: Debugging is made difficult by the need to precisely describe what each piece of the software is supposed to do, and to write code to defend modules against the errors of other modules; if this is not done it is difficult to assign blame to a small part of the program when things go wrong. Similarly, unit testing also needs precise descriptions of behavior, and is made difficult by the need to write test oracles. However, debugging and testing consume a significant fraction of the cost of software development and maintenance efforts. Inadequate debugging and testing also contribute to quality problems. We describe a runtime assertion checker for the Java Modeling Language (JML) that helps in assigning blame during debugging and in automatic generation of test oracles. It represents a significant advance over the current state of the art, because it can deal with very abstract specifications which hide representation details, and other features such as quantifiers, and inheritance of specifications. Yet JML specifications have a syntax that is easily understood by programmers. Thus, JML’s runtime assertion checker has the potential for decreasing the cost of debugging and testing.
TL;DR: An interactive system is a computing system that allows the user to interact with a running program by giving it data or control directions through an input device such as a keyboard or a mouse as discussed by the authors.
Abstract: An interactive system is a computing system that allows the user to interact with a running program by giving it data or control directions through an input device such as a keyboard or a mouse (q.v.). This mode of operation is in contrast to a batch processing system, which requires that all input be placed in a file that is readied for reading before beginning execution of the program that will process it. The obvious advantage of interactive use is that the user can choose input and control directions based on partial results received from an early phase of program execution, whereas batch processing requires that data be prepared with all eventualities in mind. The difference is most acute when debugging a new program.
TL;DR: In this paper, imperfect debugging is considered in the sense that new faults can be introduced into the software during debugging and the detected faults may not be removed completely.
Abstract: Software reliability growth models (SRGMs) have been developed to estimate software reliability measures such as the number of remaining faults, software failure rate, and software reliability. Issues such as imperfect debugging and the learning phenomenon of developers have been considered in these models. However, most SRGMs assume that faults detected during tests will eventually be removed. Consideration of fault removal efficiency in the existing models is limited. In practice, fault removal efficiency is usually imperfect. This paper aims to incorporate fault removal efficiency into software reliability assessment. Fault removal efficiency is a useful metric in software development practice and it helps developers to evaluate the debugging effectiveness and estimate the additional workload. In this paper, imperfect debugging is considered in the sense that new faults can be introduced into the software during debugging and the detected faults may not be removed completely. A model is proposed to integrate fault removal efficiency, failure rate, and fault introduction rate into software reliability assessment. In addition to traditional reliability measures, the proposed model can provide some useful metrics to help the development team make better decisions. Software testing data collected from real applications are utilized to illustrate the proposed model for both the descriptive and predictive power. The expected number of residual faults and software failure rate are also presented.
TL;DR: It is shown that mechanisms for Thread-Level Speculation (TLS) can be reused to boost debugging productivity and extends TLS's rollback capabilities to be able to roll back and deterministically re-execute the code with races to obtain the race signature.
Abstract: While removing software bugs consumes vast amounts of human time, hardware support for debugging in modern computers remains rudimentary. Fortunately, we show that mechanisms for Thread-Level Speculation (TLS) can be reused to boost debugging productivity. Most notably, TLS's rollback capabilities can be extended to support rolling back recent buggy execution and repeating it as many times as necessary until the bug is fully characterized. These incremental re-executions are deterministic even in multithreaded codes. Importantly, this operation can be done automatically on the fly, and is compatible with production runs.As a specific implementation of a TLS-based debugging framework, we introduce ReEnact. ReEnact targets a particularly hairy class of bugs: data races in multithreaded programs. ReEnact extends the communication monitoring mechanisms in TLS to also detect data races. It extends TLS's rollback capabilities to be able to roll back and deterministically re-execute the code with races to obtain the race signature. Finally, the signature is compared to a library of race patterns and, if a match occurs, the execution may be repaired. Overall, ReEnact successfully detects, characterizes, and often repairs races automatically on the fly. Moreover, it is fully compatible with always-on use in production runs: the slowdown of race-free execution with ReEnact is on average only 5.8%.
TL;DR: This work proposes a model-free satisfiability-based solution to Fault diagnosis and logic debugging for digital VLSI design problems and shows that satisfiability captures significant problem characteristics and it offers different trade-offs.
Abstract: Recent advances in Boolean satisfiability have made it attractive to solve many digital VLSI design problems such as verification and test generation. Fault diagnosis and logic debugging have not been addressed by existing satisfiability-based solutions. We attempt to bridge this gap by proposing a model-free satisfiability-based solution to these problems. The proposed formulation is intuitive and easy to implement. It shows that satisfiability captures significant problem characteristics and it offers different trade-offs. It also provides new opportunities for satisfiability-based diagnosis tools and diagnosis-specific satisfiability algorithms. Theory and experiments validate the claims and demonstrate its potential.
TL;DR: In this paper, a system and method for facilitating the reporting of information regarding a computer software product by way of a dynamically configurable general report client is presented, where a set of report user interface definition files customizes the report interface for reporting information relating to the software product with which it is associated.
Abstract: A system and method is disclosed for facilitating the reporting of information regarding a computer software product by way of a dynamically-configurable general report client. The general report client is used along with a set of report user interface definition files that is specific to each software product for which a report can be prepared. A set of report user interface definition files customizes the report user interface for reporting information relating to the software product with which it is associated. The invention provides for dynamic configurability in that, by entering certain values by way of the report user interface, the user may cause the client to load additional report user interface definition files and present additional user interface child screens accordingly. The invention provides methods by which software developers, software providers and others obtain user feedback for such purposes as beta-testing and debugging.
TL;DR: In this paper, a visual debugger for stylesheets is presented, which allows the user to set breakpoints on the stylesheet, run to, and stopping at, the breakpoints, and single step through each template rule as the rule is fired.
Abstract: A visual Debugger for stylesheets assists a user of stylesheets debug the stylesheets by: allowing the user to set breakpoints on the stylesheet; running to, and stopping at, the breakpoints; single stepping through each template rule as the rule is fired (both forward and backward); evaluating the template rule on the fly; showing the relationship between each template rule and the source document that provides the data; supporting stylesheets that call external programs written in Java or JavaScript; supporting stylesheets that include or import other stylesheets; supporting XML documents that use the “?xml-stylesheet” processing instruction to include stylesheets; supporting multiple debug sessions; and allowing the user to edit the stylesheet or source document and then allowing the debugger to be relaunched.
TL;DR: This paper describes a novel method for debugging formal, temporal specifications that exploits the short program execution traces that program verification tools generate from specification violations and that specification miners extract from programs.
Abstract: Program verification tools (such as model checkers and static analyzers) can find many errors in programs. These tools need formal specifications of correct program behavior, but writing a correct specification is difficult, just as writing a correct program is difficult. Thus, just as we need methods for debugging programs, we need methods for debugging specifications.This paper describes a novel method for debugging formal, temporal specifications. Our method exploits the short program execution traces that program verification tools generate from specification violations and that specification miners extract from programs. Manually examining these traces is a straightforward way to debug a specification, but this method is tedious and error-prone because there may be hundreds or thousands of traces to inspect. Our method uses concept analysis to automatically group the traces into highly similar clusters. By examining clusters instead of individual traces, a person can debug a specification with less work.To test our method, we implemented a tool, Cable, for debugging specifications. We have used Cable to debug specifications produced by Strauss, our specification miner. We found that using Cable to debug these specifications requires, on average, less than one third as many user decisions as debugging by examining all traces requires. In one case, using Cable required only 28 decisions, while debugging by examining all traces required 224.
TL;DR: In this article, a system and method of exposing debugging information in a graphical modeling and execution environment is described. The system allows a user to view debugging information as well as debugging data in the same window as the graphical view of the model being executed.
Abstract: A system and method of exposing debugging information in a graphical modeling and execution environment is disclosed The present invention allows a user to view debugging information in the same window as the graphical view of the model being executed Debugging data is associated with relevant components of the model displayed in the graphical view A separate execution list view shows the methods called during the execution of the block diagram in the current time step up until the current point in execution User-set breakpoints and conditional breakpoints may be set in both the model view and the execution list view Values may be obtained for all of the displayed methods The debugging tool may be implemented by using it in conjunction with a graphical modeling and execution environment, such as a block diagram environment or state diagram environment
TL;DR: A methodology that enables the real-time diagnosis of performance problems in complex high-performance distributed systems, called NetLogger, which is designed to be extremely lightweight, and includes a mechanism for reliably collecting monitoring events from multiple distributed locations.
Abstract: Developers and users of high-performance distributed systems often observe performance problems such as unexpectedly low throughput or high latency. Determining the source of the performance problems requires detailed end-to-end instrumentation of all components, including the applications, operating systems, hosts, and networks. In this paper we describe a methodology that enables the real-time diagnosis of performance problems in complex high-performance distributed systems. The methodology includes tools for generating timestamped event logs that can be used to provide detailed end-to-end application and system level monitoring; and tools for visualizing the log data and real-time state of the distributed system. This methodology, called NetLogger, has proven invaluable for diagnosing problems in networks and in distributed systems code. This approach is novel in that it combines network, host, and application-level monitoring, providing a complete view of the entire system. NetLogger is designed to be extremely lightweight, and includes a mechanism for reliably collecting monitoring events from multiple distributed locations.
TL;DR: This paper illustrates the facilities for type debugging of Haskell programs in the Chameleon programming environment with reasoning about constraint satisfiability and implication to find minimal justifications of type errors, and to explain unexpected types that arise.
Abstract: In this paper we illustrate the facilities for type debugging of Haskell programs in the Chameleon programming environment. Chameleon provides an extension to Haskell supporting advanced and programmable type extensions. Chameleon maps the typing problem for a program to a system of constraints each attached to program code that generates the constraints. We use reasoning about constraint satisfiability and implication to find minimal justifications of type errors, and to explain unexpected types that arise. Through an interactive process akin to declarative debugging, a user can track down exactly where a type error occurs. The approach handles Hindley/Milner types with Haskell-style overloading. The Chameleon system provides a full implementation of our flexible type debugging scheme which can be used as a front-end to any existing Haskell system.
TL;DR: In the 1940s, when modern computing began, engineers tended to view computers and the programs running on them as unified entities, but now after decades in which software and hardware developed along separate paths, it seems to have come full circle.
Abstract: In the 1940s, when modern computing began, engineers tended to view computers and the programs running on them as unified entities. Now, after decades in which software and hardware developed along separate paths, we seem to have come full circle. The hardware on which our programs run is thanks to embedded systems. These systems force designers to work under incredibly tight constraints. To understand the technologies developed to satisfy these constraints, we must first distinguish the underlying embedded systems elements.
TL;DR: This paper proposes and discusses different methods for deterministic monitoring, and provides benchmarking results from an industrial strength case study demonstrating the feasibility of the method based on a number of new techniques.
Abstract: In this paper we present a new approach to deterministic replay using standard components. Our method facilitates cyclic debugging of real-time systems with industry standard real-time operating systems using industry standard debuggers. The method is based on a number of new techniques: A new marker for deterministic differentiation between e.g., loop iterations for deterministic reproduction of interrupts and task preemptions, an algorithm for finding well-defined starting points of replay sessions, as well as a technique for using conditional breakpoints in standard debuggers to replay the target system. We also propose and discuss different methods for deterministic monitoring, and provide benchmarking results from an industrial strength case study demonstrating the feasibility of our method. Previously published solutions to the problem of debugging real-time systems have been based on the concept of deterministic replay: where significant system events like task-switches of multitasking software and external inputs are recorded during run-time, and later replayed (re-executed) off-line. Previous works have been based on either non-standard hardware, specially designed compilers or modified real-time operating systems. The reliance on non-standard components has limited the success of the approach. Even though this idea has been around for 20 years, no industrial application for debugging of real-time systems of the method has been presented.
TL;DR: In this article, the authors present a simulation/debugging method for SOC designs that utilizes initial memory values loaded into a simulation model and a test program is then executed, and incremetal transaction records are generated for each incremental memory access (e.g., data write operations).
Abstract: A simulation/debugging method for SOC designs that utilizes initial memory values loaded into a simulation model. A test program is then executed, and incremetal transaction records are generated for each incremental memory access (e.g., data write operations). Each transaction record includes a timestamp, address and data values. The transaction record information is stored/captured on a high level-based (i.e., system address-based) domain that takes into account all the tiling, interleaving, scrambling, and unaligned accessing used in the simulated SOC design, rather than on a low level-based (i.e., physical memory address-based) domain. Upon completing the simulation, the instantaneous memory contents at any selected point in time during the simulated execution are calculated by combining the initial data and intermediate transaction record information. Automatic memory dump and sanity check tests verify the integrity of the final data value and incremental transactions. Cache memory information is collected and displayed using a system-level format.
TL;DR: A novel strategy for automatically debugging programs given sampled data from thousands of actual user runs that has analogies with intuitive debugging heuristics, and is able to deal with various types of bugs that occur in real programs.
Abstract: We present a novel strategy for automatically debugging programs given sampled data from thousands of actual user runs. Our goal is to pinpoint those features that are most correlated with crashes. This is accomplished by maximizing an appropriately defined utility function. It has analogies with intuitive debugging heuristics, and, as we demonstrate, is able to deal with various types of bugs that occur in real programs.
TL;DR: In this article, techniques and systems for analysis, diagnosis and debugging fabricated hardware designs at the hardware description language (HDL) level are described, although the hardware designs were designed in HDL and have been fabricated in integrated circuit products with limited input/output pins.
Abstract: Techniques and systems for analysis, diagnosis and debugging fabricated hardware designs at a Hardware Description Language (HDL) level are described Although the hardware designs (which were designed in HDL) have been fabricated in integrated circuit products with limited input/output pins, the techniques and systems enable the hardware designs within the integrated circuit products to be comprehensively analyzed, diagnosed, and debugged at the HDL level at speed The ability to debug hardware designs at the HDL level facilitates correction or adjustment of the HDL description of the hardware designs
TL;DR: In this paper, a debugging tool for computer program development that adds output statements at strategic locations throughout the program is described, including the filename and line number of the original source code and may further include a listing of the executed command as well as values of certain expressions and/or variables as defined by the requested verbosity.
Abstract: A debugging tool for computer program development that analyzes the computer program adds output statements at strategic locations throughout the program. The output statements may include the filename and line number of the original source code and may further include a listing of the executed command as well as values of certain expressions and/or variables as defined by the requested verbosity. The verbosity may be set at different levels throughout the source code as required.
TL;DR: A JVM architecture designed for very small devices that supports all the CLDC Java platform semantics, including exact garbage collection, dynamic class loading, and verification, and has performance comparable to the reference CLDC implementation available from Sun™.
Abstract: The smallest complete Java™ virtual machine implementations in use today are based on the CLDC standard and are deployed in mobile phones and PDAs. These implementations require several tens of kilobytes. Smaller Java-like implementations also exist, but these involve compromises in Java semantics. This paper describes a JVM™ architecture designed for very small devices. It supports all the CLDC Java platform semantics, including exact garbage collection, dynamic class loading, and verification. For portability and ease of debugging, the entire system is written in the Java language, with key components automatically translated into C and compiled for the target device. The resulting system will run on the next generation of smart cards, and has performance comparable to the reference CLDC implementation available from Sun™.
TL;DR: Experimental results on formulae extracted from the debugging of C functions manipulating pointers show that an implementation of the techniques can discharge proof obligations which cannot be handled by Simplify (the theorem prover used in the ESC/Java tool) and perform much better on others.
Abstract: Software bugs are very difficult to detect even in small units of code. Several techniques to debug or prove correct such units are based on the generation of a set of formulae whose unsatisfiability reveals the presence of an error. These techniques assume the availability of a theorem prover capable of automatically discharging the resulting proof obligations. Building such a tool is a difficult, long, and error-prone activity. In this paper, we describe techniques to build provers which are highly automatic and flexible by combining state-of-the-art superposition theorem provers and BDDs. We report experimental results on formulae extracted from the debugging of C functions manipulating pointers showing that an implementation of our techniques can discharge proof obligations which cannot be handled by Simplify (the theorem prover used in the ESC/Java tool) and perform much better on others.
TL;DR: In this article, the authors present a debugging tool that allows a user to view debugging information in the same window as the graphical view of the model being executed, where debugging data is associated with relevant components of a model displayed in a graphical view.
Abstract: A system and method of exposing debugging information in a graphical modeling and execution environment is disclosed. The present invention allows a user to view debugging information in the same window as the graphical view of the model being executed. Debugging data is associated with relevant components of the model displayed in the graphical view. A separate execution list view shows the methods called during the execution of the block diagram in the current time step up until the current point in execution. User-set breakpoints and conditional breakpoints may be set in both the model view and the execution list view. Values may be obtained for all of the displayed methods. The debugging tool may be implemented by using it in conjunction with a graphical modeling and execution environment, such as a block diagram environment or state diagram environment.
TL;DR: The results indicate that while open-source development is subject to positive learning effects, these effects are not universal, with some projects deriving more benefit than others.
Abstract: This paper studies organizational learning effects in open-source programming projects. Working with data from the Apache and Mozilla projects, the study focuses on three aspects of open-source development. The first is the use of the open-source approach as a hedge against system complexity. The second is the adaptive learning mechanisms realized by the debugging process. The last is the learning curve effects of project-specific experience on bug cycle times. The results indicate that while open-source development is subject to positive learning effects, these effects are not universal, with some projects deriving more benefit than others.
TL;DR: LabVIEW Basics, a Beginner's Guide to Managing Virtual Instruments, and Other LabVIEW Applications, a Practical Guide to Manipulating Virtual Instruments.
Abstract: 1. LabVIEW Basics. 2. Virtual Instruments. 3. Editing and Debugging Virtual Instruments. 4. SubVIs. 5. Structures. 6. Arrays and Clusters. 7. Graphs and Charts. 8. Data Acquisitions. 9. Instrument Control. 10. Data Analysis. 11. File I/O. 12. Other LabVIEW Applications.
TL;DR: The creation of a debugger for the Sea Cucumber synthesizing compiler is discussed, used to explore the issues associated with providing information about a circuit in the context of the original source code, thus making the debugging process more intuitive.
Abstract: With the growing popularity of using high-level synthesis tools to map programs written in general-purpose programming languages to FPGA (field programmable gate array) hardware, it has become necessary to provide comprehensive, intuitive debugging tools in order to verify the correctness of the synthesized hardware. The difficulty in creating these tools lies in the fact that typical synthesizing compilers provide no information about how the source code is mapped to hardware. This paper discusses the creation of a debugger for the Sea Cucumber synthesizing compiler used to explore the issues associated with providing information about a circuit in the context of the original source code, thus making the debugging process more intuitive.
TL;DR: In this paper, the authors present a method and system for performing very high speed software downloads concurrent with system testing in an automated production environment and for test-sequencing in multi-tasking environments with consolidated automation and interactive operations.
Abstract: Method and system for performing very high speed software downloads concurrent with system testing in an automated production environment and for test-sequencing in multi-tasking environments with consolidated automation and interactive operations is described. In one embodiment, during diagnostics and software download, a multi-tasking OS is booted on a target computer system, thereby enabling diagnostics to be run at the same time the software download is performed. A visual step-sequencing engine provides the ability to sequentially execute steps, as well as to execute steps in parallel and to combine parallel and sequential steps into loops. The sequencing engine provides a visual representation of the current run status of the target system in a Main window. The sequencing engine also integrates EMR debug tools into the same application so that EMR debug technicians can run failing steps directly from the application via an EMR Control window thereof and consolidates logs into a single location within the application viewable via a Logs window thereof.
TL;DR: In this paper, a method for modifying a method's byte code instructions for purposes of testing, debugging and/or monitoring is described. And the application of the method to distributed statistical record (DSR) keeping is also disclosed.
Abstract: A method is disclosed that comprises modifying a method's byte code instructions for purposes of testing, debugging and/or monitoring. Additional byte code instructions are inserted into the method's byte code instructions at an entry point of the method and at an exit point of the method. The first additional byte code instruction causes a first output function to be executed for the method as a consequence of the entry point being reached during runtime. The second additional byte code instruction causes a second output function to be executed for the method as a consequence of the exit point being reached during runtime. The Application of the method to Distributed Statistical Record (DSR) keeping is also disclosed.
TL;DR: In this paper, a front-end application receives a request for a debugger tool from an end-user system and provides to the back-end system an identification of the end user system.
Abstract: Methods and apparatus, including computer program products, for allowing an end user at an end user system to remotely debug a back-end application program executing on a back-end system. To access the services of the back-end application program, the end user system interacts with a front-end application program executing on a front-end system, the front-end application program acting as a proxy to the back-end application program. The front-end application program receives a request for a debugger tool from the end user system and provides to the back-end system an identification of the end user system. Based on the identification, the back-end system sends a request to start a debugger tool to the end user system, and in response, the end user system establishes a communication channel with the front-end system to use the debugger tool to receive debugging information.
TL;DR: The paper uses three adaptative applications examples to illustrate the capabilities and benefits of PCL, and to show experimentally that the performance overheads of using PCL for implementing an adaptive application are negligible.