TL;DR: This book provides an extensive discussion of techniques for building Bayesian networks that model real-world situations, including techniques for synthesizing models from design, learning models from data, and debugging models using sensitivity analysis.
Abstract: This book provides a thorough introduction to the formal foundations and practical applications of Bayesian networks. It provides an extensive discussion of techniques for building Bayesian networks that model real-world situations, including techniques for synthesizing models from design, learning models from data, and debugging models using sensitivity analysis. It also treats exact and approximate inference algorithms at both theoretical and practical levels. The author assumes very little background on the covered subjects, supplying in-depth discussions for theoretically inclined readers and enough practical details to provide an algorithmic cookbook for the system developer.
TL;DR: Defects4J, a database and extensible framework providing real bugs to enable reproducible studies in software testing research, and provides a high-level interface to common tasks in softwareTesting research, making it easy to con- duct and reproduce empirical studies.
Abstract: Empirical studies in software testing research may not be comparable, reproducible, or characteristic of practice. One reason is that real bugs are too infrequently used in software testing research. Extracting and reproducing real bugs is challenging and as a result hand-seeded faults or mutants are commonly used as a substitute. This paper presents Defects4J, a database and extensible framework providing real bugs to enable reproducible studies in software testing research. The initial version of Defects4J contains 357 real bugs from 5 real-world open source pro- grams. Each real bug is accompanied by a comprehensive test suite that can expose (demonstrate) that bug. Defects4J is extensible and builds on top of each program’s version con- trol system. Once a program is configured in Defects4J, new bugs can be added to the database with little or no effort. Defects4J features a framework to easily access faulty and fixed program versions and corresponding test suites. This framework also provides a high-level interface to common tasks in software testing research, making it easy to con- duct and reproduce empirical studies. Defects4J is publicly available at http://defects4j.org.
TL;DR: A technique named DStar (D*) is proposed which can suggest suspicious locations for fault localization automatically without requiring any prior information on program structure or semantics and is found to be more effective at locating faults than all the other techniques it is compared to.
Abstract: Effective debugging is crucial to producing reliable software. Manual debugging is becoming prohibitively expensive, especially due to the growing size and complexity of programs. Given that fault localization is one of the most expensive activities in program debugging, there has been a great demand for fault localization techniques that can help guide programmers to the locations of faults. In this paper, a technique named DStar (D*) is proposed which can suggest suspicious locations for fault localization automatically without requiring any prior information on program structure or semantics. D* is evaluated across 24 programs, and is compared to 38 different fault localization techniques. Both single-fault and multi-fault programs are used. Results indicate that D* is more effective at locating faults than all the other techniques it is compared to. An empirical evaluation is also conducted to illustrate how the effectiveness of D* increases as the exponent * grows, and then levels off when the exponent * exceeds a critical value. Discussions are presented to support such observations.
TL;DR: A study of 70 real-world performance bugs collected from eight large-scale and popular Android applications, which identified their common patterns and can support follow-up research on performance bug avoidance, testing, debugging and analysis for smartphone applications.
Abstract: Smartphone applications’ performance has a vital impact on user experience. However, many smartphone applications suffer from bugs that cause significant performance degradation, thereby losing their competitive edge. Unfortunately, people have little understanding of these performance bugs. They also lack effective techniques to fight with such bugs. To bridge this gap, we conducted a study of 70 real-world performance bugs collected from eight large-scale and popular Android applications. We studied the characteristics (e.g., bug types and how they manifested) of these bugs and identified their common patterns. These findings can support follow-up research on performance bug avoidance, testing, debugging and analysis for smartphone applications. To demonstrate the usefulness of our findings, we implemented a static code analyzer, PerfChecker, to detect our identified performance bug patterns. We experimentally evaluated PerfChecker by applying it to 29 popular Android applications, which comprise 1.1 million lines of Java code. PerfChecker successfully detected 126 matching instances of our performance bug patterns. Among them, 68 were quickly confirmed by developers as previously-unknown issues that affect application performance, and 20 were fixed soon afterwards by following our optimization suggestions.
TL;DR: This work studies software bug characteristics by sampling 2,060 real world bugs in three large, representative open-source projects and uses machine learning techniques to classify 109,014 bugs automatically, suggesting semantic bugs are the dominant root cause.
Abstract: To design effective tools for detecting and recovering from software failures requires a deep understanding of software bug characteristics. We study software bug characteristics by sampling 2,060 real world bugs in three large, representative open-source projects--the Linux kernel, Mozilla, and Apache. We manually study these bugs in three dimensions--root causes, impacts, and components. We further study the correlation between categories in different dimensions, and the trend of different types of bugs. The findings include: (1) semantic bugs are the dominant root cause. As software evolves, semantic bugs increase, while memory-related bugs decrease, calling for more research effort to address semantic bugs; (2) the Linux kernel operating system (OS) has more concurrency bugs than its non-OS counterparts, suggesting more effort into detecting concurrency bugs in operating system code; and (3) reported security bugs are increasing, and the majority of them are caused by semantic bugs, suggesting more support to help developers diagnose and fix security bugs, especially semantic security bugs. In addition, to reduce the manual effort in building bug benchmarks for evaluating bug detection and diagnosis tools, we use machine learning techniques to classify 109,014 bugs automatically.
TL;DR: This paper shows how one can automatically construct a model of request execution from pre-existing component logs by generating a large number of potential hypotheses about program behavior and rejecting hypotheses contradicted by the empirical observations.
Abstract: Current debugging and optimization methods scale poorly to deal with the complexity of modern Internet services, in which a single request triggers parallel execution of numerous heterogeneous software components over a distributed set of computers The Achilles' heel of current methods is the need for a complete and accurate model of the system under observation: producing such a model is challenging because it requires either assimilating the collective knowledge of hundreds of programmers responsible for the individual components or restricting the ways in which components interactFortunately, the scale of modern Internet services offers a compensating benefit: the sheer volume of requests serviced means that, even at low sampling rates, one can gather a tremendous amount of empirical performance observations and apply "big data" techniques to analyze those observations In this paper, we show how one can automatically construct a model of request execution from pre-existing component logs by generating a large number of potential hypotheses about program behavior and rejecting hypotheses contradicted by the empirical observations We also show how one can validate potential performance improvements without costly implementation effort by leveraging the variation in component behavior that arises naturally over large numbers of requests to measure the impact of optimizing individual components or changing scheduling behaviorWe validate our methodology by analyzing performance traces of over 13 million requests to Facebook servers We present a detailed study of the factors that affect the end-to-end latency of such requests We also use our methodology to suggest and validate a scheduling optimization for improving Facebook request latency
TL;DR: This paper proposes CrashLocator, a method to locate faulty functions using the crash stack information in crash reports, and shows that it outperforms the conventional stack-only methods significantly.
Abstract: Software crash is common. When a crash occurs, software developers can receive a report upon user permission. A crash report typically includes a call stack at the time of crash. An important step of debugging a crash is to identify faulty functions, which is often a tedious and labor-intensive task. In this paper, we propose CrashLocator, a method to locate faulty functions using the crash stack information in crash reports. It deduces possible crash traces (the failing execution traces that lead to crash) by expanding the crash stack with functions in static call graph. It then calculates the suspiciousness of each function in the approximate crash traces. The functions are then ranked by their suspiciousness scores and are recommended to developers for further investigation. We evaluate our approach using real-world Mozilla crash data. The results show that our approach is effective: we can locate 50.6%, 63.7% and 67.5% of crashing faults by examining top 1, 5 and 10 functions recommended by CrashLocator, respectively. Our approach outperforms the conventional stack-only methods significantly.
TL;DR: A SysML-based approach to support the model-driven engineering (MDE) of Manufacturing Automation Software Projects (MASP) and the suitability of the MDE approach for future users was proven.
TL;DR: This study conducts an empirical study to understand how performance problems are observed and reported by real-world users and shows that statistical debugging is a natural fit for diagnosing performance problems, which are often observed through comparison-based approaches and reported together with both good and bad inputs.
Abstract: Design and implementation defects that lead to inefficient computation widely exist in software. These defects are difficult to avoid and discover. They lead to severe performance degradation and energy waste during production runs, and are becoming increasingly critical with the meager increase of single-core hardware performance and the increasing concerns about energy constraints. Effective tools that diagnose performance problems and point out the inefficiency root cause are sorely needed. The state of the art of performance diagnosis is preliminary. Profiling can identify the functions that consume the most computation resources, but can neither identify the ones that waste the most resources nor explain why. Performance-bug detectors can identify specific type of inefficient computation, but are not suited for diagnosing general performance problems. Effective failure diagnosis techniques, such as statistical debugging, have been proposed for functional bugs. However, whether they work for performance problems is still an open question. In this paper, we first conduct an empirical study to understand how performance problems are observed and reported by real-world users. Our study shows that statistical debugging is a natural fit for diagnosing performance problems, which are often observed through comparison-based approaches and reported together with both good and bad inputs. We then thoroughly investigate different design points in statistical debugging, including three different predicates and two different types of statistical models, to understand which design point works the best for performance diagnosis. Finally, we study how some unique nature of performance bugs allows sampling techniques to lower the overhead of run-time performance diagnosis without extending the diagnosis latency.
TL;DR: This paper presents a technique (and its tool implementation) to troubleshoot configuration errors caused by software evolution, called ConfSuggester, which uses dynamic profiling, execution trace comparison, and static analysis to link the undesired behavior to its root cause.
Abstract: Modern software often exposes configuration options that enable users to customize its behavior. During software evolution, developers may change how the configuration options behave. When upgrading to a new software version, users may need to re-configure the software by changing the values of certain configuration options. This paper addresses the following question during the evolution of a configurable software system: which configuration options should a user change to maintain the software's desired behavior? This paper presents a technique (and its tool implementation, called ConfSuggester) to troubleshoot configuration errors caused by software evolution. ConfSuggester uses dynamic profiling, execution trace comparison, and static analysis to link the undesired behavior to its root cause - a configuration option whose value can be changed to produce desired behavior from the new software version. We evaluated ConfSuggester on 8 configuration errors from 6 configurable software systems written in Java. For 6 errors, the rootcause configuration option was ConfSuggester's first suggestion. For 1 error, the root cause was ConfSuggester's third suggestion. The root cause of the remaining error was ConfSuggester's sixth suggestion. Overall, ConfSuggester produced significantly better results than two existing techniques. ConfSuggester runs in just a few minutes, making it an attractive alternative to manual debugging.
TL;DR: A source-level debugging framework for FPGA high-level synthesis (HLS) that offers gdb-like step, break, and data inspection functionality for an HLS-generated hardware circuit and permits concurrent hardware and software debugging to discover the first point at which any logic signal in the hardware mismatches with its corresponding variable in software.
Abstract: We describe a source-level debugging framework for FPGA high-level synthesis (HLS) that offers gdb-like step, break, and data inspection functionality for an HLS-generated hardware circuit. With the proposed framework, the user can inspect the values of logic signals in the hardware from the C source code perspective. The logic signal values come from one of two sources: 1) a logic simulation of the RTL, or 2) an actual execution of the hardware on an FPGA. In addition to the software-like ecosystem for FPGA HLS debugging, the framework provides the user with insight on the RTL produced by the HLS tool for each C statement, and permits concurrent hardware and software debugging to discover the first point at which any logic signal in the hardware mismatches with its corresponding variable in software.
TL;DR: It is shown that deductive technology that has been developed for full functional verification can be used as a basis and framework for other purposes than pure functional verification.
Abstract: The KeY system offers a platform of software analysis tools for sequential Java. Foremost, this includes full functional verification against contracts written in the Java Modeling Language. But the approach is general enough to provide a basis for other methods and purposes: (i) complementary validation techniques to formal verification such as testing and debugging, (ii) methods that reduce the complexity of verification such as modularization and abstract interpretation, (iii) analyses of non-functional properties such as information flow security, and (iv) sound program transformation and code generation. We show that deductive technology that has been developed for full functional verification can be used as a basis and framework for other purposes than pure functional verification. We use the current release of the KeY system as an example to explain and prove this claim.
TL;DR: The first static analysis is proposed to model GUI-related Android objects, their flow through the application, and their interactions with each other via the abstractions defined by the Android platform, which enables static modeling of control/data flow that is foundational for compiler analyses, instrumentation for event/interaction profiling, static error checking, security analysis, test generation, and automated debugging.
Abstract: The popularity of Android software has grown dramatically in the last few years. It is essential for researchers in programming languages and compilers to contribute new techniques in this increasingly important area. Such techniques require a foundation of program analyses for Android. The target of our work is static object reference analysis, which models the flow of object references. Existing reference analyses cannot be applied directly to Android because the software is component-based and event-driven.An Android application is driven by a graphical user interface (GUI), with GUI objects responding to user actions. These objects and the event handlers associated with them ultimately determine the possible flow of control and data. We propose the first static analysis to model GUI-related Android objects, their flow through the application, and their interactions with each other via the abstractions defined by the Android platform. A formal semantics for the relevant Android constructs is developed to provide a solid foundation for this and other analyses. Next, we propose a constraint-based reference analysis based on the semantics. The analysis employs a constraint graph to model the flow of GUI objects, the hierarchical structure of these objects, and the effects of relevant Android operations. Experimental evaluation on real-world Android applications strongly suggests that the analysis achieves high precision with low cost.The analysis enables static modeling of control/data flow that is foundational for compiler analyses, instrumentation for event/interaction profiling, static error checking, security analysis, test generation, and automated debugging. It provides a key component to be used by compile-time analysis researchers in the growing area of Android software.
TL;DR: PerfScope achieves online bug inference to obviate the need for offline bug reproduction and is application-agnostic, which can support both interpreted and compiled programs running inside a cloud infrastructure.
Abstract: Performance bugs which manifest in a production cloud computing infrastructure are notoriously difficult to diagnose because of both the difficulty of reproducing those bugs and the lack of debugging information. In this paper, we present PerfScope, a practical online performance bug inference tool to help the developer understand how a performance bug happened during the production run. PerfScope achieves online bug inference to obviate the need for offline bug reproduction. PerfScope does not require application source code or any runtime instrumentation to the production system. PerfScope is application-agnostic, which can support both interpreted and compiled programs running inside a cloud infrastructure.We have implemented PerfScope and tested it using real performance bugs on seven popular open source server systems (Hadoop, HDFS, Cassandra, Tomcat, Apache, Lighttpd, MySQL). The results show that PerfScope can narrow down the search scope of the bug-related functions to a small percentage (0.03-2.3%) and rank the real bug-related functions within top five candidates in the majority of cases. PerfScope only imposes on average 1.8% runtime overhead to the tested server applications.
TL;DR: An open-source framework that supports the full feature set of OpenACC V1.0 and performs source-to-source transformations, targeting heterogeneous devices, such as NVIDIA GPUs is presented.
Abstract: This paper presents Open Accelerator Research Compiler (OpenARC): an open-source framework that supports the full feature set of OpenACC V1.0 and performs source-to-source transformations, targeting heterogeneous devices, such as NVIDIA GPUs. Combined with its high-level, extensible Intermediate Representation (IR) and rich semantic annotations, OpenARC serves as a powerful research vehicle for prototyping optimization, source-to-source transformations, and instrumentation for debugging, performance analysis, and autotuning. In fact, OpenARC is equipped with various capabilities for advanced analyses and transformations, as well as built-in performance and debugging tools. We explain the overall design and implementation of OpenARC, and we present key analysis techniques necessary to efficiently port OpenACC applications. Porting various OpenACC applications to CUDA GPUs using OpenARC demonstrates that OpenARC performs similarly to a commercial compiler, while serving as a general research framework.
TL;DR: Perfume is described, an automated approach for inferring behavioral, resource-aware models of software systems from logs of their executions that improves on the state of the art in model inference by differentiating behaviorally similar executions that differ in resource consumption.
Abstract: Software bugs often arise because of differences between what developers think their system does and what the system actually does. These differences frustrate debugging and comprehension efforts. We describe Perfume, an automated approach for inferring behavioral, resource-aware models of software systems from logs of their executions. These finite state machine models ease understanding of system behavior and resource use.Perfume improves on the state of the art in model inference by differentiating behaviorally similar executions that differ in resource consumption. For example, Perfume separates otherwise identical requests that hit a cache from those that miss it, which can aid understanding how the cache affects system behavior and removing cache-related bugs. A small user study demonstrates that using Perfume is more effective than using logs and another model inference tool for system comprehension. A case study on the TCP protocol demonstrates that Perfume models can help understand non-trivial protocol behavior. Perfume models capture key system properties and improve system comprehension, while being reasonably robust to noise likely to occur in real-world executions.
TL;DR: This work presents an approach to the problem of type debugging that is based on generating and filtering a comprehensive set of type-change suggestions and finds that it outperforms other approaches and provides a viable alternative.
Abstract: Changing a program in response to a type error plays an important part in modern software development. However, the generation of good type error messages remains a problem for highly expressive type systems. Existing approaches often suffer from a lack of precision in locating errors and proposing remedies. Specifically, they either fail to locate the source of the type error consistently, or they report too many potential error locations. Moreover, the change suggestions offered are often incorrect. This makes the debugging process tedious and ineffective.We present an approach to the problem of type debugging that is based on generating and filtering a comprehensive set of type-change suggestions. Specifically, we generate all (program-structure-preserving) type changes that can possibly fix the type error. These suggestions will be ranked and presented to the programmer in an iterative fashion. In some cases we also produce suggestions to change the program. In most situations, this strategy delivers the correct change suggestions quickly, and at the same time never misses any rare suggestions. The computation of the potentially huge set of type-change suggestions is efficient since it is based on a variational type inference algorithm that type checks a program with variations only once, efficiently reusing type information for shared parts.We have evaluated our method and compared it with previous approaches. Based on a large set of examples drawn from the literature, we have found that our method outperforms other approaches and provides a viable alternative.
TL;DR: Tardis provides affordable time-travel with an average overhead of only 7% during normal execution, a rate of 0.6MB/s of history logging, and a worst-case 0.68s time- travel latency on the authors' benchmark applications, making Tardis suitable for use as the default debugger for managed languages.
Abstract: Developers who set a breakpoint a few statements too late or who are trying to diagnose a subtle bug from a single core dump often wish for a time-traveling debugger. The ability to rewind time to see the exact sequence of statements and program values leading to an error has great intuitive appeal but, due to large time and space overheads, time traveling debuggers have seen limited adoption. A managed runtime, such as the Java JVM or a JavaScript engine, has already paid much of the cost of providing core features - type safety, memory management, and virtual IO - that can be reused to implement a low overhead time-traveling debugger. We leverage this insight to design and build affordable time-traveling debuggers for managed languages. Tardis realizes our design: it provides affordable time-travel with an average overhead of only 7% during normal execution, a rate of 0.6MB/s of history logging, and a worst-case 0.68s time-travel latency on our benchmark applications. Tardis can also debug optimized code using time-travel to reconstruct state. This capability, coupled with its low overhead, makes Tardis suitable for use as the default debugger for managed languages, promising to bring time-traveling debugging into the mainstream and transform the practice of debugging.
TL;DR: This paper implemented a DMT system based on an execution model called deterministic lazy release consistency (DLRC), which guarantees that programs execute deterministically even when they contain data races, and evaluated it using 16 parallel applications.
Abstract: Multithreaded programs execute nondeterministically on conventional architectures and operating systems. This complicates many tasks, including debugging and testing. Deterministic multithreading (DMT) makes the output of a multithreaded program depend on its inputs only, which can totally solve the above problem. However, current DMT implementations suffer from a common inefficiency: they use frequent global barriers to enforce a deterministic ordering on memory accesses. In this paper, we eliminate that inefficiency using an execution model we call deterministic lazy release consistency (DLRC). Our execution model uses the Kendo algorithm to enforce a deterministic ordering on synchronization, and it uses a deterministic version of the lazy release consistency memory model to propagate memory updates across threads. Our approach guarantees that programs execute deterministically even when they contain data races. We implemented a DMT system based on these ideas (RFDet) and evaluated it using 16 parallel applications. Our implementation targets C/C++ programs that use POSIX threads. Results show that RFDet gains nearly 2x speedup compared with DThreads-a start-of-the-art DMT system.
TL;DR: Tmite-2 is the first tool to combine the power of automation with the flexibility of conventional development, and is also the first practical synthesis tool based on abstraction refinement, to support automated debugging of input specifications.
Abstract: Automatic device driver synthesis is a radical approach to creating drivers faster and with fewer defects by generating them automatically based on hardware device specifications. We present the design and implementation of a new driver synthesis toolkit, called Termite-2. Termite-2 is the first tool to combine the power of automation with the flexibility of conventional development. It is also the first practical synthesis tool based on abstraction refinement. Finally, it is the first synthesis tool to support automated debugging of input specifications. We demonstrate the practicality of Termite-2 by synthesizing drivers for a number of I/O devices representative of a typical embedded platform.
TL;DR: A new automated framework that applies text mining technologies on the natural-language description of bug reports to train a statistical model on historical bug reports with known labels, and it is shown that naive Bayes multinomial with information gain achieves the best performance.
Abstract: Configuration bugs are one of the dominant causes of software failures. Previous studies show that a configuration bug could cause huge financial losses in a software system. The importance of configuration bugs has attracted various research studies, e.g., To detect, diagnose, and fix configuration bugs. Given a bug report, an approach that can identify whether the bug is a configuration bug could help developers reduce debugging effort. We refer to this problem as configuration bug reports prediction. To address this problem, we develop a new automated framework that applies text mining technologies on the natural-language description of bug reports to train a statistical model on historical bug reports with known labels (i.e., Configuration or non-configuration), and the statistical model is then used to predict a label for a new bug report. Developers could apply our model to automatically predict labels of bug reports to improve their productivity. Our tool first applies feature selection techniques (e.g., Information gain and Chi-square) to pre-process the textual information in bug reports, and then applies various text mining techniques (e.g., Naive Bayes, SVM, naive Bayes multinomial) to build statistical models. We evaluate our solution on 5 bug report datasets including accumulo, activemq, camel, flume, and wicket. We show that naive Bayes multinomial with information gain achieves the best performance. On average across the 5 projects, its accuracy, configuration F-measure and non-configuration F-measure are 0.811, 0.450, and 0.880, respectively. We also compare our solution with the method proposed by Arshad et al. The results show that our proposed approach that uses naive Bayes multinomial with information gain on average improves accuracy, configuration F-measure and non-configuration F-measure scores of Arshad et al.'s method by 8.34%, 103.7%, and 4.24%, respectively.
TL;DR: Symcretic execution as discussed by the authors combines symbolic backward execution and concrete forward execution to find inputs that cover a specific branch or statement in a program, which is useful for debugging and regression testing.
Abstract: Knowing inputs that cover a specific branch or statement in a program is useful for debugging and regression testing. Symbolic backward execution (SBE) is a natural approach to find such targeted inputs. However, SBE struggles with complicated arithmetic, external method calls, and data-dependent loops that occur in many real-world programs. We propose symcretic execution, a novel combination of SBE and concrete forward execution that can efficiently find targeted inputs despite these challenges. An evaluation of our approach on a range of test cases shows that symcretic execution finds inputs in more cases than concolic testing tools while exploring fewer path segments. Integration of our approach will allow test generation tools to fill coverage gaps and static bug detectors to verify candidate bugs with concrete test cases.
TL;DR: This paper gives a review of the previous studies that are related to software fault-localization methods and various methods for localization of faults proposed in the literature.
Abstract: Software is a major component of any computer system. To maintain the quality of software, early fault localization is necessary. Many different fault-localization methods have been used by researchers. Ideally, methods for fault-localization are used in such a way that one is able to detect as many faults as possible using the least resources. But, in general, it is hard to predict a test suite's fault-localization capability. This paper gives a review of the previous studies that are related to software fault-localization methods. It reviews various journal and conference papers on localization of faults and various methods for localization of faults proposed in the literature.
TL;DR: This paper analyzes a highly-configurable industrial application and two open source applications in order to quantify the true challenges that configurability creates for software testing and debugging, finding that all three applications consist of multiple programming languages.
Abstract: Many industrial systems are highly-configurable, complicating the testing and debugging process. While researchers have developed techniques to statically extract, quantify and manipulate the valid system configurations, we conjecture that many of these techniques will fail in practice. In this paper we analyze a highly-configurable industrial application and two open source applications in order to quantify the true challenges that configurability creates for software testing and debugging. We find that (1) all three applications consist of multiple programming languages, hence static analyses need to cross programming language barriers to work, (2) there are many access points and methods to modify configurations, implying that practitioners need configuration traceability and should gather and merge metadata from more than one source and (3) the configuration state of an application on failure cannot be reliably determined by reading persistent data; a runtime memory dump or other heuristics must be used for accurate debugging. We conclude with a roadmap and lessons learned to help practitioners better handle configurability now, and that may lead to new configuration-aware testing and debugging techniques in the future.
TL;DR: High-level synthesis promises to increase designer productivity in the face of steadily increasing FPGA sizes, and broaden the market of use, allowing software designers to reap the benefits of hardware implementation.
Abstract: High-level synthesis (HLS) promises to increase designer productivity in the face of steadily increasing FPGA sizes, and broaden the market of use, allowing software designers to reap the benefits of hardware implementation. One roadblock to HLS adoption is the lack of a debugging infrastructure. To debug, designers can run their source code on a processor; however, this does not capture interactions with other system components. The alternative is to debug using the RTL, which is beyond the expertise of software designers, and impractical for hardware designers as the RTL may not resemble the original source code.
TL;DR: This experience report illuminates the application of formal methods in real safety-critical system development by detailing a complete end-to-end design-time verification process including all models and specifications.
TL;DR: Time travel debugging in a managed runtime system as mentioned in this paper can be used to replay at least a portion of the execution of the managed program component based upon the live-object snapshots of program states.
Abstract: Various technologies described herein pertain to time travel debugging in a managed runtime system The managed runtime system can include an execution component that executes a managed program component Moreover, the managed runtime system can include a time travel debugger component The time travel debugger component can be configured to record a sequence of live-object snapshots of program states during execution of the managed program component A live-object snapshot can include live objects from a heap in memory at a given time during the execution Moreover, the time travel debugger component can be configured to replay at least a portion of the execution of the managed program component based upon the live-object snapshots
TL;DR: In this article, the performance information for a software program which is being debugged in a debugger is adjusted by removing from it a measured debug overhead or other diagnostic overhead, such as pauses, context switches, debug versus release build presence, bounds checking, funceval, and call stack analyses.
Abstract: Assistance is given to aid in optimizing a program's performance during initial development while the program's features are still being implemented and/or debugged, without interfering with that development, by providing easy-to-ignore yet accurate tips about a program's performance inside a debugger. Raw performance information for a software program which is being debugged in a debugger is adjusted by removing from it a measured debug overhead or other diagnostic overhead. Some factors considered when measuring overhead include pauses, context switches, debug versus release build presence, bounds checking, funceval, and call stack analyses. The debugger is enhanced to display the adjusted program performance measure in a graphical user interface, next to the corresponding source code. The enhanced debugger updates the adjusted program performance measure value and keeps its screen location current as the developer moves through the source code, providing more detailed performance information upon request.
TL;DR: This paper proposes a new multi-dimensional root-cause algorithm for fundamental and derived measures of ad systems to identify the dimension mostly likely to blame and implements the attribution algorithm and a visualization interface in a tool called the Adtributor to help troubleshooters quickly identify potential causes.
Abstract: Advertising (ad) revenue plays a vital role in supporting free websites. When the revenue dips or increases sharply, ad system operators must find and fix the root-cause if actionable, for example, by optimizing infrastructure performance. Such revenue debugging is analogous to diagnosis and root-cause analysis in the systems literature but is more general. Failure of infrastructure elements is only one potential cause; a host of other dimensions (e.g., advertiser, device type) can be sources of potential causes. Further, the problem is complicated by derived measures such as costs-per-click that are also tracked along with revenue.Our paper takes the first systematic look at revenue debugging. Using the concepts of explanatory power, succinctness, and surprise, we propose a new multi-dimensional root-cause algorithm for fundamental and derived measures of ad systems to identify the dimension mostly likely to blame. Further, we implement the attribution algorithm and a visualization interface in a tool called the Adtributor to help troubleshooters quickly identify potential causes. Based on several case studies on a very large ad system and extensive evaluation, we show that the Adtributor has an accuracy of over 95% and helps cut down troubleshooting time by an order of magnitude.
TL;DR: This paper proposes that the original circuit mapping is fully preserved and incremental techniques are used to eliminate the need for a full recompilation, thereby accelerating the debugging process and exploiting two opportunities available during trace-insertion.
Abstract: As integrated circuits encapsulate more functionality and complexity, verifying that these devices operate correctly under all scenarios is an increasingly difficult task. Rather than using traditional verification techniques such as software simulation, more and more designers are taking advantage of the significantly higher clock speeds that can be achieved by using field-programmable gate-array (FPGA)-based prototypes. A key challenge to these prototypes is the lack of on-chip observability during debugging; one popular solution is to insert trace-buffers into the design to record a limited set of internal signals, but modifying this trace configuration often requires the entire circuit to be recompiled. In this paper, we propose that the original circuit mapping is fully preserved and incremental techniques are used to eliminate the need for a full recompilation, thereby accelerating the debugging process. By exploiting two opportunities available during trace-insertion: the ability to connect from any point of a signal to any trace-pin, and the internal symmetry of the FPGA architecture, we find that incremental trace-insertion can be 98 times faster than a full recompilation, return a routing solution with a shorter wirelength, and have a negligible effect on the critical-path delay of the original circuit when reclaiming 75% of the leftover memory capacity for tracing.