TL;DR: Even if software developers don't fully understand the faults or know their location in the code, software rejuvenation can help avoid failures in the presence of aging-related bugs.
Abstract: Even if software developers don't fully understand the faults or know their location in the code, software rejuvenation can help avoid failures in the presence of aging-related bugs This is good news because reproducing and isolating an aging-related bug can be quite involved, similar to other Mandelbugs Moreover, monitoring for signs of software aging can even help detect software faults that were missed during the development and testing phases If, on the other hand, a developer can detect a specific aging-related bug in the code, fixing it and distributing a software update might be worthwhile In the case of the Patriot missile-defense system, a modified version of the software was indeed prepared and deployed to users It arrived at Dhahran on 26 February 1991 - a day after the fatal incident
TL;DR: An empirical study is presented that demonstrates that the parallel-debugging technique and methodology can yield a dramatic decrease in total debugging time compared to a one-fault-at-a-time, or conventionally sequential, approach.
Abstract: The presence of multiple faults in a program can inhibit the ability of fault-localization techniques to locate the faults. This problem occurs for two reasons: when a program fails, the number of faults is, in general, unknown; and certain faults may mask or obfuscate other faults. This paper presents our approach to solving this problem that leverages the well-known advantages of parallel work flows to reduce the time-to-release of a program. Our approach consists of a technique that enables more effective debugging in the presence of multiple faults and a methodology that enables multiple developers to simultaneously debug multiple faults. The paper also presents an empirical study that demonstrates that our parallel-debugging technique and methodology can yield a dramatic decrease in total debugging time compared to a one-fault-at-a-time, or conventionally sequential, approach.
TL;DR: This paper developed two different algorithms, one implementing a bottom-up approach using support of an external description logic reasoner, the other implementing a specialized tableau-based calculus.
Abstract: In this paper we study the diagnosis and repair of incoherent terminologies. We define a number of new nonstandard reasoning services to explain incoherence through pinpointing, and we present algorithms for all of these services. For one of the core tasks of debugging, the calculation of minimal unsatisfiability preserving subterminologies, we developed two different algorithms, one implementing a bottom-up approach using support of an external description logic reasoner, the other implementing a specialized tableau-based calculus. Both algorithms have been prototypically implemented. We study the effectiveness of our algorithms in two ways: we present a realistic case study where we diagnose a terminology used in a practical application, and we perform controlled benchmark experiments to get a better understanding of the computational properties of our algorithms in particular and the debugging problem in general.
TL;DR: The Stack Trace Analysis Tool (STAT) is presented to aid in debugging extreme-scale applications and leverages MRNet, an infrastructure for tool control and data analyses, to overcome scalability barriers faced by heavy-weight debuggers.
Abstract: We present the Stack Trace Analysis Tool (STAT) to aid in debugging extreme-scale applications. STAT can reduce problem exploration spaces from thousands of processes to a few by sampling stack traces to form process equivalence classes, groups of processes exhibiting similar behavior. We can then use full-featured debuggers on representatives from these behavior classes for root cause analysis. STAT scalably collects stack traces over a sampling period to assemble a profile of the application's behavior. STAT routines process the samples to form a call graph prefix tree that encodes common behavior classes over the program's process space and time. STAT leverages MRNet, an infrastructure for tool control and data analyses, to overcome scalability barriers faced by heavy-weight debuggers. We present STAT's design and an evaluation that shows STAT gathers informative process traces from thousands of processes with sub-second latencies, a significant improvement over existing tools. Our case studies of production codes verify that STAT supports the quick identification of errors that were previously difficult to locate.
TL;DR: It is demonstrated that, by changing only 60 source code lines, all of the C benchmarks in the SPEC CINT2000 suite were parallelizable by automatic thread extraction, and this process, constrained by the limits of modern optimizing compilers, yielded a speedup of 454% on these applications.
Abstract: Single-threaded programming is already considered a complicated task. The move to multi-threaded programming only increases the complexity and cost involved in software development due to rewriting legacy code, training of the programmer, increased debugging of the program, and ef- forts to avoid race conditions, deadlocks, and other prob- lems associated with parallel programming. To address these costs, other approaches, such as automatic thread ex- traction, have been explored. Unfortunately, the amount of parallelism that has been automatically extracted is gener- ally insufficient to keep many cores busy. This paper argues that this lack of parallelism is not an intrinsic limitation of the sequential programming model, but rather occurs for two reasons. First, there exists no framework for automatic thread extraction that brings to- gether key existing state-of-the-art compiler and hardware techniques. This paper shows that such a framework can yield scalable parallelization on several SPEC CINT2000 benchmarks. Second, existing sequential programming lan- guages force programmers to define a single legal program outcome, rather than allowing for a range of legal out- comes. This paper shows that natural extensions to the se- quential programming model enable parallelization for the remainder of the SPEC CINT2000 suite. Our experience demonstrates that, by changing only 60 source code lines, all of the C benchmarks in the SPEC CINT2000 suite were parallelizable by automatic thread extraction. This process, constrained by the limits of modern optimizing compilers, yielded a speedup of 454% on these applications.
TL;DR: MemTracker is described, a new hardware support mechanism that can be configured to perform different kinds of memory access monitoring tasks and which can be used to implement different monitoring and debugging checkers with minimal performance overheads.
Abstract: Memory bugs are a broad class of bugs that is becoming increasingly common with increasing software complexity, and many of these bugs are also security vulnerabilities. Unfortunately, existing software and even hardware approaches for finding and identifying memory bugs have considerable performance overheads, target only a narrow class of bugs, are costly to implement, or use computational resources inefficiently. This paper describes MemTracker, a new hardware support mechanism that can be configured to perform different kinds of memory access monitoring tasks. MemTracker associates each word of data in memory with a few bits of state, and uses a programmable state transition table to react to different events that can affect this state. The number of state bits per word, the events to which MemTracker reacts, and the transition table are all fully programmable. MemTracker's rich set of states, events, and transitions can be used to implement different monitoring and debugging checkers with minimal performance overheads, even when frequent state updates are needed. To evaluate MemTracker, we map three different checkers onto it, as well as a checker that combines all three. For the most demanding (combined) checker, we observe performance overheads of only 2.7% on average and 4.8% worst-case on SPEC 2000 applications. Such low overheads allow continuous (always-on) use of MemTracker-enabled checkers even in production runs
TL;DR: Friday, a system for debugging distributed applications that combines deterministic replay of components with the power of symbolic, low-level debugging and a simple language for expressing higher-level distributed conditions and actions, is presented.
Abstract: Debugging and profiling large-scale distributed applications is a daunting task. We present Friday, a system for debugging distributed applications that combines deterministic replay of components with the power of symbolic, low-level debugging and a simple language for expressing higher-level distributed conditions and actions. Friday allows the programmer to understand the collective state and dynamics of a distributed collection of coordinated application components.
To evaluate Friday, we consider several distributed problems, including routing consistency in overlay networks, and temporal state abnormalities caused by route flaps. We show via micro-benchmarks and larger-scale application measurement that Friday can be used interactively to debug large distributed applications under replay on common hardware.
TL;DR: A novel architecture for embedded logic analysis based on lossless compression is proposed, particularly useful for in-field debugging of custom circuits that have sources of nondeterministic behavior such as asynchronous interfaces and a new compression ratio metric is introduced.
Abstract: The capacity of on-chip trace buffers employed for embedded logic analysis limits the observation window of a debug experiment. To increase the debug observation window, we propose a novel architecture for embedded logic analysis based on lossless compression. The proposed architecture is particularly useful for in-field debugging of custom circuits that have sources of nondeterministic behavior such as asynchronous interfaces. In order to measure the tradeoff between the area overhead and the increase in the observation window, we also introduce a new compression ratio metric. We use this metric to quantify the performance gain of three lossless compression algorithms suitable for embedded logic analysis.
TL;DR: A combination of static slicing and delta debugging that automatically minimizes the sequence of failure-inducing method calls is presented, which improves on the state of the art by being far more efficient.
Abstract: Randomized unit test cases can be very effective in detecting defects. In practice, however, failing test cases often comprise long sequences of method calls that are tiresome to reproduce and debug. We present a combination of static slicing and delta debugging that automatically minimizes the sequence of failure-inducing method calls. In a case study on the EiffelBase library, the strategy minimizes failing unit test cases on average by 96%. This approach improves on the state of the art by being far more efficient: in contrast to the approach of Lei and Andrews, who use delta debugging alone, our case study found slicing to be 50 times faster, while providing comparable results. The combination of slicing and delta debugging gives the best results and is 11 times faster.
TL;DR: It is shown that failure conditions as modeled by a C4.5 decision tree accurately predict failures and can therefore be used as well to help debugging.
Abstract: Using a specific machine learning technique, this paper proposes a way to identify suspicious statements during debugging. The technique is based on principles similar to Tarantula but addresses its main flaw: its difficulty to deal with the presence of multiple faults as it assumes that failing test cases execute the same fault(s). The improvement we present in this paper results from the use of C4.5 decision trees to identify various failure conditions based on information regarding the test cases' inputs and outputs. Failing test cases executing under similar conditions are then assumed to fail due to the same fault(s). Statements are then considered suspicious if they are covered by a large proportion of failing test cases that execute under similar conditions. We report on a case study that demonstrates improvement over the original Tarantula technique in terms of statement ranking. Another contribution of this paper is to show that failure conditions as modeled by a C4.5 decision tree accurately predict failures and can therefore be used as well to help debugging.
TL;DR: A novel design debugging formulation based on maximum satisfiability (max-sat) and approximate max-sat is proposed, which can quickly discard many potential error sources in designs, thus drastically reducing the size of the problem passed to an existing debugger.
Abstract: In today's SoC design cycles, debugging is one of the most time consuming manual tasks. CAD solutions strive to reduce the inefficiency of debugging by identifying error sources in designs automatically. Unfortunately, the capacity and performance of such automated techniques must be considerably extended for industrial applicability. This work aims to improve the performance of current state-of-the-art debugging techniques, thus making them more practical. More specifically, this work proposes a novel design debugging formulation based on maximum satisfiability (max-sat) and approximate max-sat. The developed technique can quickly discard many potential error sources in designs, thus drastically reducing the size of the problem passed to an existing debugger. The max-sat formulation is used as a pre-processing step to construct a highly optimized debugging framework. Empirical results demonstrate the effectiveness of the proposed framework as run-time improvements of orders of magnitude are consistently realized over a state-of-the-art debugger.
TL;DR: This paper presents WiDS Checker, a unified framework that can check distributed systems through both simulation and reproduced runs from real deployment, and found non-trivial bugs, including one in a previously proven Paxos specification.
Abstract: Despite many efforts, the predominant practice of debugging a distributed system is still printf-based log mining, which is both tedious and error-prone. In this paper, we present WiDS Checker, a unified framework that can check distributed systems through both simulation and reproduced runs from real deployment. All instances of a distributed system can be executed within one simulation process, multiplexed properly to observe the "happensbefore" relationship, thus accurately reveal full system state. A versatile script language allows a developer to refine system properties into straightforward assertions, which the checker inspects for violations. Combining these two components, we are able to check distributed properties that are otherwise impossible to check. We applied WiDS Checker over a suite of complex and real systems and found non-trivial bugs, including one in a previously proven Paxos specification. Our experience demonstrates the usefulness of the checker and allows us to gain insights beneficial to future research in this area.
TL;DR: The design and implementation of RAMP Blue, the first of several prototype systems for emulation of multi-core architectures in FPGAs, was designed to emulate a distributed-memory message-passing architecture and performed well for emulation purposes.
Abstract: We are developing a set of reusable design blocks and several prototype systems for emulation of multi-core architectures in FPGAs RAMP Blue is the first of these prototypes and was designed to emulate a distributed-memory message-passing architecture The system consists of 768-1008 MicroBlaze cores in 64-84 Virtex-II Pro 70 FPGAs on 16-21 BEE2 boards, surpassing the milestone of 1000 cores in a standard 42U rack An architecture based on point-to-point channels and switches using a combination of custom and generic hardware provides the functionality Virtual-cut-through dimensional routing on one of two hybrid topologies with virtual channels provides the connectivity A control network with a tree topology provides management and debugging capabilities A software infrastructure consisting of GCC, uClinux and UPC allows running off-the-shelf applications and scientific benchmarks Initial performance is encouraging for emulation purposes In this paper we report on the design and implementation of RAMP Blue and discuss our experiences and lessons learned
TL;DR: Qualitative evaluation by domain experts suggests that the novel Delta-Latent-Dirichlet-Allocation model outperforms existing statistical methods for bug cause identification, and may help support other software tasks not addressed by earlier models.
Abstract: Statistical debugging uses machine learning to model program failures and help identify root causes of bugs. We approach this task using a novel Delta-Latent-Dirichlet-Allocation model. We model execution traces attributed to failed runs of a program as being generated by two types of latent topics: normal usage topics and bug topics. Execution traces attributed to successful runs of the same program, however, are modeled by usage topics only. Joint modeling of both kinds of traces allows us to identify weak bug topics that would otherwise remain undetected. We perform model inference with collapsed Gibbs sampling. In quantitative evaluations on four real programs, our model produces bug topics highly correlated to the true bugs, as measured by the Rand index. Qualitative evaluation by domain experts suggests that our model outperforms existing statistical methods for bug cause identification, and may help support other software tasks not addressed by earlier models.
TL;DR: In this paper, the authors describe state-of-the-art attacks on debuggers to prevent reverse-engineering and use the information they present as part of their strategy to protect software or to assist in overcoming the anti-debugging tricks present in malicious software.
Abstract: This article focuses on describing state-of-the-art attacks on debuggers to prevent reverse engineering. You can use the information we present as part of your strategy to protect your software or to assist you in overcoming the anti-debugging tricks present in malicious software. Currently, there are enough anti-debugging techniques available to software engineers to sufficiently protect software against most threats, likewise, most state-of-the-art malware can be sufficiently reverse-engineered with patience and skill to enable security researchers to continue to defend their networks. However, advances in software protection techniques and reverse engineering might alter the balance.
TL;DR: It is argued that implicitly parallel programming models are critical for addressing the software development crises and software scalability challenges for many-core microprocessors.
Abstract: This paper argues for an implicitly parallel programming model for many-core microprocessors, and provides initial technical approaches towards this goal. In an implicitly parallel programming model, programmers maximize algorithm- level parallelism, express their parallel algorithms by asserting high-level properties on top of a traditional sequential programming language, and rely on parallelizing compilers and hardware support to perform parallel execution under the hood. In such a model, compilers and related tools require much more advanced program analysis capabilities and programmer assertions than what are currently available so that a comprehensive understanding of the input program's concurrency can be derived. Such an understanding is then used to drive automatic or interactive parallel code generation tools for a diverse set of parallel hardware organizations. The chip-level architecture and hardware should maintain parallel execution state in such a way that a strictly sequential execution state can always be derived for the purpose of verifying and debugging the program. We argue that implicitly parallel programming models are critical for addressing the software development crises and software scalability challenges for many-core microprocessors.
TL;DR: An automatic approach for fault localization in C programs is presented, based on model checking and reports only components that can be changed such that the difference between actual and intended behavior of the example is removed.
TL;DR: Indus—a robust framework for analyzing and slicing concurrent Java programs, and Kaveri—a feature-rich Eclipse-based GUI front end for Indus slicing are presented.
Abstract: Program slicing is a program analysis and transformation technique that has been successfully used in a wide range of applications including program comprehension, debugging, maintenance, testing, and verification. However, there are only few fully featured implementations of program slicing that are available for industrial applications or academic research. In particular, very little tool support exists for slicing programs written in modern object-oriented languages such as Java, C#, or C++. In this paper, we present Indus—a robust framework for analyzing and slicing concurrent Java programs, and Kaveri—a feature-rich Eclipse-based GUI front end for Indus slicing. For Indus, we describe the underlying tool architecture, analysis components, and program dependence capabilities required for slicing. In addition, we present a collection of advanced features useful for effective slicing of Java programs including calling-context sensitive slicing, scoped slicing, control slicing, and chopping. For Kaveri, we discuss the design goals and basic capabilities of the graphical facilities integrated into a Java development environment to present the slicing information. This paper is an extended version of a tool demonstration paper presented at the International Conference on Fundamental Aspects of Software Engineering (FASE 2005). Thus, the paper highlights tool capabilities and engineering issues and refers the reader to other papers for technical details.
TL;DR: This paper studies the effectiveness of fault location using dynamic slicing for a set of real bugs reported in some widely used software programs and observes that dynamic slicing considerably reduced the subset of program statements that needed to be examined to locate faulty statements.
Abstract: Dynamic slicing algorithms have been considered to aid in debugging for many years. However, as far as we know, no detailed studies on evaluating the benefits of using dynamic slicing for locating real faults present in programs have been carried out. In this paper we study the effectiveness of fault location using dynamic slicing for a set of real bugs reported in some widely used software programs. Our results show that of the 19 faults studied, 12 faults were captured by data slices, 7 required the use of full slices, and none of them required the use of relevant slices. Moreover, it was observed that dynamic slicing considerably reduced the subset of program statements that needed to be examined to locate faulty statements. Interestingly, we observed that all of the memory bugs in the faulty versions were captured by data slices. The dynamic slices that captured faulty code included 0.45 to 63.18% of statements that were executed at least once.
TL;DR: The complexity of today's hundreds-of-million-transistor microprocessors all but guarantees imperfect first silicon, but leaves unanswered the question of what exactly will go wrong as mentioned in this paper.
Abstract: The complexity of today's hundreds-of-million-transistor microprocessors all but guarantees imperfect first silicon, but leaves unanswered the question of what exactly will go wrong. This article describes features added to the cell broadband engine processor to enable debugging in the presence of such unknown events.
TL;DR: A methodology and new algorithms to automate this traditionally manual debugging process for post-silicon debugging and can automatically repair more than 70% of the benchmark designs.
Abstract: Modern IC designs have reached unparalleled levels of complexity, resulting in more and more bugs discovered after design tape-out However, so far only very few EDA tools for post-silicon debugging have been reported in the literature. In this work we develop a methodology and new algorithms to automate this debugging process. Key innovations in our technique include support for the physical constraints specific to post-silicon debugging and the ability to repair functional errors through subtle modifications of an existing layout. In addition, our proposed post-silicon debugging methodology (FogClear) can repair some electrical errors while preserving functional correctness. Thus, by automating this traditionally manual debugging process, our contributions promise to reduce engineers' debugging effort. As our empirical results show, we can automatically repair more than 70% of our benchmark designs.
TL;DR: In this article, the authors describe an automated software support system comprising automated bug filing and test case creation component to checkpoint a client process initial state and record initial state changes while the client process undergoes sequence of states which need to be analyzed, such as software bug, deliver the recordings to a development node, where the problem can be debugged without reproducing the client environment by using the recorded state to recreate initial state of the client program.
Abstract: The invention describes an automated software support system comprising automated bug filing and test case creation component to checkpoint a client process initial state and record the client process initial state changes while the client process undergoes sequence of states which need to be analyzed, such as software bug, deliver the recordings to a development node, where the problem can be debugged without reproducing the client process environment by using the recorded state to recreate initial state of the client program and by using the recorded log to simulate the client program execution forwards and backwards.
TL;DR: The PROSE project as mentioned in this paper performs reversible and systematic changes to running Java applications without requiring them to be shutdown, motivated by scenarios such as hotfixes, online program instrumentation and debugging, and evolution of critical legacy applications.
Abstract: We present a system that performs reversible and systematic changes to running Java applications without requiring them to be shutdown. PROSE is motivated by scenarios such as hotfixes, online program instrumentation and debugging, and evolution of critical legacy applications. Modifications take the form of replacement method bodies, and can use both type-based and regular expression patterns to select code for replacement. New code can make use of replaced method implementations cleanly, facilitating code evolution. Changes are composable, and may be reordered or selectively withdrawn at any time. Furthermore, the modifications are expressed as Java classes, providing additional development benefits. We describe the architecture of PROSE, the challenges of using aggressive inlining to achieve performance, and use standard benchmarks to demonstrate code performance comparable (or better than) compile-time systems from the Aspect-Oriented Programming community.
TL;DR: In this article, a method and system for automatically generating unit test cases for a computer program that can reproduce runtime problems is presented, which includes modifying the computer program according to one or more interested target program units in the program and possibly occurring run time problems.
Abstract: A method and system for automatically generating unit test cases for a computer program that can reproduce runtime problems. The method comprises: modifying the computer program according to one or more interested target program units in the program and possibly occurring run time problems; test executing the modified program; and automatically generating unit test cases according to the interested runtime problems occurring during the execution of the interested target program units. Wherein the modifying step adds captor code and problem detective code into the program, the captor code being configured to record the execution paths and execution contexts of the interested target program units in the program; and the problem detective code being configured to detect the interested unexpected exceptions possibly raised and the interested violations of predefined behavior rules possibly produced by the execution of the program units. The present invention further provides methods and systems for debugging and for regression testing using the above method, and a computer program testing method and system.
TL;DR: The proposed technique to find hard-to-detect software bugs that can cause severe problems such as data corruptions and deadlocks in parallel programs automatically via detecting their abnormal behaviors in data movements, called DMTracker, can be deployed in production runs.
Abstract: While software reliability in large-scale systems becomes increasingly important, debugging in large-scale parallel systems remains a daunting task. This paper proposes an innovative technique to find hard-to-detect software bugs that can cause severe problems such as data corruptions and deadlocks in parallel programs automatically via detecting their abnormal behaviors in data movements. Based on the observation that data movements in parallel programs typically follow certain patterns, our idea is to extract data movement (DM)-based invariants at program runtime and check the violations of these invariants. These violations indicate potential bugs such as data races and memory corruption bugs that manifest themselves in data movements. We have built a tool, called DMTracker, based on the above idea: automatically extract DM-based invariants and detect the violations of them. Our experiments with two real-world bug cases in MVAPICH/MVAPICH2, a popular MPI library, have shown that DMTracker can effectively detect them and report abnormal data movements to help programmers quickly diagnose the root causes of bugs. In addition, DMTracker incurs very low runtime overhead, from 0.9% to 6.0%, in our experiments with High Performance Linpack (HPL) and NAS Parallel Benchmarks (NPB), which indicates that DMTracker can be deployed in production runs.
TL;DR: In this paper, a data processing apparatus is used to decide whether to allow debugging of a program performed by a debugger, based on a verification value used for judgment on whether to permit the debugging, and an access control list that shows whether or not to permit an access to each of parts constituting the program.
Abstract: A data processing apparatus controls execution of debugging of a program performed by a debugger. The program includes a verification value used for judgment on whether to permit the debugging, and an access control list that shows whether to permit an access to each of parts constituting the program. The data processing apparatus acquires a debugger ID of the debugger from the debugger, and the verification value and the access control list included in the program. The data processing apparatus judges whether to permit the debugging, according to the result of comparison between the debugger ID and the verification value. The data processing apparatus permits an access to a part of the program to be debugged when the access control list shows that the access is permitted. The data processing apparatus does not permit the access to the part when the access control list shows that the access is not permitted.
TL;DR: In this paper, the S-PLUS Visual Debugging System (SPVDS) provides a profiler which tracks the number and duration of calls to functions and the amount of memory allocated to variables.
Abstract: Methods and systems for visual debugging of an interpreted language in, for example, an Interactive Development Environment (100) is provided. Example embodiments provide an S-PLUS Visual Debugging System ('SPVDS') (1024), which includes an S-PLUS Workbench Debugger ('SPWD') (1021) that provides 'step-based' visual debugging, enabling programmers (1240) to step through execution of expressions by setting and otherwise managing breakpoints, examining variables and expressions, and controlling execution such as by step, step-in, step-out, step-over, continue, stop commands. In addition, the SPWD provides a profiler which tracks the number and duration of calls to functions and the amount of memory allocated to variables. This abstract is provided to comply with rules requiring an abstract, and it is submitted with the intention that it will not be used to interpret or limit the scope or meaning of the claims.
TL;DR: In this paper, techniques and systems for analysis, diagnosis and debugging fabricated hardware designs at a Hardware Description Language (HDL) level are described, which enable the hardware designs within the integrated circuit products to be analyzed and diagnosed at the HDL level at speed.
Abstract: Techniques and systems for analysis, diagnosis and debugging fabricated hardware designs at a Hardware Description Language (HDL) level are described. Although the hardware designs (which were designed in HDL) have been fabricated in integrated circuit products with limited input/output pins, the techniques and systems enable the hardware designs within the integrated circuit products to be analyzed, diagnosed, and debugged at the HDL level at speed. The ability to debug hardware designs at the HDL level facilitates correction or adjustment of the HDL description of the hardware designs. Moreover, various embodiments related to HDL code coverage are described.
TL;DR: A method for performing evolutionary testing (ET) that does not require source code is proposed, useful for third-party testing, verification, and security audits when the source code of the test target will not be provided.
Abstract: Runtime code coverage analysis is feasible and useful when application source code is not available An evolutionary test tool receiving such statistics can use that information as fitness for pools of sessions to actively learn the interface protocol We call this activity grey-box fuzzing We intend to show that, when applicable, grey-box fuzzing is more effective at finding bugs than RFC compliant or capture-replay mutation black-box tools This research is focused on building a better/new breed of fuzzer The impact of which is the discovery of difficult to find bugs in real world applications which are accessible (not theoretical) We have successfully combined an evolutionary approach with a debugged target to get real-time grey-box code coverage (CC) fitness data We build upon existing test tool General Purpose Fuzzer (GPF) [8], and existing reverse engineering and debugging framework PaiMei [10] to accomplish this We call our new tool the Evolutionary Fuzzing System (EFS), which is the initial realization of my PhD thesis We have shown that it is possible for our system to learn the targets language (protocol) as target communication sessions become more fit over time We have also shown that this technique works to find bugs in a real world application Initial results are promising though further testing is still underway This paper will explain EFS, describing its unique features, and present preliminary results for one test case We will also discuss ongoing research efforts First we begin with some background and related works Previous Evolutionary Testing Work “Evolutionary Testing uses evolutionary algorithms to search for software test data For white-box testing criteria, each uncovered structure-for example a program statement or branch-is taken as the individual target of a test data search With certain types of programs, however, the approach degenerates into a random search, due to a lack of guidance to the required test data Often this is because the fitness function does not take into account data dependencies within the program under test, and the fact that certain program statements need to have been executed prior to the target structure in order for it to be feasible For instance, the outcome of a target branching condition may be dependent on a variable having a special value that is only set in a special circumstancefor example a special flag or enumeration value denoting an unusual condition; a unique return value from a function call indicating that an error has occurred, or a counter variable only incremented under certain conditions Without specific knowledge of such dependencies, the fitness landscape may contain coarse, flat, or even deceptive areas, causing the evolutionary search to stagnate and fail The problem of flag variables in particular has received much interest from researchers (Baresel et aL, 2004; Baresel and Sthamer, 2003; Bottaci, 2002; Harman et aL, 2002), but there has been little attention with regards to the broader problem as described [1]” The above quote is from a McMinn paper that is pushing forward the field of traditional evolutionary testing However, in this paper we propose a method for performing evolutionary testing (ET) that does not require source code This is useful for third-party testing, verification, and security audits when the source code of the test target will not be provided Our approach is to track the portions of code executed (“hits”) during runtime via a debugger Previous static analysis of the compile code, allows the debugger to set break points on functions (funcs) or basic blocks (BBs) We partially overcome the traditional problems of evolutionary testing by the use of a seed file, which gives the evolutionary algorithm hints about the nature of the protocol to learn Our approach works differently from traditional ET in two important ways: 1 We use a grey-box style of testing that allows us to proceed without source code 2 We search for sequences of test data, known as sessions, which fully define the documented and undocumented features of the interface under test (protocol discovery) This is very similar to finding test data to cover every source code branch via ET However, the administration, of discovered test data is happening during the search Thus, test results, are discovered as our algorithm runs Robustness issues are recorded in the form of crash files and Mysql data, and can be further explored for exploitable conditions while the algorithm continues to run
TL;DR: This work proposes a performance-driven, succinct and parametrizable quantified Boolean formula (QBF) satisfiability encoding and its hardware implementation for modeling sequential circuit behavior.
Abstract: Many CAD for VLSI techniques use time-frame expansion, also known as the iterative logic array representation, to model the sequential behavior of a system Replicating industrial-size designs for many time-frames may impose impractically excessive memory requirements This work proposes a performance-driven, succinct and parametrizable quantified Boolean formula (QBF) satisfiability encoding and its hardware implementation for modeling sequential circuit behavior This encoding is then applied to three notable CAD problems, namely bounded model checking (BMC), sequential test generation and design debugging Extensive experiments on industrial circuits confirm outstanding run-time and memory gains compared to state-of-the-art techniques, promoting the use of QBF in CAD for VLSI