Runtime error detection

Topic Tools

Papers

Proceedings Article•10.5555/2388996.2389037•

MPI runtime error detection with MUST: advances in deadlock detection

[...]

Tobias Hilbrich¹, Joachim Protze¹, Martin Schulz², Bronis R. de Supinski², Matthias S. Müller¹ - Show less +1 more•Institutions (2)

Dresden University of Technology¹, Lawrence Livermore National Laboratory²

10 Nov 2012

TL;DR: The Marmot Umpire Scalable Tool (MUST) is presented that detects such errors with significantly increased scalability and improvements to the graph-based deadlock detection approach for MPI are presented, which cover future MPI extensions.

...read moreread less

Abstract: The widely used Message Passing Interface (MPI) is complex and rich. As a result, application developers require automated tools to avoid and to detect MPI programming errors. We present the Marmot Umpire Scalable Tool (MUST) that detects such errors with significantly increased scalability. We present improvements to our graph-based deadlock detection approach for MPI, which cover future MPI extensions. Our enhancements also check complex MPI constructs that no previous graph-based detection approach handled correctly. Finally, we present optimizations for the processing of MPI operations that reduce runtime deadlock detection overheads. Existing approaches often require O(p) analysis time per MPI operation, for p processes. We empirically observe that our improvements lead to sub-linear or better analysis time per operation for a wide range of real world applications.

...read moreread less

73 citations

Proceedings Article•10.1145/1273463.1273478•

Variably interprocedural program analysis for runtime error detection

[...]

Aaron Tomb¹, Guillaume Brat, Willem Visser•Institutions (1)

University of California, Santa Cruz¹

9 Jul 2007

TL;DR: An analysis approach based on a of static and dynamic techniques to run-time errors in Java code that uses symbolic execution to constraints under which an error may occur and then solves these constraints to test inputs that may expose the error.

...read moreread less

Abstract: This paper describes an analysis approach based on a of static and dynamic techniques to ?nd run-time errors in Java code. It uses symbolic execution to ?nd constraints under which an error (e.g. a null pointer dereference, array out of bounds access, or assertion violation) may occur and then solves these constraints to ?nd test inputs that may expose the error. It only alerts the user to the possibility of a real error when it detects the expected exception during a program run.The analysis is customizable in two important ways. First, we can adjust how deeply to follow calls from each top-level method. Second, we can adjust the path termination tion for the symbolic execution engine to be either a bound on the path condition length or a bound on the number of times each instruction can be revisited.We evaluated the tool on a set of benchmarks from the literature as well as a number of real-world systems that range in size from a few thousand to 50,000 lines of code. The tool discovered all known errors in the benchmarks (as well as some not previously known) and reported on average 8 errors per 1000 lines of code for the industrial examples. In both cases the interprocedural call depth played little role in the error detection. That is, an intraprocedural analysis seems adequate for the class of errors we detect.

...read moreread less

67 citations

Book Chapter•10.1007/978-3-642-11261-4_5•

MUST: A Scalable Approach to Runtime Error Detection in MPI Programs

[...]

Tobias Hilbrich, Martin Schulz¹, Bronis R. de Supinski¹, Matthias S. Müller²•Institutions (2)

Lawrence Livermore National Laboratory¹, Dresden University of Technology²

24 Mar 2010

TL;DR: A novel framework for scalable MPI correctness tools to address this need, which uses P n MPI to instantiate a tool from a set of individual modules and allows correctness tools built upon it to adapt to different architectures and use cases.

...read moreread less

Abstract: The Message-Passing Interface (MPI) is large and complex. Therefore, programming MPI is error prone. Several MPI runtime correctness tools address classes of usage errors, such as deadlocks or non-portable constructs. To our knowledge none of these tools scales to more than about 100 processes. However, some of the current HPC systems use more than 100,000 cores and future systems are expected to use far more. Since errors often depend on the task count used, we need correctness tools that scale to the full system size. We present a novel framework for scalable MPI correctness tools to address this need. Our fine-grained, module-based approach supports rapid prototyping and allows correctness tools built upon it to adapt to different architectures and use cases. The design uses P n MPI to instantiate a tool from a set of individual modules. We present an overview of our design, along with first performance results for a proof of concept implementation.

...read moreread less

65 citations

Proceedings Article•10.5555/1129601.1129750•

Complementary use of runtime validation and model checking

[...]

A. A. Bayazit¹, Sharad Malik¹•Institutions (1)

Princeton University¹

31 May 2005

TL;DR: This paper considers the use of on-chip hardware for detecting bugs using hardware assertions and examines the strengths and weaknesses of runtime validation and how it may be used to complement model checking in a hybrid methodology.

...read moreread less

Abstract: The increasing gap between design complexity and compute power for verification necessitates radically new solutions to meet the verification challenges for future generations of hardware designs. Increasingly it will not be possible to completely validate hardware prior to fabrication. We will need to reconcile ourselves to the fact that hardware, like software, will be shipped with bugs. However, this can be acceptable with appropriate mechanisms for runtime validation that detect bugs and recover from them when needed. This paper takes a significant step in examining runtime validation as part of the verification methodology. It examines the strengths and weaknesses of runtime validation and how it may be used to complement model checking in a hybrid methodology. We consider the use of on-chip hardware for detecting bugs using hardware assertions. These assertions may be used for validating abstractions and assumptions for use in offline model checking. Hardware based assertions monitor properties at runtime and do not suffer from the state explosion problem. Offline model checking is used to validate globally distributed properties where runtime error detection has limitations in monitoring and responding to signals separated by many clock cycles. In this case the hardware based runtime validated abstractions and assumptions help in reducing the state space for model checking. Our ideas are demonstrated on a highly concurrent, yet simple to understand token sharing protocol, as well as a fairly complex cache coherence system.

...read moreread less

49 citations

Proceedings Article•10.1109/IPDPS.2012.123•

GTI: A Generic Tools Infrastructure for Event-Based Tools in Parallel Systems

[...]

Tobias Hilbrich, Matthias S. Müller, Bronis R. de Supinski¹, Martin Schulz¹, Wolfgang E. Nagel - Show less +1 more•Institutions (1)

Lawrence Livermore National Laboratory¹

21 May 2012

TL;DR: The Generic Tool Infrastructure (GTI) is presented, its abstractions and code generation facilities that ease many hurdles in tool development, including wrapper generation, tool communication, trace reductions, and filters, and GTI ultimately allows tool developers to focus on implementing tool functionality instead of the surrounding infrastructure.

...read moreread less

Abstract: Runtime detection of semantic errors in MPI applications supports efficient and correct large-scale application development. However, current approaches scale to at most one thousand processes and design limitations prevent increased scalability. The need for global knowledge for analyses such as type matching, and deadlock detection presents a major challenge. We present a scalable tool infrastructure -- the Generic Tool Infrastructure (GTI) -- that we will use to implement MPI runtime error detection tools and that applies to other use cases. GTI supports simple offloading of tool processing onto extra processes or threads and provides a tree based overlay network (TBON) for creating scalable tools that analyze global knowledge. We present its abstractions and code generation facilities that ease many hurdles in tool development, including wrapper generation, tool communication, trace reductions, and filters. GTI ultimately allows tool developers to focus on implementing tool functionality instead of the surrounding infrastructure. Further, we demonstrate that GTI supports scalable tool development through a lost message detector and a phase profiler. The former provides a more scalable implementation of important base functionality for MPI correctness checking, while the latter tool demonstrates that GTI can serve as the basis of further types of tools. Experiments with up to 2048 cores show that GTI's scalability features apply to both tools.

...read moreread less

23 citations

...

Expand

Year	Papers
2021	1
2020	1
2018	3
2015	1
2014	1
2013	4

Topic Tools

Papers

MPI runtime error detection with MUST: advances in deadlock detection

Variably interprocedural program analysis for runtime error detection

MUST: A Scalable Approach to Runtime Error Detection in MPI Programs

Complementary use of runtime validation and model checking

GTI: A Generic Tools Infrastructure for Event-Based Tools in Parallel Systems

Related Topics (5)

Performance Metrics