Runtime predictive analysis (or predictive analysis) is a

runtime verification Runtime verification is a computing system analysis and execution approach based on extracting information from a running system and using it to detect and possibly react to observed behaviors satisfying or violating certain properties. Some very p ...

technique in

computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...

for detecting property violations in

program Program (American English; also Commonwealth English in terms of computer programming and related activities) or programme (Commonwealth English in all other meanings), programmer, or programming may refer to: Business and management * Program m ...

executions inferred from an observed execution. An important class of predictive analysis methods has been developed for detecting

concurrency Concurrent means happening at the same time. Concurrency, concurrent, or concurrence may refer to: Law * Concurrence, in jurisprudence, the need to prove both ''actus reus'' and ''mens rea'' * Concurring opinion (also called a "concurrence"), a ...

errors (such as data races) in concurrent programs, where a runtime monitor is used to predict errors which did not happen in the observed run, but can happen in an alternative execution of the same program. The predictive capability comes from the fact that the analysis is performed on an abstract model extracted online from the observed execution, which admits a class of executions beyond the observed one.

Overview

Informally, given an execution

t

, predictive analysis checks errors in a reordered trace

t'

t

t'

is called ''feasible'' from

t

(alternatively a ''correct reordering'' of

t

) if any program that can generate

t

can also generate

t'

. In the context of concurrent programs, a predictive technique is ''sound'' if it only predicts concurrency errors in ''feasible'' executions of the causal model of the observed trace. Assuming the analysis has no knowledge about the source code of the program, the analysis is ''complete'' (also called ''maximal'') if the inferred class of executions contains all executions that have the same program order and communication order prefix of the observed trace.

Applications

Predictive analysis has been applied to detect a wide class of concurrency errors, including: * Data races *

Deadlock Deadlock commonly refers to: * Deadlock (computer science), a situation where two processes are each waiting for the other to finish * Deadlock (locksmithing) or deadbolt, a physical door locking mechanism * Political deadlock or gridlock, a si ...

s * Atomicity violations * Order violations, e.g.,

use-after-free Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations. More generally, dangling references and wild references are ...

errors

Implementation

As is typical with dynamic program analysis, predictive analysis first instruments the source program. At runtime, the analysis can be performed online, in order to detect errors on the fly. Alternatively, the instrumentation can simply dump the execution trace for offline analysis. The latter approach is preferred for expensive refined predictive analyses that require random access to the execution trace or take more than linear time.

Incorporating data and control-flow analysis

Static analysis can be first conducted to gather data and control-flow dependence information about the source program, which can help construct the causal model during online executions. This allows predictive analysis to infer a larger class of executions based on the observed execution. Intuitively, a feasible reordering can change the last writer of a memory read (data dependence) if the read, in turn, cannot affect whether any accesses execute (control dependence).

Approaches

Partial order based techniques

Partial order based techniques are most often employed for online race detection. At runtime, a partial order over the events in the trace is constructed, and any unordered pairs of critical events are reported as races. Many predictive techniques for race detection are based on the

happens-before In computer science, the happened-before relation (denoted: \to \;) is a relation between the result of two events, such that if one event should happen before another event, the result must reflect that, even if those events are in reality execute ...

relation or a weakened version of it. Such techniques can typically be implemented efficiently with vector clock algorithms, allowing only one pass of the whole input trace as it is being generated, and are thus suitable for online deployment.

SMT-based techniques

SMT encodings allow the analysis to extract a refined causal model from an execution trace, as a (possibly very large) mathematical formula. Furthermore, control-flow information can be incorporated into the model. SMT-based techniques can achieve soundness and completeness (also called ''maximal causality'' ), but has exponential-time complexity with respect to the trace size. In practice, the analysis is typically deployed to bounded segments of an execution trace, thus trading completeness for scalability.

Lockset-based approaches

In the context of data race detection for programs using lock based synchronization, lockset-based techniques provide an unsound, yet lightweight mechanism for detecting data races. These techniques primarily detect violations of the lockset principle. which says that all accesses of a given memory location must be protected by a common lock. Such techniques are also used to filter out candidate race reports in more expensive analyses.

Graph-based techniques

In the context of data race detection, sound polynomial-time predictive analyses have been developed, with good, close to maximal predictive capability based on a graphs.

Computational Complexity

Given an input trace of size

n

executed by

k

threads, general race prediction is

NP-complete In computational complexity theory, NP-complete problems are the hardest of the problems to which ''solutions'' can be verified ''quickly''. Somewhat more precisely, a problem is NP-complete when: # It is a decision problem, meaning that for any ...

and even W hard parameterized by

k

, but admits a polynomial-time algorithm when the communication topology is acyclic. Happens-before races are detected in

O(n\cdot k)

time, and this bound is optimal. Lockset races over

d

variables are detected in

O(n\cdot d)

time, and this bound is also optimal.

Tools

Here is a partial list of tools that use predictive analyses to detect concurrency errors, sorted alphabetically. * : a lightweight framework for implementing dynamic race detection engines. * : a dynamic analysis framework designed to facilitate rapid prototyping and experimentation with dynamic analyses for concurrent Java programs. * : SMT-based predictive race detection. * : SMT-based predictive use-after-free detection.

References

{{reflist Software testing