SMT-LIB
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
and
mathematical logic Mathematical logic is the study of Logic#Formal logic, formal logic within mathematics. Major subareas include model theory, proof theory, set theory, and recursion theory (also known as computability theory). Research in mathematical logic com ...
, satisfiability modulo theories (SMT) is the
problem Problem solving is the process of achieving a goal by overcoming obstacles, a frequent part of most activities. Problems in need of solutions range from simple personal tasks (e.g. how to turn on an appliance) to complex issues in business an ...
of determining whether a mathematical formula is satisfiable. It generalizes the
Boolean satisfiability problem In logic and computer science, the Boolean satisfiability problem (sometimes called propositional satisfiability problem and abbreviated SATISFIABILITY, SAT or B-SAT) asks whether there exists an Interpretation (logic), interpretation that Satisf ...
(SAT) to more complex formulas involving
real numbers In mathematics, a real number is a number that can be used to measurement, measure a continuous variable, continuous one-dimensional quantity such as a time, duration or temperature. Here, ''continuous'' means that pairs of values can have arbi ...
,
integers An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
, and/or various
data structure In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...
s such as
list A list is a Set (mathematics), set of discrete items of information collected and set forth in some format for utility, entertainment, or other purposes. A list may be memorialized in any number of ways, including existing only in the mind of t ...
s,
arrays An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
, bit vectors, and strings. The name is derived from the fact that these expressions are interpreted within ("modulo") a certain formal theory in first-order logic with equality (often disallowing quantifiers). SMT solvers are tools that aim to solve the SMT problem for a practical subset of inputs. SMT solvers such as Z3 and cvc5 have been used as a building block for a wide range of applications across computer science, including in
automated theorem proving Automated theorem proving (also known as ATP or automated deduction) is a subfield of automated reasoning and mathematical logic dealing with proving mathematical theorems by computer programs. Automated reasoning over mathematical proof was a majo ...
,
program analysis In computer science, program analysis is the process of analyzing the behavior of computer programs regarding a property such as correctness, robustness, safety and liveness. Program analysis focuses on two major areas: program optimization an ...
,
program verification In the context of hardware and software systems, formal verification is the act of proving or disproving the correctness of a system with respect to a certain formal specification or property, using formal methods of mathematics. Formal ver ...
, and
software testing Software testing is the act of checking whether software satisfies expectations. Software testing can provide objective, independent information about the Quality (business), quality of software and the risk of its failure to a User (computin ...
. Since Boolean satisfiability is already
NP-complete In computational complexity theory, NP-complete problems are the hardest of the problems to which ''solutions'' can be verified ''quickly''. Somewhat more precisely, a problem is NP-complete when: # It is a decision problem, meaning that for any ...
, the SMT problem is typically
NP-hard In computational complexity theory, a computational problem ''H'' is called NP-hard if, for every problem ''L'' which can be solved in non-deterministic polynomial-time, there is a polynomial-time reduction from ''L'' to ''H''. That is, assumi ...
, and for many theories it is undecidable. Researchers study which theories or subsets of theories lead to a decidable SMT problem and the
computational complexity In computer science, the computational complexity or simply complexity of an algorithm is the amount of resources required to run it. Particular focus is given to computation time (generally measured by the number of needed elementary operations ...
of decidable cases. The resulting decision procedures are often implemented directly in SMT solvers; see, for instance, the decidability of Presburger arithmetic. SMT can be thought of as a constraint satisfaction problem and thus a certain formalized approach to
constraint programming Constraint programming (CP) is a paradigm for solving combinatorial problems that draws on a wide range of techniques from artificial intelligence, computer science, and operations research. In constraint programming, users declaratively state t ...
.


Terminology and examples

Formally speaking, an SMT instance is a
formula In science, a formula is a concise way of expressing information symbolically, as in a mathematical formula or a ''chemical formula''. The informal use of the term ''formula'' in science refers to the general construct of a relationship betwe ...
in
first-order logic First-order logic, also called predicate logic, predicate calculus, or quantificational logic, is a collection of formal systems used in mathematics, philosophy, linguistics, and computer science. First-order logic uses quantified variables over ...
, where some function and predicate symbols have additional interpretations, and SMT is the problem of determining whether such a formula is satisfiable. In other words, imagine an instance of the
Boolean satisfiability problem In logic and computer science, the Boolean satisfiability problem (sometimes called propositional satisfiability problem and abbreviated SATISFIABILITY, SAT or B-SAT) asks whether there exists an Interpretation (logic), interpretation that Satisf ...
(SAT) in which some of the binary variables are replaced by predicates over a suitable set of non-binary variables. A predicate is a binary-valued function of non-binary variables. Example predicates include linear inequalities (e.g., 3x + 2y - z \geq 4) or equalities involving uninterpreted terms and function symbols (e.g., f(f(u, v), v) = f(u, v) where f is some unspecified function of two arguments). These predicates are classified according to each respective theory assigned. For instance, linear inequalities over real variables are evaluated using the rules of the theory of linear real
arithmetic Arithmetic is an elementary branch of mathematics that deals with numerical operations like addition, subtraction, multiplication, and division. In a wider sense, it also includes exponentiation, extraction of roots, and taking logarithms. ...
, whereas predicates involving uninterpreted terms and function symbols are evaluated using the rules of the theory of uninterpreted functions with equality (sometimes referred to as the empty theory). Other theories include the theories of
arrays An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
and
list A list is a Set (mathematics), set of discrete items of information collected and set forth in some format for utility, entertainment, or other purposes. A list may be memorialized in any number of ways, including existing only in the mind of t ...
structures (useful for modeling and verifying
computer program A computer program is a sequence or set of instructions in a programming language for a computer to Execution (computing), execute. It is one component of software, which also includes software documentation, documentation and other intangibl ...
s), and the theory of bit vectors (useful in modeling and verifying hardware designs). Subtheories are also possible: for example, difference logic is a sub-theory of linear arithmetic in which each inequality is restricted to have the form x - y > c for variables x and y and constant c. The examples above show the use of Linear Integer Arithmetic over inequalities. Other examples include: Most SMT solvers support only quantifier-free fragments of their logics.


Relationship to automated theorem proving

There is substantial overlap between SMT solving and
automated theorem proving Automated theorem proving (also known as ATP or automated deduction) is a subfield of automated reasoning and mathematical logic dealing with proving mathematical theorems by computer programs. Automated reasoning over mathematical proof was a majo ...
(ATP). Generally, automated theorem provers focus on supporting full first-order logic with quantifiers, whereas SMT solvers focus more on supporting various theories (interpreted predicate symbols). ATPs excel at problems with lots of quantifiers, whereas SMT solvers do well on large problems without quantifiers. The line is blurry enough that some ATPs participate in SMT-COMP, while some SMT solvers participate in CASC.


Expressive power

An SMT instance is a generalization of a Boolean SAT instance in which various sets of variables are replaced by predicates from a variety of underlying theories. SMT formulas provide a much richer
modeling language A modeling language is any artificial language that can be used to express data, information or knowledge or systems in a structure that is defined by a consistent set of rules. The rules are used for interpretation of the meaning of components in ...
than is possible with Boolean SAT formulas. For example, an SMT formula allows one to model the datapath operations of a
microprocessor A microprocessor is a computer processor (computing), processor for which the data processing logic and control is included on a single integrated circuit (IC), or a small number of ICs. The microprocessor contains the arithmetic, logic, a ...
at the word rather than the bit level. By comparison, answer set programming is also based on predicates (more precisely, on atomic sentences created from
atomic formula In mathematical logic, an atomic formula (also known as an atom or a prime formula) is a formula with no deeper propositional structure, that is, a formula that contains no logical connectives or equivalently a formula that has no strict subformu ...
s). Unlike SMT, answer-set programs do not have quantifiers, and cannot easily express constraints such as linear arithmetic or difference logic—answer set programming is best suited to Boolean problems that reduce to the free theory of uninterpreted functions. Implementing 32-bit integers as bitvectors in answer set programming suffers from most of the same problems that early SMT solvers faced: "obvious" identities such as ''x''+''y''=''y''+''x'' are difficult to deduce. Constraint logic programming does provide support for linear arithmetic constraints, but within a completely different theoretical framework. SMT solvers have also been extended to solve formulas in
higher-order logic In mathematics and logic, a higher-order logic (abbreviated HOL) is a form of logic that is distinguished from first-order logic by additional quantifiers and, sometimes, stronger semantics. Higher-order logics with their standard semantics are m ...
.


Solver approaches

Early attempts for solving SMT instances involved translating them to Boolean SAT instances (e.g., a 32-bit integer variable would be encoded by 32 single-bit variables with appropriate weights and word-level operations such as 'plus' would be replaced by lower-level logic operations on the bits) and passing this formula to a Boolean SAT solver. This approach, which is referred to as ''the eager approach'' (or ''bitblasting''), has its merits: by pre-processing the SMT formula into an equivalent Boolean SAT formula existing Boolean SAT solvers can be used "as-is" and their performance and capacity improvements leveraged over time. On the other hand, the loss of the high-level semantics of the underlying theories means that the Boolean SAT solver has to work a lot harder than necessary to discover "obvious" facts (such as x + y = y + x for integer addition.) This observation led to the development of a number of SMT solvers that tightly integrate the Boolean reasoning of a DPLL-style search with theory-specific solvers (''T-solvers'') that handle conjunctions (ANDs) of predicates from a given theory. This approach is referred to as ''the lazy approach''. Dubbed DPLL(T), this architecture gives the responsibility of Boolean reasoning to the DPLL-based SAT solver which, in turn, interacts with a solver for theory T through a well-defined interface. The theory solver only needs to worry about checking the feasibility of conjunctions of theory predicates passed on to it from the SAT solver as it explores the Boolean search space of the formula. For this integration to work well, however, the theory solver must be able to participate in propagation and conflict analysis, i.e., it must be able to infer new facts from already established facts, as well as to supply succinct explanations of infeasibility when theory conflicts arise. In other words, the theory solver must be incremental and backtrackable.


Decidable theories

Researchers study which theories or subsets of theories lead to a decidable SMT problem and the
computational complexity In computer science, the computational complexity or simply complexity of an algorithm is the amount of resources required to run it. Particular focus is given to computation time (generally measured by the number of needed elementary operations ...
of decidable cases. Since full
first-order logic First-order logic, also called predicate logic, predicate calculus, or quantificational logic, is a collection of formal systems used in mathematics, philosophy, linguistics, and computer science. First-order logic uses quantified variables over ...
is only semidecidable, one line of research attempts to find efficient decision procedures for fragments of first-order logic such as effectively propositional logic. Another line of research involves the development of specialized decidable theories, including linear arithmetic over rationals and
integer An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
s, fixed-width bitvectors,
floating-point arithmetic In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a Sign (mathematics), signed sequence of a fixed number of digits in some Radix, base) multiplied by an integer power of that ba ...
(often implemented in SMT solvers via ''bit-blasting'', i.e., reduction to bitvectors), strings, (co)-datatypes,
sequence In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is cal ...
s (used to model
dynamic array In computer science, a dynamic array, growable array, resizable array, dynamic table, mutable array, or array list is a random access, variable-size list data structure that allows elements to be added or removed. It is supplied with standard l ...
s), finite sets and relations,
separation logic In computer science, separation logic is an extension of Hoare logic, a way of reasoning about programs. It was developed by John C. Reynolds, Peter O'Hearn, Samin Ishtiaq and Hongseok Yang, drawing upon early work by Rod Burstall. The assertio ...
,
finite field In mathematics, a finite field or Galois field (so-named in honor of Évariste Galois) is a field (mathematics), field that contains a finite number of Element (mathematics), elements. As with any field, a finite field is a Set (mathematics), s ...
s, and uninterpreted functions among others. ''Boolean monotonic theories'' are a class of theory that support efficient theory propagation and conflict analysis, enabling practical use within DPLL(T) solvers. Monotonic theories support only boolean variables (boolean is the only ''sort''), and all their functions and predicates obey the axiom p(\ldots,b_,0,b_,\ldots)\implies p(\ldots,b_,1,b_,\ldots) Examples of monotonic theories include graph reachability, collision detection for
convex hull In geometry, the convex hull, convex envelope or convex closure of a shape is the smallest convex set that contains it. The convex hull may be defined either as the intersection of all convex sets containing a given subset of a Euclidean space, ...
s, minimum cuts, and computation tree logic. Every Datalog program can be interpreted as a monotonic theory.


SMT for undecidable theories

Most of the common SMT approaches support decidable theories. However, many real-world systems, such as an aircraft and its behavior, can only be modelled by means of non-linear arithmetic over the real numbers involving transcendental functions. This fact motivates an extension of the SMT problem to non-linear theories, such as determining whether the following equation is satisfiable: : \begin & (\sin(x)^3 = \cos(\log(y)\cdot x) \vee b \vee -x^2 \geq 2.3y) \wedge \left(\neg b \vee y < -34.4 \vee \exp(x) > \right) \end where : b \in , x,y \in . Such problems are, however, undecidable in general. (On the other hand, the theory of
real closed field In mathematics, a real closed field is a field F that has the same first-order properties as the field of real numbers. Some examples are the field of real numbers, the field of real algebraic numbers, and the field of hyperreal numbers. Def ...
s, and thus the full first order theory of the
real number In mathematics, a real number is a number that can be used to measure a continuous one- dimensional quantity such as a duration or temperature. Here, ''continuous'' means that pairs of values can have arbitrarily small differences. Every re ...
s, are decidable using
quantifier elimination Quantifier elimination is a concept of simplification used in mathematical logic, model theory, and theoretical computer science. Informally, a quantified statement "\exists x such that ..." can be viewed as a question "When is there an x such ...
. This is due to
Alfred Tarski Alfred Tarski (; ; born Alfred Teitelbaum;School of Mathematics and Statistics, University of St Andrews ''School of Mathematics and Statistics, University of St Andrews''. January 14, 1901 – October 26, 1983) was a Polish-American logician ...
.) The first order theory of the
natural numbers In mathematics, the natural numbers are the numbers 0, 1, 2, 3, and so on, possibly excluding 0. Some start counting with 0, defining the natural numbers as the non-negative integers , while others start with 1, defining them as the positiv ...
with addition (but not multiplication), called Presburger arithmetic, is also decidable. Since multiplication by constants can be implemented as nested additions, the arithmetic in many computer programs can be expressed using Presburger arithmetic, resulting in decidable formulas. Examples of SMT solvers addressing Boolean combinations of theory atoms from undecidable arithmetic theories over the reals are ABsolver, which employs a classical DPLL(T) architecture with a non-linear optimization packet as (necessarily incomplete) subordinate theory solver
iSAT
building on a unification of DPLL SAT-solving and interval constraint propagation called the iSAT algorithm, and cvc5.


Solvers

The table below summarizes some of the features of the many available SMT solvers. The column "SMT-LIB" indicates compatibility with the SMT-LIB language; many systems marked 'yes' may support only older versions of SMT-LIB, or offer only partial support for the language. The column "CVC" indicates support for the language. The column "DIMACS" indicates support for the DIMACSbr>format
Projects differ not only in features and performance, but also in the viability of the surrounding community, its ongoing interest in a project, and its ability to contribute documentation, fixes, tests and enhancements.


Standardization and the SMT-COMP solver competition

There are multiple attempts to describe a standardized interface to SMT solvers (and automated theorem provers, a term often used synonymously). The most prominent is the SMT-LIB standard, which provides a language based on
S-expression In computer programming, an S-expression (or symbolic expression, abbreviated as sexpr or sexp) is an expression in a like-named notation for nested List (computing), list (Tree (data structure), tree-structured) data. S-expressions were invented ...
s. Other standardized formats commonly supported are the DIMACS format supported by many Boolean SAT solvers, and the CVC format used by the CVC automated theorem prover. The SMT-LIB format also comes with a number of standardized benchmarks and has enabled a yearly competition between SMT solvers called SMT-COMP. Initially, the competition took place during the Computer Aided Verification conference (CAV), but as of 2020 the competition is hosted as part of the SMT Workshop, which is affiliated with the International Joint Conference on Automated Reasoning (IJCAR).


Applications

SMT solvers are useful both for verification, proving the correctness of programs, software testing based on
symbolic execution In computer science, symbolic execution (also symbolic evaluation or symbex) is a means of analyzing a program to determine what inputs cause each part of a program to execute. An interpreter follows the program, assuming symbolic values for i ...
, and for synthesis, generating program fragments by searching over the space of possible programs. Outside of software verification, SMT solvers have also been used for
type inference Type inference, sometimes called type reconstruction, refers to the automatic detection of the type of an expression in a formal language. These include programming languages and mathematical type systems, but also natural languages in some bran ...
and for modelling theoretic scenarios, including modelling actor beliefs in nuclear
arms control Arms control is a term for international restrictions upon the development, production, stockpiling, proliferation and usage of small arms, conventional weapons, and weapons of mass destruction. Historically, arms control may apply to melee wea ...
.


Verification

Computer-aided verification of computer programs often uses SMT solvers. A common technique is to translate preconditions, postconditions, loop conditions, and assertions into SMT formulas in order to determine if all properties can hold. There are many verifiers built on top of the Z3 SMT solver
Boogie
is an intermediate verification language that uses Z3 to automatically check simple imperative programs. Th
VCC
verifier for concurrent C uses Boogie, as well a
Dafny
for imperative object-based programs
Chalice
for concurrent programs, an
Spec#
for C#
F*
is a dependently typed language that uses Z3 to find proofs; the compiler carries these proofs through to produce proof-carrying bytecode. Th
Viper verification infrastructure
encodes verification conditions to Z3. Th
sbv
library provides SMT-based verification of Haskell programs, and lets the user choose among a number of solvers such as Z3, ABC, Boolector, cvc5, MathSAT and Yices. There are also many verifiers built on top of th
Alt-Ergo
SMT solver. Here is a list of mature applications:
Why3
a platform for deductive program verification, uses Alt-Ergo as its main prover; * CAVEAT, a C-verifier developed by CEA and used by Airbus; Alt-Ergo was included in the qualification DO-178C of one of its recent aircraft; *
Frama-C Frama-C is a set of interoperable program analyzers for C programs. The name ''Frama-C'' stands for ''Framework for Modular Analysis of C programs''. Frama-C has been developed by the French Commissariat à l'Énergie Atomique et aux Énergi ...
, a framework to analyse C-code, uses Alt-Ergo in the Jessie and WP plugins (dedicated to "deductive program verification"); * SPARK uses CVC4 and Alt-Ergo (behind GNATprove) to automate the verification of some assertions in SPARK 2014; * Atelier-B can use Alt-Ergo instead of its main prover (increasing success from 84% to 98% on th
ANR Bware project benchmarks
; *
Rodin François Auguste René Rodin (; ; 12 November 184017 November 1917) was a French sculptor generally considered the founder of modern sculpture. He was schooled traditionally and took a craftsman-like approach to his work. Rodin possessed a u ...
, a B-method framework developed by Systerel, can use Alt-Ergo as a back-end;
Cubicle
an open source model checker for verifying safety properties of array-based transition systems.
EasyCrypt
a toolset for reasoning about relational properties of probabilistic computations with adversarial code. Many SMT solvers implement a common interface format calle
SMTLIB2
(such files usually have the extension ".smt2"). Th
LiquidHaskell
tool implements a refinement type based verifier for Haskell that can use any SMTLIB2 compliant solver, e.g. cvc5, MathSat, or Z3.


Symbolic-execution based analysis and testing

An important application of SMT solvers is
symbolic execution In computer science, symbolic execution (also symbolic evaluation or symbex) is a means of analyzing a program to determine what inputs cause each part of a program to execute. An interpreter follows the program, assuming symbolic values for i ...
for analysis and testing of programs (e.g., concolic testing), aimed particularly at finding security vulnerabilities. Example tools in this category includ
SAGE
from
Microsoft Research Microsoft Research (MSR) is the research subsidiary of Microsoft. It was created in 1991 by Richard Rashid, Bill Gates and Nathan Myhrvold with the intent to advance state-of-the-art computing and solve difficult world problems through technologi ...

KLEES2E
an
Triton
SMT solvers that have been used for symbolic-execution applications includ
Z3STP
, th
Z3str family of solvers
an
Boolector


Interactive theorem proving

SMT solvers have been integrated with proof assistants, including Coq and Isabelle/HOL.


See also

* Answer set programming *
Automated theorem proving Automated theorem proving (also known as ATP or automated deduction) is a subfield of automated reasoning and mathematical logic dealing with proving mathematical theorems by computer programs. Automated reasoning over mathematical proof was a majo ...
*
SAT solver In computer science and formal methods, a SAT solver is a computer program which aims to solve the Boolean satisfiability problem (SAT). On input a formula over Boolean data type, Boolean variables, such as "(''x'' or ''y'') and (''x'' or not ''y'' ...
*
First-order logic First-order logic, also called predicate logic, predicate calculus, or quantificational logic, is a collection of formal systems used in mathematics, philosophy, linguistics, and computer science. First-order logic uses quantified variables over ...
* Theory of pure equality


Notes


References

* * * * , pp. , . * * * *
SMT-LIB: The Satisfiability Modulo Theories Library

SMT-COMP: The Satisfiability Modulo Theories Competition

Decision procedures - an algorithmic point of view
* * This article was originally adapted from a column in the AC
SIGDA
by Karem A. Sakallah. The original text i
available here
{{Mathematical logic Constraint programming Electronic design automation Formal methods Logic in computer science NP-complete problems Satisfiability problems