HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, static program analysis (also known as static analysis or static simulation) is the
analysis Analysis (: analyses) is the process of breaking a complex topic or substance into smaller parts in order to gain a better understanding of it. The technique has been applied in the study of mathematics and logic since before Aristotle (38 ...
of computer programs performed without executing them, in contrast with
dynamic program analysis Dynamics (from Greek δυναμικός ''dynamikos'' "powerful", from δύναμις ''dynamis'' " power") or dynamic may refer to: Physics and engineering * Dynamics (mechanics), the study of forces and their effect on motion Brands and en ...
, which is performed on programs during their execution in the integrated environment. The term is usually applied to analysis performed by an automated tool, with human analysis typically being called "program understanding", program comprehension, or
code review Code review (sometimes referred to as peer review) is a software quality assurance activity in which one or more people examine the source code of a computer program, either after implementation or during the development process. The persons perf ...
. In the last of these,
software inspection Inspection in software engineering, refers to peer review of any work product by trained individuals who look for defects using a well defined process. An inspection might also be referred to as a Fagan inspection after Michael Fagan, the creator ...
and
software walkthrough In software engineering, a walkthrough or walk-through is a form of software peer review "in which a designer or programmer leads members of the development team and other interested parties through a software product, and the participants ask que ...
s are also used. In most cases the analysis is performed on some version of a program's
source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...
, and, in other cases, on some form of its
object code In computing, object code or object module is the product of an assembler or compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' ...
.


Rationale

The sophistication of the analysis performed by tools varies from those that only consider the behaviour of individual statements and declarations, to those that include the complete
source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...
of a program in their analysis. The uses of the information obtained from the analysis vary from highlighting possible coding errors (e.g., the lint tool) to
formal methods In computer science, formal methods are mathematics, mathematically rigorous techniques for the formal specification, specification, development, Program analysis, analysis, and formal verification, verification of software and computer hardware, ...
that mathematically prove properties about a given program (e.g., its behaviour matches that of its specification).
Software metric In software engineering and development, a software metric is a standard of measure of a degree to which a software system or process possesses some property. Even if a metric is not a measurement (metrics are functions, while measurements are t ...
s and
reverse engineering Reverse engineering (also known as backwards engineering or back engineering) is a process or method through which one attempts to understand through deductive reasoning how a previously made device, process, system, or piece of software accompl ...
can be described as forms of static analysis. Deriving software metrics and static analysis are increasingly deployed together, especially in creation of embedded systems, by defining so-called ''software quality objectives''. A growing commercial use of static analysis is in the verification of properties of software used in
safety-critical A safety-critical system or life-critical system is a system whose failure or malfunction may result in one (or more) of the following outcomes: * death or serious injury to people * loss or severe damage to equipment/property * environmental h ...
computer systems and locating potentially vulnerable code. For example, the following industries have identified the use of static code analysis as a means of improving the quality of increasingly sophisticated and complex software: #
Medical software Medical software is any software item or system used within a medical context. This can include: * Standalone software used for Medical diagnosis, diagnostic or Therapy, therapeutic purposes. * Software used by health care providers to reduce pape ...
: The US
Food and Drug Administration The United States Food and Drug Administration (FDA or US FDA) is a List of United States federal agencies, federal agency of the United States Department of Health and Human Services, Department of Health and Human Services. The FDA is respo ...
(FDA) has identified the use of static analysis for medical devices. # Nuclear software: In the UK the Office for Nuclear Regulation (ONR) recommends the use of static analysis on
reactor protection system A reactor protection system (RPS) is a set of nuclear safety and security components in a nuclear power plant designed to safely shut down the reactor and prevent the release of radioactive materials. The system can "trip" automatically (initiat ...
s. # Aviation software (in combination with dynamic analysis). # Automotive & Machines (functional safety features form an integral part of each automotive product development phase,
ISO 26262 ISO 26262, titled "Road vehicles – Functional safety", is an international standard for functional safety of electrical and/or electronic systems that are installed in serial production road vehicles (excluding mopeds), defined by the Intern ...
, section 8). A study in 2012 by VDC Research reported that 28.7% of the embedded software engineers surveyed use static analysis tools and 39.7% expect to use them within 2 years. A study from 2010 found that 60% of the interviewed developers in European research projects made at least use of their basic IDE built-in static analyzers. However, only about 10% employed an additional other (and perhaps more advanced) analysis tool. In the application security industry the name
static application security testing Static may refer to: Places *Static Nunatak, in Antarctica *Static, Kentucky and Tennessee, U.S. *Static Peak, a mountain in Wyoming, U.S. **Static Peak Divide, a mountain pass near the peak Science and technology Physics *Static electricity, a n ...
(SAST) is also used. SAST is an important part of
Security Development Lifecycle The Microsoft Security Development Lifecycle (SDL) is the approach Microsoft uses to integrate security into DevOps processes (sometimes called a DevSecOps approach). You can use this SDL guidance and documentation to adapt this approach and pract ...
s (SDLs) such as the SDL defined by Microsoft and a common practice in software companies.


Tool types

The OMG (
Object Management Group The Object Management Group (OMG®) is a computer industry Standards Development Organization (SDO), or Voluntary Consensus Standards Body (VCSB). OMG develops enterprise integration and modeling standards for a range of technologies. Busin ...
) published a study regarding the types of software analysis required for
software quality In the context of software engineering, software quality refers to two related but distinct notions: * Software's functional quality reflects how well it complies with or conforms to a given design, based on functional requirements or specificat ...
measurement and assessment. This document on "How to Deliver Resilient, Secure, Efficient, and Easily Changed IT Systems in Line with CISQ Recommendations" describes three levels of software analysis. ; Unit Level: Analysis that takes place within a specific program or subroutine, without connecting to the context of that program. ; Technology Level: Analysis that takes into account interactions between unit programs to get a more holistic and semantic view of the overall program in order to find issues and avoid obvious false positives. ; System Level: Analysis that takes into account the interactions between unit programs, but without being limited to one specific technology or programming language. A further level of software analysis can be defined. ; Mission/Business Level: Analysis that takes into account the business/mission layer terms, rules and processes that are implemented within the software system for its operation as part of enterprise or program/mission layer activities. These elements are implemented without being limited to one specific technology or programming language and in many cases are distributed across multiple languages, but are statically extracted and analyzed for system understanding for mission assurance.


Formal methods

Formal methods is the term applied to the analysis of
software Software consists of computer programs that instruct the Execution (computing), execution of a computer. Software also includes design documents and specifications. The history of software is closely tied to the development of digital comput ...
(and
computer hardware Computer hardware includes the physical parts of a computer, such as the central processing unit (CPU), random-access memory (RAM), motherboard, computer data storage, graphics card, sound card, and computer case. It includes external devices ...
) whose results are obtained purely through the use of rigorous mathematical methods. The mathematical techniques used include
denotational semantics In computer science, denotational semantics (initially known as mathematical semantics or Scott–Strachey semantics) is an approach of formalizing the meanings of programming languages by constructing mathematical objects (called ''denotations'' ...
,
axiomatic semantics Axiomatic semantics is an approach based on mathematical logic for proving the correctness of computer programs. It is closely related to Hoare logic. Axiomatic semantics define the meaning of a command in a program by describing its effect on ...
,
operational semantics Operational semantics is a category of formal programming language semantics in which certain desired properties of a program, such as correctness, safety or security, are verified by constructing proofs from logical statements about its exec ...
, and
abstract interpretation In computer science, abstract interpretation is a theory of sound approximation of the semantics of computer programs, based on monotonic functions over ordered sets, especially lattices. It can be viewed as a partial execution of a computer pro ...
. By a straightforward reduction to the
halting problem In computability theory (computer science), computability theory, the halting problem is the problem of determining, from a description of an arbitrary computer program and an input, whether the program will finish running, or continue to run for ...
, it is possible to prove that (for any
Turing complete Alan Mathison Turing (; 23 June 1912 – 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher and theoretical biologist. He was highly influential in the development of theoretical comput ...
language), finding all possible run-time errors in an arbitrary program (or more generally any kind of violation of a specification on the final result of a program) is undecidable: there is no mechanical method that can always answer truthfully whether an arbitrary program may or may not exhibit runtime errors. This result dates from the works of
Church Church may refer to: Religion * Church (building), a place/building for Christian religious activities and praying * Church (congregation), a local congregation of a Christian denomination * Church service, a formalized period of Christian comm ...
, Gödel and Turing in the 1930s (see:
Halting problem In computability theory (computer science), computability theory, the halting problem is the problem of determining, from a description of an arbitrary computer program and an input, whether the program will finish running, or continue to run for ...
and
Rice's theorem In computability theory, Rice's theorem states that all non-trivial semantic properties of programs are undecidable problem, undecidable. A ''semantic'' property is one about the program's behavior (for instance, "does the program halting problem, ...
). As with many undecidable questions, one can still attempt to give useful approximate solutions. Some of the implementation techniques of formal static analysis include: *
Abstract interpretation In computer science, abstract interpretation is a theory of sound approximation of the semantics of computer programs, based on monotonic functions over ordered sets, especially lattices. It can be viewed as a partial execution of a computer pro ...
, to model the effect that every statement has on the state of an abstract machine (i.e., it 'executes' the software based on the mathematical properties of each statement and declaration). This abstract machine over-approximates the behaviours of the system: the abstract system is thus made simpler to analyze, at the expense of ''incompleteness'' (not every property true of the original system is true of the abstract system). If properly done, though, abstract interpretation is ''sound'' (every property true of the abstract system can be mapped to a true property of the original system). *
Data-flow analysis Data-flow analysis is a technique for gathering information about the possible set of values calculated at various points in a computer program. It forms the foundation for a wide variety of compiler optimizations and program verification techn ...
, a lattice-based technique for gathering information about the possible set of values; *
Hoare logic Hoare logic (also known as Floyd–Hoare logic or Hoare rules) is a formal system with a set of logical rules for reasoning rigorously about the correctness of computer programs. It was proposed in 1969 by the British computer scientist and l ...
, a
formal system A formal system is an abstract structure and formalization of an axiomatic system used for deducing, using rules of inference, theorems from axioms. In 1921, David Hilbert proposed to use formal systems as the foundation of knowledge in ma ...
with a set of logical rules for reasoning rigorously about the
correctness of computer programs In theoretical computer science, an algorithm is correct with respect to a specification if it behaves as specified. Best explored is ''functional'' correctness, which refers to the input–output behavior of the algorithm: for each input it prod ...
. There is tool support for some programming languages (e.g., the SPARK programming language (a subset of Ada) and the Java Modeling Language—JML—using ESC/Java and ESC/Java2, Frama-C WP ( weakest precondition) plugin for the C language extended with ACSL ( ANSI/ISO C Specification Language) ). *
Model checking In computer science, model checking or property checking is a method for checking whether a finite-state model of a system meets a given specification (also known as correctness). This is typically associated with hardware or software syst ...
, considers systems that have finite state or may be reduced to finite state by
abstraction Abstraction is a process where general rules and concepts are derived from the use and classifying of specific examples, literal (reality, real or Abstract and concrete, concrete) signifiers, first principles, or other methods. "An abstraction" ...
; *
Symbolic execution In computer science, symbolic execution (also symbolic evaluation or symbex) is a means of analyzing a program to determine what inputs cause each part of a program to execute. An interpreter follows the program, assuming symbolic values for i ...
, as used to derive mathematical expressions representing the value of mutated variables at particular points in the code. * Nullable reference analysis


Data-driven static analysis

Data-driven static analysis leverages extensive codebases to infer coding rules and improve the accuracy of the analysis. For instance, one can use all Java open-source packages available on
GitHub GitHub () is a Proprietary software, proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug trackin ...
to learn good analysis strategies. The rule inference can use machine learning techniques. It is also possible to learn from a large amount of past fixes and warnings.


Remediation

Static analyzers produce warnings. For certain types of warnings, it is possible to design and implement automated remediation techniques. For example, Logozzo and Ball have proposed automated remediations for C# ''cccheck''.


See also


References


Further reading

* * *
"Abstract interpretation and static analysis,"
International Winter School on Semantics and Applications 2003, b
David A. Schmidt
{{Program analysis Program analysis Software engineering