computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...

, the computational complexity or simply complexity of an

algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...

is the amount of resources required to run it. Particular focus is given to computation time (generally measured by the number of needed elementary operations) and memory storage requirements. The complexity of a problem is the complexity of the best algorithms that allow solving the problem. The study of the complexity of explicitly given algorithms is called analysis of algorithms, while the study of the complexity of problems is called

computational complexity theory In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved ...

. Both areas are highly related, as the complexity of an algorithm is always an upper bound on the complexity of the problem solved by this algorithm. Moreover, for designing efficient algorithms, it is often fundamental to compare the complexity of a specific algorithm to the complexity of the problem to be solved. Also, in most cases, the only thing that is known about the complexity of a problem is that it is lower than the complexity of the most efficient known algorithms. Therefore, there is a large overlap between analysis of algorithms and complexity theory. As the amount of resources required to run an algorithm generally varies with the size of the input, the complexity is typically expressed as a function , where is the size of the input and is either the worst-case complexity (the maximum of the amount of resources that are needed over all inputs of size ) or the average-case complexity (the average of the amount of resources over all inputs of size ). Time complexity is generally expressed as the number of required elementary operations on an input of size , where elementary operations are assumed to take a constant amount of time on a given computer and change only by a constant factor when run on a different computer. Space complexity is generally expressed as the amount of

memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remember ...

required by an algorithm on an input of size .

Resources

Time

The resource that is most commonly considered is time. When "complexity" is used without qualification, this generally means time complexity. The usual units of time (seconds, minutes etc.) are not used in complexity theory because they are too dependent on the choice of a specific computer and on the evolution of technology. For instance, a computer today can execute an algorithm significantly faster than a computer from the 1960s; however, this is not an intrinsic feature of the algorithm but rather a consequence of technological advances in computer hardware. Complexity theory seeks to quantify the intrinsic time requirements of algorithms, that is, the basic time constraints an algorithm would place on ''any'' computer. This is achieved by counting the number of ''elementary operations'' that are executed during the computation. These operations are assumed to take constant time (that is, not affected by the size of the input) on a given machine, and are often called ''steps''.

Bit complexity

Formally, the ''bit complexity'' refers to the number of operations on bits that are needed for running an algorithm. With most models of computation, it equals the time complexity up to a constant factor. On

computer A computer is a machine that can be programmed to carry out sequences of arithmetic or logical operations ( computation) automatically. Modern digital electronic computers can perform generic sets of operations known as programs. These prog ...

s, the number of operations on machine words that are needed is also proportional to the bit complexity. So, the ''time complexity'' and the ''bit complexity'' are equivalent for realistic models of computation.

Space

Another important resource is the size of

computer memory In computing, memory is a device or system that is used to store information for immediate use in a computer or related computer hardware and digital electronic devices. The term ''memory'' is often synonymous with the term '' primary storag ...

that is needed for running algorithms.

Communication

For the class of distributed algorithms that are commonly executed by multiple, interacting parties, the resource that is of most interest is the communication complexity. It is the necessary amount of communication between the executing parties.

Others

The number of arithmetic operations is another resource that is commonly used. In this case, one talks of arithmetic complexity. If one knows an upper bound on the size of the binary representation of the numbers that occur during a computation, the time complexity is generally the product of the arithmetic complexity by a constant factor. For many algorithms the size of the integers that are used during a computation is not bounded, and it is not realistic to consider that arithmetic operations take a constant time. Therefore, the time complexity, generally called bit complexity in this context, may be much larger than the arithmetic complexity. For example, the arithmetic complexity of the computation of the

determinant In mathematics, the determinant is a scalar value that is a function of the entries of a square matrix. It characterizes some properties of the matrix and the linear map represented by the matrix. In particular, the determinant is nonzero if a ...

of a integer matrix is

O(n^3)

for the usual algorithms ( Gaussian elimination). The bit complexity of the same algorithms is exponential in , because the size of the coefficients may grow exponentially during the computation. On the other hand, if these algorithms are coupled with multi-modular arithmetic, the bit complexity may be reduced to . In sorting and searching, the resource that is generally considered is the number of entry comparisons. This is generally a good measure of the time complexity if data are suitably organized.

Complexity as a function of input size

It is impossible to count the number of steps of an algorithm on all possible inputs. As the complexity generally increases with the size of the input, the complexity is typically expressed as a function of the size (in bits) of the input, and therefore, the complexity is a function of . However, the complexity of an algorithm may vary dramatically for different inputs of the same size. Therefore, several complexity functions are commonly used. The worst-case complexity is the maximum of the complexity over all inputs of size , and the average-case complexity is the average of the complexity over all inputs of size (this makes sense, as the number of possible inputs of a given size is finite). Generally, when "complexity" is used without being further specified, this is the worst-case time complexity that is considered.

Asymptotic complexity

It is generally difficult to compute precisely the worst-case and the average-case complexity. In addition, these exact values provide little practical application, as any change of computer or of model of computation would change the complexity somewhat. Moreover, the resource use is not critical for small values of , and this makes that, for small , the ease of implementation is generally more interesting than a low complexity. For these reasons, one generally focuses on the behavior of the complexity for large , that is on its asymptotic behavior when tends to the infinity. Therefore, the complexity is generally expressed by using

big O notation Big ''O'' notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. Big O is a member of a family of notations invented by Paul Bachmann, Edmund L ...

. For example, the usual algorithm for integer multiplication has a complexity of

O(n^2),

this means that there is a constant

c_u

such that the multiplication of two integers of at most digits may be done in a time less than

c_un^2.

This bound is ''sharp'' in the sense that the worst-case complexity and the average-case complexity are

\Omega(n^2),

which means that there is a constant

c_l

such that these complexities are larger than

c_ln^2.

The radix does not appear in these complexity, as changing of radix changes only the constants

c_u

and

c_l.

Models of computation

The evaluation of the complexity relies on the choice of a model of computation, which consists in defining the basic operations that are done in a unit of time. When the model of computation is not explicitly specified, this is generally meant as being multitape Turing machine.

Deterministic models

A deterministic model of computation is a model of computation such that the successive states of the machine and the operations to be performed are completely determined by the preceding state. Historically, the first deterministic models were recursive functions,

lambda calculus Lambda calculus (also written as ''λ''-calculus) is a formal system in mathematical logic for expressing computation based on function abstraction and application using variable binding and substitution. It is a universal model of computation th ...

, and

Turing machine A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer alg ...

s. The model of

random-access machine In computer science, random-access machine (RAM) is an abstract machine in the general class of register machines. The RAM is very similar to the counter machine but with the added capability of 'indirect addressing' of its registers. Like the c ...

s (also called RAM-machines) is also widely used, as a closer counterpart to real

s. When the model of computation is not specified, it is generally assumed to be a multitape Turing machine. For most algorithms, the time complexity is the same on multitape Turing machines as on RAM-machines, although some care may be needed in how data is stored in memory to get this equivalence.

Non-deterministic computation

In a non-deterministic model of computation, such as non-deterministic Turing machines, some choices may be done at some steps of the computation. In complexity theory, one considers all possible choices simultaneously, and the non-deterministic time complexity is the time needed, when the best choices are always done. In other words, one considers that the computation is done simultaneously on as many (identical) processors as needed, and the non-deterministic computation time is the time spent by the first processor that finishes the computation. This parallelism is partly amenable to quantum computing via superposed entangled states in running specific

quantum algorithms In quantum computing, a quantum algorithm is an algorithm which runs on a realistic model of quantum computation, the most commonly used model being the quantum circuit model of computation. A classical (or non-quantum) algorithm is a finite sequ ...

, like e.g. Shor's factorization of yet only small integers (: 21 = 3 × 7). Even when such a computation model is not realistic yet, it has theoretical importance, mostly related to the P = NP problem, which questions the identity of the complexity classes formed by taking "polynomial time" and "non-deterministic polynomial time" as least upper bounds. Simulating an NP-algorithm on a deterministic computer usually takes "exponential time". A problem is in the complexity class NP, if it may be solved in

polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...

on a non-deterministic machine. A problem is NP-complete if, roughly speaking, it is in NP and is not easier than any other NP problem. Many combinatorial problems, such as the Knapsack problem, the travelling salesman problem, and the

Boolean satisfiability problem In logic and computer science, the Boolean satisfiability problem (sometimes called propositional satisfiability problem and abbreviated SATISFIABILITY, SAT or B-SAT) is the problem of determining if there exists an interpretation that satisf ...

are NP-complete. For all these problems, the best known algorithm has exponential complexity. If any one of these problems could be solved in polynomial time on a deterministic machine, then all NP problems could also be solved in polynomial time, and one would have P = NP. it is generally conjectured that with the practical implication that the worst cases of NP problems are intrinsically difficult to solve, i.e., take longer than any reasonable time span (decades!) for interesting lengths of input.

Parallel and distributed computation

Parallel and distributed computing consist of splitting computation on several processors, which work simultaneously. The difference between the different model lies mainly in the way of transmitting information between processors. Typically, in parallel computing the data transmission between processors is very fast, while, in distributed computing, the data transmission is done through a

network Network, networking and networked may refer to: Science and technology * Network theory, the study of graphs as a representation of relations between discrete objects * Network science, an academic field that studies complex networks Mathematic ...

and is therefore much slower. The time needed for a computation on processors is at least the quotient by of the time needed by a single processor. In fact this theoretically optimal bound can never be reached, because some subtasks cannot be parallelized, and some processors may have to wait a result from another processor. The main complexity problem is thus to design algorithms such that the product of the computation time by the number of processors is as close as possible to the time needed for the same computation on a single processor.

Quantum computing

quantum computer Quantum computing is a type of computation whose operations can harness the phenomena of quantum mechanics, such as superposition, interference, and entanglement. Devices that perform quantum computations are known as quantum computers. Thoug ...

is a computer whose model of computation is based on

quantum mechanics Quantum mechanics is a fundamental theory in physics that provides a description of the physical properties of nature at the scale of atoms and subatomic particles. It is the foundation of all quantum physics including quantum chemistry, ...

. The Church–Turing thesis applies to quantum computers; that is, every problem that can be solved by a quantum computer can also be solved by a Turing machine. However, some problems may theoretically be solved with a much lower time complexity using a quantum computer rather than a classical computer. This is, for the moment, purely theoretical, as no one knows how to build an efficient quantum computer.

Quantum complexity theory Quantum complexity theory is the subfield of computational complexity theory that deals with complexity classes defined using quantum computers, a computational model based on quantum mechanics. It studies the hardness of computational proble ...

has been developed to study the

complexity class In computational complexity theory, a complexity class is a set of computational problems of related resource-based complexity. The two most commonly analyzed resources are time and memory. In general, a complexity class is defined in terms o ...

es of problems solved using quantum computers. It is used in post-quantum cryptography, which consists of designing

cryptographic protocol A security protocol (cryptographic protocol or encryption protocol) is an abstract or concrete protocol that performs a security-related function and applies cryptographic methods, often as sequences of cryptographic primitives. A protocol descri ...

s that are resistant to attacks by quantum computers.

Problem complexity (lower bounds)

The complexity of a problem is the infimum of the complexities of the algorithms that may solve the problem, including unknown algorithms. Thus the complexity of a problem is not greater than the complexity of any algorithm that solves the problems. It follows that every complexity that is expressed with

is a complexity of the algorithm as well as of the corresponding problem. On the other hand, it is generally hard to obtain nontrivial lower bounds for problem complexity, and there are few methods for obtaining such lower bounds. For solving most problems, it is required to read all input data, which, normally, needs a time proportional to the size of the data. Thus, such problems have a complexity that is at least

linear Linearity is the property of a mathematical relationship ('' function'') that can be graphically represented as a straight line. Linearity is closely related to '' proportionality''. Examples in physics include rectilinear motion, the linear ...

, that is, using

big omega notation Big ''O'' notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. Big O is a member of a family of notations invented by Paul Bachmann, Edmund Lan ...

, a complexity

\Omega(n).

The solution of some problems, typically in

computer algebra In mathematics and computer science, computer algebra, also called symbolic computation or algebraic computation, is a scientific area that refers to the study and development of algorithms and software for manipulating mathematical expressions ...

and computational algebraic geometry, may be very large. In such a case, the complexity is lower bounded by the maximal size of the output, since the output must be written. For example, a system of polynomial equations of degree in indeterminates may have up to

d^n

complex solutions, if the number of solutions is finite (this is Bézout's theorem). As these solutions must be written down, the complexity of this problem is

\Omega(d^n).

For this problem, an algorithm of complexity

d^

is known, which may thus be considered as asymptotically quasi-optimal. A nonlinear lower bound of

\Omega(n\log n)

is known for the number of comparisons needed for a sorting algorithm. Thus the best sorting algorithms are optimal, as their complexity is

O(n\log n).

This lower bound results from the fact that there are ways of ordering objects. As each comparison splits in two parts this set of orders, the number of of comparisons that are needed for distinguishing all orders must verify

2^N>n!,

which implies

N =\Omega(n\log n),

by Stirling's formula. A standard method for getting lower bounds of complexity consists of ''reducing'' a problem to another problem. More precisely, suppose that one may encode a problem of size into a subproblem of size of a problem , and that the complexity of is

\Omega(g(n)).

Without loss of generality, one may suppose that the function increases with and has an inverse function . Then the complexity of the problem is

\Omega(g(h(n))).

This is the method that is used to prove that, if P ≠ NP (an unsolved conjecture), the complexity of every NP-complete problem is

\Omega(n^k),

for every positive integer .

Use in algorithm design

Evaluating the complexity of an algorithm is an important part of algorithm design, as this gives useful information on the performance that may be expected. It is a common misconception that the evaluation of the complexity of algorithms will become less important as a result of Moore's law, which posits the

exponential growth Exponential growth is a process that increases quantity over time. It occurs when the instantaneous rate of change (that is, the derivative) of a quantity with respect to time is proportional to the quantity itself. Described as a function, a ...

of the power of modern

s. This is wrong because this power increase allows working with large input data (

big data Though used sometimes loosely partly because of a lack of formal definition, the interpretation that seems to best describe Big data is the one associated with large body of information that we could not comprehend when used only in smaller am ...

). For example, when one wants to sort alphabetically a list of a few hundreds of entries, such as the bibliography of a book, any algorithm should work well in less than a second. On the other hand, for a list of a million of entries (the phone numbers of a large town, for example), the elementary algorithms that require

O(n^2)

comparisons would have to do a trillion of comparisons, which would need around three hours at the speed of 10 million of comparisons per second. On the other hand, the quicksort and merge sort require only

n\log_2 n

comparisons (as average-case complexity for the former, as worst-case complexity for the latter). For , this gives approximately 30,000,000 comparisons, which would only take 3 seconds at 10 million comparisons per second. Thus the evaluation of the complexity may allow eliminating many inefficient algorithms before any implementation. This may also be used for tuning complex algorithms without testing all variants. By determining the most costly steps of a complex algorithm, the study of complexity allows also focusing on these steps the effort for improving the efficiency of an implementation.

References

* * * * * * * {{Citation , last=Sipser , first=Michael , author-link=Michael Sipser , title=

Introduction to the Theory of Computation ''Introduction to the Theory of Computation'' () is a textbook in theoretical computer science computer science (TCS) is a subset of general computer science and mathematics that focuses on mathematical aspects of computer science such as the ...

, edition=2nd , year=2006 , publisher= Thomson Course Technology , location=USA , isbn=0-534-95097-3 zh-yue:運算複雜度 Analysis of algorithms Computational complexity theory Computational resources