P versus NP problem
   HOME

TheInfoList



OR:

The P versus NP problem is a major
unsolved problem List of unsolved problems may refer to several notable conjectures or open problems in various academic fields: Natural sciences, engineering and medicine * Unsolved problems in astronomy * Unsolved problems in biology * Unsolved problems in che ...
in
theoretical computer science computer science (TCS) is a subset of general computer science and mathematics that focuses on mathematical aspects of computer science such as the theory of computation, lambda calculus, and type theory. It is difficult to circumscribe the ...
. In informal terms, it asks whether every problem whose solution can be quickly verified can also be quickly solved. The informal term ''quickly'', used above, means the existence of an algorithm solving the task that runs in
polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
, such that the time to complete the task varies as a
polynomial function In mathematics, a polynomial is an expression consisting of indeterminates (also called variables) and coefficients, that involves only the operations of addition, subtraction, multiplication, and positive-integer powers of variables. An exa ...
on the size of the input to the algorithm (as opposed to, say,
exponential time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by t ...
). The general class of questions for which some
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
can provide an answer in polynomial time is " P" or "class P". For some questions, there is no known way to find an answer quickly, but if one is provided with information showing what the answer is, it is possible to verify the answer quickly. The class of questions for which an answer can be ''verified'' in polynomial time is NP, which stands for "nondeterministic polynomial time".A
nondeterministic Turing machine In theoretical computer science, a nondeterministic Turing machine (NTM) is a theoretical model of computation whose governing rules specify more than one possible action when in some given situations. That is, an NTM's next state is ''not'' comp ...
can move to a state that is not determined by the previous state. Such a machine could solve an NP problem in polynomial time by falling into the correct answer state (by luck), then conventionally verifying it. Such machines are not practical for solving realistic problems but can be used as theoretical models.
An answer to the P versus NP question would determine whether problems that can be verified in polynomial time can also be solved in polynomial time. If it turns out that P ≠ NP, which is widely believed, it would mean that there are problems in NP that are harder to compute than to verify: they could not be solved in polynomial time, but the answer could be verified in polynomial time. The problem has been called the most important open problem in
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
. Aside from being an important problem in
computational theory In theoretical computer science and mathematics, the theory of computation is the branch that deals with what problems can be solved on a model of computation, using an algorithm, how efficiently they can be solved or to what degree (e.g., a ...
, a proof either way would have profound implications for mathematics,
cryptography Cryptography, or cryptology (from grc, , translit=kryptós "hidden, secret"; and ''graphein'', "to write", or '' -logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of adv ...
, algorithm research,
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech ...
,
game theory Game theory is the study of mathematical models of strategic interactions among rational agents. Myerson, Roger B. (1991). ''Game Theory: Analysis of Conflict,'' Harvard University Press, p.&nbs1 Chapter-preview links, ppvii–xi It has appli ...
, multimedia processing,
philosophy Philosophy (from , ) is the systematized study of general and fundamental questions, such as those about existence, reason, knowledge, values, mind, and language. Such questions are often posed as problems to be studied or resolved. ...
,
economics Economics () is the social science that studies the production, distribution, and consumption of goods and services. Economics focuses on the behaviour and interactions of economic agents and how economies work. Microeconomics anal ...
and many other fields. It is one of the seven
Millennium Prize Problems The Millennium Prize Problems are seven well-known complex mathematical problems selected by the Clay Mathematics Institute in 2000. The Clay Institute has pledged a US$1 million prize for the first correct solution to each problem. According ...
selected by the
Clay Mathematics Institute The Clay Mathematics Institute (CMI) is a private, non-profit foundation dedicated to increasing and disseminating mathematical knowledge. Formerly based in Peterborough, New Hampshire, the corporate address is now in Denver, Colorado. CMI's sc ...
, each of which carries a US$1,000,000 prize for the first correct solution.


Example

Consider
Sudoku Sudoku (; ja, 数独, sūdoku, digit-single; originally called Number Place) is a logic-based, combinatorial number-placement puzzle. In classic Sudoku, the objective is to fill a 9 × 9 grid with digits so that each column, each row ...
, a game where the player is given a partially filled-in grid of numbers and attempts to complete the grid following certain rules. Given an incomplete Sudoku grid, of any size, is there at least one legal solution? Any proposed solution is easily verified, and the time to check a solution grows slowly (polynomially) as the grid gets bigger. However, all known algorithms for finding solutions take, for difficult examples, time that grows exponentially as the grid gets bigger. So, Sudoku is in NP (quickly checkable) but does not seem to be in P (quickly solvable). Thousands of other problems seem similar, in that they are fast to check but slow to solve. Researchers have shown that many of the problems in NP have the extra property that a fast solution to any one of them could be used to build a quick solution to any other problem in NP, a property called
NP-complete In computational complexity theory, a problem is NP-complete when: # it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by trying ...
ness. Decades of searching have not yielded a fast solution to any of these problems, so most scientists suspect that none of these problems can be solved quickly. This, however, has never been proven.


History

The precise statement of the P versus NP problem was introduced in 1971 by Stephen Cook in his seminal paper "The complexity of theorem proving procedures" (and independently by
Leonid Levin Leonid Anatolievich Levin ( ; russian: Леони́д Анато́льевич Ле́вин; uk, Леоні́д Анато́лійович Ле́він; born November 2, 1948) is a Soviet-American mathematician and computer scientist. He is kn ...
in 1973). Although the P versus NP problem was formally defined in 1971, there were previous inklings of the problems involved, the difficulty of proof, and the potential consequences. In 1955, mathematician John Nash wrote a letter to the
NSA The National Security Agency (NSA) is a national-level intelligence agency of the United States Department of Defense, under the authority of the Director of National Intelligence (DNI). The NSA is responsible for global monitoring, collec ...
, in which he speculated that cracking a sufficiently complex code would require time exponential in the length of the key. If proved (and Nash was suitably skeptical), this would imply what is now called P ≠ NP, since a proposed key can easily be verified in polynomial time. Another mention of the underlying problem occurred in a 1956 letter written by
Kurt Gödel Kurt Friedrich Gödel ( , ; April 28, 1906 – January 14, 1978) was a logician, mathematician, and philosopher. Considered along with Aristotle and Gottlob Frege to be one of the most significant logicians in history, Gödel had an imm ...
to
John von Neumann John von Neumann (; hu, Neumann János Lajos, ; December 28, 1903 – February 8, 1957) was a Hungarian-American mathematician, physicist, computer scientist, engineer and polymath. He was regarded as having perhaps the widest c ...
. Gödel asked whether theorem-proving (now known to be
co-NP-complete In complexity theory, computational problems that are co-NP-complete are those that are the hardest problems in co-NP, in the sense that any problem in co-NP can be reformulated as a special case of any co-NP-complete problem with only polynomial ...
) could be solved in quadratic or
linear time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
, and pointed out one of the most important consequences – that if so, then the discovery of mathematical proofs could be automated.


Context

The relation between the
complexity class In computational complexity theory, a complexity class is a set of computational problems of related resource-based complexity. The two most commonly analyzed resources are time and memory. In general, a complexity class is defined in terms o ...
es P and NP is studied in
computational complexity theory In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved ...
, the part of the
theory of computation In theoretical computer science and mathematics, the theory of computation is the branch that deals with what problems can be solved on a model of computation, using an algorithm, how algorithmic efficiency, efficiently they can be solved or t ...
dealing with the resources required during computation to solve a given problem. The most common resources are time (how many steps it takes to solve a problem) and space (how much memory it takes to solve a problem). In such analysis, a model of the computer for which time must be analyzed is required. Typically such models assume that the computer is ''
deterministic Determinism is a philosophical view, where all events are determined completely by previously existing causes. Deterministic theories throughout the history of philosophy have developed from diverse and sometimes overlapping motives and cons ...
'' (given the computer's present state and any inputs, there is only one possible action that the computer might take) and ''sequential'' (it performs actions one after the other). In this theory, the class P consists of all those ''
decision problem In computability theory and computational complexity theory, a decision problem is a computational problem that can be posed as a yes–no question of the input values. An example of a decision problem is deciding by means of an algorithm whe ...
s'' (defined
below Below may refer to: *Earth * Ground (disambiguation) *Soil *Floor * Bottom (disambiguation) *Less than *Temperatures below freezing *Hell or underworld People with the surname *Ernst von Below (1863–1955), German World War I general *Fred Below ...
) that can be solved on a deterministic sequential machine in an amount of time that is
polynomial In mathematics, a polynomial is an expression consisting of indeterminates (also called variables) and coefficients, that involves only the operations of addition, subtraction, multiplication, and positive-integer powers of variables. An exampl ...
in the size of the input; the class NP consists of all those decision problems whose positive solutions can be verified in
polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
given the right information, or equivalently, whose solution can be found in polynomial time on a non-deterministic machine. Clearly, P ⊆ NP. Arguably, the biggest open question in
theoretical computer science computer science (TCS) is a subset of general computer science and mathematics that focuses on mathematical aspects of computer science such as the theory of computation, lambda calculus, and type theory. It is difficult to circumscribe the ...
concerns the relationship between those two classes: :Is P equal to NP? Since 2002,
William Gasarch William Ian Gasarch ( ; born 1959) is an American computer scientist known for his work in computational complexity theory, computability theory, computational learning theory, and Ramsey theory. He is currently a professor at the University of M ...
has conducted three polls of researchers concerning this and related questions. Confidence that P ≠ NP has been increasing – in 2019, 88% believed P ≠ NP, as opposed to 83% in 2012 and 61% in 2002. When restricted to experts, the 2019 answers became 99% believed P ≠ NP. These polls do not imply anything about whether P = NP is true, as stated by Gasarch himself: "This does not bring us any closer to solving P=?NP or to knowing when it will be solved, but it attempts to be an objective report on the subjective opinion of this era."


NP-completeness

To attack the P = NP question, the concept of NP-completeness is very useful. NP-complete problems are a set of problems to each of which any other NP problem can be reduced in polynomial time and whose solution may still be verified in polynomial time. That is, any NP problem can be transformed into any of the NP-complete problems. Informally, an NP-complete problem is an NP problem that is at least as "tough" as any other problem in NP.
NP-hard In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...
problems are those at least as hard as NP problems; i.e., all NP problems can be reduced (in polynomial time) to them. NP-hard problems need not be in NP; i.e., they need not have solutions verifiable in polynomial time. For instance, the
Boolean satisfiability problem In logic and computer science, the Boolean satisfiability problem (sometimes called propositional satisfiability problem and abbreviated SATISFIABILITY, SAT or B-SAT) is the problem of determining if there exists an interpretation that satisf ...
is NP-complete by the
Cook–Levin theorem In computational complexity theory, the Cook–Levin theorem, also known as Cook's theorem, states that the Boolean satisfiability problem is NP-complete. That is, it is in NP, and any problem in NP can be reduced in polynomial time by a determi ...
, so ''any'' instance of ''any'' problem in NP can be transformed mechanically into an instance of the Boolean satisfiability problem in polynomial time. The Boolean satisfiability problem is one of many such NP-complete problems. If any NP-complete problem is in P, then it would follow that P = NP. However, many important problems have been shown to be NP-complete, and no fast algorithm for any of them is known. Based on the definition alone it is not obvious that NP-complete problems exist; however, a trivial and contrived NP-complete problem can be formulated as follows: given a description of a
Turing machine A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer alg ...
''M'' guaranteed to halt in polynomial time, does there exist a polynomial-size input that ''M'' will accept? It is in NP because (given an input) it is simple to check whether ''M'' accepts the input by simulating ''M''; it is NP-complete because the verifier for any particular instance of a problem in NP can be encoded as a polynomial-time machine ''M'' that takes the solution to be verified as input. Then the question of whether the instance is a yes or no instance is determined by whether a valid input exists. The first natural problem proven to be NP-complete was the Boolean satisfiability problem, also known as SAT. As noted above, this is the Cook–Levin theorem; its proof that satisfiability is NP-complete contains technical details about Turing machines as they relate to the definition of NP. However, after this problem was proved to be NP-complete, proof by reduction provided a simpler way to show that many other problems are also NP-complete, including the game Sudoku discussed earlier. In this case, the proof shows that a solution of Sudoku in polynomial time could also be used to complete Latin squares in polynomial time. This in turn gives a solution to the problem of partitioning tri-partite graphs into triangles, which could then be used to find solutions for the special case of SAT known as 3-SAT, which then provides a solution for general Boolean satisfiability. So a polynomial-time solution to Sudoku leads, by a series of mechanical transformations, to a polynomial time solution of satisfiability, which in turn can be used to solve any other NP-problem in polynomial time. Using transformations like this, a vast class of seemingly unrelated problems are all reducible to one another, and are in a sense "the same problem".


Harder problems

Although it is unknown whether P = NP, problems outside of P are known. Just as the class P is defined in terms of polynomial running time, the class
EXPTIME In computational complexity theory, the complexity class EXPTIME (sometimes called EXP or DEXPTIME) is the set of all decision problems that are solvable by a deterministic Turing machine in exponential time, i.e., in O(2''p''(''n'')) time, wh ...
is the set of all decision problems that have ''exponential'' running time. In other words, any problem in EXPTIME is solvable by a
deterministic Turing machine A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer alg ...
in O(2''p''(''n'')) time, where ''p''(''n'') is a polynomial function of ''n''. A decision problem is
EXPTIME-complete In computational complexity theory, the complexity class EXPTIME (sometimes called EXP or DEXPTIME) is the set of all decision problems that are solvable by a deterministic Turing machine in exponential time, i.e., in O(2''p''(''n'')) time, whe ...
if it is in EXPTIME, and every problem in EXPTIME has a
polynomial-time many-one reduction In computational complexity theory, a polynomial-time reduction is a method for solving one problem using another. One shows that if a hypothetical subroutine solving the second problem exists, then the first problem can be solved by transforming ...
to it. A number of problems are known to be EXPTIME-complete. Because it can be shown that P ≠ EXPTIME, these problems are outside P, and so require more than polynomial time. In fact, by the
time hierarchy theorem In computational complexity theory, the time hierarchy theorems are important statements about time-bounded computation on Turing machines. Informally, these theorems say that given more time, a Turing machine can solve more problems. For example, ...
, they cannot be solved in significantly less than exponential time. Examples include finding a perfect strategy for
chess Chess is a board game for two players, called White and Black, each controlling an army of chess pieces in their color, with the objective to checkmate the opponent's king. It is sometimes called international chess or Western chess to dist ...
positions on an ''N'' × ''N'' board and similar problems for other board games. The problem of deciding the truth of a statement in
Presburger arithmetic Presburger arithmetic is the first-order theory of the natural numbers with addition, named in honor of Mojżesz Presburger, who introduced it in 1929. The signature of Presburger arithmetic contains only the addition operation and equality, omit ...
requires even more time. Fischer and Rabin proved in 1974 that every algorithm that decides the truth of Presburger statements of length ''n'' has a runtime of at least 2^ for some constant ''c''. Hence, the problem is known to need more than exponential run time. Even more difficult are the
undecidable problem In computability theory and computational complexity theory, an undecidable problem is a decision problem for which it is proved to be impossible to construct an algorithm that always leads to a correct yes-or-no answer. The halting problem is an ...
s, such as the
halting problem In computability theory, the halting problem is the problem of determining, from a description of an arbitrary computer program and an input, whether the program will finish running, or continue to run forever. Alan Turing proved in 1936 that a ...
. They cannot be completely solved by any algorithm, in the sense that for any particular algorithm there is at least one input for which that algorithm will not produce the right answer; it will either produce the wrong answer, finish without giving a conclusive answer, or otherwise run forever without producing any answer at all. It is also possible to consider questions other than decision problems. One such class, consisting of counting problems, is called #P: whereas an NP problem asks "Are there any solutions?", the corresponding #P problem asks "How many solutions are there?". Clearly, a #P problem must be at least as hard as the corresponding NP problem, since a count of solutions immediately tells if at least one solution exists, if the count is greater than zero. Surprisingly, some #P problems that are believed to be difficult correspond to easy (for example linear-time) P problems. For these problems, it is very easy to tell whether solutions exist, but thought to be very hard to tell how many. Many of these problems are #P-complete, and hence among the hardest problems in #P, since a polynomial time solution to any of them would allow a polynomial time solution to all other #P problems.


Problems in NP not known to be in P or NP-complete

In 1975, Richard E. Ladner showed that if P ≠ NP, then there exist problems in NP that are neither in P nor NP-complete. Such problems are called NP-intermediate problems. The
graph isomorphism problem The graph isomorphism problem is the computational problem of determining whether two finite graphs are isomorphic. The problem is not known to be solvable in polynomial time nor to be NP-complete, and therefore may be in the computational compl ...
, the
discrete logarithm problem In mathematics, for given real numbers ''a'' and ''b'', the logarithm log''b'' ''a'' is a number ''x'' such that . Analogously, in any group ''G'', powers ''b'k'' can be defined for all integers ''k'', and the discrete logarithm log''b' ...
, and the integer factorization problem are examples of problems believed to be NP-intermediate. They are some of the very few NP problems not known to be in P or to be NP-complete. The graph isomorphism problem is the computational problem of determining whether two finite
graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discre ...
s are
isomorphic In mathematics, an isomorphism is a structure-preserving mapping between two structures of the same type that can be reversed by an inverse mapping. Two mathematical structures are isomorphic if an isomorphism exists between them. The word i ...
. An important unsolved problem in complexity theory is whether the graph isomorphism problem is in P, NP-complete, or NP-intermediate. The answer is not known, but it is believed that the problem is at least not NP-complete. If graph isomorphism is NP-complete, the polynomial time hierarchy collapses to its second level. Since it is widely believed that the polynomial hierarchy does not collapse to any finite level, it is believed that graph isomorphism is not NP-complete. The best algorithm for this problem, due to László Babai, runs in
quasi-polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
. The integer factorization problem is the computational problem of determining the
prime factorization In number theory, integer factorization is the decomposition of a composite number into a product of smaller integers. If these factors are further restricted to prime numbers, the process is called prime factorization. When the numbers are s ...
of a given integer. Phrased as a decision problem, it is the problem of deciding whether the input has a factor less than ''k''. No efficient integer factorization algorithm is known, and this fact forms the basis of several modern cryptographic systems, such as the RSA algorithm. The integer factorization problem is in NP and in
co-NP In computational complexity theory, co-NP is a complexity class. A decision problem X is a member of co-NP if and only if its complement is in the complexity class NP. The class can be defined as follows: a decision problem is in co-NP precisely ...
(and even in UP and co-UP). If the problem is NP-complete, the polynomial time hierarchy will collapse to its first level (i.e., NP = co-NP). The most efficient known algorithm for integer factorization is the
general number field sieve In number theory, the general number field sieve (GNFS) is the most efficient classical algorithm known for factoring integers larger than . Heuristically, its complexity for factoring an integer (consisting of bits) is of the form :\exp\lef ...
, which takes expected time :O\left (\exp \left ( \left (\tfrac \log(2) \right )^ \left ( \log(n\log(2)) \right )^ \right) \right ) to factor an ''n''-bit integer. However, the best known
quantum algorithm In quantum computing, a quantum algorithm is an algorithm which runs on a realistic model of quantum computation, the most commonly used model being the quantum circuit model of computation. A classical (or non-quantum) algorithm is a finite s ...
for this problem,
Shor's algorithm Shor's algorithm is a quantum computer algorithm for finding the prime factors of an integer. It was developed in 1994 by the American mathematician Peter Shor. On a quantum computer, to factor an integer N , Shor's algorithm runs in polynom ...
, does run in polynomial time, although this does not indicate where the problem lies with respect to non- quantum complexity classes.


Does P mean "easy"?

All of the above discussion has assumed that P means "easy" and "not in P" means "difficult", an assumption known as ''
Cobham's thesis Cobham's thesis, also known as Cobham–Edmonds thesis (named after Alan Cobham and Jack Edmonds),.. asserts that computational problems can be feasibly computed on some computational device only if they can be computed in polynomial time; that ...
''. It is a common and reasonably accurate assumption in complexity theory; however, it has some caveats. First, it is not always true in practice. A theoretical polynomial algorithm may have extremely large constant factors or exponents, thus rendering it impractical. For example, the problem of deciding whether a graph ''G'' contains ''H'' as a minor, where ''H'' is fixed, can be solved in a running time of ''O''(''n''2), where ''n'' is the number of vertices in ''G''. However, the
big O notation Big ''O'' notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. Big O is a member of a family of notations invented by Paul Bachmann, Edmund L ...
hides a constant that depends superexponentially on ''H''. The constant is greater than 2 \uparrow \uparrow (2 \uparrow \uparrow (2 \uparrow \uparrow (h/2) ) ) (using
Knuth's up-arrow notation In mathematics, Knuth's up-arrow notation is a method of notation for very large integers, introduced by Donald Knuth in 1976. In his 1947 paper, R. L. Goodstein introduced the specific sequence of operations that are now called ''hyperoperati ...
), and where ''h'' is the number of vertices in ''H''. On the other hand, even if a problem is shown to be NP-complete, and even if P ≠ NP, there may still be effective approaches to tackling the problem in practice. There are algorithms for many NP-complete problems, such as the
knapsack problem The knapsack problem is a problem in combinatorial optimization: Given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit a ...
, the
traveling salesman problem The travelling salesman problem (also called the travelling salesperson problem or TSP) asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each cit ...
, and the
Boolean satisfiability problem In logic and computer science, the Boolean satisfiability problem (sometimes called propositional satisfiability problem and abbreviated SATISFIABILITY, SAT or B-SAT) is the problem of determining if there exists an interpretation that satisf ...
, that can solve to optimality many real-world instances in reasonable time. The empirical
average-case complexity In computational complexity theory, the average-case complexity of an algorithm is the amount of some computational resource (typically time) used by the algorithm, averaged over all possible inputs. It is frequently contrasted with worst-case com ...
(time vs. problem size) of such algorithms can be surprisingly low. An example is the
simplex algorithm In mathematical optimization, Dantzig's simplex algorithm (or simplex method) is a popular algorithm for linear programming. The name of the algorithm is derived from the concept of a simplex and was suggested by T. S. Motzkin. Simplices are n ...
in
linear programming Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements are represented by linear relationships. Linear programming is ...
, which works surprisingly well in practice; despite having exponential worst-case
time complexity In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by t ...
, it runs on par with the best known polynomial-time algorithms. Finally, there are types of computations which do not conform to the Turing machine model on which P and NP are defined, such as
quantum computation Quantum computing is a type of computation whose operations can harness the phenomena of quantum mechanics, such as superposition, interference, and entanglement. Devices that perform quantum computations are known as quantum computers. Though ...
and
randomized algorithm A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic or procedure. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performa ...
s.


Reasons to believe P ≠ NP or P = NP

Cook provides a restatement of the problem in ''The P Versus NP Problem'' as "Does P = NP?" According to polls, most computer scientists believe that P ≠ NP. A key reason for this belief is that after decades of studying these problems no one has been able to find a polynomial-time algorithm for any of more than 3000 important known NP-complete problems (see
List of NP-complete problems This is a list of some of the more commonly known problems that are NP-complete when expressed as decision problems. As there are hundreds of such problems known, this list is in no way comprehensive. Many problems of this type can be found in ...
). These algorithms were sought long before the concept of NP-completeness was even defined (
Karp's 21 NP-complete problems In computational complexity theory, Karp's 21 NP-complete problems are a set of computational problems which are NP-complete. In his 1972 paper, "Reducibility Among Combinatorial Problems", Richard Karp used Stephen Cook's 1971 theorem that the b ...
, among the first found, were all well-known existing problems at the time they were shown to be NP-complete). Furthermore, the result P = NP would imply many other startling results that are currently believed to be false, such as NP = 
co-NP In computational complexity theory, co-NP is a complexity class. A decision problem X is a member of co-NP if and only if its complement is in the complexity class NP. The class can be defined as follows: a decision problem is in co-NP precisely ...
and P =  PH. It is also intuitively argued that the existence of problems that are hard to solve but for which the solutions are easy to verify matches real-world experience. On the other hand, some researchers believe that there is overconfidence in believing P ≠ NP and that researchers should explore proofs of P = NP as well. For example, in 2002 these statements were made:


Consequences of solution

One of the reasons the problem attracts so much attention is the consequences of the possible answers. Either direction of resolution would advance theory enormously, and perhaps have huge practical consequences as well.


P = NP

A proof that P = NP could have stunning practical consequences if the proof leads to efficient methods for solving some of the important problems in NP. The potential consequences, both positive and negative, arise since various NP-complete problems are fundamental in many fields. It is also very possible that a proof would ''not'' lead to practical algorithms for NP-complete problems. The formulation of the problem does not require that the bounding polynomial be small or even specifically known. A non-constructive proof might show a solution exists without specifying either an algorithm to obtain it or a specific bound. Even if the proof is constructive, showing an explicit bounding polynomial and algorithmic details, if the polynomial is not very low-order the algorithm might not be sufficiently efficient in practice. In this case the initial proof would be mainly of interest to theoreticians, but the knowledge that polynomial time solutions are possible would surely spur research into better (and possibly practical) methods to achieve them. An example of a field that could be upended by a solution showing P = NP is
cryptography Cryptography, or cryptology (from grc, , translit=kryptós "hidden, secret"; and ''graphein'', "to write", or '' -logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of adv ...
, which relies on certain problems being difficult. A constructive and efficient solutionExactly how efficient a solution must be to pose a threat to cryptography depends on the details. A solution of O(N^2) with a reasonable constant term would be disastrous. On the other hand, a solution that is \Omega(N^4) in almost all cases would not pose an immediate practical danger. to an NP-complete problem such as 3-SAT would break most existing cryptosystems including: * Existing implementations of
public-key cryptography Public-key cryptography, or asymmetric cryptography, is the field of cryptographic systems that use pairs of related keys. Each key pair consists of a public key and a corresponding private key. Key pairs are generated with cryptographic a ...
, a foundation for many modern security applications such as secure financial transactions over the Internet. *
Symmetric cipher Symmetric-key algorithms are algorithms for cryptography that use the same cryptographic keys for both the encryption of plaintext and the decryption of ciphertext. The keys may be identical, or there may be a simple transformation to go betwe ...
s such as AES or 3DES, used for the encryption of communications data. *
Cryptographic hashing A cryptographic hash function (CHF) is a hash algorithm (a map of an arbitrary binary string to a binary string with fixed size of n bits) that has special properties desirable for cryptography: * the probability of a particular n-bit output re ...
, which underlies
blockchain A blockchain is a type of distributed ledger technology (DLT) that consists of growing lists of records, called ''blocks'', that are securely linked together using cryptography. Each block contains a cryptographic hash of the previous block, ...
cryptocurrencies A cryptocurrency, crypto-currency, or crypto is a digital currency designed to work as a medium of exchange through a computer network that is not reliant on any central authority, such as a government or bank, to uphold or maintain it. It ...
such as
Bitcoin Bitcoin (abbreviation: BTC; sign: ₿) is a decentralized digital currency that can be transferred on the peer-to-peer bitcoin network. Bitcoin transactions are verified by network nodes through cryptography and recorded in a public distr ...
, and is used to authenticate software updates. For these applications, the problem of finding a pre-image that hashes to a given value must be difficult in order to be useful, and ideally should require exponential time. However, if P = NP, then finding a pre-image ''M'' can be done in polynomial time, through reduction to SAT. These would need to be modified or replaced by information-theoretically secure solutions not inherently based on P–NP inequivalence. On the other hand, there are enormous positive consequences that would follow from rendering tractable many currently mathematically intractable problems. For instance, many problems in
operations research Operations research ( en-GB, operational research) (U.S. Air Force Specialty Code: Operations Analysis), often shortened to the initialism OR, is a discipline that deals with the development and application of analytical methods to improve decis ...
are NP-complete, such as some types of
integer programming An integer programming problem is a mathematical optimization or feasibility program in which some or all of the variables are restricted to be integers. In many settings the term refers to integer linear programming (ILP), in which the objective ...
and the
travelling salesman problem The travelling salesman problem (also called the travelling salesperson problem or TSP) asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each cit ...
. Efficient solutions to these problems would have enormous implications for logistics. Many other important problems, such as some problems in
protein structure prediction Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different ...
, are also NP-complete; if these problems were efficiently solvable, it could spur considerable advances in life sciences and biotechnology. But such changes may pale in significance compared to the revolution an efficient method for solving NP-complete problems would cause in mathematics itself. Gödel, in his early thoughts on computational complexity, noted that a mechanical method that could solve any problem would revolutionize mathematics: Similarly, Stephen Cook (assuming not only a proof, but a practically efficient algorithm) says: Research mathematicians spend their careers trying to prove theorems, and some proofs have taken decades or even centuries to find after problems have been stated—for instance,
Fermat's Last Theorem In number theory, Fermat's Last Theorem (sometimes called Fermat's conjecture, especially in older texts) states that no three positive integers , , and satisfy the equation for any integer value of greater than 2. The cases and have been ...
took over three centuries to prove. A method that is guaranteed to find proofs to theorems, should one exist of a "reasonable" size, would essentially end this struggle.
Donald Knuth Donald Ervin Knuth ( ; born January 10, 1938) is an American computer scientist, mathematician, and professor emeritus at Stanford University. He is the 1974 recipient of the ACM Turing Award, informally considered the Nobel Prize of computer sc ...
has stated that he has come to believe that P = NP, but is reserved about the impact of a possible proof:


P ≠ NP

A proof showing that P ≠ NP would lack the practical computational benefits of a proof that P = NP, but would nevertheless represent a very significant advance in computational complexity theory and provide guidance for future research. It would allow one to show in a formal way that many common problems cannot be solved efficiently, so that the attention of researchers can be focused on partial solutions or solutions to other problems. Due to widespread belief in P ≠ NP, much of this focusing of research has already taken place. Also, P ≠ NP still leaves open the
average-case complexity In computational complexity theory, the average-case complexity of an algorithm is the amount of some computational resource (typically time) used by the algorithm, averaged over all possible inputs. It is frequently contrasted with worst-case com ...
of hard problems in NP. For example, it is possible that SAT requires exponential time in the worst case, but that almost all randomly selected instances of it are efficiently solvable. Russell Impagliazzo has described five hypothetical "worlds" that could result from different possible resolutions to the average-case complexity question. These range from "Algorithmica", where P = NP and problems like SAT can be solved efficiently in all instances, to "Cryptomania", where P ≠ NP and generating hard instances of problems outside P is easy, with three intermediate possibilities reflecting different possible distributions of difficulty over instances of NP-hard problems. The "world" where P ≠ NP but all problems in NP are tractable in the average case is called "Heuristica" in the paper. A
Princeton University Princeton University is a private research university in Princeton, New Jersey. Founded in 1746 in Elizabeth as the College of New Jersey, Princeton is the fourth-oldest institution of higher education in the United States and one of the ...
workshop in 2009 studied the status of the five worlds.


Results about difficulty of proof

Although the P = NP problem itself remains open despite a million-dollar prize and a huge amount of dedicated research, efforts to solve the problem have led to several new techniques. In particular, some of the most fruitful research related to the P = NP problem has been in showing that existing proof techniques are not powerful enough to answer the question, thus suggesting that novel technical approaches are required. As additional evidence for the difficulty of the problem, essentially all known proof techniques in
computational complexity theory In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved ...
fall into one of the following classifications, each of which is known to be insufficient to prove that P ≠ NP: These barriers are another reason why NP-complete problems are useful: if a polynomial-time algorithm can be demonstrated for an NP-complete problem, this would solve the P = NP problem in a way not excluded by the above results. These barriers have also led some computer scientists to suggest that the P versus NP problem may be
independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independe ...
of standard axiom systems like ZFC (cannot be proved or disproved within them). The interpretation of an independence result could be that either no polynomial-time algorithm exists for any NP-complete problem, and such a proof cannot be constructed in (e.g.) ZFC, or that polynomial-time algorithms for NP-complete problems may exist, but it is impossible to prove in ZFC that such algorithms are correct. However, if it can be shown, using techniques of the sort that are currently known to be applicable, that the problem cannot be decided even with much weaker assumptions extending the
Peano axioms In mathematical logic, the Peano axioms, also known as the Dedekind–Peano axioms or the Peano postulates, are axioms for the natural numbers presented by the 19th century Italian mathematician Giuseppe Peano. These axioms have been used nearly ...
(PA) for integer arithmetic, then there would necessarily exist nearly polynomial-time algorithms for every problem in NP. Therefore, if one believes (as most complexity theorists do) that not all problems in NP have efficient algorithms, it would follow that proofs of independence using those techniques cannot be possible. Additionally, this result implies that proving independence from PA or ZFC using currently known techniques is no easier than proving the existence of efficient algorithms for all problems in NP.


Claimed solutions

While the P versus NP problem is generally considered unsolved, many amateur and some professional researchers have claimed solutions. Gerhard J. Woeginger compiled a list of 62 purported proofs of P = NP from 1986 to 2016, of which 50 were proofs of P ≠ NP, 2 were proofs the problem is unprovable, and one was a proof that it is undecidable. Some attempts at resolving P versus NP have received brief media attention, though these attempts have since been refuted.


Logical characterizations

The P = NP problem can be restated in terms of expressible certain classes of logical statements, as a result of work in
descriptive complexity ''Descriptive Complexity'' is a book in mathematical logic and computational complexity theory by Neil Immerman. It concerns descriptive complexity theory, an area in which the expressibility of mathematical properties using different types of l ...
. Consider all languages of finite structures with a fixed
signature A signature (; from la, signare, "to sign") is a Handwriting, handwritten (and often Stylization, stylized) depiction of someone's name, nickname, or even a simple "X" or other mark that a person writes on documents as a proof of identity and ...
including a
linear order In mathematics, a total or linear order is a partial order in which any two elements are comparable. That is, a total order is a binary relation \leq on some set X, which satisfies the following for all a, b and c in X: # a \leq a ( reflexive) ...
relation. Then, all such languages in P can be expressed in
first-order logic First-order logic—also known as predicate logic, quantificational logic, and first-order predicate calculus—is a collection of formal systems used in mathematics, philosophy, linguistics, and computer science. First-order logic uses quantifie ...
with the addition of a suitable least fixed-point combinator. Effectively, this, in combination with the order, allows the definition of recursive functions. As long as the signature contains at least one predicate or function in addition to the distinguished order relation, so that the amount of space taken to store such finite structures is actually polynomial in the number of elements in the structure, this precisely characterizes P. Similarly, NP is the set of languages expressible in existential
second-order logic In logic and mathematics, second-order logic is an extension of first-order logic, which itself is an extension of propositional logic. Second-order logic is in turn extended by higher-order logic and type theory. First-order logic quantifies on ...
—that is, second-order logic restricted to exclude
universal quantification In mathematical logic, a universal quantification is a type of quantifier, a logical constant which is interpreted as "given any" or "for all". It expresses that a predicate can be satisfied by every member of a domain of discourse. In other ...
over relations, functions, and subsets. The languages in the
polynomial hierarchy In computational complexity theory, the polynomial hierarchy (sometimes called the polynomial-time hierarchy) is a hierarchy of complexity classes that generalize the classes NP and co-NP. Each class in the hierarchy is contained within PSPACE. ...
, PH, correspond to all of second-order logic. Thus, the question "is P a proper subset of NP" can be reformulated as "is existential second-order logic able to describe languages (of finite linearly ordered structures with nontrivial signature) that first-order logic with least fixed point cannot?". The word "existential" can even be dropped from the previous characterization, since P = NP if and only if P = PH (as the former would establish that NP = co-NP, which in turn implies that NP = PH).


Polynomial-time algorithms

No algorithm for any NP-complete problem is known to run in polynomial time. However, there are algorithms known for NP-complete problems with the property that if P = NP, then the algorithm runs in polynomial time on accepting instances (although with enormous constants, making the algorithm impractical). However, these algorithms do not qualify as polynomial time because their running time on rejecting instances are not polynomial. The following algorithm, due to Levin (without any citation), is such an example below. It correctly accepts the NP-complete language SUBSET-SUM. It runs in polynomial time on inputs that are in SUBSET-SUM if and only if P = NP: ''// Algorithm that accepts the NP-complete language SUBSET-SUM.'' ''//'' ''// this is a polynomial-time algorithm if and only if P = NP.'' ''//'' ''// "Polynomial-time" means it returns "yes" in polynomial time when'' ''// the answer should be "yes", and runs forever when it is "no".'' ''//'' ''// Input: S = a finite set of integers'' ''// Output: "yes" if any subset of S adds up to 0.'' ''// Runs forever with no output otherwise.'' ''// Note: "Program number M" is the program obtained by'' ''// writing the integer M in binary, then'' ''// considering that string of bits to be a'' ''// program. Every possible program can be'' ''// generated this way, though most do nothing'' ''// because of syntax errors.'' FOR K = 1...∞ FOR M = 1...K Run program number M for K steps with input S IF the program outputs a list of distinct integers AND the integers are all in S AND the integers sum to 0 THEN OUTPUT "yes" and HALT If, and only if, P = NP, then this is a polynomial-time algorithm accepting an NP-complete language. "Accepting" means it gives "yes" answers in polynomial time, but is allowed to run forever when the answer is "no" (also known as a ''semi-algorithm''). This algorithm is enormously impractical, even if P = NP. If the shortest program that can solve SUBSET-SUM in polynomial time is ''b'' bits long, the above algorithm will try at least other programs first.


Formal definitions


P and NP

Conceptually speaking, a ''decision problem'' is a problem that takes as input some string ''w'' over an alphabet Σ, and outputs "yes" or "no". If there is an
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
(say a
Turing machine A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer alg ...
, or a
computer program A computer program is a sequence or set of instructions in a programming language for a computer to Execution (computing), execute. Computer programs are one component of software, which also includes software documentation, documentation and oth ...
with unbounded memory) that can produce the correct answer for any input string of length ''n'' in at most ''cnk'' steps, where ''k'' and ''c'' are constants independent of the input string, then we say that the problem can be solved in ''polynomial time'' and we place it in the class P. Formally, P is defined as the set of all languages that can be decided by a deterministic polynomial-time Turing machine. That is, :\mathbf = \ where :L(M) = \ and a deterministic polynomial-time Turing machine is a deterministic Turing machine ''M'' that satisfies the following two conditions: # ''M'' halts on all inputs ''w'' and # there exists k \in N such that T_M(n)\in O(n^k), where ''O'' refers to the
big O notation Big ''O'' notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. Big O is a member of a family of notations invented by Paul Bachmann, Edmund L ...
and ::T_M(n) = \max\ ::t_M(w) = \textM\textw. NP can be defined similarly using nondeterministic Turing machines (the traditional way). However, a modern approach to define NP is to use the concept of ''
certificate Certificate may refer to: * Birth certificate * Marriage certificate * Death certificate * Gift certificate * Certificate of authenticity, a document or seal certifying the authenticity of something * Certificate of deposit, or CD, a financial pr ...
'' and ''verifier''. Formally, NP is defined as the set of languages over a finite alphabet that have a verifier that runs in polynomial time, where the notion of "verifier" is defined as follows. Let ''L'' be a language over a finite alphabet, Σ. ''L'' ∈ NP if, and only if, there exists a binary relation R\subset\Sigma^\times\Sigma^ and a positive integer ''k'' such that the following two conditions are satisfied: # For all x\in\Sigma^, x\in L \Leftrightarrow\exists y\in\Sigma^ such that (''x'', ''y'') ∈ ''R'' and , y, \in O(, x, ^k); and # the language consisting of x followed by y with a delimiter in the middle">L_ = \ over \Sigma\cup\ is decidable by a deterministic Turing machine in polynomial time. A Turing machine that decides ''LR'' is called a ''verifier'' for ''L'' and a ''y'' such that (''x'', ''y'') ∈ ''R'' is called a ''certificate of membership'' of ''x'' in ''L''. In general, a verifier does not have to be polynomial-time. However, for ''L'' to be in NP, there must be a verifier that runs in polynomial time.


Example

Let :\mathrm = \left \ :R = \left \. Clearly, the question of whether a given ''x'' is a
composite Composite or compositing may refer to: Materials * Composite material, a material that is made from several different substances ** Metal matrix composite, composed of metal and other parts ** Cermet, a composite of ceramic and metallic materials ...
is equivalent to the question of whether ''x'' is a member of COMPOSITE. It can be shown that COMPOSITE ∈ NP by verifying that it satisfies the above definition (if we identify natural numbers with their binary representations). COMPOSITE also happens to be in P, a fact demonstrated by the invention of the AKS primality test.


NP-completeness

There are many equivalent ways of describing NP-completeness. Let ''L'' be a language over a finite alphabet Σ. ''L'' is NP-complete if, and only if, the following two conditions are satisfied: # ''L'' ∈ NP; and # any ''L in NP is polynomial-time-reducible to ''L'' (written as L' \leq_ L), where L' \leq_ L if, and only if, the following two conditions are satisfied: ## There exists ''f'' : Σ* → Σ* such that for all ''w'' in Σ* we have: (w\in L' \Leftrightarrow f(w)\in L); and ## there exists a polynomial-time Turing machine that halts with ''f''(''w'') on its tape on any input ''w''. Alternatively, if ''L'' ∈ NP, and there is another NP-complete problem that can be polynomial-time reduced to ''L'', then ''L'' is NP-complete. This is a common way of proving some new problem is NP-complete.


Popular culture

The film '' Travelling Salesman'', by director Timothy Lanzone, is the story of four mathematicians hired by the US government to solve the P versus NP problem. In the sixth episode of ''
The Simpsons ''The Simpsons'' is an American animated sitcom created by Matt Groening for the Fox Broadcasting Company. The series is a satirical depiction of American life, epitomized by the Simpson family, which consists of Homer, Marge, Bart, ...
'' seventh season " Treehouse of Horror VI", the equation P = NP is seen shortly after Homer accidentally stumbles into the "third dimension". In the second episode of season 2 of ''
Elementary Elementary may refer to: Arts, entertainment, and media Music * ''Elementary'' (Cindy Morgan album), 2001 * ''Elementary'' (The End album), 2007 * ''Elementary'', a Melvin "Wah-Wah Watson" Ragin album, 1977 Other uses in arts, entertainment, a ...
'', "Solve for X" revolves around Sherlock and Watson investigating the murders of mathematicians who were attempting to solve P versus NP.


See also

* Game complexity *
List of unsolved problems in mathematics Many mathematical problems have been stated but not yet solved. These problems come from many areas of mathematics, such as theoretical physics, computer science, algebra, analysis, combinatorics, algebraic, differential, discrete and Euclid ...
*
Unique games conjecture In computational complexity theory, the unique games conjecture (often referred to as UGC) is a conjecture made by Subhash Khot in 2002. The conjecture postulates that the problem of determining the approximate ''value'' of a certain type of ga ...
* Unsolved problems in computer science


Notes


References


Sources

* * *


Further reading

* * *
Online drafts
* *


External links

*
Aviad Rubinstein's ''Hardness of Approximation Between P and NP''
winner of the ACM'
2017 Doctoral Dissertation Award
* {{DEFAULTSORT:P Versus Np Problem 1956 in computing Computer-related introductions in 1956 Conjectures Mathematical optimization Millennium Prize Problems Structural complexity theory Unsolved problems in computer science Unsolved problems in mathematics