Backtracking is a class of

algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...

s for finding solutions to some

computational problem In theoretical computer science, a computational problem is a problem that may be solved by an algorithm. For example, the problem of factoring :"Given a positive integer ''n'', find a nontrivial prime factor of ''n''." is a computational probl ...

s, notably constraint satisfaction problems, that incrementally builds candidates to the solutions, and abandons a candidate ("backtracks") as soon as it determines that the candidate cannot possibly be completed to a valid solution. The classic textbook example of the use of backtracking is the

eight queens puzzle The eight queens puzzle is the problem of placing eight chess queens on an 8×8 chessboard so that no two queens threaten each other; thus, a solution requires that no two queens share the same row, column, or diagonal. There are 92 solutions. ...

, that asks for all arrangements of eight

chess Chess is a board game for two players, called White and Black, each controlling an army of chess pieces in their color, with the objective to checkmate the opponent's king. It is sometimes called international chess or Western chess to dist ...

queens Queens is a borough of New York City, coextensive with Queens County, in the U.S. state of New York. Located on Long Island, it is the largest New York City borough by area. It is bordered by the borough of Brooklyn at the western tip of Long ...

on a standard

chessboard A chessboard is a used to play chess. It consists of 64 squares, 8 rows by 8 columns, on which the chess pieces are placed. It is square in shape and uses two colours of squares, one light and one dark, in a chequered pattern. During play, the bo ...

so that no queen attacks any other. In the common backtracking approach, the partial candidates are arrangements of ''k'' queens in the first ''k'' rows of the board, all in different rows and columns. Any partial solution that contains two mutually attacking queens can be abandoned. Backtracking can be applied only for problems which admit the concept of a "partial candidate solution" and a relatively quick test of whether it can possibly be completed to a valid solution. It is useless, for example, for locating a given value in an unordered table. When it is applicable, however, backtracking is often much faster than brute-force enumeration of all complete candidates, since it can eliminate many candidates with a single test. Backtracking is an important tool for solving constraint satisfaction problems, such as crosswords,

verbal arithmetic Verbal arithmetic, also known as alphametics, cryptarithmetic, cryptarithm or word addition, is a type of mathematical game consisting of a mathematical equation among unknown numbers, whose digits are represented by letters of the alphabet. The ...

Sudoku Sudoku (; ja, 数独, sūdoku, digit-single; originally called Number Place) is a logic-based, combinatorial number-placement puzzle. In classic Sudoku, the objective is to fill a 9 × 9 grid with digits so that each column, each row ...

, and many other puzzles. It is often the most convenient technique for

parsing Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term ''parsing'' comes from ...

, for the

knapsack problem The knapsack problem is a problem in combinatorial optimization: Given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit a ...

and other

combinatorial optimization Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combi ...

problems. It is also the basis of the so-called

logic programming Logic programming is a programming paradigm which is largely based on formal logic. Any program written in a logic programming language is a set of sentences in logical form, expressing facts and rules about some problem domain. Major logic pro ...

languages such as

Icon An icon () is a religious work of art, most commonly a painting, in the cultures of the Eastern Orthodox, Oriental Orthodox, and Catholic churches. They are not simply artworks; "an icon is a sacred image used in religious devotion". The mos ...

, Planner and

Prolog Prolog is a logic programming language associated with artificial intelligence and computational linguistics. Prolog has its roots in first-order logic, a formal logic, and unlike many other programming languages, Prolog is intended primarily ...

. Backtracking depends on user-given " black box procedures" that define the problem to be solved, the nature of the partial candidates, and how they are extended into complete candidates. It is therefore a

metaheuristic In computer science and mathematical optimization, a metaheuristic is a higher-level procedure or heuristic designed to find, generate, or select a heuristic (partial search algorithm) that may provide a sufficiently good solution to an optimizati ...

rather than a specific algorithm – although, unlike many other meta-heuristics, it is guaranteed to find all solutions to a finite problem in a bounded amount of time. The term "backtrack" was coined by American mathematician

D. H. Lehmer Derrick Henry "Dick" Lehmer (February 23, 1905 – May 22, 1991), almost always cited as D.H. Lehmer, was an American mathematician significant to the development of computational number theory. Lehmer refined Édouard Lucas' work in the 1930s and ...

in the 1950s. The pioneer string-processing language

SNOBOL SNOBOL ("StriNg Oriented and symBOlic Language") is a series of programming languages developed between 1962 and 1967 at AT&T Bell Laboratories by David J. Farber, Ralph E. Griswold and Ivan P. Polonsky, culminating in SNOBOL4. It was one of ...

(1962) may have been the first to provide a built-in general backtracking facility.

Description of the method

The backtracking algorithm enumerates a set of ''partial candidates'' that, in principle, could be ''completed'' in various ways to give all the possible solutions to the given problem. The completion is done incrementally, by a sequence of ''candidate extension steps.'' Conceptually, the partial candidates are represented as the nodes of a

tree structure A tree structure, tree diagram, or tree model is a way of representing the hierarchical nature of a structure in a graphical form. It is named a "tree structure" because the classic representation resembles a tree, although the chart is genera ...

, the ''potential search tree.'' Each partial candidate is the parent of the candidates that differ from it by a single extension step; the leaves of the tree are the partial candidates that cannot be extended any further. The backtracking algorithm traverses this search tree

recursively Recursion (adjective: ''recursive'') occurs when a thing is defined in terms of itself or of its type. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in mathematics ...

, from the root down, in depth-first order. At each node ''c'', the algorithm checks whether ''c'' can be completed to a valid solution. If it cannot, the whole sub-tree rooted at ''c'' is skipped (''pruned''). Otherwise, the algorithm (1) checks whether ''c'' itself is a valid solution, and if so reports it to the user; and (2) recursively enumerates all sub-trees of ''c''. The two tests and the children of each node are defined by user-given procedures. Therefore, the ''actual search tree'' that is traversed by the algorithm is only a part of the potential tree. The total cost of the algorithm is the number of nodes of the actual tree times the cost of obtaining and processing each node. This fact should be considered when choosing the potential search tree and implementing the pruning test.

Pseudocode

In order to apply backtracking to a specific class of problems, one must provide the data ''P'' for the particular instance of the problem that is to be solved, and six procedural parameters, ''root'', ''reject'', ''accept'', ''first'', ''next'', and ''output''. These procedures should take the instance data ''P'' as a parameter and should do the following: # ''root''(''P''): return the partial candidate at the root of the search tree. # ''reject''(''P'',''c''): return ''true'' only if the partial candidate ''c'' is not worth completing. # ''accept''(''P'',''c''): return ''true'' if ''c'' is a solution of ''P'', and ''false'' otherwise. # ''first''(''P'',''c''): generate the first extension of candidate ''c''. # ''next''(''P'',''s''): generate the next alternative extension of a candidate, after the extension ''s''. # ''output''(''P'',''c''): use the solution ''c'' of ''P'', as appropriate to the application. The backtracking algorithm reduces the problem to the call ''backtrack''(''root''(''P'')), where ''backtrack'' is the following recursive procedure: procedure backtrack(P, c) is if reject(P, c) then return if accept(P, c) then output(P, c) s ← first(P, c) while s ≠ NULL do backtrack(P, s) s ← next(P, s)

Usage considerations

The ''reject'' procedure should be a

boolean-valued function A Boolean-valued function (sometimes called a predicate or a proposition) is a function of the type f : X → B, where X is an arbitrary set and where B is a Boolean domain, i.e. a generic two-element set, (for example B = ), whose elements are i ...

that returns ''true'' only if it is certain that no possible extension of ''c'' is a valid solution for ''P''. If the procedure cannot reach a definite conclusion, it should return ''false''. An incorrect ''true'' result may cause the ''backtrack'' procedure to miss some valid solutions. The procedure may assume that ''reject''(''P'',''t'') returned ''false'' for every ancestor ''t'' of ''c'' in the search tree. On the other hand, the efficiency of the backtracking algorithm depends on ''reject'' returning ''true'' for candidates that are as close to the root as possible. If ''reject'' always returns ''false'', the algorithm will still find all solutions, but it will be equivalent to a brute-force search. The ''accept'' procedure should return ''true'' if ''c'' is a complete and valid solution for the problem instance ''P'', and ''false'' otherwise. It may assume that the partial candidate ''c'' and all its ancestors in the tree have passed the ''reject'' test. The general pseudo-code above does not assume that the valid solutions are always leaves of the potential search tree. In other words, it admits the possibility that a valid solution for ''P'' can be further extended to yield other valid solutions. The ''first'' and ''next'' procedures are used by the backtracking algorithm to enumerate the children of a node ''c'' of the tree, that is, the candidates that differ from ''c'' by a single extension step. The call ''first''(''P'',''c'') should yield the first child of ''c'', in some order; and the call ''next''(''P'',''s'') should return the next sibling of node ''s'', in that order. Both functions should return a distinctive "NULL" candidate, if the requested child does not exist. Together, the ''root'', ''first'', and ''next'' functions define the set of partial candidates and the potential search tree. They should be chosen so that every solution of ''P'' occurs somewhere in the tree, and no partial candidate occurs more than once. Moreover, they should admit an efficient and effective ''reject'' predicate.

Early stopping variants

The pseudo-code above will call ''output'' for all candidates that are a solution to the given instance ''P''. The algorithm can be modified to stop after finding the first solution, or a specified number of solutions; or after testing a specified number of partial candidates, or after spending a given amount of

CPU A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, a ...

time.

Examples

Examples where backtracking can be used to solve puzzles or problems include: *

Puzzle A puzzle is a game, problem, or toy that tests a person's ingenuity or knowledge. In a puzzle, the solver is expected to put pieces together ( or take them apart) in a logical way, in order to arrive at the correct or fun solution of the puzzl ...

s such as

, crosswords,

, and

Peg Solitaire Peg solitaire, Solo Noble or simply Solitaire is a board game for one player involving movement of pegs on a board with holes. Some sets use marbles in a board with indentations. The game is known as solitaire in Britain and as peg solitaire in ...

. *

Combinatorial optimization Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combi ...

problems such as

and the

. *

Logic programming Logic programming is a programming paradigm which is largely based on formal logic. Any program written in a logic programming language is a set of sentences in logical form, expressing facts and rules about some problem domain. Major logic pro ...

languages such as

, Planner and

, which use backtracking internally to generate answers. The following is an example where backtracking is used for the constraint satisfaction problem:

Constraint satisfaction

The general constraint satisfaction problem consists in finding a list of integers , each in some range , that satisfies some arbitrary constraint (boolean function) ''F''. For this class of problems, the instance data ''P'' would be the integers ''m'' and ''n'', and the predicate ''F''. In a typical backtracking solution to this problem, one could define a partial candidate as a list of integers , for any ''k'' between 0 and ''n'', that are to be assigned to the first ''k'' variables . The root candidate would then be the empty list (). The ''first'' and ''next'' procedures would then be function first(P, c) is k ← length(c) if k = n then return NULL else return (c c …, c 1) function next(P, s) is k ← length(s) if s = m then return NULL else return (s s …, s − 1 1 + s Here ''length''(''c'') is the number of elements in the list ''c''. The call ''reject''(''P'', ''c'') should return ''true'' if the constraint ''F'' cannot be satisfied by any list of ''n'' integers that begins with the ''k'' elements of ''c''. For backtracking to be effective, there must be a way to detect this situation, at least for some candidates ''c'', without enumerating all those ''m''^{''n'' − ''k''} ''n''-tuples. For example, if ''F'' is the

conjunction Conjunction may refer to: * Conjunction (grammar), a part of speech * Logical conjunction, a mathematical operator ** Conjunction introduction, a rule of inference of propositional logic * Conjunction (astronomy), in which two astronomical bodies ...

of several boolean predicates, , and each ''F'' 'i''depends only on a small subset of the variables , then the ''reject'' procedure could simply check the terms ''F'' 'i''that depend only on variables , and return ''true'' if any of those terms returns ''false''. In fact, ''reject'' needs only check those terms that do depend on ''x'' 'k'' since the terms that depend only on will have been tested further up in the search tree. Assuming that ''reject'' is implemented as above, then ''accept''(''P'', ''c'') needs only check whether ''c'' is complete, that is, whether it has ''n'' elements. It is generally better to order the list of variables so that it begins with the most critical ones (i.e. the ones with fewest value options, or which have a greater impact on subsequent choices). One could also allow the ''next'' function to choose which variable should be assigned when extending a partial candidate, based on the values of the variables already assigned by it. Further improvements can be obtained by the technique of constraint propagation. In addition to retaining minimal recovery values used in backing up, backtracking implementations commonly keep a variable trail, to record value change history. An efficient implementation will avoid creating a variable trail entry between two successive changes when there is no choice point, as the backtracking will erase all of the changes as a single operation. An alternative to the variable trail is to keep a

timestamp A timestamp is a sequence of characters or encoded information identifying when a certain event occurred, usually giving date and time of day, sometimes accurate to a small fraction of a second. Timestamps do not have to be based on some absolut ...

of when the last change was made to the variable. The timestamp is compared to the timestamp of a choice point. If the choice point has an associated time later than that of the variable, it is unnecessary to revert the variable when the choice point is backtracked, as it was changed before the choice point occurred.

Notes

References

External links

HBmeyer.de
Interactive animation of a backtracking algorithm
Solving Combinatorial Problems with STL and Backtracking
Article and C++ source code for a generic implementation of backtracking {{Algorithmic paradigms Pattern matching Search algorithms