Backward induction is the process of determining a

sequence In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is cal ...

of optimal choices by reasoning from the endpoint of a problem or situation back to its beginning using individual events or actions. Backward induction involves examining the final point in a series of decisions and identifying the optimal process or action required to arrive at that point. This process continues backward until the best action for every possible point along the sequence is determined. Backward induction was first utilized in 1875 by

Arthur Cayley Arthur Cayley (; 16 August 1821 – 26 January 1895) was a British mathematician who worked mostly on algebra. He helped found the modern British school of pure mathematics, and was a professor at Trinity College, Cambridge for 35 years. He ...

, who discovered the method while attempting to solve the

secretary problem The secretary problem demonstrates a scenario involving optimal stopping theory For French translation, secover storyin the July issue of ''Pour la Science'' (2009). that is studied extensively in the fields of applied probability, statistics, a ...

. In dynamic programming, a method of

mathematical optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...

, backward induction is used for solving the

Bellman equation A Bellman equation, named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical Optimization (mathematics), optimization method known as dynamic programming. It writes the "value" of a decision problem ...

. In the related fields of

automated planning and scheduling Automated planning and scheduling, sometimes denoted as simply AI planning, is a branch of artificial intelligence that concerns the realization of strategies or action sequences, typically for execution by intelligent agents, autonomous robots ...

and

automated theorem proving Automated theorem proving (also known as ATP or automated deduction) is a subfield of automated reasoning and mathematical logic dealing with proving mathematical theorems by computer programs. Automated reasoning over mathematical proof was a majo ...

, the method is called backward search or

backward chaining Backward chaining (or backward reasoning) is an inference method described colloquially as working backward from the goal. It is used in automated theorem provers, inference engines, proof assistants, and other artificial intelligence application ...

. In chess, it is called retrograde analysis. In

game theory Game theory is the study of mathematical models of strategic interactions. It has applications in many fields of social science, and is used extensively in economics, logic, systems science and computer science. Initially, game theory addressed ...

, a variant of backward induction is used to compute subgame perfect equilibria in

sequential game In game theory, a sequential game is defined as a game where one player selects their action before others, and subsequent players are informed of that choice before making their own decisions. This turn-based structure, governed by a time axis, d ...

s. The difference is that optimization problems involve one

decision maker In psychology, decision-making (also spelled decision making and decisionmaking) is regarded as the cognitive process resulting in the selection of a belief or a course of action among several possible alternative options. It could be either ra ...

who chooses what to do at each point of time. In contrast, game theory problems involve the interacting decision of several

player Player may refer to: Role or adjective * Player (game), a participant in a game or sport ** Gamer, a player in video and tabletop games ** Athlete, a player in sports ** Player character, a character in a video game or role playing game who i ...

s. In this situation, it may still be possible to apply a generalization of backward induction, since it may be possible to determine what the second-to-last player will do by predicting what the last player will do in each situation, and so on. This variant of backward induction has been used to solve formal games from the beginning of game theory.

John von Neumann John von Neumann ( ; ; December 28, 1903 – February 8, 1957) was a Hungarian and American mathematician, physicist, computer scientist and engineer. Von Neumann had perhaps the widest coverage of any mathematician of his time, in ...

and

Oskar Morgenstern Oskar Morgenstern (; January 24, 1902 – July 26, 1977) was a German-born economist. In collaboration with mathematician John von Neumann, he is credited with founding the field of game theory and its application to social sciences and strategic ...

suggested solving

zero-sum Zero-sum game is a mathematical representation in game theory and economic theory of a situation that involves two competing entities, where the result is an advantage for one side and an equivalent loss for the other. In other words, player on ...

, two-person formal games through this method in their '' Theory of Games and Economic Behaviour'' (1944), the book which established game theory as a field of study.

Decision-making example

Optimal-stopping problem

Consider a person evaluating potential employment opportunities for the next ten years, denoted as times

t=1,2,3,...,10

. At each

t

, they may encounter a choice between two job options: a 'good' job offering a

salary A salary is a form of periodic payment from an employer to an employee, which may be specified in an employment contract. It is contrasted with piece wages, where each job, hour or other unit is paid separately, rather than on a periodic basis. ...

$100

or a 'bad' job offering a salary of

$44

. Each job type has an equal probability of being offered. Upon accepting a job, the individual will maintain that particular job for the entire remainder of the ten-year duration. This scenario is simplified by assuming that the individual's entire concern is their total expected monetary earnings, without any variable preferences for earnings across different periods. In economic terms, this is a scenario with an implicit

interest rate An interest rate is the amount of interest due per period, as a proportion of the amount lent, deposited, or borrowed (called the principal sum). The total interest on an amount lent or borrowed depends on the principal sum, the interest rate, ...

of zero and a constant

marginal utility Marginal utility, in mainstream economics, describes the change in ''utility'' (pleasure or satisfaction resulting from the consumption) of one unit of a good or service. Marginal utility can be positive, negative, or zero. Negative marginal utilit ...

of money. Whether the person in question should accept a 'bad' job can be decided by reasoning backwards from time

t=10

. *At

t=10

, the total earnings from accepting a 'good' job is

$100

; the value of accepting a 'bad' job is

$44

; the total earnings from rejecting the available job is

$0

. Therefore, if they are still unemployed in the last period, they should accept whatever job they are offered at that time for greater income. *At

t=9

, the total earnings from accepting a 'good' job is

2\times$100=$200

because that job will last for two years. The total earnings from accepting a 'bad' job is

2 \times $44 = $88

. The total expected earnings from rejecting a job offer are

$0

now plus the value of the next job offer, which will either be

$44

with 1/2 probability or

$100

with 1/2 probability, for an average ('expected') value of

\frac=\$72

. Therefore, the job available at

t=9

should be accepted. *At

t=8

, the total earnings from accepting a 'good' job is

3 \times $100 = $300

; the total earnings from accepting a 'bad' job is

3 \times $44 = $132

. The total expected earnings from rejecting a job offer is

$0

now plus the total expected earnings from waiting for a job offer at

t=9

. As previously concluded, any offer at

t=9

should be accepted and the expected value of doing so is

\frac=\$144

. Therefore, at

t=8

, total expected earnings are higher if the person waits for the next offer rather than accepting a 'bad' job. By continuing to work backwards, it can be verified that a 'bad' offer should only be accepted if the person is still unemployed at

t=9

t=10

; a bad offer should be rejected at any time up to and including

t=8

. Generalizing this example intuitively, it corresponds to the principle that if one expects to work in a job for a long time, it is worth picking carefully. A dynamic optimization problem of this kind is called an

optimal stopping In mathematics, the theory of optimal stopping or early stopping : (For French translation, secover storyin the July issue of ''Pour la Science'' (2009).) is concerned with the problem of choosing a time to take a particular action, in order to ...

problem because the issue at hand is when to stop waiting for a better offer.

Search theory In microeconomics, search theory studies buyers or sellers who cannot instantly find a trading partner, and must therefore search for a partner prior to transacting. It involves determining the best approach to use when looking for a specific ite ...

is a field of microeconomics that applies models of this type to matters such as shopping, job searches, and marriage.

Game theory

, backward induction is a solution methodology that follows from applying sequential rationality to identify an optimal action for each information set in a given

game tree In the context of combinatorial game theory, a game tree is a graph representing all possible game states within a sequential game that has perfect information. Such games include chess, checkers, Go, and tic-tac-toe. A game tree can be us ...

. It develops the implications of rationality via individual information sets in the extensive-form representation of a game. In order to solve for a subgame perfect equilibrium with backwards induction, the game should be written out in extensive form and then divided into subgames. Starting with the subgame furthest from the initial node, or starting point, the expected payoffs listed for this subgame are weighed, and a rational player will select the option with the higher payoff for themselves. The highest payoff

vector Vector most often refers to: * Euclidean vector, a quantity with a magnitude and a direction * Disease vector, an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematics a ...

is selected and marked. To solve for the subgame perfect equilibrium, one should continually work backwards from subgame to subgame until the starting point is reached. As this process progresses, the initial extensive form game will become shorter and shorter. The marked path of vectors is the subgame perfect equilibrium.

Multi-stage game

The application of backward induction in game theory can be demonstrated with a simple example. Consider a multi-stage game involving two players planning to go to a movie. * Player 1 wants to watch ''

The Terminator ''The Terminator'' is a 1984 American science fiction action film directed by James Cameron, written by Cameron and Gale Anne Hurd and produced by Hurd. It stars Arnold Schwarzenegger as the Terminator, a cybernetic assassin sent back in t ...

'', and Player 2 wants to watch '' Joker''. * Player 1 will buy a ticket first and tell Player 2 about her choice. * Next, Player 2 will buy his ticket. Once they both observe the choices, the second stage begins. In the second stage, players choose whether to go to the movie or stay home. * As in the first stage, Player 1 chooses whether to go to the movie first. * After observing Player 1's choice, Player 2 makes his choice. For this example, payoffs are added across different stages. The game is a

perfect information Perfect information is a concept in game theory and economics that describes a situation where all players in a game or all participants in a market have knowledge of all relevant information in the system. This is different than complete informat ...

game. The normal-form matrices for these games are: Normal form game joker terminator

The extensive form of this multi-stage game can be seen to the right. The steps for solving this game with backward induction are as follows: #Analysis starts from the final nodes. #Player 2 will observe 8 subgames from the final nodes to choose "go to movie" or "stay home". #*Player 2 would make 8 possible comparisons in total, choosing the option with the highest payoff in each. #*For example, considering the first subgame, Player 2's payoff of 11 for "go to movie" is higher than his payoff of 7 for "stay at home." Player 2 would therefore choose "go to movie." #*The method continues for every subgame. #Once Player 2's optimal decisions have been determined (bolded green lines in the extensive form diagram), analysis starts for Player 1's decisions in her 4 subgames. #*The process is similar to step 2, comparing Player 1's payoffs in order to anticipate her choices. #*Subgames that would not be chosen by Player 2 in the previous step are no longer considered because they are ruled out by the assumption of rational play. #*For example, in the first subgame, the choice "go to movie" offers a payoff of 9 since the decision tree terminates at the reward (9, 11), considering Player 2's previously established choice. Meanwhile, "stay home" offers a payoff of 1 since it ends at (1, 9), so Player 1 would choose "go to movie." #The process repeats for each player until the initial node is reached. #*For example, Player 2 would choose "Joker" for the first subgame in the next iteration because a payoff of 11 ending in (9, 11) is greater than "Terminator" with a payoff of 6 at (6, 6). #*Player 1, at the initial node, would select "Terminator" because it offers a higher payoff of 11 at (11, 9) than Joker, which has a reward of 9 at (9, 11). #To identify a subgame perfect equilibrium, one needs to identify a route that selects an optimal subgame at each information set. #*In this example, Player 1 chooses "Terminator" and Player 2 also chooses "Terminator." Then they both choose "go to movie." #*The subgame perfect equilibrium leads to a payoff of (11,9).

Limitations

Backward induction can be applied to only limited classes of games. The procedure is well-defined for any game of perfect information with no ties of utility. It is also well-defined and meaningful for games of perfect information with ties. However, in such cases it leads to more than one perfect strategy. The procedure can be applied to some games with nontrivial information sets, but it is not applicable in general. It is best suited to solve games with perfect information. If all players are not aware of the other players' actions and payoffs at each decision node, then backward induction is not so easily applied.

Ultimatum game

A second example demonstrates that even in games that formally allow for backward induction in theory, it may not accurately predict empirical game play in practice. This example of an asymmetric game consists of two players: Player 1 proposes to split a dollar with Player 2, which Player 2 then accepts or rejects. This is called the

ultimatum game The ultimatum game is a popular experimental economics game in which two players interact to decide how to divide a sum of money, first described by Nobel laureate John Harsanyi in 1961. The first player, the proposer, proposes a division of the ...

. Player 1 acts first by splitting the dollar however they see fit. Next, Player 2 either accepts the portion they have been offered by Player 1 or rejects the split. If Player 2 accepts the split, then both Player 1 and Player 2 get the payoffs matching that split. If Player 2 decides to reject Player 1's offer, then both players get nothing. In other words, Player 2 has veto power over Player 1's proposed allocation, but applying the veto eliminates any reward for both players. Considering the choice and response of Player 2 given any arbitrary proposal by Player 1, formal rationality prescribes that Player 2 would accept any payoff that is greater than or equal to $0. Accordingly, by backward induction Player 1 ought to propose giving Player 2 as little as possible in order to gain the largest portion of the split. Player 1 giving Player 2 the smallest unit of money and keeping the rest for themselves is the unique subgame-perfect equilibrium. The ultimatum game does have several other Nash Equilibria which are not subgame perfect and therefore do not arise via backward induction. The ultimatum game is a theoretical illustration of the usefulness of backward induction when considering infinite games, but the ultimatum games theoretically predicted results do not match empirical observation. Experimental evidence has shown that a proposer, Player 1, very rarely offers $0 and the responder, Player 2, sometimes rejects offers greater than $0. What is deemed acceptable by Player 2 varies with context. The pressure or presence of other players and external implications can mean that the game's formal model cannot necessarily predict what a real person will choose. According to Colin Camerer, an American behavioral economist, Player 2 "rejects offers of less than 20 percent of X about half the time, even though they end up with nothing." While backward induction assuming formal rationality would predict that a responder would accept any offer greater than zero, responders in reality are not formally rational players and therefore often seem to care more about offer 'fairness' or perhaps other anticipations of indirect or external effects rather than immediate potential monetary gains.

Economics

Entry-decision problem

A dynamic game in which the players are an incumbent firm in an industry and a potential entrant to that industry is to be considered. As it stands, the incumbent has a

monopoly A monopoly (from Greek language, Greek and ) is a market in which one person or company is the only supplier of a particular good or service. A monopoly is characterized by a lack of economic Competition (economics), competition to produce ...

over the industry and does not want to lose some of its market share to the entrant. If the entrant chooses not to enter, the payoff to the incumbent is high (it maintains its monopoly) and the entrant neither loses nor gains (its payoff is zero). If the entrant enters, the incumbent can "fight" or "accommodate" the entrant. It will fight by lowering its price, running the entrant out of business (and incurring exit costs—a negative payoff) and damaging its own profits. If it accommodates the entrant it will lose some of its sales, but a high price will be maintained and it will receive greater profits than by lowering its price (but lower than monopoly profits). If the incumbent accommodates given the case that the entrant enters, the best response for the entrant is to enter (and gain profit). Hence the strategy profile in which the entrant enters and the incumbent accommodates if the entrant enters is a

Nash equilibrium In game theory, the Nash equilibrium is the most commonly used solution concept for non-cooperative games. A Nash equilibrium is a situation where no player could gain by changing their own strategy (holding all other players' strategies fixed) ...

consistent with backward induction. However, if the incumbent is going to fight, the best response for the entrant is to not enter, and if the entrant does not enter, it does not matter what the incumbent chooses to do in the hypothetical case that the entrant does enter. Hence the strategy profile in which the incumbent fights if the entrant enters, but the entrant does not enter is also a Nash equilibrium. However, were the entrant to deviate and enter, the incumbent's best response is to accommodate—the threat of fighting is not credible. This second Nash equilibrium can therefore be eliminated by backward induction. Finding a Nash equilibrium in each decision-making process (subgame) constitutes as perfect subgame equilibria. Thus, these strategy profiles that depict subgame perfect equilibria exclude the possibility of actions like incredible threats that are used to "scare off" an entrant. If the incumbent threatens to start a

price war A price war is a form of market competition in which companies within an industry engage in aggressive pricing activity "characterized by the repeated cutting of prices below those of competitors". This leads to a cycle, where each competitor att ...

with an entrant, they are threatening to lower their prices from a monopoly price to slightly lower than the entrant's, which would be impractical, and incredible, if the entrant knew a price war would not actually happen since it would result in losses for both parties. Unlike a single-agent optimization which might include suboptimal or infeasible equilibria, a subgame perfect equilibrium accounts for the actions of another player, ensuring that no player reaches a subgame mistakenly. In this case, backwards induction yielding perfect subgame equilibria ensures that the entrant will not be convinced of the incumbent's threat knowing that it was not a best response in the strategy profile.

Unexpected hanging paradox

The

unexpected hanging paradox The unexpected hanging paradox or surprise test paradox is a paradox about a person's expectations about the timing of a future event which they are told will occur at an unexpected time. The paradox is variously applied to a prisoner's hanging or ...

is a

paradox A paradox is a logically self-contradictory statement or a statement that runs contrary to one's expectation. It is a statement that, despite apparently valid reasoning from true or apparently true premises, leads to a seemingly self-contradictor ...

related to backward induction. The prisoner described in the paradox uses backwards induction to reach a false conclusion. The description of the problem assumes it is possible to surprise someone who is performing backward induction. The mathematical theory of backward induction does not make this assumption, so the paradox does not call into question the results of this theory.

Common knowledge of rationality

Backward induction works only if both players are

rational Rationality is the quality of being guided by or based on reason. In this regard, a person acts rationally if they have a good reason for what they do, or a belief is rational if it is based on strong evidence. This quality can apply to an ...

, i.e., always select an action that maximizes their payoff. However, rationality is not enough: each player should also believe that all other players are rational. Even this is not enough: each player should believe that all other players know that all other players are rational, and so on, ad infinitum. In other words, rationality should be

common knowledge Common knowledge is knowledge that is publicly known by everyone or nearly everyone, usually with reference to the community in which the knowledge is referenced. Common knowledge can be about a broad range of subjects, such as science, litera ...

Limited backward induction

Limited backward induction is a deviation from fully rational backward induction. It involves enacting the regular process of backward induction without perfect foresight. Theoretically, this occurs when one or more players have limited foresight and cannot perform backward induction through all terminal nodes. Limited backward induction plays a much larger role in longer games as the effects of limited backward induction are more potent in later periods of games. A_four_stage_sequential_game_with_a_foresight_bound

A_four_stage_sequential_game_with_a_foresight_bound

Experiments have shown that in sequential bargaining games, such as the

Centipede game In game theory, the centipede game, first introduced by Robert W. Rosenthal, Robert Rosenthal in 1981, is an extensive form game in which two players take turns choosing either to take a slightly larger share of an increasing pot, or to pass the p ...

, subjects deviate from theoretical predictions and instead engage in limited backward induction. This deviation occurs as a result of

bounded rationality Bounded rationality is the idea that rationality is limited when individuals decision-making, make decisions, and under these limitations, rational individuals will select a decision that is satisficing, satisfactory rather than optimal. Limitat ...

, where players can only perfectly see a few stages ahead. This allows for unpredictability in decisions and inefficiency in finding and achieving subgame perfect Nash equilibria. There are three broad hypotheses for this phenomenon: # The presence of social factors (e.g. fairness) # The presence of non-social factors (e.g. limited backward induction) # Cultural difference Violations of backward induction is predominantly attributed to the presence of social factors. However, data-driven model predictions for sequential bargaining games (using the cognitive hierarchy model) have highlighted that in some games the presence of limited backward induction can play a dominant role. Within repeated public goods games, team behavior is impacted by limited backward induction; where it is evident that team members' initial contributions are higher than contributions towards the end. Limited backward induction also influences how regularly free-riding occurs within a team's public goods game. Early on, when the effects of limited backward induction are low, free riding is less frequent, whilst towards the end, when effects are high, free-riding becomes more frequent. Limited backward induction has also been tested for within a variant of the race game. In the game, players would sequentially choose integers inside a range and sum their choices until a target number is reached. Hitting the target earns that player a prize; the other loses. Partway through a series of games, a small prize was introduced. The majority of players then performed limited backward induction, as they solved for the small prize rather than for the original prize. Only a small fraction of players considered both prizes at the start. Most tests of backward induction are based on experiments, in which participants are only to a small extent incentivized to perform the task well, if at all. However, violations of backward induction also appear to be common in high-stakes environments. A large-scale analysis of the American television game show

The Price Is Right ''The Price Is Right'' is an American television game show where contestants compete by guessing the prices of merchandise to win cash and prizes. A 1972 revival by Mark Goodson and Bill Todman of their The Price Is Right (1956 American game ...

, for example, provides evidence of limited foresight. In every episode, contestants play the Showcase Showdown, a sequential game of perfect information for which the optimal strategy can be found through backward induction. The frequent and systematic deviations from optimal behavior suggest that a sizable proportion of the contestants fail to properly backward induct and myopically consider the next stage of the game only.

Notes

{{Game theory Dynamic programming Game theory Inductive reasoning

Decision-making example

Optimal-stopping problem

Game theory

Multi-stage game

Limitations

Ultimatum game

Economics

Entry-decision problem

Unexpected hanging paradox

Common knowledge of rationality

Limited backward induction

See also

Notes