The expectiminimax algorithm is a variation of the

minimax Minimax (sometimes Minmax, MM or saddle point) is a decision rule used in artificial intelligence, decision theory, combinatorial game theory, statistics, and philosophy for ''minimizing'' the possible loss function, loss for a Worst-case scenari ...

algorithm, for use in

artificial intelligence Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...

systems that play two-player

zero-sum game Zero-sum game is a Mathematical model, mathematical representation in game theory and economic theory of a situation that involves two competition, competing entities, where the result is an advantage for one side and an equivalent loss for the o ...

s, such as

backgammon Backgammon is a two-player board game played with counters and dice on tables boards. It is the most widespread Western member of the large family of tables games, whose ancestors date back at least 1,600 years. The earliest record of backgammo ...

, in which the outcome depends on a combination of the player's skill and chance elements such as dice rolls. In addition to "min" and "max" nodes of the traditional minimax tree, this variant has "chance" ("

move by nature In game theory a move by nature is a ''decision'' or ''move'' in an extensive form game made by a player who has no strategic interests in the outcome. The effect is to add a player, "Nature", whose practical role is to act as a random number ge ...

") nodes, which take the

expected value In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...

of a random event occurring. In

game theory Game theory is the study of mathematical models of strategic interactions. It has applications in many fields of social science, and is used extensively in economics, logic, systems science and computer science. Initially, game theory addressed ...

terms, an expectiminimax tree is the game tree of an

extensive-form game In game theory, an extensive-form game is a specification of a game allowing for the explicit representation of a number of key aspects, like the sequencing of players' possible moves, their choices at every decision point, the (possibly imperfec ...

perfect Perfect commonly refers to: * Perfection; completeness, and excellence * Perfect (grammar), a grammatical category in some languages Perfect may also refer to: Film and television * ''Perfect'' (1985 film), a romantic drama * ''Perfect'' (20 ...

, but

incomplete information In economics and game theory, complete information is an economic situation or game in which knowledge about other market participants or players is available to all participants. The utility functions (including risk aversion), payoffs, strategies ...

. In the traditional

method, the levels of the tree alternate from max to min until the depth limit of the tree has been reached. In an expectiminimax tree, the "chance" nodes are interleaved with the max and min nodes. Instead of taking the max or min of the utility values of their children, chance nodes take a weighted average, with the weight being the probability that child is reached. The interleaving depends on the game. Each "turn" of the game is evaluated as a "max" node (representing the AI player's turn), a "min" node (representing a potentially-optimal opponent's turn), or a "chance" node (representing a random effect or player). For example, consider a game in which each round consists of a single die throw, and then decisions made by first the AI player, and then another intelligent opponent. The order of nodes in this game would alternate between "chance", "max" and then "min".

Pseudocode

The expectiminimax algorithm is a variant of the

algorithm and was firstly proposed by

Donald Michie Donald Michie (; 11 November 1923 – 7 July 2007) was a British researcher in artificial intelligence. During World War II, Michie worked for the Government Code and Cypher School at Bletchley Park, contributing to the effort to solve " Tunny ...

in 1966. Its

pseudocode In computer science, pseudocode is a description of the steps in an algorithm using a mix of conventions of programming languages (like assignment operator, conditional operator, loop) with informal, usually self-explanatory, notation of actio ...

is given below. function expectiminimax(node, depth) if node is a terminal node or depth = 0 return the heuristic value of node if the adversary is to play at node // Return value of minimum-valued child node let α := +∞ foreach child of node α := min(α, expectiminimax(child, depth-1)) else if we are to play at node // Return value of maximum-valued child node let α := -∞ foreach child of node α := max(α, expectiminimax(child, depth-1)) else if random event at node // Return weighted average of all child nodes' values let α := 0 foreach child of node α := α + (Probability

hild Hild or Hildr may refer to: * Hildr or Hild is one of the Valkyries in Norse mythology, a personification of battle * Hild or Hilda of Whitby is a Christian saint who was a British abbess and nun in the Middle Ages * Hild (Oh My Goddess!), the ult ...

× expectiminimax(child, depth-1)) return α Note that for random nodes, there must be a known probability of reaching each child. (For most games of chance, child nodes will be equally-weighted, which means the return value can simply be the average of all child values.)

Expectimax search

Expectimax search is a variant described in ''Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability'' (2005) by Tom Everitt and

Marcus Hutter Marcus Hutter (born 14 April 1967 in Munich) is a computer scientist, professor and artificial intelligence researcher. As a senior researcher at DeepMind, he studies the mathematical foundations of artificial general intelligence. Hutter stu ...

Alpha-beta pruning

Bruce Ballard was the first to develop a technique, called *-minimax, that enables

alpha-beta pruning Alphabeta or Alpha Beta may also refer to: * Alphabeta, an Israeli musical group * Alpha Beta, a former chain of Californian supermarkets * The Greek alphabet, from ''Alpha'' (Αα) and ''Beta'' (Ββ), the first two letters * Alpha and beta anome ...

in expectiminimax trees. The problem with integrating

into the expectiminimax algorithm is that the scores of a chance node's children may exceed the alpha or beta bound of its parent, even if the weighted value of each child does not. However, it is possible to bound the scores of a chance node's children, and therefore bound the score of the CHANCE node. If a standard iterative search is about to score the

i

th child of a chance node with

N

equally likely children, that search has computed scores

v_1, v_2, \ldots, v_

for child nodes 1 through

i-1

. Assuming a lowest possible score

L

and a highest possible score

U

for each unsearched child, the bounds of the chance node's score is as follows:

\text \leq \frac \left( ( v_1 + \ldots + v_) + v_i + U \times (n - i) \right)

\text \geq \frac \left( ( v_1 + \ldots + v_) + v_i + L \times (n - i) \right)

If an alpha and/or beta bound are given in scoring the chance node, these bounds can be used to cut off the search of the

i

th child. The above equations can be rearranged to find a new alpha & beta value that will cut off the search if it would cause the chance node to exceed its own alpha and beta bounds:

\alpha_i = N \times \alpha - \left( v_1 + \ldots + v_ \right) + U \times (n - i)

\beta_i = N \times \beta - \left( v_1 + \ldots + v_ \right) + L \times (n - i)

The

for extending expectiminimax with fail-hard

in this manner is as follows: function *-minimax(node, depth, α, β) if node is a terminal node or depth = 0 return the heuristic value of node if node is a max or min node return the minimax value of the node let N = numSuccessors(node) // Compute α, β for children let A = N * (α - U) + U let B = N * (β - L) + L let sum = 0 foreach child of node // Limit child α, β to a valid range let AX = max(A, L) let BX = min(B, U) // Search the child with new cutoff values let score = *-minimax(child, depth - 1, AX, BX) // Check for α, β cutoff conditions if score <= A return α if score >= B return β sum += score // Adjust α, β for the next child A += U - v B += L - v // No cutoff occurred, return score return sum / N This technique is one of a family of variants of algorithms which can bound the search of a CHANCE node and its children based on collecting lower and upper bounds of the children during search. Other techniques which can offer performance benefits include probing each child with a heuristic to establish a min or max before performing a full search on each child, etc.

References

{{DEFAULTSORT:Expectiminimax Tree Search algorithms Game artificial intelligence Trees (data structures) Combinatorial game theory

Pseudocode

Expectimax search

Alpha-beta pruning

See also

References