
The junction tree algorithm (also known as 'Clique Tree') is a method used in
machine learning to extract
marginalization in general
graphs. In essence, it entails performing
belief propagation on a modified graph called a
junction tree. The graph is called a tree because it branches into different sections of data;
nodes of variables are the branches.
The basic premise is to eliminate
cycle
Cycle, cycles, or cyclic may refer to:
Anthropology and social sciences
* Cyclic history, a theory of history
* Cyclical theory, a theory of American political history associated with Arthur Schlesinger, Sr.
* Social cycle, various cycles in soc ...
s by clustering them into single nodes. Multiple extensive classes of queries can be compiled at the same time into larger structures of data.
There are different
algorithms to meet specific needs and for what needs to be calculated.
Inference algorithms gather new developments in the data and calculate it based on the new information provided.
Junction tree algorithm
Hugin algorithm
* If the graph is directed then
moralize
Morality () is the differentiation of intentions, decisions and actions between those that are distinguished as proper (right) and those that are improper (wrong). Morality can be a body of standards or principles derived from a code of condu ...
it to make it un-directed.
*Introduce the evidence.
*
Triangulate the graph to make it chordal.
*Construct a
junction tree from the triangulated graph (we will call the vertices of the junction tree "
supernodes").
*Propagate the probabilities along the junction tree (via
belief propagation)
Note that this last step is inefficient for graphs of large
treewidth. Computing the messages to pass between supernodes involves doing exact marginalization over the variables in both supernodes. Performing this algorithm for a graph with treewidth k will thus have at least one computation which takes time exponential in k. It is a
message passing algorithm.
The Hugin algorithm takes fewer
computation
Computation is any type of arithmetic or non-arithmetic calculation that follows a well-defined model (e.g., an algorithm).
Mechanical or electronic devices (or, historically, people) that perform computations are known as ''computers''. An es ...
s to find a solution compared to Shafer-Shenoy.
Shafer-Shenoy algorithm
* Computed recursively
* Multiple recursions of the Shafer-Shenoy algorithm results in Hugin algorithm
* Found by the
message passing equation
*
Separator
Separator can refer to:
* A mechanical device to separate fluids and solids, like
** Cream separator, separates cream from milk
** Demister (vapor), removal of liquid droplets entrained in a vapor stream
** Separator (oil production), of an oil pr ...
potentials are not stored
The Shafer-Shenoy
algorithm is the
sum product of a junction tree. It is used because it runs programs and queries more efficiently than the Hugin algorithm. The algorithm makes calculations for conditionals for
belief functions
A belief is an attitude that something is the case, or that some proposition is true. In epistemology, philosophers use the term "belief" to refer to attitudes about the world which can be either true or false. To believe something is to take i ...
possible.
Joint distributions are needed to make local computations happen.
Underlying theory

The first step concerns only
Bayesian networks, and is a procedure to turn a directed graph into an
undirected one. We do this because it allows for the universal applicability of the algorithm, regardless of direction.
The second step is setting variables to their observed value. This is usually needed when we want to calculate conditional probabilities, so we fix the value of the
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s we condition on. Those variables are also said to be clamped to their particular value.

The third step is to ensure that graphs are made
chordal
In the mathematical area of graph theory, a chordal graph is one in which all cycles of four or more vertices have a ''chord'', which is an edge that is not part of the cycle but connects two vertices of the cycle. Equivalently, every induced cy ...
if they aren't already chordal. This is the first essential step of the algorithm. It makes use of the following theorem:
Theorem: For an
undirected graph, G, the following properties are equivalent:
* Graph G is triangulated.
* The clique graph of G has a junction tree.
* There is an elimination ordering for G that does not lead to any added edges.
Thus, by
triangulating a graph, we make sure that the corresponding junction tree exists. A usual way to do this, is to decide an elimination order for its nodes, and then run the
Variable elimination algorithm. The
variable elimination algorithm states that the algorithm must be run each time there is a different query.
This will result to adding more edges to the initial graph, in such a way that the output will be a
chordal graph.
All chordal graphs have a junction tree.
The next step is to construct the
junction tree. To do so, we use the graph from the previous step, and form its corresponding
clique graph. Now the next theorem gives us a way to find a junction tree:
Theorem: Given a triangulated graph, weight the edges of the clique graph by their cardinality, , A∩B, , of the intersection of the adjacent cliques A and B. Then any maximum-weight spanning tree of the clique graph is a junction tree.
So, to construct a junction tree we just have to extract a maximum weight spanning tree out of the clique graph. This can be efficiently done by, for example, modifying
Kruskal's algorithm
Kruskal's algorithm finds a minimum spanning forest of an undirected edge-weighted graph. If the graph is connected, it finds a minimum spanning tree. (A minimum spanning tree of a connected graph is a subset of the edges that forms a tree that i ...
.
The last step is to apply
belief propagation to the obtained junction tree.
Usage: A junction tree graph is used to visualize the probabilities of the problem. The tree can become a binary tree to form the actual building of the tree. A specific use could be found in
auto encoders, which combine the graph and a passing network on a large scale automatically.
Inference Algorithms

Loopy belief propagation: A different method of interpreting complex graphs. The
loopy belief propagation
Loopy may refer to:
* Casio Loopy, a video game console
* ''Loopy'' (film), a 2004 film
* Loopy, a fictional character in the animated TV show ''KaBlam!''
* Loopy de Loop, a cartoon about a character of the same name
* Loopy games, in combinatori ...
is used when an approximate solution is needed instead of the
exact solution.
It is an
approximate inference Approximate inference methods make it possible to learn realistic models from big data
Though used sometimes loosely partly because of a lack of formal definition, the interpretation that seems to best describe Big data is the one associated w ...
.
Cutset conditioning: Used with smaller sets of variables.
Cutset conditioning allows for simpler graphs that are easier to read but are not
exact
Exact may refer to:
* Exaction, a concept in real property law
* ''Ex'Act'', 2016 studio album by Exo
* Schooner Exact, the ship which carried the founders of Seattle
Companies
* Exact (company), a Dutch software company
* Exact Change, an Ameri ...
.
References
*
*
*
*
*Lepar, V., Shenoy, P. (1998). "A Comparison of Lauritzen-Spiegelhalter, Hugin, and Shenoy-Shafer Architectures for Computing Marginals of Probability Distributions." https://arxiv.org/ftp/arxiv/papers/1301/1301.7394.pdf
{{DEFAULTSORT:Junction Tree Algorithm
Bayesian networks
Graph algorithms