Probabilistic Graphical Model
   HOME

TheInfoList



OR:

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a
probabilistic model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form ...
for which a
graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discret ...
expresses the
conditional dependence In probability theory, conditional dependence is a relationship between two or more events that are dependent when a third event occurs.Introduction to Artificial Intelligence by Sebastian Thrun and Peter Norvig, 201"Unit 3: Conditional Dependen ...
structure between
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
s. Graphical models are commonly used in
probability theory Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
,
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
—particularly
Bayesian statistics Bayesian statistics ( or ) is a theory in the field of statistics based on the Bayesian interpretation of probability, where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about ...
—and
machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
.


Types of graphical models

Generally, probabilistic graphical models use a graph-based representation as the foundation for encoding a distribution over a multi-dimensional space and a graph that is a compact or factorized representation of a set of independences that hold in the specific distribution. Two branches of graphical representations of distributions are commonly used, namely,
Bayesian network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Whi ...
s and
Markov random field In the domain of physics and probability, a Markov random field (MRF), Markov network or undirected graphical model is a set of random variables having a Markov property described by an undirected graph In discrete mathematics, particularly ...
s. Both families encompass the properties of factorization and independences, but they differ in the set of independences they can encode and the factorization of the distribution that they induce.


Undirected Graphical Model

The undirected graph shown may have one of several interpretations; the common feature is that the presence of an edge implies some sort of dependence between the corresponding random variables. From this graph, we might deduce that B, C, and D are all
conditionally independent In probability theory, conditional independence describes situations wherein an observation is irrelevant or redundant when evaluating the certainty of a hypothesis. Conditional independence is usually formulated in terms of conditional probabi ...
given A. This means that if the value of A is known, then the values of B, C, and D provide no further information about each other. Equivalently (in this case), the joint probability distribution can be factorized as: :P ,B,C,D= f_ ,B\cdot f_ ,C\cdot f_ ,D/math> for some non-negative functions f_, f_, f_.


Bayesian network

If the network structure of the model is a
directed acyclic graph In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges (also called ''arcs''), with each edge directed from one ...
, the model represents a factorization of the joint
probability Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
of all random variables. More precisely, if the events are X_1,\ldots,X_n then the joint probability satisfies :P _1,\ldots,X_n\prod_^nP \text(X_i)/math> where \text(X_i) is the set of parents of node X_i (nodes with edges directed towards X_i). In other words, the
joint distribution A joint or articulation (or articular surface) is the connection made between bones, ossicles, or other hard structures in the body which link an animal's skeletal system into a functional whole.Saladin, Ken. Anatomy & Physiology. 7th ed. McGraw- ...
factors into a product of conditional distributions. For example, in the directed acyclic graph shown in the Figure this factorization would be :P ,B,C,D= P cdot P Acdot P A\cdot P A,C/math>. Any two nodes are
conditionally independent In probability theory, conditional independence describes situations wherein an observation is irrelevant or redundant when evaluating the certainty of a hypothesis. Conditional independence is usually formulated in terms of conditional probabi ...
given the values of their parents. In general, any two sets of nodes are conditionally independent given a third set if a criterion called ''d''-separation holds in the graph. Local independences and global independences are equivalent in Bayesian networks. This type of graphical model is known as a directed graphical model,
Bayesian network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Whi ...
, or belief network. Classic machine learning models like
hidden Markov models A hidden Markov model (HMM) is a Markov model in which the observations are dependent on a latent (or ''hidden'') Markov process (referred to as X). An HMM requires that there be an observable process Y whose outcomes depend on the outcomes of X ...
,
neural networks A neural network is a group of interconnected units called neurons that send signals to one another. Neurons can be either Cell (biology), biological cells or signal pathways. While individual neurons are simple, many of them together in a netwo ...
and newer models such as
variable-order Markov model In the mathematical theory of stochastic processes, variable-order Markov (VOM) models are an important class of models that extend the well known Markov chain models. In contrast to the Markov chain models, where each random variable in a sequence ...
s can be considered special cases of Bayesian networks. One of the simplest Bayesian Networks is the
Naive Bayes classifier In statistics, naive (sometimes simple or idiot's) Bayes classifiers are a family of " probabilistic classifiers" which assumes that the features are conditionally independent, given the target class. In other words, a naive Bayes model assumes th ...
.


Cyclic Directed Graphical Models

The next figure depicts a graphical model with a cycle. This may be interpreted in terms of each variable 'depending' on the values of its parents in some manner. The particular graph shown suggests a joint probability density that factors as :P ,B,C,D= P cdot P cdot P A,B/math>, but other interpretations are possible.


Other types

* Dependency network where cycles are allowed *Tree-augmented classifier or TAN model *Targeted Bayesian network learning (TBNL) *A
factor graph A factor graph is a bipartite graph representing the factorization of a function (mathematics), function. In probability theory and its applications, factor graphs are used to represent factorization of a Probability distribution function (disam ...
is an undirected
bipartite graph In the mathematics, mathematical field of graph theory, a bipartite graph (or bigraph) is a Graph (discrete mathematics), graph whose vertex (graph theory), vertices can be divided into two disjoint sets, disjoint and Independent set (graph theo ...
connecting variables and factors. Each factor represents a function over the variables it is connected to. This is a helpful representation for understanding and implementing
belief propagation Belief propagation, also known as sum–product message passing, is a message-passing algorithm for performing inference on graphical models, such as Bayesian networks and Markov random fields. It calculates the marginal distribution for ea ...
. * A
clique tree In graph theory, a tree decomposition is a mapping of a graph into a tree that can be used to define the treewidth of the graph and speed up solving certain computational problems on the graph. Tree decompositions are also called junction trees ...
or junction tree is a
tree In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, e.g., including only woody plants with secondary growth, only ...
of
cliques A clique ( AusE, CanE, or ; ), in the social sciences, is a small group of individuals who interact with one another and share similar interests rather than include others. Interacting with cliques is part of normative social development regardle ...
, used in the
junction tree algorithm The junction tree algorithm (also known as 'Clique Tree') is a method used in machine learning to extract marginalization in general graphs. In essence, it entails performing belief propagation on a modified graph called a junction tree. The g ...
. * A
chain graph In graph theory, a mixed graph is a graph consisting of a set of vertices , a set of (undirected) edges , and a set of directed edges (or arcs) . Definitions and notation Consider adjacent vertices u,v \in V. A directed edge, called an arc, ...
is a graph which may have both directed and undirected edges, but without any directed cycles (i.e. if we start at any vertex and move along the graph respecting the directions of any arrows, we cannot return to the vertex we started from if we have passed an arrow). Both directed acyclic graphs and undirected graphs are special cases of chain graphs, which can therefore provide a way of unifying and generalizing Bayesian and Markov networks. * An ancestral graph is a further extension, having directed, bidirected and undirected edges. *
Random field In physics and mathematics, a random field is a random function over an arbitrary domain (usually a multi-dimensional space such as \mathbb^n). That is, it is a function f(x) that takes on a random value at each point x \in \mathbb^n(or some other ...
techniques ** A
Markov random field In the domain of physics and probability, a Markov random field (MRF), Markov network or undirected graphical model is a set of random variables having a Markov property described by an undirected graph In discrete mathematics, particularly ...
, also known as a Markov network, is a model over an
undirected graph In discrete mathematics, particularly in graph theory, a graph is a structure consisting of a set of objects where some pairs of the objects are in some sense "related". The objects are represented by abstractions called '' vertices'' (also call ...
. A graphical model with many repeated subunits can be represented with
plate notation In Bayesian inference, plate notation is a method of representing variables that repeat in a graphical model. Instead of drawing each repeated variable individually, a plate or rectangle is used to group variables into a subgraph that repeat togeth ...
. ** A
conditional random field Conditional random fields (CRFs) are a class of statistical modeling methods often applied in pattern recognition and machine learning and used for structured prediction. Whereas a classifier predicts a label for a single sample without consi ...
is a
discriminative model Discriminative models, also referred to as conditional models, are a class of models frequently used for classification. They are typically used to solve binary classification problems, i.e. assign labels, such as pass/fail, win/lose, alive/dead or ...
specified over an undirected graph. * A
restricted Boltzmann machine A restricted Boltzmann machine (RBM) (also called a restricted Sherrington–Kirkpatrick model with external field or restricted stochastic Ising–Lenz–Little model) is a generative stochastic artificial neural network that can learn a prob ...
is a bipartite
generative model In statistical classification, two main approaches are called the generative approach and the discriminative approach. These compute classifiers by different approaches, differing in the degree of statistical modelling. Terminology is inconsiste ...
specified over an undirected graph.


Applications

The framework of the models, which provides algorithms for discovering and analyzing structure in complex distributions to describe them succinctly and extract the unstructured information, allows them to be constructed and utilized effectively. Applications of graphical models include
causal inference Causal inference is the process of determining the independent, actual effect of a particular phenomenon that is a component of a larger system. The main difference between causal inference and inference of association is that causal inference an ...
, information extraction,
speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also ...
,
computer vision Computer vision tasks include methods for image sensor, acquiring, Image processing, processing, Image analysis, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical ...
, decoding of
low-density parity-check codes Low-density parity-check (LDPC) codes are a class of error correction codes which (together with the closely-related turbo codes) have gained prominence in coding theory and information theory since the late 1990s. The codes today are widely u ...
, modeling of
gene regulatory network A gene (or genetic) regulatory network (GRN) is a collection of molecular regulators that interact with each other and with other substances in the cell to govern the gene expression levels of mRNA and proteins which, in turn, determine the fu ...
s, gene finding and diagnosis of diseases, and graphical models for protein structure.


See also

*
Belief propagation Belief propagation, also known as sum–product message passing, is a message-passing algorithm for performing inference on graphical models, such as Bayesian networks and Markov random fields. It calculates the marginal distribution for ea ...
*
Structural equation model Structural equation modeling (SEM) is a diverse set of methods used by scientists for both observational and experimental research. SEM is used mostly in the social and behavioral science fields, but it is also used in epidemiology, business, ...


Notes


Further reading


Books and book chapters

* * * A more advanced and statistically oriented book * * A computational reasoning approach, where the relationships between graphs and probabilities were formally introduced.


Journal articles

* * *


Other


Heckerman's Bayes Net Learning Tutorial
* ttp://www.cedar.buffalo.edu/~srihari/CSE574 Sargur Srihari's lecture slides on probabilistic graphical models


External links


Graphical models and Conditional Random Fields

Probabilistic Graphical Models taught by Eric Xing at CMU
{{Statistics, analysis Bayesian statistics