Linear belief functions are an extension of the
Dempster–Shafer theory
The theory of belief functions, also referred to as evidence theory or Dempster–Shafer theory (DST), is a general framework for reasoning with uncertainty, with understood connections to other frameworks such as probability, possibility and i ...
of
belief functions to the case when variables of interest are
continuous
Continuity or continuous may refer to:
Mathematics
* Continuity (mathematics), the opposing concept to discreteness; common examples include
** Continuous probability distribution or random variable in probability and statistics
** Continuous g ...
. Examples of such variables include financial asset prices, portfolio performance, and other antecedent and consequent variables. The theory was originally proposed by
Arthur P. Dempster
Arthur Pentland Dempster (born 1929) is a Professor Emeritus in the Harvard University Department of Statistics. He was one of four faculty when the department was founded in 1957.
Biography
Dempster received his B.A. in mathematics and physics ( ...
in the context of Kalman Filters and later was elaborated, refined, and applied to knowledge representation in artificial intelligence and decision making in finance and accounting by Liping Liu.
Concept
A linear belief function intends to represent our belief regarding the location of the true value as follows: We are certain that the truth is on a so-called certainty
hyperplane
In geometry, a hyperplane is a subspace whose dimension is one less than that of its '' ambient space''. For example, if a space is 3-dimensional then its hyperplanes are the 2-dimensional planes, while if the space is 2-dimensional, its hype ...
but we do not know its exact location; along some dimensions of the certainty hyperplane, we believe the true value could be anywhere from –∞ to +∞ and the probability of being at a particular location is described by a
normal distribution
In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
:
f(x) = \frac e^
The parameter \mu i ...
; along other dimensions, our knowledge is
vacuous, i.e., the true value is somewhere from –∞ to +∞ but the associated probability is unknown. A
belief function
A belief is an attitude that something is the case, or that some proposition is true. In epistemology, philosophers use the term "belief" to refer to attitudes about the world which can be either true or false. To believe something is to take i ...
in general is defined by a
mass function over a class of
focal elements, which may have nonempty intersections. A linear belief function is a special type of
belief function
A belief is an attitude that something is the case, or that some proposition is true. In epistemology, philosophers use the term "belief" to refer to attitudes about the world which can be either true or false. To believe something is to take i ...
in the sense that its
focal elements are exclusive, parallel sub-hyperplanes over the certainty hyperplane and its mass function is a
normal distribution
In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
:
f(x) = \frac e^
The parameter \mu i ...
across the sub-hyperplanes.
Based on the above geometrical description, Shafer and Liu propose two mathematical representations of a LBF: a wide-sense inner product and a linear functional in the variable space, and as their duals over a hyperplane in the sample space. Monney proposes still another structure called Gaussian hints. Although these representations are mathematically neat, they tend to be unsuitable for knowledge representation in expert systems.
Knowledge representation
A linear belief function can represent both logical and probabilistic knowledge for three types of variables: deterministic such as an observable or controllable, random whose distribution is normal, and vacuous on which no knowledge bears. Logical knowledge is represented by linear equations, or geometrically, a certainty hyperplane. Probabilistic knowledge is represented by a normal distribution across all parallel focal elements.
In general, assume X is a vector of multiple normal variables with mean μ and covariance Σ. Then, the multivariate normal distribution can be equivalently represented as a moment matrix:
:
If the distribution is non-degenerate, i.e., Σ has a full rank and its inverse exists, the moment matrix can be fully swept:
:
Except for normalization constant, the above equation completely determines the normal density function for ''X''. Therefore,
represents the probability distribution of ''X'' in the potential form.
These two simple matrices allow us to represent three special cases of linear belief functions. First, for an ordinary normal probability distribution M(X) represents it. Second, suppose one makes a direct observation on X and obtains a value μ. In this case, since there is no uncertainty, both variance and covariance vanish, i.e., Σ = 0. Thus, a direct observation can be represented as:
:
Third, suppose one is completely ignorant about X. This is a very thorny case in Bayesian statistics since the density function does not exist. By using the fully swept moment matrix, we represent the vacuous linear belief functions as a zero matrix in the swept form follows:
:
One way to understand the representation is to imagine complete ignorance as the limiting case when the variance of X approaches to ∞, where one can show that Σ
−1 = 0 and hence
vanishes. However, the above equation is not the same as an improper prior or normal distribution with infinite variance. In fact, it does not correspond to any unique probability distribution. For this reason, a better way is to understand the vacuous linear belief functions as the neutral element for combination (see later).
To represent the remaining three special cases, we need the concept of partial sweeping. Unlike a full sweeping, a partial sweeping is a transformation on a subset of variables. Suppose X and Y are two vectors of normal variables with the joint moment matrix:
: