Dempster–Shafer theory
   HOME

TheInfoList



OR:

The theory of belief functions, also referred to as evidence theory or Dempster–Shafer theory (DST), is a general framework for reasoning with uncertainty, with understood connections to other frameworks such as
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, ...
,
possibility Possibility is the condition or fact of being possible. Latin origins of the word hint at ability. Possibility may refer to: * Probability, the measure of the likelihood that an event will occur * Epistemic possibility, a topic in philosophy an ...
and imprecise probability theories. First introduced by
Arthur P. Dempster Arthur Pentland Dempster (born 1929) is a Professor Emeritus in the Harvard University Department of Statistics. He was one of four faculty when the department was founded in 1957. Biography Dempster received his B.A. in mathematics and physics ...
in the context of
statistical inference Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properti ...
, the theory was later developed by
Glenn Shafer Glenn Shafer (born November 21, 1946) is an American mathematician and statistician. He is the co-creator of Dempster–Shafer theory. He is a University Professor and Board of Governors Professor at Rutgers University. Early life and education ...
into a general framework for modeling epistemic uncertainty—a mathematical theory of
evidence Evidence for a proposition is what supports this proposition. It is usually understood as an indication that the supported proposition is true. What role evidence plays and how it is conceived varies from field to field. In epistemology, evidenc ...
.Shafer, Glenn; ''A Mathematical Theory of Evidence'', Princeton University Press, 1976, The theory allows one to combine evidence from different sources and arrive at a degree of belief (represented by a mathematical object called ''belief function'') that takes into account all the available evidence. In a narrow sense, the term Dempster–Shafer theory refers to the original conception of the theory by Dempster and Shafer. However, it is more common to use the term in the wider sense of the same general approach, as adapted to specific kinds of situations. In particular, many authors have proposed different rules for combining evidence, often with a view to handling conflicts in evidence better.Kari Sentz and Scott Ferson (2002)
''Combination of Evidence in Dempster–Shafer Theory''
Sandia National Laboratories SAND 2002-0835
The early contributions have also been the starting points of many important developments, including the
transferable belief model The transferable belief model (TBM) is an elaboration on the Dempster–Shafer theory (DST), which is a mathematical model used to evaluate the probability that a given proposition is true from other propositions which are assigned probabilities. ...
and the theory of hints.


Overview

Dempster–Shafer theory is a generalization of the Bayesian theory of subjective probability. Belief functions base degrees of belief (or confidence, or trust) for one question on the subjective probabilities for a related question. The degrees of belief themselves may or may not have the mathematical properties of probabilities; how much they differ depends on how closely the two questions are related.Shafer, Glenn
''Dempster–Shafer theory''
2002
Put another way, it is a way of representing
epistemic Epistemology (; ), or the theory of knowledge, is the branch of philosophy concerned with knowledge. Epistemology is considered a major subfield of philosophy, along with other major subfields such as ethics, logic, and metaphysics. Episte ...
plausibilities, but it can yield answers that contradict those arrived at using
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set ...
. Often used as a method of sensor fusion, Dempster–Shafer theory is based on two ideas: obtaining degrees of belief for one question from subjective probabilities for a related question, and Dempster's ruleDempster, Arthur P.;
A generalization of Bayesian inference
', Journal of the Royal Statistical Society, Series B, Vol. 30, pp. 205–247, 1968
for combining such degrees of belief when they are based on independent items of evidence. In essence, the degree of belief in a proposition depends primarily upon the number of answers (to the related questions) containing the proposition, and the subjective probability of each answer. Also contributing are the rules of combination that reflect general assumptions about the data. In this formalism a degree of belief (also referred to as a mass) is represented as a belief function rather than a
Bayesian Thomas Bayes (/beɪz/; c. 1701 – 1761) was an English statistician, philosopher, and Presbyterian minister. Bayesian () refers either to a range of concepts and approaches that relate to statistical methods based on Bayes' theorem, or a followe ...
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon ...
. Probability values are assigned to ''sets'' of possibilities rather than single events: their appeal rests on the fact they naturally encode evidence in favor of propositions. Dempster–Shafer theory assigns its masses to all of the subsets of the set of states of a system—in
set-theoretic Set theory is the branch of mathematical logic that studies sets, which can be informally described as collections of objects. Although objects of any kind can be collected into a set, set theory, as a branch of mathematics, is mostly concern ...
terms, the
power set In mathematics, the power set (or powerset) of a set is the set of all subsets of , including the empty set and itself. In axiomatic set theory (as developed, for example, in the ZFC axioms), the existence of the power set of any set is post ...
of the states. For instance, assume a situation where there are two possible states of a system. For this system, any belief function assigns mass to the first state, the second, to both, and to neither.


Belief and plausibility

Shafer's formalism starts from a set of ''possibilities'' under consideration, for instance numerical values of a variable, or pairs of linguistic variables like "date and place of origin of a relic" (asking whether it is antique or a recent fake). A hypothesis is represented by a subset of this ''frame of discernment'', like "(Ming dynasty, China)", or "(19th century, Germany)". Shafer's framework allows for belief about such propositions to be represented as intervals, bounded by two values, ''belief'' (or ''support'') and ''plausibility'': :''belief'' ≤ ''plausibility''. In a first step, subjective probabilities (''masses'') are assigned to all subsets of the frame; usually, only a restricted number of sets will have non-zero mass (''focal elements''). ''Belief'' in a hypothesis is constituted by the sum of the masses of all subsets of the hypothesis-set. It is the amount of belief that directly supports either the given hypothesis or a more specific one, thus forming a lower bound on its probability. Belief (usually denoted ''Bel'') measures the strength of the evidence in favor of a proposition ''p''. It ranges from 0 (indicating no evidence) to 1 (denoting certainty). ''Plausibility'' is 1 minus the sum of the masses of all sets whose intersection with the hypothesis is empty. Or, it can be obtained as the sum of the masses of all sets whose intersection with the hypothesis is not empty. It is an upper bound on the possibility that the hypothesis could be true, because there is only so much evidence that contradicts that hypothesis. Plausibility (denoted by Pl) is thus related to Bel by Pl(''p'') = 1 − Bel(~''p''). It also ranges from 0 to 1 and measures the extent to which evidence in favor of ~''p'' leaves room for belief in ''p''. For example, suppose we have a belief of 0.5 for a proposition, say "the cat in the box is dead." This means that we have evidence that allows us to state strongly that the proposition is true with a confidence of 0.5. However, the evidence contrary to that hypothesis (i.e. "the cat is alive") only has a confidence of 0.2. The remaining mass of 0.3 (the gap between the 0.5 supporting evidence on the one hand, and the 0.2 contrary evidence on the other) is "indeterminate," meaning that the cat could either be dead or alive. This interval represents the level of uncertainty based on the evidence in the system. The "neither" hypothesis is set to zero by definition (it corresponds to "no solution"). The orthogonal hypotheses "Alive" and "Dead" have probabilities of 0.2 and 0.5, respectively. This could correspond to "Live/Dead Cat Detector" signals, which have respective reliabilities of 0.2 and 0.5. Finally, the all-encompassing "Either" hypothesis (which simply acknowledges there is a cat in the box) picks up the slack so that the sum of the masses is 1. The belief for the "Alive" and "Dead" hypotheses matches their corresponding masses because they have no subsets; belief for "Either" consists of the sum of all three masses (Either, Alive, and Dead) because "Alive" and "Dead" are each subsets of "Either". The "Alive" plausibility is 1 − ''m'' (Dead): 0.5 and the "Dead" plausibility is 1 − ''m'' (Alive): 0.8. In other way, the "Alive" plausibility is ''m''(Alive) + ''m''(Either) and the "Dead" plausibility is ''m''(Dead) + ''m''(Either). Finally, the "Either" plausibility sums ''m''(Alive) + ''m''(Dead) + ''m''(Either). The universal hypothesis ("Either") will always have 100% belief and plausibility—it acts as a
checksum A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify data ...
of sorts. Here is a somewhat more elaborate example where the behavior of belief and plausibility begins to emerge. We're looking through a variety of detector systems at a single faraway signal light, which can only be coloured in one of three colours (red, yellow, or green): Events of this kind would not be modeled as distinct entities in probability space as they are here in mass assignment space. Rather the event "Red or Yellow" would be considered as the union of the events "Red" and "Yellow", and (see
probability axioms The Kolmogorov axioms are the foundations of probability theory introduced by Russian mathematician Andrey Kolmogorov in 1933. These axioms remain central and have direct contributions to mathematics, the physical sciences, and real-world probabil ...
) ''P''(Red or Yellow) ≥ ''P''(Yellow), and ''P''(Any) = 1, where ''Any'' refers to ''Red'' or ''Yellow'' or ''Green''. In DST the mass assigned to ''Any'' refers to the proportion of evidence that can not be assigned to any of the other states, which here means evidence that says there is a light but does not say anything about what color it is. In this example, the proportion of evidence that shows the light is either ''Red'' or ''Green'' is given a mass of 0.05. Such evidence might, for example, be obtained from a R/G color blind person. DST lets us extract the value of this sensor's evidence. Also, in DST the empty set is considered to have zero mass, meaning here that the signal light system exists and we are examining its possible states, not speculating as to whether it exists at all.


Combining beliefs

Beliefs from different sources can be combined with various fusion operators to model specific situations of belief fusion, e.g. with Dempster's rule of combination, which combines belief constraints that are dictated by independent belief sources, such as in the case of combining hintsKohlas, J., and Monney, P.A., 1995.
A Mathematical Theory of Hints. An Approach to the Dempster–Shafer Theory of Evidence
'. Vol. 425 in Lecture Notes in Economics and Mathematical Systems. Springer Verlag.
or combining preferences.Jøsang, A., and Hankin, R., 2012. ''Interpretation and Fusion of Hyper Opinions in Subjective Logic''. 15th International Conference on Information Fusion (FUSION) 2012. E-, IEEE., url=http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6289948 Note that the probability masses from propositions that contradict each other can be used to obtain a measure of conflict between the independent belief sources. Other situations can be modeled with different fusion operators, such as cumulative fusion of beliefs from independent sources, which can be modeled with the cumulative fusion operator. Dempster's rule of combination is sometimes interpreted as an approximate generalisation of
Bayes' rule In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For exa ...
. In this interpretation the priors and conditionals need not be specified, unlike traditional Bayesian methods, which often use a symmetry (minimax error) argument to assign prior probabilities to random variables (''e.g.'' assigning 0.5 to binary values for which no information is available about which is more likely). However, any information contained in the missing priors and conditionals is not used in Dempster's rule of combination unless it can be obtained indirectly—and arguably is then available for calculation using Bayes equations. Dempster–Shafer theory allows one to specify a degree of ignorance in this situation instead of being forced to supply prior probabilities that add to unity. This sort of situation, and whether there is a real distinction between ''
risk In simple terms, risk is the possibility of something bad happening. Risk involves uncertainty about the effects/implications of an activity with respect to something that humans value (such as health, well-being, wealth, property or the environm ...
'' and ''
ignorance Ignorance is a lack of knowledge and understanding. The word "ignorant" is an adjective that describes a person in the state of being unaware, or even cognitive dissonance and other cognitive relation, and can describe individuals who are unaware ...
'', has been extensively discussed by statisticians and economists. See, for example, the contrasting views of
Daniel Ellsberg Daniel Ellsberg (born April 7, 1931) is an American political activist, and former United States military analyst. While employed by the RAND Corporation, Ellsberg precipitated a national political controversy in 1971 when he released the '' Pen ...
,
Howard Raiffa Howard Raiffa (; January 24, 1924 – July 8, 2016) was an American academic who was the Frank P. Ramsey Professor (Emeritus) of Managerial Economics, a joint chair held by the Business School and Harvard Kennedy School at Harvard University. He w ...
,
Kenneth Arrow Kenneth Joseph Arrow (23 August 1921 – 21 February 2017) was an American economist, mathematician, writer, and political theorist. He was the joint winner of the Nobel Memorial Prize in Economic Sciences with John Hicks in 1972. In economi ...
and
Frank Knight Frank Hyneman Knight (November 7, 1885 – April 15, 1972) was an American economist who spent most of his career at the University of Chicago, where he became one of the founders of the Chicago School. Nobel laureates Milton Friedman, George ...
.


Formal definition

Let ''X'' be the ''
universe The universe is all of space and time and their contents, including planets, stars, galaxies, and all other forms of matter and energy. The Big Bang theory is the prevailing cosmological description of the development of the univers ...
'': the set representing all possible states of a system under consideration. The
power set In mathematics, the power set (or powerset) of a set is the set of all subsets of , including the empty set and itself. In axiomatic set theory (as developed, for example, in the ZFC axioms), the existence of the power set of any set is post ...
:2^X \,\! is the set of all subsets of ''X'', including the
empty set In mathematics, the empty set is the unique set having no elements; its size or cardinality (count of elements in a set) is zero. Some axiomatic set theories ensure that the empty set exists by including an axiom of empty set, while in othe ...
 \emptyset. For example, if: :X = \left \ \,\! then :2^X = \left \. \, The elements of the power set can be taken to represent propositions concerning the actual state of the system, by containing all and only the states in which the proposition is true. The theory of evidence assigns a belief mass to each element of the power set. Formally, a function :m: 2^X \rightarrow ,1\,\! is called a ''basic belief assignment'' (BBA), when it has two properties. First, the mass of the empty set is zero: :m(\emptyset) = 0. \,\! Second, the masses of all the members of the power set add up to a total of 1: :\sum_ m(A) = 1. The mass ''m''(''A'') of ''A'', a given member of the power set, expresses the proportion of all relevant and available evidence that supports the claim that the actual state belongs to ''A'' but to no particular subset of ''A''. The value of ''m''(''A'') pertains ''only'' to the set ''A'' and makes no additional claims about any subsets of ''A'', each of which have, by definition, their own mass. From the mass assignments, the upper and lower bounds of a probability interval can be defined. This interval contains the precise probability of a set of interest (in the classical sense), and is bounded by two non-additive continuous measures called belief (or support) and plausibility: :\operatorname(A) \le P(A) \le \operatorname(A). The belief bel(''A'') for a set ''A'' is defined as the sum of all the masses of subsets of the set of interest: :\operatorname(A) = \sum_ m(B). \, The plausibility pl(''A'') is the sum of all the masses of the sets ''B'' that intersect the set of interest ''A'': :\operatorname(A) = \sum_ m(B). \, The two measures are related to each other as follows: :\operatorname(A) = 1 - \operatorname(\overline).\, And conversely, for finite ''A'', given the belief measure bel(''B'') for all subsets ''B'' of ''A'', we can find the masses ''m''(''A'') with the following inverse function: :m(A) = \sum_ (-1)^\operatorname(B) \, where , ''A'' − ''B'', is the difference of the cardinalities of the two sets. It follows from the last two equations that, for a finite set ''X'', one needs to know only one of the three (mass, belief, or plausibility) to deduce the other two; though one may need to know the values for many sets in order to calculate one of the other values for a particular set. In the case of an infinite ''X'', there can be well-defined belief and plausibility functions but no well-defined mass function.


Dempster's rule of combination

The problem we now face is how to combine two independent sets of probability mass assignments in specific situations. In case different sources express their beliefs over the frame in terms of belief constraints such as in the case of giving hints or in the case of expressing preferences, then Dempster's rule of combination is the appropriate fusion operator. This rule derives common shared belief between multiple sources and ignores ''all'' the conflicting (non-shared) belief through a normalization factor. Use of that rule in other situations than that of combining belief constraints has come under serious criticism, such as in case of fusing separate belief estimates from multiple sources that are to be integrated in a cumulative manner, and not as constraints. Cumulative fusion means that all probability masses from the different sources are reflected in the derived belief, so no probability mass is ignored. Specifically, the combination (called the joint mass) is calculated from the two sets of masses ''m''1 and ''m''2 in the following manner: :m_(\emptyset) = 0 \, :m_(A) = (m_1 \oplus m_2) (A) = \frac 1 \sum_ m_1(B) m_2(C) \,\! where :K = \sum_ m_1(B) m_2(C). \, ''K'' is a measure of the amount of conflict between the two mass sets.


Effects of conflict

The normalization factor above, 1 − ''K'', has the effect of completely ignoring conflict and attributing ''any'' mass associated with conflict to the empty set. This combination rule for evidence can therefore produce counterintuitive results, as we show next.


Example producing correct results in case of high conflict

The following example shows how Dempster's rule produces intuitive results when applied in a preference fusion situation, even when there is high conflict. :Suppose that two friends, Alice and Bob, want to see a film at the cinema one evening, and that there are only three films showing: X, Y and Z. Alice expresses her preference for film X with probability 0.99, and her preference for film Y with a probability of only 0.01. Bob expresses his preference for film Z with probability 0.99, and his preference for film Y with a probability of only 0.01. When combining the preferences with Dempster's rule of combination it turns out that their combined preference results in probability 1.0 for film Y, because it is the only film that they both agree to see. :Dempster's rule of combination produces intuitive results even in case of totally conflicting beliefs when interpreted in this way. Assume that Alice prefers film X with probability 1.0, and that Bob prefers film Z with probability 1.0. When trying to combine their preferences with Dempster's rule it turns out that it is undefined in this case, which means that there is no solution. This would mean that they can not agree on seeing any film together, so they do not go to the cinema together that evening. However, the semantics of interpreting preference as a probability is vague: if it is referring to the probability of seeing film X tonight, then we face the fallacy of the excluded middle: the event that actually occurs, seeing none of the films tonight, has a probability mass of 0.


Example producing counter-intuitive results in case of high conflict

An example with exactly the same numerical values was introduced by
Lotfi Zadeh Lotfi Aliasker Zadeh (; az, Lütfi Rəhim oğlu Ələsgərzadə; fa, لطفی علی‌عسکرزاده; 4 February 1921 – 6 September 2017) was a mathematician, computer scientist, electrical engineer, artificial intelligence researcher, an ...
in 1979,L. Zadeh, On the validity of Dempster's rule of combination, Memo M79/24, Univ. of California, Berkeley, USA, 1979L. Zadeh, Book review: A mathematical theory of evidence, The Al Magazine, Vol. 5, No. 3, pp. 81–83, 1984L. Zadeh
A simple view of the Dempster–Shafer Theory of Evidence and its implication for the rule of combination
The Al Magazine, Vol. 7, No. 2, pp. 85–90, Summer 1986.
to point out counter-intuitive results generated by Dempster's rule when there is a high degree of conflict. The example goes as follows: :Suppose that one has two equi-reliable doctors and one doctor believes a patient has either a brain tumor, with a probability (i.e. a basic belief assignment—bba's, or mass of belief) of 0.99; or meningitis, with a probability of only 0.01. A second doctor believes the patient has a concussion, with a probability of 0.99, and believes the patient suffers from meningitis, with a probability of only 0.01. Applying Dempster's rule to combine these two sets of masses of belief, one gets finally ''m''(meningitis)=1 (the meningitis is diagnosed with 100 percent of confidence). Such result goes against common sense since both doctors agree that there is a little chance that the patient has a meningitis. This example has been the starting point of many research works for trying to find a solid justification for Dempster's rule and for foundations of Dempster–Shafer theoryE. Ruspini,
The logical foundations of evidential reasoning
, ''SRI Technical Note'' 408, December 20, 1986 (revised April 27, 1987)
N. Wilson,
The assumptions behind Dempster's rule
, in ''Proceedings of the 9th Conference on Uncertainty in Artificial Intelligence'', pages 527–534, Morgan Kaufmann Publishers, San Mateo, CA, USA, 1993
or to show the inconsistencies of this theory.F. Voorbraak,
On the justification of Dempster's rule of combination
, ''Artificial Intelligence'', Vol. 48, pp. 171–197, 1991
Pei Wang,
A Defect in Dempster–Shafer Theory
, in ''Proceedings of the 10th Conference on Uncertainty in Artificial Intelligence'', pages 560–566, Morgan Kaufmann Publishers, San Mateo, CA, USA, 1994
P. Walley,
Statistical Reasoning with Imprecise Probabilities
, Chapman and Hall, London, pp. 278–281, 1991


Example producing counter-intuitive results in case of low conflict

The following example shows where Dempster's rule produces a counter-intuitive result, even when there is low conflict. :Suppose that one doctor believes a patient has either a brain tumor, with a probability of 0.99, or meningitis, with a probability of only 0.01. A second doctor also believes the patient has a brain tumor, with a probability of 0.99, and believes the patient suffers from concussion, with a probability of only 0.01. If we calculate m (brain tumor) with Dempster's rule, we obtain ::m(\text) = \operatorname(\text) = 1. \, This result implies ''complete support'' for the diagnosis of a brain tumor, which both doctors believed ''very likely''. The agreement arises from the low degree of conflict between the two sets of evidence comprised by the two doctors' opinions. In either case, it would be reasonable to expect that: :m(\text) < 1\text \operatorname(\text) < 1,\, since the existence of non-zero belief probabilities for other diagnoses implies ''less than complete support'' for the brain tumour diagnosis.


Dempster–Shafer as a generalisation of Bayesian theory

As in Dempster–Shafer theory, a Bayesian belief function \operatorname: 2^X \rightarrow ,1\,\! has the properties \operatorname(\emptyset) = 0 and \operatorname(X) = 1. The third condition, however, is subsumed by, but relaxed in DS theory: :\text A \cap B = \emptyset, \text \operatorname(A \cup B) = \operatorname(A) + \operatorname (B). Either of the following conditions implies the Bayesian special case of the DS theory: * \operatorname(A) + \operatorname(\bar) = 1 \text A \subseteq X. * For finite ''X'', all focal elements of the belief function are singletons. As an example of how the two approaches differ, a Bayesian could model the color of a car as a probability distribution over (red, green, blue), assigning one number to each color. Dempster–Shafer would assign numbers to each of (red, green, blue, (red or green), (red or blue), (green or blue), (red or green or blue)). These numbers do not have to be coherent; for example, Bel(red)+Bel(green) does not have to equal Bel(red or green). Thus, Bayes' conditional probability can be considered as a special case of Dempster's rule of combination. However, it lacks many (if not most) of the properties that make Bayes' rule intuitively desirable, leading some to argue that it cannot be considered a generalization in any meaningful sense. For example, DS theory violates the requirements for
Cox's theorem Cox's theorem, named after the physicist Richard Threlkeld Cox, is a derivation of the laws of probability theory from a certain set of postulates. This derivation justifies the so-called "logical" interpretation of probability, as the laws of p ...
, which implies that it cannot be considered a coherent (contradiction-free) generalization of
classical logic Classical logic (or standard logic or Frege-Russell logic) is the intensively studied and most widely used class of deductive logic. Classical logic has had much influence on analytic philosophy. Characteristics Each logical system in this class ...
—specifically, DS theory violates the requirement that a statement be either true or false (but not both). As a result, DS theory is subject to the
Dutch Book In gambling, a Dutch book or lock is a set of odds and bets, established by the bookmaker, that ensures that the bookmaker will profit—at the expense of the gamblers—regardless of the outcome of the event (a horse race, for example) on which ...
argument, implying that any agent using DS theory would agree to a series of bets that result in a guaranteed loss.


Bayesian approximation

The Bayesian approximation reduces a given bpa m to a (discrete) probability distribution, i.e. only singleton subsets of the frame of discernment are allowed to be focal elements of the approximated version \underline of m: : \underline (A) = \left\{ \begin{aligned} & \frac{\sum\limits_{B , A \subseteq B} m(B) }{ \sum\limits_C m(C) \cdot , C, }, & , A, = 1 \\ & 0, & \text{otherwise} \end{aligned} \right. It's useful for those who are only interested in the single state hypothesis. We can perform it in the 'light' example. {, class="wikitable" , - ! Hypothesis !! m_1 !! m_2 !! m_{1,2} !! \underline{m}_1 !! \underline{m}_2 !! \underline{m}_{1,2} , - , None , , 0 , , 0 , , 0 , , 0 , , 0 , , 0 , - , Red , , 0.35 , , 0.11 , , 0.32 , , 0.41, , 0.30 , , 0.37 , - , Yellow , , 0.25 , , 0.21 , , 0.33 , , 0.33, , 0.38 , , 0.38 , - , Green , , 0.15 , , 0.33 , , 0.24 , , 0.25 , , 0.32 , , 0.25 , - , Red or Yellow , , 0.06 , , 0.21 , , 0.07 , , 0, , 0 , , 0 , - , Red or Green , , 0.05 , , 0.01 , , 0.01 , , 0, , 0 , , 0 , - , Yellow or Green , , 0.04 , , 0.03 , , 0.01 , , 0, , 0 , , 0 , - , Any , , 0.1 , , 0.1 , , 0.02 , , 0, , 0 , , 0


Criticism

Judea Pearl Judea Pearl (born September 4, 1936) is an Israeli-American computer scientist and philosopher, best known for championing the probabilistic approach to artificial intelligence and the development of Bayesian networks (see the article on belief ...
(1988a, chapter 9;Pearl, J. (1988a), ''Probabilistic Reasoning in Intelligent Systems,'' (Revised Second Printing) San Mateo, CA: Morgan Kaufmann. 1988b and 1990) has argued that it is misleading to interpret belief functions as representing either "probabilities of an event," or "the confidence one has in the probabilities assigned to various outcomes," or "degrees of belief (or confidence, or trust) in a proposition," or "degree of ignorance in a situation." Instead, belief functions represent the probability that a given proposition is ''provable'' from a set of other propositions, to which probabilities are assigned. Confusing probabilities of ''truth'' with probabilities of ''provability'' may lead to counterintuitive results in reasoning tasks such as (1) representing incomplete knowledge, (2) belief-updating and (3) evidence pooling. He further demonstrated that, if partial knowledge is encoded and updated by belief function methods, the resulting beliefs cannot serve as a basis for rational decisions. Kłopotek and WierzchońM. A. Kłopotek, S. T. Wierzchoń':
A New Qualitative Rough-Set Approach to Modeling Belief Functions
" n:L. Polkowski, A, Skowron eds: ''Rough Sets And Current Trends In Computing. Proc. 1st International Conference RSCTC'98'', Warsaw, June 22–26, 1998, ''Lecture Notes in Artificial Intelligence 1424'', Springer-Verlag, pp. 346–353.
proposed to interpret the Dempster–Shafer theory in terms of statistics of decision tables (of the rough set theory), whereby the operator of combining evidence should be seen as relational joining of decision tables. In another interpretation M. A. Kłopotek and S. T. WierzchońM. A. Kłopotek and S. T. Wierzchoń, "Empirical Models for the Dempster–Shafer Theory". in: Srivastava, R. P., Mock, T. J., (Eds.). ''Belief Functions in Business Decisions''. Series: ''Studies in Fuzziness and Soft Computing''. Vol. 88 Springer-Verlag. March 2002. , pp. 62–112 propose to view this theory as describing destructive material processing (under loss of properties), e.g. like in some semiconductor production processes. Under both interpretations reasoning in DST gives correct results, contrary to the earlier probabilistic interpretations, criticized by Pearl in the cited papers and by other researchers. Jøsang proved that Dempster's rule of combination actually is a method for fusing belief constraints. It only represents an approximate fusion operator in other situations, such as cumulative fusion of beliefs, but generally produces incorrect results in such situations. The confusion around the validity of Dempster's rule therefore originates in the failure of correctly interpreting the nature of situations to be modeled. Dempster's rule of combination always produces correct and intuitive results in situation of fusing belief constraints from different sources.


Relational measures

In considering preferences one might use the
partial order In mathematics, especially order theory, a partially ordered set (also poset) formalizes and generalizes the intuitive concept of an ordering, sequencing, or arrangement of the elements of a set. A poset consists of a set together with a binary ...
of a
lattice Lattice may refer to: Arts and design * Latticework, an ornamental criss-crossed framework, an arrangement of crossing laths or other thin strips of material * Lattice (music), an organized grid model of pitch ratios * Lattice (pastry), an orna ...
instead of the
total order In mathematics, a total or linear order is a partial order in which any two elements are comparable. That is, a total order is a binary relation \leq on some set X, which satisfies the following for all a, b and c in X: # a \leq a ( reflexive ...
of the real line as found in Dempster–Schafer theory. Indeed,
Gunther Schmidt Gunther Schmidt (born 1939, Rüdersdorf) is a German mathematician who works also in informatics. Life Schmidt began studying Mathematics in 1957 at Göttingen University. His academic teachers were in particular Kurt Reidemeister, Wilhelm Kl ...
has proposed this modification and outlined the method.
Gunther Schmidt Gunther Schmidt (born 1939, Rüdersdorf) is a German mathematician who works also in informatics. Life Schmidt began studying Mathematics in 1957 at Göttingen University. His academic teachers were in particular Kurt Reidemeister, Wilhelm Kl ...
(2006
Relational measures and integration
Lecture Notes in Computer Science ''Lecture Notes in Computer Science'' is a series of computer science books published by Springer Science+Business Media since 1973. Overview The series contains proceedings, post-proceedings, monographs, and Festschrifts. In addition, tutorial ...
# 4136, pages 343−57,
Springer books Springer Science+Business Media, commonly known as Springer, is a German multinational publishing company of books, e-books and peer-reviewed journals in science, humanities, technical and medical (STM) publishing. Originally founded in 1842 i ...
Given a set of criteria ''C'' and a
bounded lattice A lattice is an abstract structure studied in the mathematical subdisciplines of order theory and abstract algebra. It consists of a partially ordered set in which every pair of elements has a unique supremum (also called a least upper boun ...
''L'' with ordering ≤, Schmidt defines a relational measure to be a function ''μ'' from the
power set In mathematics, the power set (or powerset) of a set is the set of all subsets of , including the empty set and itself. In axiomatic set theory (as developed, for example, in the ZFC axioms), the existence of the power set of any set is post ...
of ''C'' into ''L'' that respects the order ⊆ on \mathbb{P}(''C''): :A \subseteq B \implies \mu(A) \leq \mu(B) and such that ''μ'' takes the empty subset of \mathbb{P}(''C'') to the least element of ''L'', and takes ''C'' to the greatest element of ''L''. Schmidt compares ''μ'' with the belief function of Schafer, and he also considers a method of combining measures generalizing the approach of Dempster (when new evidence is combined with previously held evidence). He also introduces a ''relational integral'' and compares it to the
Choquet integral A Choquet integral is a subadditive or superadditive integral created by the French mathematician Gustave Choquet in 1953. It was initially used in statistical mechanics and potential theory, but found its way into decision theory in the 1980s, wher ...
and Sugeno integral. Any relation ''m'' between ''C'' and ''L'' may be introduced as a "direct valuation", then processed with the
calculus of relations In mathematical logic, algebraic logic is the reasoning obtained by manipulating equations with free variables. What is now usually called classical algebraic logic focuses on the identification and algebraic description of models appropriate for ...
to obtain a ''possibility measure'' ''μ''.


See also

*
Imprecise probability Imprecise probability generalizes probability theory to allow for partial probability specifications, and is applicable when information is scarce, vague, or conflicting, in which case a unique probability distribution may be hard to identify. There ...
*
Upper and lower probabilities Upper and lower probabilities are representations of imprecise probability. Whereas probability theory uses a single number, the probability, to describe how likely an event is to occur, this method uses two numbers: the upper probability of the ev ...
*
Possibility theory Possibility theory is a mathematical theory for dealing with certain types of uncertainty and is an alternative to probability theory. It uses measures of possibility and necessity between 0 and 1, ranging from impossible to possible and unnecess ...
*
Probabilistic logic Probabilistic logic (also probability logic and probabilistic reasoning) involves the use of probability and logic to deal with uncertain situations. Probabilistic logic extends traditional logic truth tables with probabilistic expressions. A diffic ...
*
Bayes' theorem In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
*
Bayesian network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Ba ...
*
G. L. S. Shackle George Lennox Sharman Shackle (14 July 1903 – 3 March 1992) was an English economist. He made a practical attempt to challenge classical rational choice theory and has been characterised as a "post-Keynesian", though he is influenced as well by ...
*
Transferable belief model The transferable belief model (TBM) is an elaboration on the Dempster–Shafer theory (DST), which is a mathematical model used to evaluate the probability that a given proposition is true from other propositions which are assigned probabilities. ...
*
Info-gap decision theory Info-gap decision theory seeks to optimize robustness to failure under severe uncertainty,Yakov Ben-Haim, ''Information-Gap Theory: Decisions Under Severe Uncertainty,'' Academic Press, London, 2001.Yakov Ben-Haim, ''Info-Gap Theory: Decisions Unde ...
* Subjective logic *
Doxastic logic Doxastic logic is a type of logic concerned with reasoning about beliefs. The term ' derives from the Ancient Greek (''doxa'', "opinion, belief"), from which the English term '' doxa'' ("popular opinion or belief") is also borrowed. Typically, a ...
*
Linear belief function Linear belief functions are an extension of the Dempster–Shafer theory of belief functions to the case when variables of interest are continuous. Examples of such variables include financial asset prices, portfolio performance, and other anteced ...


References


Further reading

* Yang, J. B. and Xu, D. L. ''Evidential Reasoning Rule for Evidence Combination'', Artificial Intelligence, Vol.205, pp. 1–29, 2013. * Yager, R. R., & Liu, L. (2008). ''Classic works of the Dempster–Shafer theory of belief functions.'' Studies in fuzziness and soft computing, v. 219. Berlin:
Springer Springer or springers may refer to: Publishers * Springer Science+Business Media, aka Springer International Publishing, a worldwide publishing group founded in 1842 in Germany formerly known as Springer-Verlag. ** Springer Nature, a multinationa ...
. . * Joseph C. Giarratano and Gary D. Riley (2005); ''Expert Systems: principles and programming'', ed. Thomson Course Tech., * Beynon, M., Curry, B. and Morgan, P.
The Dempster–Shafer theory of evidence: an alternative approach to multicriteria decision modelling
', Omega, Vol.28, pp. 37–50, 2000.


External links


BFAS: Belief Functions and Applications Society
{{DEFAULTSORT:Dempster-Shafer theory Belief