Condorcet jury theorem
   HOME

TheInfoList



OR:

Condorcet's jury theorem is a
political science Political science is the scientific study of politics. It is a social science dealing with systems of governance and Power (social and political), power, and the analysis of political activities, political philosophy, political thought, polit ...
theorem about the relative probability of a given group of individuals arriving at a correct decision. The theorem was first expressed by the
Marquis de Condorcet Marie Jean Antoine Nicolas de Caritat, Marquis of Condorcet (; ; 17 September 1743 – 29 March 1794), known as Nicolas de Condorcet, was a French Philosophy, philosopher, Political economy, political economist, Politics, politician, and m ...
in his 1785 work ''Essay on the Application of Analysis to the Probability of Majority Decisions''. The assumptions of the theorem are that a group wishes to reach a decision by
majority vote A majority is more than half of a total; however, the term is commonly used with other meanings, as explained in the "#Related terms, Related terms" section below. It is a subset of a Set (mathematics), set consisting of more than half of the se ...
. One of the two outcomes of the vote is ''correct'', and each voter has an independent probability ''p'' of voting for the correct decision. The theorem asks how many voters we should include in the group. The result depends on whether ''p'' is greater than or less than 1/2: * If ''p'' is greater than 1/2 (each voter is more likely to vote correctly), then adding more voters increases the probability that the majority decision is correct. In the limit, the probability that the majority votes correctly approaches 1 as the number of voters increases. * On the other hand, if ''p'' is less than 1/2 (each voter is more likely to vote incorrectly), then adding more voters makes things worse: the optimal jury consists of a single voter. Since Condorcet, many other researchers have proved various other jury theorems, relaxing some or all of Condorcet's assumptions.


Proofs


Proof 1: Calculating the probability that two additional voters change the outcome

To avoid the need for a tie-breaking rule, we assume ''n'' is odd. Essentially the same argument works for even ''n'' if ties are broken by adding a single voter. Now suppose we start with ''n'' voters, and let ''m'' of these voters vote correctly. Consider what happens when we add two more voters (to keep the total number odd). The majority vote changes in only two cases: * ''m'' was one vote too small to get a majority of the ''n'' votes, but both new voters voted correctly. * ''m'' was just equal to a majority of the ''n'' votes, but both new voters voted incorrectly. The rest of the time, either the new votes cancel out, only increase the gap, or don't make enough of a difference. So we only care what happens when a single vote (among the first ''n'') separates a correct from an incorrect majority. Restricting our attention to this case, we can imagine that the first ''n''-1 votes cancel out and that the deciding vote is cast by the ''n''-th voter. In this case the probability of getting a correct majority is just ''p''. Now suppose we send in the two extra voters. The probability that they change an incorrect majority to a correct majority is (1-''p'')''p''2, while the probability that they change a correct majority to an incorrect majority is ''p''(1-''p'')2. The first of these probabilities is greater than the second if and only if ''p'' > 1/2, proving the theorem.


Proof 2: Calculating the probability that the decision is correct

This proof is direct; it just sums up the probabilities of the majorities. Each term of the sum multiplies the number of
combination In mathematics, a combination is a selection of items from a set that has distinct members, such that the order of selection does not matter (unlike permutations). For example, given three fruits, say an apple, an orange and a pear, there are ...
s of a majority by the
probability Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
of that majority. Each majority is counted using a
combination In mathematics, a combination is a selection of items from a set that has distinct members, such that the order of selection does not matter (unlike permutations). For example, given three fruits, say an apple, an orange and a pear, there are ...
, ''n'' items taken ''k'' at a time, where ''n'' is the jury size, and ''k'' is the size of the majority. Probabilities range from 0 (= the vote is always wrong) to 1 (= always right). Each person decides independently, so the probabilities of their decisions multiply. The probability of each correct decision is ''p''. The probability of an incorrect decision, ''q'', is the opposite of ''p'', i.e. 1 − ''p''. The power notation, i.e. p^x is a shorthand for ''x'' multiplications of ''p''. Committee or jury accuracies can be easily estimated by using this approach in computer spreadsheets or programs. As an example, let us take the simplest case of ''n'' = 3, ''p'' = 0.8. We need to show that 3 people have higher than 0.8 chance of being right. Indeed: : 0.8 × 0.8 × 0.8 + 0.8 × 0.8 × 0.2 + 0.8 × 0.2 × 0.8 + 0.2 × 0.8 × 0.8 = 0.896.


Asymptotics

Asymptotics is “The Calculus of Approximations”. It is used to solve hard problems that cannot be solved exactly and to provide simpler forms of complicated results, from early results like Taylor's and Stirling's formulas to the prime number theorem. An important topic in the study of asymptotic is asymptotic distribution which is a probability distribution that is in a sense the "limiting" distribution of a sequence of distributions. The probability of a correct majority decision ''P''(''n'', ''p''), when the individual probability ''p'' is close to 1/2 grows linearly in terms of ''p'' − 1/2. For ''n'' voters each one having probability ''p'' of deciding correctly and for odd ''n'' (where there are no possible ties): :P(n, p) = 1/2 + c_1 (p - 1/2) + c_3 (p - 1/2)^3 + O\left( (p - 1/2)^5 \right), where :c_1 = \frac = \sqrt \left(1 + \frac + O(n^)\right), and the asymptotic approximation in terms of ''n'' is very accurate. The expansion is only in odd powers and c_3 < 0. In simple terms, this says that when the decision is difficult (''p'' close to 1/2), the gain by having ''n'' voters grows proportionally to \sqrt.


The theorem in other disciplines

The Condorcet jury theorem has recently been used to conceptualize score integration when several physician readers (radiologists, endoscopists, etc.) independently evaluate images for disease activity. This task arises in central reading performed during clinical trials and has similarities to voting. According to the authors, the application of the theorem can translate individual reader scores into a final score in a fashion that is both mathematically sound (by avoiding averaging of ordinal data), mathematically tractable for further analysis, and in a manner that is consistent with the scoring task at hand (based on decisions about the presence or absence of features, a subjective classification task) The Condorcet jury theorem is also used in
ensemble learning In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. Unlike a statistical ensemble in statist ...
in the field of
machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
. An ensemble method combines the predictions of many individual classifiers by majority voting. Assuming that each of the individual classifiers predict with slightly greater than 50% accuracy and their predictions are independent, then the ensemble of their predictions will be far greater than their individual predictive scores.


Applicability to democratic processes

Many political theorists and philosophers use the Condorcet’s Jury Theorem (CJT) to defend democracy, see Brennan and references therein. Nevertheless, it is an empirical question whether the theorem holds in real life or not. Note that the CJT is a ''double-edged sword'': it can either prove that majority rule is an (almost) perfect mechanism to aggregate information, when p>1/2, or an (almost) perfect disaster, when p<1/2. A disaster would mean that the wrong option is chosen systematically. Some authors have argued that we are in the latter scenario. For instance,
Bryan Caplan Bryan Douglas Caplan (born April 8, 1971) is an American economist and author. He is a professor of economics at George Mason University, a senior research fellow at the Mercatus Center, an adjunct scholar at the Cato Institute, and a former c ...
has extensively argued that voters' knowledge is systematically biased toward (probably) wrong options. In the CJT setup, this could be interpreted as evidence for p<1/2. Recently, another approach to study the applicability of the CJT was taken. Instead of considering the homogeneous case, each voter is allowed to have a probability p_i\in ,1/math>, possibly different from other voters. This case was previously studied by Daniel Berend and Jacob Paroush and includes the classical theorem of Condorcet (when p_i=p~~ \forall~i\in\mathbb) and other results, like the Miracle of Aggregation (when p_i=1/2 for most voters and p_i=1 for a small proportion of them). Then, following a Bayesian approach, the
prior probability A prior probability distribution of an uncertain quantity, simply called the prior, is its assumed probability distribution before some evidence is taken into account. For example, the prior could be the probability distribution representing the ...
(in this case,
a priori ('from the earlier') and ('from the later') are Latin phrases used in philosophy to distinguish types of knowledge, Justification (epistemology), justification, or argument by their reliance on experience. knowledge is independent from any ...
) of the thesis predicted by the theorem is estimated. That is, if we choose an arbitrary sequence of voters (i.e., a sequence (p_i)_ ), will the thesis of the CJT hold? The answer is no. More precisely, if a random sequence of p_i is taken following an unbiased distribution that does not favor competence, p_i>1/2, or incompetence, p_i<1/2, then the thesis predicted by the theorem will not hold
almost surely In probability theory, an event is said to happen almost surely (sometimes abbreviated as a.s.) if it happens with probability 1 (with respect to the probability measure). In other words, the set of outcomes on which the event does not occur ha ...
. With this new approach, proponents of the CJT should present strong evidence of competence, to overcome the low prior probability. That is, it is not only the case that there is evidence against competence (posterior probability), but also that we cannot expect the CJT to hold in the absence of any evidence (prior probability).


Further reading

*
Condorcet method A Condorcet method (; ) is an election method that elects the candidate who wins a majority of the vote in every head-to-head election against each of the other candidates, whenever there is such a candidate. A candidate with this property, the ...
* Condorcet paradox * Jury theorem *
Law of large numbers In probability theory, the law of large numbers is a mathematical law that states that the average of the results obtained from a large number of independent random samples converges to the true value, if it exists. More formally, the law o ...
*
Wisdom of the crowd "Wisdom of the crowd" or "wisdom of the majority" expresses the notion that the collective opinion of a diverse and independent group of individuals (rather than that of a single expert) yields the best judgement. This concept, while not new to ...


References

{{reflist Theorems in probability theory Voting theory