probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set ...

, a probability space or a probability triple

(\Omega, \mathcal, P)

is a mathematical construct that provides a formal model of a

random In common usage, randomness is the apparent or actual lack of pattern or predictability in events. A random sequence of events, symbols or steps often has no order and does not follow an intelligible pattern or combination. Individual ran ...

process or "experiment". For example, one can define a probability space which models the throwing of a

die Die, as a verb, refers to death, the cessation of life. Die may also refer to: Games * Die, singular of dice, small throwable objects used for producing random numbers Manufacturing * Die (integrated circuit), a rectangular piece of a semicondu ...

. A probability space consists of three elements:Stroock, D. W. (1999). Probability theory: an analytic view. Cambridge University Press. # A sample space,

\Omega

, which is the set of all possible outcomes. # An event space, which is a set of

event Event may refer to: Gatherings of people * Ceremony, an event of ritual significance, performed on a special occasion * Convention (meeting), a gathering of individuals engaged in some common interest * Event management, the organization of ev ...

\mathcal

, an event being a set of outcomes in the sample space. # A probability function, which assigns each event in the event space a

probability Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, ...

, which is a number between 0 and 1. In order to provide a sensible model of probability, these elements must satisfy a number of axioms, detailed in this article. In the example of the throw of a standard die, we would take the sample space to be

\

. For the event space, we could simply use the set of all subsets of the sample space, which would then contain simple events such as

\

("the die lands on 5"), as well as complex events such as

\

("the die lands on an even number"). Finally, for the probability function, we would map each event to the number of outcomes in that event divided by 6 — so for example,

\

would be mapped to

1/6

, and

\

would be mapped to

3/6 = 1/2

. When an experiment is conducted, we imagine that "nature" "selects" a single outcome,

\omega

, from the sample space

\Omega

. All the events in the event space

\mathcal

that contain the selected outcome

\omega

are said to "have occurred". This "selection" happens in such a way that if the experiment were repeated many times, the number of occurrences of each event, as a fraction of the total number of experiments, would most likely tend towards the probability assigned to that event by the probability function

P

. The Soviet mathematician

Andrey Kolmogorov Andrey Nikolaevich Kolmogorov ( rus, Андре́й Никола́евич Колмого́ров, p=ɐnˈdrʲej nʲɪkɐˈlajɪvʲɪtɕ kəlmɐˈɡorəf, a=Ru-Andrey Nikolaevich Kolmogorov.ogg, 25 April 1903 – 20 October 1987) was a Sovi ...

introduced the notion of probability space, together with other

axioms of probability The Kolmogorov axioms are the foundations of probability theory introduced by Russian mathematician Andrey Kolmogorov in 1933. These axioms remain central and have direct contributions to mathematics, the physical sciences, and real-world probabili ...

, in the 1930s. In modern probability theory there are a number of alternative approaches for axiomatization — for example, algebra of random variables.

Introduction

A probability space is a mathematical triplet

(\Omega, \mathcal, P)

that presents a model for a particular class of real-world situations. As with other models, its author ultimately defines which elements

\Omega

\mathcal

, and

P

will contain. * The sample space

\Omega

is the set of all possible outcomes. An outcome is the result of a single execution of the model. Outcomes may be states of nature, possibilities, experimental results and the like. Every instance of the real-world situation (or run of the experiment) must produce exactly one outcome. If outcomes of different runs of an experiment differ in any way that matters, they are distinct outcomes. Which differences matter depends on the kind of analysis we want to do. This leads to different choices of sample space. * The σ-algebra

\mathcal

is a collection of all the

s we would like to consider. This collection may or may not include each of the elementary events. Here, an "event" is a set of zero or more outcomes; that is, a

subset In mathematics, set ''A'' is a subset of a set ''B'' if all elements of ''A'' are also elements of ''B''; ''B'' is then a superset of ''A''. It is possible for ''A'' and ''B'' to be equal; if they are unequal, then ''A'' is a proper subset of ...

of the sample space. An event is considered to have "happened" during an experiment when the outcome of the latter is an element of the event. Since the same outcome may be a member of many events, it is possible for many events to have happened given a single outcome. For example, when the trial consists of throwing two dice, the set of all outcomes with a sum of 7 pips may constitute an event, whereas outcomes with an odd number of pips may constitute another event. If the outcome is the element of the elementary event of two pips on the first die and five on the second, then both of the events, "7 pips" and "odd number of pips", are said to have happened. * The

probability measure In mathematics, a probability measure is a real-valued function defined on a set of events in a probability space that satisfies measure properties such as ''countable additivity''. The difference between a probability measure and the more ge ...

P

is a set function returning an event's

. A probability is a real number between zero (impossible events have probability zero, though probability-zero events are not necessarily impossible) and one (the event happens

almost surely In probability theory, an event is said to happen almost surely (sometimes abbreviated as a.s.) if it happens with probability 1 (or Lebesgue measure 1). In other words, the set of possible exceptions may be non-empty, but it has probability 0. ...

, with almost total certainty). Thus

P

is a function

P : \mathcal \to,1

The probability measure function must satisfy two simple requirements: First, the probability of a

countable In mathematics, a set is countable if either it is finite or it can be made in one to one correspondence with the set of natural numbers. Equivalently, a set is ''countable'' if there exists an injective function from it into the natural numbers ...

union of mutually exclusive events must be equal to the countable sum of the probabilities of each of these events. For example, the probability of the union of the mutually exclusive events

\text

and

\text

in the random experiment of one coin toss,

P(\text\cup\text)

, is the sum of probability for

\text

and the probability for

\text

P(\text) + P(\text)

. Second, the probability of the sample space

\Omega

must be equal to 1 (which accounts for the fact that, given an execution of the model, some outcome must occur). In the previous example the probability of the set of outcomes

P(\)

must be equal to one, because it is entirely certain that the outcome will be either

\text

\text

(the model neglects any other possibility) in a single coin toss. Not every subset of the sample space

\Omega

must necessarily be considered an event: some of the subsets are simply not of interest, others cannot be "measured". This is not so obvious in a case like a coin toss. In a different example, one could consider javelin throw lengths, where the events typically are intervals like "between 60 and 65 meters" and unions of such intervals, but not sets like the "irrational numbers between 60 and 65 meters".

Definition

In short, a probability space is a

measure space A measure space is a basic object of measure theory, a branch of mathematics that studies generalized notions of volumes. It contains an underlying set, the subsets of this set that are feasible for measuring (the -algebra) and the method that ...

such that the measure of the whole space is equal to one. The expanded definition is the following: a probability space is a triple

(\Omega,\mathcal,P)

consisting of: * the sample space

\Omega

— an arbitrary non-empty set, * the σ-algebra

\mathcal \subseteq 2^\Omega

(also called σ-field) — a set of subsets of

\Omega

, called events, such that: **

\mathcal

contains the sample space:

\Omega \in \mathcal

, **

\mathcal

is closed under complements: if

A\in\mathcal

, then also

(\Omega\setminus A)\in\mathcal

, **

\mathcal

is closed under

unions: if

A_i\in\mathcal

for

i=1,2,\dots

, then also

(\bigcup_^\infty A_i)\in\mathcal

*** The corollary from the previous two properties and De Morgan’s law is that

\mathcal

is also closed under countable

intersections In mathematics, the intersection of two or more objects is another object consisting of everything that is contained in all of the objects simultaneously. For example, in Euclidean geometry, when two lines in a plane are not parallel, their ...

: if

A_i\in\mathcal

for

i = 1,2,\dots

, then also

(\bigcap_^\infty A_i)\in\mathcal

* the

P:\mathcal\to,1 /math> — a function on \mathcal such that:
** ''P'' is countably additive (also called σ-additive): if \_^\infty \subseteq \mathcal is a countable collection of pairwise

disjoint sets In mathematics, two sets are said to be disjoint sets if they have no element in common. Equivalently, two disjoint sets are sets whose intersection is the empty set.. For example, and are ''disjoint sets,'' while and are not disjoint. A ...

, then

P(\bigcup_^\infty A_i)=\sum_^\infty P(A_i),

** the measure of entire sample space is equal to one:

P(\Omega)=1

Discrete case

Discrete probability theory needs only at most countable sample spaces

\Omega

. Probabilities can be ascribed to points of

\Omega

by the

probability mass function In probability and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes it is also known as the discrete density function. The probability mass ...

p:\Omega\to,1 /math> such that \sum_ p(\omega)=1 . All subsets of \Omega can be treated as events (thus, \mathcal=2^\Omega is the power set). The probability measure takes the simple form

The greatest σ-algebra \mathcal=2^\Omega describes the complete information. In general, a σ-algebra \mathcal\subseteq2^\Omega corresponds to a finite or countable partition \Omega=B_1\cup B_2\cup\dots, the general form of an event A\in\mathcal being A=B_\cup B_\cup\dots . See also the examples.

The case p(\omega)=0 is permitted by the definition, but rarely used, since such \omega can safely be excluded from the sample space.

General case

If is uncountable, still, it may happen that for some ; such are called

atom Every atom is composed of a nucleus and one or more electrons bound to the nucleus. The nucleus is made of one or more protons and a number of neutrons. Only the most common variety of hydrogen has no neutrons. Every solid, liquid, gas, a ...

s. They are an at most countable (maybe

empty Empty may refer to: ‍ Music Albums * ''Empty'' (God Lives Underwater album) or the title song, 1995 * ''Empty'' (Nils Frahm album), 2020 * ''Empty'' (Tait album) or the title song, 2001 Songs * "Empty" (The Click Five song), 2007 * ...

) set, whose probability is the sum of probabilities of all atoms. If this sum is equal to 1 then all other points can safely be excluded from the sample space, returning us to the discrete case. Otherwise, if the sum of probabilities of all atoms is between 0 and 1, then the probability space decomposes into a discrete (atomic) part (maybe empty) and a non-atomic part.

Non-atomic case

If for all (in this case, Ω must be uncountable, because otherwise could not be satisfied), then equation () fails: the probability of a set is not necessarily the sum over the probabilities of its elements, as summation is only defined for countable numbers of elements. This makes the probability space theory much more technical. A formulation stronger than summation,

measure theory In mathematics, the concept of a measure is a generalization and formalization of geometrical measures (length, area, volume) and other common notions, such as mass and probability of events. These seemingly distinct concepts have many simila ...

is applicable. Initially the probabilities are ascribed to some "generator" sets (see the examples). Then a limiting procedure allows assigning probabilities to sets that are limits of sequences of generator sets, or limits of limits, and so on. All these sets are the σ-algebra

\mathcal

. For technical details see Carathéodory's extension theorem. Sets belonging to

\mathcal

are called measurable. In general they are much more complicated than generator sets, but much better than non-measurable sets.

Complete probability space

A probability space

(\Omega,\; \mathcal,\; P)

is said to be a complete probability space if for all

B \in \mathcal

with

P(B) = 0

and all

A\; \subset \;B

one has

A \in \mathcal

. Often, the study of probability spaces is restricted to complete probability spaces.

Examples

Discrete examples

Example 1

If the experiment consists of just one flip of a fair coin, then the outcome is either heads or tails:

\Omega = \

. The σ-algebra

\mathcal = 2^

contains

2^2 = 4

events, namely:

\

("heads"),

\

("tails"),

\

("neither heads nor tails"), and

\

("either heads or tails"); in other words,

\mathcal = \

. There is a fifty percent chance of tossing heads and fifty percent for tails, so the probability measure in this example is

P(\) = 0

P(\) = 0.5

P(\) = 0.5

P(\) = 1

Example 2

The fair coin is tossed three times. There are 8 possible outcomes: (here "HTH" for example means that first time the coin landed heads, the second time tails, and the last time heads again). The complete information is described by the σ-algebra

\mathcal = 2^\Omega

of events, where each of the events is a subset of Ω. Alice knows the outcome of the second toss only. Thus her incomplete information is described by the partition , where ⊔ is the '' disjoint union'', and the corresponding σ-algebra

\mathcal_\text = \

. Bryan knows only the total number of tails. His partition contains four parts: ; accordingly, his σ-algebra

\mathcal_\text

contains 2⁴ = 16 events. The two σ-algebras are incomparable: neither

\mathcal_\text \subseteq \mathcal_\text

nor

\mathcal_\text \subseteq \mathcal_\text

; both are sub-σ-algebras of 2^Ω.

Example 3

If 100 voters are to be drawn randomly from among all voters in California and asked whom they will vote for governor, then the set of all

sequence In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is called ...

s of 100 Californian voters would be the sample space Ω. We assume that sampling without replacement is used: only sequences of 100 ''different'' voters are allowed. For simplicity an ordered sample is considered, that is a sequence is different from . We also take for granted that each potential voter knows exactly his/her future choice, that is he/she doesn’t choose randomly. Alice knows only whether or not

Arnold Schwarzenegger Arnold Alois Schwarzenegger (born July 30, 1947) is an Austrian and American actor, film producer, businessman, retired professional bodybuilder and politician who served as the 38th governor of California between 2003 and 2011. ''Time'' ...

has received at least 60 votes. Her incomplete information is described by the σ-algebra

\mathcal_\text

that contains: (1) the set of all sequences in Ω where at least 60 people vote for Schwarzenegger; (2) the set of all sequences where fewer than 60 vote for Schwarzenegger; (3) the whole sample space Ω; and (4) the empty set ∅. Bryan knows the exact number of voters who are going to vote for Schwarzenegger. His incomplete information is described by the corresponding partition and the σ-algebra

\mathcal_\text

consists of 2¹⁰¹ events. In this case Alice’s σ-algebra is a subset of Bryan’s:

\mathcal_\text \subset \mathcal_\text

. Bryan’s σ-algebra is in turn a subset of the much larger "complete information" σ-algebra 2^Ω consisting of events, where ''n'' is the number of all potential voters in California.

Non-atomic examples

Example 4

A number between 0 and 1 is chosen at random, uniformly. Here Ω = ,1

\mathcal

is the σ-algebra of

Borel set In mathematics, a Borel set is any set in a topological space that can be formed from open sets (or, equivalently, from closed sets) through the operations of countable union, countable intersection, and relative complement. Borel sets are na ...

s on Ω, and ''P'' is the

Lebesgue measure In measure theory, a branch of mathematics, the Lebesgue measure, named after French mathematician Henri Lebesgue, is the standard way of assigning a measure to subsets of ''n''-dimensional Euclidean space. For ''n'' = 1, 2, or 3, it coincides wi ...

on ,1 In this case the open intervals of the form , where , could be taken as the generator sets. Each such set can be ascribed the probability of , which generates the

on ,1 and the

Borel σ-algebra In mathematics, a Borel set is any set in a topological space that can be formed from open sets (or, equivalently, from closed sets) through the operations of countable union, countable intersection, and relative complement. Borel sets are named ...

on Ω.

Example 5

A fair coin is tossed endlessly. Here one can take Ω = ^∞, the set of all infinite sequences of numbers 0 and 1.

Cylinder set In mathematics, the cylinder sets form a basis of the product topology on a product of sets; they are also a generating family of the cylinder σ-algebra. General definition Given a collection S of sets, consider the Cartesian product X = \prod_ ...

s may be used as the generator sets. Each such set describes an event in which the first ''n'' tosses have resulted in a fixed sequence , and the rest of the sequence may be arbitrary. Each such event can be naturally given the probability of 2^−''n''. These two non-atomic examples are closely related: a sequence leads to the number . This is not a one-to-one correspondence between ^∞ and ,1however: it is an isomorphism modulo zero, which allows for treating the two probability spaces as two forms of the same probability space. In fact, all non-pathological non-atomic probability spaces are the same in this sense. They are so-called

standard probability space In probability theory, a standard probability space, also called Lebesgue–Rokhlin probability space or just Lebesgue space (the latter term is ambiguous) is a probability space satisfying certain assumptions introduced by Vladimir Rokhlin ...

s. Basic applications of probability spaces are insensitive to standardness. However, non-discrete conditioning is easy and natural on standard probability spaces, otherwise it becomes obscure.

Related concepts

Probability distribution

Any

probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon ...

defines a probability measure.

Random variables

random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...

''X'' is a

measurable function In mathematics and in particular measure theory, a measurable function is a function between the underlying sets of two measurable spaces that preserves the structure of the spaces: the preimage of any measurable set is measurable. This is in ...

''X'': Ω → ''S'' from the sample space Ω to another measurable space ''S'' called the ''state space''. If ''A'' ⊂ ''S'', the notation Pr(''X'' ∈ ''A'') is a commonly used shorthand for

\Pr(\)

Defining the events in terms of the sample space

If Ω is

we almost always define

\mathcal

as the power set of Ω, i.e.

\mathcal = 2^\Omega

which is trivially a σ-algebra and the biggest one we can create using Ω. We can therefore omit

\mathcal

and just write (Ω,P) to define the probability space. On the other hand, if Ω is uncountable and we use

\mathcal = 2^\Omega

we get into trouble defining our probability measure ''P'' because

\mathcal

is too "large", i.e. there will often be sets to which it will be impossible to assign a unique measure. In this case, we have to use a smaller σ-algebra

\mathcal

, for example the

Borel algebra In mathematics, a Borel set is any set in a topological space that can be formed from open sets (or, equivalently, from closed sets) through the operations of countable union, countable intersection, and relative complement. Borel sets are name ...

of Ω, which is the smallest σ-algebra that makes all open sets measurable.

Conditional probability

Kolmogorov’s definition of probability spaces gives rise to the natural concept of conditional probability. Every set with non-zero probability (that is, ) defines another probability measure

P(B ,  A) =

on the space. This is usually pronounced as the "probability of ''B'' given ''A''". For any event such that the function defined by for all events is itself a probability measure.

Independence

Two events, ''A'' and ''B'' are said to be

independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independe ...

if . Two random variables, and , are said to be independent if any event defined in terms of is independent of any event defined in terms of . Formally, they generate independent σ-algebras, where two σ-algebras and , which are subsets of are said to be independent if any element of is independent of any element of .

Mutual exclusivity

Two events, and are said to be mutually exclusive or ''disjoint'' if the occurrence of one implies the non-occurrence of the other, i.e., their intersection is empty. This is a stronger condition than the probability of their intersection being zero. If and are disjoint events, then . This extends to a (finite or countably infinite) sequence of events. However, the probability of the union of an uncountable set of events is not the sum of their probabilities. For example, if is a normally distributed random variable, then is 0 for any , but . The event is referred to as "''A'' and ''B''", and the event as "''A'' or ''B''".

References

Bibliography

Pierre Simon de Laplace Pierre-Simon, marquis de Laplace (; ; 23 March 1749 – 5 March 1827) was a French scholar and polymath whose work was important to the development of engineering, mathematics, statistics, physics, astronomy, and philosophy. He summarized ...

(1812) ''Analytical Theory of Probability'' :: The first major treatise blending calculus with probability theory, originally in French: ''Théorie Analytique des Probabilités''. * Andrei Nikolajevich Kolmogorov (1950) ''Foundations of the Theory of Probability'' :: The modern measure-theoretic foundation of probability theory; the original German version (''Grundbegriffe der Wahrscheinlichkeitrechnung'') appeared in 1933. * Harold Jeffreys (1939) ''The Theory of Probability'' :: An empiricist, Bayesian approach to the foundations of probability theory. *

Edward Nelson Edward Nelson (May 4, 1932 – September 10, 2014) was an American mathematician. He was professor in the Mathematics Department at Princeton University. He was known for his work on mathematical physics and mathematical logic. In mathematic ...

(1987) ''Radically Elementary Probability Theory'' :: Foundations of probability theory based on nonstandard analysis. Downloadable. http://www.math.princeton.edu/~nelson/books.html * Patrick Billingsley: ''Probability and Measure'', John Wiley and Sons, New York, Toronto, London, 1979. * Henk Tijms (2004) ''Understanding Probability '' :: A lively introduction to probability theory for the beginner, Cambridge Univ. Press. * David Williams (1991) ''Probability with martingales'' :: An undergraduate introduction to measure-theoretic probability, Cambridge Univ. Press. *

External links

*
Animation
demonstrating probability space of dice
Virtual Laboratories in Probability and Statistics
(principal author Kyle Siegrist), especially
Probability Spaces

Citizendium

Complete probability space
* {{Authority control Experiment (probability theory)