HOME

TheInfoList



OR:

In
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
, the Borel–Kolmogorov paradox (sometimes known as Borel's paradox) is a
paradox A paradox is a logically self-contradictory statement or a statement that runs contrary to one's expectation. It is a statement that, despite apparently valid reasoning from true premises, leads to a seemingly self-contradictory or a logically u ...
relating to
conditional probability In probability theory, conditional probability is a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion or evidence) has already occurred. This particular method relies on event B occu ...
with respect to an
event Event may refer to: Gatherings of people * Ceremony, an event of ritual significance, performed on a special occasion * Convention (meeting), a gathering of individuals engaged in some common interest * Event management, the organization of ev ...
of probability zero (also known as a
null set In mathematical analysis, a null set N \subset \mathbb is a measurable set that has measure zero. This can be characterized as a set that can be covered by a countable union of intervals of arbitrarily small total length. The notion of null s ...
). It is named after
Émile Borel Félix Édouard Justin Émile Borel (; 7 January 1871 – 3 February 1956) was a French mathematician and politician. As a mathematician, he was known for his founding work in the areas of measure theory and probability. Biography Borel was ...
and
Andrey Kolmogorov Andrey Nikolaevich Kolmogorov ( rus, Андре́й Никола́евич Колмого́ров, p=ɐnˈdrʲej nʲɪkɐˈlajɪvʲɪtɕ kəlmɐˈɡorəf, a=Ru-Andrey Nikolaevich Kolmogorov.ogg, 25 April 1903 – 20 October 1987) was a Sovi ...
.


A great circle puzzle

Suppose that a random variable has a
uniform distribution Uniform distribution may refer to: * Continuous uniform distribution * Discrete uniform distribution * Uniform distribution (ecology) * Equidistributed sequence See also * * Homogeneous distribution In mathematics, a homogeneous distribution is ...
on a unit sphere. What is its
conditional distribution In probability theory and statistics, given two jointly distributed random variables X and Y, the conditional probability distribution of Y given X is the probability distribution of Y when X is known to be a particular value; in some cases the c ...
on a
great circle In mathematics, a great circle or orthodrome is the circular intersection of a sphere and a plane passing through the sphere's center point. Any arc of a great circle is a geodesic of the sphere, so that great circles in spherical geometry ...
? Because of the symmetry of the sphere, one might expect that the distribution is uniform and independent of the choice of coordinates. However, two analyses give contradictory results. First, note that choosing a point uniformly on the sphere is equivalent to choosing the
longitude Longitude (, ) is a geographic coordinate that specifies the east– west position of a point on the surface of the Earth, or another celestial body. It is an angular measurement, usually expressed in degrees and denoted by the Greek let ...
\lambda uniformly from \pi,\pi/math> and choosing the
latitude In geography, latitude is a coordinate that specifies the north– south position of a point on the surface of the Earth or another celestial body. Latitude is given as an angle that ranges from –90° at the south pole to 90° at the north po ...
\varphi from \frac,\frac/math> with density \frac \cos \varphi. Then we can look at two different great circles: # If the coordinates are chosen so that the great circle is an
equator The equator is a circle of latitude, about in circumference, that divides Earth into the Northern and Southern hemispheres. It is an imaginary line located at 0 degrees latitude, halfway between the North and South poles. The term can al ...
(latitude \varphi = 0), the conditional density for a longitude \lambda defined on the interval \pi,\pi/math> is f(\lambda\mid\varphi=0) = \frac. # If the great circle is a
line of longitude In geography and geodesy, a meridian is the locus connecting points of equal longitude, which is the angle (in degrees or other units) east or west of a given prime meridian (currently, the IERS Reference Meridian). In other words, it is a l ...
with \lambda = 0, the conditional density for \varphi on the interval \frac,\frac/math> is f(\varphi\mid\lambda=0) = \frac \cos \varphi. One distribution is uniform on the circle, the other is not. Yet both seem to be referring to the same great circle in different coordinate systems.


Explanation and implications

In case (1) above, the conditional probability that the longitude ''λ'' lies in a set ''E'' given that ''φ'' = 0 can be written ''P''(''λ'' ∈ ''E'' , ''φ'' = 0). Elementary probability theory suggests this can be computed as ''P''(''λ'' ∈ ''E'' and ''φ'' = 0)/''P''(''φ'' = 0), but that expression is not well-defined since ''P''(''φ'' = 0) = 0. Measure theory provides a way to define a conditional probability, using the family of events ''R''''ab'' = which are horizontal rings consisting of all points with latitude between ''a'' and ''b''. The resolution of the paradox is to notice that in case (2), ''P''(''φ'' ∈ ''F'' , ''λ'' = 0) is defined using the events ''L''''ab'' = , which are lunes (vertical wedges), consisting of all points whose longitude varies between ''a'' and ''b''. So although ''P''(''λ'' ∈ ''E'' , ''φ'' = 0) and ''P''(''φ'' ∈ ''F'' , ''λ'' = 0) each provide a probability distribution on a great circle, one of them is defined using rings, and the other using lunes. Thus it is not surprising after all that ''P''(''λ'' ∈ ''E'' , ''φ'' = 0) and ''P''(''φ'' ∈ ''F'' , ''λ'' = 0) have different distributions.


Mathematical explication


Measure theoretic perspective

To understand the problem we need to recognize that a distribution on a continuous random variable is described by a density ''f'' only with respect to some measure ''μ''. Both are important for the full description of the probability distribution. Or, equivalently, we need to fully define the space on which we want to define ''f''. Let Φ and Λ denote two random variables taking values in Ω1 = \left \frac, \frac\right/math> respectively Ω2 = ��, An event gives a point on the sphere ''S''(''r'') with radius ''r''. We define the
coordinate transform In geometry, a coordinate system is a system that uses one or more numbers, or coordinates, to uniquely determine the position of the points or other geometric elements on a manifold such as Euclidean space. The order of the coordinates is signi ...
:\begin x &= r \cos \varphi \cos \lambda \\ y &= r \cos \varphi \sin \lambda \\ z &= r \sin \varphi \end for which we obtain the
volume element In mathematics, a volume element provides a means for integrating a function with respect to volume in various coordinate systems such as spherical coordinates and cylindrical coordinates. Thus a volume element is an expression of the form :dV = ...
:\omega_r(\varphi,\lambda) = \left\, \times \right\, = r^2 \cos \varphi \ . Furthermore, if either ''φ'' or ''λ'' is fixed, we get the volume elements :\begin \omega_r(\lambda) &= \left\, \right\, = r \ , \quad\text \\ pt \omega_r(\varphi) &= \left\, \right\, = r \cos \varphi\ . \end Let :\mu_(d\varphi, d\lambda) = f_(\varphi,\lambda) \omega_r(\varphi,\lambda) \, d\varphi \, d\lambda denote the joint measure on \mathcal(\Omega_1 \times \Omega_2), which has a density f_ with respect to \omega_r(\varphi,\lambda) \, d\varphi \, d\lambda and let :\begin \mu_\Phi(d\varphi) &= \int_ \mu_(d\varphi, d\lambda)\ ,\\ \mu_\Lambda (d\lambda) &= \int_ \mu_(d\varphi, d\lambda)\ . \end If we assume that the density f_ is uniform, then :\begin \mu_(d\varphi \mid \lambda) &= = \frac \omega_r(\varphi) \, d\varphi \ , \quad\text \\ pt \mu_(d\lambda \mid \varphi) &= = \frac \omega_r(\lambda) \, d\lambda \ . \end Hence, \mu_ has a uniform density with respect to \omega_r(\varphi) \, d\varphi but not with respect to the Lebesgue measure. On the other hand, \mu_ has a uniform density with respect to \omega_r(\lambda) \, d\lambda and the Lebesgue measure.


Proof of contradiction

Consider a random vector (X,Y,Z) that is uniformly distributed on the unit sphere S^2. We begin by parametrizing the sphere with the usual
spherical polar coordinates In mathematics, a spherical coordinate system is a coordinate system for three-dimensional space where the position of a point is specified by three numbers: the ''radial distance'' of that point from a fixed origin, its ''polar angle'' mea ...
: :\begin x &= \cos(\varphi) \cos (\theta) \\ y &= \cos(\varphi) \sin (\theta) \\ z &= \sin(\varphi) \end where -\frac \le \varphi \le \frac and -\pi \le \theta \le \pi. We can define random variables \Phi, \Theta as the values of (X, Y, Z) under the inverse of this parametrization, or more formally using the arctan2 function: :\begin \Phi &= \arcsin(Z) \\ \Theta &= \arctan_2\left(\frac, \frac\right) \end Using the formulas for the surface area
spherical cap In geometry, a spherical cap or spherical dome is a portion of a sphere or of a ball cut off by a plane. It is also a spherical segment of one base, i.e., bounded by a single plane. If the plane passes through the center of the sphere (formi ...
and the
spherical wedge In geometry, a spherical wedge or ungula is a portion of a ball bounded by two plane semidisks and a spherical lune (termed the wedge's ''base''). The angle between the radii lying within the bounding semidisks is the dihedral . If is a sem ...
, the surface of a spherical cap wedge is given by : \operatorname(\Theta \le \theta, \Phi \le \varphi) = (1 + \sin(\varphi)) (\theta + \pi) Since (X,Y,Z) is uniformly distributed, the probability is proportional to the surface area, giving the joint cumulative distribution function : F_(\varphi, \theta) = P(\Theta \le \theta, \Phi \le \varphi) = \frac(1 + \sin(\varphi)) (\theta + \pi) The
joint probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) ca ...
is then given by : f_(\varphi, \theta) = \frac F_(\varphi, \theta) = \frac \cos(\varphi) Note that \Phi and \Theta are independent random variables. For simplicity, we won't calculate the full conditional distribution on a great circle, only the probability that the random vector lies in the first octant. That is to say, we will attempt to calculate the conditional probability \mathbb(A, B) with :\begin A &= \left\ &&= \\\ B &= \ &&= \ \end We attempt to evaluate the conditional probability as a limit of conditioning on the events :B_\varepsilon = \ As \Phi and \Theta are independent, so are the events A and B_\varepsilon, therefore : P(A \mid B) \mathrel \lim_ \frac = \lim_ P(A) = P \left(0 < \Theta < \frac\right) = \frac. Now we repeat the process with a different parametrization of the sphere: :\begin x &= \sin(\varphi) \\ y &= \cos(\varphi) \sin(\theta) \\ z &= -\cos(\varphi) \cos(\theta) \end This is equivalent to the previous parametrization rotated by 90 degrees around the y axis. Define new random variables :\begin \Phi' &= \arcsin(X) \\ \Theta' &= \arctan_2\left(\frac, \frac\right). \end Rotation is measure preserving so the density of \Phi' and \Theta' is the same: : f_(\varphi, \theta) = \frac \cos(\varphi) . The expressions for and are: :\begin A &= \left\ &&= \ &&= \left\ \\ B &= \ &&= \ &&= \left\ \cup \left\. \end Attempting again to evaluate the conditional probability as a limit of conditioning on the events :B^\prime_\varepsilon = \left\ \cup \left\. Using
L'Hôpital's rule In calculus, l'Hôpital's rule or l'Hospital's rule (, , ), also known as Bernoulli's rule, is a theorem which provides a technique to evaluate limits of indeterminate forms. Application (or repeated application) of the rule often converts an ...
and
differentiation under the integral sign In calculus, the Leibniz integral rule for differentiation under the integral sign, named after Gottfried Leibniz, states that for an integral of the form \int_^ f(x,t)\,dt, where -\infty < a(x), b(x) < \infty and the integral are
: :\begin P(A \mid B) &\mathrel \lim_ \frac\\ &= \lim_ \fracP\left( \frac - \varepsilon < \Theta' < \frac + \varepsilon,\ 0 < \Phi' < \frac,\ \sin(\Theta') < \tan(\Phi') \right)\\ &= \frac \lim_ \frac \int_^ \int_0^ 1_ f_(\varphi, \theta) \mathrm\varphi \mathrm\theta \\ &= \pi \int_0^ 1_ f_\left(\varphi, \frac\right) \mathrm\varphi \\ &= \pi \int_^ \frac \cos(\varphi) \mathrm\varphi \\ &= \frac \left( 1 - \frac \right) \neq \frac \end This shows that the conditional density cannot be treated as conditioning on an event of probability zero, as explained in Conditional probability#Conditioning on an event of probability zero.


References


Citations


Sources

* *
Fragmentary Edition (1994) (pp. 1514–1517)
(
PostScript PostScript (PS) is a page description language in the electronic publishing and desktop publishing realm. It is a dynamically typed, concatenative programming language. It was created at Adobe Systems by John Warnock, Charles Geschke, ...
format) * ** Translation: * * Mosegaard, K., & Tarantola, A. (2002). 16 Probabilistic approach to inverse problems. International Geophysics, 81, 237–265. {{DEFAULTSORT:Borel-Kolmogorov Paradox Probability theory paradoxes