Convergence Of Measures
   HOME

TheInfoList



OR:

In
mathematics Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
, more specifically
measure theory In mathematics, the concept of a measure is a generalization and formalization of geometrical measures (length, area, volume) and other common notions, such as magnitude (mathematics), magnitude, mass, and probability of events. These seemingl ...
, there are various notions of the convergence of measures. For an intuitive general sense of what is meant by ''convergence of measures'', consider a sequence of measures on a space, sharing a common collection of measurable sets. Such a sequence might represent an attempt to construct 'better and better' approximations to a desired measure that is difficult to obtain directly. The meaning of 'better and better' is subject to all the usual caveats for taking limits; for any error tolerance we require there be sufficiently large for to ensure the 'difference' between and is smaller than . Various notions of convergence specify precisely what the word 'difference' should mean in that description; these notions are not equivalent to one another, and vary in strength. Three of the most common notions of convergence are described below.


Informal descriptions

This section attempts to provide a rough intuitive description of three notions of convergence, using terminology developed in
calculus Calculus is the mathematics, mathematical study of continuous change, in the same way that geometry is the study of shape, and algebra is the study of generalizations of arithmetic operations. Originally called infinitesimal calculus or "the ...
courses; this section is necessarily imprecise as well as inexact, and the reader should refer to the formal clarifications in subsequent sections. In particular, the descriptions here do not address the possibility that the measure of some sets could be infinite, or that the underlying space could exhibit pathological behavior, and additional technical assumptions are needed for some of the statements. The statements in this section are however all correct if is a sequence of probability measures on a
Polish space In the mathematical discipline of general topology, a Polish space is a separable space, separable Completely metrizable space, completely metrizable topological space; that is, a space homeomorphic to a Complete space, complete metric space that h ...
. The various notions of convergence formalize the assertion that the 'average value' of each 'sufficiently nice' function should converge: \int f\, d\mu_n \to \int f\, d\mu To formalize this requires a careful specification of the set of functions under consideration and how uniform the convergence should be. The notion of ''weak convergence'' requires this convergence to take place for every continuous bounded function . This notion treats convergence for different functions independently of one another, i.e., different functions may require different values of to be approximated equally well (thus, convergence is non-uniform in ). The notion of ''setwise convergence'' formalizes the assertion that the measure of each measurable set should converge: \mu_n(A) \to \mu(A) Again, no uniformity over the set is required. Intuitively, considering integrals of 'nice' functions, this notion provides more uniformity than weak convergence. As a matter of fact, when considering sequences of measures with uniformly bounded variation on a
Polish space In the mathematical discipline of general topology, a Polish space is a separable space, separable Completely metrizable space, completely metrizable topological space; that is, a space homeomorphic to a Complete space, complete metric space that h ...
, setwise convergence implies the convergence \int f\, d\mu_n \to \int f\, d\mu for any bounded measurable function . As before, this convergence is non-uniform in . The notion of ''total variation convergence'' formalizes the assertion that the measure of all measurable sets should converge ''uniformly'', i.e. for every there exists such that , \mu_n(A) - \mu(A), < \varepsilon for every and for every measurable set . As before, this implies convergence of integrals against bounded measurable functions, but this time convergence is uniform over all functions bounded by any fixed constant.


Total variation convergence of measures

This is the strongest notion of convergence shown on this page and is defined as follows. Let (X, \mathcal) be a
measurable space In mathematics, a measurable space or Borel space is a basic object in measure theory. It consists of a set and a σ-algebra, which defines the subsets that will be measured. It captures and generalises intuitive notions such as length, area, an ...
. The
total variation In mathematics, the total variation identifies several slightly different concepts, related to the (local property, local or global) structure of the codomain of a Function (mathematics), function or a measure (mathematics), measure. For a real ...
distance between two (positive) measures and is then given by : \left \, \mu- \nu \right \, _\text = \sup_f \left \. Here the supremum is taken over ranging over the set of all
measurable function In mathematics, and in particular measure theory, a measurable function is a function between the underlying sets of two measurable spaces that preserves the structure of the spaces: the preimage of any measurable set is measurable. This is in ...
s from to . This is in contrast, for example, to the
Wasserstein metric In mathematics, the Wasserstein distance or Kantorovich– Rubinstein metric is a distance function defined between probability distributions on a given metric space M. It is named after Leonid Vaseršteĭn. Intuitively, if each distribution ...
, where the definition is of the same form, but the supremum is taken over ranging over the set of those measurable functions from to which have Lipschitz constant at most 1; and also in contrast to the
Radon metric In mathematics (specifically in measure theory), a Radon measure, named after Johann Radon, is a measure on the -algebra of Borel sets of a Hausdorff topological space that is finite on all compact sets, outer regular on all Borel sets, and i ...
, where the supremum is taken over ranging over the set of continuous functions from to . In the case where is a
Polish space In the mathematical discipline of general topology, a Polish space is a separable space, separable Completely metrizable space, completely metrizable topological space; that is, a space homeomorphic to a Complete space, complete metric space that h ...
, the total variation metric coincides with the Radon metric. If and are both
probability measure In mathematics, a probability measure is a real-valued function defined on a set of events in a σ-algebra that satisfies Measure (mathematics), measure properties such as ''countable additivity''. The difference between a probability measure an ...
s, then the total variation distance is also given by :\left \, \mu- \nu \right \, _ = 2\cdot\sup_ , \mu (A) - \nu (A) , . The equivalence between these two definitions can be seen as a particular case of the Monge–Kantorovich duality. From the two definitions above, it is clear that the total variation distance between probability measures is always between 0 and 2. To illustrate the meaning of the total variation distance, consider the following thought experiment. Assume that we are given two probability measures and , as well as a random variable . We know that has law either or but we do not know which one of the two. Assume that these two measures have prior probabilities 0.5 each of being the true law of . Assume now that we are given ''one'' single sample distributed according to the law of and that we are then asked to guess which one of the two distributions describes that law. The quantity : then provides a sharp upper bound on the prior probability that our guess will be correct. Given the above definition of total variation distance, a sequence of measures defined on the same measure space is said to converge to a measure in total variation distance if for every , there exists an such that for all , one has that :\, \mu_n - \mu\, _\text < \varepsilon.


Setwise convergence of measures

For (X, \mathcal) a
measurable space In mathematics, a measurable space or Borel space is a basic object in measure theory. It consists of a set and a σ-algebra, which defines the subsets that will be measured. It captures and generalises intuitive notions such as length, area, an ...
, a sequence is said to converge setwise to a limit if : \lim_ \mu_n(A) = \mu(A) for every set A\in\mathcal. Typical arrow notations are \mu_n \xrightarrow \mu and \mu_n \xrightarrow \mu. For example, as a consequence of the Riemann–Lebesgue lemma, the sequence of measures on the interval given by converges setwise to Lebesgue measure, but it does not converge in total variation. In a measure theoretical or probabilistic context setwise convergence is often referred to as strong convergence (as opposed to weak convergence). This can lead to some ambiguity because in
functional analysis Functional analysis is a branch of mathematical analysis, the core of which is formed by the study of vector spaces endowed with some kind of limit-related structure (for example, Inner product space#Definition, inner product, Norm (mathematics ...
, strong convergence usually refers to convergence with respect to a norm.


Weak convergence of measures

In
mathematics Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
and
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, weak convergence is one of many types of convergence relating to the convergence of measures. It depends on a topology on the underlying space and thus is not a purely measure-theoretic notion. There are several equivalent
definition A definition is a statement of the meaning of a term (a word, phrase, or other set of symbols). Definitions can be classified into two large categories: intensional definitions (which try to give the sense of a term), and extensional definitio ...
s of weak convergence of a sequence of measures, some of which are (apparently) more general than others. The equivalence of these conditions is sometimes known as the Portmanteau theorem. Definition. Let S be a
metric space In mathematics, a metric space is a Set (mathematics), set together with a notion of ''distance'' between its Element (mathematics), elements, usually called point (geometry), points. The distance is measured by a function (mathematics), functi ...
with its Borel \sigma-algebra \Sigma. A bounded sequence of positive
probability measure In mathematics, a probability measure is a real-valued function defined on a set of events in a σ-algebra that satisfies Measure (mathematics), measure properties such as ''countable additivity''. The difference between a probability measure an ...
s P_n\, (n = 1, 2, \dots) on (S, \Sigma) is said to converge weakly to a probability measure P (denoted P_n\Rightarrow P) if any of the following equivalent conditions is true (here \operatorname_n denotes expectation or the integral with respect to P_n, while \operatorname denotes expectation or the integral with respect to P): * \operatorname_n \to \operatorname /math> for all bounded,
continuous function In mathematics, a continuous function is a function such that a small variation of the argument induces a small variation of the value of the function. This implies there are no abrupt changes in value, known as '' discontinuities''. More preci ...
s f; * \operatorname_n \to \operatorname /math> for all bounded and
Lipschitz function In mathematical analysis, Lipschitz continuity, named after German mathematician Rudolf Lipschitz, is a strong form of uniform continuity for functions. Intuitively, a Lipschitz continuous function is limited in how fast it can change: there e ...
s f; * \limsup \operatorname_n \le \operatorname /math> for every
upper semi-continuous In mathematical analysis, semicontinuity (or semi-continuity) is a property of extended real-valued functions that is weaker than continuity. An extended real-valued function f is upper (respectively, lower) semicontinuous at a point x_0 if, r ...
function f bounded from above; * \liminf \operatorname_n \ge \operatorname /math> for every
lower semi-continuous In mathematical analysis, semicontinuity (or semi-continuity) is a property of extended real-valued functions that is weaker than continuity. An extended real-valued function f is upper (respectively, lower) semicontinuous at a point x_0 if, r ...
function f bounded from below; * \limsup P_n(C) \le P(C) for all
closed set In geometry, topology, and related branches of mathematics, a closed set is a Set (mathematics), set whose complement (set theory), complement is an open set. In a topological space, a closed set can be defined as a set which contains all its lim ...
s C of space S; * \liminf P_n(U) \ge P(U) for all
open set In mathematics, an open set is a generalization of an Interval (mathematics)#Definitions_and_terminology, open interval in the real line. In a metric space (a Set (mathematics), set with a metric (mathematics), distance defined between every two ...
s U of space S; * \lim P_n(A) = P(A) for all
continuity set In measure theory, a branch of mathematics, a continuity set of a measure is any Borel set such that \mu(\partial B) = 0, where \partial B is the (topological) boundary of . For signed measures, one instead asks that , \mu, (\partial B) = 0. T ...
s A of measure P. In the case S and \mathbf (with its usual topology) are homeomorphic , if F_n and F denote the
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ever ...
s of the measures P_n and P, respectively, then P_n converges weakly to P if and only if \lim_ F_n(x) = F(x) for all points x \in \mathbf at which F is continuous. For example, the sequence where P_n is the
Dirac measure In mathematics, a Dirac measure assigns a size to a set based solely on whether it contains a fixed element ''x'' or not. It is one way of formalizing the idea of the Dirac delta function, an important tool in physics and other technical fields. ...
located at 1/n converges weakly to the Dirac measure located at 0 (if we view these as measures on \mathbf with the usual topology), but it does not converge setwise. This is intuitively clear: we only know that 1/n is "close" to 0 because of the topology of \mathbf. This definition of weak convergence can be extended for S any
metrizable In topology and related areas of mathematics, a metrizable space is a topological space that is homeomorphic to a metric space. That is, a topological space (X, \tau) is said to be metrizable if there is a metric d : X \times X \to , \infty) suc ...
topological space In mathematics, a topological space is, roughly speaking, a Geometry, geometrical space in which Closeness (mathematics), closeness is defined but cannot necessarily be measured by a numeric Distance (mathematics), distance. More specifically, a to ...
. It also defines a weak topology on \mathcal(S), the set of all probability measures defined on (S,\Sigma). The weak topology is generated by the following basis of open sets: :\left\, where :U_ := \left\. If S is also separable, then \mathcal(S) is metrizable and separable, for example by the
Lévy–Prokhorov metric In mathematics, the Lévy–Prokhorov metric (sometimes known just as the Prokhorov metric) is a metric (i.e., a definition of distance) on the collection of probability measures on a given metric space. It is named after the French mathematician P ...
. If S is also compact or Polish, so is \mathcal(S). If S is separable, it naturally embeds into \mathcal(S) as the (closed) set of
Dirac measure In mathematics, a Dirac measure assigns a size to a set based solely on whether it contains a fixed element ''x'' or not. It is one way of formalizing the idea of the Dirac delta function, an important tool in physics and other technical fields. ...
s, and its
convex hull In geometry, the convex hull, convex envelope or convex closure of a shape is the smallest convex set that contains it. The convex hull may be defined either as the intersection of all convex sets containing a given subset of a Euclidean space, ...
is
dense Density (volumetric mass density or specific mass) is the ratio of a substance's mass to its volume. The symbol most often used for density is ''ρ'' (the lower case Greek letter rho), although the Latin letter ''D'' (or ''d'') can also be use ...
. There are many "arrow notations" for this kind of convergence: the most frequently used are P_ \Rightarrow P, P_ \rightharpoonup P, P_ \xrightarrow P and P_ \xrightarrow P.


Weak convergence of random variables

Let (\Omega, \mathcal, \mathbb) be a
probability space In probability theory, a probability space or a probability triple (\Omega, \mathcal, P) is a mathematical construct that provides a formal model of a random process or "experiment". For example, one can define a probability space which models ...
and X be a metric space. If is a sequence of
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
s then ''Xn'' is said to converge weakly (or in distribution or in law) to the random variable ''X'': Ω → X as if the sequence of
pushforward measure In measure theory, a pushforward measure (also known as push forward, push-forward or image measure) is obtained by transferring ("pushing forward") a measure from one measurable space to another using a measurable function. Definition Given mea ...
s (''Xn'')(P) converges weakly to ''X''(P) in the sense of weak convergence of measures on X, as defined above.


Comparison with vague convergence

Let X be a metric space (for example \mathbb or ,1/math>). The following spaces of test functions are commonly used in the convergence of probability measures. * C_c(X) the class of continuous functions f each vanishing outside a compact set. * C_0(X) the class of continuous functions f such that \lim _ f(x)=0 * C_B(X) the class of continuous bounded functions We have C_c \subset C_0 \subset C_B \subset C. Moreover, C_0 is the closure of C_c with respect to uniform convergence.


Vague Convergence

A sequence of measures \left(\mu_n\right)_ converges vaguely to a measure \mu if for all f \in C_c(X), \int_X f \, d \mu_n \rightarrow \int_X f \, d \mu.


Weak Convergence

A sequence of measures \left(\mu_n\right)_ converges weakly to a measure \mu if for all f \in C_B(X), \int_X f \, d \mu_n \rightarrow \int_X f \, d \mu. In general, these two convergence notions are not equivalent. In a probability setting, vague convergence and weak convergence of probability measures are equivalent assuming tightness. That is, a tight sequence of probability measures (\mu_n)_ converges vaguely to a probability measure \mu if and only if (\mu_n)_ converges weakly to \mu. The weak limit of a sequence of probability measures, provided it exists, is a probability measure. In general, if tightness is not assumed, a sequence of probability (or sub-probability) measures may not necessarily converge ''vaguely'' to a true probability measure, but rather to a sub-probability measure (a measure such that \mu(X)\leq 1). Thus, a sequence of probability measures (\mu_n)_ such that \mu_n \overset \mu where \mu is not specified to be a probability measure is not guaranteed to imply weak convergence.


Weak convergence of measures as an example of weak-* convergence

Despite having the same name as weak convergence in the context of functional analysis, weak convergence of measures is actually an example of weak-* convergence. The definitions of weak and weak-* convergences used in functional analysis are as follows: Let V be a topological vector space or Banach space. # A sequence x_n in V converges weakly to x if \varphi\left(x_n\right) \rightarrow \varphi(x) as n \to \infty for all \varphi \in V^*. One writes x_n \mathrel x as n \to \infty. # A sequence of \varphi_n \in V^*converges in the weak-* topology to \varphi provided that \varphi_n(x) \rightarrow \varphi(x) for all x \in V. That is, convergence occurs in the point-wise sense. In this case, one writes \varphi_n \mathrel \varphi as n \to \infty. To illustrate how weak convergence of measures is an example of weak-* convergence, we give an example in terms of vague convergence (see above). Let X be a locally compact Hausdorff space. By the Riesz-Representation theorem, the space M(X) of Radon measures is isomorphic to a subspace of the space of continuous linear functionals on C_0(X). Therefore, for each Radon measure \mu_n \in M(X), there is a linear functional \varphi_n \in C_0(X)^* such that \varphi_n(f)=\int_X f \, d \mu_n for all f \in C_0(X). Applying the definition of weak-* convergence in terms of linear functionals, the characterization of vague convergence of measures is obtained. For compact X , C_0(X)=C_B(X) , so in this case weak convergence of measures is a special case of weak-* convergence.


See also

*
Convergence of random variables In probability theory, there exist several different notions of convergence of sequences of random variables, including ''convergence in probability'', ''convergence in distribution'', and ''almost sure convergence''. The different notions of conve ...
*
Lévy–Prokhorov metric In mathematics, the Lévy–Prokhorov metric (sometimes known just as the Prokhorov metric) is a metric (i.e., a definition of distance) on the collection of probability measures on a given metric space. It is named after the French mathematician P ...
* Prokhorov's theorem *
Tightness of measures In mathematics, tightness is a concept in measure theory. The intuitive idea is that a given collection of measures does not "escape to infinity". Definitions Let (X, T) be a Hausdorff space, and let \Sigma be a σ-algebra on X that contai ...


Notes and references


Further reading

* * * {{More footnotes, date=February 2010 Measure theory Measure, Convergence of