HOME

TheInfoList



OR:

In the mathematical theory of probability, the
Wiener process In mathematics, the Wiener process is a real-valued continuous-time stochastic process named in honor of American mathematician Norbert Wiener for his investigations on the mathematical properties of the one-dimensional Brownian motion. It i ...
, named after
Norbert Wiener Norbert Wiener (November 26, 1894 – March 18, 1964) was an American mathematician and philosopher. He was a professor of mathematics at the Massachusetts Institute of Technology (MIT). A child prodigy, Wiener later became an early researcher ...
, is a stochastic process used in modeling various phenomena, including
Brownian motion Brownian motion, or pedesis (from grc, πήδησις "leaping"), is the random motion of particles suspended in a medium (a liquid or a gas). This pattern of motion typically consists of random fluctuations in a particle's position insi ...
and fluctuations in financial markets. A formula for the conditional probability distribution of the extremum of the Wiener process and a sketch of its proof appears in work of H. J. Kusher (appendix 3, page 106) published in 1964.H. J. Kushner, "A New Method of Locating the Maximum Point of an Arbitrary Multipeak Curve in the Presence of Noise", ''J. Basic Eng'' 86(1), 97–106 (Mar 01, 1964). a detailed constructive proof appears in work of Dario Ballabio in 1978.Dario Ballabio, "Una nuova classe di algoritmi stocastici per l'ottimizzazione globale" (A new class of stochastic algorithms for global optimization), University of Milan, Institute of Mathematics, doctoral dissertation presented on July 12th 1978, pp. 29–33. This result was developed within a research project about Bayesian optimization algorithms. In some global optimization problems the analytical definition of the objective function is unknown and it is only possible to get values at fixed points. There are objective functions in which the cost of an evaluation is very high, for example when the evaluation is the result of an experiment or a particularly onerous measurement. In these cases, the search of the global extremum (maximum or minimum) can be carried out using a methodology named " Bayesian optimization", which tend to obtain a priori the best possible result with a predetermined number of evaluations. In summary it is assumed that outside the points in which it has already been evaluated, the objective function has a pattern which can be represented by a stochastic process with appropriate characteristics. The stochastic process is taken as a model of the objective function, assuming that the probability distribution of its extrema gives the best indication about extrema of the objective function. In the simplest case of the one-dimensional optimization, given that the objective function has been evaluated in a number of points, there is the problem to choose in which of the intervals thus identified is more appropriate to invest in a further evaluation. If a Wiener stochastic process is chosen as a model for the objective function, it is possible to calculate the probability distribution of the model extreme points inside each interval, conditioned by the known values at the interval boundaries. The comparison of the obtained distributions provides a criterion for selecting the interval in which the process should be iterated. The probability value of having identified the interval in which falls the global extremum point of the objective function can be used as a stopping criterion. Bayesian optimization is not an efficient method for the accurate search of local extrema so, once the search range has been restricted, depending on the characteristics of the problem, a specific local optimization method can be used.


Proposition

Let X(t) be a Wiener stochastic process on an interval ,b/math> with initial value X(a) = X_a. By definition of
Wiener process In mathematics, the Wiener process is a real-valued continuous-time stochastic process named in honor of American mathematician Norbert Wiener for his investigations on the mathematical properties of the one-dimensional Brownian motion. It i ...
, increments have a normal distribution: : \text a \leq t_1 < t_2 \leq b, \qquad X(t_2) - X(t_1) \sim N(0, \sigma^2(t_2 - t_1)). Let : F(z) = \Pr( \min_ X(t) \leq z \mid X(b) = X_b) be the
cumulative probability distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Eve ...
of the minimum value of the X(t) function on interval ,b conditioned by the value X(b) = X_b. It is shown that:János D. Pintér, ''Global Optimization in Action: Continuous and Lipschitz Optimization, 1996 Springer Science & Business Media'', page 57.The theorem, as set out and shown for the case of the minimum of the Wiener process, also applies to the maximum. : F(z)= \begin 1 & \text z \geq \min\, \\ \exp \left( -2\dfrac\right) & \text z < \min(X_a,X_b). \end


Constructive proof

Case z \geq \min (X_a, X_b) is an immediate consequence of the minimum definition, in the following it will always be assumed z < \min (X_a, X_b) and also corner case \min_ X(t) = \min (X_a, X_b) will be excluded. Let' s assume X(t) defined in a finite number of points t_k \in ,b \ \ 0 \leq k \leq n, \ \ t_0 = a. Let T_n \ \ \overset\ \ \ by varying the integer n be a sequence of sets \ such that T_n \subset T_ and \bigcup_^T_n be a
dense set In topology and related areas of mathematics, a subset ''A'' of a topological space ''X'' is said to be dense in ''X'' if every point of ''X'' either belongs to ''A'' or else is arbitrarily "close" to a member of ''A'' — for instance, the ...
in ,b/math>, hence every
neighbourhood A neighbourhood (British English, Irish English, Australian English and Canadian English) or neighborhood (American English; American and British English spelling differences, see spelling differences) is a geographically localised community ...
of each point in ,b/math> contains an element of one of the sets T_n. Let \Delta z be a real positive number such that z + \Delta z < \min (X_a, X_b). Let the
event Event may refer to: Gatherings of people * Ceremony, an event of ritual significance, performed on a special occasion * Convention (meeting), a gathering of individuals engaged in some common interest * Event management, the organization of ev ...
E be defined as: E \ \ \overset\ \ (\min_X(t) < z + \Delta z ) \Longleftrightarrow ( \exists \, t \in ,b: X(t) < z + \Delta z) . Having excluded corner case \min_ X(t) = \min (X_a, X_b), it is surely P(E) > 0. Let E_n , \ \ n = 0,1,2,\ldots be the events defined as: E_n \ \ \overset\ \ (\exists \,t_k \in T_n : z < X(t_k) < z + \Delta z ) and let \nu be the first k among the t_k \in T_n which define E_n . Since T_n \subset T_ it is evidently E_n \subset E_. Now equation (2.1) will be proved. (2.1) \ \ \ \ E = \bigcup_^E_n By the E_n events definition,\forall \, n \ \ E_n \Rightarrow E, hence \bigcup_^E_n \subset E . It will now be verified the relation E \subset\bigcup_^E_n hence (2.1) will be proved. The definition of E, the continuity of X(t) and the hypothesis z < X_a = X(a) imply, by the
intermediate value theorem In mathematical analysis, the intermediate value theorem states that if f is a continuous function whose domain contains the interval , then it takes on any given value between f(a) and f(b) at some point within the interval. This has two im ...
, ( \exists \, \bar\in ,b: z < X(\bar) < z + \Delta z). By the continuity of X(t) and the hypothesis that \bigcup_^T_n is dense in ,b/math> it is deducted that \exists \, \bar such that for t_\nu \in T_\bar it must be z < X(t_\nu) < z + \Delta z , hence E \subset E_\bar \subset \bigcup_^E_n which implies (2.1). ''(2.2)'' \ \ \ \ P(E) = \lim_ P(E_n) (2.2) is deducted from (2.1), considering that E_n \Rightarrow E_ implies that the sequence of probabilities P(E_n) is
monotone Monotone refers to a sound, for example music or speech, that has a single unvaried tone. See: monophony. Monotone or monotonicity may also refer to: In economics * Monotone preferences, a property of a consumer's preference ordering. *Monotonic ...
non decreasing and hence it converges to its
supremum In mathematics, the infimum (abbreviated inf; plural infima) of a subset S of a partially ordered set P is a greatest element in P that is less than or equal to each element of S, if such an element exists. Consequently, the term ''greatest l ...
. The definition of events E_n implies \forall n \ \ P(E_n) > 0 \Rightarrow P(E_n) = P(E_\nu) and (2.2) implies P(E) = P(E_\nu) . In the following it will always be assumed n \geq \nu, so t_\nu is well defined. (2.3) \ \ \ \ P( X(b)\leqslant -X_b + 2z) \leqslant P(X(b) - X(t_\nu) < -X_b + z) In fact, by definition of E_n it is z < X(t_\nu) , so (X(b)\leqslant -X_b + 2z) \Rightarrow (X(b) - X(t_\nu) < -X_b + z). In a similar way, since by definition of E_n it is z < X(t_\nu) , (2.4) is valid: (2.4) \ \ \ \ P(X(b)- X(t_\nu) > X_b - z) \leqslant P( X(b) > X_b) (2.5)\ \ \ \ P(X(b) - X(t_\nu) < -X_b + z) = P(X(b) - X(t_\nu) > X_b - z) The above is explained by the fact that the random variable (X(b) - X(t_\nu)) \thicksim N(\varnothing; \ \ \sigma^2(b-t_\nu)) has a symmetric probability density compared to its mean which is zero. By applying in sequence relationships (2.3), (2.5) and (2.4) we get (2.6) : (2.6) \ \ \ \ P(X(b)\leqslant -X_b + 2z) \leqslant P(X(b)> X_b) With the same procedure used to obtain (2.3), (2.4) and (2.5) taking advantage this time by the relationship X(t_\nu) < z + \Delta z we get (2.7): (2.7) \ \ \ \ P(X(b) > X_b) \leqslant P(X(b)-X(t_\nu) > X_b - z - \Delta z) \ \ = \ \ P(X(b)- X(t_\nu) < -X_b + z + \Delta z) \leqslant P(X(b)< -X_b + 2z + 2\Delta z) By applying in sequence (2.6) and (2.7) we get: (2.8) P(X(b)\leqslant -X_b + 2z) \leqslant P(X(b)> X_b) \leqslant P( X(b)< -X_b + 2z + 2\Delta z) From X_b > z + \Delta z > z , considering the continuity of X(t) and the
intermediate value theorem In mathematical analysis, the intermediate value theorem states that if f is a continuous function whose domain contains the interval , then it takes on any given value between f(a) and f(b) at some point within the interval. This has two im ...
we get X(b)> X_b > z + \Delta z > z \Rightarrow E_n , which implies P(X(b) > X_b) = P(E_n,X(b) > X_b) . Replacing the above in (2.8) and passing to the limits: \lim_ \ \ E_n(\Delta z) \rightarrow E(\Delta z) and for \Delta z \rightarrow 0, event E(\Delta z) converges to \min_X(t) \leqslant z (2.9) \ \ \ \ P(X(b)\leqslant -X_b + 2z) = P( \min_X(t) \leqslant z, \ \ X(b) > X_b) \forall \, dX_b > 0, by substituting (X_b) with (X_b -dX_b) in (2.9) we get the equivalent relationship: (2.10)\ \ \ \ P(X(b)\leqslant -X_b + 2z + dX_b) = P( \min_X(t) \leqslant z, \ \ X(b) > X_b -dX_b) Applying the
Bayes' theorem In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For exa ...
to the joint event ( \min_X(t) \leqslant z, \ \ X_b - dX_b < X(b) \leqslant X_b) (2.11) \ \ \ \ P( \min_X(t) \leqslant z \mid X_b - dX_b < X(b) \leqslant X_b) = P( \min_X(t) \leqslant z, \ \ X_b - dX_b < X(b) \leqslant X_b) / \ \ P( X_b - dX_b < X(b) \leqslant X_b) Let: B \ \overset \ \ , \ C \ \overset \ \ , \ D \ \overset \ \ , \ A \ \overset \ \ \From the above definitions it follows: D = B \cup C \Rightarrow \ P(A, D) = P(A, B\cup C) = P(A, B) + P(A, C) \Rightarrow P(A, C) = P(A, D) - P(A, B) (2.12)\ \ \ \ P(A, C) = P(A, D) - P(A, B) Substituting (2.12) into (2.11), we get the equivalent: (2.13) P( \min_X(t) \leqslant z \mid X_b - dX_b < X(b) \leqslant X_b) = (P(\min_ X(t) \leq z,\ \ X(b) > X_b -dX_b ) - P(\min_ X(t) \leq z, \ \ X(b) > X_b)) \ \ / \ \ P( X_b - dX_b < X(b) \leqslant X_b) Substituting (2.9) and (2.10) into (2.13): (2.14) \ \ \ \ P( \min_X(t) \leqslant z \mid X_b - dX_b < X(b) \leqslant X_b) =(P(X(b) \leqslant -X_b + 2z + dX_b ) - P(X(b) \leqslant -X_b + 2z) / \ \ P( X_b - dX_b < X(b) \leqslant X_b) It can be observed that in the second member of (2.14) appears the probability distribution of the random variable X(b), normal with mean X_a e variance \sigma^2(b-a). The realizations X_b and -X_b +2z of the random variable X(b) match respectively the probability densities: (2.15) \ \ \ \ P(X_b) \, dX_b = \frac\exp \biggl( - \frac \frac \biggr) \, dX_b (2.16) \ \ \ \ P(-X_b + 2z)\, dX_b = \frac\exp \biggl( - \frac \frac \biggr) \, dX_b Substituting (2.15) e (2.16) into (2.14) and taking the limit for dX_b \rightarrow 0 the thesis is proved: F(z) = P( \min_ X(t) \leq z \ \ , \ \ X(b) = X_b) = = \frac\exp \biggl( - \frac \frac \biggr) \, dX_b \ \ \diagup \ \ \frac\exp \biggl( - \frac \frac \biggr) \, dX_b = = \exp \biggl( - \frac \frac \biggr) = \ \ \exp \biggl( -2 \ \ \frac \biggr)


Bibliography

* A versatile stochastic model of a function of unknown and time varying form - Harold J Kushner - Journal of Mathematical Analysis and Applications Volume 5, Issue 1, August 1962, Pages 150-167. * The Application of Bayesian Methods for Seeking the Extremum - J. Mockus, J. Tiesis, A. Zilinskas - IFIP Congress 1977, August 8–12 Toronto.


See also

*
Random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the p ...
*
Probability Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
*
Event (probability theory) In probability theory, an event is a set of outcomes of an experiment (a subset of the sample space) to which a probability is assigned. A single outcome may be an element of many different events, and different events in an experiment are us ...
*
Conditional probability In probability theory, conditional probability is a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion or evidence) has already occurred. This particular method relies on event B occu ...
*
Regular conditional probability In probability theory, regular conditional probability is a concept that formalizes the notion of conditioning on the outcome of a random variable. The resulting conditional probability distribution is a parametrized family of probability measures c ...
* Stochastic process *
Wiener process In mathematics, the Wiener process is a real-valued continuous-time stochastic process named in honor of American mathematician Norbert Wiener for his investigations on the mathematical properties of the one-dimensional Brownian motion. It i ...


Notes

{{reflist, group=note


References

Wiener process