For mathematical analysis and

statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...

, Leave-one-out error can refer to the following: * Leave-one-out cross-validation Stability (CVloo, for ''stability of Cross Validation with leave one out''): An algorithm f has CVloo stability β with respect to the

loss function In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cost ...

V if the following holds:

\forall i\in\, \mathbb_S\\geq1-\delta_

* Expected-to-leave-one-out error Stability (

Eloo_

, for ''Expected error from leaving one out''): An algorithm f has

Eloo_

stability if for each n there exists a

\beta_^m

and a

\delta_^m

such that:

\forall i\in\, \mathbb_S\\geq1-\delta_^m

, with

\beta_^m

and

\delta_^m

going to zero for

n\rightarrow\inf

Preliminary notations

With X and Y being a

subset In mathematics, Set (mathematics), set ''A'' is a subset of a set ''B'' if all Element (mathematics), elements of ''A'' are also elements of ''B''; ''B'' is then a superset of ''A''. It is possible for ''A'' and ''B'' to be equal; if they are ...

of the real numbers R, or X and Y ⊂ R, being respectively an input space X and an output space Y, we consider a training set:

S = \

of size m in

Z = X \times Y

drawn

independently and identically distributed In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is usual ...

(i.i.d.) from an unknown distribution, here called "D". Then a learning algorithm is a function

f

from

Z_m

into

F \subset YX

which maps a learning set S onto a function

f_S

from the input space X to the output space Y. To avoid complex notation, we consider only deterministic algorithms. It is also assumed that the algorithm

f

is symmetric with respect to S, i.e. it does not depend on the order of the elements in the training set. Furthermore, we assume that all functions are measurable and all sets are countable which does not limit the interest of the results presented here. The loss of an hypothesis f with respect to an example

z = (x,y)

is then defined as

V(f,z) = V(f(x),y)

. The empirical error of f can then be written as

= \frac\sum V(f,z_i)

. The true error of f is

= \mathbb_z V(f,z)

Given a training set S of size m, we will build, for all i = 1....,m, modified training sets as follows: * By removing the i-th element

S^ = \

* and/or by replacing the i-th element

S^i = \{z_1 ,...,\ z_{i-1},\ z_i',\ z_{i+1},...,\ z_m\}

References

*S. Mukherjee, P. Niyogi, T. Poggio, and R. M. Rifkin. Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization. Adv. Comput. Math., 25(1-3):161–193, 2006 Machine learning

Preliminary notations

See also

References