HOME

TheInfoList



OR:

In mathematics, a smooth maximum of an
indexed family In mathematics, a family, or indexed family, is informally a collection of objects, each associated with an index from some index set. For example, a ''family of real numbers, indexed by the set of integers'' is a collection of real numbers, whe ...
''x''1, ..., ''x''''n'' of numbers is a smooth approximation to the
maximum In mathematical analysis, the maxima and minima (the respective plurals of maximum and minimum) of a function, known collectively as extrema (the plural of extremum), are the largest and smallest value of the function, either within a given r ...
function \max(x_1,\ldots,x_n), meaning a
parametric family In mathematics and its applications, a parametric family or a parameterized family is a family of objects (a set of related objects) whose differences depend only on the chosen values for a set of parameters. Common examples are parametrized (fa ...
of functions m_\alpha(x_1,\ldots,x_n) such that for every , the function is smooth, and the family converges to the maximum function as . The concept of smooth minimum is similarly defined. In many cases, a single family approximates both: maximum as the parameter goes to positive infinity, minimum as the parameter goes to negative infinity; in symbols, as and as . The term can also be used loosely for a specific smooth function that behaves similarly to a maximum, without necessarily being part of a parametrized family.


Examples

For large positive values of the parameter \alpha > 0, the following formulation is a smooth,
differentiable In mathematics, a differentiable function of one real variable is a function whose derivative exists at each point in its domain. In other words, the graph of a differentiable function has a non- vertical tangent line at each interior point i ...
approximation of the maximum function. For negative values of the parameter that are large in absolute value, it approximates the minimum. : \mathcal_\alpha (x_1,\ldots,x_n) = \frac \mathcal_\alpha has the following properties: #\mathcal_\alpha\to \max as \alpha\to\infty #\mathcal_0 is the
arithmetic mean In mathematics and statistics, the arithmetic mean ( ) or arithmetic average, or just the ''mean'' or the '' average'' (when the context is clear), is the sum of a collection of numbers divided by the count of numbers in the collection. The coll ...
of its inputs #\mathcal_\alpha\to \min as \alpha\to -\infty The gradient of \mathcal_ is closely related to
softmax The softmax function, also known as softargmax or normalized exponential function, converts a vector of real numbers into a probability distribution of possible outcomes. It is a generalization of the logistic function to multiple dimensions, a ...
and is given by : \nabla_\mathcal_\alpha (x_1,\ldots,x_n) = \frac + \alpha(x_i - \mathcal_\alpha (x_1,\ldots,x_n)) This makes the softmax function useful for optimization techniques that use
gradient descent In mathematics, gradient descent (also often called steepest descent) is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of ...
.


LogSumExp

Another smooth maximum is
LogSumExp The LogSumExp (LSE) (also called RealSoftMax or multivariable softplus) function (mathematics), function is a smooth maximum – a smooth function, smooth approximation to the maximum function, mainly used by machine learning algorithms. It is def ...
: : \mathrm_\alpha(x_1, \ldots, x_n) = (1/\alpha)\log( \exp(\alpha x_1) + \ldots + \exp( \alpha x_n)) This can also be normalized if the x_i are all non-negative, yielding a function with domain [0,\infty)^n and range [0, \infty): : g(x_1, \ldots, x_n) = \log( \exp(x_1) + \ldots + \exp(x_n) - (n - 1) ) The (n - 1) term corrects for the fact that \exp(0) = 1 by canceling out all but one zero exponential, and \log 1 = 0 if all x_i are zero.


p-Norm

Another smooth maximum is the p-norm: : , , (x_1, \ldots, x_n) , , _p = \left( , x_1, ^p + \cdots + , x_n, ^p \right)^ which converges to , , (x_1, \ldots, x_n) , , _\infty = \max_ , x_i, as p \to \infty. An advantage of the p-norm is that it is a
norm Naturally occurring radioactive materials (NORM) and technologically enhanced naturally occurring radioactive materials (TENORM) consist of materials, usually industrial wastes or by-products enriched with radioactive elements found in the envir ...
. As such it is "scale invariant" (homogeneous): , , (\lambda x_1, \ldots, \lambda x_n) , , _p = , \lambda, \times , , (x_1, \ldots, x_n) , , _p , and it satisfies the triangular inequality.


Other choices of smoothing function

: \begin \textstyle\max_\varepsilon(a, b) &= \frac \\ &= \frac \end where \varepsilon \geq 0 is a parameter. As \varepsilon \to 0, , \cdot, _\varepsilon \to , \cdot, and thus \textstyle\max_\varepsilon \to \max.


See also

*
LogSumExp The LogSumExp (LSE) (also called RealSoftMax or multivariable softplus) function (mathematics), function is a smooth maximum – a smooth function, smooth approximation to the maximum function, mainly used by machine learning algorithms. It is def ...
*
Softmax function The softmax function, also known as softargmax or normalized exponential function, converts a vector of real numbers into a probability distribution of possible outcomes. It is a generalization of the logistic function to multiple dimensions, a ...
*
Generalized mean In mathematics, generalized means (or power mean or Hölder mean from Otto Hölder) are a family of functions for aggregating sets of numbers. These include as special cases the Pythagorean means (arithmetic, geometric, and harmonic means). ...


References

{{Reflist Mathematical notation Basic concepts in set theory https://www.johndcook.com/soft_maximum.pdf M. Lange, D. Zühlke, O. Holz, and T. Villmann, "Applications of lp-norms and their smooth approximations for gradient based learning vector quantization," ''in Proc. ESANN'', Apr. 2014, pp. 271-276. (https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2014-153.pdf)