Global optimization is a branch of

applied mathematics Applied mathematics is the application of mathematical methods by different fields such as physics, engineering, medicine, biology, finance, business, computer science, and industry. Thus, applied mathematics is a combination of mathematical ...

and

numerical analysis Numerical analysis is the study of algorithms that use numerical approximation (as opposed to symbolic manipulations) for the problems of mathematical analysis (as distinguished from discrete mathematics). It is the study of numerical methods t ...

that attempts to find the global minima or maxima of a function or a set of functions on a given set. It is usually described as a minimization problem because the maximization of the real-valued function

g(x)

is equivalent to the minimization of the function

f(x):=(-1)\cdot g(x)

. Given a possibly nonlinear and non-convex continuous function

f:\Omega\subset\mathbb^n\to\mathbb

with the global minima

f^*

and the set of all global minimizers

X^*

\Omega

, the standard minimization problem can be given as :

\min_f(x),

that is, finding

f^*

and a global minimizer in

X^*

; where

\Omega

is a (not necessarily convex) compact set defined by inequalities

g_i(x)\geqslant0, i=1,\ldots,r

. Global optimization is distinguished from local optimization by its focus on finding the minimum or maximum over the given set, as opposed to finding ''local'' minima or maxima. Finding an arbitrary local minimum is relatively straightforward by using classical ''local optimization'' methods. Finding the global minimum of a function is far more difficult: analytical methods are frequently not applicable, and the use of numerical solution strategies often leads to very hard challenges.

General theory

A recent approach to the global optimization problem is via minima distribution . In this work, a relationship between any continuous function

f

on a compact set

\Omega\subset\mathbb^n

and its global minima

f^*

has been strictly established. As a typical case, it follows that :

\lim_\int_\Omega f(x)m^(x) \, \mathrmx=f^*,~~\textrm~~m^(x)=\frac;

meanwhile, :

\lim_m^(x)
=\left\{\begin{array}{cl}
  \frac{1}{\mu(X^*)}, & x\in X^*, \\
  0, & x\in\Omega-X^*,
\end{array}\right.

where

\mu(X^*)

is the

n

-dimensional Lebesgue measure of the set of minimizers

X^*\in\Omega

. And if

f

is not a constant on

\Omega

, the monotonic relationship :

\int_\Omega f(x)m^{(k)}(x)\,\mathrm{d}x>
\int_\Omega f(x)m^{(k+\Delta k)}(x)\,\mathrm{d}x>f^*

holds for all

k\in\mathbb{R}

and

\Delta k>0

, which implies a series of monotonous containment relationships, and one of them is, for example, :

\Omega\supset D_f^{(k)}\supset D_f^{(k+\Delta k)}\supset X^*, \text{ where } D_f^{(k)}=\left\{ x \in \Omega : f(x)\leqslant \int_\Omega f(t)m^{(k)}(t) \, \mathrm{d}t\right\}.

And we define a minima distribution to be a weak limit

m_{f,\Omega}

such that the identity :

\int_\Omega m_{f,\Omega}(x)\varphi(x) \, \mathrm{d}x = \lim_{k\to\infty} \int_\Omega m^{(k)}(x) \varphi(x) \, \mathrm{d}x

holds for every smooth function

\varphi

with compact support in

\Omega

. Here are two immediate properties of

m_{f,\Omega}

: : (1)

m_{f,\Omega}

satisfies the identity

\int_\Omega m_{f,\Omega}(x) \, \mathrm{d}x = 1

. : (2) If

f

is continuous on

\Omega

, then

f^*=\int_\Omega f(x)m_{f,\Omega}(x) \, \mathrm{d}x

. As a comparison, the well-known relationship between any differentiable convex function and its minima is strictly established by the gradient. If

f

is differentiable on a convex set

D

, then

f

is convex if and only if :

f(y)\geqslant f(x)+\nabla f(x)(y-x),~~\forall x,y\in D;

thus,

\nabla f(x^*)=0

implies that

f(y)\geqslant f(x^*)

holds for all

y\in D

, i.e.,

x^*

is a global minimizer of

f

D

Applications

Typical examples of global optimization applications include: *

Protein structure prediction Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is differen ...

(minimize the energy/free energy function) *

Computational phylogenetics Computational phylogenetics is the application of computational algorithms, methods, and programs to phylogenetic

(e.g., minimize the number of character transformations in the tree) * Traveling salesman problem and electrical circuit design (minimize the path length) *

Chemical engineering Chemical engineering is an engineering field which deals with the study of operation and design of chemical plants as well as methods of improving production. Chemical engineers develop economical commercial processes to convert raw materials int ...

(e.g., analyzing the

Gibbs energy In thermodynamics, the Gibbs free energy (or Gibbs energy; symbol G) is a thermodynamic potential that can be used to calculate the maximum amount of work that may be performed by a thermodynamically closed system at constant temperature and p ...

) * Safety verification,

safety engineering Safety engineering is an engineering discipline which assures that engineered systems provide acceptable levels of safety. It is strongly related to industrial engineering/systems engineering, and the subset system safety engineering. Safety eng ...

(e.g., of mechanical structures, buildings) *

Worst-case analysis The worst-case analysis regulation40 CFR 1502.22 was promulgated in 1979 by the US Council on Environmental Quality (CEQ). The regulation is one of many implementing the National Environmental Policy Act of 1969 and it sets out the formal proced ...

* Mathematical problems (e.g., the

Kepler conjecture The Kepler conjecture, named after the 17th-century mathematician and astronomer Johannes Kepler, is a mathematical theorem about sphere packing in three-dimensional Euclidean space. It states that no arrangement of equally sized spheres filling ...

) * Object packing (configuration design) problems * The starting point of several

molecular dynamics Molecular dynamics (MD) is a computer simulation method for analyzing the physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamic "evolution" of the s ...

simulations consists of an initial optimization of the energy of the system to be simulated. *

Spin glass In condensed matter physics, a spin glass is a magnetic state characterized by randomness, besides cooperative behavior in freezing of spins at a temperature called 'freezing temperature' ''Tf''. In ferromagnetic solids, component atoms' magne ...

es * Calibration of radio propagation models and of many other models in the sciences and engineering * Curve fitting like

non-linear least squares Non-linear least squares is the form of least squares analysis used to fit a set of ''m'' observations with a model that is non-linear in ''n'' unknown parameters (''m'' ≥ ''n''). It is used in some forms of nonlinear regression. The ...

analysis and other generalizations, used in fitting model parameters to experimental data in chemistry, physics, biology, economics, finance, medicine, astronomy, engineering. *

IMRT Radiation therapy or radiotherapy, often abbreviated RT, RTx, or XRT, is a therapy using ionizing radiation, generally provided as part of cancer treatment to control or kill malignant cells and normally delivered by a linear accelerator. Radiat ...

radiation therapy planning

Deterministic methods

The most successful general exact strategies are:

Inner and outer approximation

In both of these strategies, the set over which a function is to be optimized is approximated by polyhedra. In inner approximation, the polyhedra are contained in the set, while in outer approximation, the polyhedra contain the set.

Cutting-plane methods

The cutting-plane method is an umbrella term for optimization methods which iteratively refine a

feasible set In mathematical optimization, a feasible region, feasible set, search space, or solution space is the set of all possible points (sets of values of the choice variables) of an optimization problem that satisfy the problem's constraints, potent ...

or objective function by means of linear inequalities, termed ''cuts''. Such procedures are popularly used to find

integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign (−1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language o ...

solutions to

mixed integer linear programming Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements are represented by linear relationships. Linear programming is ...

(MILP) problems, as well as to solve general, not necessarily differentiable

convex optimization Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets (or, equivalently, maximizing concave functions over convex sets). Many classes of convex optimization probl ...

problems. The use of cutting planes to solve MILP was introduced by

Ralph E. Gomory Ralph Edward Gomory (born May 7, 1929) is an American applied mathematician and executive. Gomory worked at IBM as a researcher and later as an executive. During that time, his research led to the creation of new areas of applied mathematics. ...

and Václav Chvátal.

Branch and bound methods

Branch and bound (BB or B&B) is an

algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...

design paradigm for

discrete Discrete may refer to: *Discrete particle or quantum in physics, for example in quantum theory *Discrete device, an electronic component with just one circuit element, either passive or active, other than an integrated circuit *Discrete group, a g ...

and

combinatorial optimization Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combin ...

problems. A branch-and-bound algorithm consists of a systematic enumeration of candidate solutions by means of

state space search State space search is a process used in the field of computer science, including artificial intelligence (AI), in which successive configurations or ''states'' of an instance are considered, with the intention of finding a ''goal state'' with the ...

: the set of candidate solutions is thought of as forming a

rooted tree In graph theory, a tree is an undirected graph in which any two vertices are connected by ''exactly one'' path, or equivalently a connected acyclic undirected graph. A forest is an undirected graph in which any two vertices are connected by '' ...

with the full set at the root. The algorithm explores ''branches'' of this tree, which represent subsets of the solution set. Before enumerating the candidate solutions of a branch, the branch is checked against upper and lower estimated ''bounds'' on the optimal solution, and is discarded if it cannot produce a better solution than the best one found so far by the algorithm.

Interval methods

Interval arithmetic, interval mathematics, interval analysis, or interval computation, is a method developed by mathematicians since the 1950s and 1960s as an approach to putting bounds on

rounding error A roundoff error, also called rounding error, is the difference between the result produced by a given algorithm using exact arithmetic and the result produced by the same algorithm using finite-precision, rounded arithmetic. Rounding errors are d ...

s and

measurement error Observational error (or measurement error) is the difference between a measured value of a quantity and its true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. In statistics, an error is not necessarily a "mistake". ...

s in mathematical computation and thus developing

numerical methods Numerical analysis is the study of algorithms that use numerical approximation (as opposed to symbolic manipulations) for the problems of mathematical analysis (as distinguished from discrete mathematics). It is the study of numerical methods th ...

that yield reliable results. Interval arithmetic helps find reliable and guaranteed solutions to equations and optimization problems.

Methods based on real algebraic geometry

Real algebra is the part of algebra which is relevant to real algebraic (and semialgebraic) geometry. It is mostly concerned with the study of

ordered field In mathematics, an ordered field is a field together with a total ordering of its elements that is compatible with the field operations. The basic example of an ordered field is the field of real numbers, and every Dedekind-complete ordered field ...

s and

ordered ring In abstract algebra, an ordered ring is a (usually commutative) ring ''R'' with a total order ≤ such that for all ''a'', ''b'', and ''c'' in ''R'': * if ''a'' ≤ ''b'' then ''a'' + ''c'' ≤ ''b'' + ''c''. * if 0 ≤ ''a'' and 0 ≤ ''b'' th ...

s (in particular

real closed field In mathematics, a real closed field is a field ''F'' that has the same first-order properties as the field of real numbers. Some examples are the field of real numbers, the field of real algebraic numbers, and the field of hyperreal numbers. D ...

s) and their applications to the study of

positive polynomial In mathematics, a positive polynomial on a particular set is a polynomial whose values are positive on that set. Let ''p'' be a polynomial in ''n'' variables with real coefficients and let ''S'' be a subset of the ''n''-dimensional Euclidean s ...

s and sums-of-squares of polynomials. It can be used in

Stochastic methods

Several exact or inexact Monte-Carlo-based algorithms exist:

Direct Monte-Carlo sampling

In this method, random simulations are used to find an approximate solution. Example: The

traveling salesman problem The travelling salesman problem (also called the travelling salesperson problem or TSP) asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each cit ...

is what is called a conventional optimization problem. That is, all the facts (distances between each destination point) needed to determine the optimal path to follow are known with certainty and the goal is to run through the possible travel choices to come up with the one with the lowest total distance. However, let's assume that instead of wanting to minimize the total distance traveled to visit each desired destination, we wanted to minimize the total time needed to reach each destination. This goes beyond conventional optimization since travel time is inherently uncertain (traffic jams, time of day, etc.). As a result, to determine our optimal path we would want to use simulation - optimization to first understand the range of potential times it could take to go from one point to another (represented by a probability distribution in this case rather than a specific distance) and then optimize our travel decisions to identify the best path to follow taking that uncertainty into account.

Stochastic tunneling

Stochastic tunneling (STUN) is an approach to global optimization based on the

Monte Carlo method Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be determin ...

- sampling of the function to be objectively minimized in which the function is nonlinearly transformed to allow for easier tunneling among regions containing function minima. Easier tunneling allows for faster exploration of sample space and faster convergence to a good solution.

Parallel tempering

Parallel tempering, also known as replica exchange MCMC sampling, is a

simulation A simulation is the imitation of the operation of a real-world process or system over time. Simulations require the use of models; the model represents the key characteristics or behaviors of the selected system or process, whereas the ...

method aimed at improving the dynamic properties of

simulations of physical systems, and of

Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain a ...

(MCMC) sampling methods more generally. The replica exchange method was originally devised by Swendsen, then extended by Geyer and later developed, among others, by

Giorgio Parisi Giorgio Parisi (born 4 August 1948) is an Italian theoretical physicist, whose research has focused on quantum field theory, statistical mechanics and complex systems. His best known contributions are the QCD evolution equations for parton dens ...

., Sugita and Okamoto formulated a

version of parallel tempering: this is usually known as replica-exchange molecular dynamics or REMD. Essentially, one runs ''N'' copies of the system, randomly initialized, at different temperatures. Then, based on the Metropolis criterion one exchanges configurations at different temperatures. The idea of this method is to make configurations at high temperatures available to the simulations at low temperatures and vice versa. This results in a very robust ensemble which is able to sample both low and high energy configurations. In this way, thermodynamical properties such as the specific heat, which is in general not well computed in the canonical ensemble, can be computed with great precision.

Heuristics and metaheuristics

Other approaches include heuristic strategies to search the search space in a more or less intelligent way, including: *

Ant colony optimization In computer science and operations research, the ant colony optimization algorithm (ACO) is a probabilistic technique for solving computational problems which can be reduced to finding good paths through graphs. Artificial ants stand for multi ...

(ACO) *

Simulated annealing Simulated annealing (SA) is a probabilistic technique for approximating the global optimum of a given function. Specifically, it is a metaheuristic to approximate global optimization in a large search space for an optimization problem. It ...

, a generic probabilistic metaheuristic *

Tabu search Tabu search is a metaheuristic search method employing local search methods used for mathematical optimization. It was created by Fred W. Glover in 1986 and formalized in 1989. Local (neighborhood) searches take a potential solution to a prob ...

, an extension of local search capable of escaping from local minima *

Evolutionary algorithm In computational intelligence (CI), an evolutionary algorithm (EA) is a subset of evolutionary computation, a generic population-based metaheuristic optimization algorithm. An EA uses mechanisms inspired by biological evolution, such as reprodu ...

s (e.g.,

genetic algorithms In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to gen ...

and

evolution strategies In computer science, an evolution strategy (ES) is an optimization technique based on ideas of evolution. It belongs to the general class of evolutionary computation or artificial evolution methodologies. History The 'evolution strategy' optimizat ...

) *

Differential evolution In evolutionary computation, differential evolution (DE) is a method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. Such methods are commonly known as metaheuristics ...

, a method that optimizes a problem by iteratively trying to improve a

candidate solution In mathematical optimization, a feasible region, feasible set, search space, or solution space is the set of all possible points (sets of values of the choice variables) of an optimization problem that satisfy the problem's constraints, potent ...

with regard to a given measure of quality * Swarm-based optimization algorithms (e.g., particle swarm optimization, social cognitive optimization, multi-swarm optimization and

ant colony optimization In computer science and operations research, the ant colony optimization algorithm (ACO) is a probabilistic technique for solving computational problems which can be reduced to finding good paths through graphs. Artificial ants stand for multi ...

) * Memetic algorithms, combining global and local search strategies * Reactive search optimization (i.e. integration of sub-symbolic machine learning techniques into search heuristics) *

Graduated optimization Graduated optimization is a global optimization technique that attempts to solve a difficult optimization problem by initially solving a greatly simplified problem, and progressively transforming that problem (while optimizing) until it is equivalen ...

, a technique that attempts to solve a difficult optimization problem by initially solving a greatly simplified problem, and progressively transforming that problem (while optimizing) until it is equivalent to the difficult optimization problem.Hossein Mobahi, John W. Fisher III.
On the Link Between Gaussian Homotopy Continuation and Convex Envelopes
In Lecture Notes in Computer Science (EMMCVPR 2015), Springer, 2015.

Response surface methodology-based approaches

* IOSO Indirect Optimization based on Self-Organization * Bayesian optimization, a sequential design strategy for global

optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfi ...

of black-box functions using

Bayesian statistics Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...

Jonas Mockus (2013)
Bayesian approach to global optimization: theory and applications
Kluwer Academic.

Footnotes

References

Deterministic global optimization: * R. Horst, H. Tuy
Global Optimization: Deterministic Approaches
Springer, 1996. * R. Horst, P.M. Pardalos and N.V. Thoai
Introduction to Global Optimization
Second Edition. Kluwer Academic Publishers, 2000.
A.Neumaier, Complete Search in Continuous Global Optimization and Constraint Satisfaction, pp. 271–369 in: Acta Numerica 2004 (A. Iserles, ed.), Cambridge University Press 2004.
* M. Mongeau, H. Karsenty, V. Rouzé and J.-B. Hiriart-Urruty
Comparison of public-domain software for black box global optimization
Optimization Methods & Software 13(3), pp. 203–226, 2000. * J.D. Pintér
Global Optimization in Action - Continuous and Lipschitz Optimization: Algorithms, Implementations and Applications
Kluwer Academic Publishers, Dordrecht, 1996. Now distributed by Springer Science and Business Media, New York. This book also discusses stochastic global optimization methods. * L. Jaulin, M. Kieffer, O. Didrit, E. Walter (2001). Applied Interval Analysis. Berlin: Springer. * E.R. Hansen (1992), Global Optimization using Interval Analysis, Marcel Dekker, New York. For simulated annealing: * For reactive search optimization: *

Roberto Battiti Roberto Battiti (born 1961) is an Italian computer scientist, Professor of computer science at the University of Trento, director of the LIONlab (Learning and Intelligent Optimization), and deputy director of the DISI Department (Information Engi ...

, M. Brunato and F. Mascia, Reactive Search and Intelligent Optimization, Operations Research/Computer Science Interfaces Series, Vol. 45, Springer, November 2008. For stochastic methods: * A. Zhigljavsky. Theory of Global Random Search. Mathematics and its applications. Kluwer Academic Publishers. 1991. * * * For parallel tempering: * For continuation methods: * Zhijun Wu.
The effective energy transformation scheme as a special continuation approach to global optimization with application to molecular conformation
Technical Report, Argonne National Lab., IL (United States), November 1996. For general considerations on the dimensionality of the domain of definition of the objective function: * For strategies allowing one to compare deterministic and stochastic global optimization methods

External links

Introduction to global optimization by L. LibertiFree e-book by Thomas Weise
{{Authority control Deterministic global optimization