Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfields:

discrete optimization Discrete optimization is a branch of optimization in applied mathematics and computer science. As opposed to continuous optimization, some or all of the variables used in a discrete optimization problem are restricted to be discrete variables&mda ...

and

continuous optimization Continuous optimization is a branch of optimization in applied mathematics. As opposed to discrete optimization, the variables used in the objective function are required to be continuous variables—that is, to be chosen from a set of ...

. Optimization problems arise in all quantitative disciplines from

computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...

and

engineering Engineering is the practice of using natural science, mathematics, and the engineering design process to Problem solving#Engineering, solve problems within technology, increase efficiency and productivity, and improve Systems engineering, s ...

operations research Operations research () (U.S. Air Force Specialty Code: Operations Analysis), often shortened to the initialism OR, is a branch of applied mathematics that deals with the development and application of analytical methods to improve management and ...

and

economics Economics () is a behavioral science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goods and services. Economics focuses on the behaviour and interac ...

, and the development of solution methods has been of interest in

mathematics Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...

for centuries. In the more general approach, an

optimization problem In mathematics, engineering, computer science and economics Economics () is a behavioral science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goo ...

consists of maximizing or minimizing a

real function In mathematical analysis, and applications in geometry, applied mathematics, engineering, and natural sciences, a function of a real variable is a function whose domain is the real numbers \mathbb, or a subset of \mathbb that contains an inter ...

by systematically choosing input values from within an allowed set and computing the value of the function. The generalization of optimization theory and techniques to other formulations constitutes a large area of

applied mathematics Applied mathematics is the application of mathematics, mathematical methods by different fields such as physics, engineering, medicine, biology, finance, business, computer science, and Industrial sector, industry. Thus, applied mathematics is a ...

Optimization problems

Optimization problems can be divided into two categories, depending on whether the variables are continuous or

discrete Discrete may refer to: *Discrete particle or quantum in physics, for example in quantum theory * Discrete device, an electronic component with just one circuit element, either passive or active, other than an integrated circuit * Discrete group, ...

: * An optimization problem with discrete variables is known as a ''

'', in which an

object Object may refer to: General meanings * Object (philosophy), a thing, being, or concept ** Object (abstract), an object which does not exist at any particular time or place ** Physical object, an identifiable collection of matter * Goal, an a ...

such as an

integer An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...

permutation In mathematics, a permutation of a set can mean one of two different things: * an arrangement of its members in a sequence or linear order, or * the act or process of changing the linear order of an ordered set. An example of the first mean ...

graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discret ...

must be found from a

countable set In mathematics, a set is countable if either it is finite or it can be made in one to one correspondence with the set of natural numbers. Equivalently, a set is ''countable'' if there exists an injective function from it into the natural numbe ...

. * A problem with continuous variables is known as a ''

'', in which optimal arguments from a continuous set must be found. They can include constrained problems and multimodal problems. An optimization problem can be represented in the following way: :''Given:'' a function from some

set Set, The Set, SET or SETS may refer to: Science, technology, and mathematics Mathematics *Set (mathematics), a collection of elements *Category of sets, the category whose objects and morphisms are sets and total functions, respectively Electro ...

to the

real number In mathematics, a real number is a number that can be used to measure a continuous one- dimensional quantity such as a duration or temperature. Here, ''continuous'' means that pairs of values can have arbitrarily small differences. Every re ...

s :''Sought:'' an element such that for all ("minimization") or such that for all ("maximization"). Such a formulation is called an

or a mathematical programming problem (a term not directly related to

computer programming Computer programming or coding is the composition of sequences of instructions, called computer program, programs, that computers can follow to perform tasks. It involves designing and implementing algorithms, step-by-step specifications of proc ...

, but still in use for example in

linear programming Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements and objective are represented by linear function#As a polynomia ...

– see

History History is the systematic study of the past, focusing primarily on the Human history, human past. As an academic discipline, it analyses and interprets evidence to construct narratives about what happened and explain why it happened. Some t ...

below). Many real-world and theoretical problems may be modeled in this general framework. Since the following is valid: :

f(\mathbf_)\geq f(\mathbf) \Leftrightarrow -f(\mathbf_)\leq -f(\mathbf),

it suffices to solve only minimization problems. However, the opposite perspective of considering only maximization problems would be valid, too. Problems formulated using this technique in the fields of

physics Physics is the scientific study of matter, its Elementary particle, fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge whi ...

may refer to the technique as ''

energy Energy () is the physical quantity, quantitative physical property, property that is transferred to a physical body, body or to a physical system, recognizable in the performance of Work (thermodynamics), work and in the form of heat and l ...

minimization'', speaking of the value of the function as representing the energy of the

system A system is a group of interacting or interrelated elements that act according to a set of rules to form a unified whole. A system, surrounded and influenced by its open system (systems theory), environment, is described by its boundaries, str ...

being modeled. In

machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...

, it is always necessary to continuously evaluate the quality of a data model by using a cost function where a minimum implies a set of possibly optimal parameters with an optimal (lowest) error. Typically, is some

subset In mathematics, a Set (mathematics), set ''A'' is a subset of a set ''B'' if all Element (mathematics), elements of ''A'' are also elements of ''B''; ''B'' is then a superset of ''A''. It is possible for ''A'' and ''B'' to be equal; if they a ...

of the

Euclidean space Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, in Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics there are ''Euclidean spaces ...

\mathbb R^n

, often specified by a set of '' constraints'', equalities or inequalities that the members of have to satisfy. The domain of is called the ''search space'' or the ''choice set'', while the elements of are called ''

candidate solution In mathematical optimization and computer science, a feasible region, feasible set, or solution space is the set of all possible points (sets of values of the choice variables) of an optimization problem that satisfy the problem's constraints, ...

s'' or ''feasible solutions''. The function is variously called an ''objective function'', ''criterion function'', '' loss function'', ''cost function'' (minimization), ''utility function'' or ''fitness function'' (maximization), or, in certain fields, an ''energy function'' or ''energy functional''. A feasible solution that minimizes (or maximizes) the objective function is called an ''optimal solution''. In mathematics, conventional optimization problems are usually stated in terms of minimization. A ''local minimum'' is defined as an element for which there exists some such that :

\forall\mathbf\in A \; \text \;\left\Vert\mathbf-\mathbf^\right\Vert\leq\delta,\,

the expression holds; that is to say, on some region around all of the function values are greater than or equal to the value at that element. Local maxima are defined similarly. While a local minimum is at least as good as any nearby elements, a global minimum is at least as good as every feasible element. Generally, unless the objective function is

convex Convex or convexity may refer to: Science and technology * Convex lens, in optics Mathematics * Convex set, containing the whole line segment that joins points ** Convex polygon, a polygon which encloses a convex set of points ** Convex polytop ...

in a minimization problem, there may be several local minima. In a convex problem, if there is a local minimum that is interior (not on the edge of the set of feasible elements), it is also the global minimum, but a nonconvex problem may have more than one local minimum not all of which need be global minima. A large number of algorithms proposed for solving the nonconvex problems – including the majority of commercially available solvers – are not capable of making a distinction between locally optimal solutions and globally optimal solutions, and will treat the former as actual solutions to the original problem.

Global optimization Global optimization is a branch of operations research, applied mathematics, and numerical analysis that attempts to find the global minimum or maximum of a function or a set of functions on a given set. It is usually described as a minimization ...

is the branch of

and

numerical analysis Numerical analysis is the study of algorithms that use numerical approximation (as opposed to symbolic computation, symbolic manipulations) for the problems of mathematical analysis (as distinguished from discrete mathematics). It is the study of ...

that is concerned with the development of deterministic algorithms that are capable of guaranteeing convergence in finite time to the actual optimal solution of a nonconvex problem.

Notation

Optimization problems are often expressed with special notation. Here are some examples:

Minimum and maximum value of a function

Consider the following notation: :

\min_\; \left(x^2 + 1\right)

This denotes the minimum value of the objective function , when choosing from the set of

\mathbb

. The minimum value in this case is 1, occurring at . Similarly, the notation :

\max_\; 2x

asks for the maximum value of the objective function , where may be any real number. In this case, there is no such maximum as the objective function is unbounded, so the answer is "

infinity Infinity is something which is boundless, endless, or larger than any natural number. It is denoted by \infty, called the infinity symbol. From the time of the Ancient Greek mathematics, ancient Greeks, the Infinity (philosophy), philosophic ...

" or "

undefined Undefined may refer to: Mathematics *Undefined (mathematics), with several related meanings **Indeterminate form, in calculus Computing *Undefined behavior, computer code whose behavior is not specified under certain conditions *Undefined valu ...

Optimal input arguments

Consider the following notation: :

\underset \; x^2 + 1,

or equivalently :

\underset \; x^2 + 1, \; \text \; x\in(-\infty,-1].

This represents the value (or values) of the Argument of a function, argument in the interval that minimizes (or minimize) the objective function (the actual minimum value of that function is not what the problem asks for). In this case, the answer is , since is infeasible, that is, it does not belong to the feasible set. Similarly, :

\underset \; x\cos y,

or equivalently :

\; y\in\mathbb R,

represents the pair (or pairs) that maximizes (or maximize) the value of the objective function , with the added constraint that lie in the interval (again, the actual maximum value of the expression does not matter). In this case, the solutions are the pairs of the form and , where ranges over all

s. Operators and are sometimes also written as and , and stand for ''argument of the minimum'' and ''argument of the maximum''.

History

Fermat Pierre de Fermat (; ; 17 August 1601 – 12 January 1665) was a French mathematician who is given credit for early developments that led to infinitesimal calculus, including his technique of adequality. In particular, he is recognized for his d ...

and

Lagrange Joseph-Louis Lagrange (born Giuseppe Luigi LagrangiaNewton and

Gauss Johann Carl Friedrich Gauss (; ; ; 30 April 177723 February 1855) was a German mathematician, astronomer, Geodesy, geodesist, and physicist, who contributed to many fields in mathematics and science. He was director of the Göttingen Observat ...

proposed iterative methods for moving towards an optimum. The term "

" for certain optimization cases was due to George B. Dantzig, although much of the theory had been introduced by

Leonid Kantorovich Leonid Vitalyevich Kantorovich (, ; 19 January 19127 April 1986) was a Soviet mathematician and economist, known for his theory and development of techniques for the optimal allocation of resources. He is regarded as the founder of linear programm ...

in 1939. (''Programming'' in this context does not refer to

, but comes from the use of ''program'' by the

United States The United States of America (USA), also known as the United States (U.S.) or America, is a country primarily located in North America. It is a federal republic of 50 U.S. state, states and a federal capital district, Washington, D.C. The 48 ...

military to refer to proposed training and

logistics Logistics is the part of supply chain management that deals with the efficient forward and reverse flow of goods, services, and related information from the point of origin to the Consumption (economics), point of consumption according to the ...

schedules, which were the problems Dantzig studied at that time.) Dantzig published the

Simplex algorithm In mathematical optimization, Dantzig's simplex algorithm (or simplex method) is a popular algorithm for linear programming. The name of the algorithm is derived from the concept of a simplex and was suggested by T. S. Motzkin. Simplices are ...

in 1947, and also

John von Neumann John von Neumann ( ; ; December 28, 1903 – February 8, 1957) was a Hungarian and American mathematician, physicist, computer scientist and engineer. Von Neumann had perhaps the widest coverage of any mathematician of his time, in ...

and other researchers worked on the theoretical aspects of linear programming (like the theory of duality) around the same time. Other notable researchers in mathematical optimization include the following: *

Richard Bellman Richard Ernest Bellman (August 26, 1920 – March 19, 1984) was an American applied mathematician, who introduced dynamic programming in 1953, and made important contributions in other fields of mathematics, such as biomathematics. He foun ...

* Dimitri Bertsekas * Michel Bierlaire * Stephen P. Boyd * Roger Fletcher * Martin Grötschel * Ronald A. Howard * Fritz John *

Narendra Karmarkar Narendra Krishna Karmarkar (born 1956) is an Indian mathematician. He developed Karmarkar's algorithm. He is listed as an ISI highly cited researcher. He invented one of the first probably polynomial time algorithms for linear programming, w ...

* William Karush * Leonid Khachiyan *

Bernard Koopman Bernard Osgood Koopman (January 19, 1900 – August 18, 1981) was a French-born American mathematician, known for his work in ergodic theory, the foundations of probability, statistical theory and operations research. Education and work ...

* Harold Kuhn *

László Lovász László Lovász (; born March 9, 1948) is a Hungarian mathematician and professor emeritus at Eötvös Loránd University, best known for his work in combinatorics, for which he was awarded the 2021 Abel Prize jointly with Avi Wigderson. He ...

* David Luenberger * Arkadi Nemirovski * Yurii Nesterov * Lev Pontryagin * R. Tyrrell Rockafellar * Naum Z. Shor * Albert Tucker

Major subfields

* Convex programming studies the case when the objective function is

(minimization) or concave (maximization) and the constraint set is

. This can be viewed as a particular case of nonlinear programming or as generalization of linear or convex quadratic programming. **

Linear programming Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements and objective are represented by linear function#As a polynomia ...

(LP), a type of convex programming, studies the case in which the objective function ''f'' is linear and the constraints are specified using only linear equalities and inequalities. Such a constraint set is called a

polyhedron In geometry, a polyhedron (: polyhedra or polyhedrons; ) is a three-dimensional figure with flat polygonal Face (geometry), faces, straight Edge (geometry), edges and sharp corners or Vertex (geometry), vertices. The term "polyhedron" may refer ...

or a

polytope In elementary geometry, a polytope is a geometric object with flat sides ('' faces''). Polytopes are the generalization of three-dimensional polyhedra to any number of dimensions. Polytopes may exist in any general number of dimensions as an ...

if it is bounded. **

Second-order cone programming A second-order cone program (SOCP) is a convex optimization problem of the form :minimize \ f^T x \ :subject to ::\lVert A_i x + b_i \rVert_2 \leq c_i^T x + d_i,\quad i = 1,\dots,m ::Fx = g \ where the problem parameters are f \in \mathbb^n, ...

(SOCP) is a convex program, and includes certain types of quadratic programs. ** Semidefinite programming (SDP) is a subfield of convex optimization where the underlying variables are semidefinite

matrices Matrix (: matrices or matrixes) or MATRIX may refer to: Science and mathematics * Matrix (mathematics), a rectangular array of numbers, symbols or expressions * Matrix (logic), part of a formula in prenex normal form * Matrix (biology), the ...

. It is a generalization of linear and convex quadratic programming. ** Conic programming is a general form of convex programming. LP, SOCP and SDP can all be viewed as conic programs with the appropriate type of cone. ** Geometric programming is a technique whereby objective and inequality constraints expressed as posynomials and equality constraints as monomials can be transformed into a convex program. *

Integer programming An integer programming problem is a mathematical optimization or feasibility program in which some or all of the variables are restricted to be integers. In many settings the term refers to integer linear programming (ILP), in which the objective ...

studies linear programs in which some or all variables are constrained to take on

values. This is not convex, and in general much more difficult than regular linear programming. *

Quadratic programming Quadratic programming (QP) is the process of solving certain mathematical optimization problems involving quadratic functions. Specifically, one seeks to optimize (minimize or maximize) a multivariate quadratic function subject to linear constr ...

allows the objective function to have quadratic terms, while the feasible set must be specified with linear equalities and inequalities. For specific forms of the quadratic term, this is a type of convex programming. * Fractional programming studies optimization of ratios of two nonlinear functions. The special class of concave fractional programs can be transformed to a convex optimization problem. * Nonlinear programming studies the general case in which the objective function or the constraints or both contain nonlinear parts. This may or may not be a convex program. In general, whether the program is convex affects the difficulty of solving it. * Stochastic programming studies the case in which some of the constraints or parameters depend on

random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...

s. *

Robust optimization Robust optimization is a field of mathematical optimization theory that deals with optimization problems in which a certain measure of robustness is sought against uncertainty that can be represented as deterministic variability in the value of the ...

is, like stochastic programming, an attempt to capture uncertainty in the data underlying the optimization problem. Robust optimization aims to find solutions that are valid under all possible realizations of the uncertainties defined by an uncertainty set. *

Combinatorial optimization Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combina ...

is concerned with problems where the set of feasible solutions is discrete or can be reduced to a

one. *

Stochastic optimization Stochastic optimization (SO) are optimization methods that generate and use random variables. For stochastic optimization problems, the objective functions or constraints are random. Stochastic optimization also include methods with random iter ...

is used with random (noisy) function measurements or random inputs in the search process. *

Infinite-dimensional optimization In certain optimization (mathematics), optimization problems the unknown optimal solution might not be a number or a vector, but rather a continuous quantity, for example a function (mathematics), function or the shape of a body. Such a problem is a ...

studies the case when the set of feasible solutions is a subset of an infinite-

dimension In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coo ...

al space, such as a space of functions. *

Heuristics A heuristic or heuristic technique (''problem solving'', '' mental shortcut'', ''rule of thumb'') is any approach to problem solving that employs a pragmatic method that is not fully optimized, perfected, or rationalized, but is nevertheless ...

and

metaheuristic In computer science and mathematical optimization, a metaheuristic is a higher-level procedure or heuristic designed to find, generate, tune, or select a heuristic (partial search algorithm) that may provide a sufficiently good solution to an op ...

s make few or no assumptions about the problem being optimized. Usually, heuristics do not guarantee that any optimal solution need be found. On the other hand, heuristics are used to find approximate solutions for many complicated optimization problems. * Constraint satisfaction studies the case in which the objective function ''f'' is constant (this is used in

artificial intelligence Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...

, particularly in

automated reasoning In computer science, in particular in knowledge representation and reasoning and metalogic, the area of automated reasoning is dedicated to understanding different aspects of reasoning. The study of automated reasoning helps produce computer progr ...

). **

Constraint programming Constraint programming (CP) is a paradigm for solving combinatorial problems that draws on a wide range of techniques from artificial intelligence, computer science, and operations research. In constraint programming, users declaratively state t ...

is a programming paradigm wherein relations between variables are stated in the form of constraints. * Disjunctive programming is used where at least one constraint must be satisfied but not all. It is of particular use in scheduling. * Space mapping is a concept for modeling and optimization of an engineering system to high-fidelity (fine) model accuracy exploiting a suitable physically meaningful coarse or surrogate model. In a number of subfields, the techniques are designed primarily for optimization in dynamic contexts (that is, decision making over time): *

Calculus of variations The calculus of variations (or variational calculus) is a field of mathematical analysis that uses variations, which are small changes in Function (mathematics), functions and functional (mathematics), functionals, to find maxima and minima of f ...

is concerned with finding the best way to achieve some goal, such as finding a surface whose boundary is a specific curve, but with the least possible area. *

Optimal control Optimal control theory is a branch of control theory that deals with finding a control for a dynamical system over a period of time such that an objective function is optimized. It has numerous applications in science, engineering and operations ...

theory is a generalization of the calculus of variations which introduces control policies. * Dynamic programming is the approach to solve the

stochastic optimization Stochastic optimization (SO) are optimization methods that generate and use random variables. For stochastic optimization problems, the objective functions or constraints are random. Stochastic optimization also include methods with random iter ...

problem with stochastic, randomness, and unknown model parameters. It studies the case in which the optimization strategy is based on splitting the problem into smaller subproblems. The equation that describes the relationship between these subproblems is called the

Bellman equation A Bellman equation, named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical Optimization (mathematics), optimization method known as dynamic programming. It writes the "value" of a decision problem ...

. * Mathematical programming with equilibrium constraints is where the constraints include

variational inequalities In mathematics, a variational inequality is an inequality (mathematics), inequality involving a Functional (mathematics), functional, which has to be Inequality (mathematics)#Solving Inequalities, solved for all possible values of a given Variable ( ...

or complementarities.

Multi-objective optimization

Adding more than one objective to an optimization problem adds complexity. For example, to optimize a structural design, one would desire a design that is both light and rigid. When two objectives conflict, a trade-off must be created. There may be one lightest design, one stiffest design, and an infinite number of designs that are some compromise of weight and rigidity. The set of trade-off designs that improve upon one criterion at the expense of another is known as the Pareto set. The curve created plotting weight against stiffness of the best designs is known as the Pareto frontier. A design is judged to be "Pareto optimal" (equivalently, "Pareto efficient" or in the Pareto set) if it is not dominated by any other design: If it is worse than another design in some respects and no better in any respect, then it is dominated and is not Pareto optimal. The choice among "Pareto optimal" solutions to determine the "favorite solution" is delegated to the decision maker. In other words, defining the problem as multi-objective optimization signals that some information is missing: desirable objectives are given but combinations of them are not rated relative to each other. In some cases, the missing information can be derived by interactive sessions with the decision maker. Multi-objective optimization problems have been generalized further into

vector optimization Vector optimization is a subarea of mathematical optimization where Optimization problem, optimization problems with a vector-valued objective functions are optimized with respect to a given partial ordering and subject to certain constraints. A m ...

problems where the (partial) ordering is no longer given by the Pareto ordering.

Multi-modal or global optimization

Optimization problems are often multi-modal; that is, they possess multiple good solutions. They could all be globally good (same cost function value) or there could be a mix of globally good and locally good solutions. Obtaining all (or at least some of) the multiple solutions is the goal of a multi-modal optimizer. Classical optimization techniques due to their iterative approach do not perform satisfactorily when they are used to obtain multiple solutions, since it is not guaranteed that different solutions will be obtained even with different starting points in multiple runs of the algorithm. Common approaches to

global optimization Global optimization is a branch of operations research, applied mathematics, and numerical analysis that attempts to find the global minimum or maximum of a function or a set of functions on a given set. It is usually described as a minimization ...

problems, where multiple local extrema may be present include

evolutionary algorithm Evolutionary algorithms (EA) reproduce essential elements of the biological evolution in a computer algorithm in order to solve "difficult" problems, at least Approximation, approximately, for which no exact or satisfactory solution methods are k ...

Bayesian optimization Bayesian optimization is a sequential design strategy for global optimization of black-box functions, that does not assume any functional forms. It is usually employed to optimize expensive-to-evaluate functions. With the rise of artificial intell ...

and

simulated annealing Simulated annealing (SA) is a probabilistic technique for approximating the global optimum of a given function. Specifically, it is a metaheuristic to approximate global optimization in a large search space for an optimization problem. ...

Classification of critical points and extrema

Feasibility problem

The '' satisfiability problem'', also called the ''feasibility problem'', is just the problem of finding any feasible solution at all without regard to objective value. This can be regarded as the special case of mathematical optimization where the objective value is the same for every solution, and thus any solution is optimal. Many optimization algorithms need to start from a feasible point. One way to obtain such a point is to

relax Relax or RELAX may refer to: Albums * ''Relax'' (album), by Das Racist, 2011 * ''Relax'', by Blank & Jones, 2003 * ''Relax'', by Los Piratas, 2003 Songs * "Relax" (Deetah song), 1998 * "Relax" (Frankie Goes to Hollywood song), 1983 * "Relax ...

the feasibility conditions using a slack variable; with enough slack, any starting point is feasible. Then, minimize that slack variable until the slack is null or negative.

Existence

The

extreme value theorem In calculus, the extreme value theorem states that if a real-valued function f is continuous on the closed and bounded interval ,b/math>, then f must attain a maximum and a minimum, each at least once. That is, there exist numbers c and ...

Karl Weierstrass Karl Theodor Wilhelm Weierstrass (; ; 31 October 1815 – 19 February 1897) was a German mathematician often cited as the " father of modern analysis". Despite leaving university without a degree, he studied mathematics and trained as a school t ...

states that a continuous real-valued function on a compact set attains its maximum and minimum value. More generally, a lower semi-continuous function on a compact set attains its minimum; an upper semi-continuous function on a compact set attains its maximum point or view.

Necessary conditions for optimality

One of Fermat's theorems states that optima of unconstrained problems are found at

stationary point In mathematics, particularly in calculus, a stationary point of a differentiable function of one variable is a point on the graph of a function, graph of the function where the function's derivative is zero. Informally, it is a point where the ...

s, where the first derivative or the gradient of the objective function is zero (see

first derivative test In calculus, a derivative test uses the derivatives of a function to locate the critical points of a function and determine whether each point is a local maximum, a local minimum, or a saddle point. Derivative tests can also give information ab ...

). More generally, they may be found at critical points, where the first derivative or gradient of the objective function is zero or is undefined, or on the boundary of the choice set. An equation (or set of equations) stating that the first derivative(s) equal(s) zero at an interior optimum is called a 'first-order condition' or a set of first-order conditions. Optima of equality-constrained problems can be found by the

Lagrange multiplier In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function (mathematics), function subject to constraint (mathematics), equation constraints (i.e., subject to the conditio ...

method. The optima of problems with equality and/or inequality constraints can be found using the '

Karush–Kuhn–Tucker conditions In mathematical optimization, the Karush–Kuhn–Tucker (KKT) conditions, also known as the Kuhn–Tucker conditions, are first derivative tests (sometimes called first-order necessary conditions) for a solution in nonlinear programming to be ...

Sufficient conditions for optimality

While the first derivative test identifies points that might be extrema, this test does not distinguish a point that is a minimum from one that is a maximum or one that is neither. When the objective function is twice differentiable, these cases can be distinguished by checking the second derivative or the matrix of second derivatives (called the

Hessian matrix In mathematics, the Hessian matrix, Hessian or (less commonly) Hesse matrix is a square matrix of second-order partial derivatives of a scalar-valued Function (mathematics), function, or scalar field. It describes the local curvature of a functio ...

) in unconstrained problems, or the matrix of second derivatives of the objective function and the constraints called the bordered Hessian in constrained problems. The conditions that distinguish maxima, or minima, from other stationary points are called 'second-order conditions' (see ' Second derivative test'). If a candidate solution satisfies the first-order conditions, then the satisfaction of the second-order conditions as well is sufficient to establish at least local optimality.

Sensitivity and continuity of optima

The envelope theorem describes how the value of an optimal solution changes when an underlying

parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...

changes. The process of computing this change is called

comparative statics In economics, comparative statics is the comparison of two different economic outcomes, before and after a change in some underlying exogenous variable, exogenous parameter. As a type of ''static analysis'' it compares two different economic equ ...

. The maximum theorem of Claude Berge (1963) describes the continuity of an optimal solution as a function of underlying parameters.

Calculus of optimization

For unconstrained problems with twice-differentiable functions, some critical points can be found by finding the points where the

gradient In vector calculus, the gradient of a scalar-valued differentiable function f of several variables is the vector field (or vector-valued function) \nabla f whose value at a point p gives the direction and the rate of fastest increase. The g ...

of the objective function is zero (that is, the stationary points). More generally, a zero

subgradient In mathematics, the subderivative (or subgradient) generalizes the derivative to convex functions which are not necessarily differentiable. The set of subderivatives at a point is called the subdifferential at that point. Subderivatives arise in c ...

certifies that a local minimum has been found for minimization problems with convex functions and other

locally In mathematics, a mathematical object is said to satisfy a property locally, if the property is satisfied on some limited, immediate portions of the object (e.g., on some ''sufficiently small'' or ''arbitrarily small'' neighborhoods of points). P ...

Lipschitz function In mathematical analysis, Lipschitz continuity, named after German mathematician Rudolf Lipschitz, is a strong form of uniform continuity for functions. Intuitively, a Lipschitz continuous function is limited in how fast it can change: there e ...

s, which meet in loss function minimization of the neural network. The positive-negative momentum estimation lets to avoid the local minimum and converges at the objective function global minimum. Further, critical points can be classified using the

definiteness In linguistics, definiteness is a semantic feature of noun phrases that distinguishes between referents or senses that are identifiable in a given context (definite noun phrases) and those that are not (indefinite noun phrases). The prototypical ...

of the

: If the Hessian is ''positive'' definite at a critical point, then the point is a local minimum; if the Hessian matrix is negative definite, then the point is a local maximum; finally, if indefinite, then the point is some kind of

saddle point In mathematics, a saddle point or minimax point is a Point (geometry), point on the surface (mathematics), surface of the graph of a function where the slopes (derivatives) in orthogonal directions are all zero (a Critical point (mathematics), ...

. Constrained problems can often be transformed into unconstrained problems with the help of

Lagrangian relaxation In the field of mathematical optimization, Lagrangian relaxation is a relaxation method which approximates a difficult problem of constrained optimization by a simpler problem. A solution to the relaxed problem is an approximate solution to the o ...

can also provide approximate solutions to difficult constrained problems. When the objective function is a

convex function In mathematics, a real-valued function is called convex if the line segment between any two distinct points on the graph of a function, graph of the function lies above or on the graph between the two points. Equivalently, a function is conve ...

, then any local minimum will also be a global minimum. There exist efficient numerical techniques for minimizing convex functions, such as interior-point methods.

Global convergence

More generally, if the objective function is not a quadratic function, then many optimization methods use other methods to ensure that some subsequence of iterations converges to an optimal solution. The first and still popular method for ensuring convergence relies on line searches, which optimize a function along one dimension. A second and increasingly popular method for ensuring convergence uses trust regions. Both line searches and trust regions are used in modern methods of non-differentiable optimization. Usually, a global optimizer is much slower than advanced local optimizers (such as BFGS), so often an efficient global optimizer can be constructed by starting the local optimizer from different starting points.

Computational optimization techniques

To solve problems, researchers may use

algorithm In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...

s that terminate in a finite number of steps, or

iterative method In computational mathematics, an iterative method is a Algorithm, mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the ''i''-th approximation (called an " ...

s that converge to a solution (on some specified class of problems), or

heuristics A heuristic or heuristic technique (''problem solving'', '' mental shortcut'', ''rule of thumb'') is any approach to problem solving that employs a pragmatic method that is not fully optimized, perfected, or rationalized, but is nevertheless ...

that may provide approximate solutions to some problems (although their iterates need not converge).

Optimization algorithms

of George Dantzig, designed for

* Extensions of the simplex algorithm, designed for

quadratic programming Quadratic programming (QP) is the process of solving certain mathematical optimization problems involving quadratic functions. Specifically, one seeks to optimize (minimize or maximize) a multivariate quadratic function subject to linear constr ...

and for linear-fractional programming * Variants of the simplex algorithm that are especially suited for network optimization * Combinatorial algorithms * Quantum optimization algorithms

Iterative methods

The

iterative methods In computational mathematics, an iterative method is a Algorithm, mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the ''i''-th approximation (called an " ...

used to solve problems of nonlinear programming differ according to whether they

evaluate In common usage, evaluation is a systematic determination and assessment of a subject's merit, worth and significance, using criteria governed by a set of standards. It can assist an organization, program, design, project or any other intervention ...

Hessians, gradients, or only function values. While evaluating Hessians (H) and gradients (G) improves the rate of convergence, for functions for which these quantities exist and vary sufficiently smoothly, such evaluations increase the

computational complexity In computer science, the computational complexity or simply complexity of an algorithm is the amount of resources required to run it. Particular focus is given to computation time (generally measured by the number of needed elementary operations ...

(or computational cost) of each iteration. In some cases, the computational complexity may be excessively high. One major criterion for optimizers is just the number of required function evaluations as this often is already a large computational effort, usually much more effort than within the optimizer itself, which mainly has to operate over the N variables. The derivatives provide detailed information for such optimizers, but are even harder to calculate, e.g. approximating the gradient takes at least N+1 function evaluations. For approximations of the 2nd derivatives (collected in the Hessian matrix), the number of function evaluations is in the order of N². Newton's method requires the 2nd-order derivatives, so for each iteration, the number of function calls is in the order of N², but for a simpler pure gradient optimizer it is only N. However, gradient optimizers need usually more iterations than Newton's algorithm. Which one is best with respect to the number of function calls depends on the problem itself. * Methods that evaluate Hessians (or approximate Hessians, using

finite difference A finite difference is a mathematical expression of the form . Finite differences (or the associated difference quotients) are often used as approximations of derivatives, such as in numerical differentiation. The difference operator, commonly d ...

s): **

Newton's method In numerical analysis, the Newton–Raphson method, also known simply as Newton's method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a ...

Sequential quadratic programming Sequential quadratic programming (SQP) is an iterative method for constrained nonlinear optimization, also known as Lagrange-Newton method. SQP methods are used on mathematical problems for which the objective function and the constraints are twi ...

: A Newton-based method for small-medium scale ''constrained'' problems. Some versions can handle large-dimensional problems. ** Interior point methods: This is a large class of methods for constrained optimization, some of which use only (sub)gradient information and others of which require the evaluation of Hessians. * Methods that evaluate gradients, or approximate gradients in some way (or even subgradients): ** Coordinate descent methods: Algorithms which update a single coordinate in each iteration **

Conjugate gradient method In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is positive-semidefinite. The conjugate gradient method is often implemented as an it ...

Iterative method In computational mathematics, an iterative method is a Algorithm, mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the ''i''-th approximation (called an " ...

s for large problems. (In theory, these methods terminate in a finite number of steps with quadratic objective functions, but this finite termination is not observed in practice on finite–precision computers.) **

Gradient descent Gradient descent is a method for unconstrained mathematical optimization. It is a first-order iterative algorithm for minimizing a differentiable multivariate function. The idea is to take repeated steps in the opposite direction of the gradi ...

(alternatively, "steepest descent" or "steepest ascent"): A (slow) method of historical and theoretical interest, which has had renewed interest for finding approximate solutions of enormous problems. ** Subgradient methods: An iterative method for large

Lipschitz functions In mathematical analysis, Lipschitz continuity, named after German mathematician Rudolf Lipschitz, is a strong form of uniform continuity for functions. Intuitively, a Lipschitz continuous function is limited in how fast it can change: there e ...

using generalized gradients. Following Boris T. Polyak, subgradient–projection methods are similar to conjugate–gradient methods. ** Bundle method of descent: An iterative method for small–medium-sized problems with locally Lipschitz functions, particularly for convex minimization problems (similar to conjugate gradient methods). **

Ellipsoid method In mathematical optimization, the ellipsoid method is an iterative method for convex optimization, minimizing convex functions over convex sets. The ellipsoid method generates a sequence of ellipsoids whose volume uniformly decreases at every ste ...

: An iterative method for small problems with quasiconvex objective functions and of great theoretical interest, particularly in establishing the polynomial time complexity of some combinatorial optimization problems. It has similarities with Quasi-Newton methods. ** Conditional gradient method (Frank–Wolfe) for approximate minimization of specially structured problems with linear constraints, especially with traffic networks. For general unconstrained problems, this method reduces to the gradient method, which is regarded as obsolete (for almost all problems). ** Quasi-Newton methods: Iterative methods for medium-large problems (e.g. N<1000). ** Simultaneous perturbation stochastic approximation (SPSA) method for stochastic optimization; uses random (efficient) gradient approximation. * Methods that evaluate only function values: If a problem is continuously differentiable, then gradients can be approximated using finite differences, in which case a gradient-based method can be used. **

Interpolation In the mathematics, mathematical field of numerical analysis, interpolation is a type of estimation, a method of constructing (finding) new data points based on the range of a discrete set of known data points. In engineering and science, one ...

methods ** Pattern search methods, which have better convergence properties than the Nelder–Mead heuristic (with simplices), which is listed below. ** Mirror descent

Heuristics

Besides (finitely terminating)

s and (convergent)

s, there are

. A heuristic is any algorithm which is not guaranteed (mathematically) to find the solution, but which is nevertheless useful in certain practical situations. List of some well-known heuristics: * Differential evolution * Dynamic relaxation *

Evolutionary algorithms Evolutionary algorithms (EA) reproduce essential elements of the biological evolution in a computer algorithm in order to solve "difficult" problems, at least Approximation, approximately, for which no exact or satisfactory solution methods are k ...

Genetic algorithms In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to g ...

Hill climbing numerical analysis, hill climbing is a mathematical optimization technique which belongs to the family of local search. It is an iterative algorithm that starts with an arbitrary solution to a problem, then attempts to find a better soluti ...

with random restart * Memetic algorithm * Nelder–Mead simplicial heuristic: A popular heuristic for approximate minimization (without calling gradients) *

Particle swarm optimization In computational science, particle swarm optimization (PSO) is a computational method that Mathematical optimization, optimizes a problem by iterative method, iteratively trying to improve a candidate solution with regard to a given measure of qu ...

Simulated annealing Simulated annealing (SA) is a probabilistic technique for approximating the global optimum of a given function. Specifically, it is a metaheuristic to approximate global optimization in a large search space for an optimization problem. ...

* Stochastic tunneling * Tabu search

Applications

Mechanics

Problems in

rigid body dynamics In the physical science of dynamics, rigid-body dynamics studies the movement of systems of interconnected bodies under the action of external forces. The assumption that the bodies are '' rigid'' (i.e. they do not deform under the action ...

(in particular articulated rigid body dynamics) often require mathematical programming techniques, since you can view rigid body dynamics as attempting to solve an

ordinary differential equation In mathematics, an ordinary differential equation (ODE) is a differential equation (DE) dependent on only a single independent variable (mathematics), variable. As with any other DE, its unknown(s) consists of one (or more) Function (mathematic ...

on a constraint manifold; the constraints are various nonlinear geometric constraints such as "these two points must always coincide", "this surface must not penetrate any other", or "this point must always lie somewhere on this curve". Also, the problem of computing contact forces can be done by solving a

linear complementarity problem In mathematical optimization theory, the linear complementarity problem (LCP) arises frequently in computational mechanics and encompasses the well-known quadratic programming as a special case. It was proposed by Cottle and Dantzig in 1968 ...

, which can also be viewed as a QP (quadratic programming) problem. Many design problems can also be expressed as optimization programs. This application is called design optimization. One subset is the engineering optimization, and another recent and growing subset of this field is

multidisciplinary design optimization Multi-disciplinary design optimization (MDO) is a field of engineering that uses optimization methods to solve design problems incorporating a number of disciplines. It is also known as multidisciplinary system design optimization (MSDO), and mul ...

, which, while useful in many problems, has in particular been applied to

aerospace engineering Aerospace engineering is the primary field of engineering concerned with the development of aircraft and spacecraft. It has two major and overlapping branches: aeronautical engineering and astronautical engineering. Avionics engineering is s ...

problems. This approach may be applied in cosmology and astrophysics.

Economics and finance

Economics Economics () is a behavioral science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goods and services. Economics focuses on the behaviour and interac ...

is closely enough linked to optimization of agents that an influential definition relatedly describes economics ''qua'' science as the "study of human behavior as a relationship between ends and

scarce In economics, scarcity "refers to the basic fact of life that there exists only a finite amount of human and nonhuman resources which the best technical knowledge is capable of using to produce only limited maximum amounts of each economic good. ...

means" with alternative uses. Modern optimization theory includes traditional optimization theory but also overlaps with

game theory Game theory is the study of mathematical models of strategic interactions. It has applications in many fields of social science, and is used extensively in economics, logic, systems science and computer science. Initially, game theory addressed ...

and the study of economic equilibria. The ''

Journal of Economic Literature The ''Journal of Economic Literature'' is a peer-reviewed academic journal, published by the American Economic Association, that surveys the academic literature in economics. It was established by Arthur Smithies in 1963 as the ''Journal of Econo ...

'' codes classify mathematical programming, optimization techniques, and related topics under JEL:C61-C63. In microeconomics, the

utility maximization problem Utility maximization was first developed by utilitarian philosophers Jeremy Bentham and John Stuart Mill. In microeconomics, the utility maximization problem is the problem consumers face: "How should I spend my money in order to maximize my uti ...

and its

dual problem In mathematical optimization theory, duality or the duality principle is the principle that optimization problems may be viewed from either of two perspectives, the primal problem or the dual problem. If the primal is a minimization problem then th ...

, the expenditure minimization problem, are economic optimization problems. Insofar as they behave consistently,

consumer A consumer is a person or a group who intends to order, or use purchased goods, products, or services primarily for personal, social, family, household and similar needs, who is not directly related to entrepreneurial or business activities. ...

s are assumed to maximize their

utility In economics, utility is a measure of a certain person's satisfaction from a certain state of the world. Over time, the term has been used with at least two meanings. * In a normative context, utility refers to a goal or objective that we wish ...

, while

firm A company, abbreviated as co., is a Legal personality, legal entity representing an association of legal people, whether Natural person, natural, Juridical person, juridical or a mixture of both, with a specific objective. Company members ...

s are usually assumed to maximize their

profit Profit may refer to: Business and law * Profit (accounting), the difference between the purchase price and the costs of bringing to market * Profit (economics), normal profit and economic profit * Profit (real property), a nonpossessory inter ...

. Also, agents are often modeled as being risk-averse, thereby preferring to avoid risk. Asset prices are also modeled using optimization theory, though the underlying mathematics relies on optimizing

stochastic process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables in a probability space, where the index of the family often has the interpretation of time. Sto ...

es rather than on static optimization. International trade theory also uses optimization to explain trade patterns between nations. The optimization of portfolios is an example of multi-objective optimization in economics. Since the 1970s, economists have modeled dynamic decisions over time using

control theory Control theory is a field of control engineering and applied mathematics that deals with the control system, control of dynamical systems in engineered processes and machines. The objective is to develop a model or algorithm governing the applic ...

. For example, dynamic search models are used to study labor-market behavior. A crucial distinction is between deterministic and stochastic models. Macroeconomists build dynamic stochastic general equilibrium (DSGE) models that describe the dynamics of the whole economy as the result of the interdependent optimizing decisions of workers, consumers, investors, and governments.

Electrical engineering

Some common applications of optimization techniques in

electrical engineering Electrical engineering is an engineering discipline concerned with the study, design, and application of equipment, devices, and systems that use electricity, electronics, and electromagnetism. It emerged as an identifiable occupation in the l ...

include

active filter An active filter is a type of analog circuit implementing an electronic filter using active components, typically an amplifier. Amplifiers included in a filter design can be used to improve the cost, performance and predictability of a filter. ...

design, stray field reduction in superconducting magnetic energy storage systems, space mapping design of

microwave Microwave is a form of electromagnetic radiation with wavelengths shorter than other radio waves but longer than infrared waves. Its wavelength ranges from about one meter to one millimeter, corresponding to frequency, frequencies between 300&n ...

structures, handset antennas, electromagnetics-based design. Electromagnetically validated design optimization of microwave components and antennas has made extensive use of an appropriate physics-based or empirical surrogate model and space mapping methodologies since the discovery of space mapping in 1993. Optimization techniques are also used in power-flow analysis.

Civil engineering

Optimization has been widely used in civil engineering.

Construction management Construction management (CM) aims to control the quality of a construction project's scope, time, and cost (sometimes referred to as a project management triangle or "triple constraints") to maximize the project owner's satisfaction. It uses pro ...

and

transportation engineering Transportation engineering or transport engineering is the application of technology and scientific principles to the planning, functional design, operation and management of facilities for any mode of transportation to provide for the safe, ...

are among the main branches of civil engineering that heavily rely on optimization. The most common civil engineering problems that are solved by optimization are cut and fill of roads, life-cycle analysis of structures and infrastructures,

resource leveling In project management, resource leveling is defined by '' A Guide to the Project Management Body of Knowledge'' (''PMBOK Guide'') as "A technique in which start and finish dates are adjusted based on resource limitation with the goal of balancing d ...

, water resource allocation,

traffic Traffic is the movement of vehicles and pedestrians along land routes. Traffic laws govern and regulate traffic, while rules of the road include traffic laws and informal rules that may have developed over time to facilitate the orderly an ...

management and schedule optimization.

Operations research

Another field that uses optimization techniques extensively is

. Operations research also uses stochastic modeling and simulation to support improved decision-making. Increasingly, operations research uses stochastic programming to model dynamic decisions that adapt to events; such problems can be solved with large-scale optimization and

methods.

Control engineering

Mathematical optimization is used in much modern controller design. High-level controllers such as model predictive control (MPC) or real-time optimization (RTO) employ mathematical optimization. These algorithms run online and repeatedly determine values for decision variables, such as choke openings in a process plant, by iteratively solving a mathematical optimization problem including constraints and a model of the system to be controlled.

Geophysics

Optimization techniques are regularly used in

geophysical Geophysics () is a subject of natural science concerned with the physical processes and properties of Earth and its surrounding space environment, and the use of quantitative methods for their analysis. Geophysicists conduct investigations acros ...

parameter estimation problems. Given a set of geophysical measurements, e.g. seismic recordings, it is common to solve for the

physical properties A physical property is any property of a physical system that is measurable. The changes in the physical properties of a system can be used to describe its changes between momentary states. A quantifiable physical property is called ''physical ...

and geometrical shapes of the underlying rocks and fluids. The majority of problems in geophysics are nonlinear with both deterministic and stochastic methods being widely used.

Molecular modeling

Nonlinear optimization methods are widely used in conformational analysis.

Computational systems biology

Optimization techniques are used in many facets of computational systems biology such as model building, optimal experimental design, metabolic engineering, and synthetic biology.

has been applied to calculate the maximal possible yields of fermentation products, and to infer gene regulatory networks from multiple microarray datasets as well as transcriptional regulatory networks from high-throughput data. Nonlinear programming has been used to analyze energy metabolism and has been applied to metabolic engineering and parameter estimation in biochemical pathways.

Machine learning

Solvers

Notes

External links

* Links to optimization source codes * * * {{Authority control Operations research

Optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...