Convex optimization is a subfield of

mathematical optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...

that studies the problem of minimizing

convex function In mathematics, a real-valued function is called convex if the line segment between any two distinct points on the graph of a function, graph of the function lies above or on the graph between the two points. Equivalently, a function is conve ...

s over

convex set In geometry, a set of points is convex if it contains every line segment between two points in the set. For example, a solid cube (geometry), cube is a convex set, but anything that is hollow or has an indent, for example, a crescent shape, is n ...

s (or, equivalently, maximizing concave functions over convex sets). Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general

NP-hard In computational complexity theory, a computational problem ''H'' is called NP-hard if, for every problem ''L'' which can be solved in non-deterministic polynomial-time, there is a polynomial-time reduction from ''L'' to ''H''. That is, assumi ...

Definition

Abstract form

A convex optimization problem is defined by two ingredients: * The ''objective function'', which is a real-valued

of ''n'' variables,

f :\mathcal D \subseteq \mathbb^n \to \mathbb

; * The ''feasible set'', which is a

convex subset In geometry, a set of points is convex if it contains every line segment between two points in the set. For example, a solid cube (geometry), cube is a convex set, but anything that is hollow or has an indent, for example, a crescent shape, is n ...

C\subseteq \mathbb^n

. The goal of the problem is to find some

\mathbf \in C

attaining :

\inf \

. In general, there are three options regarding the existence of a solution: * If such a point ''x''* exists, it is referred to as an ''optimal point'' or ''solution''; the set of all optimal points is called the ''optimal set''; and the problem is called ''solvable''. * If

f

is unbounded below over

C

, or the infimum is not attained, then the optimization problem is said to be ''unbounded''. * Otherwise, if

C

is the empty set, then the problem is said to be ''infeasible''.

Standard form

A convex optimization problem is in ''standard form'' if it is written as :

\begin
&\underset& & f(\mathbf) \\
&\operatorname
& &g_i(\mathbf) \leq 0, \quad i = 1, \dots, m \\
&&&h_i(\mathbf) = 0, \quad i = 1, \dots, p,
\end

where: *

\mathbf \in \mathbb^n

is the vector of optimization variables; * The objective function

f: \mathcal D \subseteq \mathbb^n \to \mathbb

is a

; * The inequality constraint functions

g_i : \mathbb^n \to \mathbb

i=1, \ldots, m

, are convex functions; * The equality constraint functions

h_i : \mathbb^n \to \mathbb

i=1, \ldots, p

, are

affine transformation In Euclidean geometry, an affine transformation or affinity (from the Latin, '' affinis'', "connected with") is a geometric transformation that preserves lines and parallelism, but not necessarily Euclidean distances and angles. More general ...

s, that is, of the form:

h_i(\mathbf) = \mathbf\cdot \mathbf - b_i

, where

\mathbf

is a vector and

b_i

is a scalar. The feasible set

C

of the optimization problem consists of all points

\mathbf \in \mathcal

satisfying the inequality and the equality constraints. This set is convex because

\mathcal

is convex, the sublevel sets of convex functions are convex, affine sets are convex, and the intersection of convex sets is convex. Many optimization problems can be equivalently formulated in this standard form. For example, the problem of maximizing a

concave function In mathematics, a concave function is one for which the function value at any convex combination of elements in the domain is greater than or equal to that convex combination of those domain elements. Equivalently, a concave function is any funct ...

f

can be re-formulated equivalently as the problem of minimizing the convex function

-f

. The problem of maximizing a concave function over a convex set is commonly called a convex optimization problem.

Epigraph form (standard form with linear objective)

In the standard form it is possible to assume, without loss of generality, that the objective function ''f'' is a

linear function In mathematics, the term linear function refers to two distinct but related notions: * In calculus and related areas, a linear function is a function whose graph is a straight line, that is, a polynomial function of degree zero or one. For di ...

. This is because any program with a general objective can be transformed into a program with a linear objective by adding a single variable t and a single constraint, as follows: :

\begin
&\underset& & t \\
&\operatorname
& &f(\mathbf) - t \leq 0 \\
&& &g_i(\mathbf) \leq 0, \quad i = 1, \dots, m \\
&&&h_i(\mathbf) = 0, \quad i = 1, \dots, p,
\end

Conic form

Every convex program can be presented in a ''conic form'', which means minimizing a linear objective over the intersection of an affine plane and a convex cone: :

\begin
&\underset& & c^T x \\
&\operatorname
& &x \in (b+L)\cap K
\end

where K is a closed pointed convex cone, L is a

linear subspace In mathematics, the term ''linear'' is used in two distinct senses for two different properties: * linearity of a ''function (mathematics), function'' (or ''mapping (mathematics), mapping''); * linearity of a ''polynomial''. An example of a li ...

of R''ⁿ'', and b is a vector in R''ⁿ''. A linear program in standard form is the special case in which K is the nonnegative orthant of R''ⁿ''.

Eliminating linear equality constraints

It is possible to convert a convex program in standard form, to a convex program with no equality constraints. Denote the equality constraints ''h_i''(''x'')=0 as ''Ax''=''b'', where ''A'' has ''n'' columns. If ''Ax''=''b'' is infeasible, then of course the original problem is infeasible. Otherwise, it has some solution ''x''₀ , and the set of all solutions can be presented as: ''Fz''+''x''₀, where ''z'' is in ''R^k'', ''k''=''n''-rank(''A''), and ''F'' is an ''n''-by-''k'' matrix. Substituting ''x'' = ''Fz''+''x''₀ in the original problem gives:

$\begin
&\underset& & f(\mathbf) \\
&\operatorname
& &g_i(\mathbf) \leq 0, \quad i = 1, \dots, m \\
\end$

where the variables are z. Note that there are rank(''A'') fewer variables. This means that, in principle, one can restrict attention to convex optimization problems without equality constraints. In practice, however, it is often preferred to retain the equality constraints, since they might make some algorithms more efficient, and also make the problem easier to understand and analyze.

Special cases

The following problem classes are all convex optimization problems, or can be reduced to convex optimization problems via simple transformations: Hierarchy compact convex

Linear programming Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements and objective are represented by linear function#As a polynomia ...

problems are the simplest convex programs. In LP, the objective and constraint functions are all linear. *

Quadratic programming Quadratic programming (QP) is the process of solving certain mathematical optimization problems involving quadratic functions. Specifically, one seeks to optimize (minimize or maximize) a multivariate quadratic function subject to linear constr ...

are the next-simplest. In QP, the constraints are all linear, but the objective may be a convex quadratic function. *

Second order cone programming A second-order cone program (SOCP) is a convex optimization problem of the form :minimize \ f^T x \ :subject to ::\lVert A_i x + b_i \rVert_2 \leq c_i^T x + d_i,\quad i = 1,\dots,m ::Fx = g \ where the problem parameters are f \in \mathbb^n, ...

are more general. *

Semidefinite programming Semidefinite programming (SDP) is a subfield of mathematical programming concerned with the optimization of a linear objective function (a user-specified function that the user wants to minimize or maximize) over the intersection of the cone of po ...

are more general. * Conic optimization are even more general - see figure to the right, Other special cases include; *

Least squares The method of least squares is a mathematical optimization technique that aims to determine the best fit function by minimizing the sum of the squares of the differences between the observed values and the predicted values of the model. The me ...

* Quadratic minimization with convex quadratic constraints *

Geometric programming A geometric program (GP) is an optimization problem of the form : \begin \mbox & f_0(x) \\ \mbox & f_i(x) \leq 1, \quad i=1, \ldots, m\\ & g_i(x) = 1, \quad i=1, \ldots, p, \end where f_0,\dots,f_m are posynomials and g_1,\dots,g_p are monomials. ...

Entropy maximization The principle of maximum entropy states that the probability distribution which best represents the current state of knowledge about a system is the one with largest entropy, in the context of precisely stated prior data (such as a proposition ...

with appropriate constraints.

Properties

The following are useful properties of convex optimization problems: * every point that is

local minimum In mathematical analysis, the maximum and minimum of a function are, respectively, the greatest and least value taken by the function. Known generically as extremum, they may be defined either within a given range (the ''local'' or ''relative ...

is a also a

global minimum In mathematical analysis, the maximum and minimum of a function are, respectively, the greatest and least value taken by the function. Known generically as extremum, they may be defined either within a given range (the ''local'' or ''relative' ...

; * the optimal set is convex; * if the objective function is ''strictly'' convex, then the problem has at most one optimal point. These results are used by the theory of convex minimization along with geometric notions from

functional analysis Functional analysis is a branch of mathematical analysis, the core of which is formed by the study of vector spaces endowed with some kind of limit-related structure (for example, Inner product space#Definition, inner product, Norm (mathematics ...

(in Hilbert spaces) such as the

Hilbert projection theorem In mathematics, the Hilbert projection theorem is a famous result of convex analysis that says that for every vector x in a Hilbert space In mathematics, a Hilbert space is a real number, real or complex number, complex inner product space tha ...

, the

separating hyperplane theorem In geometry, the hyperplane separation theorem is a theorem about disjoint convex sets in ''n''-dimensional Euclidean space. There are several rather similar versions. In one version of the theorem, if both these sets are closed and at least on ...

, and

Farkas' lemma In mathematics, Farkas' lemma is a solvability theorem for a finite system of linear inequalities. It was originally proven by the Hungarian mathematician Gyula Farkas. Farkas' lemma is the key result underpinning the linear programming duali ...

Algorithms

Unconstrained and equality-constrained problems

The convex programs easiest to solve are the ''unconstrained'' problems, or the problems with only equality constraints. As the equality constraints are all linear, they can be eliminated with

linear algebra Linear algebra is the branch of mathematics concerning linear equations such as :a_1x_1+\cdots +a_nx_n=b, linear maps such as :(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n, and their representations in vector spaces and through matrix (mathemat ...

and integrated into the objective, thus converting an equality-constrained problem into an unconstrained one. In the class of unconstrained (or equality-constrained) problems, the simplest ones are those in which the objective is

quadratic In mathematics, the term quadratic describes something that pertains to squares, to the operation of squaring, to terms of the second degree, or equations or formulas that involve such terms. ''Quadratus'' is Latin for ''square''. Mathematics ...

. For these problems, the

KKT conditions KKT may refer to: * Karush–Kuhn–Tucker conditions, in mathematical optimization of nonlinear programming * kkt (), a type of general partnership in Hungary * Koi language, of Nepal, by ISO 639-3 code * Kappa Kappa Tau, a fictional sorority i ...

(which are necessary for optimality) are all linear, so they can be solved analytically. For unconstrained (or equality-constrained) problems with a general convex objective that is twice-differentiable,

Newton's method In numerical analysis, the Newton–Raphson method, also known simply as Newton's method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a ...

can be used. It can be seen as reducing a general unconstrained convex problem, to a sequence of quadratic problems.Newton's method can be combined with

line search In optimization, line search is a basic iterative approach to find a local minimum \mathbf^* of an objective function f:\mathbb R^n\to\mathbb R. It first finds a descent direction along which the objective function f will be reduced, and then co ...

for an appropriate step size, and it can be mathematically proven to converge quickly. Other efficient algorithms for unconstrained minimization are

gradient descent Gradient descent is a method for unconstrained mathematical optimization. It is a first-order iterative algorithm for minimizing a differentiable multivariate function. The idea is to take repeated steps in the opposite direction of the gradi ...

(a special case of

steepest descent Gradient descent is a method for unconstrained mathematical optimization. It is a :First order methods, first-order Iterative algorithm, iterative algorithm for minimizing a differentiable function, differentiable multivariate function. The ide ...

General problems

The more challenging problems are those with inequality constraints. A common way to solve them is to reduce them to unconstrained problems by adding a

barrier function In constrained optimization, a field of mathematics, a barrier function is a continuous function whose value increases to infinity as its argument approaches the boundary of the feasible region of an optimization problem. Such functions are used ...

, enforcing the inequality constraints, to the objective function. Such methods are called

interior point methods Interior-point methods (also referred to as barrier methods or IPMs) are algorithms for solving linear and non-linear convex optimization problems. IPMs combine two advantages of previously-known algorithms: * Theoretically, their run-time is po ...

.They have to be initialized by finding a feasible interior point using by so-called ''phase I'' methods, which either find a feasible point or show that none exist. Phase I methods generally consist of reducing the search in question to a simpler convex optimization problem. Convex optimization problems can also be solved by the following contemporary methods: * Bundle methods (Wolfe, Lemaréchal, Kiwiel), and * Subgradient projection methods (Polyak), * Interior-point methods, which make use of self-concordant barrier functions and self-regular barrier functions. *

Cutting-plane methods In mathematics, mathematical Optimization (mathematics), optimization, the cutting-plane method is any of a variety of optimization methods that iteratively refine a feasible set or objective function by means of linear inequalities, termed ''cut ...

Ellipsoid method In mathematical optimization, the ellipsoid method is an iterative method for convex optimization, minimizing convex functions over convex sets. The ellipsoid method generates a sequence of ellipsoids whose volume uniformly decreases at every ste ...

Subgradient method Subgradient methods are convex optimization methods which use subderivatives. Originally developed by Naum Z. Shor and others in the 1960s and 1970s, subgradient methods are convergent when applied even to a non-differentiable objective function. ...

* Dual subgradients and the drift-plus-penalty method Subgradient methods can be implemented simply and so are widely used. Dual subgradient methods are subgradient methods applied to a

dual problem In mathematical optimization theory, duality or the duality principle is the principle that optimization problems may be viewed from either of two perspectives, the primal problem or the dual problem. If the primal is a minimization problem then th ...

. The drift-plus-penalty method is similar to the dual subgradient method, but takes a time average of the primal variables.

Lagrange multipliers

Consider a convex minimization problem given in standard form by a cost function

f(x)

and inequality constraints

g_i(x)\leq 0

for

1 \leq i \leq m

. Then the domain

\mathcal

is: :

\mathcal = \left\.

The Lagrangian function for the problem is :

L(x,\lambda_,\lambda_1, \ldots ,\lambda_)=\lambda_ f(x) + \lambda_ g_ (x)+\cdots + \lambda_ g_ (x).

For each point

x

X

that minimizes

f

over

X

, there exist real numbers

\lambda_,\lambda_1, \ldots, \lambda_,

called

Lagrange multipliers In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equation constraints (i.e., subject to the condition that one or more equations have to be satisfie ...

, that satisfy these conditions simultaneously: #

x

minimizes

L(y,\lambda_,\lambda_,\ldots ,\lambda_)

over all

y \in X,

\lambda_,\lambda_,\ldots ,\lambda_ \geq 0,

with at least one

\lambda_ > 0,

\lambda_g_(x)=\cdots = \lambda_g_(x) = 0

(complementary slackness). If there exists a "strictly feasible point", that is, a point

z

satisfying :

g_(z), \ldots, g_(z)<0,

then the statement above can be strengthened to require that

\lambda_=1

. Conversely, if some

x

X

satisfies (1)–(3) for

scalar Scalar may refer to: *Scalar (mathematics), an element of a field, which is used to define a vector space, usually the field of real numbers *Scalar (physics), a physical quantity that can be described by a single element of a number field such a ...

\lambda_,\ldots,\lambda_

with

\lambda_=1

then

x

is certain to minimize

f

over

X

Software

There is a large software ecosystem for convex optimization. This ecosystem has two main categories: ''solvers'' on the one hand and ''modeling tools'' (or ''interfaces'') on the other hand. Solvers implement the algorithms themselves and are usually written in C. They require users to specify optimization problems in very specific formats which may not be natural from a modeling perspective. Modeling tools are separate pieces of software that let the user specify an optimization in higher-level syntax. They manage all transformations to and from the user's high-level model and the solver's input/output format. Below are two tables. The first shows shows modelling tools (such as CVXPY and JuMP.jl) and the second solvers (such as SCS and MOSEK). They are by no means exhaustive.

Applications

Convex optimization can be used to model problems in a wide range of disciplines, such as automatic

control systems A control system manages, commands, directs, or regulates the behavior of other devices or systems using control loops. It can range from a single home heating controller using a thermostat controlling a domestic boiler to large industrial co ...

, estimation and

signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as audio signal processing, sound, image processing, images, Scalar potential, potential fields, Seismic tomograph ...

, communications and networks, electronic

circuit design In electrical engineering, the process of circuit design can cover systems ranging from complex electronic systems down to the individual transistors within an integrated circuit. One person can often do the design process without needing a pl ...

, data analysis and modeling,

finance Finance refers to monetary resources and to the study and Academic discipline, discipline of money, currency, assets and Liability (financial accounting), liabilities. As a subject of study, is a field of Business administration, Business Admin ...

statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

( optimal experimental design), and structural optimization, where the approximation concept has proven to be efficient. Convex optimization can be used to model problems in the following fields: *

Portfolio optimization Portfolio optimization is the process of selecting an optimal portfolio (asset distribution), out of a set of considered portfolios, according to some objective. The objective typically maximizes factors such as expected return, and minimizes c ...

. * Worst-case risk analysis. * Optimal advertising. * Variations of statistical regression (including

regularization Regularization may refer to: * Regularization (linguistics) * Regularization (mathematics) * Regularization (physics) * Regularization (solid modeling) * Regularization Law, an Israeli law intended to retroactively legalize settlements See also ...

and

quantile regression Quantile regression is a type of regression analysis used in statistics and econometrics. Whereas the method of least squares estimates the conditional ''mean'' of the response variable across values of the predictor variables, quantile regress ...

). * Model fitting (particularly

multiclass classification In machine learning and statistical classification, multiclass classification or multinomial classification is the problem of classifying instances into one of three or more classes (classifying instances into one of two classes is called binary ...

). *

Electricity generation Electricity generation is the process of generating electric power from sources of primary energy. For electric utility, utilities in the electric power industry, it is the stage prior to its Electricity delivery, delivery (Electric power transm ...

optimization. *

Combinatorial optimization Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combina ...

. * Non-probabilistic modelling of

uncertainty Uncertainty or incertitude refers to situations involving imperfect or unknown information. It applies to predictions of future events, to physical measurements that are already made, or to the unknown, and is particularly relevant for decision ...

. * Localization using wireless signals Ahmad Bazzi, Dirk TM Slock, and Lisa Meilhac. "Online angle of arrival estimation in the presence of mutual coupling." 2016 IEEE Statistical Signal Processing Workshop (SSP). IEEE, 2016.

Extensions

Extensions of convex optimization include the optimization of biconvex, pseudo-convex, and

quasiconvex In mathematics, a quasiconvex function is a real-valued function defined on an interval or on a convex subset of a real vector space such that the inverse image of any set of the form (-\infty,a) is a convex set. For a function of a singl ...

functions. Extensions of the theory of

convex analysis Convex analysis is the branch of mathematics devoted to the study of properties of convex functions and convex sets, often with applications in convex optimization, convex minimization, a subdomain of optimization (mathematics), optimization theor ...

and iterative methods for approximately solving non-convex minimization problems occur in the field of

generalized convexity A generalization is a form of abstraction whereby common properties of specific instances are formulated as general concepts or claims. Generalizations posit the existence of a domain or set of elements, as well as one or more common characteri ...

, also known as abstract convex analysis.

Notes

References

* * * * * * Hiriart-Urruty, Jean-Baptiste, and Lemaréchal, Claude. (2004). ''Fundamentals of Convex analysis''. Berlin: Springer. * * * * * * Nesterov, Yurii. (2004).
Introductory Lectures on Convex Optimization
', Kluwer Academic Publishers * * * Schmit, L.A.; Fleury, C. 1980: ''Structural synthesis by combining approximation concepts and dual methods''. J. Amer. Inst. Aeronaut. Astronaut 18, 1252-1260

External links

EE364a: Convex Optimization I
an
EE364b: Convex Optimization II
Stanford course homepages
6.253: Convex Analysis and Optimization
an MIT OCW course homepage * Brian Borchers
An overview of software for convex optimizationConvex Optimization Book by Lieven Vandenberghe and Stephen P. Boyd
{{Convex analysis and variational analysis Convex analysis Mathematical optimization