Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over

convex set In geometry, a subset of a Euclidean space, or more generally an affine space over the reals, is convex if, given any two points in the subset, the subset contains the whole line segment that joins them. Equivalently, a convex set or a convex ...

s (or, equivalently, maximizing concave functions over convex sets). Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard. Convex optimization has applications in a wide range of disciplines, such as automatic

control systems A control system manages, commands, directs, or regulates the behavior of other devices or systems using control loops. It can range from a single home heating controller using a thermostat controlling a domestic boiler to large industrial c ...

, estimation and

signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as sound, images, and scientific measurements. Signal processing techniques are used to optimize transmissions, ...

, communications and networks, electronic circuit design, data analysis and modeling, finance, statistics ( optimal experimental design), and structural optimization, where the approximation concept has proven to be efficient. With recent advancements in computing and optimization algorithms, convex programming is nearly as straightforward as linear programming.

Definition

A convex optimization problem is an

optimization problem In mathematics, computer science and economics, an optimization problem is the problem of finding the ''best'' solution from all feasible solutions. Optimization problems can be divided into two categories, depending on whether the variables ...

in which the objective function is a convex function and the feasible set is a

. A function

f

mapping some subset of

\mathbb^n

into

\mathbb \cup \

is convex if its domain is convex and for all

\theta \in

, 1 The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline o ...

/math> and all

x, y

in its domain, the following condition holds:

f(\theta x + (1 - \theta)y) \leq \theta f(x) + (1 - \theta) f(y)

. A set S is convex if for all members

x, y \in S

and all

\theta \in

/math>, we have that

\theta x + (1 - \theta) y \in S

. Concretely, a convex optimization problem is the problem of finding some

\mathbf \in C

attaining :

\inf \

, where the objective function

f :\mathcal D \subseteq \mathbb^n \to \mathbb

is convex, as is the feasible set

C

. If such a point exists, it is referred to as an ''optimal point'' or ''solution''; the set of all optimal points is called the ''optimal set''. If

f

is unbounded below over

C

or the infimum is not attained, then the optimization problem is said to be ''unbounded''. Otherwise, if

C

is the empty set, then the problem is said to be ''infeasible''.

Standard form

A convex optimization problem is in ''standard form'' if it is written as :

\begin
&\underset& & f(\mathbf) \\
&\operatorname
& &g_i(\mathbf) \leq 0, \quad i = 1, \dots, m \\
&&&h_i(\mathbf) = 0, \quad i = 1, \dots, p,
\end

where: *

\mathbf \in \mathbb^n

is the optimization variable; * The objective function

f: \mathcal D \subseteq \mathbb^n \to \mathbb

is a convex function; * The inequality constraint functions

g_i : \mathbb^n \to \mathbb

i=1, \ldots, m

, are convex functions; * The equality constraint functions

h_i : \mathbb^n \to \mathbb

i=1, \ldots, p

, are affine transformations, that is, of the form:

h_i(\mathbf) = \mathbf\cdot \mathbf - b_i

, where

\mathbf

is a vector and

b_i

is a scalar. This notation describes the problem of finding

\mathbf \in \mathbb^n

that minimizes

f(\mathbf)

among all

\mathbf

satisfying

g_i(\mathbf) \leq 0

i=1, \ldots, m

and

h_i(\mathbf) = 0

i=1, \ldots, p

. The function

f

is the objective function of the problem, and the functions

g_i

and

h_i

are the constraint functions. The feasible set

C

of the optimization problem consists of all points

\mathbf \in \mathcal

satisfying the constraints. This set is convex because

\mathcal

is convex, the

sublevel set In mathematics, a level set of a real-valued function of real variables is a set where the function takes on a given constant value , that is: : L_c(f) = \left\~, When the number of independent variables is two, a level set is calle ...

s of convex functions are convex, affine sets are convex, and the intersection of convex sets is convex. A solution to a convex optimization problem is any point

\mathbf \in C

attaining

\inf \

. In general, a convex optimization problem may have zero, one, or many solutions. Many optimization problems can be equivalently formulated in this standard form. For example, the problem of maximizing a concave function

f

can be re-formulated equivalently as the problem of minimizing the convex function

-f

. The problem of maximizing a concave function over a convex set is commonly called a convex optimization problem.

Properties

The following are useful properties of convex optimization problems: * every

local minimum In mathematical analysis, the maxima and minima (the respective plurals of maximum and minimum) of a function, known collectively as extrema (the plural of extremum), are the largest and smallest value of the function, either within a given ra ...

is a global minimum; * the optimal set is convex; * if the objective function is ''strictly'' convex, then the problem has at most one optimal point. These results are used by the theory of convex minimization along with geometric notions from

functional analysis Functional analysis is a branch of mathematical analysis, the core of which is formed by the study of vector spaces endowed with some kind of limit-related structure (e.g. inner product, norm, topology, etc.) and the linear functions defined o ...

(in Hilbert spaces) such as the

Hilbert projection theorem In mathematics, the Hilbert projection theorem is a famous result of convex analysis that says that for every vector x in a Hilbert space H and every nonempty closed convex C \subseteq H, there exists a unique vector m \in C for which \, c - x\, ...

, the

separating hyperplane theorem In geometry, the hyperplane separation theorem is a theorem about disjoint convex sets in ''n''-dimensional Euclidean space. There are several rather similar versions. In one version of the theorem, if both these sets are closed and at least one ...

, and Farkas' lemma.

Applications

The following problem classes are all convex optimization problems, or can be reduced to convex optimization problems via simple transformations: Hierarchy compact convex

* Least squares * Linear programming * Convex quadratic minimization with linear constraints * Quadratic minimization with convex quadratic constraints * Conic optimization *

Geometric programming A geometric program (GP) is an optimization problem of the form : \begin \mbox & f_0(x) \\ \mbox & f_i(x) \leq 1, \quad i=1, \ldots, m\\ & g_i(x) = 1, \quad i=1, \ldots, p, \end where f_0,\dots,f_m are posynomials and g_1,\dots,g_p are monomials. I ...

* Second order cone programming *

Semidefinite programming Semidefinite programming (SDP) is a subfield of convex optimization concerned with the optimization of a linear objective function (a user-specified function that the user wants to minimize or maximize) over the intersection of the cone of positive ...

Entropy maximization The principle of maximum entropy states that the probability distribution which best represents the current state of knowledge about a system is the one with largest entropy, in the context of precisely stated prior data (such as a proposition ...

with appropriate constraints Convex optimization has practical applications for the following. *

Portfolio optimization Portfolio optimization is the process of selecting the best portfolio (asset distribution), out of the set of all portfolios being considered, according to some objective. The objective typically maximizes factors such as expected return, and minimi ...

. * Worst-case risk analysis. * Optimal advertising. * Variations of

statistical regression Statistics (from German: ''Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industri ...

(including

regularization Regularization may refer to: * Regularization (linguistics) * Regularization (mathematics) * Regularization (physics) In physics, especially quantum field theory, regularization is a method of modifying observables which have singularities in ...

and

quantile regression Quantile regression is a type of regression analysis used in statistics and econometrics. Whereas the method of least squares estimates the conditional ''mean'' of the response variable across values of the predictor variables, quantile regress ...

). * Model fitting (particularly

multiclass classification In machine learning and statistical classification, multiclass classification or multinomial classification is the problem of classifying instances into one of three or more classes (classifying instances into one of two classes is called binary c ...

). *

Electricity generation Electricity generation is the process of generating electric power from sources of primary energy. For utilities in the electric power industry, it is the stage prior to its delivery ( transmission, distribution, etc.) to end users or its s ...

optimization. *

Combinatorial optimization Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combi ...

. * Non-probabilistic modelling of

uncertainty Uncertainty refers to epistemic situations involving imperfect or unknown information. It applies to predictions of future events, to physical measurements that are already made, or to the unknown. Uncertainty arises in partially observable ...

. * Localization using wireless signals

Lagrange multipliers

Consider a convex minimization problem given in standard form by a cost function

f(x)

and inequality constraints

g_i(x)\leq 0

for

1 \leq i \leq m

. Then the domain

\mathcal

is: :

\mathcal = \left\.

The Lagrangian function for the problem is :

L(x,\lambda_,\lambda_1, \ldots ,\lambda_)=\lambda_ f(x) + \lambda_ g_ (x)+\cdots + \lambda_ g_ (x).

For each point

x

X

that minimizes

f

over

X

, there exist real numbers

\lambda_,\lambda_1, \ldots, \lambda_,

called

Lagrange multipliers In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equality constraints (i.e., subject to the condition that one or more equations have to be satisfied e ...

, that satisfy these conditions simultaneously: #

x

minimizes

L(y,\lambda_,\lambda_,\ldots ,\lambda_)

over all

y \in X,

\lambda_,\lambda_,\ldots ,\lambda_ \geq 0,

with at least one

\lambda_ > 0,

\lambda_g_(x)=\cdots = \lambda_g_(x) = 0

(complementary slackness). If there exists a "strictly feasible point", that is, a point

z

satisfying :

g_(z), \ldots, g_(z)<0,

then the statement above can be strengthened to require that

\lambda_=1

. Conversely, if some

x

X

satisfies (1)–(3) for scalars

\lambda_,\ldots,\lambda_

with

\lambda_=1

then

x

is certain to minimize

f

over

X

Algorithms

Unconstrained convex optimization can be easily solved with

gradient descent In mathematics, gradient descent (also often called steepest descent) is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of the ...

(a special case of

steepest descent In mathematics, gradient descent (also often called steepest descent) is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of the ...

) or Newton's method, combined with

line search In optimization, the line search strategy is one of two basic iterative approaches to find a local minimum \mathbf^* of an objective function f:\mathbb R^n\to\mathbb R. The other approach is trust region. The line search approach first finds a ...

for an appropriate step size; these can be mathematically proven to converge quickly, especially the latter method. Convex optimization with linear equality constraints can also be solved using KKT matrix techniques if the objective function is a quadratic function (which generalizes to a variation of Newton's method, which works even if the point of initialization does not satisfy the constraints), but can also generally be solved by eliminating the equality constraints with

linear algebra Linear algebra is the branch of mathematics concerning linear equations such as: :a_1x_1+\cdots +a_nx_n=b, linear maps such as: :(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n, and their representations in vector spaces and through matrices ...

or solving the dual problem. Finally, convex optimization with both linear equality constraints and convex inequality constraints can be solved by applying an unconstrained convex optimization technique to the objective function plus logarithmic barrier terms. (When the starting point is not feasible - that is, satisfying the constraints - this is preceded by so-called ''phase I'' methods, which either find a feasible point or show that none exist. Phase I methods generally consist of reducing the search in question to yet another convex optimization problem.) Convex optimization problems can also be solved by the following contemporary methods: * Bundle methods (Wolfe, Lemaréchal, Kiwiel), and * Subgradient projection methods (Polyak), * Interior-point methods, which make use of self-concordant barrier functions and self-regular barrier functions. * Cutting-plane methods *

Ellipsoid method In mathematical optimization, the ellipsoid method is an iterative method for minimizing convex functions. When specialized to solving feasible linear optimization problems with rational data, the ellipsoid method is an algorithm which find ...

* Subgradient method * Dual subgradients and the drift-plus-penalty method Subgradient methods can be implemented simply and so are widely used.Bertsekas Dual subgradient methods are subgradient methods applied to a

dual problem In mathematical optimization theory, duality or the duality principle is the principle that optimization problems may be viewed from either of two perspectives, the primal problem or the dual problem. If the primal is a minimization problem then t ...

. The drift-plus-penalty method is similar to the dual subgradient method, but takes a time average of the primal variables.

Implementations

Convex optimization and related algorithms have been implemented in the following software programs:

Extensions

Extensions of convex optimization include the optimization of biconvex, pseudo-convex, and

quasiconvex In mathematics, a quasiconvex function is a real-valued function defined on an interval or on a convex subset of a real vector space such that the inverse image of any set of the form (-\infty,a) is a convex set. For a function of a single ...

functions. Extensions of the theory of

convex analysis Convex analysis is the branch of mathematics devoted to the study of properties of convex functions and convex sets, often with applications in convex minimization, a subdomain of optimization theory. Convex sets A subset C \subseteq X of som ...

and iterative methods for approximately solving non-convex minimization problems occur in the field of generalized convexity, also known as abstract convex analysis.

Notes

References

* * * * * * Hiriart-Urruty, Jean-Baptiste, and Lemaréchal, Claude. (2004). ''Fundamentals of Convex analysis''. Berlin: Springer. * * * * * * Nesterov, Yurii. (2004).
Introductory Lectures on Convex Optimization
', Kluwer Academic Publishers * * * Schmit, L.A.; Fleury, C. 1980: ''Structural synthesis by combining approximation concepts and dual methods''. J. Amer. Inst. Aeronaut. Astronaut 18, 1252-1260

External links

EE364a: Convex Optimization I
an
EE364b: Convex Optimization II
Stanford course homepages
6.253: Convex Analysis and Optimization
an MIT OCW course homepage * Brian Borchers
An overview of software for convex optimizationConvex Optimization Book by Lieven Vandenberghe and Stephen P. Boyd
{{Convex analysis and variational analysis Convex analysis Mathematical optimization