Semidefinite programming (SDP) is a subfield of
convex optimization
Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets (or, equivalently, maximizing concave functions over convex sets). Many classes of convex optimization prob ...
concerned with the optimization of a linear
objective function
In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cos ...
(a user-specified function that the user wants to minimize or maximize)
over the intersection of the
cone of
positive semidefinite matrices with an
affine space
In mathematics, an affine space is a geometric structure that generalizes some of the properties of Euclidean spaces in such a way that these are independent of the concepts of distance and measure of angles, keeping only the properties relat ...
, i.e., a
spectrahedron.
Semidefinite programming is a relatively new field of optimization which is of growing interest for several reasons. Many practical problems in
operations research
Operations research ( en-GB, operational research) (U.S. Air Force Specialty Code: Operations Analysis), often shortened to the initialism OR, is a discipline that deals with the development and application of analytical methods to improve dec ...
and
combinatorial optimization
Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combi ...
can be modeled or approximated as semidefinite programming problems. In automatic control theory, SDPs are used in the context of
linear matrix inequalities. SDPs are in fact a special case of
cone programming and can be efficiently solved by
interior point methods
Interior-point methods (also referred to as barrier methods or IPMs) are a certain class of algorithms that solve linear and nonlinear convex optimization problems.
An interior point method was discovered by Soviet mathematician I. I. Dikin in ...
.
All
linear programs and (convex)
quadratic programs can be expressed as SDPs, and via hierarchies of SDPs the solutions of polynomial optimization problems can be approximated. Semidefinite programming has been used in the
optimization of complex systems. In recent years, some quantum query complexity problems have been formulated in terms of semidefinite programs.
Motivation and definition
Initial motivation
A
linear programming
Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements are represented by linear relationships. Linear programming is ...
problem is one in which we wish to maximize or minimize a linear objective function of real variables over a
polytope
In elementary geometry, a polytope is a geometric object with flat sides ('' faces''). Polytopes are the generalization of three-dimensional polyhedra to any number of dimensions. Polytopes may exist in any general number of dimensions as an ...
. In semidefinite programming, we instead use real-valued vectors and are allowed to take the dot product of vectors; nonnegativity constraints on real variables in LP (''linear programming'') are replaced by semidefiniteness constraints on matrix variables in SDP (''semidefinite programming''). Specifically, a general semidefinite programming problem can be defined as any mathematical programming problem of the form
:
where the
, and the
are real numbers and
is the
dot product
In mathematics, the dot product or scalar productThe term ''scalar product'' means literally "product with a scalar as a result". It is also used sometimes for other symmetric bilinear forms, for example in a pseudo-Euclidean space. is an alg ...
of
and
.
Equivalent formulations
An
matrix
is said to be
positive semidefinite if it is the
Gram matrix of some vectors (i.e. if there exist vectors
such that
for all
). If this is the case, we denote this as
. Note that there are several other equivalent definitions of being positive semidefinite, for example, positive semidefinite matrices are
self-adjoint matrices that have only non-negative
eigenvalues.
Denote by
the space of all
real symmetric matrices. The space is equipped with the
inner product
In mathematics, an inner product space (or, rarely, a Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation called an inner product. The inner product of two vectors in the space is a scalar, often ...
(where
denotes the
trace)
We can rewrite the mathematical program given in the previous section equivalently as
:
where entry
in
is given by
from the previous section and
is a symmetric
matrix having
th entry
from the previous section. Thus, the matrices
and
are symmetric and the above inner products are well-defined.
Note that if we add
slack variables appropriately, this SDP can be converted to one of the form
:
For convenience, an SDP may be specified in a slightly different, but equivalent form. For example, linear expressions involving nonnegative
scalar variables may be added to the program specification. This remains an SDP because each variable can be incorporated into the matrix
as a diagonal entry (
for some
). To ensure that
, constraints
can be added for all
. As another example, note that for any positive semidefinite matrix
, there exists a set of vectors
such that the
,
entry of
is
the
scalar product of
and
. Therefore, SDPs are often formulated in terms of linear expressions on scalar products of vectors. Given the solution to the SDP in the standard form, the vectors
can be recovered in
time (e.g., by using an incomplete
Cholesky decomposition of X).
Duality theory
Definitions
Analogously to linear programming, given a general SDP of the form
:
(the primal problem or P-SDP), we define the ''
dual
Dual or Duals may refer to:
Paired/two things
* Dual (mathematics), a notion of paired concepts that mirror one another
** Dual (category theory), a formalization of mathematical duality
*** see more cases in :Duality theories
* Dual (grammatical ...
'' semidefinite program (D-SDP) as
:
where for any two matrices
and
,
means
.
Weak duality
The
weak duality theorem states that the value of the primal SDP is at least the value of the dual SDP. Therefore, any feasible solution to the dual SDP lower-bounds the primal SDP value, and conversely, any feasible solution to the primal SDP upper-bounds the dual SDP value. This is because
:
where the last inequality is because both matrices are positive semidefinite, and the result of this function is sometimes referred to as duality gap.
Strong duality
When the value of the primal and dual SDPs are equal, the SDP is said to satisfy the
strong duality Strong duality is a condition in mathematical optimization in which the primal optimal objective and the dual optimal objective are equal. This is as opposed to weak duality (the primal problem has optimal value smaller than or equal to the dual p ...
property. Unlike
linear programs, where every dual linear program has optimal objective equal to the primal objective, not every SDP satisfies strong duality; in general, the value of the dual SDP may lie strictly below the value of the primal, and the P-SDP and D-SPD satisfy the following properties:
(i) Suppose the primal problem (P-SDP) is bounded below and strictly
feasible (i.e., there exists
such that
,
).
Then there is an optimal solution
to (D-SDP) and
:
(ii) Suppose the dual problem (D-SDP) is bounded above and strictly
feasible (i.e.,
for some
).
Then there is an optimal solution
to (P-SDP) and
the equality from (i) holds.
A sufficient condition for strong duality to hold for a SDP problem (and in general, for any convex optimization problem) is the
Slater's condition. It is also possible to attain strong duality for SDPs without additional regularity conditions by using an extended dual problem proposed by Ramana.
Examples
Example 1
Consider three random variables
,
, and
. By definition, their
correlation coefficients are valid if and only if
:
in which case this matrix is called the
correlation matrix. Suppose that we know from some prior knowledge (empirical results of an experiment, for example) that
and
. The problem of determining the smallest and largest values that
can take is given by:
:
We set
to obtain the answer. This can be formulated by an SDP. We handle the inequality constraints by augmenting the variable matrix and introducing
slack variables, for example
Solving this SDP gives the minimum and maximum values of
as
and
respectively.
Example 2
Consider the problem
: minimize
: subject to
where we assume that
whenever
.
Introducing an auxiliary variable
the problem can be reformulated:
: minimize
: subject to
In this formulation, the objective is a linear function of the variables
.
The first restriction can be written as
:
where the matrix
is the square matrix with values in the diagonal equal
to the elements of the vector
.
The second restriction can be written as
:
Defining
as follows
: