Semidefinite programming (SDP) is a subfield of

convex optimization Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets (or, equivalently, maximizing concave functions over convex sets). Many classes of convex optimization prob ...

concerned with the optimization of a linear

objective function In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cos ...

(a user-specified function that the user wants to minimize or maximize) over the intersection of the

cone A cone is a three-dimensional geometric shape that tapers smoothly from a flat base (frequently, though not necessarily, circular) to a point called the apex or vertex. A cone is formed by a set of line segments, half-lines, or lines co ...

of positive semidefinite

matrices Matrix most commonly refers to: * ''The Matrix'' (franchise), an American media franchise ** ''The Matrix'', a 1999 science-fiction action film ** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchis ...

with an

affine space In mathematics, an affine space is a geometric structure that generalizes some of the properties of Euclidean spaces in such a way that these are independent of the concepts of distance and measure of angles, keeping only the properties relat ...

, i.e., a

spectrahedron In convex geometry, a spectrahedron is a shape that can be represented as a linear matrix inequality. Alternatively, the set of positive semidefinite matrices forms a convex cone in , and a spectrahedron is a shape that can be formed by interse ...

. Semidefinite programming is a relatively new field of optimization which is of growing interest for several reasons. Many practical problems in

operations research Operations research ( en-GB, operational research) (U.S. Air Force Specialty Code: Operations Analysis), often shortened to the initialism OR, is a discipline that deals with the development and application of analytical methods to improve dec ...

and

combinatorial optimization Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combi ...

can be modeled or approximated as semidefinite programming problems. In automatic control theory, SDPs are used in the context of

linear matrix inequalities In convex optimization, a linear matrix inequality (LMI) is an expression of the form : \operatorname(y):=A_0+y_1A_1+y_2A_2+\cdots+y_m A_m\succeq 0\, where * y= _i\,,~i\!=\!1,\dots, m/math> is a real vector, * A_0, A_1, A_2,\dots,A_m are n\times n ...

. SDPs are in fact a special case of cone programming and can be efficiently solved by

interior point methods Interior-point methods (also referred to as barrier methods or IPMs) are a certain class of algorithms that solve linear and nonlinear convex optimization problems. An interior point method was discovered by Soviet mathematician I. I. Dikin in ...

. All

linear programs Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements are represented by linear relationships. Linear programming is ...

and (convex) quadratic programs can be expressed as SDPs, and via hierarchies of SDPs the solutions of polynomial optimization problems can be approximated. Semidefinite programming has been used in the

optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfi ...

of complex systems. In recent years, some quantum query complexity problems have been formulated in terms of semidefinite programs.

Motivation and definition

Initial motivation

linear programming Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements are represented by linear relationships. Linear programming is ...

problem is one in which we wish to maximize or minimize a linear objective function of real variables over a

polytope In elementary geometry, a polytope is a geometric object with flat sides ('' faces''). Polytopes are the generalization of three-dimensional polyhedra to any number of dimensions. Polytopes may exist in any general number of dimensions as an ...

. In semidefinite programming, we instead use real-valued vectors and are allowed to take the dot product of vectors; nonnegativity constraints on real variables in LP (''linear programming'') are replaced by semidefiniteness constraints on matrix variables in SDP (''semidefinite programming''). Specifically, a general semidefinite programming problem can be defined as any mathematical programming problem of the form :

\begin
 &  \\
\text &  \textk \\
\end

where the

c_, a_

, and the

b_k

are real numbers and

x^i \cdot x^j

is the

dot product In mathematics, the dot product or scalar productThe term ''scalar product'' means literally "product with a scalar as a result". It is also used sometimes for other symmetric bilinear forms, for example in a pseudo-Euclidean space. is an alg ...

x^i

and

x^j

Equivalent formulations

n \times n

matrix

M

is said to be positive semidefinite if it is the

Gram matrix In linear algebra, the Gram matrix (or Gramian matrix, Gramian) of a set of vectors v_1,\dots, v_n in an inner product space is the Hermitian matrix of inner products, whose entries are given by the inner product G_ = \left\langle v_i, v_j \right\r ...

of some vectors (i.e. if there exist vectors

x^1, \ldots, x^n

such that

m_=x^i \cdot x^j

for all

i,j

). If this is the case, we denote this as

M \succeq 0

. Note that there are several other equivalent definitions of being positive semidefinite, for example, positive semidefinite matrices are

self-adjoint In mathematics, and more specifically in abstract algebra, an element ''x'' of a *-algebra is self-adjoint if x^*=x. A self-adjoint element is also Hermitian, though the reverse doesn't necessarily hold. A collection ''C'' of elements of a sta ...

matrices that have only non-negative

eigenvalues In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...

. Denote by

\mathbb^n

the space of all

n \times n

real symmetric matrices. The space is equipped with the

inner product In mathematics, an inner product space (or, rarely, a Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation called an inner product. The inner product of two vectors in the space is a scalar, often ...

(where

denotes the

trace Trace may refer to: Arts and entertainment Music * ''Trace'' (Son Volt album), 1995 * ''Trace'' (Died Pretty album), 1993 * Trace (band), a Dutch progressive rock band * ''The Trace'' (album) Other uses in arts and entertainment * ''Trace'' ...

)

\langle A,B\rangle_ = (A^T B) = \sum_^n
  A_B_.

We can rewrite the mathematical program given in the previous section equivalently as :

\begin
 & \langle C, X \rangle_ \\
\text & \langle A_k, X \rangle_ \leq b_k, \quad k = 1,\ldots,m \\
& X \succeq 0.
\end

where entry

i,j

C

is given by

\frac

from the previous section and

A_k

is a symmetric

n \times n

matrix having

i,j

th entry

\frac

from the previous section. Thus, the matrices

C

and

A_k

are symmetric and the above inner products are well-defined. Note that if we add

slack variable In an optimization problem, a slack variable is a variable that is added to an inequality constraint to transform it into an equality. Introducing a slack variable replaces an inequality constraint with an equality constraint and a non-negativity c ...

s appropriately, this SDP can be converted to one of the form :

\begin
 & \langle C, X \rangle_ \\
\text & \langle A_k, X \rangle_ = b_k, \quad k = 1,\ldots,m \\
& X \succeq 0.
\end

For convenience, an SDP may be specified in a slightly different, but equivalent form. For example, linear expressions involving nonnegative

scalar Scalar may refer to: *Scalar (mathematics), an element of a field, which is used to define a vector space, usually the field of real numbers *Scalar (physics), a physical quantity that can be described by a single element of a number field such a ...

variables may be added to the program specification. This remains an SDP because each variable can be incorporated into the matrix

X

as a diagonal entry (

X_

for some

i

). To ensure that

X_ \geq 0

, constraints

X_ = 0

can be added for all

j \neq i

. As another example, note that for any positive semidefinite matrix

X

, there exists a set of vectors

\

such that the

i

j

entry of

X

X_ = (v_i, v_j)

the

scalar product In mathematics, the dot product or scalar productThe term ''scalar product'' means literally "product with a scalar as a result". It is also used sometimes for other symmetric bilinear forms, for example in a pseudo-Euclidean space. is an alg ...

v_i

and

v_j

. Therefore, SDPs are often formulated in terms of linear expressions on scalar products of vectors. Given the solution to the SDP in the standard form, the vectors

\

can be recovered in

O(n^3)

time (e.g., by using an incomplete

Cholesky decomposition In linear algebra, the Cholesky decomposition or Cholesky factorization (pronounced ) is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose, which is useful for effi ...

of X).

Duality theory

Definitions

Analogously to linear programming, given a general SDP of the form :

\begin
 & \langle C, X \rangle_ \\
\text & \langle A_i, X \rangle_ = b_i, \quad i = 1,\ldots,m \\
& X \succeq 0
\end

(the primal problem or P-SDP), we define the '' dual'' semidefinite program (D-SDP) as :

\begin
 & \langle b, y \rangle_ \\
\text &  y_i A_i \preceq C
\end

where for any two matrices

P

and

Q

P \succeq Q

means

P-Q \succeq 0

Weak duality

The

weak duality In applied mathematics, weak duality is a concept in optimization which states that the duality gap is always greater than or equal to 0. That means the solution to the dual (minimization) problem is ''always'' greater than or equal to the soluti ...

theorem states that the value of the primal SDP is at least the value of the dual SDP. Therefore, any feasible solution to the dual SDP lower-bounds the primal SDP value, and conversely, any feasible solution to the primal SDP upper-bounds the dual SDP value. This is because :

\langle C, X \rangle - \langle b, y \rangle
= \langle C, X \rangle - \sum_^m y_i b_i
= \langle C, X \rangle - \sum_^m y_i \langle A_i, X \rangle
= \langle C - \sum_^m y_i A_i, X \rangle
\geq 0,

where the last inequality is because both matrices are positive semidefinite, and the result of this function is sometimes referred to as duality gap.

Strong duality

When the value of the primal and dual SDPs are equal, the SDP is said to satisfy the

strong duality Strong duality is a condition in mathematical optimization in which the primal optimal objective and the dual optimal objective are equal. This is as opposed to weak duality (the primal problem has optimal value smaller than or equal to the dual p ...

property. Unlike

, where every dual linear program has optimal objective equal to the primal objective, not every SDP satisfies strong duality; in general, the value of the dual SDP may lie strictly below the value of the primal, and the P-SDP and D-SPD satisfy the following properties: (i) Suppose the primal problem (P-SDP) is bounded below and strictly feasible (i.e., there exists

X_0\in\mathbb^n, X_0\succ 0

such that

\langle
A_i,X_0\rangle_ = b_i

i=1,\ldots,m

). Then there is an optimal solution

y^*

to (D-SDP) and :

\langle C,X^*\rangle_ = \langle b,y^*\rangle_.

(ii) Suppose the dual problem (D-SDP) is bounded above and strictly feasible (i.e.,

\sum_^m (y_0)_i A_i
\prec C

for some

y_0\in\R^m

). Then there is an optimal solution

X^*

to (P-SDP) and the equality from (i) holds. A sufficient condition for strong duality to hold for a SDP problem (and in general, for any convex optimization problem) is the

Slater's condition In mathematics, Slater's condition (or Slater condition) is a sufficient condition for strong duality to hold for a convex optimization problem, named after Morton L. Slater. Informally, Slater's condition states that the feasible region must ha ...

. It is also possible to attain strong duality for SDPs without additional regularity conditions by using an extended dual problem proposed by Ramana.

Examples

Example 1

Consider three random variables

A

B

, and

C

. By definition, their correlation coefficients

\rho_, \ \rho_, \rho_

are valid if and only if :

\begin
  1 & \rho_ & \rho_ \\
  \rho_ & 1 & \rho_ \\
  \rho_ & \rho_ & 1
\end \succeq 0,

in which case this matrix is called the

correlation matrix In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistic ...

. Suppose that we know from some prior knowledge (empirical results of an experiment, for example) that

-0.2 \leq \rho_ \leq -0.1

and

0.4 \leq \rho_ \leq 0.5

. The problem of determining the smallest and largest values that

\rho_ \

can take is given by: :

\begin
 & x_ \\
\text & -0.2 \leq x_ \leq -0.1\\
& 0.4 \leq x_ \leq 0.5\\
& \begin
  1 & x_ & x_ \\
  x_ & 1 & x_ \\
  x_ & x_ & 1
\end \succeq 0
\end

We set

\rho_ = x_, \ \rho_ = x_, \ \rho_ = x_

to obtain the answer. This can be formulated by an SDP. We handle the inequality constraints by augmenting the variable matrix and introducing

s, for example

\mathrm\left(\left(\begin
0 & 1 & 0 & 0 & 0 & 0\\
0 & 0 & 0 & 0 & 0 & 0\\
0 & 0 & 0 & 0 & 0 & 0\\
0 & 0 & 0 & 1 & 0 & 0\\
0 & 0 & 0 & 0 & 0 & 0\\
0 & 0 & 0 & 0 & 0 & 0\end\right)\cdot\left(\begin
1 & x_ & x_ & 0 & 0 & 0\\
x_ & 1 & x_ & 0 & 0 & 0\\
x_ & x_ & 1 & 0 & 0 & 0\\
0 & 0 & 0 & s_ & 0 & 0\\
0 & 0 & 0 & 0 & s_ & 0\\
0 & 0 & 0 & 0 & 0 & s_\end\right)\right)=x_ + s_=-0.1

Solving this SDP gives the minimum and maximum values of

\rho_ = x_ \

-0.978

and

0.872

respectively.

Example 2

Consider the problem : minimize

\frac

: subject to

Ax +b\geq 0

where we assume that

d^Tx>0

whenever

Ax+b\geq 0

. Introducing an auxiliary variable

t

the problem can be reformulated: : minimize

t

: subject to

Ax+b\geq 0, \, \frac\leq t

In this formulation, the objective is a linear function of the variables

x,t

. The first restriction can be written as :

\textbf(Ax+b)\geq 0

where the matrix

\textbf(Ax+b)

is the square matrix with values in the diagonal equal to the elements of the vector

Ax+b

. The second restriction can be written as :

td^Tx-(c^Tx)^2\geq 0

Defining

D

as follows :

D=\left begint&c^Tx\\c^Tx&d^Tx\end\right /math>

We can use the theory of Schur Complements to see that

: D \succeq 0 (Boyd and Vandenberghe, 1996)

The semidefinite program associated with this problem is

: minimize t : subject to \left begin\textbf(Ax+b)&0&0\\0&t&c^Tx\\0&c^Tx&d^Tx\end\right \succeq 0

Example 3 (Goemans–Williamson max cut approximation algorithm)

Semidefinite programs are important tools for developing approximation algorithms for NP-hard maximization problems. The first approximation algorithm based on an SDP is due to

Michel Goemans Michel Xavier Goemans (born December, 1964) is a Belgian- American professor of applied mathematics and the RSA Professor of Mathematics at MIT working in discrete mathematics and combinatorial optimization at CSAIL and MIT Operations Research C ...

and

David P. Williamson David Paul Williamson is a professor of operations research at Cornell University, and the editor-in-chief of the ''SIAM Journal on Discrete Mathematics''. He earned his Ph.D. in 1993 from the Massachusetts Institute of Technology under the supe ...

(JACM, 1995). They studied the max cut problem: Given a

graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discre ...

''G'' = (''V'', ''E''), output a partition of the vertices ''V'' so as to maximize the number of edges crossing from one side to the other. This problem can be expressed as an integer quadratic program: :Maximize

\sum_ \frac,

such that each

v_i\in\

. Unless

P = NP The P versus NP problem is a major unsolved problem in theoretical computer science. In informal terms, it asks whether every problem whose solution can be quickly verified can also be quickly solved. The informal term ''quickly'', used above ...

, we cannot solve this maximization problem efficiently. However, Goemans and Williamson observed a general three-step procedure for attacking this sort of problem: # ''Relax'' the integer quadratic program into an SDP. # Solve the SDP (to within an arbitrarily small additive error

\epsilon

). # ''Round'' the SDP solution to obtain an approximate solution to the original integer quadratic program. For max cut, the most natural relaxation is :

\max \sum_ \frac,

such that

\lVert v_i\rVert^2 = 1

, where the maximization is over vectors

\

instead of integer scalars. This is an SDP because the objective function and constraints are all linear functions of vector inner products. Solving the SDP gives a set of unit vectors in

\mathbf

; since the vectors are not required to be collinear, the value of this relaxed program can only be higher than the value of the original quadratic integer program. Finally, a rounding procedure is needed to obtain a partition. Goemans and Williamson simply choose a uniformly random hyperplane through the origin and divide the vertices according to which side of the hyperplane the corresponding vectors lie. Straightforward analysis shows that this procedure achieves an expected ''approximation ratio'' (performance guarantee) of 0.87856 - ε. (The expected value of the cut is the sum over edges of the probability that the edge is cut, which is proportional to the angle

\cos^\langle v_, v_\rangle

between the vectors at the endpoints of the edge over

\pi

. Comparing this probability to

(1-\langle v_, v_\rangle)/

, in expectation the ratio is always at least 0.87856.) Assuming the

unique games conjecture In computational complexity theory, the unique games conjecture (often referred to as UGC) is a conjecture made by Subhash Khot in 2002. The conjecture postulates that the problem of determining the approximate ''value'' of a certain type of ga ...

, it can be shown that this approximation ratio is essentially optimal. Since the original paper of Goemans and Williamson, SDPs have been applied to develop numerous approximation algorithms. Recently, Prasad Raghavendra has developed a general framework for constraint satisfaction problems based on the

Algorithms

There are several types of algorithms for solving SDPs. These algorithms output the value of the SDP up to an additive error

\epsilon

in time that is polynomial in the program description size and

\log (1/\epsilon)

. There are also facial reduction algorithms that can be used to preprocess SDPs problems by inspecting the constraints of the problem. These can be used to detect lack of strict feasibility, to delete redundant rows and columns, and also to reduce the size of the variable matrix.

Interior point methods

Most codes are based on

(CSDP,

MOSEK MOSEK is a software package for the solution of linear, mixed-integer linear, quadratic, mixed-integer quadratic, quadratically constraint, conic and convex nonlinear mathematical optimization problems. The applicability of the solver varies wide ...

, SeDuMi
SDPT3
DSDP, SDPA). Robust and efficient for general linear SDP problems. Restricted by the fact that the algorithms are second-order methods and need to store and factorize a large (and often dense) matrix. Theoretically, the state-of-the-art high-accuracy SDP algorithms are based on this approach.

First-order methods

First-order methods for

conic optimization Conic optimization is a subfield of convex optimization that studies problems consisting of minimizing a convex function over the intersection of an affine subspace and a convex cone. The class of conic optimization problems includes some of the ...

avoid computing, storing and factorizing a large Hessian matrix and scale to much larger problems than interior point methods, at some cost in accuracy. A first-order method is implemented in the Splitting Cone Solver (SCS). Another first-order method is the

alternating direction method of multipliers Augmented Lagrangian methods are a certain class of algorithms for solving constrained optimization problems. They have similarities to penalty methods in that they replace a constrained optimization problem by a series of unconstrained problems ...

(ADMM). This method requires in every step projection on the cone of semidefinite matrices.

Bundle method

The code ConicBundle formulates the SDP problem as a nonsmooth optimization problem and solves it by the Spectral Bundle method of nonsmooth optimization. This approach is very efficient for a special class of linear SDP problems.

Other solving methods

Algorithms based on

Augmented Lagrangian method Augmented Lagrangian methods are a certain class of algorithms for solving constrained optimization problems. They have similarities to penalty methods in that they replace a constrained optimization problem by a series of unconstrained problems ...

(PENSDP) are similar in behavior to the interior point methods and can be specialized to some very large scale problems. Other algorithms use low-rank information and reformulation of the SDP as a

nonlinear programming In mathematics, nonlinear programming (NLP) is the process of solving an optimization problem where some of the constraints or the objective function are nonlinear. An optimization problem is one of calculation of the extrema (maxima, minima or st ...

problem (SDPLR).

Approximate methods

Algorithms that solve SDPs approximately have been proposed as well. The main goal of such methods is to achieve lower complexity in applications where approximate solutions are sufficient and complexity must be minimal. A prominent method that has been used for data detection in multiple-input multiple-output (MIMO) wireless systems is Triangular Approximate SEmidefinite Relaxation (TASER), which operates on the Cholesky decomposition factors of the semidefinite matrix instead of the semidefinite matrix. This method calculates approximate solutions for a max-cut-like problem that are often comparable to solutions from exact solvers but in only 10-20 algorithm iterations.

Applications

Semidefinite programming has been applied to find approximate solutions to combinatorial optimization problems, such as the solution of the

max cut For a graph, a maximum cut is a cut whose size is at least the size of any other cut. That is, it is a partition of the graph's vertices into two complementary sets and , such that the number of edges between and is as large as possible. F ...

problem with an

approximation ratio An approximation is anything that is intentionally similar but not exactly equal to something else. Etymology and usage The word ''approximation'' is derived from Latin ''approximatus'', from ''proximus'' meaning ''very near'' and the prefix '' ...

of 0.87856. SDPs are also used in geometry to determine tensegrity graphs, and arise in control theory as LMIs, and in inverse elliptic coefficient problems as convex, non-linear, semidefiniteness constraints. It is also widely used in physics to constrain conformal field theories with the

conformal bootstrap The conformal bootstrap is a non-perturbative mathematical method to constrain and solve conformal field theories, i.e. models of particle physics or statistical physics that exhibit similar properties at different levels of resolution. Overview ...

References

* Lieven Vandenberghe, Stephen Boyd, "Semidefinite Programming", SIAM Review 38, March 1996, pp. 49–95
pdf
* Monique Laurent, Franz Rendl, "Semidefinite Programming and Integer Programming", Report PNA-R0210, CWI, Amsterdam, April 2002

* E. de Klerk, "Aspects of Semidefinite Programming: Interior Point Algorithms and Selected Applications", Kluwer Academic Publishers, March 2002, . * Robert M. Freund, "Introduction to Semidefinite Programming (SDP)
SDP-Introduction

External links

Lecture notes
from

László Lovász László Lovász (; born March 9, 1948) is a Hungarian mathematician and professor emeritus at Eötvös Loránd University, best known for his work in combinatorics, for which he was awarded the 2021 Abel Prize jointly with Avi Wigderson. He wa ...

on Semidefinite Programming {{Mathematical optimization software Convex optimization P-complete problems Real algebraic geometry Linear programming