HOME

TheInfoList



OR:

In mathematics, the matrix sign function is a
matrix function In mathematics, every analytic function can be used for defining a matrix function that maps square matrices with complex entries to square matrices of the same size. This is used for defining the exponential of a matrix, which is involved in th ...
on
square matrices In mathematics, a square matrix is a matrix with the same number of rows and columns. An ''n''-by-''n'' matrix is known as a square matrix of order Any two square matrices of the same order can be added and multiplied. Square matrices are ofte ...
analogous to the complex
sign function In mathematics, the sign function or signum function (from '' signum'', Latin for "sign") is an odd mathematical function that extracts the sign of a real number. In mathematical expressions the sign function is often represented as . To a ...
. It was introduced by J.D. Roberts in 1971 as a tool for model reduction and for solving Lyapunov and Algebraic
Riccati Jacopo Francesco Riccati (28 May 1676 – 15 April 1754) was a Venetian mathematician and jurist from Venice. He is best known for having studied the equation which bears his name. Education Riccati was educated first at the Jesuit school for the ...
equation in a technical report of
Cambridge University , mottoeng = Literal: From here, light and sacred draughts. Non literal: From this place, we gain enlightenment and precious knowledge. , established = , other_name = The Chancellor, Masters and Schola ...
, which was later published in a journal in 1980.


Definition

The matrix sign function is a generalization of the complex
signum function In mathematics, the sign function or signum function (from '' signum'', Latin for "sign") is an odd mathematical function that extracts the sign of a real number. In mathematical expressions the sign function is often represented as . To av ...
\operatorname(z)= \begin 1 & \text \mathrm(z) > 0, \\ -1 & \text \mathrm(z) < 0, \end to the matrix valued analogue \operatorname(A). Although the sign function is not analytic, the
matrix function In mathematics, every analytic function can be used for defining a matrix function that maps square matrices with complex entries to square matrices of the same size. This is used for defining the exponential of a matrix, which is involved in th ...
is well defined for all matrices that have no
eigenvalue In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denot ...
on the
imaginary axis An imaginary number is a real number multiplied by the imaginary unit , is usually used in engineering contexts where has other meanings (such as electrical current) which is defined by its property . The square of an imaginary number is . Fo ...
, see for example the Jordan-form-based definition (where the derivatives are all zero).


Properties

Theorem: Let A\in\C^, then \operatorname(A)^2 = I. Theorem: Let A\in\C^, then \operatorname(A) is
diagonalizable In linear algebra, a square matrix A is called diagonalizable or non-defective if it is similar to a diagonal matrix, i.e., if there exists an invertible matrix P and a diagonal matrix D such that or equivalently (Such D are not unique.) F ...
and has eigenvalues that are \pm 1. Theorem: Let A\in\C^, then (I+\operatorname(A))/2 is a
projector A projector or image projector is an optical device that projects an image (or moving images) onto a surface, commonly a projection screen. Most projectors create an image by shining a light through a small transparent lens, but some newer type ...
onto the
invariant subspace In mathematics, an invariant subspace of a linear mapping ''T'' : ''V'' → ''V '' i.e. from some vector space ''V'' to itself, is a subspace ''W'' of ''V'' that is preserved by ''T''; that is, ''T''(''W'') ⊆ ''W''. General descr ...
associated with the eigenvalues in the right-half plane, and analogously for (I-\operatorname(A))/2 and the left-half plane. Theorem: Let A\in\C^, and A = P\beginJ_+ & 0 \\ 0 & J_-\endP^ be a Jordan decomposition such that J_+ corresponds to eigenvalues with positive real part and J_- to eigenvalue with negative real part. Then \operatorname(A) = P\beginI_+ & 0 \\ 0 & -I_-\endP^, where I_+ and I_- are
identity matrices In linear algebra, the identity matrix of size n is the n\times n square matrix with ones on the main diagonal and zeros elsewhere. Terminology and notation The identity matrix is often denoted by I_n, or simply by I if the size is immaterial or ...
of sizes corresponding to J_+ and J_-, respectively.


Computational methods

The function can be computed with generic methods for
matrix function In mathematics, every analytic function can be used for defining a matrix function that maps square matrices with complex entries to square matrices of the same size. This is used for defining the exponential of a matrix, which is involved in th ...
s, but there are also specialized methods.


Newton iteration

The
Newton iteration In numerical analysis, Newton's method, also known as the Newton–Raphson method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a real-v ...
can be derived by observing that \operatorname(x) = \sqrt/x, which in terms of matrices can be written as \operatorname(A) = A^\sqrt, where we use the
matrix square root In mathematics, the square root of a matrix extends the notion of square root from numbers to matrices. A matrix is said to be a square root of if the matrix product is equal to . Some authors use the name ''square root'' or the notation ...
. If we apply the
Babylonian method Methods of computing square roots are numerical analysis algorithms for approximating the principal, or non-negative, square root (usually denoted \sqrt, \sqrt /math>, or S^) of a real number. Arithmetically, it means given S, a procedure for fin ...
to compute the square root of the matrix A^2, that is, the iteration X_ = \frac \left(X_k + A X_k^\right), and define the new iterate Z_k = A^X_k, we arrive at the iteration Z_ = \frac\left(Z_k + Z_k^\right) , where typically Z_0=A. Convergence is global, and locally it is quadratic. The Newton iteration uses the explicit inverse of the iterates Z_k.


Newton–Schulz iteration

To avoid the need of an explicit inverse used in the Newton iteration, the inverse can be approximated with one step of the Newton iteration for the inverse, Z_k^\approx Z_k\left(2I-Z_k^2\right), derived by
Schulz Schulz is a common German and Jewish-Ashkenazi family name from Germany, particularly Northern Germany. The word ''Schulz'' originates from the local official title of Schultheiß or ''(Dorf-)Schulz(e)'', meaning village headman or constable / s ...
( de) in 1933. Substituting this approximation into the previous method, the new method becomes Z_ = \fracZ_k\left(3I - Z_k^2\right) . Convergence is (still) quadratic, but only local (guaranteed for \, I-A^2\, <1).


Applications


Solutions of Sylvester equations

Theorem: Let A,B,C\in\R^ and assume that A and B are stable, then the unique solution to the Sylvester equation, AX +XB = C, is given by X such that \begin -I &2X\\ 0 & I \end = \operatorname \left( \begin A &-C\\ 0 & -B \end \right). ''Proof sketch:'' The result follows from the similarity transform \begin A &-C\\ 0 & -B \end = \begin I & X \\ 0 & I \end \begin A & 0\\ 0 & -B \end \begin I & X \\ 0 & I \end^, since \operatorname \left( \begin A &-C\\ 0 & -B \end \right) = \begin I & X \\ 0 & I \end \begin I & 0\\ 0 & -I \end \begin I & -X \\ 0 & I \end, due to the stability of A and B. The theorem is, naturally, also applicable to the
Lyapunov equation In control theory, the discrete Lyapunov equation is of the form :A X A^ - X + Q = 0 where Q is a Hermitian matrix and A^H is the conjugate transpose of A. The continuous Lyapunov equation is of the form :AX + XA^H + Q = 0. The Lyapunov equation o ...
. However, due to the structure the Newton iteration simplifies to only involving inverses of A and A^T.


Solutions of algebraic Riccati equations

There is a similar result applicable to the algebraic Riccati equation, A^H P + P A - P F P + Q = 0 . Define V,W\in\Complex^ as \begin V & W \end = \operatorname \left( \begin A^H &Q\\ F & -A \end \right) - \begin I &0\\ 0 & I \end. Under the assumption that F,Q\in\Complex^ are
Hermitian {{Short description, none Numerous things are named after the French mathematician Charles Hermite (1822–1901): Hermite * Cubic Hermite spline, a type of third-degree spline * Gauss–Hermite quadrature, an extension of Gaussian quadrature m ...
and there exists a unique stabilizing solution, in the sense that A-FP is stable, that solution is given by the over-determined, but
consistent In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consisten ...
, linear system VP=-W. ''Proof sketch:'' The similarity transform \begin A^H &Q\\ F & -A \end = \begin P & -I \\ I & 0 \end \begin (-A-FP) & -F\\ 0 & (A-FP) \end \begin P & -I \\ I & 0 \end^, and the stability of A-FP implies that \left( \operatorname \left( \begin A^H &Q\\ F & -A \end \right) - \begin I &0\\ 0 & I \end \right) \begin X & -I \\ I & 0 \end = \begin X & -I\\ I & 0 \end \begin 0 & Y \\ 0 & -2I \end, for some matrix Y\in\Complex^ .


Computations of matrix square-root

The Denman–Beavers iteration for the square root of a matrix can be derived from the Newton iteration for the matrix sign function by noticing that A - PIP=0 is a degenerate algebraic Riccati equation and by definition a solution P is the square root of A.


References

{{reflist Matrix theory Linear algebra