Moore–Penrose inverse
   HOME

TheInfoList



OR:

In mathematics, and in particular
linear algebra Linear algebra is the branch of mathematics concerning linear equations such as: :a_1x_1+\cdots +a_nx_n=b, linear maps such as: :(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n, and their representations in vector spaces and through matrices ...
, the Moore–Penrose inverse of a matrix is the most widely known generalization of the
inverse matrix In linear algebra, an -by- square matrix is called invertible (also nonsingular or nondegenerate), if there exists an -by- square matrix such that :\mathbf = \mathbf = \mathbf_n \ where denotes the -by- identity matrix and the multiplicati ...
. It was independently described by E. H. Moore in 1920, Arne Bjerhammar in 1951, and Roger Penrose in 1955. Earlier, Erik Ivar Fredholm had introduced the concept of a pseudoinverse of
integral operator An integral operator is an operator that involves integration. Special instances are: * The operator of integration itself, denoted by the integral symbol * Integral linear operators, which are linear operators induced by bilinear forms invol ...
s in 1903. When referring to a matrix, the term
pseudoinverse In mathematics, and in particular, algebra, a generalized inverse (or, g-inverse) of an element ''x'' is an element ''y'' that has some properties of an inverse element but not necessarily all of them. The purpose of constructing a generalized inv ...
, without further specification, is often used to indicate the Moore–Penrose inverse. The term generalized inverse is sometimes used as a synonym for pseudoinverse. A common use of the pseudoinverse is to compute a "best fit" ( least squares) solution to a system of linear equations that lacks a solution (see below under § Applications). Another use is to find the minimum ( Euclidean) norm solution to a system of linear equations with multiple solutions. The pseudoinverse facilitates the statement and proof of results in linear algebra. The pseudoinverse is defined and unique for all matrices whose entries are
real Real may refer to: Currencies * Brazilian real (R$) * Central American Republic real * Mexican real * Portuguese real * Spanish real * Spanish colonial real Music Albums * ''Real'' (L'Arc-en-Ciel album) (2000) * ''Real'' (Bright album) (2010) ...
or
complex Complex commonly refers to: * Complexity, the behaviour of a system whose components interact in multiple ways so possible interactions are difficult to describe ** Complex system, a system composed of many components which may interact with each ...
numbers. It can be computed using the
singular value decomposition In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix. It generalizes the eigendecomposition of a square normal matrix with an orthonormal eigenbasis to any \ m \times n\ matrix. It is re ...
. In the special case where is a
normal matrix In mathematics, a complex square matrix is normal if it commutes with its conjugate transpose : The concept of normal matrices can be extended to normal operators on infinite dimensional normed spaces and to normal elements in C*-algebras. As ...
(for example, a Hermitian matrix), the pseudoinverse annihilates the
kernel Kernel may refer to: Computing * Kernel (operating system), the central component of most operating systems * Kernel (image processing), a matrix used for image convolution * Compute kernel, in GPGPU programming * Kernel method, in machine learn ...
of and acts as a traditional inverse of on the subspace orthogonal to the kernel.


Notation

In the following discussion, the following conventions are adopted. * will denote one of the
fields Fields may refer to: Music * Fields (band), an indie rock band formed in 2006 * Fields (progressive rock band), a progressive rock band formed in 1971 * ''Fields'' (album), an LP by Swedish-based indie rock band Junip (2010) * "Fields", a song b ...
of real or complex numbers, denoted , , respectively. The vector space of matrices over is denoted by . * For , and denote the transpose and Hermitian transpose (also called
conjugate transpose In mathematics, the conjugate transpose, also known as the Hermitian transpose, of an m \times n complex matrix \boldsymbol is an n \times m matrix obtained by transposing \boldsymbol and applying complex conjugate on each entry (the complex c ...
) respectively. If \mathbb = \mathbb, then A^* = A^\textsf. * For , (standing for " range") denotes the column space (image) of (the space spanned by the column vectors of ) and denotes the
kernel Kernel may refer to: Computing * Kernel (operating system), the central component of most operating systems * Kernel (image processing), a matrix used for image convolution * Compute kernel, in GPGPU programming * Kernel method, in machine learn ...
(null space) of . * Finally, for any positive integer , denotes the identity matrix.


Definition

For , a pseudoinverse of is defined as a matrix satisfying all of the following four criteria, known as the Moore–Penrose conditions: # need not be the general identity matrix, but it maps all column vectors of to themselves: A A^+ A = \; A # acts like a weak inverse: A^+ A A^+ = \; A^+ # is
Hermitian {{Short description, none Numerous things are named after the French mathematician Charles Hermite (1822–1901): Hermite * Cubic Hermite spline, a type of third-degree spline * Gauss–Hermite quadrature, an extension of Gaussian quadrature m ...
: \left(A A^+\right)^* = \; A A^+ # is also Hermitian: \left(A^+ A\right)^* = \; A^+ A exists for any matrix , but, when the latter has full
rank Rank is the relative position, value, worth, complexity, power, importance, authority, level, etc. of a person or object within a ranking, such as: Level or position in a hierarchical organization * Academic rank * Diplomatic rank * Hierarchy * ...
(that is, the rank of is ), then can be expressed as a simple algebraic formula. In particular, when has linearly independent columns (and thus matrix is invertible), can be computed as A^+ = \left(A^* A\right)^ A^*. This particular pseudoinverse constitutes a ''left inverse'', since, in this case, A^+A = I . When has linearly independent rows (matrix is invertible), can be computed as A^+ = A^* \left(A A^*\right)^. This is a ''right inverse'', as A A^+ = I.


Properties


Existence and uniqueness

The pseudoinverse exists and is unique: for any matrix , there is precisely one matrix , that satisfies the four properties of the definition. A matrix satisfying the first condition of the definition is known as a generalized inverse. If the matrix also satisfies the second definition, it is called a generalized ''reflexive'' inverse. Generalized inverses always exist but are not in general unique. Uniqueness is a consequence of the last two conditions.


Basic properties

Proofs for the properties below can be found at b:Topics in Abstract Algebra/Linear algebra. * If has real entries, then so does . * If is
invertible In mathematics, the concept of an inverse element generalises the concepts of opposite () and reciprocal () of numbers. Given an operation denoted here , and an identity element denoted , if , one says that is a left inverse of , and that is ...
, its pseudoinverse is its inverse. That is, A^+ = A^.. * For any n\in\mathbb Z_ and any v\in\mathbb k^n, (\operatorname(v))^+=\operatorname(r\circ v), where r=(x^)_\cup\. (We use r for "reciprocal".) * More generally, for any m,n\in\mathbb Z_ and any v\in\mathbb k^, the rectangular diagonal matrix \begin &((((m\times n)\setminus\)\times\)\cup\)^+ \\ =&((((n\times m)\setminus\)\times\)\cup\), \end where r = (x^)_\cup\. * It follows that the pseudoinverse of a
zero matrix In mathematics, particularly linear algebra, a zero matrix or null matrix is a matrix all of whose entries are zero. It also serves as the additive identity of the additive group of m \times n matrices, and is denoted by the symbol O or 0 followed ...
is its transpose. * The pseudoinverse of the pseudoinverse is the original matrix: \left(A^+\right)^+ = A. * Pseudoinversion commutes with transposition, complex conjugation, and taking the conjugate transpose: \left(A^\textsf\right)^+ = \left(A^+\right)^\textsf, \quad \left(\overline\right)^+ = \overline, \quad \left(A^*\right)^+ = \left(A^+\right)^* . * The pseudoinverse of a scalar multiple of is the reciprocal multiple of :` \left(\alpha A\right)^+ = \alpha^ A^+ for .


Identities

The following identity formula can be used to cancel or expand certain subexpressions involving pseudoinverses: A = AA^*A^ = A^A^*A Equivalently, substituting A^+ for A gives A^+ =A^+A^A^* = A^*A^A^+ while substituting A^* for A gives A^* =A^*AA^+=A^+AA^*.


Reduction to Hermitian case

The computation of the pseudoinverse is reducible to its construction in the Hermitian case. This is possible through the equivalences: A^+ = \left(A^*A\right)^+ A^*, A^+ = A^* \left(A A^*\right)^+, as and are Hermitian.


Products

Suppose . Then the following are equivalent: # # \begin A^+ A BB^* A^* & = BB^* A^*, \\ BB^+ A^* A B & = A^* A B. \end # \begin \left(A^+ A BB^*\right)^* & = A^+ A BB^*, \\ \left(A^* A BB^+\right)^* & = A^* A BB^+. \end # A^+ A BB^* A^* A BB^+ = BB^* A^* A # \begin A^+ A B & = B (AB)^+ AB, \\ BB^+ A^* & = A^* A B (AB)^+. \end The following are sufficient conditions for : # has orthonormal columns (then A^*A = A^+ A = I_n),   or # has orthonormal rows (then BB^* = BB^+ = I_n),   or # has linearly independent columns (then A^+ A = I ) and has linearly independent rows (then BB^+ = I),   or # B = A^*, or # B = A^+. The following is a necessary condition for : # (A^+ A) (BB^+) = (BB^+) (A^+ A) The last sufficient condition yields the equalities \begin \left(A A^*\right)^+ &= A^ A^+, \\ \left(A^* A\right)^+ &= A^+ A^. \end NB: The equality does not hold in general. See the counterexample: \Biggl( \begin 1 & 1 \\ 0 & 0 \end \begin 0 & 0 \\ 1 & 1 \end \Biggr)^+ = \begin 1 & 1 \\ 0 & 0 \end^+ = \begin \tfrac12 & 0 \\ \tfrac12 & 0 \end \quad \neq \quad \begin \tfrac14 & 0 \\ \tfrac14 & 0 \end = \begin 0 & \tfrac12 \\ 0 & \tfrac12 \end \begin \tfrac12 & 0 \\ \tfrac12 & 0 \end = \begin 0 & 0 \\ 1 & 1 \end^+ \begin 1 & 1 \\ 0 & 0 \end^+


Projectors

P = A A^+ and Q = A^+A are orthogonal projection operators, that is, they are Hermitian (P = P^*, Q = Q^*) and idempotent (P^2 = P and Q^2 = Q). The following hold: * PA = AQ = A and A^+ P = QA^+ = A^+ * is the orthogonal projector onto the range of (which equals the
orthogonal complement In the mathematical fields of linear algebra and functional analysis, the orthogonal complement of a subspace ''W'' of a vector space ''V'' equipped with a bilinear form ''B'' is the set ''W''⊥ of all vectors in ''V'' that are orthogonal to every ...
of the kernel of ). * is the orthogonal projector onto the range of (which equals the orthogonal complement of the kernel of ). * I - Q = I - A^+A is the orthogonal projector onto the kernel of . * I - P = I - A A^+ is the orthogonal projector onto the kernel of . The last two properties imply the following identities: * A\,\ \left(I - A^+ A\right)= \left(I - A A^+\right)A\ \ = 0 * A^*\left(I - A A^+\right) = \left(I - A^+A\right)A^* = 0 Another property is the following: if is Hermitian and idempotent (true if and only if it represents an orthogonal projection), then, for any matrix the following equation holds: A(BA)^+ = (BA)^+ This can be proven by defining matrices C = BA, D = A(BA)^+, and checking that is indeed a pseudoinverse for by verifying that the defining properties of the pseudoinverse hold, when is Hermitian and idempotent. From the last property it follows that, if is Hermitian and idempotent, for any matrix (AB)^+A = (AB)^+ Finally, if is an orthogonal projection matrix, then its pseudoinverse trivially coincides with the matrix itself, that is, A^+ = A.


Geometric construction

If we view the matrix as a linear map over the field then can be decomposed as follows. We write for the direct sum, for the
orthogonal complement In the mathematical fields of linear algebra and functional analysis, the orthogonal complement of a subspace ''W'' of a vector space ''V'' equipped with a bilinear form ''B'' is the set ''W''⊥ of all vectors in ''V'' that are orthogonal to every ...
, for the
kernel Kernel may refer to: Computing * Kernel (operating system), the central component of most operating systems * Kernel (image processing), a matrix used for image convolution * Compute kernel, in GPGPU programming * Kernel method, in machine learn ...
of a map, and for the image of a map. Notice that \mathbb^n = \left(\ker A\right)^\perp \oplus \ker A and \mathbb^m = \operatorname A \oplus \left(\operatorname A\right)^\perp. The restriction A: \left(\ker A\right)^\perp \to \operatorname A is then an isomorphism. This implies that on is the inverse of this isomorphism, and is zero on \left(\operatorname A\right)^\perp . In other words: To find for given in , first project orthogonally onto the range of , finding a point in the range. Then form , that is, find those vectors in that sends to . This will be an affine subspace of parallel to the kernel of . The element of this subspace that has the smallest length (that is, is closest to the origin) is the answer we are looking for. It can be found by taking an arbitrary member of and projecting it orthogonally onto the orthogonal complement of the kernel of . This description is closely related to the Minimum norm solution to a linear system.


Subspaces

\begin \ker\left(A^+\right) &= \ker\left(A^*\right) \\ \operatorname\left(A^+\right) &= \operatorname\left(A^*\right) \end


Limit relations

The pseudoinverse are limits: A^+ = \lim_ \left(A^* A + \delta I\right)^ A^* = \lim_ A^* \left(A A^* + \delta I\right)^ (see
Tikhonov regularization Ridge regression is a method of estimating the coefficients of multiple-regression models in scenarios where the independent variables are highly correlated. It has been used in many fields including econometrics, chemistry, and engineering. Also ...
). These limits exist even if or do not exist.


Continuity

In contrast to ordinary matrix inversion, the process of taking pseudoinverses is not continuous: if the sequence converges to the matrix (in the maximum norm or Frobenius norm, say), then need not converge to . However, if all the matrices have the same rank as , will converge to .


Derivative

The derivative of a real valued pseudoinverse matrix which has constant rank at a point may be calculated in terms of the derivative of the original matrix: \frac A^+(x) = -A^+ \left( \frac A \right) A^+ ~+~ A^+ A^ \left(\frac A^\textsf\right) \left(I - A A^+\right) ~+~ \left(I - A^+ A\right) \left(\frac A^\textsf\right) A^ A^+.


Examples

Since for invertible matrices the pseudoinverse equals the usual inverse, only examples of non-invertible matrices are considered below.


Special cases


Scalars

It is also possible to define a pseudoinverse for scalars and vectors. This amounts to treating these as matrices. The pseudoinverse of a scalar is zero if is zero and the reciprocal of otherwise: x^+ = \begin 0, & \mboxx = 0; \\ x^, & \mbox. \end


Vectors

The pseudoinverse of the null (all zero) vector is the transposed null vector. The pseudoinverse of a non-null vector is the conjugate transposed vector divided by its squared magnitude: \vec^+ = \begin \vec^\textsf, & \text \vec = \vec; \\ \dfrac, & \text. \end


Linearly independent columns

If the rank of is identical to its column rank, , (for ,) there are
linearly independent In the theory of vector spaces, a set of vectors is said to be if there is a nontrivial linear combination of the vectors that equals the zero vector. If no such linear combination exists, then the vectors are said to be . These concepts are ...
columns, and is invertible. In this case, an explicit formula is: A^+ = \left(A^*A\right)^A^*. It follows that is then a left inverse of :   A^+ A = I_n.


Linearly independent rows

If the rank of is identical to its row rank, , (for ,) there are
linearly independent In the theory of vector spaces, a set of vectors is said to be if there is a nontrivial linear combination of the vectors that equals the zero vector. If no such linear combination exists, then the vectors are said to be . These concepts are ...
rows, and is invertible. In this case, an explicit formula is: A^+ = A^*\left(A A^*\right)^. It follows that is a right inverse of :   A A^+ = I_m.


Orthonormal columns or rows

This is a special case of either full column rank or full row rank (treated above). If has orthonormal columns (A^*A = I_n) or orthonormal rows (A A^* = I_m), then: A^+ = A^* .


Normal matrices

If is
normal Normal(s) or The Normal(s) may refer to: Film and television * ''Normal'' (2003 film), starring Jessica Lange and Tom Wilkinson * ''Normal'' (2007 film), starring Carrie-Anne Moss, Kevin Zegers, Callum Keith Rennie, and Andrew Airlie * ''Norma ...
; that is, it commutes with its conjugate transpose, then its pseudoinverse can be computed by diagonalizing it, mapping all nonzero eigenvalues to their inverses, and mapping zero eigenvalues to zero. A corollary is that commuting with its transpose implies that it commutes with its pseudoinverse.


Orthogonal projection matrices

This is a special case of a Normal matrix with eigenvalues 0 and 1. If is an orthogonal projection matrix, that is, A = A^* and A^2 = A, then the pseudoinverse trivially coincides with the matrix itself: A^+ = A.


Circulant matrices

For a
circulant matrix In linear algebra, a circulant matrix is a square matrix in which all row vectors are composed of the same elements and each row vector is rotated one element to the right relative to the preceding row vector. It is a particular kind of Toeplit ...
, the singular value decomposition is given by the Fourier transform, that is, the singular values are the Fourier coefficients. Let be the Discrete Fourier Transform (DFT) matrix; then \begin C &= \mathcal\cdot\Sigma\cdot\mathcal^*,\\ C^+ &= \mathcal\cdot\Sigma^+\cdot\mathcal^*. \end


Construction


Rank decomposition

Let denote the
rank Rank is the relative position, value, worth, complexity, power, importance, authority, level, etc. of a person or object within a ranking, such as: Level or position in a hierarchical organization * Academic rank * Diplomatic rank * Hierarchy * ...
of . Then can be (rank) decomposed as A = BC where and are of rank . Then A^+ = C^+B^+ = C^*\left(CC^*\right)^\left(B^*B\right)^B^*.


The QR method

For \mathbb \in \ computing the product or and their inverses explicitly is often a source of numerical rounding errors and computational cost in practice. An alternative approach using the
QR decomposition In linear algebra, a QR decomposition, also known as a QR factorization or QU factorization, is a decomposition of a matrix ''A'' into a product ''A'' = ''QR'' of an orthogonal matrix ''Q'' and an upper triangular matrix ''R''. QR decomp ...
of may be used instead. Consider the case when is of full column rank, so that A^+ = \left(A^*A\right)^A^*. Then the
Cholesky decomposition In linear algebra, the Cholesky decomposition or Cholesky factorization (pronounced ) is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose, which is useful for effici ...
A^*A = R^*R, where is an upper triangular matrix, may be used. Multiplication by the inverse is then done easily by solving a system with multiple right-hand sides, A^+ = \left(A^*A\right)^A^* \quad \Leftrightarrow \quad \left(A^*A\right)A^+ = A^* \quad \Leftrightarrow \quad R^*RA^+ = A^* which may be solved by
forward substitution In mathematics, a triangular matrix is a special kind of square matrix. A square matrix is called if all the entries ''above'' the main diagonal are zero. Similarly, a square matrix is called if all the entries ''below'' the main diagonal are ...
followed by back substitution. The Cholesky decomposition may be computed without forming explicitly, by alternatively using the
QR decomposition In linear algebra, a QR decomposition, also known as a QR factorization or QU factorization, is a decomposition of a matrix ''A'' into a product ''A'' = ''QR'' of an orthogonal matrix ''Q'' and an upper triangular matrix ''R''. QR decomp ...
of A = Q R, where Q has orthonormal columns, Q^*Q = I , and is upper triangular. Then A^*A \,=\, (Q R)^*(Q R) \,=\, R^*Q^*Q R \,=\, R^*R , so is the Cholesky factor of . The case of full row rank is treated similarly by using the formula A^+ = A^*\left(A A^*\right)^ and using a similar argument, swapping the roles of and .


Singular value decomposition (SVD)

A computationally simple and accurate way to compute the pseudoinverse is by using the
singular value decomposition In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix. It generalizes the eigendecomposition of a square normal matrix with an orthonormal eigenbasis to any \ m \times n\ matrix. It is re ...
.Linear Systems & Pseudo-Inverse
/ref> If A = U\Sigma V^* is the singular value decomposition of , then A^+ = V\Sigma^+ U^*. For a rectangular diagonal matrix such as , we get the pseudoinverse by taking the reciprocal of each non-zero element on the diagonal, leaving the zeros in place, and then transposing the matrix. In numerical computation, only elements larger than some small tolerance are taken to be nonzero, and the others are replaced by zeros. For example, in the
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementa ...
or
GNU Octave GNU Octave is a high-level programming language primarily intended for scientific computing and numerical computation. Octave helps in solving linear and nonlinear problems numerically, and for performing other numerical experiments using a langu ...
function , the tolerance is taken to be , where ε is the machine epsilon. The computational cost of this method is dominated by the cost of computing the SVD, which is several times higher than matrix–matrix multiplication, even if a state-of-the art implementation (such as that of
LAPACK LAPACK ("Linear Algebra Package") is a standard software library for numerical linear algebra. It provides routines for solving systems of linear equations and linear least squares, eigenvalue problems, and singular value decomposition. It als ...
) is used. The above procedure shows why taking the pseudoinverse is not a continuous operation: if the original matrix has a singular value 0 (a diagonal entry of the matrix above), then modifying slightly may turn this zero into a tiny positive number, thereby affecting the pseudoinverse dramatically as we now have to take the reciprocal of a tiny number.


Block matrices

Optimized approaches exist for calculating the pseudoinverse of block structured matrices.


The iterative method of Ben-Israel and Cohen

Another method for computing the pseudoinverse (cf.
Drazin inverse In mathematics, the Drazin inverse, named after Michael P. Drazin, is a kind of generalized inverse of a matrix. Let ''A'' be a square matrix. The index of ''A'' is the least nonnegative integer ''k'' such that rank(''A'k''+1) = rank(''A'k'') ...
) uses the recursion A_ = 2A_i - A_i A A_i, which is sometimes referred to as hyper-power sequence. This recursion produces a sequence converging quadratically to the pseudoinverse of if it is started with an appropriate satisfying A_0 A = \left(A_0 A\right)^*. The choice A_0 = \alpha A^* (where 0 < \alpha < 2/\sigma^2_1(A), with denoting the largest singular value of ) has been argued not to be competitive to the method using the SVD mentioned above, because even for moderately ill-conditioned matrices it takes a long time before enters the region of quadratic convergence. However, if started with already close to the Moore–Penrose inverse and A_0 A = \left(A_0 A\right)^*, for example A_0 := \left(A^* A + \delta I\right)^ A^*, convergence is fast (quadratic).


Updating the pseudoinverse

For the cases where has full row or column rank, and the inverse of the correlation matrix ( for with full row rank or for full column rank) is already known, the pseudoinverse for matrices related to can be computed by applying the Sherman–Morrison–Woodbury formula to update the inverse of the correlation matrix, which may need less work. In particular, if the related matrix differs from the original one by only a changed, added or deleted row or column, incremental algorithms exist that exploit the relationship. Similarly, it is possible to update the Cholesky factor when a row or column is added, without creating the inverse of the correlation matrix explicitly. However, updating the pseudoinverse in the general rank-deficient case is much more complicated.


Software libraries

High-quality implementations of SVD, QR, and back substitution are available in standard libraries, such as
LAPACK LAPACK ("Linear Algebra Package") is a standard software library for numerical linear algebra. It provides routines for solving systems of linear equations and linear least squares, eigenvalue problems, and singular value decomposition. It als ...
. Writing one's own implementation of SVD is a major programming project that requires a significant numerical expertise. In special circumstances, such as parallel computing or embedded computing, however, alternative implementations by QR or even the use of an explicit inverse might be preferable, and custom implementations may be unavoidable. The Python package NumPy provides a pseudoinverse calculation through its functions matrix.I and linalg.pinv; its pinv uses the SVD-based algorithm.
SciPy SciPy (pronounced "sigh pie") is a free and open-source Python library used for scientific computing and technical computing. SciPy contains modules for optimization, linear algebra, integration, interpolation, special functions, FFT, ...
adds a function scipy.linalg.pinv that uses a least-squares solver. The MASS package for R provides a calculation of the Moore–Penrose inverse through the ginv function. The ginv function calculates a pseudoinverse using the singular value decomposition provided by the svd function in the base R package. An alternative is to employ the pinv function available in the pracma package. The Octave programming language provides a pseudoinverse through the standard package function pinv and the pseudo_inverse() method. In
Julia (programming language) Julia is a high-level, dynamic programming language. Its features are well suited for numerical analysis and computational science. Distinctive aspects of Julia's design include a type system with parametric polymorphism in a dynamic program ...
, the LinearAlgebra package of the standard library provides an implementation of the Moore–Penrose inverse pinv() implemented via singular-value decomposition.


Applications


Linear least-squares

The pseudoinverse provides a least squares solution to a system of linear equations. For , given a system of linear equations A x = b, in general, a vector that solves the system may not exist, or if one does exist, it may not be unique. The pseudoinverse solves the "least-squares" problem as follows: * , we have \left\, Ax - b\right\, _2 \ge \left\, Az - b\right\, _2 where z = A^+b and \, \cdot\, _2 denotes the Euclidean norm. This weak inequality holds with equality if and only if x = A^+b + \left(I - A^+A\right)w for any vector ; this provides an infinitude of minimizing solutions unless has full column rank, in which case is a zero matrix. The solution with minimum Euclidean norm is This result is easily extended to systems with multiple right-hand sides, when the Euclidean norm is replaced by the Frobenius norm. Let . * , we have \, AX - B\, _ \ge \, AZ -B\, _ where Z = A^+B and \, \cdot\, _ denotes the
Frobenius norm In mathematics, a matrix norm is a vector norm in a vector space whose elements (vectors) are matrices (of given dimensions). Preliminaries Given a field K of either real or complex numbers, let K^ be the -vector space of matrices with m ro ...
.


Obtaining all solutions of a linear system

If the linear system A x = b has any solutions, they are all given by x = A^+ b + \left - A^+ A\right for arbitrary vector . Solution(s) exist if and only if A A^+ b = b. If the latter holds, then the solution is unique if and only if has full column rank, in which case is a zero matrix. If solutions exist but does not have full column rank, then we have an
indeterminate system In mathematics, particularly in algebra, an indeterminate system is a system of simultaneous equations (e.g., linear equations) which has more than one solution (sometimes infinitely many solutions). In the case of a linear system, the system may b ...
, all of whose infinitude of solutions are given by this last equation.


Minimum norm solution to a linear system

For linear systems Ax = b, with non-unique solutions (such as under-determined systems), the pseudoinverse may be used to construct the solution of minimum Euclidean norm \, x\, _2 among all solutions. * If Ax = b is satisfiable, the vector z = A^+b is a solution, and satisfies \, z\, _2 \le \, x\, _2 for all solutions. This result is easily extended to systems with multiple right-hand sides, when the Euclidean norm is replaced by the Frobenius norm. Let . * If AX = B is satisfiable, the matrix Z = A^+B is a solution, and satisfies \, Z\, _ \le \, X\, _ for all solutions.


Condition number

Using the pseudoinverse and a
matrix norm In mathematics, a matrix norm is a vector norm in a vector space whose elements (vectors) are matrices (of given dimensions). Preliminaries Given a field K of either real or complex numbers, let K^ be the -vector space of matrices with m ro ...
, one can define a condition number for any matrix: \mbox(A) = \, A\, \left\, A^+\right\, . A large condition number implies that the problem of finding least-squares solutions to the corresponding system of linear equations is ill-conditioned in the sense that small errors in the entries of can lead to huge errors in the entries of the solution.


Generalizations

Besides for matrices over real and complex numbers, the conditions hold for matrices over
biquaternion In abstract algebra, the biquaternions are the numbers , where , and are complex numbers, or variants thereof, and the elements of multiply as in the quaternion group and commute with their coefficients. There are three types of biquaternions co ...
s, also called "complex quaternions". In order to solve more general least-squares problems, one can define Moore–Penrose inverses for all continuous linear operators between two Hilbert spaces and , using the same four conditions as in our definition above. It turns out that not every continuous linear operator has a continuous linear pseudoinverse in this sense. Those that do are precisely the ones whose range is closed in . A notion of pseudoinverse exists for matrices over an arbitrary field equipped with an arbitrary involutive automorphism. In this more general setting, a given matrix doesn't always have a pseudoinverse. The necessary and sufficient condition for a pseudoinverse to exist is that \operatorname(A) = \operatorname\left(A^* A\right) = \operatorname\left(A A^*\right), where A^* denotes the result of applying the involution operation to the transpose of A. When it does exist, it is unique. Example: Consider the field of complex numbers equipped with the identity involution (as opposed to the involution considered elsewhere in the article); do there exist matrices that fail to have pseudoinverses in this sense? Consider the matrix A = \begin1 & i\end^\textsf. Observe that \operatorname\left(A A^\textsf\right) = 1 while \operatorname\left(A^\textsf A\right) = 0. So this matrix doesn't have a pseudoinverse in this sense. In
abstract algebra In mathematics, more specifically algebra, abstract algebra or modern algebra is the study of algebraic structures. Algebraic structures include group (mathematics), groups, ring (mathematics), rings, field (mathematics), fields, module (mathe ...
, a Moore–Penrose inverse may be defined on a *-regular semigroup. This abstract definition coincides with the one in linear algebra.


See also

*
Drazin inverse In mathematics, the Drazin inverse, named after Michael P. Drazin, is a kind of generalized inverse of a matrix. Let ''A'' be a square matrix. The index of ''A'' is the least nonnegative integer ''k'' such that rank(''A'k''+1) = rank(''A'k'') ...
* Hat matrix *
Inverse element In mathematics, the concept of an inverse element generalises the concepts of opposite () and reciprocal () of numbers. Given an operation denoted here , and an identity element denoted , if , one says that is a left inverse of , and that is ...
*
Linear least squares (mathematics) Linear least squares (LLS) is the least squares approximation of linear functions to data. It is a set of formulations for solving statistical problems involved in linear regression, including variants for ordinary (unweighted), weighted, and ...
* Pseudo-determinant *
Von Neumann regular ring In mathematics, a von Neumann regular ring is a ring ''R'' (associative, with 1, not necessarily commutative) such that for every element ''a'' in ''R'' there exists an ''x'' in ''R'' with . One may think of ''x'' as a "weak inverse" of the elemen ...


Notes


References

* * * *


External links

*
Interactive program & tutorial of Moore–Penrose Pseudoinverse
* * *
The Moore–Penrose Pseudoinverse. A Tutorial Review of the Theory


{{DEFAULTSORT:Moore-Penrose inverse Matrix theory Singular value decomposition Numerical linear algebra