HOME

TheInfoList



OR:

In
numerical analysis Numerical analysis is the study of algorithms that use numerical approximation (as opposed to symbolic computation, symbolic manipulations) for the problems of mathematical analysis (as distinguished from discrete mathematics). It is the study of ...
, inverse iteration (also known as the ''inverse power method'') is an
iterative Iteration is the repetition of a process in order to generate a (possibly unbounded) sequence of outcomes. Each repetition of the process is a single iteration, and the outcome of each iteration is then the starting point of the next iteration. ...
eigenvalue algorithm In numerical analysis, one of the most important problems is designing efficient and stable algorithms for finding the eigenvalues of a matrix. These eigenvalue algorithms may also find eigenvectors. Eigenvalues and eigenvectors Given an squa ...
. It allows one to find an approximate
eigenvector In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...
when an approximation to a corresponding
eigenvalue In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...
is already known. The method is conceptually similar to the
power method In mathematics, power iteration (also known as the power method) is an eigenvalue algorithm: given a diagonalizable matrix A, the algorithm will produce a number \lambda, which is the greatest (in absolute value) eigenvalue of A, and a nonzero vect ...
. It appears to have originally been developed to compute resonance frequencies in the field of structural mechanics. Ernst Pohlhausen, ''Berechnung der Eigenschwingungen statisch-bestimmter Fachwerke'', ZAMM - Zeitschrift für Angewandte Mathematik und Mechanik 1, 28-42 (1921). The inverse power iteration algorithm starts with an approximation \mu for the
eigenvalue In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...
corresponding to the desired
eigenvector In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...
and a vector b_0, either a randomly selected vector or an approximation to the eigenvector. The method is described by the iteration : b_ = \frac, where C_k are some constants usually chosen as C_k= \, (A - \mu I)^b_k \, . Since eigenvectors are defined up to multiplication by constant, the choice of C_k can be arbitrary in theory; practical aspects of the choice of C_k are discussed below. At every iteration, the vector b_k is multiplied by the matrix (A - \mu I)^ and normalized. It is exactly the same formula as in the
power method In mathematics, power iteration (also known as the power method) is an eigenvalue algorithm: given a diagonalizable matrix A, the algorithm will produce a number \lambda, which is the greatest (in absolute value) eigenvalue of A, and a nonzero vect ...
, except replacing the matrix A by (A - \mu I)^. The closer the approximation \mu to the eigenvalue is chosen, the faster the algorithm converges; however, incorrect choice of \mu can lead to slow convergence or to the convergence to an eigenvector other than the one desired. In practice, the method is used when a good approximation for the eigenvalue is known, and hence one needs only few (quite often just one) iterations.


Theory and convergence

The basic idea of the
power iteration In mathematics, power iteration (also known as the power method) is an eigenvalue algorithm: given a diagonalizable matrix (mathematics), matrix A, the algorithm will produce a number \lambda, which is the greatest (in absolute value) eigenvalue of ...
is choosing an initial vector b (either an
eigenvector In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...
approximation or a
random In common usage, randomness is the apparent or actual lack of pattern or predictability in events. A random sequence of events, symbols or steps often has no :wikt:order, order and does not follow an intelligible pattern or combination. Ind ...
vector) and iteratively calculating Ab, A^b, A^b,.... Except for a set of zero
measure Measure may refer to: * Measurement, the assignment of a number to a characteristic of an object or event Law * Ballot measure, proposed legislation in the United States * Church of England Measure, legislation of the Church of England * Mea ...
, for any initial vector, the result will converge to an
eigenvector In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...
corresponding to the dominant
eigenvalue In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...
. The inverse iteration does the same for the matrix (A - \mu I)^, so it converges to the eigenvector corresponding to the dominant eigenvalue of the matrix (A - \mu I)^. Eigenvalues of this matrix are (\lambda_1 - \mu)^,...,(\lambda_n - \mu)^, where \lambda_i are eigenvalues of A. The largest of these numbers corresponds to the smallest of (\lambda_1 - \mu),...,(\lambda_n - \mu). The eigenvectors of A and of (A - \mu I)^ are the same, since Av=\lambda v \Leftrightarrow (A-\mu I)v = \lambda v - \mu v \Leftrightarrow (\lambda - \mu)^ v = (A-\mu I)^ v Conclusion: The method converges to the eigenvector of the matrix A corresponding to the closest eigenvalue to \mu . In particular, taking \mu=0 we see that (A)^b_k converges to the eigenvector corresponding to the eigenvalue of A^ with the largest magnitude \frac and thus can be used to determine the smallest magnitude eigenvalue of A since they are inversely related.


Speed of convergence

Let us analyze the
rate of convergence In numerical analysis, the order of convergence and the rate of convergence of a convergent sequence are quantities that represent how quickly the sequence approaches its limit. A sequence (x_n) that converges to x^* is said to have ''order of co ...
of the method. The
power method In mathematics, power iteration (also known as the power method) is an eigenvalue algorithm: given a diagonalizable matrix A, the algorithm will produce a number \lambda, which is the greatest (in absolute value) eigenvalue of A, and a nonzero vect ...
is known to converge linearly to the limit, more precisely: \mathrm( b^\mathrm, b^_\mathrm)=O \left( \left, \frac \^k \right), hence for the inverse iteration method similar result sounds as: \mathrm( b^\mathrm, b^_\mathrm)=O \left( \left, \frac \^k \right). This is a key formula for understanding the method's convergence. It shows that if \mu is chosen close enough to some eigenvalue \lambda , for example \mu- \lambda = \epsilon each iteration will improve the accuracy , \epsilon, /, \lambda +\epsilon - \lambda_ , times. (We use that for small enough \epsilon "closest to \mu" and "closest to \lambda " is the same.) For small enough , \epsilon, it is approximately the same as , \epsilon, /, \lambda - \lambda_, . Hence if one is able to find \mu , such that the \epsilon will be small enough, then very few iterations may be satisfactory.


Complexity

The inverse iteration algorithm requires solving a
linear system In systems theory, a linear system is a mathematical model of a system based on the use of a linear operator. Linear systems typically exhibit features and properties that are much simpler than the nonlinear case. As a mathematical abstraction o ...
or calculation of the inverse matrix. For non-structured matrices (not sparse, not Toeplitz,...) this requires O(n^) operations.


Implementation options

The method is defined by the formula: : b_ = \frac, There are, however, multiple options for its implementation.


Calculate inverse matrix or solve system of linear equations

We can rewrite the formula in the following way: : (A - \mu I) b_ = \frac, emphasizing that to find the next approximation b_ we may solve a system of linear equations. There are two options: one may choose an algorithm that solves a linear system, or one may calculate the inverse (A - \mu I)^ and then apply it to the vector. Both options have complexity ''O(n3)'', the exact number depends on the chosen method. The choice depends also on the number of iterations. Naively, if at each iteration one solves a linear system, the complexity will be ''k*O(n3)'', where ''k'' is number of iterations; similarly, calculating the inverse matrix and applying it at each iteration is of complexity ''k*O(n3)''. Note, however, that if the eigenvalue estimate \mu remains constant, then we may reduce the complexity to ''O(n3) + k*O(n2)'' with either method. Calculating the inverse matrix once, and storing it to apply at each iteration is of complexity ''O(n3) + k*O(n2)''. Storing an
LU decomposition In numerical analysis and linear algebra, lower–upper (LU) decomposition or factorization factors a matrix as the product of a lower triangular matrix and an upper triangular matrix (see matrix decomposition). The product sometimes includes a pe ...
of (A - \mu I) and using forward and back substitution to solve the system of equations at each iteration is also of complexity ''O(n3) + k*O(n2)''. Inverting the matrix will typically have a greater initial cost, but lower cost at each iteration. Conversely, solving systems of linear equations will typically have a lesser initial cost, but require more operations for each iteration.


Tridiagonalization,

Hessenberg form In linear algebra, a Hessenberg matrix is a special kind of square matrix, one that is "almost" triangular. To be exact, an upper Hessenberg matrix has zero entries below the first subdiagonal, and a lower Hessenberg matrix has zero entries above t ...

If it is necessary to perform many iterations (or few iterations, but for many eigenvectors), then it might be wise to bring the matrix to the upper
Hessenberg form In linear algebra, a Hessenberg matrix is a special kind of square matrix, one that is "almost" triangular. To be exact, an upper Hessenberg matrix has zero entries below the first subdiagonal, and a lower Hessenberg matrix has zero entries above t ...
first (for symmetric matrix this will be tridiagonal form). Which costs \begin\frac\end n^3 + O(n^2) arithmetic operations using a technique based on Householder reduction), with a finite sequence of orthogonal similarity transforms, somewhat like a two-sided QR decomposition..Lloyd N. Trefethen and David Bau, ''Numerical Linear Algebra'' (SIAM, 1997). (For QR decomposition, the Householder rotations are multiplied only on the left, but for the Hessenberg case they are multiplied on both left and right.) For
symmetric matrices In linear algebra, a symmetric matrix is a square matrix that is equal to its transpose. Formally, Because equal matrices have equal dimensions, only square matrices can be symmetric. The entries of a symmetric matrix are symmetric with re ...
this procedure costs \begin\frac\end n^3 + O(n^2) arithmetic operations using a technique based on Householder reduction. Solution of the system of linear equations for the
tridiagonal matrix In linear algebra, a tridiagonal matrix is a band matrix that has nonzero elements only on the main diagonal, the subdiagonal/lower diagonal (the first diagonal below this), and the supradiagonal/upper diagonal (the first diagonal above the main di ...
costs O(n) operations, so the complexity grows like O(n^3) + kO(n), where k is the iteration number, which is better than for the direct inversion. However, for few iterations such transformation may not be practical. Also transformation to the
Hessenberg form In linear algebra, a Hessenberg matrix is a special kind of square matrix, one that is "almost" triangular. To be exact, an upper Hessenberg matrix has zero entries below the first subdiagonal, and a lower Hessenberg matrix has zero entries above t ...
involves square roots and the division operation, which are not universally supported by hardware.


Choice of the normalization constant C_k

On general purpose processors (e.g. produced by Intel) the execution time of addition, multiplication and division is approximately equal. But on embedded and/or low energy consuming hardware (
digital signal processor A digital signal processor (DSP) is a specialized microprocessor chip, with its architecture optimized for the operational needs of digital signal processing. DSPs are fabricated on MOS integrated circuit chips. They are widely used in audio si ...
s,
FPGA A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturinghence the term '' field-programmable''. The FPGA configuration is generally specified using a hardware de ...
,
ASIC An application-specific integrated circuit (ASIC ) is an integrated circuit (IC) chip customized for a particular use, rather than intended for general-purpose use, such as a chip designed to run in a digital voice recorder or a high-efficien ...
) division may not be supported by hardware, and so should be avoided. Choosing C_k=2^ allows fast division without explicit hardware support, as division by a power of 2 may be implemented as either a
bit shift In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral (considered as a bit string) at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operati ...
(for
fixed-point arithmetic In computing, fixed-point is a method of representing fractional (non-integer) numbers by storing a fixed number of digits of their fractional part. Dollar amounts, for example, are often stored with exactly two fractional digits, representi ...
) or subtraction of k from the exponent (for
floating-point arithmetic In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can be ...
). When implementing the algorithm using
fixed-point arithmetic In computing, fixed-point is a method of representing fractional (non-integer) numbers by storing a fixed number of digits of their fractional part. Dollar amounts, for example, are often stored with exactly two fractional digits, representi ...
, the choice of the constant C_k is especially important. Small values will lead to fast growth of the norm of b_k and to overflow; large values of C_k will cause the vector b_k to tend toward zero.


Usage

The main application of the method is the situation when an approximation to an eigenvalue is found and one needs to find the corresponding approximate eigenvector. In such a situation the inverse iteration is the main and probably the only method to use.


Methods to find approximate eigenvalues

Typically, the method is used in combination with some other method which finds approximate eigenvalues: the standard example is the
bisection eigenvalue algorithm In geometry, bisection is the division of something into two equal or congruent parts, usually by a line, which is then called a ''bisector''. The most often considered types of bisectors are the ''segment bisector'' (a line that passes through ...
, another example is the
Rayleigh quotient iteration Rayleigh quotient iteration is an eigenvalue algorithm which extends the idea of the inverse iteration by using the Rayleigh quotient to obtain increasingly accurate eigenvalue estimates. Rayleigh quotient iteration is an iterative method, that is, ...
, which is actually the same inverse iteration with the choice of the approximate eigenvalue as the
Rayleigh quotient In mathematics, the Rayleigh quotient () for a given complex Hermitian matrix ''M'' and nonzero vector ''x'' is defined as: R(M,x) = . For real matrices and vectors, the condition of being Hermitian reduces to that of being symmetric, and the con ...
corresponding to the vector obtained on the previous step of the iteration. There are some situations where the method can be used by itself, however they are quite marginal.


Norm of matrix as approximation to the ''dominant'' eigenvalue

The dominant eigenvalue can be easily estimated for any matrix. For any induced norm it is true that \left \, A \right \, \ge , \lambda, , for any eigenvalue \lambda. So taking the norm of the matrix as an approximate eigenvalue one can see that the method will converge to the dominant eigenvector.


Estimates based on statistics

In some real-time applications one needs to find eigenvectors for matrices with a speed of millions of matrices per second. In such applications, typically the statistics of matrices is known in advance and one can take as an approximate eigenvalue the average eigenvalue for some large matrix sample. Better, one may calculate the mean ratio of the eigenvalues to the trace or the norm of the matrix and estimate the average eigenvalue as the trace or norm multiplied by the average value of that ratio. Clearly such a method can be used only with discretion and only when high precision is not critical. This approach of estimating an average eigenvalue can be combined with other methods to avoid excessively large error.


See also

*
Power iteration In mathematics, power iteration (also known as the power method) is an eigenvalue algorithm: given a diagonalizable matrix (mathematics), matrix A, the algorithm will produce a number \lambda, which is the greatest (in absolute value) eigenvalue of ...
*
Rayleigh quotient iteration Rayleigh quotient iteration is an eigenvalue algorithm which extends the idea of the inverse iteration by using the Rayleigh quotient to obtain increasingly accurate eigenvalue estimates. Rayleigh quotient iteration is an iterative method, that is, ...
* List of eigenvalue algorithms


References

{{Numerical linear algebra Numerical linear algebra