Modified Richardson iteration is an

iterative method In computational mathematics, an iterative method is a Algorithm, mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the ''n''-th approximation is derived fr ...

for solving a

system of linear equations In mathematics, a system of linear equations (or linear system) is a collection of one or more linear equations involving the same variable (math), variables. For example, :\begin 3x+2y-z=1\\ 2x-2y+4z=-2\\ -x+\fracy-z=0 \end is a system of three ...

. Richardson iteration was proposed by

Lewis Fry Richardson Lewis Fry Richardson, FRS (11 October 1881 – 30 September 1953) was an English mathematician, physicist, meteorologist, psychologist, and pacifist who pioneered modern mathematical techniques of weather forecasting, and the application of si ...

in his work dated 1910. It is similar to the

Jacobi Jacobi may refer to: * People with the surname Jacobi (surname), Jacobi Mathematics: * Jacobi sum, a type of character sum * Jacobi method, a method for determining the solutions of a diagonally dominant system of linear equations * Jacobi eigenva ...

and Gauss–Seidel method. We seek the solution to a set of linear equations, expressed in matrix terms as :

A x = b.\,

The Richardson iteration is :

x^  = x^ + \omega \left( b - A x^ \right),

where

\omega

is a scalar parameter that has to be chosen such that the sequence

x^

converges. It is easy to see that the method has the correct fixed points, because if it converges, then

x^ \approx x^

and

x^

has to approximate a solution of

A x = b

Convergence

Subtracting the exact solution

x

, and introducing the notation for the error

e^ = x^-x

, we get the equality for the errors :

e^  = e^ - \omega A e^ = (I-\omega A) e^.

Thus, :

\, e^\,  = \, (I-\omega A) e^\, \leq  \, I-\omega A\,  \, e^\, ,

for any vector norm and the corresponding induced matrix norm. Thus, if

\, I-\omega A\, <1

, the method converges. Suppose that

A

is symmetric positive definite and that

(\lambda_j)_j

are the

eigenvalues In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...

A

. The error converges to

0

,  1 - \omega \lambda_j , < 1

for all eigenvalues

\lambda_j

. If, e.g., all eigenvalues are positive, this can be guaranteed if

\omega

is chosen such that

0 < \omega < \omega_\text\,, \ \omega_\text:= 2/\lambda_(A)

. The optimal choice, minimizing all

,  1 - \omega \lambda_j ,

, is

\omega_\text := 2/(\lambda_\text(A)+\lambda_\text(A))

, which gives the simplest

Chebyshev iteration In numerical linear algebra, the Chebyshev iteration is an iterative method for determining the solutions of a system of linear equations. The method is named after Russian mathematician Pafnuty Chebyshev. Chebyshev iteration avoids the computatio ...

. This optimal choice yields a spectral radius of :

\min_ \rho (I-\omega A)
  = \rho (I-\omega_\text A)
  = 1 - \frac \,,

where

\kappa(A)

is the

condition number In numerical analysis, the condition number of a function measures how much the output value of the function can change for a small change in the input argument. This is used to measure how sensitive a function is to changes or errors in the input ...

. If there are both positive and negative eigenvalues, the method will diverge for any

\omega

if the initial error

e^

has nonzero components in the corresponding eigenvectors.

Equivalence to gradient descent

Consider minimizing the function

F(x) = \frac\, \tildex-\tilde\, _2^2

. Since this is a

convex function In mathematics, a real-valued function is called convex if the line segment between any two points on the graph of a function, graph of the function lies above the graph between the two points. Equivalently, a function is convex if its epigra ...

, a sufficient condition for optimality is that the

gradient In vector calculus, the gradient of a scalar-valued differentiable function of several variables is the vector field (or vector-valued function) \nabla f whose value at a point p is the "direction and rate of fastest increase". If the gradi ...

is zero (

\nabla F(x) = 0

) which gives rise to the equation :

\tilde^T\tildex = \tilde^T\tilde.

Define

A=\tilde^T\tilde

and

b=\tilde^T\tilde

. Because of the form of ''A'', it is a positive semi-definite matrix, so it has no negative eigenvalues. A step of gradient descent is :

x^ = x^ - t \nabla F(x^) = x^ - t( Ax^ - b )

which is equivalent to the Richardson iteration by making

t=\omega

References

* * {{Numerical linear algebra Numerical linear algebra Iterative methods

Convergence

Equivalence to gradient descent

See also

References