Modified Richardson iteration is an

iterative method In computational mathematics, an iterative method is a mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the ''n''-th approximation is derived from the pre ...

for solving a system of linear equations. Richardson iteration was proposed by

Lewis Fry Richardson Lewis Fry Richardson, FRS (11 October 1881 – 30 September 1953) was an English mathematician, physicist, meteorologist, psychologist, and pacifist who pioneered modern mathematical techniques of weather forecasting, and the application of s ...

in his work dated 1910. It is similar to the Jacobi and Gauss–Seidel method. We seek the solution to a set of linear equations, expressed in matrix terms as :

A x = b.\,

The Richardson iteration is :

x^  = x^ + \omega \left( b - A x^ \right),

where

\omega

is a scalar parameter that has to be chosen such that the sequence

x^

converges. It is easy to see that the method has the correct fixed points, because if it converges, then

x^ \approx x^

and

x^

has to approximate a solution of

A x = b

Convergence

Subtracting the exact solution

x

, and introducing the notation for the error

e^ = x^-x

, we get the equality for the errors :

e^  = e^ - \omega A e^ = (I-\omega A) e^.

Thus, :

\, e^\,  = \, (I-\omega A) e^\, \leq  \, I-\omega A\,  \, e^\, ,

for any vector norm and the corresponding induced matrix norm. Thus, if

\, I-\omega A\, <1

, the method converges. Suppose that

A

is symmetric positive definite and that

(\lambda_j)_j

are the

eigenvalues In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted b ...

A

. The error converges to

0

,  1 - \omega \lambda_j , < 1

for all eigenvalues

\lambda_j

. If, e.g., all eigenvalues are positive, this can be guaranteed if

\omega

is chosen such that

0 < \omega < \omega_\text\,, \ \omega_\text:= 2/\lambda_(A)

. The optimal choice, minimizing all

,  1 - \omega \lambda_j ,

, is

\omega_\text := 2/(\lambda_\text(A)+\lambda_\text(A))

, which gives the simplest Chebyshev iteration. This optimal choice yields a spectral radius of :

\min_ \rho (I-\omega A)
  = \rho (I-\omega_\text A)
  = 1 - \frac \,,

where

\kappa(A)

is the

condition number In numerical analysis, the condition number of a function measures how much the output value of the function can change for a small change in the input argument. This is used to measure how sensitive a function is to changes or errors in the inpu ...

. If there are both positive and negative eigenvalues, the method will diverge for any

\omega

if the initial error

e^

has nonzero components in the corresponding

eigenvectors In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted ...

Equivalence to
gradient descent In mathematics, gradient descent (also often called steepest descent) is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of ...

Consider minimizing the function

F(x) = \frac\, \tildex-\tilde\, _2^2

. Since this is a

convex function In mathematics, a real-valued function is called convex if the line segment between any two points on the graph of the function lies above the graph between the two points. Equivalently, a function is convex if its epigraph (the set of poin ...

, a sufficient condition for optimality is that the

gradient In vector calculus, the gradient of a scalar-valued differentiable function of several variables is the vector field (or vector-valued function) \nabla f whose value at a point p is the "direction and rate of fastest increase". If the gr ...

is zero (

\nabla F(x) = 0

) which gives rise to the equation :

\tilde^T\tildex = \tilde^T\tilde.

Define

A=\tilde^T\tilde

and

b=\tilde^T\tilde

. Because of the form of ''A'', it is a positive semi-definite matrix, so it has no negative eigenvalues. A step of gradient descent is :

x^ = x^ - t \nabla F(x^) = x^ - t( Ax^ - b )

which is equivalent to the Richardson iteration by making

t=\omega

References

* * {{Numerical linear algebra Numerical linear algebra Iterative methods

Convergence

Equivalence to gradient descent In mathematics, gradient descent (also often called steepest descent) is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of ...

See also

References

Equivalence to
gradient descent In mathematics, gradient descent (also often called steepest descent) is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of ...