The Kantorovich theorem, or Newton–Kantorovich theorem, is a mathematical statement on the semi-local

convergence Convergence may refer to: Arts and media Literature *''Convergence'' (book series), edited by Ruth Nanda Anshen *Convergence (comics), "Convergence" (comics), two separate story lines published by DC Comics: **A four-part crossover storyline that ...

Newton's method In numerical analysis, the Newton–Raphson method, also known simply as Newton's method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a ...

. It was first stated by

Leonid Kantorovich Leonid Vitalyevich Kantorovich (, ; 19 January 19127 April 1986) was a Soviet mathematician and economist, known for his theory and development of techniques for the optimal allocation of resources. He is regarded as the founder of linear programm ...

in 1948. It is similar to the form of the

Banach fixed-point theorem In mathematics, the Banach fixed-point theorem (also known as the contraction mapping theorem or contractive mapping theorem or Banach–Caccioppoli theorem) is an important tool in the theory of metric spaces; it guarantees the existence and uniqu ...

, although it states existence and uniqueness of a

zero 0 (zero) is a number representing an empty quantity. Adding (or subtracting) 0 to any number leaves that number unchanged; in mathematical terminology, 0 is the additive identity of the integers, rational numbers, real numbers, and compl ...

rather than a fixed point. Newton's method constructs a sequence of points that under certain conditions will converge to a solution

x

of an equation

f(x)=0

or a vector solution of a system of equation

F(x)=0

. The Kantorovich theorem gives conditions on the initial point of this sequence. If those conditions are satisfied then a solution exists close to the initial point and the sequence converges to that point.

Assumptions

Let

X\subset\R^n

be an open subset and

F:X \subset \R^n \to\R^n

differentiable function In mathematics, a differentiable function of one real variable is a function whose derivative exists at each point in its domain. In other words, the graph of a differentiable function has a non- vertical tangent line at each interior point in ...

with a Jacobian

F^(\mathbf x)

that is locally

Lipschitz continuous In mathematical analysis, Lipschitz continuity, named after Germany, German mathematician Rudolf Lipschitz, is a strong form of uniform continuity for function (mathematics), functions. Intuitively, a Lipschitz continuous function is limited in h ...

(for instance if

F

is twice differentiable). That is, it is assumed that for any

x \in X

there is an open subset

U\subset X

such that

x \in U

and there exists a constant

L>0

such that for any

\mathbf x,\mathbf y\in U

\, F'(\mathbf x)-F'(\mathbf y)\, \le L\;\, \mathbf x-\mathbf y\,

holds. The norm on the left is the operator norm. In other words, for any vector

\mathbf v\in\R^n

the inequality :

\, F'(\mathbf x)(\mathbf v)-F'(\mathbf y)(\mathbf v)\, \le L\;\, \mathbf x-\mathbf y\, \,\, \mathbf v\,

must hold. Now choose any initial point

\mathbf x_0\in X

. Assume that

F'(\mathbf x_0)

is invertible and construct the Newton step

\mathbf h_0=-F'(\mathbf x_0)^F(\mathbf x_0).

The next assumption is that not only the next point

\mathbf x_1=\mathbf x_0+\mathbf h_0

but the entire ball

B(\mathbf x_1,\, \mathbf h_0\, )

is contained inside the set

X

. Let

M

be the Lipschitz constant for the Jacobian over this ball (assuming it exists). As a last preparation, construct recursively, as long as it is possible, the sequences

(\mathbf x_k)_k

(\mathbf h_k)_k

(\alpha_k)_k

according to :

\mathbf x_&=\mathbf x_k+\mathbf h_k. \end

Statement

Now if

\alpha_0\le\tfrac12

then #a solution

\mathbf x^*

F(\mathbf x^*)=0

exists inside the closed ball

\bar B(\mathbf x_1,\, \mathbf h_0\, )

and #the Newton iteration starting in

\mathbf x_0

converges to

\mathbf x^*

with at least linear order of convergence. A statement that is more precise but slightly more difficult to prove uses the roots

t^\ast\le t^

of the quadratic polynomial :

p(t)
  =\left(\tfrac12L\, F'(\mathbf x_0)^\, ^\right)t^2
    -t+\, \mathbf h_0\,

, :

t^=\frac

and their ratio :

\theta
  =\frac
  =\frac.

Then #a solution

\mathbf x^*

exists inside the closed ball

\bar B(\mathbf x_1,\theta\, \mathbf h_0\, )\subset\bar B(\mathbf x_0,t^*)

#it is unique inside the bigger ball

B(\mathbf x_0,t^)

#and the convergence to the solution of

F

is dominated by the convergence of the Newton iteration of the quadratic polynomial

p(t)

towards its smallest root

t^\ast

, if

t_0=0,\,t_=t_k-\tfrac

, then #:

\, \mathbf x_-\mathbf x_k\, \le t_-t_k.

#The quadratic convergence is obtained from the error estimate #:

\, \mathbf x_-\mathbf x^*\, 
    \le \theta^\, \mathbf x_-\mathbf x_n\, 
    \le\frac\, \mathbf h_0\, .

Corollary

In 1986, Yamamoto proved that the error evaluations of the Newton method such as Doring (1969), Ostrowski (1971, 1973), Gragg-Tapia (1974), Potra-Ptak (1980), Miel (1981), Potra (1984), can be derived from the Kantorovich theorem.

Generalizations

There is a ''q''-analog for the Kantorovich theorem. For other generalizations/variations, see Ortega & Rheinboldt (1970).

Applications

Oishi and Tanabe claimed that the Kantorovich theorem can be applied to obtain reliable solutions of

linear programming Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements and objective are represented by linear function#As a polynomia ...

Assumptions

Statement

Corollary

Generalizations

Applications

References

Further reading