numerical analysis Numerical analysis is the study of algorithms that use numerical approximation (as opposed to symbolic manipulations) for the problems of mathematical analysis (as distinguished from discrete mathematics). It is the study of numerical methods ...

, the Weierstrass method or Durand–Kerner method, discovered by

Karl Weierstrass Karl Theodor Wilhelm Weierstrass (german: link=no, Weierstraß ; 31 October 1815 – 19 February 1897) was a German mathematician often cited as the "father of modern analysis". Despite leaving university without a degree, he studied mathematics ...

in 1891 and rediscovered independently by Durand in 1960 and Kerner in 1966, is a

root-finding algorithm In mathematics and computing, a root-finding algorithm is an algorithm for finding zeros, also called "roots", of continuous functions. A zero of a function , from the real numbers to real numbers or from the complex numbers to the complex numbers ...

for solving

polynomial In mathematics, a polynomial is an expression consisting of indeterminates (also called variables) and coefficients, that involves only the operations of addition, subtraction, multiplication, and positive-integer powers of variables. An example ...

equations. In other words, the method can be used to solve numerically the equation : ''f''(''x'') = 0, where ''f'' is a given polynomial, which can be taken to be scaled so that the leading coefficient is 1.

Explanation

This explanation considers equations of degree four. It is easily generalized to other degrees. Let the polynomial ''f'' be defined by :

f(x) = x^4 + ax^3 + bx^2 + cx + d

for all ''x''. The known numbers ''a'', ''b'', ''c'', ''d'' are the coefficients. Let the (complex) numbers ''P'', ''Q'', ''R'', ''S'' be the roots of this polynomial ''f''. Then :

f(x) = (x - P)(x - Q)(x - R)(x - S)

for all ''x''. One can isolate the value ''P'' from this equation: :

P = x - \frac.

So if used as a fixed-point

iteration Iteration is the repetition of a process in order to generate a (possibly unbounded) sequence of outcomes. Each repetition of the process is a single iteration, and the outcome of each iteration is then the starting point of the next iteration. ...

x_1 := x_0 - \frac,

it is strongly stable in that every initial point ''x''₀ ≠ ''Q'', ''R'', ''S'' delivers after one iteration the root ''P'' = ''x''₁. Furthermore, if one replaces the zeros ''Q'', ''R'' and ''S'' by approximations ''q'' ≈ ''Q'', ''r'' ≈ ''R'', ''s'' ≈ ''S'', such that ''q'', ''r'', ''s'' are not equal to ''P'', then ''P'' is still a fixed point of the perturbed fixed-point iteration :

x_ := x_k - \frac,

since :

P - \frac = P - 0 = P.

Note that the denominator is still different from zero. This fixed-point iteration is a

contraction mapping In mathematics, a contraction mapping, or contraction or contractor, on a metric space (''M'', ''d'') is a function ''f'' from ''M'' to itself, with the property that there is some real number 0 \leq k < 1 such that for all ''x'' and ...

for ''x'' around ''P''. The clue to the method now is to combine the fixed-point iteration for ''P'' with similar iterations for ''Q'', ''R'', ''S'' into a simultaneous iteration for all roots. Initialize ''p'', ''q'', ''r'', ''s'': : ''p''₀ := (0.4 + 0.9''i'')⁰, : ''q''₀ := (0.4 + 0.9''i'')¹, : ''r''₀ := (0.4 + 0.9''i'')², : ''s''₀ := (0.4 + 0.9''i'')³. There is nothing special about choosing 0.4 + 0.9''i'' except that it is neither a

real number In mathematics, a real number is a number that can be used to measure a ''continuous'' one-dimensional quantity such as a distance, duration or temperature. Here, ''continuous'' means that values can have arbitrarily small variations. Every ...

nor a

root of unity In mathematics, a root of unity, occasionally called a de Moivre number, is any complex number that yields 1 when raised to some positive integer power . Roots of unity are used in many branches of mathematics, and are especially important ...

. Make the substitutions for ''n'' = 1, 2, 3, ...: :

p_n = p_ - \frac,

q_n = q_ - \frac,

r_n = r_ - \frac,

s_n = s_ - \frac.

Re-iterate until the numbers ''p'', ''q'', ''r'', ''s'' essentially stop changing relative to the desired precision. They then have the values ''P'', ''Q'', ''R'', ''S'' in some order and in the chosen precision. So the problem is solved. Note that

complex number In mathematics, a complex number is an element of a number system that extends the real numbers with a specific element denoted , called the imaginary unit and satisfying the equation i^= -1; every complex number can be expressed in the fo ...

arithmetic must be used, and that the roots are found simultaneously rather than one at a time.

Variations

This iteration procedure, like the

Gauss–Seidel method In numerical linear algebra, the Gauss–Seidel method, also known as the Liebmann method or the method of successive displacement, is an iterative method used to solve a system of linear equations. It is named after the German mathematicians Carl ...

for linear equations, computes one number at a time based on the already computed numbers. A variant of this procedure, like the

Jacobi method In numerical linear algebra, the Jacobi method is an iterative algorithm for determining the solutions of a strictly diagonally dominant system of linear equations. Each diagonal element is solved for, and an approximate value is plugged in. The ...

, computes a vector of root approximations at a time. Both variant are effective root-finding algorithms. One could also choose the initial values for ''p'', ''q'', ''r'', ''s'' by some other procedure, even randomly, but in a way that * they are inside some not-too-large circle containing also the roots of ''f''(''x''), e.g. the circle around the origin with radius

1 + \max\big(, a, , , b, , , c, , , d, \big)

, (where 1, ''a'', ''b'', ''c'', ''d'' are the coefficients of ''f''(''x'')) and that * they are not too close to each other, which may increasingly become a concern as the degree of the polynomial increases. If the coefficients are real and the polynomial has odd degree, then it must have at least one real root. To find this, use a real value of ''p''₀ as the initial guess and make ''q''₀ and ''r''₀, etc.,

complex conjugate In mathematics, the complex conjugate of a complex number is the number with an equal real part and an imaginary part equal in magnitude but opposite in sign. That is, (if a and b are real, then) the complex conjugate of a + bi is equal to a - ...

pairs. Then the iteration will preserve these properties; that is, ''p''_''n'' will always be real, and ''q''_''n'' and ''r''_''n'', etc., will always be conjugate. In this way, the ''p''_''n'' will converge to a real root ''P''. Alternatively, make all of the initial guesses real; they will remain so.

Example

This example is from the reference 1992. The equation solved is . The first 4 iterations move ''p'', ''q'', ''r'' seemingly chaotically, but then the roots are located to 1 decimal. After iteration number 5 we have 4 correct decimals, and the subsequent iteration number 6 confirms that the computed roots are fixed. This general behaviour is characteristic for the method. Also notice that, in this example, the roots are used as soon as they are computed in each iteration. In other words, the computation of each second column uses the value of the previous computed columns. :: Note that the equation has one real root and one pair of complex conjugate roots, and that the sum of the roots is 3.

Derivation of the method via Newton's method

For every ''n''-tuple of complex numbers, there is exactly one monic polynomial of degree ''n'' that has them as its zeros (keeping multiplicities). This polynomial is given by multiplying all the corresponding linear factors, that is :

g_(X)=(X-z_1)\cdots(X-z_n).

This polynomial has coefficients that depend on the prescribed zeros, :

g_(X)=X^n+g_(\vec z)X^+\cdots+g_0(\vec z).

Those coefficients are, up to a sign, the

elementary symmetric polynomial In mathematics, specifically in commutative algebra, the elementary symmetric polynomials are one type of basic building block for symmetric polynomials, in the sense that any symmetric polynomial can be expressed as a polynomial in elementary sy ...

\alpha_1(\vec z),\dots,\alpha_n(\vec z)

of degrees ''1,...,n''. To find all the roots of a given polynomial

f(X)=X^n+c_X^+\cdots+c_0

with coefficient vector

(c_,\dots,c_0)

simultaneously is now the same as to find a solution vector to the system :

\begin
c_0&=&g_0(\vec z)&=&(-1)^n\alpha_n(\vec z)&=&(-1)^nz_1\cdots z_n\\
c_1&=&g_1(\vec z)&=&(-1)^\alpha_(\vec z)\\
&\vdots&\\
c_&=&g_(\vec z)&=&-\alpha_1(\vec z)&=&-(z_1+z_2+\cdots+z_n).
\end

The Durand–Kerner method is obtained as the multidimensional Newton's method applied to this system. It is algebraically more comfortable to treat those identities of coefficients as the identity of the corresponding polynomials,

g_(X)=f(X)

. In the Newton's method one looks, given some initial vector

\vec z

, for an increment vector

\vec w

such that

g_(X)=f(X)

is satisfied up to second and higher order terms in the increment. For this one solves the identity :

f(X)-g_(X)=\sum_^n\fracw_k=-\sum_^n w_k\prod_(X-z_j).

If the numbers

z_1,\dots,z_n

are pairwise different, then the polynomials in the terms of the right hand side form a basis of the ''n''-dimensional space

\mathbb C

of polynomials with maximal degree ''n'' − 1. Thus a solution

\vec w

to the increment equation exists in this case. The coordinates of the increment

\vec w

are simply obtained by evaluating the increment equation :

-\sum_^n w_k\prod_(X-z_j)=f(X)-\prod_^n(X-z_j)

at the points

X=z_k

, which results in :

-w_k\prod_(z_k-z_j)=-w_kg_'(z_k)=f(z_k)

, that is

w_k=-\frac.

Root inclusion via Gerschgorin's circles

In the

quotient ring In ring theory, a branch of abstract algebra, a quotient ring, also known as factor ring, difference ring or residue class ring, is a construction quite similar to the quotient group in group theory and to the quotient space in linear algebra. ...

(algebra) of

residue class In mathematics, modular arithmetic is a system of arithmetic for integers, where numbers "wrap around" when reaching a certain value, called the modulus. The modern approach to modular arithmetic was developed by Carl Friedrich Gauss in his bo ...

es modulo ƒ(''X''), the multiplication by ''X'' defines an

endomorphism In mathematics, an endomorphism is a morphism from a mathematical object to itself. An endomorphism that is also an isomorphism is an automorphism. For example, an endomorphism of a vector space is a linear map , and an endomorphism of a gr ...

that has the zeros of ƒ(''X'') as

eigenvalue In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted ...

s with the corresponding multiplicities. Choosing a basis, the multiplication operator is represented by its coefficient matrix ''A'', the

companion matrix In linear algebra, the Frobenius companion matrix of the monic polynomial : p(t)=c_0 + c_1 t + \cdots + c_t^ + t^n ~, is the square matrix defined as :C(p)=\begin 0 & 0 & \dots & 0 & -c_0 \\ 1 & 0 & \dots & 0 & -c_1 \\ 0 & 1 & \dots & 0 & -c_2 ...

of ƒ(''X'') for this basis. Since every polynomial can be reduced modulo ƒ(''X'') to a polynomial of degree ''n'' − 1 or lower, the space of residue classes can be identified with the space of polynomials of degree bounded by ''n'' − 1. A problem specific basis can be taken from

Lagrange interpolation In numerical analysis, the Lagrange interpolating polynomial is the unique polynomial of lowest degree that interpolates a given set of data. Given a data set of coordinate pairs (x_j, y_j) with 0 \leq j \leq k, the x_j are called ''nodes'' an ...

as the set of ''n'' polynomials :

b_k(X)=\prod_(X-z_j),\quad k=1,\dots,n,

where

z_1,\dots,z_n\in\mathbb C

are pairwise different complex numbers. Note that the kernel functions for the Lagrange interpolation are

L_k(X)=\frac

. For the multiplication operator applied to the basis polynomials one obtains from the Lagrange interpolation where

w_j=-\frac

are again the Weierstrass updates. The companion matrix of ƒ(''X'') is therefore :

A = \mathrm(z_1,\dots,z_n)
  +\begin1\\\vdots\\1\end\cdot\left(w_1,\dots,w_n\right).

From the transposed matrix case of the Gershgorin circle theorem it follows that all eigenvalues of ''A'', that is, all roots of ƒ(''X''), are contained in the union of the disks

D(a_,r_k)

with a radius

r_k=\sum_\big, a_\big,

. Here one has

a_=z_k+w_k

, so the centers are the next iterates of the Weierstrass iteration, and radii

r_k=(n-1)\left, w_k\

that are multiples of the Weierstrass updates. If the roots of ƒ(''X'') are all well isolated (relative to the computational precision) and the points

z_1,\dots,z_n\in\mathbb C

are sufficiently close approximations to these roots, then all the disks will become disjoint, so each one contains exactly one zero. The midpoints of the circles will be better approximations of the zeros. Every conjugate matrix

TAT^

of ''A'' is as well a companion matrix of ƒ(''X''). Choosing ''T'' as diagonal matrix leaves the structure of ''A'' invariant. The root close to

z_k

is contained in any isolated circle with center

z_k

regardless of ''T''. Choosing the optimal diagonal matrix ''T'' for every index results in better estimates (see ref. Petkovic et al. 1995).

Convergence results

The connection between the Taylor series expansion and Newton's method suggests that the distance from

z_k + w_k

to the corresponding root is of the order

O\big(, w_k, ^2\big)

, if the root is well isolated from nearby roots and the approximation is sufficiently close to the root. So after the approximation is close, Newton's method converges ''quadratically''; that is, the error is squared with every step (which will greatly reduce the error once it is less than 1). In the case of the Durand–Kerner method, convergence is quadratic if the vector

\vec z = (z_1, \dots, z_n)

is close to some permutation of the vector of the roots of ''f''. For the conclusion of linear convergence there is a more specific result (see ref. Petkovic et al. 1995). If the initial vector

\vec z

and its vector of Weierstrass updates

\vec w = (w_1, \dots, w_n)

satisfies the inequality :

\max_ , w_k,  \le \frac \min_ , z_k - z_j, ,

then this inequality also holds for all iterates, all inclusion disks

D\big(z_k + w_k, (n - 1) , w_k, \big)

are disjoint, and linear convergence with a contraction factor of 1/2 holds. Further, the inclusion disks can in this case be chosen as :

D\left(z_k + w_k, \tfrac14 , w_k, \right),\quad k = 1, \dots, n,

each containing exactly one zero of ''f''.

Failing general convergence

The Weierstrass / Durand-Kerner method is not generally convergent: in other words, it is not true that for every polynomial, the set of initial vectors that eventually converges to roots is open and dense. In fact, there are open sets of polynomials that have open sets of initial vectors that converge to periodic cycles other than roots (see Reinke et al.)

References

* * * * * * Bo Jacoby, ''Nulpunkter for polynomier'', CAE-nyt (a periodical for Dansk CAE Gruppe anish CAE Group, 1988. * Agnethe Knudsen, ''Numeriske Metoder'' (lecture notes), Københavns Teknikum. * Bo Jacoby, ''Numerisk løsning af ligninger'', Bygningsstatiske meddelelser (Published by Danish Society for Structural Science and Engineering) volume 63 no. 3–4, 1992, pp. 83–105. * * Victor Pan (May 2002)
''Univariate Polynomial Root-Finding with Lower Computational Precision and Higher Convergence Rates''
Tech-Report, City University of New York * * Jan Verschelde,

', 2003. * Bernhard Reinke, Dierk Schleicher, and Michael Stoll,
The Weierstrass root finder is not generally convergent
', 2020

External links

*
Ada Generic_Roots using the Durand–Kerner Method
' — an open-source implementation in

Ada Ada may refer to: Places Africa * Ada Foah, a town in Ghana * Ada (Ghana parliament constituency) * Ada, Osun, a town in Nigeria Asia * Ada, Urmia, a village in West Azerbaijan Province, Iran * Ada, Karaman, a village in Karaman Province, ...

*
Polynomial Roots
' — an open-source implementation in

Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mos ...

*
Roots Extraction from Polynomials : The Durand–Kerner Method
' — contains a

Java applet Java applets were small applications written in the Java programming language, or another programming language that compiles to Java bytecode, and delivered to users in the form of Java bytecode. The user launched the Java applet from a ...

demonstration {{DEFAULTSORT:Durand-Kerner method Root-finding algorithms