Steffensen's Method
   HOME

TheInfoList



OR:

In
numerical analysis Numerical analysis is the study of algorithms that use numerical approximation (as opposed to symbolic computation, symbolic manipulations) for the problems of mathematical analysis (as distinguished from discrete mathematics). It is the study of ...
, Steffensen's method is an
iterative method In computational mathematics, an iterative method is a Algorithm, mathematical procedure that uses an initial value to generate a sequence of improving approximate solutions for a class of problems, in which the ''i''-th approximation (called an " ...
for numerical root-finding named after Johan Frederik Steffensen that is similar to the secant method and to
Newton's method In numerical analysis, the Newton–Raphson method, also known simply as Newton's method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a ...
. Steffensen's method achieves a quadratic order of convergence without using
derivative In mathematics, the derivative is a fundamental tool that quantifies the sensitivity to change of a function's output with respect to its input. The derivative of a function of a single variable at a chosen input value, when it exists, is t ...
s, whereas the more familiar Newton's method also converges quadratically, but requires derivatives and the secant method does not require derivatives but also converges less quickly than quadratically. Steffensen's method has the drawback that it requires two function evaluations per step, whereas the secant method requires only one evaluation per step, so it is not necessarily most efficient in terms of
computational cost A computation is any type of arithmetic or non-arithmetic calculation that is well-defined. Common examples of computation are mathematical equation solving and the execution of computer algorithms. Mechanical or electronic devices (or, historic ...
, depending on the number of iterations each requires. Newton's method also requires evaluating two functions per step – for the function and for its derivative – and its computational cost varies between being the same as Steffensen's method (for most functions, where calculation of the derivative is just as computationally costly as the original function). Steffensen's method can be derived as an adaptation of
Aitken's delta-squared process In numerical analysis, Aitken's delta-squared process or Aitken extrapolation is a series acceleration method used for accelerating the rate of convergence of a sequence. It is named after Alexander Aitken, who introduced this method in 1926 as ...
applied to
fixed-point iteration In numerical analysis, fixed-point iteration is a method of computing fixed points of a function. More specifically, given a function f defined on the real numbers with real values and given a point x_0 in the domain of f, the fixed-point itera ...
. Viewed in this way, Steffensen's method naturally generalizes to efficient fixed-point calculation in general
Banach spaces In mathematics, more specifically in functional analysis, a Banach space (, ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vectors and ...
, whenever fixed points are guaranteed to exist and fixed-point iteration is guaranteed to converge, although possibly slowly, by the
Banach fixed-point theorem In mathematics, the Banach fixed-point theorem (also known as the contraction mapping theorem or contractive mapping theorem or Banach–Caccioppoli theorem) is an important tool in the theory of metric spaces; it guarantees the existence and uniqu ...
.


Simple description

The simplest form of the formula for Steffensen's method occurs when it is used to find a
zero 0 (zero) is a number representing an empty quantity. Adding (or subtracting) 0 to any number leaves that number unchanged; in mathematical terminology, 0 is the additive identity of the integers, rational numbers, real numbers, and compl ...
of a
real function In mathematical analysis, and applications in geometry, applied mathematics, engineering, and natural sciences, a function of a real variable is a function whose domain is the real numbers \mathbb, or a subset of \mathbb that contains an inter ...
f; that is, to find the real value \ x_\star\ that satisfies \ f(x_\star) = 0 ~. Near the solution \ x_\star\ , the derivative of the function, \ f'\ , needs to either exactly or very nearly satisfy -1 < f'(x_\star) < 0 ~. For some functions, Steffensen's method can work even if this condition is not met, but in such a case, the starting value \ x_0\ must be ''very'' close to the actual solution \ x_\star\ , and convergence to the solution may be slow. Adjustment of the size of the method's intermediate step, mentioned later, can improve convergence in some of these cases. Given an adequate starting value \ x_0\ , a sequence of values \ x_0,\ \ x_1,\ x_2,\ \dots,\ x_n,\ \dots\ can be generated using the formula below. When it works, each value in the sequence is much closer to the solution \ x_\star\ than the prior value. The value \ x_n\ from the current step generates the value \ x_\ for the next step, via the formula : x_ = x_n - \frac for \ n = 0, 1, 2, 3, ...\ , where the slope function \ g(x)\ is a composite of the original function \ f\ given by the formula : g(x) = \frac - 1 or perhaps more clearly, : g(x) = \frac \qquad \approx \quad \frac \equiv f'( x ), where \ h \equiv f(x)\ is a step-size between the last iteration point, \ x\ , and an auxiliary point located at \ x + h ~. Technically, the function \ g\ is called the first-order
divided difference In mathematics, divided differences is an algorithm, historically used for computing tables of logarithms and trigonometric functions. Charles Babbage's difference engine, an early mechanical calculator, was designed to use this algorithm in its ...
of \ f\ between those two points Practically, it is the averaged value of the slope f' of the function \ f\ between the last sequence point \left( x, y \right) = \bigl( x_n, f\left( x_n \right) \bigr) and the auxiliary point at \ \bigl( x, y \bigr) = \bigl( x_n + h, f\left( x_n + h \right) \bigr)\ , with the size of the intermediate step (and its direction) given by \ h = f(x_n) ~. Because the value of \ g\ is an approximation for \ f'\ , its value can optionally be checked to see if it meets the condition \ -1 < g < 0\ , which is required to guarantee convergence of Steffensen's algorithm. Although slight non-conformance may not necessarily be dire, any large departure from the condition warns that Steffensen's method is liable to fail, and temporary use of some fallback algorithm is warranted (e.g. the more robust Illinois algorithm, or plain
regula falsi In mathematics, the ''regula falsi'', method of false position, or false position method is a very old method for solving an equation with one unknown; this method, in modified form, is still in use. In simple terms, the method is the trial and er ...
). It is only for the purpose of finding \ h\ for this auxiliary point that the value of the function \ f\ must fulfill the requirement that \ -1 < f'(x_\star) < 0 ~. For all other parts of the calculation, Steffensen's method only requires the function \ f\ to be continuous and to actually have a nearby solution. Several modest modifications of the step \ h\ used in the formula for the slope \ g\ exist, such as multiplying it by or , to accommodate functions \ f\ that do not quite meet the requirement.


Advantages and drawbacks

The main advantage of Steffensen's method is that it has
quadratic convergence In mathematical analysis, particularly numerical analysis, the rate of convergence and order of convergence of a sequence that converges to a limit are any of several characterizations of how quickly that sequence approaches its limit. These are ...
like
Newton's method In numerical analysis, the Newton–Raphson method, also known simply as Newton's method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a ...
– that is, both methods find roots to an equation \ f\ just as "quickly". In this case, ''quickly'' means that for both methods, the number of correct digits in the answer doubles with each step. But the formula for Newton's method requires evaluation of the function's derivative \ f'\ as well as the function \ f\ , while Steffensen's method only requires \ f\ itself. This is important when the derivative is not easily or efficiently available. The price for the quick convergence is the double function evaluation: Both \ f(x_n)\ and \ f(x_n + h)\ must be calculated, which might be time-consuming if \ f\ is complicated. For comparison, both
regula falsi In mathematics, the ''regula falsi'', method of false position, or false position method is a very old method for solving an equation with one unknown; this method, in modified form, is still in use. In simple terms, the method is the trial and er ...
and the secant method only need one function evaluation per step. The secant method increases the number of correct digits by "only" a factor of roughly   1.6 per step,  but one can do twice as many steps of the secant method within a given time. Since the secant method can carry out twice as many steps in the same time as Steffensen's method, in practical use the secant method actually converges faster than Steffensen's method, when both algorithms succeed: The secant method achieves a factor of about     as many digits for every two steps (two function evaluations), compared to Steffensen's factor of     for every one step (two function evaluations). Similar to most other iterative root-finding algorithms, the crucial weakness in Steffensen's method is choosing a "sufficiently close" starting value \ x_0 ~. If the value of \ x_0\ is not "close enough" to the actual solution \ x_\star\ , the method may fail, and the sequence of values \ x_0, \, x_1, \, x_2, \, x_3, \, \dots\ may either erratically flip-flop between two extremes, or diverge to infinity, or both.


Derivation using Aitken's delta-squared process

The version of Steffensen's method implemented in the
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...
code shown below can be found using
Aitken's delta-squared process In numerical analysis, Aitken's delta-squared process or Aitken extrapolation is a series acceleration method used for accelerating the rate of convergence of a sequence. It is named after Alexander Aitken, who introduced this method in 1926 as ...
for convergence acceleration. To compare the following formulae to the formulae in the section above, notice that x_n = p - p_n. This method assumes starting with a linearly convergent sequence and increases the rate of convergence of that sequence. If the signs of p_n, \, p_, \, p_ agree and p_n is "sufficiently close" to the desired limit of the sequence p, then we can assume :\frac \approx \frac, so that :( p_ - 2 p_ + p_n ) p \approx p_ p_n - p_^2. Solving for the desired limit of the sequence p gives: : p \approx \frac : =~ \frac : = p_n - \frac, which results in the more rapidly convergent sequence: :p \approx p_ = p_n - \frac.


Code example


In Matlab

Here is the source for an implementation of Steffensen's Method in
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...
. : function Steffensen(f, p0, tol) % This function takes as inputs: a fixed point iteration function, f, % and initial guess to the fixed point, p0, and a tolerance, tol. % The fixed point iteration function is assumed to be input as an % inline function. % This function will calculate and return the fixed point, p, % that makes the expression f(x) = p true to within the desired % tolerance, tol. format compact % This shortens the output. format long % This prints more decimal places. for i = 1:1000 % get ready to do a large, but finite, number of iterations. % This is so that if the method fails to converge, we won't % be stuck in an infinite loop. p1 = f(p0) + p0; % calculate the next two guesses for the fixed point. p2 = f(p1) + p1; p = p0-(p1-p0)^2/(p2-2*p1+p0) % use Aitken's delta squared method to % find a better approximation to p0. if abs(p - p0) < tol % test to see if we are within tolerance. break % if we are, stop the iterations, we have our answer. end p0 = p; % update p0 for the next iteration. end if abs(p - p0) > tol % If we fail to meet the tolerance, we output a % message of failure. 'failed to converge in 1000 iterations.' end


In Python

Here is the source for an implementation of Steffensen's method in
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (prog ...
. : from typing import Callable, Iterator Func = Callable float float, float] def g(f: Func, x: float, fx: float) -> Func: """First-order divided difference function. Arguments: f: Function input to g x: Point at which to evaluate g fx: Function f evaluated at x """ return lambda x: f(x + fx) / fx - 1 def steff(f: Func, x: float, tol: float) -> Iterator loat """Steffenson algorithm for finding roots. This recursive generator yields the x_ value first then, when the generator iterates, it yields x_ from the next level of recursion. Arguments: f: Function whose root we are searching for x: Starting value upon first call, each level n that the function recurses x is x_n """ while True: fx = f(x) if abs(fx) <= tol: break else: gx = g(f, x, fx)(x) x = x - fx / gx # Update to x_ yield x # Yield value


Generalization to Banach space

Steffensen's method can also be used to find an input \ x = x_\star\ for a different kind of function \ F\ that produces output the same as its input: \ x_\star = F(x_\star)\ for the special value \ x_\star ~. Solutions like \ x_\star\ are called '' fixed points''. Many of these functions can be used to find their own solutions by repeatedly recycling the result back as input, but the rate of convergence can be slow, or the function can fail to converge at all, depending on the individual function. Steffensen's method accelerates this convergence, to make it quadratic. Momentarily ignoring the issues of a more general
Banach space In mathematics, more specifically in functional analysis, a Banach space (, ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vectors and ...
vs. basic
real numbers In mathematics, a real number is a number that can be used to measurement, measure a continuous variable, continuous one-dimensional quantity such as a time, duration or temperature. Here, ''continuous'' means that pairs of values can have arbi ...
for the sake of an example: To re-orient the reader to the earlier section, a simple
toy model A toy or plaything is an object that is used primarily to provide entertainment. Simple examples include toy blocks, board games, and dolls. Toys are often designed for use by children, although many are designed specifically for adults and ...
fixed-point function, \ \tilde F\ , using any root function \ f\ , can be made with \ \tilde F(x) = x + \varepsilon\ f(x) ~. Here \ \varepsilon\ is a constant with the appropriate sign that is small enough in magnitude to make \ \tilde F\ stable under iteration, but large enough for the
non-linearity In mathematics and science, a nonlinear system (or a non-linear system) is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathe ...
of the function \ f\ to be appreciable. This method for finding fixed points of a real-valued function has been generalized for functions \ F : X \to X\ that map a
Banach space In mathematics, more specifically in functional analysis, a Banach space (, ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vectors and ...
\ X\ onto itself or even more generally \ F : X \to Y\ that map from one
Banach space In mathematics, more specifically in functional analysis, a Banach space (, ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vectors and ...
X into another
Banach space In mathematics, more specifically in functional analysis, a Banach space (, ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vectors and ...
\ Y ~. The generalized method assumes that a
family Family (from ) is a Social group, group of people related either by consanguinity (by recognized birth) or Affinity (law), affinity (by marriage or other relationship). It forms the basis for social order. Ideally, families offer predictabili ...
of bounded
linear operators In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that pr ...
\ \bigl\\ associated with \ u\ and \ v\ can be devised that (locally) satisfies the condition The operator \ G\ is roughly equivalent to a
matrix Matrix (: matrices or matrixes) or MATRIX may refer to: Science and mathematics * Matrix (mathematics), a rectangular array of numbers, symbols or expressions * Matrix (logic), part of a formula in prenex normal form * Matrix (biology), the m ...
whose entries are all functions of
vector Vector most often refers to: * Euclidean vector, a quantity with a magnitude and a direction * Disease vector, an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematics a ...
arguments An argument is a series of sentences, statements, or propositions some of which are called premises and one is the conclusion. The purpose of an argument is to give reasons for one's conclusion via justification, explanation, and/or persua ...
\ u\ and \ v ~. Refer again back to the
simple function In the mathematical field of real analysis, a simple function is a real (or complex)-valued function over a subset of the real line, similar to a step function. Simple functions are sufficiently "nice" that using them makes mathematical reas ...
\ f\ , given in the first section, where the function merely takes in and puts out real numbers: There, the function \ g\ is a ''
divided difference In mathematics, divided differences is an algorithm, historically used for computing tables of logarithms and trigonometric functions. Charles Babbage's difference engine, an early mechanical calculator, was designed to use this algorithm in its ...
''. In the generalized form here, the operator \ G\ is the analogue of a divided difference for use in the
Banach space In mathematics, more specifically in functional analysis, a Banach space (, ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vectors and ...
. If division is possible in the
Banach space In mathematics, more specifically in functional analysis, a Banach space (, ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vectors and ...
, then the linear operator \ G\ can be obtained from : G\left( u, v \right) = \bigl F\left( u \right)- F\left( v \right)\ \bigr \bigl(\ u - v\ \bigr)^\ , which may provide some insight: Expressed in this way, the linear operator \ G\ can be more easily seen to be an elaborate version of the
divided difference In mathematics, divided differences is an algorithm, historically used for computing tables of logarithms and trigonometric functions. Charles Babbage's difference engine, an early mechanical calculator, was designed to use this algorithm in its ...
\ g\ discussed in the first section, above. The quotient form is shown here for orientation only; it is ''not'' required ''per se''. Note also that division within the Banach space is not necessary for the elaborated Steffensen's method to be viable; the only requirement is that the operator \ G\ satisfy (). Steffensen's method is then very similar to the Newton's method, except that it uses the divided difference \ G \bigl( F\left( x \right), x \bigr)\ instead of the derivative \ F'(x) ~. Note that for arguments \ x\ close to some fixed point \ x_\star\ , fixed point functions \ F\ and their linear operators \ G\ meeting condition (), \ F'(x)\ \approx\ G \bigl( F\left( x \right), x \bigr)\ \approx\ I\ , where \ I\ is the identity operator. In the case that division is possible in the Banach space, the generalized iteration formula is given by : x_ = x_n + \Bigl I - G\bigl( F\left( x_n \right), x_n \bigr)\ \Bigr\Bigl F\left( x_n \right) - x_n\ \Bigr , for \ n = 1,\ 2,\ 3,\ ... ~. In the more general case in which division may not be possible, the iteration formula requires finding a solution \ x_\ close to \ x_\ for which : \Bigl I - G\bigl( F\left( x_n \right), x_n \bigr)\ \Bigr\bigl(\ x_ - x_n\ \bigr) = F\left( x_n \right) - x_n ~. Equivalently, one may seek the solution \ x_\ to the somewhat reduced form : \Bigl I - G\bigl( F\left( x_n \right), x_n \bigr)\ \Bigr x_ = \Bigl F\left( x_n \right) - G\bigl( F\left( x_n \right), x_n \bigr) \ x_n\ \Bigr , with all the values inside square brackets being independent of \ x_\ : The bracketed terms all only depend on \ x_\ . However, the second form may not be as numerically stable as the first; because the first form involves finding a value for a (hopefully) small difference, it may be numerically more likely to avoid excessively large or erratic changes to the iterated value \ x_n ~. If the linear operator \ G\ satisfies : \Bigl\, G \left( u, v \right) - G \left( x, y \right) \Bigr\, \le k \biggl( \Bigl\, u - x \Bigr\, + \Bigr\, v - y \Bigr\, \biggr) for some positive real constant \ k\ , then the method converges quadratically to a fixed point of \ F\ if the initial approximation \ x_0\ is "sufficiently close" to the desired solution \ x_\star\ that satisfies \ x_\star = F(x_\star) ~.


Notes


References

{{Root-finding algorithms Articles with example MATLAB/Octave code Articles with example Python (programming language) code Quasi-Newton methods