In
numerical analysis
Numerical analysis is the study of algorithms that use numerical approximation (as opposed to symbolic computation, symbolic manipulations) for the problems of mathematical analysis (as distinguished from discrete mathematics). It is the study of ...
, catastrophic cancellation
is the phenomenon that subtracting good approximations to two nearby numbers may yield a very bad approximation to the difference of the original numbers.
For example, if there are two
studs, one
long and the other
long, and they are measured with a ruler that is good only to the centimeter, then the approximations could come out to be
and
.
These may be good approximations, in
relative error, to the true lengths: the approximations are in error by less than 0.2% of the true lengths,
.
However, if the ''approximate'' lengths are subtracted, the difference will be
, even though the true difference between the lengths is
.
The difference of the approximations,
, is in error by almost 100% of the magnitude of the difference of the
true values,
.
Catastrophic cancellation is not affected by how large the inputs are—it applies just as much to large and small inputs.
It depends only on how large the ''difference'' is, and on the ''error'' of the inputs.
Exactly the same error would arise by subtracting
from
as approximations to
and
, or by subtracting
from
as approximations to
and
.
Catastrophic cancellation may happen even if the difference is computed exactly, as in the example above—it is not a property of any particular kind of arithmetic like
floating-point arithmetic
In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a Sign (mathematics), signed sequence of a fixed number of digits in some Radix, base) multiplied by an integer power of that ba ...
; rather, it is inherent to subtraction, when the ''inputs'' are approximations themselves. Indeed, in floating-point arithmetic, when the inputs are close enough, the floating-point difference is computed exactly, by the
Sterbenz lemma—there is no rounding error introduced by the floating-point subtraction operation.
Formal analysis
Formally, catastrophic cancellation happens because subtraction is
ill-conditioned
In numerical analysis, the condition number of a function measures how much the output value of the function can change for a small change in the input argument. This is used to measure how sensitive a function is to changes or errors in the inpu ...
at nearby inputs: even if approximations
and
have small
relative errors and
from true values
and
, respectively, the relative error of the difference
of the approximations from the difference
of the true values is inversely proportional to the difference of the true values:
Thus, the relative error of the exact difference
of the approximations from the difference
of the true values is
which can be arbitrarily large if the true values
and
are close.
In numerical algorithms
Subtracting nearby numbers in floating-point arithmetic does not always cause catastrophic cancellation, or even any error—by the
Sterbenz lemma, if the numbers are close enough the floating-point difference is exact.
But cancellation may ''amplify'' errors in the inputs that arose from rounding in other floating-point arithmetic.
Example: Difference of squares
Given numbers
and
, the naive attempt to compute the mathematical function
by the floating-point arithmetic
is subject to catastrophic cancellation when
and
are close in magnitude, because the subtraction can expose the rounding errors in the squaring.
The alternative factoring
,
evaluated by the floating-point arithmetic
,
avoids catastrophic cancellation because it avoids introducing rounding error leading into the subtraction.
For example, if
and
,
then the true value of the difference
is
.
In
IEEE 754 binary64 arithmetic, evaluating the alternative factoring
gives the correct result exactly (with no rounding), but evaluating the naive expression
gives the floating-point number
,
of which less than half the digits are correct and the other (underlined) digits reflect the missing terms
, lost due to rounding when calculating the intermediate squared values.
Example: Complex arcsine
When computing the
complex arcsine function, one may be tempted to use the logarithmic formula directly:
However, suppose
for
.
Then
and
; call the difference between them
—a very small difference, nearly zero.
If
is evaluated in floating-point arithmetic giving
with any error
, where
denotes floating-point rounding, then computing the difference
of two nearby numbers, both very close to
, may amplify the error
in one input by a factor of
—a very large factor because
was nearly zero.
For instance, if
, the true value of
is approximately
, but using the naive logarithmic formula in
IEEE 754 binary64 arithmetic may give
, with only five out of sixteen digits correct and the remainder (underlined) all incorrect.
In the case of
for
, using the identity
avoids cancellation because
but
, so the subtraction is effectively addition with the same sign which does not cancel.
Example: Radix conversion
Numerical constants in software programs are often written in decimal, such as in the C fragment
double x = 1.000000000000001; to declare and initialize an IEEE 754 binary64 variable named
x
.
However,
is not a binary64 floating-point number; the nearest one, which
x
will be initialized to in this fragment, is
.
Although the radix conversion from decimal floating-point to binary floating-point only incurs a small relative error, catastrophic cancellation may amplify it into a much larger one:
double x = 1.000000000000001; // rounded to 1 + 5*2^
double y = 1.000000000000002; // rounded to 1 + 9*2^
double z = y - x; // difference is exactly 4*2^
The difference
is
.
The relative errors of
x
from
and of
y
from
are both below
, and the floating-point subtraction
y - x
is computed exactly by the Sterbenz lemma.
But even though the inputs are good approximations, and even though the subtraction is computed exactly, the difference of the ''approximations''
has a relative error of over
from the difference
of the original values as written in decimal: catastrophic cancellation amplified a tiny error in radix conversion into a large error in the output.
Benign cancellation
Cancellation is sometimes useful and desirable in numerical algorithms.
For example, the
2Sum and Fast2Sum algorithms both rely on such cancellation after a rounding error in order to exactly compute what the error was in a floating-point addition operation as a floating-point number itself.
The function
,
if evaluated naively at points
,
will lose most of the digits of
in rounding
.
However, the function
itself is
well-conditioned at inputs near
.
Rewriting it as
exploits cancellation in
to avoid the error from
evaluated directly.
This works because the cancellation in the numerator
and the cancellation in the denominator
counteract each other; the function
is well-enough conditioned near zero that
gives a good approximation to
,
and thus
gives a good approximation to
.
References
{{reflist
Numerical analysis