HOME

TheInfoList



OR:

In
numerical analysis Numerical analysis is the study of algorithms that use numerical approximation (as opposed to symbolic manipulations) for the problems of mathematical analysis (as distinguished from discrete mathematics). It is the study of numerical methods t ...
, the Kahan summation algorithm, also known as compensated summation, significantly reduces the numerical error in the total obtained by adding a sequence of finite- precision floating-point numbers, compared to the obvious approach. This is done by keeping a separate ''running compensation'' (a variable to accumulate small errors), in effect extending the precision of the sum by the precision of the compensation variable. In particular, simply summing n numbers in sequence has a worst-case error that grows proportional to n, and a
root mean square In mathematics and its applications, the root mean square of a set of numbers x_i (abbreviated as RMS, or rms and denoted in formulas as either x_\mathrm or \mathrm_x) is defined as the square root of the mean square (the arithmetic mean of the ...
error that grows as \sqrt for random inputs (the roundoff errors form a random walk).. With compensated summation, using a compensation variable with sufficiently high precision the worst-case error bound is effectively independent of n, so a large number of values can be summed with an error that only depends on the floating-point precision of the result. The
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
is attributed to
William Kahan William "Velvel" Morton Kahan (born June 5, 1933) is a Canadian mathematician and computer scientist, who received the Turing Award in 1989 for "''his fundamental contributions to numerical analysis''", was named an ACM Fellow in 1994, and induc ...
;. Ivo Babuška seems to have come up with a similar algorithm independently (hence Kahan–Babuška summation). Similar, earlier techniques are, for example,
Bresenham's line algorithm Bresenham's line algorithm is a line drawing algorithm that determines the points of an ''n''-dimensional raster that should be selected in order to form a close approximation to a straight line between two points. It is commonly used to draw li ...
, keeping track of the accumulated error in integer operations (although first documented around the same time) and the
delta-sigma modulation Delta-sigma (ΔΣ; or sigma-delta, ΣΔ) modulation is a method for encoding analog signals into digital signals as found in an analog-to-digital converter (ADC). It is also used to convert high bit-count, low-frequency digital signals into l ...
.


The algorithm

In
pseudocode In computer science, pseudocode is a plain language description of the steps in an algorithm or another system. Pseudocode often uses structural conventions of a normal programming language, but is intended for human reading rather than machine re ...
, the algorithm will be: function KahanSum(input) var sum = 0.0 // Prepare the accumulator. var c = 0.0 // A running compensation for lost low-order bits. for i = 1 to input.length do // The array ''input'' has elements indexed input to input nput.length var y = input - c // ''c'' is zero the first time around. var t = sum + y // Alas, ''sum'' is big, ''y'' small, so low-order digits of ''y'' are lost. c = (t - sum) - y // ''(t - sum)'' cancels the high-order part of ''y''; subtracting ''y'' recovers negative (low part of ''y'') sum = t // Algebraically, ''c'' should always be zero. Beware overly-aggressive optimizing compilers! next i // Next time around, the lost low part will be added to ''y'' in a fresh attempt. return sum This algorithm can also be rewritten to use the Fast2Sum algorithm: function KahanSum2(input) var sum = 0.0 // Prepare the accumulator. var c = 0.0 // A running compensation for lost low-order bits. for i = 1 to input.length do // The array ''input'' has elements indexed input to input nput.length var y = input + c // ''c'' is zero the first time around. (sum,c) = Fast2Sum(sum,y) // ''sum'' + ''c'' is an approximation to the exact sum. next i // Next time around, the lost low part will be added to ''y'' in a fresh attempt. return sum


Worked example

This example will be given in decimal. Computers typically use binary arithmetic, but the principle being illustrated is the same. Suppose we are using six-digit decimal
floating-point arithmetic In computing, floating-point arithmetic (FP) is arithmetic that represents real numbers approximately, using an integer with a fixed precision, called the significand, scaled by an integer exponent of a fixed base. For example, 12.345 can be r ...
, sum has attained the value 10000.0, and the next two values of input /code> are 3.14159 and 2.71828. The exact result is 10005.85987, which rounds to 10005.9. With a plain summation, each incoming value would be aligned with sum, and many low-order digits would be lost (by truncation or rounding). The first result, after rounding, would be 10003.1. The second result would be 10005.81828 before rounding and 10005.8 after rounding. This is not correct. However, with compensated summation, we get the correct rounded result of 10005.9. Assume that c has the initial value zero. y = 3.14159 - 0.00000 ''y = input - c'' t = 10000.0 + 3.14159 = 10003.14159 But only six digits are retained. = 10003.1 Many digits have been lost! c = (10003.1 - 10000.0) - 3.14159 This must be evaluated as written! = 3.10000 - 3.14159 The assimilated part of ''y'' recovered, vs. the original full ''y''. = -0.0415900 Trailing zeros shown because this is six-digit arithmetic. sum = 10003.1 Thus, few digits from ''input(i'') met those of ''sum''. The sum is so large that only the high-order digits of the input numbers are being accumulated. But on the next step, c gives the error. y = 2.71828 - (-0.0415900) The shortfall from the previous stage gets included. = 2.75987 It is of a size similar to ''y'': most digits meet. t = 10003.1 + 2.75987 But few meet the digits of ''sum''. = 10005.85987 And the result is rounded = 10005.9 To six digits. c = (10005.9 - 10003.1) - 2.75987 This extracts whatever went in. = 2.80000 - 2.75987 In this case, too much. = 0.040130 But no matter, the excess would be subtracted off next time. sum = 10005.9 Exact result is 10005.85987, this is correctly rounded to 6 digits. So the summation is performed with two accumulators: sum holds the sum, and c accumulates the parts not assimilated into sum, to nudge the low-order part of sum the next time around. Thus the summation proceeds with "guard digits" in c, which is better than not having any, but is not as good as performing the calculations with double the precision of the input. However, simply increasing the precision of the calculations is not practical in general; if input is already in double precision, few systems supply quadruple precision, and if they did, input could then be in quadruple precision.


Accuracy

A careful analysis of the errors in compensated summation is needed to appreciate its accuracy characteristics. While it is more accurate than naive summation, it can still give large relative errors for ill-conditioned sums. Suppose that one is summing n values x_i, for i = 1, \, \ldots, \, n. The exact sum is : S_n = \sum_^n x_i (computed with infinite precision). With compensated summation, one instead obtains S_n + E_n, where the error E_n is bounded by : , E_n, \le \big \varepsilon + O(n\varepsilon^2)\big\sum_^n , x_i, , where \varepsilon is the
machine precision Machine epsilon or machine precision is an upper bound on the relative approximation error due to rounding in floating point arithmetic. This value characterizes computer arithmetic in the field of numerical analysis, and by extension in the subje ...
of the arithmetic being employed (e.g. \varepsilon \approx 10^ for IEEE standard
double-precision Double-precision floating-point format (sometimes called FP64 or float64) is a floating-point number format, usually occupying 64 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. Flo ...
floating point). Usually, the quantity of interest is the
relative error The approximation error in a data value is the discrepancy between an exact value and some ''approximation'' to it. This error can be expressed as an absolute error (the numerical amount of the discrepancy) or as a relative error (the absolute er ...
, E_n, /, S_n, , which is therefore bounded above by : \frac \le \big \varepsilon + O(n\varepsilon^2)\big\frac. In the expression for the relative error bound, the fraction \Sigma , x_i, / , \Sigma x_i, is the
condition number In numerical analysis, the condition number of a function measures how much the output value of the function can change for a small change in the input argument. This is used to measure how sensitive a function is to changes or errors in the inpu ...
of the summation problem. Essentially, the condition number represents the ''intrinsic'' sensitivity of the summation problem to errors, regardless of how it is computed. The relative error bound of ''every'' ( backwards stable) summation method by a fixed algorithm in fixed precision (i.e. not those that use
arbitrary-precision In computer science, arbitrary-precision arithmetic, also called bignum arithmetic, multiple-precision arithmetic, or sometimes infinite-precision arithmetic, indicates that calculations are performed on numbers whose digits of precision are li ...
arithmetic, nor algorithms whose memory and time requirements change based on the data), is proportional to this condition number. An ''ill-conditioned'' summation problem is one in which this ratio is large, and in this case even compensated summation can have a large relative error. For example, if the summands x_i are uncorrelated random numbers with zero mean, the sum is a random walk, and the condition number will grow proportional to \sqrt. On the other hand, for random inputs with nonzero mean the condition number asymptotes to a finite constant as n \to \infty. If the inputs are all
non-negative In mathematics, the sign of a real number is its property of being either positive, negative, or zero. Depending on local conventions, zero may be considered as being neither positive nor negative (having no sign or a unique third sign), or i ...
, then the condition number is 1. Given a condition number, the relative error of compensated summation is effectively independent of n. In principle, there is the O (n \varepsilon^2) that grows linearly with n, but in practice this term is effectively zero: since the final result is rounded to a precision \varepsilon, the n \varepsilon^2 term rounds to zero, unless n is roughly 1 / \varepsilon or larger. In double precision, this corresponds to an n of roughly 10^, much larger than most sums. So, for a fixed condition number, the errors of compensated summation are effectively O (\varepsilon), independent of n. In comparison, the relative error bound for naive summation (simply adding the numbers in sequence, rounding at each step) grows as O(\varepsilon n) multiplied by the condition number. This worst-case error is rarely observed in practice, however, because it only occurs if the rounding errors are all in the same direction. In practice, it is much more likely that the rounding errors have a random sign, with zero mean, so that they form a random walk; in this case, naive summation has a
root mean square In mathematics and its applications, the root mean square of a set of numbers x_i (abbreviated as RMS, or rms and denoted in formulas as either x_\mathrm or \mathrm_x) is defined as the square root of the mean square (the arithmetic mean of the ...
relative error that grows as O\left(\varepsilon \sqrt\right) multiplied by the condition number.Manfred Tasche and Hansmartin Zeuner, ''Handbook of Analytic-Computational Methods in Applied Mathematics'', Boca Raton, FL: CRC Press, 2000. This is still much worse than compensated summation, however. However, if the sum can be performed in twice the precision, then \varepsilon is replaced by \varepsilon^2, and naive summation has a worst-case error comparable to the O(n \varepsilon^2) term in compensated summation at the original precision. By the same token, the \Sigma , x_i, that appears in E_n above is a worst-case bound that occurs only if all the rounding errors have the same sign (and are of maximal possible magnitude). In practice, it is more likely that the errors have random sign, in which case terms in \Sigma , x_i, are replaced by a random walk, in which case, even for random inputs with zero mean, the error E_n grows only as O\left(\varepsilon \sqrt\right) (ignoring the n \varepsilon^2 term), the same rate the sum S_n grows, canceling the \sqrt factors when the relative error is computed. So, even for asymptotically ill-conditioned sums, the relative error for compensated summation can often be much smaller than a worst-case analysis might suggest.


Further enhancements

Neumaier introduced an improved version of Kahan algorithm, which he calls an "improved Kahan–Babuška algorithm", which also covers the case when the next term to be added is larger in absolute value than the running sum, effectively swapping the role of what is large and what is small. In
pseudocode In computer science, pseudocode is a plain language description of the steps in an algorithm or another system. Pseudocode often uses structural conventions of a normal programming language, but is intended for human reading rather than machine re ...
, the algorithm is: function KahanBabushkaNeumaierSum(input) var sum = 0.0 var c = 0.0 // A running compensation for lost low-order bits. for i = 1 to input.length do var t = sum + input if , sum, >= , input then c += (sum - t) + input // If ''sum'' is bigger, low-order digits of ''input ' are lost. else c += (input - t) + sum // Else low-order digits of ''sum'' are lost. endif sum = t next i return sum + c // Correction only applied once in the very end. This enhancement is similar to the replacement of Fast2Sum by
2Sum 2Sum is a floating-point algorithm for computing the exact round-off error in a floating-point addition operation. 2Sum and its variant Fast2Sum were first published by Møller in 1965. Fast2Sum is often used implicitly in other algorithms such a ...
in Kahan's algorithm rewritten with Fast2Sum. For many sequences of numbers, both algorithms agree, but a simple example due to Peters shows how they can differ. For summing .0, +10^, 1.0, -10^/math> in double precision, Kahan's algorithm yields 0.0, whereas Neumaier's algorithm yields the correct value 2.0. Higher-order modifications of better accuracy are also possible. For example, a variant suggested by Klein, which he called a second-order "iterative Kahan–Babuška algorithm". In
pseudocode In computer science, pseudocode is a plain language description of the steps in an algorithm or another system. Pseudocode often uses structural conventions of a normal programming language, but is intended for human reading rather than machine re ...
, the algorithm is: function KahanBabushkaKleinSum(input) var sum = 0.0 var cs = 0.0 var ccs = 0.0 var c = 0.0 var cc = 0.0 for i = 1 to input.length do var t = sum + input if , sum, >= , input then c = (sum - t) + input else c = (input - t) + sum endif sum = t t = cs + c if , cs, >= , c, then cc = (cs - t) + c else cc = (c - t) + cs endif cs = t ccs = ccs + cc end loop return sum + cs + ccs


Alternatives

Although Kahan's algorithm achieves O(1) error growth for summing ''n'' numbers, only slightly worse O(\log n) growth can be achieved by pairwise summation: one
recursively Recursion (adjective: ''recursive'') occurs when a thing is defined in terms of itself or of its type. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in mathematics ...
divides the set of numbers into two halves, sums each half, and then adds the two sums. This has the advantage of requiring the same number of arithmetic operations as the naive summation (unlike Kahan's algorithm, which requires four times the arithmetic and has a latency of four times a simple summation) and can be calculated in parallel. The base case of the recursion could in principle be the sum of only one (or zero) numbers, but to amortize the overhead of recursion, one would normally use a larger base case. The equivalent of pairwise summation is used in many fast Fourier transform (FFT) algorithms and is responsible for the logarithmic growth of roundoff errors in those FFTs. In practice, with roundoff errors of random signs, the root mean square errors of pairwise summation actually grow as O\left(\sqrt\right). Another alternative is to use
arbitrary-precision arithmetic In computer science, arbitrary-precision arithmetic, also called bignum arithmetic, multiple-precision arithmetic, or sometimes infinite-precision arithmetic, indicates that calculations are performed on numbers whose digits of precision are li ...
, which in principle need no rounding at all with a cost of much greater computational effort. A way of performing exactly rounded sums using arbitrary precision is to extend adaptively using multiple floating-point components. This will minimize computational cost in common cases where high precision is not needed.Raymond Hettinger
Recipe 393090: Binary floating point summation accurate to full precision
Python implementation of algorithm from Shewchuk (1997) article (28 March 2005).
Another method that uses only integer arithmetic, but a large accumulator, was described by Kirchner and Kulisch; a hardware implementation was described by Müller, Rüb and Rülling.


Possible invalidation by compiler optimization

In principle, a sufficiently aggressive
optimizing compiler In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power c ...
could destroy the effectiveness of Kahan summation: for example, if the compiler simplified expressions according to the
associativity In mathematics, the associative property is a property of some binary operations, which means that rearranging the parentheses in an expression will not change the result. In propositional logic, associativity is a valid rule of replacement ...
rules of real arithmetic, it might "simplify" the second step in the sequence : t = sum + y; : c = (t - sum) - y; to : c = ((sum + y) - sum) - y; and then to : c = 0; thus eliminating the error compensation.. In practice, many compilers do not use associativity rules (which are only approximate in floating-point arithmetic) in simplifications, unless explicitly directed to do so by compiler options enabling "unsafe" optimizations, although the
Intel C++ Compiler Intel oneAPI DPC++/C++ Compiler and Intel C++ Compiler Classic are Intel’s C, C++, SYCL, and Data Parallel C++ (DPC++) compilers for Intel processor-based systems, available for Windows, Linux, and macOS operating systems. Overview Intel ...
is one example that allows associativity-based transformations by default. The original
K&R C C (''pronounced like the letter c'') is a general-purpose computer programming language. It was created in the 1970s by Dennis Ritchie, and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of ...
version of the
C programming language ''The C Programming Language'' (sometimes termed ''K&R'', after its authors' initials) is a computer programming book written by Brian Kernighan and Dennis Ritchie, the latter of whom originally designed and implemented the language, as well a ...
allowed the compiler to re-order floating-point expressions according to real-arithmetic associativity rules, but the subsequent
ANSI C ANSI C, ISO C, and Standard C are successive standards for the C programming language published by the American National Standards Institute (ANSI) and ISO/IEC JTC 1/SC 22/WG 14 of the International Organization for Standardization (ISO) and the ...
standard prohibited re-ordering in order to make C better suited for numerical applications (and more similar to Fortran, which also prohibits re-ordering), although in practice compiler options can re-enable re-ordering, as mentioned above. A portable way to inhibit such optimizations locally is to break one of the lines in the original formulation into two statements, and make two of the intermediate products volatile: function KahanSum(input) var sum = 0.0 var c = 0.0 for i = 1 to input.length do var y = input - c volatile var t = sum + y volatile var z = t - sum c = z - y sum = t next i return sum


Support by libraries

In general, built-in "sum" functions in computer languages typically provide no guarantees that a particular summation algorithm will be employed, much less Kahan summation. The BLAS standard for linear algebra subroutines explicitly avoids mandating any particular computational order of operations for performance reasons, and BLAS implementations typically do not use Kahan summation. The standard library of the
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pr ...
computer language specifies a
fsum
function for exactly rounded summation, using the Shewchuk algorithm to track multiple partial sums. In the
Julia Julia is usually a feminine given name. It is a Latinate feminine form of the name Julio and Julius. (For further details on etymology, see the Wiktionary entry "Julius".) The given name ''Julia'' had been in use throughout Late Antiquity (e. ...
language, the default implementation of the sum function does pairwise summation for high accuracy with good performance, but an external library provides an implementation of Neumaier's variant named sum_kbn for the cases when higher accuracy is needed. In the C# language
HPCsharp nuget package
implements the Neumaier variant and pairwise summation: both as scalar, data-parallel using SIMD processor instructions, and parallel multi-core.HPCsharp nuget package of high performance algorithms


See also

*
Algorithms for calculating variance Algorithms for calculating variance play a major role in computational statistics. A key difficulty in the design of good algorithms for this problem is that formulas for the variance may involve sums of squares, which can lead to numerical inst ...
, which includes stable summation


References


External links


Floating-point Summation, Dr. Dobb's Journal September, 1996
{{DEFAULTSORT:Kahan Summation Algorithm Computer arithmetic Numerical analysis Articles with example pseudocode