mathematics Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...

and

computer programming Computer programming or coding is the composition of sequences of instructions, called computer program, programs, that computers can follow to perform tasks. It involves designing and implementing algorithms, step-by-step specifications of proc ...

, exponentiating by squaring is a general method for fast computation of large

positive integer In mathematics, the natural numbers are the numbers 0, 1, 2, 3, and so on, possibly excluding 0. Some start counting with 0, defining the natural numbers as the non-negative integers , while others start with 1, defining them as the positiv ...

powers of a

number A number is a mathematical object used to count, measure, and label. The most basic examples are the natural numbers 1, 2, 3, 4, and so forth. Numbers can be represented in language with number words. More universally, individual numbers can ...

, or more generally of an element of a

semigroup In mathematics, a semigroup is an algebraic structure consisting of a set together with an associative internal binary operation on it. The binary operation of a semigroup is most often denoted multiplicatively (just notation, not necessarily th ...

, like a

polynomial In mathematics, a polynomial is a Expression (mathematics), mathematical expression consisting of indeterminate (variable), indeterminates (also called variable (mathematics), variables) and coefficients, that involves only the operations of addit ...

or a

square matrix In mathematics, a square matrix is a Matrix (mathematics), matrix with the same number of rows and columns. An ''n''-by-''n'' matrix is known as a square matrix of order Any two square matrices of the same order can be added and multiplied. Squ ...

. Some variants are commonly referred to as square-and-multiply algorithms or binary exponentiation. These can be of quite general use, for example in

modular arithmetic In mathematics, modular arithmetic is a system of arithmetic operations for integers, other than the usual ones from elementary arithmetic, where numbers "wrap around" when reaching a certain value, called the modulus. The modern approach to mo ...

or powering of matrices. For semigroups for which additive notation is commonly used, like

elliptic curve In mathematics, an elliptic curve is a smooth, projective, algebraic curve of genus one, on which there is a specified point . An elliptic curve is defined over a field and describes points in , the Cartesian product of with itself. If the ...

s used in

cryptography Cryptography, or cryptology (from "hidden, secret"; and ''graphein'', "to write", or ''-logy, -logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of Adversary (cryptography), ...

, this method is also referred to as double-and-add.

Basic method

Recursive version

The method is based on the observation that, for any integer

n > 0

, one has:

x^n=
    \begin
                x \, ( x^)^, & \mbox n \mbox \\
                (x^)^ , & \mbox n \mbox
     \end

If the exponent is zero then the answer is 1. If the exponent is negative then we can reuse the previous formula by rewriting the value using a positive exponent. That is,

x^n = \left(\frac\right)^\,.

Together, these may be implemented directly as the following

recursive algorithm In computer science, recursion is a method of solving a computational problem where the solution depends on solutions to smaller instances of the same problem. Recursion solves such recursive problems by using functions that call themselves ...

: Inputs: a real number ''x''; an integer ''n'' Output: xⁿ function exp_by_squaring(x, n) is if ''n'' < 0 then return exp_by_squaring(1 / ''x'', −''n'') else if ''n'' = 0 then return 1 else if ''n'' is even then return exp_by_squaring(''x'' × ''x'', ''n'' / 2) else if ''n'' is odd then return ''x'' × exp_by_squaring(''x'' × ''x'', (''n'' − 1) / 2) end function In each recursive call, the least significant digits of the

binary representation A binary number is a number expressed in the base-2 numeral system or binary numeral system, a method for representing numbers that uses only two symbols for the natural numbers: typically "0" (zero) and "1" (one). A ''binary number'' may also ...

of is removed. It follows that the number of recursive calls is

\lceil \log_2 n\rceil,

the number of

bit The bit is the most basic unit of information in computing and digital communication. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represented as ...

s of the binary representation of . So this algorithm computes this number of squares and a lower number of multiplication, which is equal to the number of in the binary representation of . This logarithmic number of operations is to be compared with the trivial algorithm which requires multiplications. This algorithm is not tail-recursive. This implies that it requires an amount of auxiliary memory that is roughly proportional to the number of recursive calls -- or perhaps higher if the amount of data per iteration is increasing. The algorithms of the next section use a different approach, and the resulting algorithms needs the same number of operations, but use an auxiliary memory that is roughly the same as the memory required to store the result.

With constant auxiliary memory

The variants described in this section are based on the formula :

yx^n=
    \begin
                (yx) \, ( x^)^, & \mbox n \mbox \\
                y\,(x^)^ , & \mbox n \mbox.
     \end

If one applies recursively this formula, by starting with , one gets eventually an exponent equal to , and the desired result is then the left factor. This may be implemented as a tail-recursive function: Function exp_by_squaring(x, n) return exp_by_squaring2(1, x, n) Function exp_by_squaring2(y, x, n) if n < 0 then return exp_by_squaring2(y, 1 / x, -n); else if n = 0 then return y; else if n is even then return exp_by_squaring2(y, x * x, n / 2); else if n is odd then return exp_by_squaring2(x * y, x * x, (n - 1) / 2). The iterative version of the algorithm also uses a bounded auxiliary space, and is given by Function exp_by_squaring_iterative(x, n) if n < 0 then x := 1 / x; n := -n; if n = 0 then return 1 y := 1; while n > 1 do if n is odd then y := x * y; n := n - 1; x := x * x; n := n / 2; return x * y The correctness of the algorithm results from the fact that

yx^n

is invariant during the computation; it is

1\cdot x^n=x^n

at the beginning; and it is

yx^1=xy

at the end. These algorithms use exactly the same number of operations as the algorithm of the preceding section, but the multiplications are done in a different order.

Computational complexity

A brief analysis shows that such an algorithm uses

\lfloor \log_2n\rfloor

squarings and at most

\lfloor \log_2n\rfloor

multiplications, where

\lfloor\;\rfloor

denotes the

floor function In mathematics, the floor function is the function that takes as input a real number , and gives as output the greatest integer less than or equal to , denoted or . Similarly, the ceiling function maps to the least integer greater than or eq ...

. More precisely, the number of multiplications is one less than the number of ones present in the

binary expansion A binary number is a number expressed in the base-2 numeral system or binary numeral system, a method for representing numbers that uses only two symbols for the natural numbers: typically "0" ( zero) and "1" ( one). A ''binary number'' may als ...

of ''n''. For ''n'' greater than about 4 this is computationally more efficient than naively multiplying the base with itself repeatedly. Each squaring results in approximately double the number of digits of the previous, and so, if multiplication of two ''d''-digit numbers is implemented in O(''d''^''k'') operations for some fixed ''k'', then the complexity of computing ''x''^''n'' is given by :

\sum\limits_^ \big(2^i O(\log x)\big)^k = O\big((n \log x)^k\big).

2^''k''-ary method

This algorithm calculates the value of ''xⁿ'' after expanding the exponent in base 2^''k''. It was first proposed by

Brauer Brauer or Bräuer is a surname of German origin, meaning "brewer". Notable people with the name include:- * Alfred Brauer (1894–1985), German-American mathematician, brother of Richard * Andreas Brauer (born 1973), German film producer * Arik Bra ...

in 1939. In the algorithm below we make use of the following function ''f''(0) = (''k'', 0) and ''f''(''m'') = (''s'', ''u''), where ''m'' = ''u''·2^''s'' with ''u'' odd. Algorithm: ;Input: An element ''x'' of ''G'', a parameter ''k'' > 0, a non-negative integer and the precomputed values

x^3, x^5, ... , x^

. ;Output: The element ''xⁿ'' in ''G'' y := 1; i := l - 1 while i ≥ 0 do (s, u) := f(n_i) for j := 1 to k - s do y := y² y := y * x^u for j := 1 to s do y := y² i := i - 1 return y For optimal efficiency, ''k'' should be the smallest integer satisfying :

\lg n < \frac + 1.

Sliding-window method

This method is an efficient variant of the 2^''k''-ary method. For example, to calculate the exponent 398, which has binary expansion (110 001 110)₂, we take a window of length 3 using the 2^''k''-ary method algorithm and calculate 1, x³, x⁶, x¹², x²⁴, x⁴⁸, x⁴⁹, x⁹⁸, x⁹⁹, x¹⁹⁸, x¹⁹⁹, x³⁹⁸. But, we can also compute 1, x³, x⁶, x¹², x²⁴, x⁴⁸, x⁹⁶, x¹⁹², x¹⁹⁹, x³⁹⁸, which saves one multiplication and amounts to evaluating (110 001 110)₂ Here is the general algorithm: Algorithm: ;Input: An element ''x'' of ''G'', a non negative integer , a parameter ''k'' > 0 and the pre-computed values

x^3, x^5, ... ,x^

. ;Output: The element ''xⁿ'' ∈ ''G''. Algorithm: y := 1; i := l - 1 while i > -1 do if n_i = 0 then y := y² i := i - 1 else s := max while n_s = 0 do s := s + 1In this line, the loop finds the longest string of length less than or equal to ''k'' ending in a non-zero value. Not all odd powers of 2 up to

x^

need be computed, and only those specifically involved in the computation need be considered. for h := 1 to i - s + 1 do y := y² u := (n_i, n_i-1, ..., n_s)₂ y := y * x^u i := s - 1 return y

Montgomery's ladder technique

Many algorithms for exponentiation do not provide defence against

side-channel attack In computer security, a side-channel attack is a type of security exploit that leverages information inadvertently leaked by a system—such as timing, power consumption, or electromagnetic or acoustic emissions—to gain unauthorized access to ...

s. Namely, an attacker observing the sequence of squarings and multiplications can (partially) recover the exponent involved in the computation. This is a problem if the exponent should remain secret, as with many public-key cryptosystems. A technique called " Montgomery's ladder" addresses this concern. Given the

of a positive, non-zero integer ''n'' = (''n''_''k''−1...''n''₀)₂ with ''n''_k−1 = 1, we can compute ''xⁿ'' as follows: x₁ = x; x₂ = x² for i = k - 2 to 0 do if n_i = 0 then x₂ = x₁ * x₂; x₁ = x₁² else x₁ = x₁ * x₂; x₂ = x₂² return x₁ The algorithm performs a fixed sequence of operations (

up to Two Mathematical object, mathematical objects and are called "equal up to an equivalence relation " * if and are related by , that is, * if holds, that is, * if the equivalence classes of and with respect to are equal. This figure of speech ...

log ''n''): a multiplication and squaring takes place for each bit in the exponent, regardless of the bit's specific value. A similar algorithm for multiplication by doubling exists. This specific implementation of Montgomery's ladder is not yet protected against cache

timing attack In cryptography, a timing attack is a side-channel attack in which the attacker attempts to compromise a cryptosystem by analyzing the time taken to execute cryptographic algorithms. Every logical operation in a computer takes time to execute, an ...

s: memory access latencies might still be observable to an attacker, as different variables are accessed depending on the value of bits of the secret exponent. Modern cryptographic implementations use a "scatter" technique to make sure the processor always misses the faster cache.

Fixed-base exponent

There are several methods which can be employed to calculate ''xⁿ'' when the base is fixed and the exponent varies. As one can see,

precomputation In algorithms, precomputation is the act of performing an initial computation before Run time (program lifecycle phase), run time to generate a lookup table that can be used by an algorithm to avoid repeated computation each time it is executed. ...

s play a key role in these algorithms.

Yao's method

Yao's method is orthogonal to the -ary method where the exponent is expanded in radix and the computation is as performed in the algorithm above. Let , , , and be integers. Let the exponent be written as :

n = \sum_^ n_i b_i,

where

0 \leqslant n_i < h

for all

i \in, w-1 /math>.

Let .

Then the algorithm uses the equality
: x^n = \prod_^ x_i^ = \prod_^ \bigg prod_ x_i\bigg j. Given the element  of , and the exponent  written in the above form, along with the precomputed values , the element  is calculated using the algorithm below:

 y = 1, u = 1, j = h - 1
 while j > 0 do
     for i = 0 to w - 1 do
         if n

_i = j then u = u × x^b_i y = y × u j = j - 1 return y If we set and , then the values are simply the digits of in base . Yao's method collects in ''u'' first those that appear to the highest power ; in the next round those with power are collected in as well etc. The variable ''y'' is multiplied times with the initial , times with the next highest powers, and so on. The algorithm uses multiplications, and elements must be stored to compute .

Euclidean method

The Euclidean method was first introduced in ''Efficient exponentiation using precomputation and vector addition chains'' by P.D Rooij. This method for computing

x^n

in group , where is a natural integer, whose algorithm is given below, is using the following equality recursively: :

x_0^ \cdot x_1^ = \left(x_0 \cdot x_1^q\right)^ \cdot x_1^,

where

q = \left\lfloor \frac \right\rfloor

. In other words, a Euclidean division of the exponent by is used to return a quotient and a rest . Given the base element in group , and the exponent

n

written as in Yao's method, the element

x^n

is calculated using

l

precomputed values

x^, ..., x^

and then the algorithm below. Begin loop Break loop End loop; The algorithm first finds the largest value among the and then the supremum within the set of . Then it raises to the power , multiplies this value with , and then assigns the result of this computation and the value modulo .

Further applications

The approach also works with

s that are not of characteristic zero, for example allowing fast computation of large exponents modulo a number. Especially in

, it is useful to compute powers in a

ring (The) Ring(s) may refer to: * Ring (jewellery), a round band, usually made of metal, worn as ornamental jewelry * To make a sound with a bell, and the sound made by a bell Arts, entertainment, and media Film and TV * ''The Ring'' (franchise), a ...

of integers modulo . For example, the evaluation of : would take a very long time and much storage space if the naïve method of computing and then taking the

remainder In mathematics, the remainder is the amount "left over" after performing some computation. In arithmetic, the remainder is the integer "left over" after dividing one integer by another to produce an integer quotient ( integer division). In a ...

when divided by 2345 were used. Even using a more effective method will take a long time: square 13789, take the remainder when divided by 2345, multiply the

result A result (also called upshot) is the outcome or consequence of a sequence of actions or events. Possible results include gain, injury, value, and victory. Some types of results include the outcome of an action, the final value of a calculation ...

by 13789, and so on. Applying above ''exp-by-squaring'' algorithm, with "*" interpreted as (that is, a multiplication followed by a division with remainder) leads to only 27 multiplications and divisions of integers, which may all be stored in a single machine word. Generally, any of these approaches will take fewer than modular multiplications. The approach can also be used to compute integer powers in a

group A group is a number of persons or things that are located, gathered, or classed together. Groups of people * Cultural group, a group whose members share the same cultural identity * Ethnic group, a group whose members share the same ethnic iden ...

, using either of the rules :, :. The approach also works in

non-commutative In mathematics, a binary operation is commutative if changing the order of the operands does not change the result. It is a fundamental property of many binary operations, and many mathematical proofs depend on it. Perhaps most familiar as a pr ...

semigroups and is often used to compute powers of

matrices Matrix (: matrices or matrixes) or MATRIX may refer to: Science and mathematics * Matrix (mathematics), a rectangular array of numbers, symbols or expressions * Matrix (logic), part of a formula in prenex normal form * Matrix (biology), the ...

. More generally, the approach works with positive integer exponents in every

magma Magma () is the molten or semi-molten natural material from which all igneous rocks are formed. Magma (sometimes colloquially but incorrectly referred to as ''lava'') is found beneath the surface of the Earth, and evidence of magmatism has also ...

for which the binary operation is

power associative In mathematics, specifically in abstract algebra, power associativity is a property of a binary operation that is a weak form of associativity. Definition An algebra (or more generally a magma) is said to be power-associative if the subalgebra g ...

Signed-digit recoding

In certain computations it may be more efficient to allow negative coefficients and hence use the inverse of the base, provided inversion in is "fast" or has been precomputed. For example, when computing , the binary method requires multiplications and squarings. However, one could perform squarings to get and then multiply by to obtain . To this end we define the

signed-digit representation In mathematical notation for numbers, a signed-digit representation is a positional numeral system with a set of signed digits used to encode the integers. Signed-digit representation can be used to accomplish fast addition of integers becau ...

of an integer in radix as :

n = \sum_^ n_i b^i \text , n_i,  < b.

''Signed binary representation'' corresponds to the particular choice and

n_i \in \

. It is denoted by

(n_ \dots n_0)_s

. There are several methods for computing this representation. The representation is not unique. For example, take : two distinct signed-binary representations are given by

(10\bar 1 1100\bar 1 10)_s

and

(100\bar 1 1000\bar 1 0)_s

, where

\bar 1

is used to denote . Since the binary method computes a multiplication for every non-zero entry in the base-2 representation of , we are interested in finding the signed-binary representation with the smallest number of non-zero entries, that is, the one with ''minimal''

Hamming weight The Hamming weight of a string (computer science), string is the number of symbols that are different from the zero-symbol of the alphabet used. It is thus equivalent to the Hamming distance from the all-zero string of the same length. For the mo ...

. One method of doing this is to compute the representation in non-adjacent form, or NAF for short, which is one that satisfies

n_i n_ = 0 \text i \geqslant 0

and denoted by

(n_ \dots n_0)_\text

. For example, the NAF representation of 478 is

(1000\bar 1 000\bar 1 0)_\text

. This representation always has minimal Hamming weight. A simple algorithm to compute the NAF representation of a given integer

n = (n_l n_ \dots n_0)_2

with

n_l = n_ = 0

is the following: for to do Another algorithm by Koyama and Tsuruoka does not require the condition that

n_i = n_ = 0

; it still minimizes the Hamming weight.

Alternatives and generalizations

Exponentiation by squaring can be viewed as a suboptimal addition-chain exponentiation algorithm: it computes the exponent by an addition chain consisting of repeated exponent doublings (squarings) and/or incrementing exponents by ''one'' (multiplying by ''x'') only. More generally, if one allows ''any'' previously computed exponents to be summed (by multiplying those powers of ''x''), one can sometimes perform the exponentiation using fewer multiplications (but typically using more memory). The smallest power where this occurs is for ''n'' = 15: :

x^ = x \times (x \times \times x^2 2)^2

(squaring, 6 multiplies), :

x^ = x^3 \times (^3 2)^2

(optimal addition chain, 5 multiplies if ''x''³ is re-used). In general, finding the ''optimal'' addition chain for a given exponent is a hard problem, for which no efficient algorithms are known, so optimal chains are typically used for small exponents only (e.g. in

compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...

s where the chains for small powers have been pre-tabulated). However, there are a number of

heuristic A heuristic or heuristic technique (''problem solving'', '' mental shortcut'', ''rule of thumb'') is any approach to problem solving that employs a pragmatic method that is not fully optimized, perfected, or rationalized, but is nevertheless ...

algorithms that, while not being optimal, have fewer multiplications than exponentiation by squaring at the cost of additional bookkeeping work and memory usage. Regardless, the number of multiplications never grows more slowly than Θ(log ''n''), so these algorithms improve asymptotically upon exponentiation by squaring by only a constant factor at best.

Notes

References

{{Reflist Exponentials Computer arithmetic algorithms Computer arithmetic