numerical linear algebra Numerical linear algebra, sometimes called applied linear algebra, is the study of how matrix operations can be used to create computer algorithms which efficiently and accurately provide approximate answers to questions in continuous mathemati ...

, the alternating-direction implicit (ADI) method is an iterative method used to solve Sylvester matrix equations. It is a popular method for solving the large matrix equations that arise in

systems theory Systems theory is the Transdisciplinarity, transdisciplinary study of systems, i.e. cohesive groups of interrelated, interdependent components that can be natural or artificial. Every system has causal boundaries, is influenced by its context, de ...

and

control Control may refer to: Basic meanings Economics and business * Control (management), an element of management * Control, an element of management accounting * Comptroller (or controller), a senior financial officer in an organization * Controlling ...

, and can be formulated to construct solutions in a memory-efficient, factored form. It is also used to numerically solve parabolic and

elliptic In mathematics, an ellipse is a plane curve surrounding two focal points, such that for all points on the curve, the sum of the two distances to the focal points is a constant. It generalizes a circle, which is the special type of ellipse in ...

partial differential equations, and is a classic method used for modeling

heat conduction Thermal conduction is the diffusion of thermal energy (heat) within one material or between materials in contact. The higher temperature object has molecules with more kinetic energy; collisions between molecules distributes this kinetic energy u ...

and solving the

diffusion equation The diffusion equation is a parabolic partial differential equation. In physics, it describes the macroscopic behavior of many micro-particles in Brownian motion, resulting from the random movements and collisions of the particles (see Fick's l ...

in two or more dimensions.. It is an example of an

operator splitting This is a list of operator splitting topics. General * Alternating direction implicit method — finite difference method for parabolic, hyperbolic, and elliptic partial differential equations * GRADELA — simple gradient elasticity model * Matri ...

method. The method was developed at

Humble Oil Humble Oil and Refining Co. was an American oil company founded in 1911 in Humble, Texas. In 1919, a 50% interest in Humble was acquired by the Standard Oil of New Jersey which acquired the rest of the company in September 1959. The Humble bran ...

in the mid-1950s by Jim Douglas Jr, Henry Rachford, and Don Peaceman.

ADI for matrix equations

The method

The ADI method is a two step iteration process that alternately updates the column and row spaces of an approximate solution to

AX - XB = C

. One ADI iteration consists of the following steps:

1. Solve for $X^$ , where $\left( A - \beta_ I\right) X^ = X^\left( B - \beta_ I \right) + C.$

2. Solve for $X^$ , where $X^\left( B - \alpha_ I \right) = \left( A - \alpha_ I\right) X^ - C$ .

The numbers

(\alpha_, \beta_)

are called shift parameters, and convergence depends strongly on the choice of these parameters. To perform

K

iterations of ADI, an initial guess

X^

is required, as well as

K

shift parameters,

\_^

When to use ADI

A \in \mathbb^

and

B \in \mathbb^

, then

AX - XB = C

can be solved directly in

\mathcal(m^3 + n^3)

using the Bartels-Stewart method. It is therefore only beneficial to use ADI when matrix-vector multiplication and linear solves involving

A

and

B

can be applied cheaply. The equation

AX-XB=C

has a unique solution if and only if

\sigma(A) \cap \sigma(B) = \emptyset

, where

\sigma(M)

is the

spectrum A spectrum (: spectra or spectrums) is a set of related ideas, objects, or properties whose features overlap such that they blend to form a continuum. The word ''spectrum'' was first used scientifically in optics to describe the rainbow of co ...

M

. However, the ADI method performs especially well when

\sigma(A)

and

\sigma(B)

are well-separated, and

A

and

B

are

normal matrices Normal(s) or The Normal(s) may refer to: Film and television * ''Normal'' (2003 film), starring Jessica Lange and Tom Wilkinson * ''Normal'' (2007 film), starring Carrie-Anne Moss, Kevin Zegers, Callum Keith Rennie, and Andrew Airlie * ''Norma ...

. These assumptions are met, for example, by the Lyapunov equation

AX + XA^* = C

when

A

positive definite In mathematics, positive definiteness is a property of any object to which a bilinear form In mathematics, a bilinear form is a bilinear map on a vector space (the elements of which are called '' vectors'') over a field ''K'' (the elements of w ...

. Under these assumptions, near-optimal shift parameters are known for several choices of

A

and

B

. Additionally, a priori error bounds can be computed, thereby eliminating the need to monitor the residual error in implementation. The ADI method can still be applied when the above assumptions are not met. The use of suboptimal shift parameters may adversely affect convergence, and convergence is also affected by the non-normality of

A

B

(sometimes advantageously).

Krylov subspace In linear algebra, the order-''r'' Krylov subspace generated by an ''n''-by-''n'' matrix ''A'' and a vector ''b'' of dimension ''n'' is the linear subspace spanned by the images of ''b'' under the first ''r'' powers of ''A'' (starting from A^0=I) ...

methods, such as the Rational Krylov Subspace Method, are observed to typically converge more rapidly than ADI in this setting, and this has led to the development of hybrid ADI-projection methods.

Shift-parameter selection and the ADI error equation

The problem of finding good shift parameters is nontrivial. This problem can be understood by examining the ADI error equation. After

K

iterations, the error is given by

X - X^   = \prod_^K \frac \left ( X - X^ \right ) \prod_^K \frac.

Choosing

X^ = 0

results in the following bound on the relative error:

\frac \leq \,  r_K(A) \, _2 \,  r_K(B)^\, _2, \quad r_K(M) =  \prod_^K \frac.

where

\,  \cdot \, _2

is the

operator norm In mathematics, the operator norm measures the "size" of certain linear operators by assigning each a real number called its . Formally, it is a norm defined on the space of bounded linear operators between two given normed vector spaces. Inform ...

. The ideal set of shift parameters

\_^K

defines a

rational function In mathematics, a rational function is any function that can be defined by a rational fraction, which is an algebraic fraction such that both the numerator and the denominator are polynomials. The coefficients of the polynomials need not be ...

r_K

that minimizes the quantity

\,  r_K(A) \, _2 \,  r_K(B)^\, _2

. If

A

and

B

are

and have eigendecompositions

A = V_A\Lambda_AV_A^*

and

B = V_B\Lambda_BV_B^*

, then

\,  r_K(A) \, _2 \,  r_K(B)^\, _2  = \,  r_K(\Lambda_A) \, _2 \,  r_K(\Lambda_B)^\, _2

Near-optimal shift parameters

Near-optimal shift parameters are known in certain cases, such as when

\Lambda_A \subset

, b The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...

/math> and

\Lambda_B \subset

, d The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...

/math>, where

/math> and

/math> are disjoint intervals on the real line. The

Lyapunov equation The Lyapunov equation, named after the Russian mathematician Aleksandr Lyapunov, is a matrix equation used in the stability analysis of linear dynamical systems. In particular, the discrete-time Lyapunov equation (also known as Stein equation) ...

AX + XA^* = C

, for example, satisfies these assumptions when

A

. In this case, the shift parameters can be expressed in closed form using

elliptic integral In integral calculus, an elliptic integral is one of a number of related functions defined as the value of certain integrals, which were first studied by Giulio Fagnano and Leonhard Euler (). Their name originates from their originally arising i ...

s, and can easily be computed numerically. More generally, if closed, disjoint sets

E

and

F

, where

\Lambda_A \subset E

and

\Lambda_B \subset F

, are known, the optimal shift parameter selection problem is approximately solved by finding an extremal rational function that attains the value

Z_K(E, F) : = \inf_ \frac,

where the infimum is taken over all rational functions of degree

(K, K)

. This approximation problem is related to several results in

potential theory In mathematics and mathematical physics, potential theory is the study of harmonic functions. The term "potential theory" was coined in 19th-century physics when it was realized that the two fundamental forces of nature known at the time, namely g ...

, and was solved by Zolotarev in 1877 for

E

and

F=-E.

The solution is also known when

E

and

F

are disjoint disks in the complex plane.

Heuristic shift-parameter strategies

When less is known about

\sigma(A)

and

\sigma(B)

, or when

A

B

are non-normal matrices, it may not be possible to find near-optimal shift parameters. In this setting, a variety of strategies for generating good shift parameters can be used. These include strategies based on asymptotic results in potential theory, using the Ritz values of the matrices

A

A^

B

, and

B^

to formulate a greedy approach, and cyclic methods, where the same small collection of shift parameters are reused until a convergence tolerance is met. When the same shift parameter is used at every iteration, ADI is equivalent to an algorithm called Smith's method.

Factored ADI

In many applications,

A

and

B

are very large, sparse matrices, and

C

can be factored as

C = C_1C_2^*

, where

C_1 \in \mathbb^, C_2 \in \mathbb^

, with

r = 1, 2

. In such a setting, it may not be feasible to store the potentially dense matrix

X

explicitly. A variant of ADI, called factored ADI, can be used to compute

ZY^*

, where

X \approx ZY^*

. The effectiveness of factored ADI depends on whether

X

is well-approximated by a low rank matrix. This is known to be true under various assumptions about

A

and

B

ADI for parabolic equations

Historically, the ADI method was developed to solve the 2D diffusion equation on a square domain using finite differences. Unlike ADI for matrix equations, ADI for parabolic equations does not require the selection of shift parameters, since the shift appearing in each iteration is determined by parameters such as the timestep, diffusion coefficient, and grid spacing. The connection to ADI on matrix equations can be observed when one considers the action of the ADI iteration on the system at steady state.

Example: 2D diffusion equation

The traditional method for solving the heat conduction equation numerically is the

Crank–Nicolson method In numerical analysis, the Crank–Nicolson method is a finite difference method used for numerically solving the heat equation and similar partial differential equations. It is a Big O notation, second-order method in time. It is Explicit and im ...

. This method results in a very complicated set of equations in multiple dimensions, which are costly to solve. The advantage of the ADI method is that the equations that have to be solved in each step have a simpler structure and can be solved efficiently with the

tridiagonal matrix algorithm In numerical linear algebra, the tridiagonal matrix algorithm, also known as the Thomas algorithm (named after Llewellyn Thomas), is a simplified form of Gaussian elimination that can be used to solve Tridiagonal matrix, tridiagonal systems of equa ...

. Consider the linear diffusion equation in two dimensions, :

=
 \left( +

\right)
 =  ( u_ + u_ )

The implicit Crank–Nicolson method produces the following finite difference equation: :

=
\left(\delta_x^2+\delta_y^2\right)
\left(u_^+u_^n\right)

where: :

\Delta x = \Delta y

and

\delta_p^2

is the central second difference operator for the ''p''-th coordinate :

\delta_p^2 u_=u_-2u_+u_

with

e_p=(1,0)

(0,1)

for

p=x

y

respectively (and

ij

a shorthand for lattice points

(i,j)

). After performing a

stability analysis Stability may refer to: Mathematics *Stability theory, the study of the stability of solutions to differential equations and dynamical systems **Asymptotic stability **Exponential stability **Linear stability **Lyapunov stability **Marginal stabi ...

, it can be shown that this method will be stable for any

\Delta t

. A disadvantage of the Crank–Nicolson method is that the matrix in the above equation is banded with a band width that is generally quite large. This makes direct solution of the

system of linear equations In mathematics, a system of linear equations (or linear system) is a collection of two or more linear equations involving the same variable (math), variables. For example, : \begin 3x+2y-z=1\\ 2x-2y+4z=-2\\ -x+\fracy-z=0 \end is a system of th ...

quite costly (although efficient approximate solutions exist, for example use of the

conjugate gradient method In mathematics, the conjugate gradient method is an algorithm for the numerical solution of particular systems of linear equations, namely those whose matrix is positive-semidefinite. The conjugate gradient method is often implemented as an it ...

preconditioned with

incomplete Cholesky factorization In numerical analysis, an incomplete Cholesky factorization of a symmetric positive definite matrix is a sparse approximation of the Cholesky factorization. An incomplete Cholesky factorization is often used as a preconditioner for algorithms like ...

). The idea behind the ADI method is to split the finite difference equations into two, one with the ''x''-derivative taken implicitly and the next with the ''y''-derivative taken implicitly, :

=

=

The system of equations involved is

symmetric Symmetry () in everyday life refers to a sense of harmonious and beautiful proportion and balance. In mathematics, the term has a more precise definition and is usually used to refer to an object that is invariant under some transformations ...

and tridiagonal (banded with bandwidth 3), and is typically solved using

. It can be shown that this method is unconditionally stable and second order in time and space. There are more refined ADI methods such as the methods of Douglas, or the f-factor method which can be used for three or more dimensions.

Generalizations

The usage of the ADI method as an

scheme can be generalized. That is, we may consider general evolution equations :

\dot u = F_1 u + F_2 u,

where

F_1

and

F_2

are (possibly nonlinear) operators defined on a

Banach Banach (pronounced in German, in Slavic Languages, and or in English) is a Jewish surname of Ashkenazi origin believed to stem from the translation of the phrase "Son of man (Judaism), son of man", combining the Hebrew language, Hebrew word ' ...

space. In the diffusion example above we have

F_1 =

and

F_2 =

Fundamental ADI (FADI)

Simplification of ADI to FADI

It is possible to simplify the conventional ADI method into Fundamental ADI method, which only has the similar operators at the left-hand sides while being operator-free at the right-hand sides. This may be regarded as the fundamental (basic) scheme of ADI method, with no more operator (to be reduced) at the right-hand sides, unlike most traditional implicit methods that usually consist of operators at both sides of equations. The FADI method leads to simpler, more concise and efficient update equations without degrading the accuracy of conventional ADI method.

Relations to other implicit methods

Many classical implicit methods by Peaceman-Rachford, Douglas-Gunn, D'Yakonov, Beam-Warming, Crank-Nicolson, etc., may be simplified to fundamental implicit schemes with operator-free right-hand sides. In their fundamental forms, the FADI method of second-order temporal accuracy can be related closely to the fundamental locally one-dimensional (FLOD) method, which can be upgraded to second-order temporal accuracy, such as for three-dimensional Maxwell's equations in

computational electromagnetics Computational electromagnetics (CEM), computational electrodynamics or electromagnetic modeling is the process of modeling the interaction of electromagnetic fields with physical objects and the environment using computers. It typically involve ...

. For two- and three-dimensional heat conduction and diffusion equations, both FADI and FLOD methods may be implemented in simpler, more efficient and stable manner compared to their conventional methods.

References

{{DEFAULTSORT:Alternating Direction Implicit Method Partial differential equations Numerical differential equations Iterative methods

ADI for matrix equations

The method

When to use ADI

Shift-parameter selection and the ADI error equation

Near-optimal shift parameters

Heuristic shift-parameter strategies

Factored ADI

ADI for parabolic equations

Example: 2D diffusion equation

Generalizations

Fundamental ADI (FADI)

Simplification of ADI to FADI

Relations to other implicit methods

Further reading

References