Gerschgorin Circle Theorem
   HOME

TheInfoList



OR:

In
mathematics Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
, the Gershgorin circle theorem may be used to bound the
spectrum A spectrum (: spectra or spectrums) is a set of related ideas, objects, or properties whose features overlap such that they blend to form a continuum. The word ''spectrum'' was first used scientifically in optics to describe the rainbow of co ...
of a square
matrix Matrix (: matrices or matrixes) or MATRIX may refer to: Science and mathematics * Matrix (mathematics), a rectangular array of numbers, symbols or expressions * Matrix (logic), part of a formula in prenex normal form * Matrix (biology), the m ...
. It was first published by the Soviet mathematician Semyon Aronovich Gershgorin in 1931. Gershgorin's name has been transliterated in several different ways, including Geršgorin, Gerschgorin, Gershgorin, Hershhorn, and Hirschhorn.


Statement and proof

Let A be a
complex Complex commonly refers to: * Complexity, the behaviour of a system whose components interact in multiple ways so possible interactions are difficult to describe ** Complex system, a system composed of many components which may interact with each ...
n\times n matrix, with entries a_. For i \in\ let R_i be the sum of the
absolute value In mathematics, the absolute value or modulus of a real number x, is the non-negative value without regard to its sign. Namely, , x, =x if x is a positive number, and , x, =-x if x is negative (in which case negating x makes -x positive), ...
s of the non-diagonal entries in the i-th row: : R_i= \sum_ \left, a_\. Let D(a_, R_i) \subseteq \Complex be a closed
disc Disc or disk may refer to: * Disk (mathematics), a two dimensional shape, the interior of a circle * Disk storage * Optical disc * Floppy disk Music * Disc (band), an American experimental music band * ''Disk'' (album), a 1995 EP by Moby Other ...
centered at a_ with radius R_i. Such a disc is called a Gershgorin disc. :Theorem. Every
eigenvalue In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...
of A lies within at least one of the Gershgorin discs D(a_,R_i). ''Proof.'' Let \lambda be an eigenvalue of A with corresponding eigenvector x = (x_j). Find ''i'' such that the element of ''x'' with the largest
absolute value In mathematics, the absolute value or modulus of a real number x, is the non-negative value without regard to its sign. Namely, , x, =x if x is a positive number, and , x, =-x if x is negative (in which case negating x makes -x positive), ...
is x_i. Since Ax=\lambda x, in particular we take the ''i''th component of that equation to get: : \sum_j a_ x_j = \lambda x_i. Taking a_ to the other side: : \sum_ a_ x_j = (\lambda - a_) x_i. Therefore, applying the
triangle inequality In mathematics, the triangle inequality states that for any triangle, the sum of the lengths of any two sides must be greater than or equal to the length of the remaining side. This statement permits the inclusion of Degeneracy (mathematics)#T ...
and recalling that \frac \le 1 based on how we picked ''i'', : \left, \lambda - a_\ = \left, \sum_ \frac\ \le \sum_ \left, \frac\ = \sum_ \left, a_\ \frac \le \sum_ \left, a_\ = R_i. :Corollary. The eigenvalues of ''A'' must also lie within the Gershgorin discs ''C''''j'' corresponding to the columns of ''A''. ''Proof.'' Apply the Theorem to ''A''T while recognizing that the eigenvalues of the transpose are the same as those of the original matrix. Example. For a
diagonal matrix In linear algebra, a diagonal matrix is a matrix in which the entries outside the main diagonal are all zero; the term usually refers to square matrices. Elements of the main diagonal can either be zero or nonzero. An example of a 2×2 diagon ...
, the Gershgorin discs coincide with the spectrum. Conversely, if the Gershgorin discs coincide with the spectrum, the matrix is diagonal.


Discussion

One way to interpret this theorem is that if the off-diagonal entries of a square matrix over the complex numbers have small norms, the eigenvalues of the matrix cannot be "far from" the diagonal entries of the matrix. Therefore, by reducing the norms of off-diagonal entries one can attempt to approximate the eigenvalues of the matrix. Of course, diagonal entries may change in the process of minimizing off-diagonal entries. The theorem does ''not'' claim that there is one disc for each eigenvalue; if anything, the discs rather correspond to the ''axes'' in \mathbb^n, and each expresses a bound on precisely those eigenvalues whose eigenspaces are closest to one particular axis. In the matrix : \begin 3 & 2 & 2 \\ 1 & 1 & 0 \\ 1 & 0 & 1 \end \begin a & 0 & 0 \\ 0 & b & 0 \\ 0 & 0 & c \end \begin 3 & 2 & 2 \\ 1 & 1 & 0 \\ 1 & 0 & 1 \end^ = \begin -3a+2b+2c & 6a-2b-4c & 6a-4b-2c \\ b-a & a+(a-b) & 2(a-b) \\ c-a & 2(a-c) & a+(a-c) \end — which by construction has eigenvalues a, b, and c with eigenvectors \left(\begin 3 \\ 1 \\ 1 \end\right) , \left(\begin 2 \\ 1 \\ 0 \end\right) , and \left(\begin 2 \\ 0 \\ 1 \end\right) — it is easy to see that the disc for row 2 covers a and b while the disc for row 3 covers a and c. This is however just a happy coincidence; if working through the steps of the proof one finds that it in each eigenvector is the first element that is the largest (every eigenspace is closer to the first axis than to any other axis), so the theorem only promises that the disc for row 1 (whose radius can be twice the ''sum'' of the other two radii) covers all three eigenvalues.


Strengthening of the theorem

If one of the discs is disjoint from the others then it contains exactly one eigenvalue. If however it meets another disc it is possible that it contains no eigenvalue (for example, A = \left(\begin0&1\\4&0\end\right) or A = \left(\begin1&-2\\1&-1\end\right) ). In the general case the theorem can be strengthened as follows: Theorem: If the union of ''k'' discs is disjoint from the union of the other ''n'' − ''k'' discs then the former union contains exactly ''k'' and the latter ''n'' − ''k'' eigenvalues of ''A'', when the eigenvalues are counted with their algebraic multiplicities. ''Proof'': Let ''D'' be the diagonal matrix with entries equal to the diagonal entries of ''A'' and let : B(t) = (1-t) D + t A. We will use the fact that the eigenvalues are continuous in t, and show that if any eigenvalue moves from one of the unions to the other, then it must be outside all the discs for some t, which is a contradiction. The statement is true for D = B(0). The diagonal entries of B(t) are equal to that of ''A'', thus the centers of the Gershgorin circles are the same, however their radii are ''t'' times that of A. Therefore, the union of the corresponding ''k'' discs of B(t) is disjoint from the union of the remaining ''n-k'' for all t \in ,1. The discs are closed, so the distance of the two unions for ''A'' is d>0. The distance for B(t) is a decreasing function of ''t'', so it is always at least ''d''. ''Since the eigenvalues of B(t) are a continuous function of ''t'', for any eigenvalue \lambda(t) of B(t) in the union of the ''k'' discs its distance d(t) from the union of the other ''n-k'' discs is also continuous.'' Obviously d(0)\ge d, and assume \lambda(1) lies in the union of the ''n-k'' discs. Then d(1)=0, so there exists 0 < t_0 <1 such that 0 < d(t_0) < d. But this means \lambda(t_0) lies outside the Gershgorin discs, which is impossible. Therefore \lambda(1) lies in the union of the ''k'' discs, and the theorem is proven. Remarks: It is necessary to count the eigenvalues with respect to their algebraic multiplicities. Here is a counter-example : Consider the matrix, \begin 5 & 1 & 0 & 0 & 0 \\ 0 & 5 & 1 & 0 & 0 \\ 0 & 0 & 5 & 0 & 0 \\ 0 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 0 & 1 \end The union of the first 3 disks does not intersect the last 2, but the matrix has only 2 eigenvectors, e1,e4, and therefore only 2 eigenvalues, demonstrating that theorem is false in its formulation. The demonstration of the shows only that eigenvalues are distinct, however any affirmation about number of them is something that does not fit, and this is a counterexample. * The continuity of \lambda(t) should be understood in the sense of
topology Topology (from the Greek language, Greek words , and ) is the branch of mathematics concerned with the properties of a Mathematical object, geometric object that are preserved under Continuous function, continuous Deformation theory, deformat ...
. It is sufficient to show that the roots (as a point in space \mathbb^n) is continuous function of its coefficients. Note that the inverse map that maps roots to coefficients is described by
Vieta's formulas In mathematics, Vieta's formulas relate the coefficients of a polynomial to sums and products of its roots. They are named after François Viète (1540-1603), more commonly referred to by the Latinised form of his name, "Franciscus Vieta." Basi ...
(note for
characteristic polynomial In linear algebra, the characteristic polynomial of a square matrix is a polynomial which is invariant under matrix similarity and has the eigenvalues as roots. It has the determinant and the trace of the matrix among its coefficients. The ...
s that a_n\equiv1), which can be proved an
open map In mathematics, more specifically in topology, an open map is a function between two topological spaces that maps open sets to open sets. That is, a function f : X \to Y is open if for any open set U in X, the image f(U) is open in Y. Likewise, ...
. This proves the roots as a whole is a continuous function of its coefficients. Since composition of continuous functions is again continuous, the \lambda(t) as a composition of roots solver and B(t) is also continuous. * Individual eigenvalue \lambda(t) could merge with other eigenvalue(s) or appeared from a splitting of previous eigenvalue. This may confuse people and questioning the concept of continuous. However, when viewing from the space of eigenvalue set \mathbb^n, the trajectory is still a continuous curve although not necessarily smooth everywhere. Added Remark: * The proof given above is arguably (in)correct...... There are two types of continuity concerning eigenvalues: (1) each individual eigenvalue is a usual continuous function (such a representation does exist on a real interval but may not exist on a complex domain), (2) eigenvalues are continuous as a whole in the topological sense (a mapping from the matrix space with metric induced by a norm to unordered tuples, i.e., the quotient space of C^n under permutation equivalence with induced metric). Whichever continuity is used in a proof of the Gerschgorin disk theorem, it should be justified that the sum of algebraic multiplicities of eigenvalues remains unchanged on each connected region. A proof using the
argument principle In complex analysis, the argument principle (or Cauchy's argument principle) is a theorem relating the difference between the number of zeros and poles of a meromorphic function to a contour integral of the function's logarithmic derivative. Fo ...
of
complex analysis Complex analysis, traditionally known as the theory of functions of a complex variable, is the branch of mathematical analysis that investigates functions of complex numbers. It is helpful in many branches of mathematics, including algebraic ...
requires no eigenvalue continuity of any kind. For a brief discussion and clarification, see.Chi-Kwong Li & Fuzhen Zhang (2019), ''Eigenvalue continuity and Gersgorin's theorem'', Electronic Journal of Linear Algebra (ELA) OI: https://doi.org/10.13001/ela.2019.5179/ref>


Application

The Gershgorin circle theorem is useful in solving matrix equations of the form ''Ax'' = ''b'' for ''x'' where ''b'' is a vector and ''A'' is a matrix with a large
condition number In numerical analysis, the condition number of a function measures how much the output value of the function can change for a small change in the input argument. This is used to measure how sensitive a function is to changes or errors in the inpu ...
. In this kind of problem, the error in the final result is usually of the same
order of magnitude In a ratio scale based on powers of ten, the order of magnitude is a measure of the nearness of two figures. Two numbers are "within an order of magnitude" of each other if their ratio is between 1/10 and 10. In other words, the two numbers are ...
as the error in the initial data multiplied by the condition number of ''A''. For instance, if ''b'' is known to six decimal places and the condition number of ''A'' is 1000 then we can only be confident that ''x'' is accurate to three decimal places. For very high condition numbers, even very small errors due to rounding can be magnified to such an extent that the result is meaningless. It would be good to reduce the condition number of ''A''. This can be done by
preconditioning In mathematics, preconditioning is the application of a transformation, called the preconditioner, that conditions a given problem into a form that is more suitable for numerical solving methods. Preconditioning is typically related to reducing ...
: A matrix ''P'' such that ''P'' ≈ ''A''−1 is constructed, and then the equation ''PAx'' = ''Pb'' is solved for ''x''. Using the ''exact''
inverse Inverse or invert may refer to: Science and mathematics * Inverse (logic), a type of conditional sentence which is an immediate inference made from another conditional sentence * Additive inverse, the inverse of a number that, when added to the ...
of ''A'' would be nice but finding the inverse of a matrix is something we want to avoid because of the computational expense. Now, since ''PA'' ≈ ''I'' where ''I'' is the identity matrix, the
eigenvalue In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...
s of ''PA'' should all be close to 1. By the Gershgorin circle theorem, every eigenvalue of ''PA'' lies within a known area and so we can form a rough estimate of how good our choice of ''P'' was.


Example

Use the Gershgorin circle theorem to estimate the eigenvalues of: : A = \begin 10 & 1 & 0 & 1\\ 0.2 & 8 & 0.2 & 0.2\\ 1 & 1 & 2 & 1\\ -1 & -1 & -1 & -11\\ \end. Starting with row one, we take the element on the diagonal, ''a''''ii'' as the center for the disc. We then take the remaining elements in the row and apply the formula : \sum_ , a_, = R_i to obtain the following four discs: : D(10,2), \; D(8,0.6), \; D(2,3), \; \text \; D(-11,3). Note that we can improve the accuracy of the last two discs by applying the formula to the corresponding columns of the matrix, obtaining D(2,1.2) and D(-11,2.2) . The eigenvalues are -10.870, 1.906, 10.046, 7.918. Note that this is a (column)
diagonally dominant matrix In mathematics, a square matrix is said to be diagonally dominant if, for every row of the matrix, the magnitude of the diagonal entry in a row is greater than or equal to the sum of the magnitudes of all the other (off-diagonal) entries in that ro ...
: , a_, > \sum_ , a_, . This means that most of the matrix is in the diagonal, which explains why the eigenvalues are so close to the centers of the circles, and the estimates are very good. For a random matrix, we would expect the eigenvalues to be substantially further from the centers of the circles.


Stronger Conclusions and Taussky’s Theorem

While the original Gershgorin Circle Theorem applies to all complex square matrices, stronger conclusions can be drawn when the matrix has additional structure, such as being symmetric or irreducible. For a real symmetric matrix A \in \mathbb^, the Gershgorin disks reduce to intervals on the real line. In this case: * Every eigenvalue of A lies within at least one of its Gershgorin intervals. * If the intervals I_1, \dots, I_n can be divided into two disjoint groups — one with p intervals and the other with n - p — and the union of each group is disjoint from the other, then the group with p intervals contains exactly p eigenvalues (counting algebraic multiplicities), and the other group contains n - p. Additionally, a refinement due to Olga Taussky provides further structure for irreducible matrices: * If an eigenvalue lies at an endpoint of a Gershgorin interval, and the matrix is irreducible, then that eigenvalue is an endpoint of every Gershgorin interval. This result — known as Taussky’s Theorem — highlights how the geometry of the Gershgorin intervals tightly constrains eigenvalue locations when the matrix exhibits sufficient connectivity and structure.


Example illustrating Taussky’s Theorem

Consider the symmetric and irreducible tridiagonal matrix: : A = \begin 1 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 1 \end This matrix is real, symmetric, and irreducible. Each Gershgorin disk reduces to an interval on the real line: * Row 1: center = 1, radius = 1 → interval:
, 2 The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
* Row 2: center = 2, radius = 2 → interval:
, 4 The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
* Row 3: center = 1, radius = 1 → interval:
, 2 The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
The union of these intervals covers
, 4 The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
The eigenvalues of this matrix are: : \lambda_1 = 0, \quad \lambda_2 = 1, \quad \lambda_3 = 3 Observe that the smallest eigenvalue, \lambda_1 = 0, lies exactly at the left endpoint of all three Gershgorin intervals:
, 2 The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
, 4 The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
and
, 2 The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
By Taussky’s Theorem, since the matrix is symmetric and irreducible, and one eigenvalue lies at the boundary of a Gershgorin interval, that eigenvalue must lie at the boundary of every Gershgorin interval. This condition is satisfied here. This example illustrates how eigenvalues of symmetric, irreducible matrices can lie exactly on the boundaries of all Gershgorin intervals, as constrained by Taussky’s refinement of the Gershgorin Circle Theorem.


See also

* For matrices with non-negative entries, see
Perron–Frobenius theorem In matrix theory, the Perron–Frobenius theorem, proved by and , asserts that a real square matrix with positive entries has a unique eigenvalue of largest magnitude and that eigenvalue is real. The corresponding eigenvector can be chosen to ha ...
. *
Doubly stochastic matrix In mathematics, especially in probability and combinatorics, a doubly stochastic matrix (also called bistochastic matrix) is a square matrix X=(x_) of nonnegative real numbers, each of whose rows and columns sums to 1, i.e., :\sum_i x_=\sum_j x_ ...
*
Hurwitz-stable matrix In mathematics, a Hurwitz-stable matrix, or more commonly simply Hurwitz matrix, is a square matrix whose eigenvalues all have strictly negative real part. Some authors also use the term stability matrix. Such matrices play an important role in c ...
*
Joel Lee Brenner Joel Lee Brenner ( – ) was an American mathematician who specialized in matrix theory, linear algebra, and group theory. He is known as the translator of several popular Russian texts. He was a teaching professor at some dozen colleges and univ ...
*
Metzler matrix In mathematics, a Metzler matrix is a matrix in which all the off-diagonal components are nonnegative (equal to or greater than zero): : \forall_\, x_ \geq 0. It is named after the American economist Lloyd Metzler. Metzler matrices appear in st ...
*
Muirhead's inequality In mathematics, Muirhead's inequality, named after Robert Franklin Muirhead, also known as the "bunching" method, generalizes the inequality of arithmetic and geometric means. Preliminary definitions ''a''-mean For any real vector :a=(a_1,\do ...
*
Bendixson's inequality In mathematics, Bendixson's inequality is a quantitative result in the field of matrices derived by Ivar Bendixson in 1902. The inequality puts limits on the imaginary and real parts of characteristic roots (eigenvalues) of real matrices. A special ...
* Schur–Horn theorem


References

* . * .
Errata
. * . 1st ed., Prentice Hall, 1962. * .


External links

*{{planetmath reference, urlname=GershgorinsCircleTheorem, title=Gershgorin's circle theorem * Eric W. Weisstein.

" From MathWorld—A Wolfram Web Resource. * Semyon Aranovich Gershgorin biography a

Theorems in algebra Linear algebra Matrix theory Articles containing proofs