Ritz Process
   HOME

TheInfoList



OR:

The Rayleigh–Ritz method is a direct numerical method of approximating
eigenvalues In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...
, originated in the context of solving physical
boundary value problem In the study of differential equations, a boundary-value problem is a differential equation subjected to constraints called boundary conditions. A solution to a boundary value problem is a solution to the differential equation which also satis ...
s and named after
Lord Rayleigh John William Strutt, 3rd Baron Rayleigh ( ; 12 November 1842 – 30 June 1919), was an English physicist who received the Nobel Prize in Physics in 1904 "for his investigations of the densities of the most important gases and for his discovery ...
and
Walther Ritz Walther Heinrich Wilhelm Ritz (22 February 1878 – 7 July 1909) was a Swiss theoretical physicist. He is most famous for his work with Johannes Rydberg on the Rydberg–Ritz combination principle. Ritz is also known for the variational method n ...
. In this method, an infinite-dimensional
linear operator In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that pr ...
is approximated by a finite-dimensional
compression Compression may refer to: Physical science *Compression (physics), size reduction due to forces *Compression member, a structural element such as a column *Compressibility, susceptibility to compression * Gas compression *Compression ratio, of a ...
, on which we can use an
eigenvalue algorithm In numerical analysis, one of the most important problems is designing efficient and Numerical stability, stable algorithms for finding the eigenvalues of a Matrix (mathematics), matrix. These eigenvalue algorithms may also find eigenvectors. Eig ...
. It is used in all applications that involve approximating
eigenvalues and eigenvectors In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...
, often under different names. In
quantum mechanics Quantum mechanics is the fundamental physical Scientific theory, theory that describes the behavior of matter and of light; its unusual characteristics typically occur at and below the scale of atoms. Reprinted, Addison-Wesley, 1989, It is ...
, where a system of particles is described using a
Hamiltonian Hamiltonian may refer to: * Hamiltonian mechanics, a function that represents the total energy of a system * Hamiltonian (quantum mechanics), an operator corresponding to the total energy of that system ** Dyall Hamiltonian, a modified Hamiltonian ...
, the Ritz method uses trial wave functions to approximate the ground state eigenfunction with the lowest energy. In the
finite element method Finite element method (FEM) is a popular method for numerically solving differential equations arising in engineering and mathematical modeling. Typical problem areas of interest include the traditional fields of structural analysis, heat tran ...
context, mathematically the same algorithm is commonly called the Ritz-Galerkin method. The Rayleigh–Ritz method or Ritz method terminology is typical in mechanical and structural engineering to approximate the eigenmodes and
resonant frequencies Resonance is a phenomenon that occurs when an object or system is subjected to an external force or vibration whose frequency matches a resonant frequency (or resonance frequency) of the system, defined as a frequency that generates a maximu ...
of a structure.


Naming and attribution

The name of the method and its origin story have been debated by historians. It has been called Ritz method after
Walther Ritz Walther Heinrich Wilhelm Ritz (22 February 1878 – 7 July 1909) was a Swiss theoretical physicist. He is most famous for his work with Johannes Rydberg on the Rydberg–Ritz combination principle. Ritz is also known for the variational method n ...
, since the numerical procedure has been published by
Walther Ritz Walther Heinrich Wilhelm Ritz (22 February 1878 – 7 July 1909) was a Swiss theoretical physicist. He is most famous for his work with Johannes Rydberg on the Rydberg–Ritz combination principle. Ritz is also known for the variational method n ...
in 1908-1909. According to A. W. Leissa,
Lord Rayleigh John William Strutt, 3rd Baron Rayleigh ( ; 12 November 1842 – 30 June 1919), was an English physicist who received the Nobel Prize in Physics in 1904 "for his investigations of the densities of the most important gases and for his discovery ...
wrote a paper congratulating Ritz on his work in 1911, but stating that he himself had used Ritz's method in many places in his book and in another publication. This statement, although later disputed, and the fact that the method in the trivial case of a single vector results in the
Rayleigh quotient In mathematics, the Rayleigh quotient () for a given complex Hermitian matrix M and nonzero vector ''x'' is defined as:R(M,x) = .For real matrices and vectors, the condition of being Hermitian reduces to that of being symmetric, and the conjugat ...
make the case for the name ''Rayleigh–Ritz'' method. According to S. Ilanko, citing
Richard Courant Richard Courant (January 8, 1888 – January 27, 1972) was a German-American mathematician. He is best known by the general public for the book '' What is Mathematics?'', co-written with Herbert Robbins. His research focused on the areas of real ...
, both
Lord Rayleigh John William Strutt, 3rd Baron Rayleigh ( ; 12 November 1842 – 30 June 1919), was an English physicist who received the Nobel Prize in Physics in 1904 "for his investigations of the densities of the most important gases and for his discovery ...
and
Walther Ritz Walther Heinrich Wilhelm Ritz (22 February 1878 – 7 July 1909) was a Swiss theoretical physicist. He is most famous for his work with Johannes Rydberg on the Rydberg–Ritz combination principle. Ritz is also known for the variational method n ...
independently conceived the idea of utilizing the equivalence between
boundary value problem In the study of differential equations, a boundary-value problem is a differential equation subjected to constraints called boundary conditions. A solution to a boundary value problem is a solution to the differential equation which also satis ...
s of
partial differential equations In mathematics, a partial differential equation (PDE) is an equation which involves a multivariable function and one or more of its partial derivatives. The function is often thought of as an "unknown" that solves the equation, similar to how ...
on the one hand and problems of the
calculus of variations The calculus of variations (or variational calculus) is a field of mathematical analysis that uses variations, which are small changes in Function (mathematics), functions and functional (mathematics), functionals, to find maxima and minima of f ...
on the other hand for numerical calculation of the solutions, by substituting for the variational problems simpler approximating extremum problems in which a finite number of parameters need to be determined. Ironically for the debate, the modern justification of the algorithm drops the
calculus of variations The calculus of variations (or variational calculus) is a field of mathematical analysis that uses variations, which are small changes in Function (mathematics), functions and functional (mathematics), functionals, to find maxima and minima of f ...
in favor of the simpler and more general approach of
orthogonal projection In linear algebra and functional analysis, a projection is a linear transformation P from a vector space to itself (an endomorphism) such that P\circ P=P. That is, whenever P is applied twice to any vector, it gives the same result as if it we ...
as in
Galerkin method In mathematics, in the area of numerical analysis, Galerkin methods are a family of methods for converting a continuous operator problem, such as a differential equation, commonly in a weak formulation, to a discrete problem by applying linear c ...
named after
Boris Galerkin Boris Grigoryevich Galerkin (, surname more accurately romanized as Galyorkin; –12 July 1945) was a Soviet mathematician and an engineer. Biography Early life Galerkin was born on in Polotsk, Vitebsk Governorate, Russian Empire, now part of ...
, thus leading also to the Ritz-Galerkin method naming.


Method

Let T be a
linear operator In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that pr ...
on a
Hilbert space In mathematics, a Hilbert space is a real number, real or complex number, complex inner product space that is also a complete metric space with respect to the metric induced by the inner product. It generalizes the notion of Euclidean space. The ...
\mathcal, with
inner product In mathematics, an inner product space (or, rarely, a Hausdorff pre-Hilbert space) is a real vector space or a complex vector space with an operation called an inner product. The inner product of two vectors in the space is a scalar, ofte ...
(\cdot, \cdot). Now consider a finite set of functions \mathcal = \. Depending on the application these functions may be: * A subset of the
orthonormal basis In mathematics, particularly linear algebra, an orthonormal basis for an inner product space V with finite Dimension (linear algebra), dimension is a Basis (linear algebra), basis for V whose vectors are orthonormal, that is, they are all unit vec ...
of the original operator; * A space of splines (as in the
Galerkin method In mathematics, in the area of numerical analysis, Galerkin methods are a family of methods for converting a continuous operator problem, such as a differential equation, commonly in a weak formulation, to a discrete problem by applying linear c ...
); * A set of functions which approximate the
eigenfunctions In mathematics, an eigenfunction of a linear map, linear operator ''D'' defined on some function space is any non-zero function (mathematics), function f in that space that, when acted upon by ''D'', is only multiplied by some scaling factor calle ...
of the operator. One could use the orthonormal basis generated from the eigenfunctions of the operator, which will produce
diagonal In geometry, a diagonal is a line segment joining two vertices of a polygon or polyhedron, when those vertices are not on the same edge. Informally, any sloping line is called diagonal. The word ''diagonal'' derives from the ancient Greek Î ...
approximating matrices, but in this case we would have already had to calculate the spectrum. We now approximate T by T_, which is defined as the matrix with entries (T_)_ = (T \varphi_i, \varphi_j). and solve the eigenvalue problem T_u = \lambda u. It can be shown that the matrix T_ is the
compression Compression may refer to: Physical science *Compression (physics), size reduction due to forces *Compression member, a structural element such as a column *Compressibility, susceptibility to compression * Gas compression *Compression ratio, of a ...
of T to \mathcal. For
differential operators In mathematics, a differential operator is an operator defined as a function of the differentiation operator. It is helpful, as a matter of notation first, to consider differentiation as an abstract operation that accepts a function and retur ...
(such as Sturm-Liouville operators), the inner product (\cdot, \cdot) can be replaced by the
weak formulation Weak formulations are important tools for the analysis of mathematical equations that permit the transfer of concepts of linear algebra to solve problems in other fields such as partial differential equations. In a weak formulation, equations or co ...
\mathcal(\cdot, \cdot). If a subset of the orthonormal basis was used to find the matrix, the eigenvectors of T_ will be
linear combinations In mathematics, a linear combination or superposition is an expression constructed from a set of terms by multiplying each term by a constant and adding the results (e.g. a linear combination of ''x'' and ''y'' would be any expression of the form ...
of orthonormal basis functions, and as a result they will be approximations of the eigenvectors of T.


Properties


Spectral pollution

It is possible for the Rayleigh–Ritz method to produce values which do not converge to actual values in the spectrum of the operator as the truncation gets large. These values are known as spectral pollution. In some cases (such as for the
Schrödinger equation The Schrödinger equation is a partial differential equation that governs the wave function of a non-relativistic quantum-mechanical system. Its discovery was a significant landmark in the development of quantum mechanics. It is named after E ...
), there is no approximation which both includes all eigenvalues of the equation, and contains no pollution. The spectrum of the compression (and thus pollution) is bounded by the
numerical range In the mathematics, mathematical field of linear algebra and convex analysis, the numerical range or field of values of a complex number, complex n \times n square matrix, matrix ''A'' is the set :W(A) = \left\ = \left\ where \mathbf^* denotes t ...
of the operator; in many cases it is bounded by a subset of the numerical range known as the
essential numerical range Essential or essentials may refer to: Biology *Essential amino acid, one that the organism cannot produce by itself Groups and organizations * EQ Media Group, formerly Essential Media Group, a global television production company * Essential M ...
.


For matrix eigenvalue problems

In numerical linear algebra, the Rayleigh–Ritz method is commonly applied to approximate an eigenvalue problem A \mathbf = \lambda \mathbf for the matrix A \in \mathbb^ of size N using a projected matrix of a smaller size m < N, generated from a given matrix V \in \mathbb^ with
orthonormal In linear algebra, two vectors in an inner product space are orthonormal if they are orthogonal unit vectors. A unit vector means that the vector has a length of 1, which is also known as normalized. Orthogonal means that the vectors are all perpe ...
columns. The matrix version of the algorithm is the most simple: # Compute the m \times m matrix V^* A V , where V^* denotes the complex-conjugate transpose of V # Solve the eigenvalue problem V^* A V \mathbf_i = \mu_i \mathbf_i # Compute the Ritz vectors \tilde_i = V \mathbf_i and the Ritz value \tilde_i=\mu_i # Output approximations (\tilde_i,\tilde_i), called the Ritz pairs, to
eigenvalues and eigenvectors In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a ...
of the original matrix A. If the subspace with the orthonormal basis given by the columns of the matrix V \in \mathbb^ contains k \leq m vectors that are close to eigenvectors of the matrix A, the Rayleigh–Ritz method above finds k Ritz vectors that well approximate these eigenvectors. The easily computable quantity \, A \tilde_i - \tilde_i \tilde_i\, determines the accuracy of such an approximation for every Ritz pair. In the easiest case m = 1, the N \times m matrix V turns into a unit column-vector v, the m \times m matrix V^* A V is a scalar that is equal to the
Rayleigh quotient In mathematics, the Rayleigh quotient () for a given complex Hermitian matrix M and nonzero vector ''x'' is defined as:R(M,x) = .For real matrices and vectors, the condition of being Hermitian reduces to that of being symmetric, and the conjugat ...
\rho(v) = v^*Av/v^*v, the only i = 1 solution to the eigenvalue problem is y_i = 1 and \mu_i = \rho(v), and the only one Ritz vector is v itself. Thus, the Rayleigh–Ritz method turns into computing of the
Rayleigh quotient In mathematics, the Rayleigh quotient () for a given complex Hermitian matrix M and nonzero vector ''x'' is defined as:R(M,x) = .For real matrices and vectors, the condition of being Hermitian reduces to that of being symmetric, and the conjugat ...
if m = 1. Another useful connection to the
Rayleigh quotient In mathematics, the Rayleigh quotient () for a given complex Hermitian matrix M and nonzero vector ''x'' is defined as:R(M,x) = .For real matrices and vectors, the condition of being Hermitian reduces to that of being symmetric, and the conjugat ...
is that \mu_i = \rho(v_i) for every Ritz pair (\tilde_i, \tilde_i), allowing to derive some properties of Ritz values \mu_i from the corresponding theory for the
Rayleigh quotient In mathematics, the Rayleigh quotient () for a given complex Hermitian matrix M and nonzero vector ''x'' is defined as:R(M,x) = .For real matrices and vectors, the condition of being Hermitian reduces to that of being symmetric, and the conjugat ...
. For example, if A is a
Hermitian matrix In mathematics, a Hermitian matrix (or self-adjoint matrix) is a complex square matrix that is equal to its own conjugate transpose—that is, the element in the -th row and -th column is equal to the complex conjugate of the element in the ...
, its
Rayleigh quotient In mathematics, the Rayleigh quotient () for a given complex Hermitian matrix M and nonzero vector ''x'' is defined as:R(M,x) = .For real matrices and vectors, the condition of being Hermitian reduces to that of being symmetric, and the conjugat ...
(and thus its every Ritz value) is real and takes values within the closed interval of the smallest and largest eigenvalues of A.


Example

The matrix A = \begin 2 & 0 & 0 \\ 0 & 2 & 1 \\ 0 & 1 & 2 \end has eigenvalues 1, 2, 3 and the corresponding eigenvectors \mathbf x_ = \begin 0 \\ 1 \\ -1 \end, \quad \mathbf x_ = \begin 1 \\ 0 \\ 0 \end, \quad \mathbf x_ = \begin 0 \\ 1 \\ 1 \end. Let us take V = \begin 0 & 0 \\ 1 & 0 \\ 0 & 1 \end, then V^* A V = \begin 2 & 1 \\ 1 & 2 \end with eigenvalues 1, 3 and the corresponding eigenvectors \mathbf y_ = \begin 1 \\ -1 \end, \quad \mathbf y_ = \begin 1 \\ 1 \end, so that the Ritz values are 1, 3 and the Ritz vectors are \mathbf \tilde_ = \begin 0 \\ 1 \\ -1 \end, \quad \mathbf \tilde_ = \begin 0 \\ 1 \\ 1 \end. We observe that each one of the Ritz vectors is exactly one of the eigenvectors of A for the given V as well as the Ritz values give exactly two of the three eigenvalues of A. A mathematical explanation for the exact approximation is based on the fact that the
column space In linear algebra, the column space (also called the range or image) of a matrix ''A'' is the span (set of all possible linear combinations) of its column vectors. The column space of a matrix is the image or range of the corresponding matr ...
of the matrix V happens to be exactly the same as the subspace spanned by the two eigenvectors \mathbf x_ and \mathbf x_ in this example.


For matrix singular value problems

Truncated singular value decomposition (SVD) in numerical linear algebra can also use the Rayleigh–Ritz method to find approximations to left and right singular vectors of the matrix M \in \mathbb^ of size M \times N in given subspaces by turning the singular value problem into an eigenvalue problem.


Using the normal matrix

The definition of the singular value \sigma and the corresponding left and right singular vectors is M v = \sigma u and M^* u = \sigma v. Having found one set (left of right) of approximate singular vectors and singular values by applying naively the Rayleigh–Ritz method to the
Hermitian {{Short description, none Numerous things are named after the French mathematician Charles Hermite (1822–1901): Hermite * Cubic Hermite spline, a type of third-degree spline * Gauss–Hermite quadrature, an extension of Gaussian quadrature me ...
normal matrix M^* M \in \mathbb^ or M M^* \in \mathbb^, whichever one is smaller size, one could determine the other set of left of right singular vectors simply by dividing by the singular values, i.e., u = Mv / \sigma and v = M^* u / \sigma. However, the division is unstable or fails for small or zero singular values. An alternative approach, e.g., defining the normal matrix as A = M^* M \in \mathbb^ of size N \times N, takes advantage of the fact that for a given N \times m matrix W \in \mathbb^ with
orthonormal In linear algebra, two vectors in an inner product space are orthonormal if they are orthogonal unit vectors. A unit vector means that the vector has a length of 1, which is also known as normalized. Orthogonal means that the vectors are all perpe ...
columns the eigenvalue problem of the Rayleigh–Ritz method for the m \times m matrix W^* A W = W^* M^* M W = (M W)^* M W can be interpreted as a singular value problem for the N \times m matrix M W. This interpretation allows simple simultaneous calculation of both left and right approximate singular vectors as follows. # Compute the N \times m matrix M W . # Compute the thin, or economy-sized, SVD M W = \mathbf \Sigma \mathbf V_h, with N \times m matrix \mathbf U, m \times m diagonal matrix \Sigma, and m \times m matrix \mathbf _h. # Compute the matrices of the Ritz left U = \mathbf U and right V_h = \mathbf _h W^* singular vectors. # Output approximations U, \Sigma, V_h, called the Ritz singular triplets, to selected singular values and the corresponding left and right singular vectors of the original matrix M representing an approximate Truncated singular value decomposition (SVD) with left singular vectors restricted to the column-space of the matrix W. The algorithm can be used as a post-processing step where the matrix W is an output of an eigenvalue solver, e.g., such as
LOBPCG Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is a matrix-free method for finding the largest (or smallest) eigenvalues and the corresponding eigenvectors of a symmetric generalized eigenvalue problem :A x= \lambda B x, for a g ...
, approximating numerically selected eigenvectors of the normal matrix A = M^* M.


Example

The matrix M = \begin 1 & 0 & 0 & 0 \\ 0 & 2 & 0 & 0 \\ 0 & 0 & 3 & 0 \\ 0 & 0 & 0 & 4 \\ 0 & 0 & 0 & 0 \end has its normal matrix A = M^* M = \begin 1 & 0 & 0 & 0 \\ 0 & 4 & 0 & 0 \\ 0 & 0 & 9 & 0 \\ 0 & 0 & 0 & 16 \\ \end, singular values 1, 2, 3, 4 and the corresponding thin SVD A = \begin 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end \begin 4 & 0 & 0 & 0 \\ 0 & 3 & 0 & 0 \\ 0 & 0 & 2 & 0 \\ 0 & 0 & 0 & 1 \end \begin 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \end, where the columns of the first multiplier from the complete set of the left singular vectors of the matrix A, the diagonal entries of the middle term are the singular values, and the columns of the last multiplier transposed (although the transposition does not change it) \begin 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \end^* \quad = \quad \begin 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \end are the corresponding right singular vectors. Let us take W = \begin 1 / \sqrt & 1 / \sqrt \\ 1 / \sqrt & -1 / \sqrt \\ 0 & 0 \\ 0 & 0 \end with the column-space that is spanned by the two exact right singular vectors \begin 0 & 1 \\ 1 & 0 \\ 0 & 0 \\ 0 & 0 \end corresponding to the singular values 1 and 2. Following the algorithm step 1, we compute MW = \begin 1 / \sqrt & 1 / \sqrt \\ \sqrt & -\sqrt \\ 0 & 0 \\ 0 & 0 \end, and on step 2 its thin SVD M W = \mathbf \mathbf _h with \mathbf = \begin 0 & 1 \\ 1 & 0 \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \end, \quad \Sigma = \begin 2 & 0 \\ 0 & 1 \end, \quad \mathbf _h = \begin 1 / \sqrt & -1 / \sqrt \\ 1 / \sqrt & 1 / \sqrt \end. Thus we already obtain the singular values 2 and 1 from \Sigma and from \mathbf the corresponding two left singular vectors u as , 1, 0, 0, 0* and , 0, 0, 0, 0*, which span the column-space of the matrix W, explaining why the approximations are exact for the given W. Finally, step 3 computes the matrix V_h = \mathbf _h W^* \mathbf _h = \begin 1 / \sqrt & -1 / \sqrt \\ 1 / \sqrt & 1 / \sqrt \end \, \begin 1 / \sqrt & 1 / \sqrt & 0 & 0 \\ 1 / \sqrt & -1 / \sqrt & 0 & 0 \end = \begin 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \end recovering from its rows the two right singular vectors v as , 1, 0, 0* and , 0, 0, 0*. We validate the first vector: M v = \sigma u \begin 1 & 0 & 0 & 0 \\ 0 & 2 & 0 & 0 \\ 0 & 0 & 3 & 0 \\ 0 & 0 & 0 & 4 \\ 0 & 0 & 0 & 0 \end \, \begin 0 \\ 1 \\ 0 \\ 0 \end = \, 2 \, \begin 0 \\ 1 \\ 0 \\ 0 \\ 0 \end and M^* u = \sigma v \begin 1 & 0 & 0 & 0 & 0 \\ 0 & 2 & 0 & 0 & 0 \\ 0 & 0 & 3 & 0 & 0 \\ 0 & 0 & 0 & 4 & 0 \end \, \begin 0 \\ 1 \\ 0 \\ 0 \\ 0 \end = \, 2 \, \begin 0 \\ 1 \\ 0 \\ 0 \end. Thus, for the given matrix W with its column-space that is spanned by two exact right singular vectors, we determine these right singular vectors, as well as the corresponding left singular vectors and the singular values, all exactly. For an arbitrary matrix W, we obtain approximate singular triplets which are optimal given W in the sense of optimality of the Rayleigh–Ritz method.


Applications and examples


In quantum physics

In quantum physics, where the spectrum of the
Hamiltonian Hamiltonian may refer to: * Hamiltonian mechanics, a function that represents the total energy of a system * Hamiltonian (quantum mechanics), an operator corresponding to the total energy of that system ** Dyall Hamiltonian, a modified Hamiltonian ...
is the set of discrete energy levels allowed by a quantum mechanical system, the Rayleigh–Ritz method is used to approximate the energy states and wavefunctions of a complicated atomic or nuclear system. In fact, for any system more complicated than a single hydrogen atom, there is no known exact solution for the spectrum of the Hamiltonian. In this case, a trial wave function, \Psi, is tested on the system. This trial function is selected to meet boundary conditions (and any other physical constraints). The exact function is not known; the trial function contains one or more adjustable parameters, which are varied to find a lowest energy configuration. It can be shown that the ground state energy, E_0, satisfies an inequality: E_0 \le \frac. That is, the ground-state energy is less than this value. The trial wave-function will always give an expectation value larger than or equal to the ground-energy. If the trial wave function is known to be
orthogonal In mathematics, orthogonality (mathematics), orthogonality is the generalization of the geometric notion of ''perpendicularity''. Although many authors use the two terms ''perpendicular'' and ''orthogonal'' interchangeably, the term ''perpendic ...
to the ground state, then it will provide a boundary for the energy of some excited state. The Ritz ansatz function is a linear combination of ''N'' known basis functions \left\lbrace\Psi_i\right\rbrace, parametrized by unknown coefficients: \Psi = \sum_^N c_i \Psi_i. With a known Hamiltonian, we can write its expected value as \varepsilon = \frac = \frac \equiv \frac. The basis functions are usually not orthogonal, so that the
overlap matrix In chemical bonds, an orbital overlap is the concentration of orbitals on adjacent atoms in the same regions of space. Orbital overlap can lead to bond formation. The general principle for orbital overlap is that, the greater the overlap between ...
''S'' has nonzero nondiagonal elements. Either \left\lbrace c_i \right\rbrace or \left\lbrace c_i^* \right\rbrace (the conjugation of the first) can be used to minimize the expectation value. For instance, by making the partial derivatives of \varepsilon over \left\lbrace c_i^* \right\rbrace zero, the following equality is obtained for every ''k'' = 1, 2, ..., ''N'': \frac = \frac = 0, which leads to a set of ''N''
secular equation In linear algebra, the characteristic polynomial of a square matrix is a polynomial which is invariant under matrix similarity and has the eigenvalues as roots. It has the determinant and the trace of the matrix among its coefficients. The ...
s: \sum_^N c_j \left( H_ - \varepsilon S_ \right) = 0 \quad \text \quad k = 1,2,\dots,N. In the above equations, energy \varepsilon and the coefficients \left\lbrace c_j \right\rbrace are unknown. With respect to ''c'', this is a homogeneous set of linear equations, which has a solution when the
determinant In mathematics, the determinant is a Scalar (mathematics), scalar-valued function (mathematics), function of the entries of a square matrix. The determinant of a matrix is commonly denoted , , or . Its value characterizes some properties of the ...
of the coefficients to these unknowns is zero: \det \left( H - \varepsilon S \right) = 0, which in turn is true only for ''N'' values of \varepsilon. Furthermore, since the Hamiltonian is a
hermitian operator In mathematics, a self-adjoint operator on a complex vector space ''V'' with inner product \langle\cdot,\cdot\rangle is a linear map ''A'' (from ''V'' to itself) that is its own adjoint. That is, \langle Ax,y \rangle = \langle x,Ay \rangle for al ...
, the ''H'' matrix is also
hermitian {{Short description, none Numerous things are named after the French mathematician Charles Hermite (1822–1901): Hermite * Cubic Hermite spline, a type of third-degree spline * Gauss–Hermite quadrature, an extension of Gaussian quadrature me ...
and the values of \varepsilon_i will be real. The lowest value among \varepsilon_i (i=1,2,..,N), \varepsilon_0, will be the best approximation to the ground state for the basis functions used. The remaining ''N-1'' energies are estimates of excited state energies. An approximation for the wave function of state ''i'' can be obtained by finding the coefficients \left\lbrace c_j \right\rbrace from the corresponding secular equation.


In mechanical engineering

The Rayleigh–Ritz method is often used in
mechanical engineering Mechanical engineering is the study of physical machines and mechanism (engineering), mechanisms that may involve force and movement. It is an engineering branch that combines engineering physics and engineering mathematics, mathematics principl ...
for finding the approximate real
resonant frequencies Resonance is a phenomenon that occurs when an object or system is subjected to an external force or vibration whose frequency matches a resonant frequency (or resonance frequency) of the system, defined as a frequency that generates a maximu ...
of multi
degree of freedom In many scientific fields, the degrees of freedom of a system is the number of parameters of the system that may vary independently. For example, a point in the plane has two degrees of freedom for translation: its two coordinates; a non-infinites ...
systems, such as spring mass systems or
flywheel A flywheel is a mechanical device that uses the conservation of angular momentum to store rotational energy, a form of kinetic energy proportional to the product of its moment of inertia and the square of its rotational speed. In particular, a ...
s on a shaft with varying
cross section Cross section may refer to: * Cross section (geometry) ** Cross-sectional views in architecture and engineering 3D *Cross section (geology) * Cross section (electronics) * Radar cross section, measure of detectability * Cross section (physics) **A ...
. It is an extension of Rayleigh's method. It can also be used for finding buckling loads and post-buckling behaviour for columns. Consider the case whereby we want to find the resonant frequency of oscillation of a system. First, write the oscillation in the form, y(x,t) = Y(x) \cos\omega t with an unknown mode shape Y(x). Next, find the total energy of the system, consisting of a kinetic energy term and a potential energy term. The kinetic energy term involves the square of the time derivative of y(x,t) and thus gains a factor of \omega ^2. Thus, we can calculate the total energy of the system and express it in the following form: E = T + V \equiv A
(x) An emoticon (, , rarely , ), short for emotion icon, is a pictorial representation of a facial expression using characters—usually punctuation marks, numbers and letters—to express a person's feelings, mood or reaction, without needin ...
\omega^2\sin^2 \omega t + B
(x) An emoticon (, , rarely , ), short for emotion icon, is a pictorial representation of a facial expression using characters—usually punctuation marks, numbers and letters—to express a person's feelings, mood or reaction, without needin ...
\cos^2 \omega t By conservation of energy, the average kinetic energy must be equal to the average potential energy. Thus, \omega^2 = \frac = R
(x) An emoticon (, , rarely , ), short for emotion icon, is a pictorial representation of a facial expression using characters—usually punctuation marks, numbers and letters—to express a person's feelings, mood or reaction, without needin ...
/math> which is also known as the
Rayleigh quotient In mathematics, the Rayleigh quotient () for a given complex Hermitian matrix M and nonzero vector ''x'' is defined as:R(M,x) = .For real matrices and vectors, the condition of being Hermitian reduces to that of being symmetric, and the conjugat ...
. Thus, if we knew the mode shape Y(x), we would be able to calculate A
(x) An emoticon (, , rarely , ), short for emotion icon, is a pictorial representation of a facial expression using characters—usually punctuation marks, numbers and letters—to express a person's feelings, mood or reaction, without needin ...
/math> and B
(x) An emoticon (, , rarely , ), short for emotion icon, is a pictorial representation of a facial expression using characters—usually punctuation marks, numbers and letters—to express a person's feelings, mood or reaction, without needin ...
/math>, and in turn get the eigenfrequency. However, we do not yet know the mode shape. In order to find this, we can approximate Y(x) as a combination of a few approximating functions Y_i(x) Y(x) = \sum_^N c_i Y_i(x) where c_1,c_2,\cdots,c_N are constants to be determined. In general, if we choose a random set of c_1,c_2,\cdots,c_N, it will describe a superposition of the actual eigenmodes of the system. However, if we seek c_1,c_2,\cdots,c_N such that the eigenfrequency \omega^2 is minimised, then the mode described by this set of c_1,c_2,\cdots,c_N will be close to the lowest possible actual eigenmode of the system. Thus, this finds the lowest eigenfrequency. If we find eigenmodes orthogonal to this approximated lowest eigenmode, we can approximately find the next few eigenfrequencies as well. In general, we can express A
(x) An emoticon (, , rarely , ), short for emotion icon, is a pictorial representation of a facial expression using characters—usually punctuation marks, numbers and letters—to express a person's feelings, mood or reaction, without needin ...
/math> and B
(x) An emoticon (, , rarely , ), short for emotion icon, is a pictorial representation of a facial expression using characters—usually punctuation marks, numbers and letters—to express a person's feelings, mood or reaction, without needin ...
/math> as a collection of terms quadratic in the coefficients c_i: B
(x) An emoticon (, , rarely , ), short for emotion icon, is a pictorial representation of a facial expression using characters—usually punctuation marks, numbers and letters—to express a person's feelings, mood or reaction, without needin ...
= \sum_i \sum_j c_i c_j K_ = \mathbf^\mathsf K \mathbf A
(x) An emoticon (, , rarely , ), short for emotion icon, is a pictorial representation of a facial expression using characters—usually punctuation marks, numbers and letters—to express a person's feelings, mood or reaction, without needin ...
= \sum_i \sum_j c_i c_j M_ = \mathbf^\mathsf M \mathbf where K and M are the stiffness matrix and mass matrix of a discrete system respectively. The minimization of \omega^2 becomes: \frac = \frac \frac = 0 Solving this, \mathbf^\mathsf M \mathbf\frac - \mathbf^\mathsf K \mathbf \frac = 0 K \mathbf c - \fracM\mathbf = \mathbf K \mathbf - \omega^2 M \mathbf = \mathbf For a non-trivial solution of c, we require determinant of the matrix coefficient of c to be zero. \det(K - \omega^2 M)=0 This gives a solution for the first ''N'' eigenfrequencies and eigenmodes of the system, with N being the number of approximating functions.


Simple case of double spring-mass system

The following discussion uses the simplest case, where the system has two lumped springs and two lumped masses, and only two mode shapes are assumed. Hence and . A
mode shape A normal mode of a dynamical system is a pattern of motion in which all parts of the system move sinusoidally with the same frequency and with a fixed phase relation. The free motion described by the normal modes takes place at fixed frequencies. ...
is assumed for the system, with two terms, one of which is weighted by a factor ''B'', e.g. ''Y'' =  , 1nbsp;+ ''B'' , âˆ’1
Simple harmonic motion In mechanics and physics, simple harmonic motion (sometimes abbreviated as ) is a special type of periodic motion an object experiences by means of a restoring force whose magnitude is directly proportional to the distance of the object from ...
theory says that the
velocity Velocity is a measurement of speed in a certain direction of motion. It is a fundamental concept in kinematics, the branch of classical mechanics that describes the motion of physical objects. Velocity is a vector (geometry), vector Physical q ...
at the time when deflection is zero, is the
angular frequency In physics, angular frequency (symbol ''ω''), also called angular speed and angular rate, is a scalar measure of the angle rate (the angle per unit time) or the temporal rate of change of the phase argument of a sinusoidal waveform or sine ...
\omega times the deflection (y) at time of maximum deflection. In this example the
kinetic energy In physics, the kinetic energy of an object is the form of energy that it possesses due to its motion. In classical mechanics, the kinetic energy of a non-rotating object of mass ''m'' traveling at a speed ''v'' is \fracmv^2.Resnick, Rober ...
(KE) for each mass is \frac\omega^2 Y_1^2 m_1 etc., and the
potential energy In physics, potential energy is the energy of an object or system due to the body's position relative to other objects, or the configuration of its particles. The energy is equal to the work done against any restoring forces, such as gravity ...
(PE) for each
spring Spring(s) may refer to: Common uses * Spring (season), a season of the year * Spring (device), a mechanical device that stores energy * Spring (hydrology), a natural source of water * Spring (mathematics), a geometric surface in the shape of a he ...
is \frac k_1 Y_1^2 etc. We also know that without damping, the maximal KE equals the maximal PE. Thus, \sum_^2 \left(\frac \omega^2 Y_i^2 M_i\right)=\sum_^2 \left(\frac K_i Y_i^2\right) The overall amplitude of the mode shape cancels out from each side, always. That is, the actual size of the assumed deflection does not matter, just the mode ''shape''. Mathematical manipulations then obtain an expression for \omega, in terms of B, which can be differentiated with respect to B, to find the minimum, i.e. when d\omega/dB=0. This gives the value of B for which \omega is lowest. This is an upper bound solution for \omega if \omega is hoped to be the predicted fundamental frequency of the system because the mode shape is ''assumed'', but we have found the lowest value of that upper bound, given our assumptions, because B is used to find the optimal 'mix' of the two assumed mode shape functions. There are many tricks with this method, the most important is to try and choose realistic assumed mode shapes. For example, in the case of beam deflection problems it is wise to use a deformed shape that is analytically similar to the expected solution. A quartic may fit most of the easy problems of simply linked beams even if the order of the deformed solution may be lower. The springs and masses do not have to be discrete, they can be continuous (or a mixture), and this method can be easily used in a
spreadsheet A spreadsheet is a computer application for computation, organization, analysis and storage of data in tabular form. Spreadsheets were developed as computerized analogs of paper accounting worksheets. The program operates on data entered in c ...
to find the natural frequencies of quite complex distributed systems, if you can describe the distributed KE and PE terms easily, or else break the continuous elements up into discrete parts. This method could be used iteratively, adding additional mode shapes to the previous best solution, or you can build up a long expression with many Bs and many mode shapes, and then differentiate them partially.


In dynamical systems

The
Koopman operator In mathematics, the composition operator C_\phi with symbol \phi is a linear operator defined by the rule C_\phi (f) = f \circ \phi where f \circ \phi denotes function composition. It is also encountered in composition of permutations in permutati ...
allows a finite-dimensional
nonlinear system In mathematics and science, a nonlinear system (or a non-linear system) is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathem ...
to be encoded as an infinite-dimensional
linear system In systems theory, a linear system is a mathematical model of a system based on the use of a linear operator. Linear systems typically exhibit features and properties that are much simpler than the nonlinear case. As a mathematical abstractio ...
. In general, both of these problems are difficult to solve, but for the latter we can use the Ritz-Galerkin method to approximate a solution.


The relationship with the finite element method

In the language of the finite element method, the matrix H_ is precisely the ''stiffness matrix'' of the Hamiltonian in the piecewise linear element space, and the matrix S_ is the ''mass matrix''. In the language of linear algebra, the value \epsilon is an eigenvalue of the discretized Hamiltonian, and the vector c is a discretized eigenvector.


See also

*
Rayleigh quotient In mathematics, the Rayleigh quotient () for a given complex Hermitian matrix M and nonzero vector ''x'' is defined as:R(M,x) = .For real matrices and vectors, the condition of being Hermitian reduces to that of being symmetric, and the conjugat ...
*
Arnoldi iteration In numerical linear algebra, the Arnoldi iteration is an eigenvalue algorithm and an important example of an iterative method. Arnoldi finds an approximation to the eigenvalues and eigenvectors of general (possibly non- Hermitian) matrices by c ...
*
Sturm–Liouville theory In mathematics and its applications, a Sturm–Liouville problem is a second-order linear ordinary differential equation of the form \frac \left (x) \frac\right+ q(x)y = -\lambda w(x) y for given functions p(x), q(x) and w(x), together with some ...
*
Hilbert space In mathematics, a Hilbert space is a real number, real or complex number, complex inner product space that is also a complete metric space with respect to the metric induced by the inner product. It generalizes the notion of Euclidean space. The ...
*
Galerkin method In mathematics, in the area of numerical analysis, Galerkin methods are a family of methods for converting a continuous operator problem, such as a differential equation, commonly in a weak formulation, to a discrete problem by applying linear c ...


Notes and references

* *


External links


Course on Calculus of Variations, has a section on Rayleigh–Ritz method

Ritz method
in the ''
Encyclopedia of Mathematics The ''Encyclopedia of Mathematics'' (also ''EOM'' and formerly ''Encyclopaedia of Mathematics'') is a large reference work in mathematics. Overview The 2002 version contains more than 8,000 entries covering most areas of mathematics at a graduat ...
'' * {{DEFAULTSORT:Rayleigh-Ritz Method Numerical differential equations Dynamical systems