Root-mean-square Deviation (bioinformatics)
   HOME

TheInfoList



OR:

In
bioinformatics Bioinformatics () is an interdisciplinary field of science that develops methods and Bioinformatics software, software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, ...
, the root mean square deviation of atomic positions, or simply root mean square deviation (RMSD), is the measure of the average distance between the atoms (usually the backbone atoms) of
superimposed Superimposition is the placement of one thing over another, typically so that both are still evident. Superimpositions are often related to the mathematical procedure of superposition. Audio Superimposition (SI) during sound recording and repro ...
molecules. In the study of globular protein conformations, one customarily measures the similarity in three-dimensional structure by the RMSD of the
Cα In the nomenclature of organic chemistry, a locant is a term to indicate the position of a functional group or substituent within a molecule. Numeric locants The International Union of Pure and Applied Chemistry (IUPAC) recommends the use of n ...
atomic coordinates after optimal rigid body superposition. When a
dynamical system In mathematics, a dynamical system is a system in which a Function (mathematics), function describes the time dependence of a Point (geometry), point in an ambient space, such as in a parametric curve. Examples include the mathematical models ...
fluctuates about some well-defined average position, the RMSD from the average over time can be referred to as the ''RMSF'' or
root mean square fluctuation In statistical mechanics, the mean squared displacement (MSD), also called mean square displacement, average squared displacement, or mean square fluctuation, is a measure of the deviation of the position of a particle with respect to a referenc ...
. The size of this fluctuation can be measured, for example using
Mössbauer spectroscopy Mössbauer spectroscopy is a spectroscopic technique based on the Mössbauer effect. This effect, discovered by Rudolf Mössbauer (sometimes written "Moessbauer", German: "Mößbauer") in 1958, consists of the nearly recoil-free emission and a ...
or
nuclear magnetic resonance Nuclear magnetic resonance (NMR) is a physical phenomenon in which nuclei in a strong constant magnetic field are disturbed by a weak oscillating magnetic field (in the near field) and respond by producing an electromagnetic signal with a ...
, and can provide important physical information. The
Lindemann index The Lindemann index is a simple measure of thermally driven Entropy, disorder in atoms or molecules. Definition The local Lindemann index is defined as: q_i = \frac \sum_ \frac where angle brackets indicate a time average. The global Lindemann ...
is a method of placing the RMSF in the context of the parameters of the system. A widely used way to compare the structures of biomolecules or solid bodies is to translate and rotate one structure with respect to the other to minimize the RMSD. Coutsias, ''et al.'' presented a simple derivation, based on
quaternion In mathematics, the quaternion number system extends the complex numbers. Quaternions were first described by the Irish mathematician William Rowan Hamilton in 1843 and applied to mechanics in three-dimensional space. The algebra of quater ...
s, for the optimal solid body transformation (rotation-translation) that minimizes the RMSD between two sets of vectors. They proved that the quaternion method is equivalent to the well-known
Kabsch algorithm The Kabsch algorithm, also known as the Kabsch-Umeyama algorithm, named after Wolfgang Kabsch and Shinji Umeyama, is a method for calculating the optimal rotation matrix that minimizes the RMSD (root mean squared deviation) between two paired sets ...
. The solution given by Kabsch is an instance of the solution of the ''d''-dimensional problem, introduced by Hurley and Cattell. The
quaternion In mathematics, the quaternion number system extends the complex numbers. Quaternions were first described by the Irish mathematician William Rowan Hamilton in 1843 and applied to mechanics in three-dimensional space. The algebra of quater ...
solution to compute the optimal rotation was published in the appendix of a paper of Petitjean. This
quaternion In mathematics, the quaternion number system extends the complex numbers. Quaternions were first described by the Irish mathematician William Rowan Hamilton in 1843 and applied to mechanics in three-dimensional space. The algebra of quater ...
solution and the calculation of the optimal isometry in the ''d''-dimensional case were both extended to infinite sets and to the continuous case in the appendix A of another paper of Petitjean.


The equation

: \mathrm=\sqrt where ''δi'' is the distance between atom ''i'' and either a reference structure or the mean position of the ''N'' equivalent atoms. This is often calculated for the backbone heavy atoms ''C'', ''N'', ''O'', and ''Cα'' or sometimes just the ''Cα'' atoms. Normally a rigid superposition which minimizes the RMSD is performed, and this minimum is returned. Given two sets of n points \mathbf and \mathbf, the RMSD is defined as follows: : \begin \mathrm(\mathbf, \mathbf) & = \sqrt \\ & = \sqrt) \end An RMSD value is expressed in length units. The most commonly used unit in
structural biology Structural biology deals with structural analysis of living material (formed, composed of, and/or maintained and refined by living cells) at every level of organization. Early structural biologists throughout the 19th and early 20th centuries we ...
is the
Ångström The angstrom (; ) is a unit of length equal to m; that is, one ten-billionth of a metre, a hundred-millionth of a centimetre, 0.1 nanometre, or 100 picometres. The unit is named after the Swedish physicist Anders Jonas Ångström (1814–18 ...
(Ã…) which is equal to 10−10 m.


Uses

Typically RMSD is used as a quantitative measure of similarity between two or more protein structures. For example, the
CASP Critical Assessment of Structure Prediction (CASP), sometimes called Critical Assessment of Protein Structure Prediction, is a community-wide, worldwide experiment for protein structure prediction taking place every two years since 1994. CASP pro ...
protein structure prediction Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its Protein secondary structure, secondary and Protein tertiary structure, tertiary structure ...
competition uses RMSD as one of its assessments of how well a submitted structure matches the known, target structure. Thus the lower RMSD, the better the model is in comparison to the target structure. Also some scientists who study
protein folding Protein folding is the physical process by which a protein, after Protein biosynthesis, synthesis by a ribosome as a linear chain of Amino acid, amino acids, changes from an unstable random coil into a more ordered protein tertiary structure, t ...
by computer simulations use RMSD as a
reaction coordinate In chemistry, a reaction coordinate is an abstract one-dimensional coordinate chosen to represent progress along a reaction pathway. Where possible it is usually a geometric parameter that changes during the conversion of one or more molecular e ...
to quantify where the protein is between the folded state and the unfolded state. The study of RMSD for small organic molecules (commonly called
ligands In coordination chemistry, a ligand is an ion or molecule with a functional group that binds to a central metal atom to form a coordination complex. The bonding with the metal generally involves formal donation of one or more of the ligand's ...
when they're binding to macromolecules, such as proteins, is studied) is common in the context of docking, as well as in other methods to study the
configuration Configuration or configurations may refer to: Computing * Computer configuration or system configuration * Configuration file, a software file used to configure the initial settings for a computer program * Configurator, also known as choice board ...
of ligands when bound to macromolecules. Note that, for the case of ligands (contrary to proteins, as described above), their structures are most commonly not superimposed prior to the calculation of the RMSD. RMSD is also one of several metrics that have been proposed for quantifying evolutionary similarity between proteins, as well as the quality of sequence alignments.


See also

*
Root mean square deviation The root mean square deviation (RMSD) or root mean square error (RMSE) is either one of two closely related and frequently used measures of the differences between true or predicted values on the one hand and observed values or an estimator on th ...
*
Root mean square fluctuation In statistical mechanics, the mean squared displacement (MSD), also called mean square displacement, average squared displacement, or mean square fluctuation, is a measure of the deviation of the position of a particle with respect to a referenc ...
*
Quaternion In mathematics, the quaternion number system extends the complex numbers. Quaternions were first described by the Irish mathematician William Rowan Hamilton in 1843 and applied to mechanics in three-dimensional space. The algebra of quater ...
– used to optimise RMSD calculations *
Kabsch algorithm The Kabsch algorithm, also known as the Kabsch-Umeyama algorithm, named after Wolfgang Kabsch and Shinji Umeyama, is a method for calculating the optimal rotation matrix that minimizes the RMSD (root mean squared deviation) between two paired sets ...
– an algorithm used to minimize the RMSD by first finding the best rotation * GDT – a different structure comparison measure *
TM-score In bioinformatics, the template modeling score or TM-score is a measure of similarity between two protein structures. The TM-score is intended as a more accurate measure of the global similarity of full-length protein structures than the often used ...
– a different structure comparison measure *
Longest continuous segment Long may refer to: Measurement * Long, characteristic of something of great duration * Long, characteristic of something of great length * Longitude (abbreviation: long.), a geographic coordinate * Longa (music), note value in early music mensu ...
(LCS) — A different structure comparison measure *
Global distance calculation The global distance test (GDT), also written as GDT_TS to represent "total score", is a measure of similarity between two protein structures with known amino acid correspondences (e.g. identical amino acid sequences) but different tertiary struct ...
(GDC_sc, GDC_all) — Structure comparison measures that use full-model information (not just α-carbon) to assess similarity *
Local global alignment Local may refer to: Geography and transportation * Local (train), a train serving local traffic demand * Local, Missouri, a community in the United States Arts, entertainment, and media * ''Local'' (comics), a limited series comic book by Bria ...
(LGA) — Protein structure alignment program and structure comparison measure


References


Further reading

* Shibuya T (2009). "Searching Protein 3-D Structures in Linear Time." Proc. 13th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2009), ''LNCS'' 5541:1–15. * * * * {{cite journal , vauthors=Maiorov VN, Crippen GM , year=1994 , title=Significance of root-mean-square deviation in comparing three-dimensional structures of globular proteins , journal=J Mol Biol , volume=235 , issue=2 , pages=625–634 , doi=10.1006/jmbi.1994.1017 , pmid=8289285, hdl=2027.42/31835 , url=https://deepblue.lib.umich.edu/bitstream/2027.42/31835/1/0000782.pdf , hdl-access=free


External links


Molecular Distance Measures
mdash;a tutorial on how to calculate RMSD
RMSD
mdash;another tutorial on how to calculate RMSD with example code
Secondary Structure Matching (SSM)
— a tool for protein structure comparison. Uses RMSD.

— different structure comparison measures. Description and services.
SuperPose
— a protein superposition server. Uses RMSD.

— structural alignment based on secondary structure matching. By the CCP4 project. Uses RMSD. *A
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (prog ...
script is available at https://github.com/charnley/rmsd *An alternate
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (prog ...
script is available at https://github.com/jewettaij/superpose3d Statistical deviation and dispersion Protein methods Bioinformatics