HOME

TheInfoList



OR:

In the field of molecular modeling, docking is a method which predicts the preferred orientation of one molecule to a second when a ligand and a target are bound to each other to form a stable complex. Knowledge of the preferred orientation in turn may be used to predict the strength of association or
binding affinity In biochemistry and pharmacology, a ligand is a substance that forms a complex with a biomolecule to serve a biological purpose. The etymology stems from ''ligare'', which means 'to bind'. In protein-ligand binding, the ligand is usually a mo ...
between two molecules using, for example, scoring functions. The associations between biologically relevant molecules such as
proteins Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
,
peptide Peptides (, ) are short chains of amino acids linked by peptide bonds. Long chains of amino acids are called proteins. Chains of fewer than twenty amino acids are called oligopeptides, and include dipeptides, tripeptides, and tetrapeptides. ...
s,
nucleic acids Nucleic acids are biopolymers, macromolecules, essential to all known forms of life. They are composed of nucleotides, which are the monomers made of three components: a 5-carbon sugar, a phosphate group and a nitrogenous base. The two main cl ...
,
carbohydrates In organic chemistry, a carbohydrate () is a biomolecule consisting of carbon (C), hydrogen (H) and oxygen (O) atoms, usually with a hydrogen–oxygen atom ratio of 2:1 (as in water) and thus with the empirical formula (where ''m'' may or m ...
, and
lipids Lipids are a broad group of naturally-occurring molecules which includes fats, waxes, sterols, fat-soluble vitamins (such as vitamins A, D, E and K), monoglycerides, diglycerides, phospholipids, and others. The functions of lipids in ...
play a central role in
signal transduction Signal transduction is the process by which a chemical or physical signal is transmitted through a cell as a series of molecular events, most commonly protein phosphorylation catalyzed by protein kinases, which ultimately results in a cellula ...
. Furthermore, the relative orientation of the two interacting partners may affect the type of signal produced (e.g.,
agonism Agonism (from Greek ἀγών ''agon'', "struggle") is a political and social theory that emphasizes the potentially positive aspects of certain forms of conflict. It accepts a permanent place for such conflict in the political sphere, but seeks ...
vs antagonism). Therefore, docking is useful for predicting both the strength and type of signal produced. Molecular docking is one of the most frequently used methods in
structure-based drug design Drug design, often referred to as rational drug design or simply rational design, is the inventive process of finding new medications based on the knowledge of a biological target. The drug is most commonly an organic small molecule that activ ...
, due to its ability to predict the binding-conformation of small molecule ligands to the appropriate target
binding site In biochemistry and molecular biology, a binding site is a region on a macromolecule such as a protein that binds to another molecule with specificity. The binding partner of the macromolecule is often referred to as a ligand. Ligands may includ ...
. Characterisation of the binding behaviour plays an important role in rational design of drugs as well as to elucidate fundamental biochemical processes.


Definition of problem

One can think of molecular docking as a problem of ''“lock-and-key”'', in which one wants to find the correct relative orientation of the ''“key”'' which will open up the ''“lock”'' (where on the surface of the lock is the key hole, which direction to turn the key after it is inserted, etc.). Here, the protein can be thought of as the “lock” and the ligand can be thought of as a “key”. Molecular docking may be defined as an optimization problem, which would describe the “best-fit” orientation of a ligand that binds to a particular protein of interest. However, since both the ligand and the protein are flexible, a ''“hand-in-glove”'' analogy is more appropriate than ''“lock-and-key”''. During the course of the docking process, the ligand and the protein adjust their conformation to achieve an overall "best-fit" and this kind of conformational adjustment resulting in the overall binding is referred to as "induced-fit". Molecular docking research focuses on computationally simulating the
molecular recognition The term molecular recognition refers to the specific interaction between two or more molecules through noncovalent bonding such as hydrogen bonding, metal coordination, hydrophobic forces, van der Waals forces, π-π interactions, halogen ...
process. It aims to achieve an optimized conformation for both the protein and ligand and relative orientation between protein and ligand such that the free energy of the overall system is minimized.


Docking approaches

Two approaches are particularly popular within the molecular docking community. One approach uses a matching technique that describes the protein and the ligand as complementary surfaces. The second approach simulates the actual docking process in which the ligand-protein pairwise interaction energies are calculated. Both approaches have significant advantages as well as some limitations. These are outlined below.


Shape complementarity

Geometric matching/ shape complementarity methods describe the protein and ligand as a set of features that make them dockable. These features may include molecular surface /
complementary surface A complement is something that completes something else. Complement may refer specifically to: The arts * Complement (music), an interval that, when added to another, spans an octave ** Aggregate complementation, the separation of pitch-class ...
descriptors. In this case, the receptor's molecular surface is described in terms of its solvent-accessible surface area and the ligand's molecular surface is described in terms of its matching surface description. The complementarity between the two surfaces amounts to the shape matching description that may help finding the complementary pose of docking the target and the ligand molecules. Another approach is to describe the hydrophobic features of the protein using turns in the main-chain atoms. Yet another approach is to use a Fourier shape descriptor technique. Whereas the shape complementarity based approaches are typically fast and robust, they cannot usually model the movements or dynamic changes in the ligand/ protein conformations accurately, although recent developments allow these methods to investigate ligand flexibility. Shape complementarity methods can quickly scan through several thousand ligands in a matter of seconds and actually figure out whether they can bind at the protein's active site, and are usually scalable to even protein-protein interactions. They are also much more amenable to pharmacophore based approaches, since they use geometric descriptions of the ligands to find optimal binding.


Simulation

Simulating the docking process is much more complicated. In this approach, the protein and the ligand are separated by some physical distance, and the ligand finds its position into the protein's active site after a certain number of “moves” in its conformational space. The moves incorporate rigid body transformations such as translations and rotations, as well as internal changes to the ligand's structure including torsion angle rotations. Each of these moves in the conformation space of the ligand induces a total energetic cost of the system. Hence, the system's total energy is calculated after every move. The obvious advantage of docking simulation is that ligand flexibility is easily incorporated, whereas shape complementarity techniques must use ingenious methods to incorporate flexibility in ligands. Also, it more accurately models reality, whereas shape complementary techniques are more of an abstraction. Clearly, simulation is computationally expensive, having to explore a large energy landscape. Grid-based techniques, optimization methods, and increased computer speed have made docking simulation more realistic.


Mechanics of docking

To perform a docking screen, the first requirement is a structure of the protein of interest. Usually the structure has been determined using a biophysical technique such as
x-ray crystallography X-ray crystallography is the experimental science determining the atomic and molecular structure of a crystal, in which the crystalline structure causes a beam of incident X-rays to diffract into many specific directions. By measuring the angles ...
,
NMR spectroscopy Nuclear magnetic resonance spectroscopy, most commonly known as NMR spectroscopy or magnetic resonance spectroscopy (MRS), is a spectroscopic technique to observe local magnetic fields around atomic nuclei. The sample is placed in a magnetic fi ...
or cryo electron microscopy (cryo-EM), but can also derive from
homology modeling Homology modeling, also known as comparative modeling of protein, refers to constructing an atomic-resolution model of the "''target''" protein from its amino acid sequence and an experimental three-dimensional structure of a related homologous pr ...
construction. This protein structure and a database of potential ligands serve as inputs to a docking program. The success of a docking program depends on two components: the
search algorithm In computer science, a search algorithm is an algorithm designed to solve a search problem. Search algorithms work to retrieve information stored within particular data structure, or calculated in the search space of a problem domain, with eith ...
and the scoring function.


Search algorithm

The search space in theory consists of all possible orientations and conformations of the protein paired with the ligand. However, in practice with current computational resources, it is impossible to exhaustively explore the search space—this would involve enumerating all possible distortions of each molecule (molecules are dynamic and exist in an ensemble of conformational states) and all possible rotational and translational orientations of the ligand relative to the protein at a given level of
granularity Granularity (also called graininess), the condition of existing in granules or grains, refers to the extent to which a material or system is composed of distinguishable pieces. It can either refer to the extent to which a larger entity is sub ...
. Most docking programs in use account for the whole conformational space of the ligand (flexible ligand), and several attempt to model a flexible protein receptor. Each "snapshot" of the pair is referred to as a pose. A variety of conformational search strategies have been applied to the ligand and to the receptor. These include: * systematic or
stochastic Stochastic (, ) refers to the property of being well described by a random probability distribution. Although stochasticity and randomness are distinct in that the former refers to a modeling approach and the latter refers to phenomena themselv ...
torsional searches about rotatable bonds *
molecular dynamics Molecular dynamics (MD) is a computer simulation method for analyzing the physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamic "evolution" of th ...
simulations *
genetic algorithm In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to ge ...
s to "evolve" new low energy conformations and where the score of each pose acts as the fitness function used to select individuals for the next iteration.


Ligand flexibility

Conformations of the ligand may be generated in the absence of the receptor and subsequently docked or conformations may be generated on-the-fly in the presence of the receptor binding cavity, or with full rotational flexibility of every dihedral angle using fragment based docking. Force field energy evaluation are most often used to select energetically reasonable conformations, but knowledge-based methods have also been used. Peptides are both highly flexible and relatively large-sized molecules, which makes modeling their flexibility a challenging task. A number of methods were developed to allow for efficient modeling of flexibility of peptides during protein-peptide docking.


Receptor flexibility

Computational capacity has increased dramatically over the last decade making possible the use of more sophisticated and computationally intensive methods in computer-assisted drug design. However, dealing with receptor flexibility in docking methodologies is still a thorny issue. The main reason behind this difficulty is the large number of degrees of freedom that have to be considered in this kind of calculations. Neglecting it, however, in some of the cases may lead to poor docking results in terms of binding pose prediction. Multiple static structures experimentally determined for the same protein in different conformations are often used to emulate receptor flexibility. Alternatively rotamer libraries of amino acid side chains that surround the binding cavity may be searched to generate alternate but energetically reasonable protein conformations.


Scoring function

Docking programs generate a large number of potential ligand poses, of which some can be immediately rejected due to clashes with the protein. The remainder are evaluated using some scoring function, which takes a pose as input and returns a number indicating the likelihood that the pose represents a favorable binding interaction and ranks one ligand relative to another. Most scoring functions are physics-based molecular mechanics force fields that estimate the energy of the pose within the binding site. The various contributions to binding can be written as an additive equation: \bigtriangleup G_ = \bigtriangleup G_ + \bigtriangleup G_ + \bigtriangleup G_ + \bigtriangleup G_ + \bigtriangleup G_ + \bigtriangleup G_ The components consist of solvent effects, conformational changes in the protein and ligand, free energy due to protein-ligand interactions, internal rotations, association energy of ligand and receptor to form a single complex and free energy due to changes in vibrational modes. A low (negative) energy indicates a stable system and thus a likely binding interaction. Alternative approaches use modified scoring functions to include constraints based on known key protein-ligand interactions, or knowledge-based potentials derived from interactions observed in large databases of protein-ligand structures (e.g. the
Protein Data Bank The Protein Data Bank (PDB) is a database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. The data, typically obtained by X-ray crystallography, NMR spectroscopy, or, increasingly, cr ...
). There are a large number of structures from
X-ray crystallography X-ray crystallography is the experimental science determining the atomic and molecular structure of a crystal, in which the crystalline structure causes a beam of incident X-rays to diffract into many specific directions. By measuring the angles ...
for complexes between proteins and high affinity ligands, but comparatively fewer for low affinity ligands as the latter complexes tend to be less stable and therefore more difficult to crystallize. Scoring functions trained with this data can dock high affinity ligands correctly, but they will also give plausible docked conformations for ligands that do not bind. This gives a large number of
false positive A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test resul ...
hits, i.e., ligands predicted to bind to the protein that actually don't when placed together in a test tube. One way to reduce the number of false positives is to recalculate the energy of the top scoring poses using (potentially) more accurate but computationally more intensive techniques such as Generalized Born or Poisson-Boltzmann methods.


Docking assessment

The interdependence between sampling and scoring function affects the docking capability in predicting plausible poses or binding affinities for novel compounds. Thus, an assessment of a docking protocol is generally required (when experimental data is available) to determine its predictive capability. Docking assessment can be performed using different strategies, such as: * docking accuracy (DA) calculation; * the correlation between a docking score and the experimental response or determination of the enrichment factor (EF); * the distance between an ion-binding moiety and the ion in the active site; * the presence of induce-fit models.


Docking accuracy

Docking accuracy represents one measure to quantify the fitness of a docking program by rationalizing the ability to predict the right pose of a ligand with respect to that experimentally observed.


Enrichment factor

Docking screens can also be evaluated by the enrichment of annotated ligands of known binders from among a large database of presumed non-binding, “
decoy A decoy (derived from the Dutch ''de'' ''kooi'', literally "the cage" or possibly ''ende kooi'', " duck cage") is usually a person, device, or event which resembles what an individual or a group might be looking for, but it is only meant to lu ...
” molecules. In this way, the success of a docking screen is evaluated by its capacity to enrich the small number of known active compounds in the top ranks of a screen from among a much greater number of decoy molecules in the database. The area under the receiver operating characteristic (ROC) curve is widely used to evaluate its performance.


Prospective

Resulting hits from docking screens are subjected to pharmacological validation (e.g. IC50, affinity or
potency Potency may refer to: * Potency (pharmacology), a measure of the activity of a drug in a biological system * Virility * Cell potency, a measure of the differentiation potential of stem cells * In homeopathic dilutions, potency is a measure of how ...
measurements). Only prospective studies constitute conclusive proof of the suitability of a technique for a particular target. In the case of G protein-coupled receptors (GPCRs), which are targets of more than 30% of marketed drugs, molecular docking led to the discovery of more than 500 GPCR ligands.


Benchmarking

The potential of docking programs to reproduce binding modes as determined by
X-ray crystallography X-ray crystallography is the experimental science determining the atomic and molecular structure of a crystal, in which the crystalline structure causes a beam of incident X-rays to diffract into many specific directions. By measuring the angles ...
can be assessed by a range of docking benchmark sets. For small molecules, several benchmark data sets for docking and virtual screening exist e.g. ''Astex Diverse Set'' consisting of high quality protein−ligand X-ray crystal structures or the ''Directory of Useful Decoys'' (DUD) for evaluation of virtual screening performance. An evaluation of docking programs for their potential to reproduce peptide binding modes can be assessed by ''Lessons for Efficiency Assessment of Docking and Scoring'' (LEADS-PEP).


Applications

A binding interaction between a small molecule ligand and an
enzyme Enzymes () are proteins that act as biological catalysts by accelerating chemical reactions. The molecules upon which enzymes may act are called substrates, and the enzyme converts the substrates into different molecules known as products ...
protein may result in activation or inhibition of the enzyme. If the protein is a receptor, ligand binding may result in
agonism Agonism (from Greek ἀγών ''agon'', "struggle") is a political and social theory that emphasizes the potentially positive aspects of certain forms of conflict. It accepts a permanent place for such conflict in the political sphere, but seeks ...
or antagonism. Docking is most commonly used in the field of drug design — most drugs are small
organic Organic may refer to: * Organic, of or relating to an organism, a living entity * Organic, of or relating to an anatomical organ Chemistry * Organic matter, matter that has come from a once-living organism, is capable of decay or is the product ...
molecules, and docking may be applied to: * hit identification – docking combined with a scoring function can be used to quickly screen large databases of potential drugs
in silico In biology and other experimental sciences, an ''in silico'' experiment is one performed on computer or via computer simulation. The phrase is pseudo-Latin for 'in silicon' (correct la, in silicio), referring to silicon in computer chips. It ...
to identify molecules that are likely to bind to protein target of interest (see virtual screening). Reverse pharmacology routinely uses docking for target identification. * lead optimization – docking can be used to predict in where and in which relative orientation a ligand binds to a protein (also referred to as the binding mode or pose). This information may in turn be used to design more potent and selective analogs. *
Bioremediation Bioremediation broadly refers to any process wherein a biological system (typically bacteria, microalgae, fungi, and plants), living or dead, is employed for removing environmental pollutants from air, water, soil, flue gasses, industrial effluent ...
– Protein ligand docking can also be used to predict pollutants that can be degraded by enzymes.


See also

* Drug design * Katchalski-Katzir algorithm * List of molecular graphics systems * Macromolecular docking * Molecular mechanics *
Protein structure Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers specifically polypeptides formed from sequences of amino acids, the monomers of the polymer. A single amino acid monom ...
*
Protein design Protein design is the rational design of new protein molecules to design novel activity, behavior, or purpose, and to advance basic understanding of protein function. Proteins can be designed from scratch (''de novo'' design) or by making calcul ...
* Software for molecular mechanics modeling *
List of protein-ligand docking software The number of notable protein-ligand docking programs currently available is high and has been steadily increasing over the last decades. The following list presents an overview of the most common notable programs, listed alphabetically, with ...
* Molecular design software * Docking@Home * Ibercivis *
ZINC database The ZINC database (recursive acronym: ''ZINC is not commercial'') is a curated collection of commercially available chemical compounds prepared especially for virtual screening. ZINC is used by investigators (generally people with training as bio ...
* Lead Finder * Virtual screening *
Scoring functions for docking In the fields of computational chemistry and molecular modelling, scoring functions are mathematical functions used to approximately predict the binding affinity between two molecules after they have been docked. Most commonly one of the molecul ...


References


External links

* * {{cite web , url = http://users.ox.ac.uk/~jesu1458/installation_of_autodock_on_ubuntu_linux/ , archive-url = https://web.archive.org/web/20090226231523/http://users.ox.ac.uk/~jesu1458/installation_of_autodock_on_ubuntu_linux/ , url-status = dead , archive-date = 2009-02-26 , title = Step by step installation of MGLTools 1.5.2 (AutoDockTools, Python Molecular Viewer and Visual Programming Environment) on Ubuntu Linux 8.04 , author = Malinauskas T , access-date = 2008-07-15
Docking@GRID
Project of Conformational Sampling and Docking on Grids : one aim is to deploy some intrinsic distributed docking algorithms on computational Grids, downloa
Docking@GRID open-source Linux version

Click2Drug.org
- Directory of computational drug design tools.

with MOE (Molecular Operating Environment) Molecular modelling Computational chemistry Protein structure Medicinal chemistry Bioinformatics Drug discovery Articles containing video clips