The ConsensusPathDB is a molecular functional interaction
database
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spa ...
, integrating information on
protein interaction
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respondi ...
s,
genetic interaction
Epistasis is a phenomenon in genetics in which the effect of a gene mutation is dependent on the presence or absence of mutations in one or more other genes, respectively termed modifier genes. In other words, the effect of the mutation is dep ...
s signaling,
metabolism
Metabolism (, from el, μεταβολή ''metabolē'', "change") is the set of life-sustaining chemical reactions in organisms. The three main functions of metabolism are: the conversion of the energy in food to energy available to run c ...
,
gene regulation
Regulation of gene expression, or gene regulation, includes a wide range of mechanisms that are used by cells to increase or decrease the production of specific gene products (protein or RNA). Sophisticated programs of gene expression are wide ...
, and drug-target interactions in humans. ConsensusPathDB currently (release 30) includes such interactions from 32 databases.
ConsensusPathDB is freely available for academic use under http://ConsensusPathDB.org.
Integrated Databases
*
Reactome
Reactome is a free online database of biological pathways. There are several Reactomes that concentrate on specific organisms, the largest of these is focused on human biology, the following description concentrates on the human Reactome. It is au ...
(
metabolic
Metabolism (, from el, μεταβολή ''metabolē'', "change") is the set of life-sustaining chemical reactions in organisms. The three main functions of metabolism are: the conversion of the energy in food to energy available to run cel ...
and
signaling pathway
In biology, cell signaling (cell signalling in British English) or cell communication is the ability of a cell to receive, process, and transmit signals with its environment and with itself. Cell signaling is a fundamental property of all cellula ...
s)
*
KEGG
KEGG (Kyoto Encyclopedia of Genes and Genomes) is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances. KEGG is utilized for bioinformatics research and education, including data analysis ...
(metabolic pathways only have been integrated in ConsensusPathDB)
* HumanCyc (metabolic pathways)
* PID - Pathway Interaction Database (signaling pathways)
* BioCarta (signaling pathways)
*
Netpath (signaling pathways)
* IntAct (protein interactions)
*
DIP (protein interactions)
* MINT (protein interactions)
*
HPRD
The Human Protein Reference Database (HPRD) is a protein database accessible through the Internet. It is closely associated with the premier Indian Non-Profit research organisation Institute of Bioinformatics (IOB), Bangalore. This database is a ...
(protein interactions)
*
BioGRID (protein interactions)
* SPIKE (protein interactions, signaling reactions)
*
WikiPathways
WikiPathways is a community resource for contributing and maintaining content dedicated to biological pathways. Any registered WikiPathways user can contribute, and anybody can become a registered user. Contributions are monitored by a group of a ...
(
metabolic
Metabolism (, from el, μεταβολή ''metabolē'', "change") is the set of life-sustaining chemical reactions in organisms. The three main functions of metabolism are: the conversion of the energy in food to energy available to run cel ...
and
signaling pathway
In biology, cell signaling (cell signalling in British English) or cell communication is the ability of a cell to receive, process, and transmit signals with its environment and with itself. Cell signaling is a fundamental property of all cellula ...
s)
* and many more.
Functionalities
The ConsensusPathDB is accessible via a
web interface
In the industrial design field of human–computer interaction, a user interface (UI) is the space where interactions between humans and machines occur. The goal of this interaction is to allow effective operation and control of the machine f ...
providing a variety of functions.
Search and visualization
Using the web interface users can search for
physical entities
Physical may refer to:
* Physical examination, a regular overall check-up with a doctor
* ''Physical'' (Olivia Newton-John album), 1981
** "Physical" (Olivia Newton-John song)
* ''Physical'' (Gabe Gurnsey album)
* "Physical" (Alcazar song) (2004)
* ...
(e.g.
protein
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respon ...
s,
metabolite
In biochemistry, a metabolite is an intermediate or end product of metabolism.
The term is usually used for small molecules. Metabolites have various functions, including fuel, structure, signaling, stimulatory and inhibitory effects on enzymes, ...
s etc.) or pathways using common names or accession numbers (e.g.
UniProt
UniProt is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived fro ...
identifiers). Selected interactions can be visualized in an interactive environment as expandable networks. ConsensusPathDB currently allows users to export their models in
BioPAX BioPAX (Biological Pathway Exchange) is a RDF/ OWL-based
standard language to represent biological pathways
at the molecular and cellular level. Its major use is to facilitate the exchange of pathway data.
Pathway data captures our understanding o ...
format or as image in several formats.
Shortest path
Users can search for shortest paths of functional interactions between physical entities, based on all interactions in the database. The pathway search can be constrained by forbidding passing through certain physical entities.
Data upload
Users can upload their own
interaction network In molecular biology, an interactome is the whole set of molecular interactions in a particular cell. The term specifically refers to physical interactions among molecules (such as those among proteins, also known as protein–protein interactions, ...
s in
BioPAX BioPAX (Biological Pathway Exchange) is a RDF/ OWL-based
standard language to represent biological pathways
at the molecular and cellular level. Its major use is to facilitate the exchange of pathway data.
Pathway data captures our understanding o ...
, PSI-MI or
SBML
The Systems Biology Markup Language (SBML) is a representation format, based on XML, for communicating and storing computational models of biological processes. It is a free and open standard with widespread software support and a community of use ...
files in order to validate and/or extend those networks in the context of the interactions in ConsensusPathDB.
Over-representation analysis
Using the web-interface of the database, one can perform overrepresentation analysis, based on
biochemical pathway
In biochemistry, a metabolic pathway is a linked series of chemical reactions occurring within a cell. The reactants, products, and intermediates of an enzymatic reaction are known as metabolites, which are modified by a sequence of chemical reac ...
s or on neighbourhood-based entity sets (NESTs) that constitute sub-networks of the overall interaction network containing all physical entities around a central one within a "radius" (number of interactions from the center). For each predefined set (pathway / NEST), a
P-value
In null-hypothesis significance testing, the ''p''-value is the probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct. A very small ''p''-value means ...
is computed based on the
hypergeometric distribution
In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the probability of k successes (random draws for which the object drawn has a specified feature) in n draws, ''without' ...
. It reflects the significance of the observed overlap between the user-specific input gene list and the members of the predefined set.
Over-representation analyses can be performed with user-specified genes or metabolites.
References
External links
* {{Official website, http://consensuspathdb.org
Biological databases
Systems biology