HOME

TheInfoList



OR:

The Gene Ontology (GO) is a major
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
initiative to unify the representation of
gene In biology, the word gene (from , ; "...Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
and
gene product A gene product is the biochemical material, either RNA or protein, resulting from expression of a gene. A measurement of the amount of gene product is sometimes used to infer how active a gene is. Abnormal amounts of gene product can be correlate ...
attributes across all
species In biology, a species is the basic unit of classification and a taxonomic rank of an organism, as well as a unit of biodiversity. A species is often defined as the largest group of organisms in which any two individuals of the appropriat ...
. More specifically, the project aims to: 1) maintain and develop its
controlled vocabulary Control may refer to: Basic meanings Economics and business * Control (management), an element of management * Control, an element of management accounting * Comptroller (or controller), a senior financial officer in an organization * Control ...
of gene and gene product attributes; 2) annotate genes and gene products, and assimilate and disseminate annotation data; and 3) provide tools for easy access to all aspects of the data provided by the project, and to enable functional interpretation of experimental data using the GO, for example via enrichment analysis. GO is part of a larger classification effort, the
Open Biomedical Ontologies The Open Biological and Biomedical Ontologies (OBO) Foundry is a group of people dedicated to build and maintain ontologies related to the life sciences. The OBO Foundry establishes a set of principles for ontology development for creating a ...
, being one of the Initial Candidate Members of the
OBO Foundry The Open Biological and Biomedical Ontologies (OBO) Foundry is a group of people dedicated to build and maintain ontologies related to the life sciences. The OBO Foundry establishes a set of principles for ontology development for creating a s ...
. Whereas
gene nomenclature Gene nomenclature is the scientific naming of genes, the units of heredity in living organisms. It is also closely associated with protein nomenclature, as genes and the proteins they code for usually have similar nomenclature. An international co ...
focuses on gene and gene products, the Gene Ontology focuses on the function of the genes and gene products. The GO also extends the effort by using
markup language Markup language refers to a text-encoding system consisting of a set of symbols inserted in a text document to control its structure, formatting, or the relationship between its parts. Markup is often used to control the display of the document ...
to make the data (not only of the genes and their products but also of curated attributes) machine readable, and to do so in a way that is unified across all species (whereas gene nomenclature conventions vary by biological
taxon In biology, a taxon ( back-formation from '' taxonomy''; plural taxa) is a group of one or more populations of an organism or organisms seen by taxonomists to form a unit. Although neither is required, a taxon is usually known by a particular n ...
).


Terms and ontology

From a practical view, an ontology is a representation of something we know about. "Ontologies" consist of representations of things that are detectable or directly observable, and the relationships between those things. There is no universal standard terminology in biology and related domains, and term usages may be specific to a species, research area or even a particular research group. This makes communication and sharing of data more difficult. The Gene Ontology project provides an
ontology In metaphysics, ontology is the philosophy, philosophical study of being, as well as related concepts such as existence, Becoming (philosophy), becoming, and reality. Ontology addresses questions like how entities are grouped into Category ...
of defined terms representing
gene product A gene product is the biochemical material, either RNA or protein, resulting from expression of a gene. A measurement of the amount of gene product is sometimes used to infer how active a gene is. Abnormal amounts of gene product can be correlate ...
properties. The ontology covers three domains: * cellular component, the parts of a
cell Cell most often refers to: * Cell (biology), the functional basic unit of life Cell may also refer to: Locations * Monastic cell, a small room, hut, or cave in which a religious recluse lives, alternatively the small precursor of a monastery ...
or its
extracellular This glossary of biology terms is a list of definitions of fundamental terms and concepts used in biology, the study of life and of living organisms. It is intended as introductory material for novices; for more specific and technical definitions ...
environment; * molecular function, the elemental activities of a gene product at the molecular level, such as binding or
catalysis Catalysis () is the process of increasing the rate of a chemical reaction by adding a substance known as a catalyst (). Catalysts are not consumed in the reaction and remain unchanged after it. If the reaction is rapid and the catalyst recyc ...
; *
biological process Biological processes are those processes that are vital for an organism to live, and that shape its capacities for interacting with its environment. Biological processes are made of many chemical reactions or other events that are involved in the ...
, operations or sets of molecular events with a defined beginning and end, pertinent to the functioning of integrated living units: cells, tissues, organs, and
organism In biology, an organism () is any living system that functions as an individual entity. All organisms are composed of cells ( cell theory). Organisms are classified by taxonomy into groups such as multicellular animals, plants, and fu ...
s. Each GO term within the ontology has a term name, which may be a word or string of words; a unique alphanumeric identifier; a definition with cited sources; and an ontology indicating the domain to which it belongs. Terms may also have synonyms, which are classed as being exactly equivalent to the term name, broader, narrower, or related; references to equivalent concepts in other databases; and comments on term meaning or usage. The GO ontology is structured as a
directed acyclic graph In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges (also called ''arcs''), with each edge directed from one ...
, and each term has defined relationships to one or more other terms in the same domain, and sometimes to other domains. The GO vocabulary is designed to be species-neutral, and includes terms applicable to
prokaryote A prokaryote () is a single-celled organism that lacks a nucleus and other membrane-bound organelles. The word ''prokaryote'' comes from the Greek πρό (, 'before') and κάρυον (, 'nut' or 'kernel').Campbell, N. "Biology:Concepts & Con ...
s and
eukaryote Eukaryotes () are organisms whose cells have a nucleus. All animals, plants, fungi, and many unicellular organisms, are Eukaryotes. They belong to the group of organisms Eukaryota or Eukarya, which is one of the three domains of life. Bacter ...
s, single and
multicellular organism A multicellular organism is an organism that consists of more than one cell, in contrast to unicellular organism. All species of animals, land plants and most fungi are multicellular, as are many algae, whereas a few organisms are partially uni ...
s. GO is not static, and additions, corrections and alterations are suggested by, and solicited from, members of the research and annotation communities, as well as by those directly involved in the GO project. For example, an annotator may request a specific term to represent a metabolic pathway, or a section of the ontology may be revised with the help of community experts (e.g.). Suggested edits are reviewed by the ontology editors, and implemented where appropriate. The GO ontology and annotation files are freely available from the GO website in a number of formats, or can be accessed online using the GO browser AmiGO. The Gene Ontology project also provides downloadable mappings of its terms to other classification systems.


Example term

:id: GO:0000016 :name: lactase activity :ontology: molecular_function :def: "Catalysis of the reaction: lactose + H2O=D-glucose + D-galactose." C:3.2.1.108:synonym: "lactase-phlorizin hydrolase activity" BROAD C:3.2.1.108:synonym: "lactose galactohydrolase activity" EXACT C:3.2.1.108:xref: EC:3.2.1.108 :xref: MetaCyc:LACTASE-RXN :xref: Reactome:20536 :is_a: GO:0004553 ! hydrolase activity, hydrolyzing O-glycosyl compounds Data source:


Annotation

Genome annotation encompasses the practice of capturing data about a gene product, and GO annotations use terms from the GO to do so. Annotations from GO curators are integrated and disseminated on the GO website, where they can be downloaded directly or viewed online using AmiGO. In addition to the gene product identifier and the relevant GO term, GO annotations have at least the following data: The ''reference'' used to make the annotation (e.g. a journal article); An ''evidence code'' denoting the type of evidence upon which the annotation is based; The date and the creator of the annotation Supporting information, depending on GO term and evidence used and supplementary information, such as the conditions the function is observed under, may also be included in a GO annotation. The evidence code comes from a
controlled vocabulary Control may refer to: Basic meanings Economics and business * Control (management), an element of management * Control, an element of management accounting * Comptroller (or controller), a senior financial officer in an organization * Control ...
of codes, the Evidence Code Ontology, covering both manual and automated annotation methods. For example, ''Traceable Author Statement'' (TAS) means a curator has read a published scientific paper and the metadata for that annotation bears a citation to that paper; ''Inferred from Sequence Similarity'' (ISS) means a human curator has reviewed the output from a sequence similarity search and verified that it is biologically meaningful. Annotations from automated processes (for example, remapping annotations created using another annotation vocabulary) are given the code ''Inferred from Electronic Annotation'' (IEA). In 2010, over 98% of all GO annotations were inferred computationally, not by curators, but as of July 2, 2019, only about 30% of all GO annotations were inferred computationally. As these annotations are not checked by a human, the GO Consortium considers them to be marginally less reliable and they are commonly to higher level, less detailed terms. Full annotation data sets can be downloaded from the GO website. To support the development of annotation, the GO Consortium provides workshops and mentors new groups of curators and developers. Many
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
algorithms have been designed and implemented to predict Gene Ontology annotations.


Example annotation

:Gene product: Actin, alpha cardiac muscle 1
UniProtKB:P68032
:GO term:
heart contraction ; GO:0060047
(biological process) :Evidence code: Inferred from Mutant Phenotype (IMP) :Reference: :Assigned by: UniProtKB, June 6, 2008 Data source:


Tools

There are a large number of tools available both online and to download that use the data provided by the GO project. The vast majority of these come from third parties; the GO Consortium develops and supports two tools, AmiGO and OBO-Edit. AmiGOAmiGO
-the current official web-based set of tools for searching and browsing the Gene Ontology database
is a web-based application that allows users to query, browse and visualize ontologies and gene product annotation data. It also has a
BLAST Blast or The Blast may refer to: *Explosion, a rapid increase in volume and release of energy in an extreme manner *Detonation, an exothermic front accelerating through a medium that eventually drives a shock front Film * ''Blast'' (1997 film), ...
tool, tools allowing analysis of larger data sets, and an interface to query the GO database directly. AmiGO can be used online at the GO website to access the data provided by the GO Consortium, or can be downloaded and installed for local use on any database employing the GO database schema (e.g.). It is free
open source software Open-source software (OSS) is computer software that is released under a license in which the copyright holder grants users the rights to use, study, change, and distribute the software and its source code to anyone and for any purpose. Open ...
and is available as part of the go-dev software distribution. OBO-Edit is an open source, platform-independent ontology editor developed and maintained by the Gene Ontology Consortium. It is implemented in
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ...
, and uses a graph-oriented approach to display and edit ontologies. OBO-Edit includes a comprehensive search and filter interface, with the option to render subsets of terms to make them visually distinct; the user interface can also be customized according to user preferences. OBO-Edit also has a reasoner that can infer links that have not been explicitly stated, based on existing relationships and their properties. Although it was developed for biomedical ontologies, OBO-Edit can be used to view, search and edit any ontology. It is freely available to download.


Consortium

The Gene Ontology Consortium is the set of
biological database Biological databases are libraries of biological sciences, collected from scientific experiments, published literature, high-throughput experiment technology, and computational analysis. They contain information from research areas including genom ...
s and research groups actively involved in the gene ontology project. This includes a number of
model organism A model organism (often shortened to model) is a non-human species that is extensively studied to understand particular biological phenomena, with the expectation that discoveries made in the model organism will provide insight into the workin ...
databases and multi-species protein databases, software development groups, and a dedicated editorial office.


History

The Gene Ontology was originally constructed in 1998 by a consortium of researchers studying the genomes of three
model organism A model organism (often shortened to model) is a non-human species that is extensively studied to understand particular biological phenomena, with the expectation that discoveries made in the model organism will provide insight into the workin ...
s: ''
Drosophila melanogaster ''Drosophila melanogaster'' is a species of fly (the taxonomic order Diptera) in the family Drosophilidae. The species is often referred to as the fruit fly or lesser fruit fly, or less commonly the " vinegar fly" or "pomace fly". Starting with ...
'' (fruit fly), ''
Mus musculus Mus or MUS may refer to: Abbreviations * MUS, the NATO country code for Mauritius * MUS, the IATA airport code for Minami Torishima Airport * MUS, abbreviation for the Centre for Modern Urban Studies on Campus The Hague, Leiden University, Net ...
'' (mouse), and ''
Saccharomyces cerevisiae ''Saccharomyces cerevisiae'' () (brewer's yeast or baker's yeast) is a species of yeast (single-celled fungus microorganisms). The species has been instrumental in winemaking, baking, and brewing since ancient times. It is believed to have b ...
'' (brewer's or baker's yeast). Many other Model Organism Databases have joined the Gene Ontology Consortium, contributing not only annotation data, but also contributing to the development of the ontologies and tools to view and apply the data. Many major plant, animal and microorganism databases make a contribution towards this project. As of July 2019, the GO contains 44,945 terms; there are 6,408,283 annotations to 4,467 different biological organisms. There is a significant body of literature on the development and use of the GO, and it has become a standard tool in the
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
arsenal. Their objectives have three aspects: building gene ontology, assigning ontology to gene/gene products and developing software and databases for the first two objects. Several analyses of the Gene Ontology using formal, domain-independent properties of classes (the metaproperties) are also starting to appear. For instance, an ontological analysis of biological ontologies see.


See also

* Blast2GO *
Comparative Toxicogenomics Database The Comparative Toxicogenomics Database (CTD) is a public website and research tool launched in November 2004 that curates scientific data describing relationships between chemicals/drugs, genes/proteins, diseases, taxa, phenotypes, GO annotations ...
* DAVID bioinformatics *
Interferome Interferome is an online bioinformatics database of interferon-regulated genes (IRGs). These Interferon Regulated Genes are also known as Interferon Stimulated Genes (ISGs). The database contains information on type I (IFN alpha, beta), type II (IF ...
* National Center for Biomedical Ontology * Critical Assessment of Function Annotation


References


External links


AmiGO
- the current official web-based set of tools for searching and browsing the Gene Ontology database
Gene Ontology Consortium
- official site
PlantRegMap - GO annotation for 165 plant species and GO enrichment Analysis
{{Bioinformatics Biological databases Ontology (information science)