BioMart is a community-driven project to provide a single point of access to distributed
research data
Data ( , ) are a collection of discrete or continuous value (semiotics), values that convey information, describing the quantity, qualitative property, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols t ...
. The BioMart project contributes
open source software
Open-source software (OSS) is Software, computer software that is released under a Open-source license, license in which the copyright holder grants users the rights to use, study, change, and Software distribution, distribute the software an ...
and data services to the international
scientific community
The scientific community is a diverse network of interacting scientists. It includes many "working group, sub-communities" working on particular scientific fields, and within particular institutions; interdisciplinary and cross-institutional acti ...
. Although the BioMart software is primarily used by the
biomedical research
Medical research (or biomedical research), also known as health research, refers to the process of using scientific methods with the aim to produce knowledge about human diseases, the prevention and treatment of illness, and the promotion of ...
community, it is designed in such a way that any type of data can be incorporated into the BioMart framework. The BioMart project originated at the
European Bioinformatics Institute as a data management solution
for the
Human Genome Project
The Human Genome Project (HGP) was an international scientific research project with the goal of determining the base pairs that make up human DNA, and of identifying, mapping and sequencing all of the genes of the human genome from both a ...
.
Since then, BioMart has grown to become a multi-institute collaboration involving various database projects on five continents.
Integration with Ensembl
BioMart is a powerful tool for researchers and bioinformaticians that allows a user to export data from Ensembl, this could include data such as gene ID’s, gene positions, associated variations, protein domains and sequences. BioMArt allows the data to be exported into convenient file types like FASTA, XLS, CSV, TSV, HTML. Researchers can use the exported data in a variety of applications, including genomic studies, gene expression analysis, and comparative genomics. BioMart's intuitive interface enables users to customize queries to access specific data sets or features of interest easily
Software
BioMart is a freely available,
open-source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
,
federated database system
A federated database system (FDBS) is a type of Meta (prefix), meta-database management system (DBMS), which transparently maps multiple autonomous Database management system, database systems into a single federated database. The constituent data ...
that provides unified access to disparate, geographically distributed data sources.
BioMart allows databases hosted on different
servers to be presented seamlessly to users, facilitating collaborative projects. BioMart contains several levels of
query optimization
Query optimization is a feature of many relational database management systems and other databases such as NoSQL and graph databases. The query optimizer attempts to determine the most efficient way to execute a given query by considering the po ...
to efficiently manage large data sets, and offers a diverse selection of
graphical user interface
A graphical user interface, or GUI, is a form of user interface that allows user (computing), users to human–computer interaction, interact with electronic devices through Graphics, graphical icon (computing), icons and visual indicators such ...
s and
application programming interface
An application programming interface (API) is a connection between computers or between computer programs. It is a type of software Interface (computing), interface, offering a service to other pieces of software. A document or standard that des ...
s to allow queries to be performed in whatever manner is most convenient for the user. BioMart's capabilities are extended by integration with several widely used software packages such as
Bioconductor,
Galaxy
A galaxy is a Physical system, system of stars, stellar remnants, interstellar medium, interstellar gas, cosmic dust, dust, and dark matter bound together by gravity. The word is derived from the Ancient Greek, Greek ' (), literally 'milky', ...
,
Cytoscape
Cytoscape is an Open-source software, open source bioinformatics software platform for Visualization (graphic), visualizing Metabolic network modelling, molecular interaction networks and integrating with gene expression profiles and other state da ...
,
and
Taverna
A taverna (; ) is a small Greek restaurant that serves Greek cuisine. The taverna is an integral part of Greek culture and has become familiar to people from other countries who visit Greece, as well as through the establishment of tavernes ...
.
Data sources and community
There are around 40 BioMart data sources including the
Atlas of UTR Regulatory Activity (AURA), the
COSMIC cancer database
COSMIC is an online database of Somatic (biology), somatically acquired mutations found in human cancer. Somatic mutations are those that occur in non-germline cells that are not inherited by children. COSMIC, an acronym of ''Catalogue Of Somatic ...
,
Ensembl Genomes,
HapMap,
InterPro,
Mouse Genome Informatics (MGI),
Rfam and
UniProt
UniProt is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived fro ...
. Access is provided by institutions including the
European Bioinformatics Institute (EBI) and the
Wellcome Trust Sanger Institute
The Wellcome Sanger Institute, previously known as The Sanger Centre and Wellcome Trust Sanger Institute, is a non-profit organisation, non-profit British genomics and genetics research institute, primarily funded by the Wellcome Trust.
It is l ...
in the UK,
Cold Spring Harbor Laboratory
Cold Spring Harbor Laboratory (CSHL) is a private, non-profit institution with research programs focusing on cancer, neuroscience, botany, genomics, and quantitative biology. It is located in Laurel Hollow, New York, in Nassau County, on ...
and the
National Center for Biotechnology Information
The National Center for Biotechnology Information (NCBI) is part of the National Library of Medicine (NLM), a branch of the National Institutes of Health (NIH). It is approved and funded by the government of the United States. The NCBI is lo ...
(NCBI) in the United States and
French National Centre for Scientific Research
The French National Centre for Scientific Research (, , CNRS) is the French state research organisation and is the largest fundamental science agency in Europe.
In 2016, it employed 31,637 staff, including 11,137 tenured researchers, 13,415 engi ...
(CNRS).
The BioMart Central Portal was established to provide a convenient single point of access to this growing pool of data sources.
References
{{Reflist
External links
DATABASE Issue dedicated to BioMart*
ttp://biomart.org/ BioMart Project home pageBioMart Users mailing list
Bioinformatics software
Biological databases
Data warehousing products
Free software projects
Science and technology in Cambridgeshire
South Cambridgeshire District