HOME

TheInfoList



OR:

The International Nucleotide Sequence Database Collaboration (INSDC) consists of a joint effort to collect and disseminate
database In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
s containing
DNA Deoxyribonucleic acid (; DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of al ...
and
RNA Ribonucleic acid (RNA) is a polymeric molecule that is essential for most biological functions, either by performing the function itself (non-coding RNA) or by forming a template for the production of proteins (messenger RNA). RNA and deoxyrib ...
sequences. It involves the following computerized
database In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and a ...
s: NIG's DNA Data Bank of Japan (
Japan Japan is an island country in East Asia. Located in the Pacific Ocean off the northeast coast of the Asia, Asian mainland, it is bordered on the west by the Sea of Japan and extends from the Sea of Okhotsk in the north to the East China Sea ...
), NCBI's
GenBank The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations. It is produced and maintained by the National Center for Biotechnology Information (NCBI; a par ...
( USA) and the EMBL- EBI's European Nucleotide Archive ( EMBL). New and updated data on
nucleotide Nucleotides are Organic compound, organic molecules composed of a nitrogenous base, a pentose sugar and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both o ...
sequences contributed by research teams to each of the three databases are synchronized on a daily basis through continuous interaction between the staff at each the collaborating organizations. All of the data in INSDC is available for free and unrestricted access, for any purpose, with no restrictions on analysis, redistribution, or re-publication of the data. This policy has been a foundational principle of the INSDC since its inception. Since the 1990s, most of the world's major scientific journals have required that sequence data be deposited in an INSDC database as a pre-condition for publication. The DDBJ/ EMBL-EBI/
GenBank The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations. It is produced and maintained by the National Center for Biotechnology Information (NCBI; a par ...
synchronization is maintained according to a number of guidelines which are produced and published by an International Advisory Board. The guidelines consist of a common definition of the feature tables for the databases, which regulate the content and
syntax In linguistics, syntax ( ) is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituenc ...
of the database entries, in the form of a common DTD (''Document Type Definition''). The syntax is called INSDSeq and its core consists of the letter sequence of the
gene expression Gene expression is the process (including its Regulation of gene expression, regulation) by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, proteins or non-coding RNA, ...
(
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although over 500 amino acids exist in nature, by far the most important are the 22 α-amino acids incorporated into proteins. Only these 22 a ...
sequence) and the letter sequence for
nucleotide Nucleotides are Organic compound, organic molecules composed of a nitrogenous base, a pentose sugar and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both o ...
bases in the gene or decoded segment. In a DBFetch operation shows a typical INSD entry at the EMBL-EBI database; the same entry at NCBI.


See also

*
Bioinformatics Bioinformatics () is an interdisciplinary field of science that develops methods and Bioinformatics software, software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, ...
* Biological database * List of biological databases *
National Center for Biotechnology Information The National Center for Biotechnology Information (NCBI) is part of the National Library of Medicine (NLM), a branch of the National Institutes of Health (NIH). It is approved and funded by the government of the United States. The NCBI is lo ...
*
Sequence database In the field of bioinformatics, a sequence database is a type of biological database that is composed of a large collection of computerized ("Digital data, digital") nucleic acid sequences, protein sequences, or other polymer sequences stored on a ...


References


External links


Official site


External links

*




DNA Data Bank of Japan

GenBank Nucleotide Search
{{Bioinformatics Bioinformatics organizations Biology organisations based in the United Kingdom Databases in the United Kingdom Population genetics in the United Kingdom South Cambridgeshire District