HOME



picture info

RefSeq
The Reference Sequence (RefSeq) database is an open access, annotated and curated collection of publicly available nucleotide sequences (DNA, RNA) and their protein products. RefSeq was introduced in 2000. This database is built by National Center for Biotechnology Information (NCBI), and, unlike GenBank, provides only a single record for each natural biological molecule (i.e. DNA, RNA or protein) for major organisms ranging from viruses to bacteria to eukaryotes. For each model organism, ''RefSeq'' aims to provide separate and linked records for the genomic DNA, the gene transcripts, and the proteins arising from those transcripts. ''RefSeq'' is limited to major organisms for which sufficient data are available (121,461 distinct "named" organisms as of July 2022), while GenBank includes sequences for any organism submitted (approximately 504,000 formally described species). RefSeq categories RefSeq collection comprises different data types, with different origins, so it is neces ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Consensus CDS Project
The Consensus Coding Sequence (CCDS) Project is a collaborative effort to maintain a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assemblies. The CCDS project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented by the National Center for Biotechnology Information (NCBI), Ensembl, and UCSC Genome Browser. The integrity of the CCDS dataset is maintained through stringent quality assurance testing and on-going manual curation. Motivation and background Biological and biomedical research has come to rely on accurate and consistent annotation of genes and their products on genome assemblies. Reference annotations of genomes are available from various sources, each with their own independent goals and policies, which results in some annotation variation. The CCDS project was established to identify a gold stand ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

GenBank
The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations. It is produced and maintained by the National Center for Biotechnology Information (NCBI; a part of the National Institutes of Health in the United States) as part of the International Nucleotide Sequence Database Collaboration (INSDC). In October 2024, GenBank contained 34 trillion base pairs from over 4.7 billion nucleotide sequences and more than 580,000 formally described species. The database started in 1982 by Walter Goad and Los Alamos National Laboratory. GenBank has become an important database for research in biological fields and has grown in recent years at an exponential rate by doubling roughly every 18 months. GenBank is built by direct submissions from individual laboratories, as well as from bulk submissions from large-scale sequencing centers. Submissions Only original sequences can be submitted to GenBank. ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Model Organism
A model organism is a non-human species that is extensively studied to understand particular biological phenomena, with the expectation that discoveries made in the model organism will provide insight into the workings of other organisms. Model organisms are widely used to research human disease when human experimentation would be unfeasible or unethical. This strategy is made possible by the common descent of all living organisms, and the conservation of metabolic and developmental pathways and genetic material over the course of evolution. Research using animal models has been central to most of the achievements of modern medicine. It has contributed most of the basic knowledge in fields such as human physiology and biochemistry, and has played significant roles in fields such as neuroscience and infectious disease. The results have included the near- eradication of polio and the development of organ transplantation, and have benefited both humans and animals. From 19 ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




National Center For Biotechnology Information
The National Center for Biotechnology Information (NCBI) is part of the National Library of Medicine (NLM), a branch of the National Institutes of Health (NIH). It is approved and funded by the government of the United States. The NCBI is located in Bethesda, Maryland, and was founded in 1988 through legislation sponsored by US Congressman Claude Pepper. The NCBI houses a series of databases relevant to biotechnology and biomedicine and is an important resource for bioinformatics tools and services. Major databases include GenBank for DNA sequences and PubMed, a bibliographic database for biomedical literature. Other databases include the NCBI Epigenomics database. All these databases are available online through the Entrez search engine. NCBI was directed by David Lipman, one of the original authors of the BLAST sequence alignment program and a widely respected figure in bioinformatics. GenBank NCBI had responsibility for making available the GenBank DNA seque ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Sequence Database
In the field of bioinformatics, a sequence database is a type of biological database that is composed of a large collection of computerized ("Digital data, digital") nucleic acid sequences, protein sequences, or other polymer sequences stored on a computer. The UniProt database is an example of a protein sequence database. As of 2013 it contained over 40 million sequences and is growing at an exponential rate. Historically, sequences were published in paper form, but as the number of sequences grew, this storage method became unsustainable. Search Searching in a sequence database involves looking for similarities between a genomic/protein sequence and a query string and, finding the sequence in the database that "best" matches the target sequence (based on criteria which vary depending on the search method). The number of matches/hits is used to formulate a score that determines the similarity between the sequence query and the sequences in the sequence database. The main goal is ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Enhancer (genetics)
In genetics, an enhancer is a short (50–1500 bp) region of DNA that can be bound by proteins ( activators) to increase the likelihood that transcription of a particular gene will occur. These proteins are usually referred to as transcription factors. Enhancers are ''cis''-acting. They can be located up to 1 Mbp (1,000,000 bp) away from the gene, upstream or downstream from the start site. There are hundreds of thousands of enhancers in the human genome. They are found in both prokaryotes and eukaryotes. Active enhancers typically get transcribed as enhancer or regulatory non-coding RNA, whose expression levels correlate with mRNA levels of target genes. The first discovery of a eukaryotic enhancer was in the immunoglobulin heavy chain gene in 1983. This enhancer, located in the large intron, provided an explanation for the transcriptional activation of rearranged Vh gene promoters while unrearranged Vh promoters remained inactive. Lately, enhancers have been shown to be in ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Silencer (genetics)
In genetics, a silencer is a DNA sequence capable of binding transcription regulation factors, called repressors. DNA contains genes and provides the template to produce messenger RNA (mRNA). That mRNA is then translated into proteins. When a repressor protein binds to the silencer region of DNA, RNA polymerase is prevented from transcribing the DNA sequence into RNA. With transcription blocked, the translation of RNA into proteins is impossible. Thus, silencers prevent genes from being expressed as proteins. RNA polymerase, a DNA-dependent enzyme, transcribes the DNA sequences, called nucleotides, in the 3' to 5' direction while the complementary RNA is synthesized in the 5' to 3' direction. RNA is similar to DNA, except that RNA contains uracil, instead of thymine, which forms a base pair with adenine. An important region for the activity of gene repression and expression found in RNA is the 3' untranslated region. This is a region on the 3' terminus of RNA that will not ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

DNase I Hypersensitive Site
In genetics, DNase I hypersensitive sites (DHSs) are regions of chromatin that are sensitive to cleavage by the DNase I enzyme. In these specific regions of the genome, chromatin has lost its condensed structure, exposing the DNA and making it accessible. This raises the availability of DNA to degradation by enzymes, such as DNase I. These accessible chromatin zones are functionally related to transcriptional activity, since this remodeled state is necessary for the binding of proteins such as transcription factors. Since the discovery of DHSs 30 years ago, they have been used as markers of regulatory DNA regions. These regions have been shown to map many types of cis-regulatory elements including promoters, enhancers, insulators, silencers and locus control regions. A high-throughput measure of these regions is available through DNase-Seq. Massive analysis The ENCODE project proposes to map all of the DHSs in the human genome with the intention of cataloging human regulatory D ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Origin Of Replication
The origin of replication (also called the replication origin) is a particular sequence in a genome at which replication is initiated. Propagation of the genetic material between generations requires timely and accurate duplication of DNA by semiconservative replication prior to cell division to ensure each daughter cell receives the full complement of chromosomes. Material was copied from this source, which is available under Creative Commons Attribution 4.0 International License This can either involve the DNA replication, replication of DNA in living organisms such as prokaryotes and eukaryotes, or that of DNA virus, DNA or RNA virus, RNA in viruses, such as double-stranded RNA viruses. Synthesis of daughter strands starts at discrete sites, termed replication origins, and proceeds in a bidirectional manner until all genomic DNA is replicated. Despite the fundamental nature of these events, organisms have evolved surprisingly divergent strategies that control replication onset. ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Phylogenetics
In biology, phylogenetics () is the study of the evolutionary history of life using observable characteristics of organisms (or genes), which is known as phylogenetic inference. It infers the relationship among organisms based on empirical data and observed heritable traits of DNA sequences, protein amino acid sequences, and morphology. The results are a phylogenetic tree—a diagram depicting the hypothetical relationships among the organisms, reflecting their inferred evolutionary history. The tips of a phylogenetic tree represent the observed entities, which can be living taxa or fossils. A phylogenetic diagram can be rooted or unrooted. A rooted tree diagram indicates the hypothetical common ancestor of the taxa represented on the tree. An unrooted tree diagram (a network) makes no assumption about directionality of character state transformation, and does not show the origin or "root" of the taxa in question. In addition to their use for inferring phylogenetic pa ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Ribosomal RNA
Ribosomal ribonucleic acid (rRNA) is a type of non-coding RNA which is the primary component of ribosomes, essential to all cells. rRNA is a ribozyme which carries out protein synthesis in ribosomes. Ribosomal RNA is transcribed from ribosomal DNA (rDNA) and then bound to ribosomal proteins to form SSU rRNA, small and LSU rRNA, large ribosome subunits. rRNA is the physical and mechanical factor of the ribosome that forces transfer RNA (tRNA) and messenger RNA (mRNA) to process and Translation (biology), translate the latter into proteins. Ribosomal RNA is the predominant form of RNA found in most cells; it makes up about 80% of cellular RNA despite never being translated into proteins itself. Ribosomes are composed of approximately 60% rRNA and 40% ribosomal proteins, though this ratio differs between Prokaryote, prokaryotes and Eukaryote, eukaryotes. Structure Although the primary structure of rRNA sequences can vary across organisms, Base pair, base-pairing within these sequ ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]