A microsatellite is a tract of repetitive
DNA
Deoxyribonucleic acid (; DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of al ...
in which certain
DNA motifs (ranging in length from one to six or more
base pairs) are repeated, typically 5–50 times.
Microsatellites occur at thousands of locations within an organism's
genome
A genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as ...
. They have a higher
mutation
In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, ...
rate than other areas of DNA
leading to high
genetic diversity
Genetic diversity is the total number of genetic characteristics in the genetic makeup of a species. It ranges widely, from the number of species to differences within species, and can be correlated to the span of survival for a species. It is d ...
. Microsatellites are often referred to as short tandem repeats (STRs) by
forensic geneticists and in
genetic genealogy, or as simple sequence repeats (SSRs) by plant geneticists.
Microsatellites and their longer cousins, the
minisatellites, together are classified as
VNTR (variable number of
tandem repeats) DNA. The name
"satellite" DNA refers to the early observation that centrifugation of genomic DNA in a test tube separates a prominent layer of bulk DNA from accompanying "satellite" layers of repetitive DNA.
They are widely used for
DNA profiling
DNA profiling (also called DNA fingerprinting and genetic fingerprinting) is the process of determining an individual's deoxyribonucleic acid (DNA) characteristics. DNA analysis intended to identify a species, rather than an individual, is cal ...
in
cancer diagnosis, in
kinship
In anthropology, kinship is the web of social relationships that form an important part of the lives of all humans in all societies, although its exact meanings even within this discipline are often debated. Anthropologist Robin Fox says that ...
analysis (especially
paternity testing) and in forensic identification. They are also used in
genetic linkage
Genetic linkage is the tendency of Nucleic acid sequence, DNA sequences that are close together on a chromosome to be inherited together during the meiosis phase of sexual reproduction. Two Genetic marker, genetic markers that are physically near ...
analysis to locate a gene or a mutation responsible for a given trait or disease. Microsatellites are also used in
population genetics
Population genetics is a subfield of genetics that deals with genetic differences within and among populations, and is a part of evolutionary biology. Studies in this branch of biology examine such phenomena as Adaptation (biology), adaptation, s ...
to measure levels of relatedness between subspecies, groups and individuals.
History
Although the first microsatellite was characterised in 1984 at the
University of Leicester
The University of Leicester ( ) is a public university, public research university based in Leicester, England. The main campus is south of the city centre, adjacent to Victoria Park, Leicester, Victoria Park. The university's predecessor, Univ ...
by Weller,
Jeffreys Jeffreys is a surname that may refer to the following notable people:
* Alec Jeffreys (born 1950), British biologist and discoverer of DNA fingerprinting
* Anne Jeffreys (1923–2017), American actress and singer
* Arthur Frederick Jeffreys ( ...
and colleagues as a polymorphic GGAT repeat in the human
myoglobin
Myoglobin (symbol Mb or MB) is an iron- and oxygen-binding protein found in the cardiac and skeletal muscle, skeletal Muscle, muscle tissue of vertebrates in general and in almost all mammals. Myoglobin is distantly related to hemoglobin. Compar ...
gene, the term "microsatellite" was introduced later, in 1989, by Litt and Luty.
The name
"satellite" DNA refers to the early observation that centrifugation of genomic DNA in a test tube separates a prominent layer of bulk DNA from accompanying "satellite" layers of repetitive DNA.
The increasing availability of DNA amplification by PCR at the beginning of the 1990s triggered a large number of studies using the amplification of microsatellites as genetic markers for forensic medicine, for paternity testing, and for positional cloning to find the gene underlying a trait or disease. Prominent early applications include the identifications by microsatellite genotyping of the eight-year-old skeletal remains of a British murder victim (
Hagelberg et al. 1991), and of the Auschwitz concentration camp doctor
Josef Mengele
Josef Mengele (; 16 March 19117 February 1979) was a Nazi German (SS) officer and physician during World War II at the Russian front and then at Auschwitz during the Holocaust, often dubbed the "Angel of Death" (). He performed Nazi hum ...
who escaped to South America following World War II (
Jeffreys Jeffreys is a surname that may refer to the following notable people:
* Alec Jeffreys (born 1950), British biologist and discoverer of DNA fingerprinting
* Anne Jeffreys (1923–2017), American actress and singer
* Arthur Frederick Jeffreys ( ...
et al. 1992).
Structures, locations, and functions
A microsatellite is a tract of tandemly repeated (i.e. adjacent) DNA motifs that range in length from one to six or up to ten nucleotides (the exact definition and delineation to the longer minisatellites varies from author to author),
and are typically repeated 5–50 times. For example, the sequence TATATATATA is a dinucleotide microsatellite, and GTCGTCGTCGTCGTC is a trinucleotide microsatellite (with A being
Adenine
Adenine (, ) (nucleoside#List of nucleosides and corresponding nucleobases, symbol A or Ade) is a purine nucleotide base that is found in DNA, RNA, and Adenosine triphosphate, ATP. Usually a white crystalline subtance. The shape of adenine is ...
, G
Guanine
Guanine () (symbol G or Gua) is one of the four main nucleotide bases found in the nucleic acids DNA and RNA, the others being adenine, cytosine, and thymine ( uracil in RNA). In DNA, guanine is paired with cytosine. The guanine nucleoside ...
, C
Cytosine
Cytosine () (symbol C or Cyt) is one of the four nucleotide bases found in DNA and RNA, along with adenine, guanine, and thymine ( uracil in RNA). It is a pyrimidine derivative, with a heterocyclic aromatic ring and two substituents attac ...
, and T
Thymine
Thymine () (symbol T or Thy) is one of the four nucleotide bases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The others are adenine, guanine, and cytosine. Thymine is also known as 5-methyluracil, a pyrimidine ...
). Repeat units of four and five nucleotides are referred to as tetra- and pentanucleotide motifs, respectively. Most
eukaryote
The eukaryotes ( ) constitute the Domain (biology), domain of Eukaryota or Eukarya, organisms whose Cell (biology), cells have a membrane-bound cell nucleus, nucleus. All animals, plants, Fungus, fungi, seaweeds, and many unicellular organisms ...
s have microsatellites, with the notable exception of some yeast species. Microsatellites are distributed throughout the genome.
The human genome for example contains 50,000–100,000 dinucleotide microsatellites, and lesser numbers of tri-, tetra- and pentanucleotide microsatellites.
Many are located in non-coding parts of the human genome and therefore do not produce proteins, but they can also be located in regulatory regions and
coding regions.
Microsatellites in non-coding regions may not have any specific function, and therefore might not be
selected against; this allows them to accumulate mutations unhindered over the generations and gives rise to variability that can be used for DNA fingerprinting and identification purposes. Other microsatellites are located in regulatory flanking or
intronic regions of genes, or directly in
codons of genes – microsatellite mutations in such cases can lead to phenotypic changes and diseases, notably in
triplet expansion diseases such as
fragile X syndrome
Fragile X syndrome (FXS) is a genetic neurodevelopmental disorder. The average IQ in males with FXS is under 55, while affected females tend to be in the borderline to normal range, typically around 70–85. Physical features may include a lo ...
and
Huntington's disease
Huntington's disease (HD), also known as Huntington's chorea, is an incurable neurodegenerative disease that is mostly Genetic disorder#Autosomal dominant, inherited. It typically presents as a triad of progressive psychiatric, cognitive, and ...
.
Telomeres are linear sequences of DNA that sit at the very ends of chromosomes and protect the integrity of genomic material (not unlike an
aglet on the end of a shoelace) during successive rounds of cell division due to the "end replication problem".
In white blood cells, the gradual shortening of telomeric DNA has been shown to inversely correlate with
ageing
Ageing (or aging in American English) is the process of becoming older until death. The term refers mainly to humans, many other animals, and fungi; whereas for example, bacteria, perennial plants and some simple animals are potentially biol ...
in several sample types. Telomeres consist of repetitive DNA, with the hexanucleotide repeat motif TTAGGG in vertebrates. They are thus classified as
minisatellites. Similarly, insects have shorter repeat motifs in their telomeres that could arguably be considered microsatellites.
Mutation mechanisms and mutation rates

Unlike
point mutations, which affect only a single nucleotide, microsatellite mutations lead to the gain or loss of an entire repeat unit, and sometimes two or more repeats simultaneously. Thus, the
mutation rate
In genetics, the mutation rate is the frequency of new mutations in a single gene, nucleotide sequence, or organism over time. Mutation rates are not constant and are not limited to a single type of mutation; there are many different types of mu ...
at microsatellite
loci is expected to differ from other mutation rates, such as base substitution rates.
The mutation rate at microsatellite loci depends on the repeat motif sequence, the number of repeated motif units and the purity of the canonical repeated sequence.
A variety of mechanisms for mutation of microsatellite loci have been reviewed,
and their resulting polymorphic nature has been quantified.
The actual cause of mutations in microsatellites is debated.
One proposed cause of such length changes is replication slippage, caused by mismatches between DNA strands while being replicated during
meiosis
Meiosis () is a special type of cell division of germ cells in sexually-reproducing organisms that produces the gametes, the sperm or egg cells. It involves two rounds of division that ultimately result in four cells, each with only one c ...
.
DNA polymerase
A DNA polymerase is a member of a family of enzymes that catalyze the synthesis of DNA molecules from nucleoside triphosphates, the molecular precursors of DNA. These enzymes are essential for DNA replication and usually work in groups to create t ...
, the enzyme responsible for reading DNA during replication, can slip while moving along the template strand and continue at the wrong nucleotide. DNA polymerase slippage is more likely to occur when a repetitive sequence (such as CGCGCG) is replicated. Because microsatellites consist of such repetitive sequences, DNA polymerase may make errors at a higher rate in these sequence regions. Several studies have found evidence that slippage is the cause of microsatellite mutations.
Typically, slippage in each microsatellite occurs about once per 1,000 generations.
Thus, slippage changes in repetitive DNA are three orders of magnitude more common than
point mutation
A point mutation is a genetic mutation where a single nucleotide base is changed, inserted or deleted from a DNA or RNA sequence of an organism's genome. Point mutations have a variety of effects on the downstream protein product—consequences ...
s in other parts of the genome.
Most slippage results in a change of just one repeat unit, and slippage rates vary for different allele lengths and repeat unit sizes,
and within different species.
If there is a large size difference between individual alleles, then there may be increased instability during recombination at meiosis.
Another possible cause of microsatellite mutations are point mutations, where only one nucleotide is incorrectly copied during replication. A study comparing human and primate genomes found that most changes in repeat number in short microsatellites appear due to point mutations rather than slippage.
Microsatellite mutation rates
Direct estimates of microsatellite mutation rates have been made in numerous organisms, from insects to humans. In the
desert locust
The desert locust (''Schistocerca gregaria'') is a species of locust, a periodically swarming, short-horned grasshopper in the family Acrididae. They are found primarily in the deserts and dry areas of northern and eastern Africa, Arabia, and ...
''Schistocerca gregaria'', the microsatellite mutation rate was estimated at 2.1 × 10
−4 per generation per locus.
The microsatellite mutation rate in human male germ lines is five to six times higher than in female germ lines and ranges from 0 to 7 × 10
−3 per locus per gamete per generation.
In the nematode ''
Pristionchus pacificus'', the estimated microsatellite mutation rate ranges from 8.9 × 10
−5 to 7.5 × 10
−4 per locus per generation.
Microsatellite mutation rates vary with base position relative to the microsatellite, repeat type, and base identity.
Mutation rate rises specifically with repeat number, peaking around six to eight repeats and then decreasing again.
Increased heterozygosity in a population will also increase microsatellite mutation rates,
especially when there is a large length difference between alleles. This is likely due to
homologous chromosomes with arms of unequal lengths causing instability during meiosis.
Biological effects of microsatellite mutations
Many microsatellites are located in
non-coding DNA
Non-coding DNA (ncDNA) sequences are components of an organism's DNA that do not encode protein sequences. Some non-coding DNA is transcribed into functional non-coding RNA molecules (e.g. transfer RNA, microRNA, piRNA, ribosomal RNA, and reg ...
and are biologically silent. Others are located in regulatory or even
coding DNA – microsatellite mutations in such cases can lead to phenotypic changes and diseases. A genome-wide study estimates that microsatellite variation contributes 10–15% of heritable gene expression variation in humans.
Effects on proteins
In mammals, 20–40% of proteins contain repeating sequences of
amino acid
Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although over 500 amino acids exist in nature, by far the most important are the 22 α-amino acids incorporated into proteins. Only these 22 a ...
s encoded by short sequence repeats.
Most of the short sequence repeats within protein-coding portions of the genome have a repeating unit of three nucleotides, since that length will not cause frame-shifts when mutating.
Each trinucleotide repeating sequence is transcribed into a repeating series of the same amino acid. In yeasts, the most common repeated amino acids are glutamine, glutamic acid, asparagine, aspartic acid and serine.
Mutations in these repeating segments can affect the physical and chemical properties of proteins, with the potential for producing gradual and predictable changes in protein action.
For example, length changes in tandemly repeating regions in the
Runx2 gene lead to differences in facial length in domesticated dogs (''
Canis familiaris''), with an association between longer sequence lengths and longer faces.
This association also applies to a wider range of Carnivora species.
Length changes in polyalanine tracts within the
HOXA13 gene are linked to
hand-foot-genital syndrome, a developmental disorder in humans.
Length changes in other triplet repeats are linked to more than 40 neurological diseases in humans, notably
trinucleotide repeat disorders such as
fragile X syndrome
Fragile X syndrome (FXS) is a genetic neurodevelopmental disorder. The average IQ in males with FXS is under 55, while affected females tend to be in the borderline to normal range, typically around 70–85. Physical features may include a lo ...
and
Huntington's disease
Huntington's disease (HD), also known as Huntington's chorea, is an incurable neurodegenerative disease that is mostly Genetic disorder#Autosomal dominant, inherited. It typically presents as a triad of progressive psychiatric, cognitive, and ...
.
Evolutionary changes from replication slippage also occur in simpler organisms. For example, microsatellite length changes are common within surface membrane proteins in yeast, providing rapid evolution in cell properties.
Specifically, length changes in the FLO1 gene control the level of adhesion to substrates.
Short sequence repeats also provide rapid evolutionary change to surface proteins in pathenogenic bacteria; this may allow them to keep up with immunological changes in their hosts.
Length changes in short sequence repeats in a fungus (''
Neurospora crassa'') control the duration of its
circadian clock
A circadian clock, or circadian oscillator, also known as one’s internal alarm clock is a biochemical oscillator that cycles with a stable phase and is synchronized with solar time.
Such a clock's ''in vivo'' period is necessarily almost exact ...
cycles.
Effects on gene regulation
Length changes of microsatellites within
promoters and other
cis-regulatory regions can change gene expression quickly, between generations. The human genome contains many (>16,000) short sequence repeats in regulatory regions, which provide 'tuning knobs' on the expression of many genes.
Length changes in bacterial SSRs can affect
fimbriae formation in ''Haemophilus influenzae'', by altering promoter spacing.
Dinucleotide microsatellites are linked to abundant variation in cis-regulatory control regions in the human genome.
Microsatellites in control regions of the Vasopressin 1a receptor gene in voles influence their social behavior, and level of monogamy.
In
Ewing sarcoma (a type of painful bone cancer in young humans), a point mutation has created an extended GGAA microsatellite which binds a transcription factor, which in turn activates the EGR2 gene which drives the cancer. In addition, other GGAA microsatellites may influence the expression of genes that contribute to the clinical outcome of Ewing sarcoma patients.
Effects within introns
Microsatellites within
intron
An intron is any nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word ''intron'' is derived from the term ''intragenic region'', i.e., a region inside a gene."The notion of the cistron .e., gen ...
s also influence phenotype, through means that are not currently understood. For example, a GAA triplet expansion in the first intron of the X25 gene appears to interfere with transcription, and causes
Friedreich's ataxia
Friedreich's ataxia (FRDA) is a rare, inherited, autosomal recessive neurodegenerative disorder that primarily affects the nervous system, causing progressive damage to the spinal cord, peripheral nerves, and cerebellum, leading to impaired ...
.
Tandem repeats in the first intron of the Asparagine synthetase gene are linked to acute lymphoblastic leukaemia.
A repeat polymorphism in the fourth intron of the NOS3 gene is linked to hypertension in a Tunisian population.
Reduced repeat lengths in the EGFR gene are linked with osteosarcomas.
An archaic form of splicing preserved in
zebrafish is known to use microsatellite sequences within intronic mRNA for the removal of introns in the absence of U2AF2 and other splicing machinery. It is theorized that these sequences form highly stable
cloverleaf configurations that bring the 3' and 5' intron splice sites into close proximity, effectively replacing the
spliceosome. This method of RNA splicing is believed to have diverged from human evolution at the formation of
tetrapod
A tetrapod (; from Ancient Greek :wiktionary:τετρα-#Ancient Greek, τετρα- ''(tetra-)'' 'four' and :wiktionary:πούς#Ancient Greek, πούς ''(poús)'' 'foot') is any four-Limb (anatomy), limbed vertebrate animal of the clade Tetr ...
s and to represent an artifact of an
RNA world.
Effects within transposons
Almost 50% of the human genome is contained in various types of transposable elements (also called transposons, or 'jumping genes'), and many of them contain repetitive DNA.
It is probable that short sequence repeats in those locations are also involved in the regulation of gene expression.
Applications
Microsatellites are used for assessing chromosomal DNA deletions in cancer diagnosis. Microsatellites are widely used for
DNA profiling
DNA profiling (also called DNA fingerprinting and genetic fingerprinting) is the process of determining an individual's deoxyribonucleic acid (DNA) characteristics. DNA analysis intended to identify a species, rather than an individual, is cal ...
, also known as "genetic fingerprinting", of crime stains (in forensics) and of tissues (in transplant patients). They are also widely used in
kinship
In anthropology, kinship is the web of social relationships that form an important part of the lives of all humans in all societies, although its exact meanings even within this discipline are often debated. Anthropologist Robin Fox says that ...
analysis (most commonly in paternity testing). Also, microsatellites are used for mapping locations within the genome, specifically in
genetic linkage
Genetic linkage is the tendency of Nucleic acid sequence, DNA sequences that are close together on a chromosome to be inherited together during the meiosis phase of sexual reproduction. Two Genetic marker, genetic markers that are physically near ...
analysis to locate a gene or a mutation responsible for a given trait or disease. As a special case of mapping, they can be used for studies of
gene duplication
Gene duplication (or chromosomal duplication or gene amplification) is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene ...
or
deletion. Researchers use microsatellites in
population genetics
Population genetics is a subfield of genetics that deals with genetic differences within and among populations, and is a part of evolutionary biology. Studies in this branch of biology examine such phenomena as Adaptation (biology), adaptation, s ...
and in species conservation projects. Plant geneticists have proposed the use of microsatellites for
marker assisted selection of desirable traits in plant breeding.
Cancer diagnosis
In
tumour
A neoplasm () is a type of abnormal and excessive growth of tissue (biology), tissue. The process that occurs to form or produce a neoplasm is called neoplasia. The growth of a neoplasm is uncoordinated with that of the normal surrounding tiss ...
cells, whose controls on replication are damaged, microsatellites may be gained or lost at an especially high frequency during each round of
mitosis
Mitosis () is a part of the cell cycle in eukaryote, eukaryotic cells in which replicated chromosomes are separated into two new Cell nucleus, nuclei. Cell division by mitosis is an equational division which gives rise to genetically identic ...
. Hence a tumour cell line might show a different
genetic fingerprint from that of the host tissue, and, especially in
colorectal cancer
Colorectal cancer (CRC), also known as bowel cancer, colon cancer, or rectal cancer, is the development of cancer from the Colon (anatomy), colon or rectum (parts of the large intestine). Signs and symptoms may include Lower gastrointestinal ...
, might present with
loss of heterozygosity.
Microsatellites analyzed in primary tissue therefore been routinely used in cancer diagnosis to assess tumour progression.
Genome Wide Association Studies (GWAS) have been used to identify microsatellite biomarkers as a source of genetic predisposition in a variety of cancers.
Forensic and medical fingerprinting
Microsatellite analysis became popular in the field of
forensics
Forensic science combines principles of law and science to investigate criminal activity. Through crime scene investigations and laboratory analysis, forensic scientists are able to link suspects to evidence. An example is determining the time and ...
in the 1990s.
It is used for the
genetic fingerprinting
DNA profiling (also called DNA fingerprinting and genetic fingerprinting) is the process of determining an individual's deoxyribonucleic acid (DNA) characteristics. DNA analysis intended to identify a species, rather than an individual, is cal ...
of individuals where it permits forensic identification (typically matching a crime stain to a victim or perpetrator). It is also used to follow up
bone marrow transplant
Hematopoietic stem-cell transplantation (HSCT) is the transplantation of multipotent hematopoietic stem cells, usually derived from bone marrow, peripheral blood, or umbilical cord blood, in order to replicate inside a patient and produce a ...
patients.
The microsatellites in use today for forensic analysis are all tetra- or penta-nucleotide repeats, as these give a high degree of error-free data while being short enough to survive degradation in non-ideal conditions. Even shorter repeat sequences would tend to suffer from artifacts such as PCR stutter and preferential amplification, while longer repeat sequences would suffer more highly from environmental degradation and would amplify less well by
PCR.
Another forensic consideration is that the person's
medical privacy must be respected, so that forensic STRs are chosen which are non-coding, do not influence gene regulation, and are not usually trinucleotide STRs which could be involved in
triplet expansion diseases such as
Huntington's disease
Huntington's disease (HD), also known as Huntington's chorea, is an incurable neurodegenerative disease that is mostly Genetic disorder#Autosomal dominant, inherited. It typically presents as a triad of progressive psychiatric, cognitive, and ...
. Forensic STR profiles are stored in DNA databanks such as the
UK National DNA Database (NDNAD), the American
CODIS or the Australian NCIDD.
Kinship analysis (paternity testing)
Autosomal
An autosome is any chromosome that is not a sex chromosome. The members of an autosome pair in a diploid cell have the same morphology, unlike those in allosomal (sex chromosome) pairs, which may have different structures. The DNA in autosome ...
microsatellites are widely used for
DNA profiling
DNA profiling (also called DNA fingerprinting and genetic fingerprinting) is the process of determining an individual's deoxyribonucleic acid (DNA) characteristics. DNA analysis intended to identify a species, rather than an individual, is cal ...
in
kinship
In anthropology, kinship is the web of social relationships that form an important part of the lives of all humans in all societies, although its exact meanings even within this discipline are often debated. Anthropologist Robin Fox says that ...
analysis (most commonly in paternity testing). Paternally inherited
Y-STRs (microsatellites on the
Y chromosome
The Y chromosome is one of two sex chromosomes in therian mammals and other organisms. Along with the X chromosome, it is part of the XY sex-determination system, in which the Y is the sex-determining chromosome because the presence of the ...
) are often used in
genealogical DNA testing.
Genetic linkage analysis
During the 1990s and the first several years of this millennium, microsatellites were the workhorse genetic markers for genome-wide scans to locate any gene responsible for a given phenotype or disease, using
segregation observations across generations of a sampled pedigree. Although the rise of higher throughput and cost-effective
single-nucleotide polymorphism
In genetics and bioinformatics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in a ...
(SNP) platforms led to the era of the SNP for genome scans, microsatellites remain highly informative measures of genomic variation for linkage and association studies. Their continued advantage lies in their greater allelic diversity than biallelic SNPs, thus microsatellites can differentiate alleles within a SNP-defined linkage disequilibrium block of interest. Thus, microsatellites have successfully led to discoveries of type 2 diabetes (
TCF7L2) and prostate cancer genes (the 8q21 region).
Population genetics

Microsatellites were popularized in
population genetics
Population genetics is a subfield of genetics that deals with genetic differences within and among populations, and is a part of evolutionary biology. Studies in this branch of biology examine such phenomena as Adaptation (biology), adaptation, s ...
during the 1990s because as
PCR became ubiquitous in laboratories researchers were able to design primers and amplify sets of microsatellites at low cost. Their uses are wide-ranging. A microsatellite with a neutral evolutionary history makes it applicable for measuring or inferring
bottlenecks,
local adaptation, the allelic
fixation index
The fixation index (FST) is a measure of population differentiation due to genetic structure. It is frequently estimated from Polymorphism (biology), genetic polymorphism data, such as single-nucleotide polymorphisms (SNP) or Microsatellite (genet ...
(F
ST),
population size, and
gene flow
In population genetics, gene flow (also known as migration and allele flow) is the transfer of genetic variation, genetic material from one population to another. If the rate of gene flow is high enough, then two populations will have equivalent ...
. As
next generation sequencing becomes more affordable the use of microsatellites has decreased, however they remain a crucial tool in the field.
Plant breeding
Marker assisted selection or marker aided selection (MAS) is an indirect selection process where a
trait of interest is selected based on a
marker (
morphological,
biochemical
Biochemistry, or biological chemistry, is the study of chemical processes within and relating to living organisms. A sub-discipline of both chemistry and biology, biochemistry may be divided into three fields: structural biology, enzymology, ...
or
DNA
Deoxyribonucleic acid (; DNA) is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The polymer carries genetic instructions for the development, functioning, growth and reproduction of al ...
/
RNA
Ribonucleic acid (RNA) is a polymeric molecule that is essential for most biological functions, either by performing the function itself (non-coding RNA) or by forming a template for the production of proteins (messenger RNA). RNA and deoxyrib ...
variation) linked to a trait of interest (e.g. productivity, disease resistance, stress tolerance, and quality), rather than on the trait itself. Microsatellites have been proposed to be used as such markers to assist plant breeding.
Analysis
Repetitive DNA is not easily analysed by
next generation DNA sequencing methods, for some technologies struggle with
homopolymeric tracts. A variety of software approaches have been created for the analysis or raw nextgen DNA sequencing reads to determine the genotype and variants at repetitive loci.
Microsatellites can be analysed and verified by established PCR amplification and amplicon size determination, sometimes followed by
Sanger DNA sequencing.
In forensics, the analysis is performed by extracting
nuclear DNA
Nuclear DNA (nDNA), or nuclear deoxyribonucleic acid, is the DNA contained within each cell nucleus of a eukaryotic organism. It encodes for the majority of the genome in eukaryotes, with mitochondrial DNA and plastid DNA coding for the rest. ...
from the cells of a sample of interest, then amplifying specific
polymorphic regions of the extracted DNA by means of the
polymerase chain reaction
The polymerase chain reaction (PCR) is a method widely used to make millions to billions of copies of a specific DNA sample rapidly, allowing scientists to amplify a very small sample of DNA (or a part of it) sufficiently to enable detailed st ...
. Once these sequences have been amplified, they are resolved either through
gel electrophoresis or
capillary electrophoresis, which will allow the analyst to determine how many repeats of the microsatellites sequence in question there are. If the DNA was resolved by gel electrophoresis, the DNA can be visualized either by
silver staining (low sensitivity, safe, inexpensive), or an
intercalating dye such as
ethidium bromide (fairly sensitive, moderate health risks, inexpensive), or as most modern forensics labs use,
fluorescent dyes (highly sensitive, safe, expensive).
Instruments built to resolve microsatellite fragments by capillary electrophoresis also use fluorescent dyes.
Forensic profiles are stored in major databanks. The
British
British may refer to:
Peoples, culture, and language
* British people, nationals or natives of the United Kingdom, British Overseas Territories and Crown Dependencies.
* British national identity, the characteristics of British people and culture ...
data base for microsatellite loci identification was originally based on the British
SGM+ system using 10 loci and a
sex marker. The Americans increased this number to 13 loci.
The Australian database is called the NCIDD, and since 2013 it has been using 18 core markers for DNA profiling.
Amplification
Microsatellites can be amplified for identification by the
polymerase chain reaction
The polymerase chain reaction (PCR) is a method widely used to make millions to billions of copies of a specific DNA sample rapidly, allowing scientists to amplify a very small sample of DNA (or a part of it) sufficiently to enable detailed st ...
(PCR) process, using the unique sequences of flanking regions as
primers. DNA is repeatedly denatured at a high temperature to separate the double strand, then cooled to allow
annealing of primers and the extension of nucleotide sequences through the microsatellite. This process results in production of enough DNA to be visible on
agarose or
polyacrylamide gels; only small amounts of DNA are needed for amplification because in this way thermocycling creates an exponential increase in the replicated segment.
With the abundance of PCR technology, primers that flank microsatellite loci are simple and quick to use, but the development of correctly functioning primers is often a tedious and costly process.
Design of microsatellite primers
If searching for microsatellite markers in specific regions of a genome, for example within a particular
intron
An intron is any nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word ''intron'' is derived from the term ''intragenic region'', i.e., a region inside a gene."The notion of the cistron .e., gen ...
, primers can be designed manually. This involves searching the genomic DNA sequence for microsatellite repeats, which can be done by eye or by using automated tools such a
repeat masker Once the potentially useful microsatellites are determined, the flanking sequences can be used to design
oligonucleotide
Oligonucleotides are short DNA or RNA molecules, oligomers, that have a wide range of applications in genetic testing, Recombinant DNA, research, and Forensic DNA, forensics. Commonly made in the laboratory by Oligonucleotide synthesis, solid-phase ...
primers which will amplify the specific microsatellite repeat in a PCR reaction.
Random microsatellite primers can be developed by
cloning
Cloning is the process of producing individual organisms with identical genomes, either by natural or artificial means. In nature, some organisms produce clones through asexual reproduction; this reproduction of an organism by itself without ...
random segments of DNA from the focal species. These random segments are inserted into a
plasmid
A plasmid is a small, extrachromosomal DNA molecule within a cell that is physically separated from chromosomal DNA and can replicate independently. They are most commonly found as small circular, double-stranded DNA molecules in bacteria and ...
or
bacteriophage
A bacteriophage (), also known informally as a phage (), is a virus that infects and replicates within bacteria. The term is derived . Bacteriophages are composed of proteins that Capsid, encapsulate a DNA or RNA genome, and may have structu ...
vector
Vector most often refers to:
* Euclidean vector, a quantity with a magnitude and a direction
* Disease vector, an agent that carries and transmits an infectious pathogen into another living organism
Vector may also refer to:
Mathematics a ...
, which is in turn implanted into ''
Escherichia coli
''Escherichia coli'' ( )Wells, J. C. (2000) Longman Pronunciation Dictionary. Harlow ngland Pearson Education Ltd. is a gram-negative, facultative anaerobic, rod-shaped, coliform bacterium of the genus '' Escherichia'' that is commonly fo ...
'' bacteria. Colonies are then developed, and screened with fluorescently–labelled oligonucleotide sequences that will hybridize to a microsatellite repeat, if present on the DNA segment. If positive clones can be obtained from this procedure, the DNA is sequenced and PCR primers are chosen from sequences flanking such regions to determine a specific
locus. This process involves significant trial and error on the part of researchers, as microsatellite repeat sequences must be predicted and primers that are randomly isolated may not display significant polymorphism.
Microsatellite loci are widely distributed throughout the genome and can be isolated from semi-degraded DNA of older specimens, as all that is needed is a suitable substrate for amplification through PCR.
More recent techniques involve using oligonucleotide sequences consisting of repeats complementary to repeats in the microsatellite to "enrich" the DNA extracted (
microsatellite enrichment). The oligonucleotide probe hybridizes with the repeat in the microsatellite, and the probe/microsatellite complex is then pulled out of solution. The enriched DNA is then cloned as normal, but the proportion of successes will now be much higher, drastically reducing the time required to develop the regions for use. However, which probes to use can be a trial and error process in itself.
ISSR-PCR
ISSR (for inter-simple sequence repeat) is a general term for a genome region between microsatellite loci. The complementary sequences to two neighboring microsatellites are used as PCR primers; the variable region between them gets amplified. The limited length of amplification cycles during PCR prevents excessive replication of overly long contiguous DNA sequences, so the result will be a mix of a variety of amplified DNA strands which are generally short but vary much in length.
Sequences amplified by ISSR-PCR can be used for DNA fingerprinting. Since an ISSR may be a conserved or nonconserved region, this technique is not useful for distinguishing individuals, but rather for
phylogeography analyses or maybe delimiting
species
A species () is often defined as the largest group of organisms in which any two individuals of the appropriate sexes or mating types can produce fertile offspring, typically by sexual reproduction. It is the basic unit of Taxonomy (biology), ...
; sequence diversity is lower than in SSR-PCR, but still higher than in actual gene sequences. In addition, microsatellite sequencing and ISSR sequencing are mutually assisting, as one produces primers for the other.
Limitations
Repetitive DNA is not easily analysed by
next generation DNA sequencing methods, which struggle with homopolymeric tracts.
Therefore, microsatellites are normally analysed by conventional PCR amplification and amplicon size determination. The use of PCR means that microsatellite length analysis is prone to PCR limitations like any other PCR-amplified DNA locus. A particular concern is the occurrence of '
null alleles':
* Occasionally, within a sample of individuals such as in paternity testing casework, a mutation in the DNA flanking the microsatellite can prevent the PCR primer from binding and producing an amplicon (creating a "null allele" in a gel assay), thus only one allele is amplified (from the non-mutated sister chromosome), and the individual may then falsely appear to be homozygous. This can cause confusion in paternity casework. It may then be necessary to amplify the microsatellite using a different set of primers.
Null alleles are caused especially by mutations at the 3' section, where extension commences.
* In species or population analysis, for example in conservation work, PCR primers which amplify microsatellites in one individual or species can work in other species. However, the risk of applying PCR primers across different species is that null alleles become likely, whenever sequence divergence is too great for the primers to bind. The species may then artificially appear to have a reduced diversity. Null alleles in this case can sometimes be indicated by an excessive frequency of homozygotes causing deviations from Hardy-Weinberg equilibrium expectations.
See also
References
Further reading
*
*
*
*
*
*
*
*
*
*
*
External links
All known disease-causing short tandem repeatsMicroSatellite DataBase* Search tools:
*
*
IMEx
*
Imperfect SSR Finder—find perfect or imperfect SSRs in
FASTA sequences.
*
JSTRING—Java Search for Tandem Repeats In Genomes*
Microsatellite repeats finder*
*
MREPATT
*
Mreps
*
��a tandem repeat search tool for perfect and imperfect repeats—the maximum pattern size depends only on computational power
*
Poly*
*
SSR Finder*
STAR*
SERFDe Novo Genome Analysis and Tandem Repeats Finder
*
*
*
TRED*
TROLL*
Zebrafish Repeats
{{Authority control
Genetics
Forensic genetics
Repetitive DNA sequences