HOME

TheInfoList



OR:

Disease gene identification is a process by which scientists identify the mutant
genotype The genotype of an organism is its complete set of genetic material. Genotype can also be used to refer to the alleles or variants an individual carries in a particular gene or genetic location. The number of alleles an individual can have in a ...
s responsible for an inherited
genetic disorder A genetic disorder is a health problem caused by one or more abnormalities in the genome. It can be caused by a mutation in a single gene (monogenic) or multiple genes (polygenic) or by a chromosome abnormality. Although polygenic disorders ...
.
Mutations In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mitosi ...
in these genes can include single nucleotide substitutions, single nucleotide additions/deletions, deletion of the entire gene, and other genetic abnormalities.


Significance

Knowledge of which genes (when non-functional) cause which disorders will simplify diagnosis of patients and provide insights into the functional characteristics of the mutation. The advent of modern-day
high-throughput sequencing DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, thymine, cytosine, and guanine. The ...
technologies combined with insights provided from the growing field of
genomics Genomics is an interdisciplinary field of molecular biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, ...
is resulting in more rapid disease gene identification, thus allowing scientists to identify more complex mutations.


Generic gene identification procedure

Disease gene identification techniques often follow the same overall procedure. DNA is first collected from several patients who are believed to have the same genetic disease. Then, their DNA samples are analyzed and screened to determine probable regions where the mutation could potentially reside. These techniques are mentioned below. These probable regions are then lined-up with one another and the overlapping region should contain the mutant gene. If enough of the genome sequence is known, that region is searched for
candidate gene The candidate gene approach to conducting genetic association studies focuses on associations between genetic variation within pre-specified genes of interest, and Phenotype (clinical medicine), phenotypes or disease states. This is in contrast to ...
s. Coding regions of these genes are then sequenced until a mutation is discovered or another patient is discovered, in which case the analysis can be repeated, potentially narrowing down the region of interest. The differences between most disease gene identification procedures are in the second step (where DNA samples are analyzed and screened to determine regions in which the mutation could reside).


Pre-genomics techniques

Without the aid of the whole-genome sequences, pre-genomics investigations looked at select regions of the genome, often with only minimal knowledge of the gene sequences they were looking at. Genetic techniques capable of providing this sort of information include
Restriction Fragment Length Polymorphism In molecular biology, restriction fragment length polymorphism (RFLP) is a technique that exploits variations in homologous DNA sequences, known as polymorphisms, populations, or species or to pinpoint the locations of genes within a sequence. T ...
(RFLP) analysis and
microsatellite A microsatellite is a tract of repetitive DNA in which certain Sequence motif, DNA motifs (ranging in length from one to six or more base pairs) are repeated, typically 5–50 times. Microsatellites occur at thousands of locations within an organ ...
analysis.


Loss of heterozygosity (LOH)

Loss of heterozygosity (LOH) is a technique that can only be used to compare two samples from the same individual. LOH analysis is often used when identifying cancer-causing
oncogenes An oncogene is a gene that has the potential to cause cancer. In tumor cells, these genes are often mutated, or expressed at high levels.
in that one sample consists of (mutant) tumor DNA and the other (control) sample consists of genomic DNA from non-cancerous cells from the same individual. RFLPs and microsatellite markers provide patterns of DNA polymorphisms, which can be interpreted as residing in a
heterozygous Zygosity (the noun, zygote, is from the Greek "yoked," from "yoke") () is the degree to which both copies of a chromosome or gene have the same genetic sequence. In other words, it is the degree of similarity of the alleles in an organism. Mos ...
region or a
homozygous Zygosity (the noun, zygote, is from the Greek "yoked," from "yoke") () is the degree to which both copies of a chromosome or gene have the same genetic sequence. In other words, it is the degree of similarity of the alleles in an organism. Mos ...
region of the genome. Provided that all individuals are affected with the same disease resulting from a manifestation of a deletion of a single copy of the same gene, all individuals will contain one region where their control sample is heterozygous but the mutant sample is homozygous - this region will contain the disease gene.


Post-genomics techniques

With the advent of modern laboratory techniques such as
High-throughput sequencing DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, thymine, cytosine, and guanine. The ...
and software capable of genome-wide analysis, sequence acquisition has become increasingly less expensive and time-consuming, thus providing significant benefits to science in the form of more efficient disease gene identification techniques.


Identity by descent mapping

Identity by descent A DNA segment is identical by descent (IBD) in two or more individuals if: * they have inherited it from a common ancestor without recombination, that is, the segment has the same ancestral origin in these individuals * the segment is maximal, t ...
(IBD) mapping generally uses
single nucleotide polymorphism In genetics and bioinformatics, a single-nucleotide polymorphism (SNP ; plural SNPs ) is a germline substitution of a single nucleotide at a specific position in the genome. Although certain definitions require the substitution to be present in ...
(SNP) arrays to survey known polymorphic sites throughout the genome of affected individuals and their parents and/or siblings, both affected and unaffected. While these SNPs probably do not cause the disease, they provide valuable insight into the makeup of the genomes in question. A region of the genome is considered identical by descent if contiguous SNPs share the same genotype. When comparing an affected individual to his/her affected sibling, all identical regions are recorded (ex. Shaded in red in above figure). Given that an affected sibling and an unaffected sibling do not have the same disease phenotype, their DNA must by definition be different (barring the presence of a genetic or environmental
modifier Modifier may refer to: * Grammatical modifier, a word that modifies the meaning of another word or limits its meaning ** Compound modifier, two or more words that modify a noun ** Dangling modifier, a word or phrase that modifies a clause in an am ...
). Thus, the IBD mapping results can be further supplemented by removing any regions that are identical in both affected individuals and unaffected siblings. This is then repeated for multiple families, thus generating a small, overlapping fragment, which theoretically contains the disease gene.


Homozygosity/autozygosity mapping

Homozygosity/Autozygosity mapping is a powerful technique, but is only valid when searching for a mutation segregating within a small, closed population. Such a small population, possibly created by the
founder effect In population genetics, the founder effect is the loss of genetic variation that occurs when a new population is established by a very small number of individuals from a larger population. It was first fully outlined by Ernst Mayr in 1942, us ...
, will have a limited gene pool, and thus any inherited disease will probably be a result of two copies of the same mutation segregating on the same
haplotype A haplotype (haploid genotype) is a group of alleles in an organism that are inherited together from a single parent. Many organisms contain genetic material (DNA) which is inherited from two parents. Normally these organisms have their DNA orga ...
. Since affected individuals will probably be homozygous in the regions, looking at SNPs in a region is an adequate marker of regions of homozygosity and heterozygosity. Modern day
SNP array In molecular biology, SNP array is a type of DNA microarray which is used to detect polymorphisms within a population. A single nucleotide polymorphism (SNP), a variation at a single site in DNA, is the most frequent type of variation in the geno ...
s are used to survey the genome and identify large regions of homozygosity. Homozygous blocks in the genomes of affected individuals can then be laid on top of each other, and the overlapping region should contain the disease gene. This analysis is often extended by analyzing autozygosity, an extension of homozygosity, in the genomes of affected individuals. This can be accomplished by plotting a cumulative
LOD score Genetic linkage is the tendency of DNA sequences that are close together on a chromosome to be inherited together during the meiosis phase of sexual reproduction. Two genetic markers that are physically near to each other are unlikely to be separ ...
alongside the overlaid blocks of homozygosity. By taking into consideration the population allele frequencies for all SNPs via autozygosity mapping, the results of homozygosity can be confirmed. Furthermore, if two suspicious regions appear as a result of homozygosity mapping, autozygosity mapping may be able to distinguish between the two (ex. If one block of homozygosity is a result of a very non-diverse region of the genome, the LOD score will be very low). Tools for Homozygosity Mapping # HomSI: a homozygous stretch identifier from next-generation sequencing data A tool that identifies homozygous regions using deep sequence data.


Genome-wide knockdown studies

Genome-wide knockdown studies are an example of the
reverse genetics Reverse genetics is a method in molecular genetics that is used to help understand the function(s) of a gene by analysing the phenotypic effects caused by genetically engineering specific nucleic acid sequences within the gene. The process proce ...
made possible by the acquisition of whole genome sequences, and the advent of genomics and gene-silencing technologies, mainly
siRNA Small interfering RNA (siRNA), sometimes known as short interfering RNA or silencing RNA, is a class of double-stranded non-coding RNA molecules, typically 20–24 base pairs in length, similar to microRNA (miRNA), and operating within the RN ...
and
deletion mapping In genetics and especially genetic engineering, deletion mapping is a technique used to find out the mutation sites within a gene. The principle of deletion mapping involves crossing a strain which has a point mutation in a gene, with multiple st ...
. Genome-wide knockdown studies involve systematic knockdown or deletion of genes or segments of the genome. This is generally done in
prokaryotes A prokaryote (; less commonly spelled procaryote) is a single-celled organism whose cell lacks a nucleus and other membrane-bound organelles. The word ''prokaryote'' comes from the Ancient Greek (), meaning 'before', and (), meaning 'nut' ...
or in a
tissue culture Tissue culture is the growth of tissue (biology), tissues or cell (biology), cells in an artificial medium separate from the parent organism. This technique is also called micropropagation. This is typically facilitated via use of a liquid, semi-s ...
environment due to the massive number of knockdowns that must be performed. After the systematic knockout is completed (and possibly confirmed by mRNA expression analysis), the phenotypic results of the knockdown/knockout can be observed. Observation parameters can be selected to target a highly specific phenotype. The resulting dataset is then queried for samples which exhibit phenotypes matching the disease in question – the gene(s) knocked down/out in said samples can then be considered candidate disease genes for the individual in question.


Whole exome sequencing

Whole
exome sequencing Exome sequencing, also known as whole exome sequencing (WES), is a genomic technique for sequencing all of the protein-coding regions of genes in a genome (known as the exome). It consists of two steps: the first step is to select only the subs ...
is a brute-force approach that involves using modern day sequencing technology and DNA sequence assembly tools to piece together all coding portions of the genome. The sequence is then compared to a
reference genome A reference genome (also known as a reference assembly) is a digital nucleic acid sequence database, assembled by scientists as a representative example of the genome, set of genes in one idealized individual organism of a species. As they are a ...
and any differences are noted. After filtering out all known benign polymorphisms, synonymous changes, and
intronic An intron is any nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word ''intron'' is derived from the term ''intragenic region'', i.e., a region inside a gene."The notion of the cistron .e., gene ...
changes (that do not affect splice sites), only potentially pathogenic variants will be left. This technique can be combined with other techniques to further exclude potentially pathogenic variants should more than one be identified.


See also

*
Gene Disease Database In bioinformatics, a Gene Disease Database is a systematized collection of data, typically structured to model aspects of reality, in a way to comprehend the underlying mechanisms of complex diseases, by understanding multiple composite intera ...
* Gene identification * Haplotype tagging


References

{{Reflist, colwidth=35em Mutation Genomics Molecular biology Bioinformatics