Transcriptome Analysis
A transcriptome is the sum of all RNA transcripts that are present in a given cell, tissue, or organ within an organism. Transcriptomes include both mRNA, which functions as an intermediate to the central dogma; as well as noncoding RNAs that may play other roles in protein synthesis. In the central dogma, it describes how DNA is able to make proteins through transcription and translation. RNAs are present in a cell in varied concentrations, and play various roles outside of the central dogma and are able to be identified based on length and function. It is through functional elements that the transcriptional and translational activities of genes is able to be regulated. Transcriptome analysis is beneficial for obtaining information about all RNAs present and can provide valuable insight into the genetic mechanisms that are tissue specific. The transcriptome was first investigated in the 1990s in an experiment performed to identify a partial transcriptome of the human brain. Researchers were able to identify 609 mRNA sequences. Since then, many advances in Next Generation Sequencing methods have been made. Transcriptomes are now able to be routinely developed due to advances in these methods and new technologies such as microarrays and RNA-Seq. Both methods require computed imaging as well as high reads and statistical analysis. By obtaining information about gene expression through mRNAs, many applications have been discovered. Transcriptome analysis has proven to be beneficial in identifying disease processes as well as regulatory elements in disease progressions, has aided drug development through identification of disease processes, offers insight into therapeutic strategies, and has improved identification of genes that are able to respond to both biotic and abiotic environmental factors as well as how environmental conditions play a role in gene expression.Methods
AAdvantages
The advantages of this methodology are through the insight it gives researchers into the function of genes and the association between gene functions and gene expression. TWAS has the potential to take results from GWAS and extend the results to aid in the understanding of disease mechanisms. Additionally, as this method uses loci that were previously identified by GWAS analysis, there is a lower testing burden associated with a TWAS as less sites are analyzed. By lowering the number of loci being analyzed, this allows more in-depth analysis of the sites analyzed and can give further insight to the functions and associations of the significant loci. TWAS also have the advantage of reducing the effects of confounding factors. When building a predictive model, it only looks at genetic expression, not total expression. Total expression includes factors like the environment and epigenetic modifications to levels of expression, and are not accounted for in the predictive model. By not accounting for these factors, it can reduce the accuracy of predicted levels of gene expression; however, it also reduces the effects of confounding variables in the results. Another advantage of TWAS is that the results are tissue specific. The level of gene expression differs by the tissue that the genes are in, as each tissue has specific splicing patterns and patterns of regulation. By having tissue specific results, this furthers the information that can be derived through these studies as results have the ability to show how gene regulation differs by tissue types as well as how functions are regulated and if there are common regulatory mechanisms between tissues or if regulatory mechanisms have different functions in different tissues. TWAS cross tissue methods also have the possibility to identify potential causal genes for diseases and traits on a larger scale, however, single tissue methods have the ability to determine associations on a case specific basis.Limitations
Many of the disadvantages of TWAS are implications of the prediction capabilities of the model used to predict gene expression levels based on genotypes. One disadvantage of TWAS is that it mainly looks at cis-genetic components for imputation and for in most studies, does not identify any trans-genetic component variants. This acts as a disadvantage for TWAS as trans-genetic component variants are any regulatory mechanisms that are outside of a 1 Megabase range of the gene, and even though they are a significant distance away from the gene of interest, many regulatory mechanisms have the potential to act long range and can still impact expression. By not taking these components into account, it lowers the accuracy of predicted genetic expression levels and can cause deviation between expected and observed expression levels. As mentioned above, another disadvantage of these studies are that environmental and epigenetic mechanisms for regulation of gene expression is not taken into account with the predictive model for gene expression, which also has the potential to lead to inaccuracies with the predicted gene expression levels and observed expression levels. Another challenge for TWAS is that it can be hard to predict accurate gene expression levels when genes have low heritability levels. eQTLs rely on a level of heritability, and when low heritability is observed, it can affect the observance of false positives and can negatively impact the prediction capabilities of the model used for TWAS. Additionally, another challenge for TWAS, very similar to GWAS results, is that these studies can only demonstrate associations from results. Even though a statistically significant association can be seen between the gene or loci of interest and the trait or disease, no causal relationship can be derived. In order to establish a causal relationship, further studies utilizing aApplications
Schizophrenia
A TWAS study was performed following a GWAS investigating loci associated with schizophrenia. From the GWAS results, over 100 risk loci were located. A TWAS was then used to identify 157 significant loci using expression data, and 35 of the identified loci from the TWAS did not align with the GWAS loci. Results were then further narrowed using regulatory target investigations. 42 of these genes were found to have a statistically significant association with chromatin phenotypes, which is a regulatory mechanism that could further be investigated. ''MAPK3'' was one association that was observed to have a large impact on neurodevelopmental phenotypes, and was further prioritized as a candidate causal gene.Breast Cancer
In 2018, a TWAS was used to identify candidate causal genes for breast cancer. Data was collected from The Cancer Genome Atlas to establish genetic models as well as 229,000 women of European ancestry. In this study, 8,597 genes were evaluated. Through GWAS studies, around 170 loci were associated with at least one variant of breast cancer. In this study, 179 genes were found to have an association with a variant of breast cancer. Of the 179 genes with associations, 48 were identified to be statistically significant using a Bonferroni-correction threshold (as seen on the Manhattan plot above). 14 of these had never been reported to be associated with a risk of breast cancer previously. The other 34 genes at known risk loci had 23 that do not have any associated risk SNPs. Using gene knock-downs, 13 genes with high predicted levels of expression were found to be associated with an increased risk. When knocked-down, studies showed that 11 of the genes investigated had an effect in a cell line of breast cancer, especially in 184A1 normal breast cells. These genes include the following: ''PIDD1, NRBF2,'' and ''ABHD8''. All of the genes identified in the study, both up- and down-regulated had relatively high ''cis''-heritability.Parkinson's Disease
A TWAS study was completed in 2021 that utilized the most recent Parkinson's Disease (PD) GWAS that utilized 480.000 individuals. From those results, 18 genes were found to have a statistically significant association with PD. The most significant of these was ''LRRC37A2,'' which was found to be associated in all 13 brain tissues.TWAS Atlas
TWAS AtlasReferences
{{Reflist Genetics