Background
Strand-seq (single-cell and single-strand sequencing) was one of the first single-cell sequencing protocols described in 2012. This genomic technique selectively sequencings the parental template strands in single daughter cells DNA libraries. As a proof of concept study, the authors demonstrated the ability to acquire sequence information from the Watson and/or Crick chromosomal strands in an individual DNA library, depending on the mode of chromatid segregation; a typical DNA library will always contain DNA from both strands. The authors were specifically interested in showing the utility of strand-seq in detecting sister chromatid exchanges (SCEs) at high-resolution. They successfully identified eight putative SCEs in the murine (mouse) embryonic stem (meS) cell line with resolution up to 23 bp. This methodology has also been shown to hold great utility in discerning patterns of non-random chromatid segregation, especially in stem cell lineages. Furthermore, SCEs have been implicated as diagnostic indicators of genome stress, information that has utility in cancer biology. Most research on this topic involves observing the assortment of chromosomal template strands through many cell development cycles and correlating non-random assortment with particular cell fates. Single-cell sequencing protocols were foundational in the development of this technique, but they differ in several aspects.Methodology
Similar methods
Past methods have been used to track the inheritance patterns of chromatids on a per-strand basis and elucidate the process of non-random segregation:Pulse-chase
Chromosome-orientation fluorescence in situ hybridization (CO-FISH)
CO-FISH, or strand-specific fluorescence ''in situ'' hybridization, facilitates strand-specific targeting of DNA with fluorescently-tagged probes. It exploits the uniform orientation of majorWet lab protocols
Cells of interest are cultured either in vivo or in vitro. During S-phase cells are treated with bromodeoxyuridine (BrdU) which is then incorporated into their nascent DNA, acting as a substitute for thymidine. After at least one replication event has occurred, the daughter cells are synchronized at the G2 phase and individually separated by fluorescence-activated cell sorting (FACS). The cells are directly sorted into lysis buffer and their DNA is extracted. Having been arrested at a specified number of generations (usually one), the inheritance patterns of sister chromatids can be assessed. The following methods concentrate on the DNA sequencing of a single daughter cell's DNA. At this point the chromosomes are composed of nascent strands with BrdU in place of thymidine and the original template strands are primed for DNA sequencing library preparation. Since this protocol was published in 2012, the canonical methodology is only well described for Illumina sequencing platforms; the protocol could very easily be adapted for other sequencing platforms, depending on the application. Next, the DNA is incubated with a special dye such that when the BrdU-dye complex is excited by UV light, nascent strands are nicked by photolysis. This process inhibitsBioinformatic processing
Limitations
Strand-seq requires cells undergoing cell division for BrdU labeling, and thus is not applicable to formalin-fixed specimens or non-dividing cells. But it may be applied to normal mitotic cells and tissues, organoids, as well as leukemia and tumor samples using fresh or frozen primary specimens. Strand-seq is using Illumina sequencing, and applications that require sequence information from different sequencing technologies require new protocols, or alternatively integration of data generated using distinct sequencing platforms as recently show-cased. Authors from the initial papers describing Strand-seq showed that they were able to attain a 23bp resolution for mapping SCEs, and other large chromosomal abnormalities are likely to share this mapping resolution (if breakpoint fine-mapping is performed). Resolution, however, is dependent on a combination of the sequencing platform used, library preparation protocols, and the number of cells analysed as well as the depth of sequencing per cell. However, it would be sensical for precision to further increase with sequencing technologies that don't incur errors in homopolymeric repeats.Applications and utility
Identifying sister chromatid exchanges
Strand-seq was initially proposed as a tool to identify sister chromatid exchanges. Being a process that is localized to individual cells, DNA sequencing of more than one cell would naturally scatter these effects and suggest an absence of SCE events. Moreover, classic single cell sequencing techniques are unable to show these events due to heterogeneous amplification biases and dual-strand sequence information, thereby necessitating Strand-seq. Using the reference alignment information, researchers can identify an SCE if the directionality of an inherited template strand changes.Identifying misoriented contigs
Misoriented contigs are present in reference genomes at significant rates (ex. 1% in the mouse reference genome). Strand-seq, in contrast to conventional sequencing methods, can detect these misorientations. Misoriented contigs are present where strand inheritance changes from one homozygous state to the other (ex. WW to CC, or CC to WW). Moreover, this state change is visible in every Strand-seq library, reinforcing the presence of a misoriented contig.Identifying non-random segregation of sister chromatids
Prior to the 1960s, it was assumed that sister chromatids were segregated randomly into daughter cells. However, non-random segregation of sister chromatids has been observed in mammalian cells ever since. There have been a few hypotheses proposed to explain the non-random segregation, including the Immortal Strand Hypothesis and the Silent Sister Hypothesis, one of which may hopefully be verified by methods involving Strand-seq. ‘’Immortal Strand Hypothesis’’ Mutations occur every time a cell divides. Certain long-lived cells (ex. stem cells) may be particularly affected by these mutations. The Immortal Strand Hypothesis proposes that these cells avoid mutation accumulation by consistently retaining parental template strands For this hypothesis to be true, sister chromatids from each and every chromosome must segregate in a non-random fashion. Additionally, one cell will retain the exact same set of template strands after each division, giving the rest to the other cell products of the division. ‘’Silent Sister Hypothesis’’ This hypothesis states that sister chromatids have differing epigenetic signatures, thereby also differing expression regulation. When replication occurs, non-random segregation of sister chromatids ensures the fates of the daughter cells. Assessing the validity of this hypothesis would require a joint analysis of Strand-seq and gene expression profiles for both daughter cells.Discovery of structural variations & aneuploid chromosomes
The output of BAIT shows the inheritance of parental template strands along the genome. Normally, two template strands are inherited for each autosome, and any deviation from this number indicates an instance ofHaplotyping, genome assembly & generation of high-resolution human genetic variation maps
Early-build genomes are quite fragmented, with unordered and unoriented contigs. Using Strand-seq provides directionality information to accompany the sequence, which ultimately helps resolve the placement of contigs. Contigs present in the same chromosome will exhibit the same directionality, provided SCE events have not occurred. Conversely, contigs present in different chromosomes will only exhibit the same directionality in 50% of the Strand-seq libraries. Scaffolds, successive contigs intersected by a gap, can be localized in the same manner. The same principle of using strand direction to distinguish large DNA molecules enables the use of Strand-seq as a tool to construct whole-chromosome haplotypes of genetic variation, from telomere to telomere. Recent reports have shown that Strand-seq can be computationally integrated with long-read sequencing technology, with the unique advantages of both technologies enabling the generation of highly contiguous haplotype-resolved de novo human genome assemblies. These genomic assemblies integrate all forms of genetic variation including single nucleotide variants,Considerations
The possibility that BrdU being substituted for thymine in the genomic DNA could induce double stranded chromosomal breaks and specifically resulting in SCEs has been previously discussed in the literature. Additionally, BrdU incorporation has been suggested to interfere with strand segregation patterns. If this is the case, there would be an inflation in false positive SCEs which may be annotated. Therefore, many cells should be analyzed using the Strand-seq protocol to ensure that SCEs are in fact present in the population. For structural variants detected in single cells, detection of the same variant (on the same haplotype) in more than one cell can exclude BrdU incorporation as a possible cause. The number of single cell strands that need to be sequenced in order for an annotation to be accepted has yet to be proposed and is highly dependent on the questions being asked. As Strand-seq is founded on single cell sequencing techniques, one must consider the problems faced with single cell sequencing as well. These include the lacking standards for cell isolation and amplification. Even though previous Strand-seq studies isolated cells using FACS, microfluidics also serves as an attractive alternative. PCR has been shown to produce more erroneous amplification products compared to strand displacement based methods such as MDA and MALBAC, whereas the latter two techniques generate chimeric reads as a byproduct that can result in erroneous structural variation calls. MDA and MALBAC also generate more dropouts than Strand-seq during SV detection because they require reads that cross the breakpoint of an SV to enable its detection (this is not required for any of the different SV classes that Strand-seq can detect). Strand displacement amplification also tends to generate more sequence and longer products which could be beneficial for long read sequencing technologies.References
*