Background
Parvoviruses are a family of DNA viruses that have single-stranded DNA (ssDNA) genomes enclosed in rugged, icosahedral proteinGeneral process
The entire process of rolling hairpin replication, which has distinct, sequential stages, can be summarized as follows: *1. The coding portion of the genome is replicated, starting from the 3′-end of the 3′ hairpin, which acts as a primer, and continues until the newly synthesized strand is connected to the 5′-end of the 5′ hairpin, producing a duplex DNA molecule that has two strands of the coding portion of the genome. *2. mRNA that encodes the viral replication initiator protein is transcribed and subsequently translated to synthesize the protein. *3. The initiator protein binds to and cleaves the DNA within a region called the origin, which results in the hairpin unfolding into a linear, extended form. At the same time, the initiator protein establishes a replication fork with its helicase activity. *4. The extended-form hairpin is replicated to create an inverted copy of the telomere on the newly synthesized strand. *5. The two strands of that end refold back into two hairpins, which repositions the replication fork to switch templates and move in the opposite direction. *6. DNA replication continues in a linear manner from one end to the other using the opposite strand as a template. *7. Upon reaching the other end, that end's hairpin is unfolded and refolded to replicate the terminus and once again swap templates and change the direction of replication. This back-and-forth replication is continually repeated, producing a concatemer of multiple copies of the genome. *8. The viral initiator protein periodically excises individual genomic strands of DNA from the replicative concatemer. *9. Excised ssDNA genomes are packaged into newly constructed viral capsids.Preparation for replication
Upon cell entry, a tether about 24 nucleotides in length that attaches the viral protein NS1, essential in replication, to the virion is cleaved off the virion to be reattached later. After cell entry, virions accumulate in theEssential viral proteins and initiation
Once an infected cell enters S-phase, parvovirus genomes are converted to their duplex form by host replication machinery, and mRNA that encodes non-structural (NS) proteins is transcribed starting from a viral promoter (P4 for MVM). One of these NS proteins is usually called NS1 but also Rep1 or Rep68/78 for the genus ''Dependoparvovirus'', which AAV belongs to. NS1 is a site-specific DNA binding protein that acts as the replication initiator protein via nickase activity. It also mediates excision of both ends of the genome from duplex RF intermediates via a transesterification reaction that introduces a nick into specific duplex origin sequences. Key components of NS1 include an HUH endonuclease domain toward theMVM right-end origin
The right-end hairpin of MVM contains 248 nucleotides organized into a cruciform shape. This region is almost perfectly basepaired, with just three unpaired bases at the axis and a mismatched region positioned 20 nucleotides from the axis. A three nucleotide insertion, AGA or TCT, on one strand separates opposing pairs of NS1 binding sites, creating a 36 basepair-length palindrome that can assume an alternate cruciform configuration. This configuration is expected to destabilize the duplex, which facilitates its ability to function as a hinge. The mismatch of the unpaired bases, rather than the three-nucleotide sequence itself, may help to promote instability of duplex DNA. Fully-duplex linear forms of the right-end hairpin sequence also function as NS1-dependent origins. For many parvoviral telomeres, however, only an initiator binding site next to the nick site is required for the origin function so that the minimal sequences required for nicking are less than 40 basepairs in length. For MVM, the minimal right-end origin is around 125 basepairs in length and includes most of the hairpin sequence because at least three recognition elements are involved: the nick site 5′-CTWWTCA-3′ (element 1), positioned seven nucleotides upstream from a duplex NS1-binding site (element 2) that is oriented to have the attached NS1 complex extending over the nick site, and a second NS1-binding site (element 3), which is adjacent to the hairpin axis. The second binding site is over 100 basepairs away from the nick site but is required for NS1-mediated cleavage. ''In vivo'', there is slight variation in the position of the nick, plus or minus one nucleotide, with one position preferred. During nicking, this site is likely exposed as a single strand and is potentially stabilized as a minimal stem-loop by the tetranucleotide inverted repeats to the sides of the site. Optimal forms of the NS1-binding site contain at least three tandem copies of the 5′-ACCA-3′ sequence. Modest alterations to these motifs only have a small effect on affinity, which suggests that each tetranucleotide motif is recognized by different molecules in the NS1 complex. The NS1-binding site that positions NS1 over the nick site in the right-end origin is a high affinity site. With ATP, NS1 binds asymmetrically over the aforementioned sequence, protecting a region 41 basepairs in length from digestion. This footprint extends just five nucleotides beyond the 3′-end of the ACCA repeat but 22 nucleotides beyond the 5′-end so that the footprint ends 15 nucleotides beyond the nick site, placing NS1 in position to nick the origin. Nicking only occurs if the second, distant NS1-binding site is also present in the origin and the entire complex is activated by addition of HMG1. In the absence of NS1, HMG1 binds the hairpin sequence independently, causing it to bend, without protecting any region from digestion. HMG1 can also directly bind to NS1 and mediates interactions between NS1 molecules bound to their recognition elements in the origin, so it is essential for formation of the cleavage complex. The ability of the axis region to reconfigure into a cruciform does not appear to be important in this process. Cleavage is dependent on the correct spacing of the elements of the origin, so additions and deletions can be lethal, whereas substitutions can be tolerated. Addition of HMG1 appears to only slightly adjust the sequences protected by NS1, but the conformation of the intervening DNA changes, folding into a double helical loop that extends about 30 basepairs through aTerminal resolution
Following nicking, a replication fork is established at the newly exposed 3′ nucleotide that proceeds to unfold and copy the right-end hairpin through a series of melting and reannealing reactions. This process begins once NS1 nicks the inboard end of the original hairpin. The terminal sequence is then copied in the opposite direction, which produces an inverted copy of the original sequence. The end result is a duplex extended-form terminus that contains two copies of the terminal sequence. While NS1 is required for this, it is unclear if unfolding is mediated by its helicase activity in front of the fork or by destabilization of the duplex following DNA binding at one of its 5′-(ACCA)-3′ recognition sites. This process is usually called terminal resolution but also hairpin transfer or hairpin resolution. Terminal resolution occurs with each round of replication, so progeny genomes contain an equal number of each terminal orientation. The two orientations are termed "flip" and "flop", and may be represented as R and r, or B and b, for the flip and flop of the right-end telomere and L and l, or A and a, for the flip and flop of the left-end telomere. Since parvoviral terminal palindromes are imperfect, it is easy to identify which orientation is which. The extended-form duplex telomeres generated during terminal resolution are melted, mediated by NS1 with ATP hydrolysis, causing individual strands to fold back on themselves to create hairpin "rabbit ear" structures that have the flip and flop of the termini. This requires the NS1 helicase activity as well as its site-specific binding activity, the latter of which enables NS1 to bind to symmetrical copies of NS1-binding sites that surround the axis of the extended-form terminus. Rabbit ear formation allows the 3′ nucleotide of the newly synthesized DNA strand to pair with an internal base, which repositions the replication fork in a strand-switching maneuver that primes synthesis of additional linear sequences. Switching from DNA synthesis to rabbit-ear formation at the end of terminal resolution may require different types of NS1 complexes. Alternatively, the NS1 complex may remain intact during this switch, being ready to start stand displacement synthesis following refolding into rabbit ears. After the replication fork is repositioned, replication continues toward the left end, using the newly synthesized DNA strand as a template. At the left end of the genome, NS1 is probably required to unfold the hairpin. NS1 appears to be directly involved in melting-out and reconfiguring the resulting extended-form left-end duplexes into rabbit ear structures, though this reaction seems to be less efficient than at the right-end terminus. Dimeric and tetrameric concatemers of the genome are generated successively for MVM. In these concatemers, alternating unit-length genomes are fused through a palindromic junction in left-end to left-end and right-end to right-end orientations. In total, RHR results in coding sequences of the genome being copied twice as often as the termini. Both linear and hairpin configurations of the right-end telomere support initiation of RHR, so resolution of duplex right-end to right-end junctions can occur symmetrically on the basepaired duplex sequence or after this complex is melted and reconfigured into two hairpins. It is unclear which of these two reactions is more common since both appear to produce identical results. For AAV, each telomere is 125 bases in length and capable folding into a T-shaped hairpin. AAV contains a Rep gene that encodes for four Rep proteins, two of which, Rep68 and Rep78, act as replication initiator proteins and fulfill the same functions, such the nickase and helicase activities, as NS1. They recognize and bind to a (GAGC) sequence in the stem region of the terminus and nick a site 20 bases away termed ''trs''. The same process of terminal resolution as MVM is done for AAV, but at both ends. The other two Rep proteins, Rep52 and Rep40, are not involved in DNA replication but are implicated in synthesis of progeny. AAV replication is dependent on a helper virus that is either an adenovirus or a herpesvirus that coinfects the cell. In the absence of coinfection, the AAV genome is integrated into the host cell's DNA until coinfection occurs. A general rule is that parvoviruses with identical termini, i.e. homotelomeric parvoviruses such as AAV and B19, replicate both ends by terminal resolution, generating equal numbers of flips and flops of each telomere. Parvoviruses that have different termini, i.e. heterotelomeric parvoviruses like MVM, replicate one end by terminal resolution and the other end by asymmetric junction resolution, which conserves a single-sequence orientation and requires different structural arrangements and cofactors to activate NS1's nickase. AAV DNA intermediates containing covalently linked sense and antisense strands yield genomic concatemers under denaturing conditions, indicating that AAV replication also synthesizes duplex concatemers that require some form of junction resolution.MVM left-end origin
In negative-sense MVM genomes, the left-end hairpin is 121 nucleotides in length and exists in a single flip sequence orientation. This telomere is Y-shaped and contains small internal palindromes that fold into the "ears" of the Y, a duplex stem region 43 nucleotides in length that is interrupted by an asymmetric thymidine residue, and a mismatched "bubble" sequence in which the 5′-GAA-3′ sequence on the inboard arm is opposite of 5′-GA-3′ in the outboard strand. Sequences in this hairpin are involved in both replication and regulation of transcription. The elements involved in these two functions separate the two arms of the hairpin. The left-end telomere of MVM, and likely of all heterotelomeric parvoviruses, cannot function as a replication origin in its hairpin configuration. Instead, a single origin on the lower strand is created when the hairpin is unfolded, extended, and copied to form a duplex basepaired sequence that spans adjacent genomes in the dimer RF. Within this structure, the sequence from the outboard arm that surrounds a GA/TC dinucleotide serves as an origin, OriL. The equivalent GAA/TTC sequence on the inboard arm that contains the bubble trinucleotide, called OriL, does not serve as an origin. The inboard arm and hairpin configuration of the terminus instead appear to function as upstream control elements for the viral transcriptional promoter P4. Additionally, the ability to segregate one arm from nicking appears essential for replication. The minimal linear left-end origin is about 50 basepairs long and extends from two 5′-ACGT-3′ motifs, spaced five nucleotides apart at one end, to a position seven basepairs beyond the nick site. The bubble's GA sequence itself is relatively unimportant, but the space that it occupies is necessary for the origin to function. Within the origin, there are three recognition sequences: an NS1-binding site that orients the NS1 complex over the nick site 5′-CTWWTCA-3′, which is located 17 nucleotides downstream (toward the 3′-end), and the two ACGT motifs. These motifs bind a heterodimeric cellular factor called either parvovirus initiation factor (PIF) or glucocorticoid modulating element-binding protein (GMEB). PIF is a site-specific DNA-binding heterodimeric complex that contains two subunits, p96 and p79, and functions as a transcription modulator in the host cell. It binds DNA via a KDWK fold and recognizes two ACGT half-sites. The spacing between these sites can vary significantly for PIF, from one to nine nucleotides, with an optimal spacing of six. PIF stabilizes the binding of NS1 on the active form of the left-end origin, OriL, but not on the inactive form, OriL, because the two complexes are able to establish contact over the bubble binucleotide. The left-end hairpin of all other species in the ''Protoparvovirus'' genus,This genus is included in Kerr, et al. under its former name ''Parvovirus''. of which MVM belongs, have bubble asymmetries and PIF-binding sites, though with slight variation in spacing. This suggests that they all share a similar origin segregation mechanism.Asymmetric junction resolution
Due to the location of the active origin OriL in the dimer junction, synthesis of new copies of the left-end hairpin in the correct, i.e.flip, orientation is not straightforward since a replication fork moving from this site through the linear bridge structure should synthesize new DNA in the flop orientation. Instead, the left-hand MVM dimer junction is resolved asymmetrically in a process that creates a cruciform intermediate. This maneuver accomplishes two things: it allows synthesis of the new DNA in the correct sequence orientation, and it creates a structure that can be resolved by NS1. This "heterocruciform" model of synthesis suggests that resolution is driven by the NS1 helicase activity and depends on the inherent instability of the duplex palindrome, a property that allows it to switch between its linear and cruciform configurations. NS1 initially introduces a single-strand nick in OriL in the B ("right") arm of the junction and becomes covalently attached to the DNA on the 5′ side of the nick, exposing a basepaired 3′ nucleotide. Two outcomes can then occur, depending on the speed with which a replication fork is assembled. If assembly is rapid, then while the junction is in its linear configuration, "read-through" synthesis copies the upper strand, which regenerates the duplex junction and displaces a positive-sense strand that feeds back into the replicative pool. This promotes MVM DNA amplification but does not lead to synthesis of new terminal sequences in the correct orientation or to junction resolution. To create a resolvable structure, the initial nicking must be followed by melting and rearrangement of the dimer junction into a cruciform. This is driven by the 3′-to-5′ helicase activity of the 5′-linked NS1 complex. Once this cruciform extends to include sequences beyond the nick site, the exposed primer at the nick site in OriL undergoes template switching by annealing with its complement in the lower arm of the cruciform. If a fork assembles after this point, then the subsequent synthesis unfolds and copies the lower cruciform arm. This creates a heterocruciform intermediate that contains the newly synthesized telomere in the flip sequence orientation that is attached to the lower strand of the B arm. This modified junction is called MJ2. The lower arm of MJ2 is an extended-form duplex palindrome that is essentially identical to those generated during terminal resolution. Once MJ2 is synthesized, the lower arm becomes susceptible to rabbit-ear formation. This repositions the 3′ nucleotide of the newly synthesized copy of the lower arm so that it pairs with inboard sequences on the junction's B arm to prime strand displacement synthesis. If a replication fork is created at this 3′ nucleotide, then the lower strand of the B arm is copied, creating an intermediate junction called MJ1 and progressively displacing the upper strand. This leads to the release of the newly synthesized B turn-around (B-ta) sequence. The residual cruciform, called δJ, is partially single-stranded at the upper part of the B arm and contains the intact upper strand of the junction paired to the lower strand of the A ("left") arm, with an intact copy of the left-end hairpin, ending in a 5′ NS1 complex. Since δJ carries the NS1 helicase, it is presumed to periodically alter configuration. The next step is less certain but can be inferred based on what is known about the process thus far. The NS1 helicase is expected to create a dynamic structure in which the nick site in δJ in the normally inactive A side is temporarily but repeatedly exposed in a single-stranded form during duplex-to-hairpin rearrangements, which allows NS1 to engage the nick site in the origin OriL without the help of a cofactor. The nick would leave NS1 covalently attached to the positive-sense "B" strand of δJ and lead to the release of this strand. Nicking also leaves open a basepaired 3′ nucleotide on the "A" strand of δJ to prime DNA synthesis. If a replication fork is established here, then the A strand is unfolded and copied to create its duplex extended form. When MVM genomes replicate ''in vivo'', the aforementioned nick may not occur because both ends of the dimer replicative form contain an efficient number of right-end hairpin origins. Therefore, replication forks may progress back toward the dimer junction from the genome's right end, copying the top strand of the B arm before the final resolution nick. This bypasses dimer bridge resolution and recycles the top strand into a replicating duplex dimer pool. In a closely related virus, LuIII, the single-strand nick releases a positive-sense strand with its left-end hairpin in the flop orientation. Unlike MVM, LuIII packages strands of both sense with equal frequency. In the negative-sense strands, the left-end hairpins are all in the flip orientation, while in the positive-sense strands, there are an equal number of flip and flop orientations. Compared to MVM, LuIII contains a two-base insertion immediately 3′ of the nick site in the right origin, which impairs its efficiency. Because of this, the reduced efficiency of replication fork assembly in the genome's right end may favor single-strand nicking by giving it more time to occur.Synthesis of progeny
Individual progeny genomes are excised from genomic replicative concatemers starting by introducing breaks in replication origins, usually by the replication initiator protein. This results in the establishment of new replication forks that replicate the telomeres in a combination of terminal resolution and junction resolution and displaces individual ssDNA genomes from the replicative molecule. At the end of this process, the telomeres are folded back inwards to form hairpins on excised genomes. The extended-form termini created during excision resemble the extended-form molecules prior to terminal resolution, so they can be melted out and refolded into rabbit ears for additional rounds of replication. Within an infected cell, numerous replicative concatemers are therefore able to arise. Displacement of progeny ssDNA genomes either occurs: predominantly or exclusively during active DNA replication, or when cells are assembling viral particles. Displacement of single strands may therefore be associated with packaging viral DNA into capsids. Earlier research suggested that the preassembled viral particle may sequester the genome in a 5′-to-3′ direction as it is displaced from the fork, but more recent research suggests that packaging is performed in a 3′-to-5′ direction driven by the NS1 helicase using newly synthesized single strands. It is not clear if these single strands are released into the nucleoplasm so that packaging complexes are physically separate from replication complexes or if the replication intermediates serve as both replication and packaging substrates. In the latter case, newly displaced progeny genomes would be kept in the replication complex via interactions between their 5′-linked NS1 molecules and NS1 or capsid proteins that are physically associated with replicating DNA. Genomes are inserted into the capsid via an entrance called a portal situated at one of the icosahedral 5-fold axes of the capsid, which is possibly opposite of the opening from which genomes are expelled early in the replication cycle. Strand selection for encapsidation likely does not involve specific packaging signals but may be predictable by the Kinetic Hairpin Transfer (KHT) mathematical model, which explains the distribution of the strands and terminal conformations of packaged genomes in terms of the efficiency with which each terminus type can undergo reactions that allow it to be copied and reformed. In other words, the KHT model postulates that the relative efficiency with which two genomic termini are resolved and replicated determines the distribution of amplified replication intermediates created during infection and ultimately the efficiency with which ssDNAs of characteristic polarity and terminal orientations are excised, which will then be packaged with equal efficiency. Preferential excision of particular genomes is only apparent during packaging. Therefore, among parvoviruses that package strands of one sense, replication appears to be biphasic. At early times, both sense strands are excised. This is followed by a switch in the replication mode that allows for exclusive synthesis of a single sense for packaging. A modified form of the KHT model, called the preferential strand displacement model, proposes that the aforementioned switch in replication is caused by the onset of packaging because the substrate for packaging is probably a newly displaced DNA molecule. For heterotelomeric parvoviruses, imbalance of origin firing leads to preferential displacement of negative sense strands from the right-end origin. The relative frequency of sense strands in packaged virions can therefore be used to infer the type of resolution mechanism used during excision. Shortly after the start of S-phase, translation of viral mRNA leads to the accumulation of capsid proteins in the nucleus. These proteins form into oligomers that are assembled into intact empty capsids. After encapsidation, complete virions may be exported from the nucleus to the exterior of the cell before disintegration of the nucleus. Disruption of the host cell environment may also occur later on in infection. This results in cell lysis viaComparison to rolling circle replication
Many small replicons that have circular genomes such as circular ssDNA viruses and circular plasmids replicate via rolling circle replication (RCR), which is a unidirectional, strand displacement form of DNA replication similar to RHR. In RCR, successive rounds of replication, which proceeds in a loop around the genome, are initiated and terminated by site-specific single-strand nicks made by a replicon-encoded endonuclease, variously called the nickase, relaxase, mobilization protein (mob), transesterase, or replication protein (Rep). The replication initiator protein of parvoviruses is genetically related to these other endonucleases. RCR initiator proteins contain three motifs considered to be important for replication. Two of these are retained within parvovirus initiator proteins: an HUHUUU cluster, which is presumed to bind to a ion required for nicking, and a YxxxK motif that contains the active-site tyrosine residue that attacks the phosphodiester bond of target DNA. In contrast to RCR initiator proteins, which can join together DNA strands, RHR initiator proteins have only vestigial traces of being able to perform ligation. RCR begins when the initiator protein nicks a DNA strand at a specific sequence in the replication origin region. This is done through a transesterification reaction that forms a 5′-phosphate bond that connects the DNA to the active-site tyrosine and frees the 3′-end hydroxyl (3′-OH) adjacent to the nick site. The 3′-end is then used as a primer for the host DNA polymerase to begin replication while the initiator protein remains attached to the 5′-end of the "original" strand. After one loop of replication around the circular genome, the initiator protein returns to the nick site, i.e. the original initiator complex, while still attached to the parent strand and attacks the regenerated duplex nick site, or a nearby second site in some cases, by means of a topoisomerase-like nicking-joining reaction. During the aforementioned reaction, the initiator protein cleaves a new nick site and is transferred across the analogous phosphodiester bond. It thereby becomes attached to the new 5′-end while ligating the 5′-end of the first strand to which it was originally attached to the 3′-end of the same strand. This second mechanism varies depending on the replicon. Some replicons such as the virus ΦX174 contain a second active tyrosine residue in the initiator protein. Others use the analogous active-site tyrosine in a second initiator protein that is present as part of a multimeric nickase complex. This second nicking reaction may occur after one loop or successive loops may occur in which a concatemer containing multiple copies of the genome is created. The result of this nick is that displaced genomes become detached from the replicative molecule. These copies of the genome are ligated and may either be encapsidated into progeny capsids, provided they are monomeric, or converted to a covalently-closed double-stranded form by a host DNA polymerase for further replication. While RHR generally involves replication of both sense strands in a continuous process, RCR has complementary strand synthesis and genomic strand synthesis occur separately. The strategies used in RHR to engage the nick site are also present in RCR. Most RCR origins are in the form of duplex DNA that has to be melted before nicking. RCR initiators accomplish this by binding to specific DNA-binding sequences in the origin next to the initiation site. The latter site is then melted in a process that consumes ATP and which is assisted by the ability of the separated strands to reconfigure into stem-loop structures. In these structures, the nick site is presented on an exposed loop. Like RHR initiator proteins, many RCR initiator proteins contain helicase activity, which allows them to melt the DNA prior to nicking and serve as the 3′-to-5′ helicase in the replication fork.Notes
References
Bibliography
*{{cite book , vauthors=Kerr J, Cotmore S, Bloom ME , date=25 November 2005 , title=Parvoviruses , publisher=CRC Press , pages=171–185 , isbn=9781444114782 DNA replication Molecular biology Parvoviruses