Fact-checked by Grok 2 weeks ago

Origin of replication

The origin of replication is a specific DNA sequence that serves as the starting point for DNA replication, where initiator proteins bind to recruit the replication machinery and unwind the double helix, enabling bidirectional synthesis of daughter strands to duplicate the genome prior to cell division. This process ensures the accurate and timely copying of genetic information, coordinated with cell cycle progression, transcription, and DNA repair mechanisms. In prokaryotes, replication typically initiates from a single origin per circular , such as in , a ~245 sequence containing multiple boxes and an AT-rich duplex unwinding element (). The initiator protein binds cooperatively to these high-affinity sites, oligomerizes to melt the DNA at the DUE, and facilitates the loading of the DnaB and other components, ensuring once-per-cell-cycle replication. This tightly regulated, sequence-specific mechanism supports the rapid replication of relatively small bacterial genomes. In contrast, eukaryotic genomes employ thousands of origins—approximately 1,600 (as of 2024) in budding yeast and 20,000–50,000 in humans—distributed across linear chromosomes to accommodate their larger size and complexity. These origins often lack strict sequence consensus, except in certain model organisms like , and their specification is influenced by chromatin accessibility, positioning, DNA topology, and epigenetic marks rather than fixed motifs alone. The (ORC), a heterohexameric protein, binds to origins during to license replication by loading the MCM2-7 , with activation occurring later in under control. This distributed and flexible system allows eukaryotes to replicate vast genomes efficiently while preventing re-replication within a single .

Fundamental Concepts

Definition and Role

The origin of replication is a discrete genomic locus where DNA unwinding initiates, marking the starting point for the assembly of replication forks that proceed either bidirectionally or unidirectionally to duplicate the . This site enables the precise coordination of , ensuring that parental strands serve as templates for the formation of complementary daughter strands by replicative polymerases. In the , origins play a central role by synchronizing duplication with S-phase entry, allowing each to be copied exactly once per cycle. This regulation is achieved through licensing mechanisms, where origins are primed during by the formation of pre-replicative complexes involving initiator proteins, which are then activated to prevent re-initiation and over-replication within the same cycle. The basic initiation process begins with the specific recognition and binding of initiator proteins to the , followed by localized destabilization of the DNA helix—often facilitated by AT-rich regions—and the subsequent recruitment of helicases and polymerase machinery to establish active replication forks. The proper functioning of origins is crucial for maintaining genomic stability, as disruptions can trigger replication stress, leading to DNA damage, mutations, chromosomal aberrations, or . Errors in origin activity have been implicated in diseases such as cancer, underscoring their role in preventing genomic instability. This concept traces back to the historical discovery by , Brenner, and Cuzin in , who proposed the replicon model linking origins to the regulated initiation of .

Replicon Model

The replicon model was proposed in 1963 by François Jacob, , and François Cuzin, drawing from genetic studies on in . This framework emerged from observations of bacterial chromosome behavior during conjugation and plasmid maintenance, positing that DNA replication is organized into discrete, independently controlled units. The model integrated concepts from earlier work on operons, adapting them to explain how replication initiates and is regulated at specific chromosomal sites. At its core, a replicon is defined as a chromosomal or extrachromosomal unit of capable of autonomous replication, controlled by an independent initiation site. It consists of two primary components: the replicator, a cis-acting sequence that functions as the where replication begins, and the initiator, a diffusible factor (typically a protein) that recognizes the replicator and activates the replication machinery. Replicons also encompass associated control elements, such as partition systems, which ensure the stable segregation of replicated daughter molecules to progeny cells during division. In organisms with multiple origins per , such as eukaryotes, the length of a replicon is determined by the distance between adjacent origins and is typically 100–200 kb. In bacteria like E. coli with a single chromosomal , the replicon spans the entire of approximately 4.6 Mb. Experimental evidence for replicon autonomy stemmed from conjugation studies in E. coli, where the integrates into the to form Hfr strains, allowing transfer of chromosomal segments during mating. Upon excision, these hybrid molecules demonstrated independent replication, confirming that both and chromosomal segments function as self-sufficient replicons when separated. Such transfers revealed that replication control is localized to specific sites, independent of the broader chromosomal context. The replicon model underscores the as the rate-limiting element in replication, dictating the timing of and the speed of fork progression to complete duplication once per . This has profound implications for understanding replication fidelity and coordination in prokaryotes, influencing subsequent on replication control across domains of .

Structural Features

Sequence Motifs

Origins of replication contain conserved DNA sequence motifs that serve as sites for the replication machinery, enabling the initial steps of DNA unwinding and assembly of the . These motifs are modular elements that collectively define the 's functionality, with variations in arrangement contributing to and specificity across different organisms. AT-rich regions, often referred to as DNA unwinding elements (DUEs), are a hallmark of replication origins and typically exhibit 50-70% AT content, which lowers the melting temperature of the DNA duplex to facilitate initial strand separation. These regions, spanning 20-50 base pairs, are prone to melting under physiological conditions due to weaker hydrogen bonding in AT pairs compared to GC pairs, allowing the exposure of single-stranded DNA for subsequent binding events. This structural feature is conserved in origins from , , and eukaryotes, underscoring its essential role in replication initiation. Consensus sequences represent short, highly conserved motifs within origins that provide high-affinity binding sites, typically 9-17 base pairs in length. In bacterial origins, these include 9-bp motifs, while eukaryotic examples like feature 11-17 bp autonomous consensus sequences (ACS); the binding strength and specificity are modulated by the precise orientation and spacing of these motifs relative to one another. Such arrangements ensure selective recognition and activation, with mismatches or altered spacing reducing origin efficiency by orders of magnitude. Bending sites contribute to the architectural flexibility of origins through intrinsically curved DNA segments, often created by phased A-tracts—runs of 4-6 adenine residues spaced every 10-11 base pairs to align on one face of the helix. These elements induce a bend of 40-90 degrees, promoting the compaction and distortion of DNA necessary for the assembly of multi-protein complexes at the origin. By facilitating DNA looping or wrapping, bending sites enhance the local accessibility of adjacent motifs during initiation. The overall length of replication origins varies from approximately 100 to 500 base pairs, accommodating a modular arrangement of the aforementioned motifs in a non-random fashion. This variability allows for evolutionary adaptation while maintaining core functionality, with shorter origins often relying on tightly packed elements and longer ones incorporating auxiliary sequences for regulation. The modular nature ensures that disruption of individual motifs can impair origin firing without abolishing it entirely, highlighting their interdependent roles.

Associated Proteins

Initiator proteins play a central role in recognizing and activating replication origins across domains of life. In prokaryotes, the protein binds ATP and assembles into oligomeric complexes on origin DNA, forming a right-handed helical that wraps and distorts the double helix to promote unwinding. In eukaryotes, the (ORC), a heterohexameric assembly of Orc1-6 subunits, similarly exhibits ATP-dependent DNA binding, with Orc1's activity facilitating stable association and oligomerization into ring-like structures that encircle the origin. These complexes, formed through ATP-driven oligomerization, serve as platforms for subsequent replication machinery while referencing underlying DNA sequence motifs as primary binding targets. Helicase recruitment follows initiator binding, enabling the initial separation of DNA strands. In both prokaryotes and eukaryotes, initiator proteins coordinate the loading of replicative helicases: DnaA recruits DnaB in bacteria, while ORC, in conjunction with Cdc6 and Cdt1, loads the MCM2-7 complex as head-to-head double hexamers that encircle duplex DNA without initial unwinding. Upon activation at the G1/S transition, these double hexamers encircle and translocate along single-stranded DNA, unwinding the duplex processively at rates of several hundred base pairs per second in prokaryotes and tens of base pairs per second in eukaryotes to establish bidirectional replication forks. Accessory factors support origin activation by stabilizing unwound regions and managing topological constraints. Single-strand binding proteins (SSBs), such as in prokaryotes and (RPA) in eukaryotes, coat the exposed single-stranded DNA to prevent reannealing, secondary structure formation, and nucleolytic degradation, thereby facilitating polymerase access. Concurrently, topoisomerases I and II alleviate torsional stress generated by unwinding; type IA topoisomerases relax negative supercoils behind the fork, while type II enzymes decatenate intertwined daughter strands and relieve positive supercoils ahead of the fork. Regulation of initiator and accessory proteins ensures precise control over origin licensing and firing. by cyclin-dependent kinases (CDKs) in eukaryotes targets components like ORC subunits and Cdc6, promoting their dissociation from origins or inactivation to prevent re-replication. further modulates stability; for instance, CDK-phosphorylated Cdc6 is marked by SCF ubiquitin ligases for proteasomal degradation in . Additionally, Cdc6's intrinsic activity, stimulated upon MCM loading, disengages Cdc6 from the ORC-MCM complex, enforcing unidirectional licensing and inhibiting premature re-assembly. These modifications collectively synchronize replication with the , with analogous mechanisms regulating activity in prokaryotes.

Prokaryotic Origins

Bacterial Origins

In , chromosomal replication initiates at a unique origin known as oriC in , which serves as the paradigm for prokaryotic origins and exemplifies the replicon model where a single origin controls replication of the entire . The oriC locus spans approximately 245 base pairs and features 11 DnaA binding sites, termed DnaA boxes, including three high-affinity sites (R1, R2, and R4) that preferentially bind the initiator protein and several low-affinity τ sites that contribute to complex assembly under specific conditions. Adjacent to these boxes lies an AT-rich region with three tandem 13-bp repeats, which acts as the duplex unwinding element () to facilitate initial DNA strand separation during initiation. This compact structure ensures precise recognition and activation once per , with E. coli maintaining a single oriC per chromosome to coordinate bidirectional replication forks that progress to the terminus. Initiation at oriC is orchestrated by the DnaA protein in its ATP-bound form (DnaA-ATP), which first occupies the high-affinity R1, R2, and R4 boxes to form a nucleoprotein complex, then recruits additional DnaA molecules to the low-affinity sites and DUE. The integration host factor (IHF) binds nearby, inducing significant DNA bending that wraps the origin around the DnaA complex, thereby promoting torsional stress and melting of the AT-rich repeats within the DUE to expose single-stranded DNA. This unwound region serves as a platform for loading two hexameric DnaB helicases in opposite orientations, delivered by the DnaC loader protein, which encircles the single-stranded DNA and unwinds the duplex ahead of the advancing forks to establish the replisome. Insights into DnaA's DNA recognition have been advanced by the 2003 crystal structure of its domain IV (the DNA-binding domain) complexed with a DnaA box, revealing how helix-turn-helix motifs insert into the major groove for sequence-specific binding. To prevent over-replication, initiation is tightly regulated through multiple mechanisms, including sequestration of the newly replicated, hemimethylated oriC by the SeqA protein, which binds GATC sites and blocks DnaA access for about one-third of the cell cycle. Additional control occurs via titration of excess DnaA at the datA locus, a chromosomal site ~0.47 Mb from oriC containing multiple DnaA boxes that sequester the initiator and promote its conversion from active ATP-bound to inactive ADP-bound form through hydrolysis. The critical role of DnaA was established in the 1970s through isolation of temperature-sensitive dnaA mutants (dnaAts), which cease initiation at non-permissive temperatures while allowing elongation to complete, demonstrating DnaA's specific function in origin activation. While most bacteria like E. coli rely on a single per , variations occur in species with multiple chromosomes; for example, Vibrio cholerae has two origins—oriC1 on the large I and oriC2 on the small II—enabling staggered replication timing that facilitates resolution of chromosome dimers via during . This dual-origin system ensures coordinated replication and proper partitioning in a bacterium with a naturally bipartite , contrasting with the unimodal control in monogenomic .

Archaeal Origins

Archaeal genomes typically contain multiple origins of replication, ranging from 1 to 5 per chromosome, which contrasts with the single origin found in most bacteria. For instance, species in the genus Sulfolobus, such as S. islandicus and S. solfataricus, possess three active origins. Each origin spans approximately 500 base pairs and features conserved 17-base pair sequences known as origin recognition boxes (ORBs), which serve as binding sites for initiator proteins. These ORBs are AT-rich and facilitate the initial recognition step in replication initiation, with AT-rich DNA unwinding elements (DUEs) commonly present across archaeal origins to promote strand separation. Initiation at archaeal origins is mediated by proteins homologous to eukaryotic Cdc6 and Orc1, often encoded by multiple genes adjacent to the origins themselves. These Cdc6/Orc1 homologs bind to ORBs either as monomers or dimers, with each ORB typically accommodating one monomer in species like Sulfolobus. Structural studies have revealed that ATP binding induces conformational remodeling in these proteins, enabling DNA distortion and helicase recruitment in a manner analogous to the eukaryotic origin recognition complex (ORC). In Sulfolobus, for example, Orc1-1 forms a complex with the origin DNA upon ATP hydrolysis, stabilizing the binding and preparing the site for further assembly. Recent 2025 studies have identified nucleoid-associated proteins that bind essential motifs within archaeal origins, further refining models of initiation specificity. The replication mechanism proceeds with the loading of the MCM helicase, facilitated by the protein, a homolog of eukaryotic Cdt1, which ensures proper encircling of the DNA duplex. Once loaded, the MCM helicases establish bidirectional replication forks that progress from each origin, coordinating with the to complete duplication. Archaeal origins are frequently integrated with transcription units, as many are located near or overlap with promoters of replication-related genes, allowing coordinated regulation of replication and transcription to minimize conflicts in these compact genomes. Diversity in archaeal replication origins is evident across phyla, with Crenarchaeota (e.g., Sulfolobus and Pyrobaculum) generally featuring multiple, well-defined origins rich in ORBs, while (e.g., Haloferax and Methanothermobacter) exhibit greater variability, including cases with fewer origins or reliance on different initiator combinations. A 2024 review highlights spatiotemporal control mechanisms in hyperthermophilic , such as temporally staggered firing of origins to manage replication timing under extreme conditions, ensuring efficient progression despite .

Eukaryotic Origins

Model Organisms

In the budding yeast Saccharomyces cerevisiae, autonomously replicating sequence (ARS) elements serve as well-defined origins of replication, first identified in the late 1970s through assays demonstrating their ability to maintain plasmids independently of the chromosome. These compact elements, typically 100-150 base pairs in length, contain an essential ARS consensus sequence (ACS) with the motif 5'-TTTATYRTTTYA-3', where Y denotes C or T and R denotes A or G. The S. cerevisiae genome contains approximately 400-500 such origins, which activate stochastically during S phase to ensure timely and complete DNA duplication without over-replication. The Drosophila melanogaster provides another key eukaryotic model, with genome-wide studies identifying roughly 5,000 replication origins distributed across its chromosomes. Many of these origins are associated with CG-rich regions, which exhibit open chromatin and facilitate efficient initiation similar to CpG islands in vertebrates. In early embryos, where cell cycles are abbreviated to under 10 minutes, origins are closely spaced at intervals of 5-10 kilobases to support the extraordinarily rapid genome replication required for syncytial divisions. Replication initiation in these model organisms follows a conserved mechanism: the (ORC) binds the ACS or analogous sequence motifs, recruiting Cdc6 and Cdt1 to load double hexamers of the MCM onto origin DNA during . (CDK) phosphorylation then regulates the process by inhibiting re-loading of MCM after G1 and promoting activation in through targeted modifications of ORC, Cdc6, and accessory factors. This licensing strategy is broadly shared among eukaryotes. Key experimental approaches for mapping origins in and include , which visualizes replication bubble and fork structures in genomic DNA, and coupled with sequencing (ChIP-seq), which profiles binding of and MCM proteins at high resolution across the genome. A 2025 study in elucidated the precise timing of MCM double hexamer assembly at origins like ARS1, demonstrating how CDK-mediated constraints on this step have evolutionarily shaped origin structure and firing efficiency.

Mammalian Origins

In mammalian cells, including humans, origins of replication lack the consensus sequences characteristic of simpler eukaryotes like , exhibiting instead a high degree of flexibility and sequence independence that complicates their identification and characterization. This variability arises from contextual factors such as structure and epigenetic marks, allowing origins to form dynamically without fixed motifs. The contains an estimated 50,000 active origins per , though the total potential number, including dormant ones, may reach 100,000, with inter-origin spacing typically ranging from 50 to 300 kb. Many of these origins remain dormant during normal replication but can fire under replicative stress to ensure complete genome duplication and maintain stability. Identification of mammalian origins has relied on methods like nascent strand abundance sequencing (NASBA), which quantifies short nascent DNA strands enriched at active origins to map their locations. More recently, computational tools such as the 2023 deep learning model Ori-FinderH have improved prediction by analyzing Z-curve features of DNA sequences, achieving approximately 92% accuracy in identifying human origins of varying lengths. The (ORC), composed of subunits ORC1-6, binds origins in mammals but does so dynamically, with subunit associations fluctuating across the rather than maintaining stable tethering. A 2025 study using BrdU incorporation and single-molecule revealed that most replication initiation events are dispersed throughout bodies, rather than being confined to promoters, highlighting the nature of origin usage in human cells. Regulation of mammalian origins involves tissue-specific timing programs, where origin firing correlates with cell-type-specific landscapes and transcription patterns. ORC1 is subject to ubiquitination and proteasomal degradation during the S-to-M transition, preventing re-licensing and ensuring once-per-cycle replication. Dysregulated origin firing contributes to genomic instability in cancer, as seen in human papillomavirus (HPV) integrations at common fragile sites, where replication stress promotes breakage and viral genome insertion.

Viral Origins

Prokaryotic Viruses

Prokaryotic viruses, particularly bacteriophages, exhibit origins of replication that are compact and often leverage host bacterial machinery while incorporating specialized viral elements to ensure efficient propagation within infected cells. These origins enable rapid tailored to the lytic or lysogenic cycles, with many phages initiating replication bidirectionally before transitioning to alternative modes for amplification. Such adaptations highlight the evolutionary fine-tuning of to bacterial hosts, drawing loosely from chromosomal origins like those in Escherichia coli for sequence motifs but optimized for viral lifecycle demands. A prominent example is the origin of replication (ori) in bacteriophage λ, a temperate phage that infects E. coli. The λ ori spans approximately a 200-bp region containing four iterons—repeated 17- to 19-bp sequences of hyphenated dyad symmetry—to which the viral O protein binds as dimers, forming a complex that recruits host DnaB for unwinding. Replication initiates bidirectionally in a mode from this site early in , producing circular daughter molecules, before switching to a rolling-circle mechanism mediated by viral and host factors to generate concatemers for packaging. In contrast, the single-stranded DNA bacteriophage employs a distinct suited to its structure. Its 5,386-bp circular , fully sequenced in the , features the at nucleotide 4308, characterized by loops that serve as recognition sites for the host E. coli Rep . The viral gene A protein nicks the replicative form at this site to initiate synthesis, with Rep unwinding the duplex while binding to the structures, facilitating primer-independent leading-strand synthesis and reliance on host primase for the lagging strand. This setup enables conversion of the single-stranded viral to a double-stranded replicative form, followed by asymmetric rolling-circle replication for progeny production. Bacteriophage P1, which maintains as a low-copy , utilizes a plasmid-like with a dedicated for stable . The system includes parS centromere-like sites bound by ParB protein, which interacts with ATPase to ensure equitable distribution during host division, independent of the host's for partitioning but requiring RepA for replication initiation. RepA binds iterons at the to activate a secondary, DnaA-independent replicon, allowing controlled copy number maintenance in the lysogenic state before lytic replication shifts to host-dependent modes. Host-virus interactions further refine these origins, as seen in phage T7, where the bifunctional gene 4 protein (gp4) acts as both and . The domain recognizes specific sequences bearing 5'-GTC-3' and 5'-ATC-3' motifs on the lagging-strand template, synthesizing tetraribonucleotide primers every 40-50 nucleotides to support continuous replication fork progression.

Eukaryotic Viruses

Eukaryotic viruses that infect mammalian and other eukaryotic hosts utilize origins of replication (oris) that are compact, autonomous elements capable of directing using a mix of viral and host proteins. These oris typically feature sequence motifs for viral initiator protein binding and AT-rich regions prone to unwinding, mimicking aspects of host chromosomal origins to hijack cellular replication machinery. Unlike prokaryotic phages, these viral oris support replication of larger genomes within the complex eukaryotic , often linking to viral lifecycle stages such as or lytic growth. A prominent example is the simian virus 40 () ori, which consists of a core region with three pentanucleotide T-antigen binding sites, an early , and an adjacent AT-rich DNA unwinding element (). The upstream enhancer contains one or two 72-bp repeats that are also bound by the viral T-antigen , enhancing replication efficiency by facilitating T-antigen assembly into double hexamers that unwind the . Studies from the 1980s established that initiation relies on T-antigen for origin recognition and unwinding, independent of host (ORC) binding to the viral ori, though it recruits host MCM for elongation. In Epstein-Barr virus (EBV), the latent origin oriP comprises two key elements: the family of repeats () for plasmid segregation and the dyad symmetry () element, a bound by the viral EBNA1 protein to recruit host replication factors. EBNA1 binding to DS establishes the replication start site, while FR binding stabilizes the during cell division; this dual function supports persistent infection linked to diseases like Burkitt's lymphoma. EBV also employs a distinct lytic origin, oriLyt, activated during viral reactivation for amplified production, contrasting oriP's role in latency. Human papillomavirus (HPV) oris are regulated by the viral E2 protein, which binds upstream regulatory elements and recruits the E1 helicase to three specific binding sites near the replication start, including palindromic E1 sites that facilitate E1 oligomerization. E1 forms a hexameric complex at the ori to unwind DNA, initiating bidirectional replication dependent on host polymerases. Recent structural studies have revealed the architecture of the E1 hexamer and its interaction with E2, highlighting how E2 stabilizes E1 loading for efficient viral persistence in epithelial cells. Adenoviruses initiate replication at origins within their inverted terminal repeats (ITRs), where the viral terminal protein (TP) binds and forms a covalent linkage with the 5' dCMP, priming strand-displacement synthesis without primers. The minimal ori spans the terminal 18 of the ITRs, featuring inverted repeats that position TP and the viral for initiation, enabling replication of the linear from both ends. This protein-DNA covalent mechanism distinguishes adenoviral oris from those relying on host primases. These viral oris often co-opt mammalian cellular replication proteins like MCM and polymerases, adapting host factors for autonomous propagation.

Variations and Advances

Replication Directionality

Replication from origins of replication can proceed in either a unidirectional or bidirectional manner, with the latter being the predominant mode across prokaryotes and eukaryotes. In bidirectional replication, two replication forks diverge in opposite directions from the origin, effectively doubling the rate of genome duplication compared to a single fork. This process is initiated by the loading of two helicase complexes at the origin, each unwinding the DNA helix to allow polymerase access on both strands. For instance, in Escherichia coli, bidirectional replication from the oriC origin covers the 4.6 Mb chromosome in approximately 40 minutes under optimal conditions, facilitated by the coordinated progression of the forks toward the terminus. Unidirectional replication, in contrast, involves a single replication fork proceeding in one direction from the origin, which is less common and typically observed in certain rather than chromosomal contexts. A representative example is the γ origin of the R6K , where replication initiates unidirectionally due to specific sequence elements and initiator proteins like π that direct fork progression in only one orientation, often requiring a nick or specialized protein interactions to establish polarity. The speed of a replication fork can be quantified as v = \frac{d}{t}, where d is the distance replicated and t is the time taken; in , this rate averages around 600 base pairs per second. Unidirectional modes demand mechanisms to prevent bidirectional initiation, such as asymmetric binding sites or terminators that block the opposing fork. Some replication systems exhibit switching between modes, starting with bidirectional theta-form replication before transitioning to unidirectional rolling-circle replication, particularly in response to cellular cues or copy number control needs; this shift avoids head-on collisions between replication forks and transcription machinery, which are more frequent in unidirectional setups and can lead to replication stalling or genomic instability. Bidirectional replication orients most highly expressed genes co-directionally with fork movement, minimizing such conflicts and reducing mutation rates. The prevalence of bidirectional replication offers evolutionary advantages by halving the time required for genome duplication and lowering error accumulation, as shorter fork travel distances reduce exposure to replication stress; recent studies in archaea, which also predominantly employ multiple bidirectional origins, reinforce this dominance across domains. Origin sequences, such as AT-rich regions and DnaA boxes in bacteria, facilitate helicase loading that enables this divergent fork establishment.

Dormant and Flexible Origins

In eukaryotic cells, a significant proportion of licensed replication origins remain dormant and do not fire during a normal S phase, serving as a reserve to ensure complete genome duplication. In budding yeast, approximately 50% of origins exhibit low firing efficiency and function as dormant sites, while in mammals, up to 90% of licensed origins are dormant under unperturbed conditions. These dormant origins are passively replicated by forks from nearby active origins but can be activated when replication forks stall due to stress, such as treatment with hydroxyurea (HU), which slows fork progression and triggers their firing to rescue stalled replication. The activation of dormant origins during such stress is mediated by the ATR kinase, which promotes local origin firing in response to single-stranded DNA accumulation at stalled forks, thereby preventing replication gaps. The firing of replication origins in eukaryotes is inherently and flexible, with only a activated in each to maintain even progression of replication forks across the . This probabilistic selection ensures that dormant origins are interspersed at intervals of approximately 100 , providing without over-. A 2025 study using BrdU incorporation and single-molecule in cells revealed that under normal conditions, most replication events (~80%) occur at dispersed sites throughout the , including gene bodies, rather than being confined to traditional initiation zones, highlighting the high cell-to-cell variability and nature of origin usage. Mechanisms underlying this flexibility include the pre-loading of excess MCM2-7 helicases during , far exceeding the number needed for firing (e.g., ~100,000 complexes in cells versus ~30,000-50,000 active origins), coupled with regulation by cyclin-dependent kinases (CDKs). Low CDK activity in G1 permits licensing, while rising S-phase CDK levels limit firing to a of origins, balancing to avoid conflicts with transcription or excessive fork density. The loss or dysfunction of dormant origins has profound consequences for genome integrity, leading to unresected replication fork collapse under and subsequent DNA double-strand breaks. In cells depleted of excess MCM2-7, stalled forks cannot be efficiently rescued, resulting in increased genomic instability, improper segregation, and heightened sensitivity to replication inhibitors. This vulnerability contributes to pathological states, including accelerated cellular aging through chronic replication and , as observed where aging tissues show dysregulated dormant origin activation and ATR-dependent responses. In cancer, impaired dormant origin function exacerbates oncogene-induced replication , promoting and tumor progression, underscoring their role as tumor suppressors.

Recent Developments

In 2023, researchers introduced Ori-FinderH, a deep learning-based computational tool that integrates Z-curve representation of DNA sequences with convolutional neural networks (CNNs) to predict human origins of replication (ORIs) of varying lengths with high accuracy, outperforming previous methods by achieving up to 92% sensitivity and specificity on benchmark datasets. Building on this, the 2025 development of OriGen, an AI-driven sequence generation model, marked a breakthrough in synthetic biology by designing de novo plasmid origins of replication that retain essential functional elements like AT-rich regions and DnaA-binding sites, with experimental validation showing successful replication in bacterial hosts and divergence from natural sequences by up to 50%. Advancements in have illuminated the activation mechanisms of the (MCM) double hexamer, a key replicative . In 2024, cryo-electron (cryo-EM) studies of human proteins revealed the dynamic loading of the MCM double hexamer onto DNA, capturing intermediate states where the (ORC) and CDC6 facilitate head-to-head hexamer assembly, with resolutions down to 3.2 Å highlighting conformational changes necessary for bidirectional activation. Complementing this, 2025 investigations in budding demonstrated how (CDK) regulation remodels the MCM double hexamer during the , shaping origin firing timing by promoting G1-specific loading and inhibiting re-licensing, thereby influencing evolutionary patterns of origin efficiency across yeast . In mammalian systems, recent findings have challenged traditional views of replication initiation sites. A 2025 study using BrdU incorporation coupled with single-molecule uncovered that most human initiates in a dispersed manner across bodies, often independent of promoter regions, with over 70% of events occurring in non-canonical, intergenic, or intronic loci rather than discrete . Concurrently, integrative mapping via ChIP-exo in 2024 showed overlapping binding profiles of and MCM2-7 at human origins, revealing a self-limiting licensing mechanism where MCM loading displaces , ensuring equitable distribution across the with densities correlating to replication timing domains. Synthetic applications of engineered origins have expanded into therapeutic contexts, particularly for designing viral vectors in gene therapy. In extremophile , models of archaeal replication timing—derived from like Haloferax volcanii that initiate replication without fixed origins—have informed the of robust replication systems for , enhancing yields in harsh conditions like high salinity or temperature, as reviewed in comparative archaeal studies. These dormant origins can activate under stress, providing adaptive flexibility in synthetic constructs.

References

  1. [1]
    Origins of DNA replication - PMC - PubMed Central
    DNA synthesis of daughter strands starts at discrete sites, termed replication origins, and proceeds in a bidirectional manner until all genomic DNA is ...
  2. [2]
    DNA Replication Origins - PMC - NIH
    Prokaryotic replication origins are specific DNA sequence motifs that position initiator proteins. In eukaryotes, the need for specific recognition sequences ...
  3. [3]
    Origins of DNA Replication in Eukaryotes - PMC - NIH
    This review focuses on the current understanding how the Origin Recognition Complex (ORC) contributes to determining the location of replication initiation.2. Dna Replication... · 3. Origin Specification · 4.1. Plasticity Of...
  4. [4]
    On the Regulation of DNA Replication in Bacteria
    DNA replication in bacteria involves assembling free deoxyribonucleotides to form an identical sequence, using base-pairing.
  5. [5]
    from simple origins to complex functions - Genes & Development
    Originally proposed by Brenner, Cuzin, and Jacob based on genetic findings, the replicon model hypothesized the existence of two essential components ...
  6. [6]
    Replication and Control of Circular Bacterial Plasmids - ASM Journals
    Jun 1, 1998 · An essential feature of bacterial plasmids is their ability to replicate as autonomous genetic elements in a controlled way within the host.
  7. [7]
    A Brief History of Plasmids - PMC - PubMed Central
    Soon after it was established that the F plasmid could assume both an autonomous and integrated state, F. Jacob and E. Wollman (10) drew analogies between this ...
  8. [8]
    The Replication Domain Model: regulating replicon firing in the ... - NIH
    In their celebrated theory, Jacob, Brenner and Cuzin hypothesized that the DNA of Escherichia coli was organized as “replicons”, with each replicon consisting ...
  9. [9]
  10. [10]
  11. [11]
  12. [12]
  13. [13]
  14. [14]
  15. [15]
  16. [16]
    Structure of the active form of human origin recognition complex and ...
    Jan 23, 2017 · Here, we report the structure of human ORC in a functionally active, ATP-hydrolysis ready state, providing insight into ATP-dependent protein ...
  17. [17]
    A Structural View of The Initiators for Chromosome Replication - PMC
    This review discusses current knowledge of origin licensing and compares the origin recognition machinery with other multi-subunit, ATP-driven cellular motors.
  18. [18]
    MCM double hexamer loading visualized with human proteins - Nature
    Nov 27, 2024 · Our current understanding of how the double hexamer is assembled by the origin recognition complex (ORC), CDC6 and CDT1 comes mostly from ...
  19. [19]
    The mechanism of DNA unwinding by the eukaryotic replicative ...
    May 14, 2019 · The processivity, number of base pairs unwound before activity ceases, is 827 ± 642 bp (n = 197). This lower limit arises as our experiments do ...
  20. [20]
    Single-Stranded DNA Binding Proteins and Their Identification ... - NIH
    Single-stranded DNA (ssDNA) binding proteins (SSBs) are critical in maintaining genome stability by protecting the transient existence of ssDNA from damage ...
  21. [21]
    Distinguishing the Roles of Topoisomerases I and II in Relief of ... - NIH
    Eukaryotic cells have two major topoisomerases that are capable of efficiently relaxing torsionally stressed DNA: topoisomerase I (Top1) and topoisomerase II ( ...
  22. [22]
    Regulation of replication origin licensing by ORC phosphorylation ...
    Cdc6 phosphorylation leads to its ubiquitin modification and degradation (27–29). In contrast to Mcm2-7 and Cdc6, phosphorylated ORC remains in the nucleus ...
  23. [23]
    Cdc6 ATPase activity disengages Cdc6 from the pre-replicative ...
    Origin licensing requires ATP binding and hydrolysis by the MCM replicative helicase. Molecular Cell. 2014;55:666–677. doi: 10.1016/j.molcel.2014.06.034 ...Missing: seminal | Show results with:seminal
  24. [24]
    The DnaA Cycle in Escherichia coli: Activation, Function ... - Frontiers
    Dec 20, 2017 · The 245 bp minimal oriC region has multiple binding sites for the chromosomal replication initiator protein DnaA (DnaA boxes), and a single ...Introduction · Basic Features of oriC and DnaA · Perspectives for Coupling...
  25. [25]
    DnaA binding locus datA promotes DnaA-ATP hydrolysis to ... - PNAS
    Dec 31, 2012 · We reveal that a complex consisting of datA and IHF promotes DnaA-ATP hydrolysis in a manner dependent on specific inter-DnaA interactions.
  26. [26]
    Replication initiation at the Escherichia coli chromosomal origin - PMC
    To initiate DNA replication, DnaA recognizes and binds to specific sequences within the Escherichia coli chromosomal origin (oriC), and then unwinds a region ...
  27. [27]
    Structural basis of replication origin recognition by the DnaA protein
    In the present study, we have determined the crystal structure of the DNA‐binding domain (domain IV, amino acid residues 374–467) of the Escherichia coli DnaA ...
  28. [28]
    [PDF] Diversity of DNA Replication in the Archaea
    Jan 31, 2017 · For example, Sulfolobus islandicus and Haloferax volcanii have three replication origins per chromosome [10–13] while Pyrobaculum calidifontis ...<|control11|><|separator|>
  29. [29]
    Multiple replication origins with diverse control mechanisms in ...
    Among archaea, multiple replication origins have been best described in Sulfolobus species, and the studies have demonstrated three active origins in their ...
  30. [30]
    Multiple replication origins with diverse control mechanisms in ... - NIH
    Nov 22, 2013 · Briefly, an ∼500-bp DNA fragment in the middle of the target ... C-ORB represents a classic ORB element identified in archaeal origins.
  31. [31]
    DNA replication origins in archaea - Frontiers
    Apr 28, 2014 · Archaea, the third domain of life, use a single or multiple origin(s) to initiate replication of their circular chromosomes. The basic structure ...
  32. [32]
    An archaeal nucleoid-associated protein binds an essential motif in ...
    Jun 5, 2025 · DNA replication typically has defined start sites, or replication origins, which are designated by their recognition by specific initiator ...
  33. [33]
    Structural mechanism for replication origin binding and remodeling ...
    Aug 26, 2020 · A loop region near the catalytic Walker B motif of Orc1 directly contacts DNA, allosterically coupling DNA binding to ORC's ATPase site.
  34. [34]
    An archaeal nucleoid-associated protein binds an essential motif in ...
    Jun 5, 2025 · The Orc1 subunit of ORC and Cdc6 share sequence similarity, and archaeal genomes typically encode one or more proteins related to Orc1 and Cdc6.
  35. [35]
    Mechanism of Archaeal MCM Helicase Recruitment to DNA ... - NIH
    We reveal that archaeal Orc1-1 fulfills both Orc1 and Cdc6 functions by binding to a replication origin and directly recruiting MCM helicase.Missing: bidirectional forks
  36. [36]
    DNA Replication in Time and Space: The Archaeal Dimension - MDPI
    This review provides a historical overview of major advancements in the study of DNA replication, followed by a comparative analysis of replication initiation ...
  37. [37]
    Genetic and Physical Mapping of DNA Replication Origins in ...
    We have used a combination of genetic, biochemical, and bioinformatic approaches to map DNA replication origins in H. volcanii.
  38. [38]
    The ARS309 chromosomal replicator of Saccharomyces cerevisiae ...
    ARS elements are modular (Fig. 1). All contain an essential match or near match to the 11-bp ARS consensus sequence (ACS) WTTTAYRTTTW (where W is A or T ...
  39. [39]
    Identification of 1600 replication origins in S. cerevisiae - eLife
    Feb 5, 2024 · The first yeast origin of replication, ARS1, was reported in 1979 as a sequence that could sustain plasmid replication (Stinchcomb et al., 1979) ...
  40. [40]
    DNA Replication Control During Drosophila Development - NIH
    Rapid early embryogenesis, in the first 2 hr after fertilization, is achieved by accelerated DNA replication. In early embryos, nuclei divide quickly with no ...Dna Replication Overview · Table 2. Drosophila... · Drosophila Gene...<|separator|>
  41. [41]
    High-resolution profiling of Drosophila replication start sites reveals ...
    Several CpG-islands (CGIs) in mammals and CG-rich regions (CGRs) in Drosophila share the potential to initiate DNA replication (Cayrou et al., 2012a; Besnard ...Missing: spacing | Show results with:spacing
  42. [42]
    Origins of DNA replication in eukaryotes - ScienceDirect
    Feb 2, 2023 · DNA replication could occur in less than 4 min in early-stage Drosophila embryos, and less than 30 min in early Xenopus cleavage embryos—20 ...
  43. [43]
    Cell cycle regulation has shaped replication origins in budding yeast
    Jun 30, 2025 · CDK phosphorylation of Orc2 blocks MO formation. CDK prevents MCM loading outside of the G1 phase by promoting Cdc6 proteolysis and MCM–Cdt1 ...
  44. [44]
    Recent advances in the genome-wide study of DNA replication ...
    Feb 19, 2015 · Here, we review several experimental approaches that have been used to map replication origins in yeast and some of the available web resources ...Missing: 2D | Show results with:2D
  45. [45]
    Cell cycle regulation has shaped replication origins in budding yeast
    Jun 30, 2025 · Eukaryotic DNA replication initiates from genomic loci known as origins. At budding yeast origins like ARS1, a double hexamer (DH) of the ...
  46. [46]
    Peaks cloaked in the mist: The landscape of mammalian replication ...
    Jan 19, 2015 · The word replicon then designated the DNA replicated from a single origin. ... Each zone (2.6–21.6 kb in size) fired in only a fraction of the ...
  47. [47]
    DNA replication origins—where do we begin? - Genes & Development
    First, origins are not randomly distributed across the genome, as the density of mapped origins is highest in early-replicating chromosomal domains (Cadoret et ...
  48. [48]
    Genome-wide mapping of human DNA replication by optical ...
    Jul 15, 2021 · The human genome is replicated from ∼50,000 distinct initiation events in each cell cycle (Masai et al., 2010). However, identifying the ...Results · Genome-Wide Mapping Of... · Discussion
  49. [49]
    Dormant origins licensed by excess Mcm2–7 are required for human ...
    Dormant origins, licensed by excess Mcm2-7, are activated during replicative stress to maintain DNA replication and genome stability, despite checkpoint ...
  50. [50]
    Mapping replication origins by quantifying relative abundance of ...
    A procedure was developed for mapping origins of DNA replication in mammalian cell chromosomes based on determining the relative abundance of nascent DNA ...Missing: NASBA | Show results with:NASBA
  51. [51]
    Unveiling human origins of replication using deep learning
    Nov 25, 2023 · We proposed a computational approach Ori-FinderH, which can efficiently and precisely predict the human ORIs of various lengths by combining the Z-curve method ...INTRODUCTION · MATERIALS AND METHODS · RESULT · CONCLUSION
  52. [52]
    The dynamic nature of the human origin recognition complex ... - eLife
    Aug 18, 2020 · The Origin Recognition Complex (ORC) is necessary for orchestrating the initiation process by binding to origin DNA, recruiting CDC6, and ...
  53. [53]
    Most human DNA replication initiation is dispersed throughout the ...
    May 9, 2025 · In mammalian cells, these methods have identified broad zones of replication initiation (30–100 kb, initiation zones—IZs), separated by large ...
  54. [54]
    Order from clutter: selective interactions at mammalian replication ...
    Accordingly, tissue-specific replication programmes would then reflect the activity of several origin-binding factors, each interacting with a subgroup of ...
  55. [55]
    Mammalian Orc1 protein is selectively released from chromatin and ...
    Mammalian Orc1 protein is selectively released from chromatin and ubiquitinated during the S-to-M transition in the cell division cycle.Missing: tissue- | Show results with:tissue-
  56. [56]
    Recurrent integration of human papillomavirus genomes at ...
    Common fragile sites are regions of the genome that have difficulty completing replication and, as such, are susceptible to chromosome breakage in mitosis. They ...Missing: misfiring | Show results with:misfiring
  57. [57]
    Bacteriophage replication modules | FEMS Microbiology Reviews
    The phage initiator O binds to the replication origin (oriλ) in the linear or (relaxed) circular forms but origin unwinding requires the negatively supercoiled ...
  58. [58]
    Binding and bending of the lambda replication origin by the phage O ...
    Dec 16, 1985 · We have characterized the binding of lambda phage replication initiation protein O to the phage origin of replication.Missing: bidirectional theta rolling circle
  59. [59]
    evidence for direct interaction of Escherichia coli RNA polymerase ...
    Bacteriophage λ DNA replication starts from binding of λO replication initiator protein to four partially symmetrical iterons present at ori λ ( 1 ).
  60. [60]
    Turn Off of Early Replication of Bacteriophage Lambda | PLOS One
    May 10, 2012 · Lambda replicates in two stages. The early or bidirectional (theta) mode from oriλ starts within two minutes following thermal de-repression of ...
  61. [61]
    Regulation of the switch from early to late bacteriophage lambda ...
    It is proposed that in wild-type E. coli cells infected with lambda, phage DNA replication proceeds according to a bidirectional theta mechanism early after ...
  62. [62]
  63. [63]
    An Escherichia coli replication protein that recognizes a ... - PubMed
    Restriction of phi X174 DNA have led to the identification of a 55-nucleotide fragment that carries the protein n' recognition sequence. Molecular hybridization ...Missing: 5412 Rep helicase
  64. [64]
    Rep protein as a helicase in an active, isolatable replication fork of ...
    May 25, 1981 · Rep protein as a helicase combines its actions with those of gene A protein and single-stranded DNA binding protein to separate the strands of phi X174 duplex ...
  65. [65]
    An Escherichia coli replication protein that recognizes a unique ...
    An Escherichia coli replication protein that recognizes a unique sequence within a hairpin region in phi X174 DNA. · Abstract · Free full text ...Missing: loops 5412
  66. [66]
    ParAB-mediated intermolecular association of plasmid P1 parS Sites
    Aug 6, 2025 · The P1 plasmid partition system depends on ParA-ParB proteins acting on centromere-like parS sites for a faithful plasmid segregation during ...
  67. [67]
    dependence upon dnaA of replicons derived from P1 and F - PMC
    Evidence is also presented for the existence of a dnaA-independent secondary replicon of P1 that is able to drive bacterial chromosome replication but is ...
  68. [68]
    Choreography of bacteriophage T7 DNA replication - PMC
    Synthesis of the lagging strand requires the synthesis of oligoribonucleotides by the primase domain of gp4. These oligoribonucleotides are then used as primers ...
  69. [69]
    An additional replication origin causes cell cycle specific DNA ...
    Apr 29, 2025 · Our analysis revealed that RFS is reduced by approximately one third when four replication forks are active and increases by about one fourth when only one ...
  70. [70]
    SV40 DNA replication: From the A gene to a nanomachine - PMC
    SV40 DNA replicates during the S/G2 phase of the cell cycle, but DNA polymerase alpha-primase phosphorylated by cyclin-dependent kinase is unable to support ...
  71. [71]
    The Crystal Structure of the SV40 T-Antigen Origin Binding Domain ...
    Jan 23, 2007 · We report here the crystal structure of the DNA-binding domain of SV40 T-ag on a DNA fragment derived from the viral origin of replication.
  72. [72]
    The simian virus 40 minimal origin and the 72-base-pair repeat are ...
    Jul 2, 1984 · The 72-bp repeat contains an element that enhances transcription offof the. SV40 early promoter (5, 6) and the promoters of several heterologous ...
  73. [73]
    DNA Replication in Protein Extracts from Human Cells Requires ...
    SV40-ori indicates the fragment containing the SV40 origin of DNA replication. D, target-bound assay with recombinant human ORC. 0.5 μg of recombinant human ORC ...
  74. [74]
    Cryo-EM Structure and Functional Studies of EBNA1 Binding to the ...
    Sep 14, 2022 · EBNA1 binds to two functionally distinct elements at the viral origin of plasmid replication (oriP), termed the dyad symmetry (DS) element, ...
  75. [75]
    Adenovirus DNA Replication - PMC - NIH
    Two identical origins of replication are located within the inverted terminal repeats, covering ∼1–50 bp. The terminal 18 bp form the minimal origin and the ...
  76. [76]
    Escherichia coli cell factories with altered chromosomal replication ...
    Jun 21, 2022 · In Escherichia coli, bidirectional DNA replication is initiated at the origin of replication (oriC) and arrested by the 10 termination sites ( ...
  77. [77]
    Genetic toggle switch controlled by bacterial growth rate - PMC - NIH
    Dec 2, 2017 · In E. coli a single round of genome replication lasts about 40 min and it must be accomplished about 20 min before cell division. To achieve ...
  78. [78]
    Replication of R6K gamma origin in vitro: discrete start sites for DNA ...
    Oct 2, 1998 · The regulation of the plasmid R6K gamma origin (gamma ori) is accomplished through the ability of the pi protein to act as an initiator and ...
  79. [79]
    DNA replication speed in vivo - Bacteria Escherichia coli
    " From this one can infer a value of 600bp/s for the replication rate in vivo (180,000bp/300s). Entered by, Uri M. ID, 109251. Related BioNumbers. Rate of DNA ...
  80. [80]
    Mechanisms of Theta Plasmid Replication | Microbiology Spectrum
    Class C and D plasmids both have termination signals in the 3′ direction of lagging-strand synthesis, making replication of these plasmids unidirectional. Class ...
  81. [81]
    Genome-wide coorientation of replication and transcription reduces ...
    In addition to effects on replication, collisions between RNA and DNA polymerases are expected to cause a small decrease in transcription of a given gene when ...
  82. [82]
    Evolutionary Trajectory of the Replication Mode of Bacterial Replicons
    Jan 26, 2021 · Bidirectional replication is the rule for bacterial chromosomes, and unidirectional replication has been found only in plasmids. To date, no ...
  83. [83]
    Interplay between chromosomal architecture and termination of DNA ...
    The genomes of archaeal species are circular, but are predominantly replicated from multiple origins. In all three cases, replication is bidirectional and ...
  84. [84]
    Initiation of bidirectional replication at the chromosomal origin is ...
    Initiation of bidirectional replication at the chromosomal origin is directed by the interaction between helicase and primase.Missing: unidirectional E.
  85. [85]
    Genome-wide estimation of firing efficiencies of origins of DNA ...
    May 13, 2010 · As a result, the efficiency of original dormant/inefficient origin O1 will increase greatly to 100%. Thus firing efficiency is determined by the ...Spanned Firing Time Model · S. Cerevisiae · Firing Efficiency And...
  86. [86]
    Replication forks, chromatin loops and dormant replication origins
    Dec 30, 2008 · Many more origins are licensed in G1 than are actually used, with around 90% of licensed origins being inefficient and remaining dormant in any ...
  87. [87]
    A model for DNA replication showing how dormant origins ...
    Dormant origins normally have only a certain period of time to fire before they are passively replicated—and hence inactivated—by a fork from a neighbouring ...
  88. [88]
    The essential kinase ATR: ensuring faithful duplication of a ... - NIH
    (b and c) ATR allows local dormant origins to fire in response to replication stress. (b) When a replication fork is stalled, nearby dormant origins fire to ...
  89. [89]
    DNA Replication Origins Fire Stochastically in Fission Yeast
    Oct 26, 2005 · We show that the firing of replication origins is stochastic, leading to a random distribution of replication initiation. Furthermore, origin ...
  90. [90]
    The Protective Role of Dormant Origins in Response to Replicative ...
    About 100,000 potential replication origins form on the chromatin in the gap 1 (G1) phase but only 20–30% of them are active during the DNA replication of a ...
  91. [91]
    Stalled fork rescue via dormant replication origins in unchallenged S ...
    We show that a loss of dormant origins results in an increased number of stalled replication forks even in unchallenged S phase in primary mouse fibroblasts.
  92. [92]
    In vivo DNA replication dynamics unveil aging-dependent ...
    Oct 31, 2024 · For each early-replicating expressed gene, we scored for the presence of ig origins located within 100 kb of the gene transcribed sequence.Missing: spacing | Show results with:spacing
  93. [93]
    Replication stress as a driver of cellular senescence and aging
    May 22, 2024 · Replication stress contributes to the dysregulation of many hallmarks of aging, which are extensive. Difficulty in replicating chromosome ends ...
  94. [94]
    Unveiling human origins of replication using deep learning - NIH
    Nov 25, 2023 · We proposed a computational approach Ori-FinderH, which can efficiently and precisely predict the human ORIs of various lengths by combining the Z-curve method ...Missing: CNN | Show results with:CNN
  95. [95]
    Generating functional plasmid origins with OriGen - bioRxiv
    Feb 4, 2025 · We develop OriGen, a sequence model that generates new plasmid origins of replication while maintaining key elements essential for their replication.
  96. [96]
    MCM2-7 loading-dependent ORC release ensures genome-wide ...
    Aug 24, 2024 · Origin recognition complex (ORC)-dependent loading of the replicative helicase MCM2-7 onto replication origins in G1-phase forms the basis ...