Replication protein A
Replication protein A (RPA) is a heterotrimeric single-stranded DNA (ssDNA)-binding protein complex that serves as the primary eukaryotic guardian of ssDNA, protecting it from nucleases and secondary structures while coordinating multiple aspects of DNA metabolism, including replication, repair, recombination, and damage response.[1] Composed of three subunits—RPA70 (also known as RPA1, approximately 70 kDa), RPA32 (RPA2, 32–34 kDa), and RPA14 (RPA3, 14 kDa)—RPA features six oligonucleotide/oligosaccharide-binding (OB)-fold domains distributed across the subunits, enabling high-affinity binding to ssDNA in a dynamic, horseshoe-shaped conformation that accommodates 20–30 nucleotides.[1][2] These domains include high-affinity sites in DBD-A and DBD-B of RPA70 for initial ssDNA recognition, with additional domains (DBD-C, DBD-D, DBD-E, and DBD-F) facilitating cooperative assembly and interactions with other proteins.[2] In DNA replication, RPA binds transiently exposed ssDNA at replication forks to prevent degradation and aberrant folding, such as G-quadruplex formation, while stimulating polymerases like DNA polymerase α and δ and aiding fork restart through recruitment of factors like PrimPol.[1] Beyond replication, RPA is indispensable for DNA repair pathways: it supports nucleotide excision repair (NER) by interacting with proteins such as XPA, XPG, and XPF–ERCC1; facilitates homologous recombination (HR) by protecting ssDNA and assisting in RAD51 filament dynamics; and contributes to base excision repair and double-strand break end resection via enzymes like Exo1.[1] Additionally, RPA plays a central role in the DNA damage response by binding ssDNA to activate the ATR kinase pathway through associations with ATRIP, Rad17, and Nbs1, and it undergoes regulatory phosphorylation (e.g., at Ser33 on RPA32 by ATM, ATR, or DNA-PKcs) to fine-tune these processes.[1] RPA's multifunctional nature extends to other cellular processes, including telomere maintenance, nucleosome assembly, transcription regulation, and RNA metabolism, underscoring its broader impact on genome stability.[1] In higher eukaryotes like plants, multiple RPA paralogs form distinct complexes (e.g., Types A, B, and C) specialized for chloroplast or nuclear functions, reflecting evolutionary adaptations while conserving core roles in DNA synthesis and repair.[3] Dysregulation or oxidative damage to RPA can impair repair efficiency and promote genomic instability, highlighting its therapeutic potential as a target for cancer treatments via small-molecule inhibitors of its N-terminal domain.[1]Discovery and Overview
Historical Discovery
The development of an in vitro system for simian virus 40 (SV40) DNA replication in the late 1970s and early 1980s using human cell extracts marked a pivotal advance in understanding eukaryotic DNA synthesis. This system, established by researchers including Thomas J. Kelly, demonstrated that SV40 large T antigen, along with cellular factors, was sufficient to initiate and complete DNA replication from the viral origin. During fractionation of these extracts, a cellular protein fraction, initially termed Fraction A, was found essential for supporting T antigen-dependent unwinding of the origin and subsequent primer synthesis by DNA polymerase α-primase, highlighting its role as a key eukaryotic replication factor.[4] Further purification efforts in the mid-1980s led to the isolation of the active component from Fraction A, named Replication Protein A (RPA) or Replication Factor A (RFA), and also referred to as the human single-stranded DNA-binding protein (hSSB). In 1988, Marc S. Wold and colleagues in the Kelly laboratory at Johns Hopkins University purified RPA to homogeneity from HeLa cell extracts and characterized it as a multisubunit protein required for SV40 DNA replication in vitro. The protein was shown to bind specifically to single-stranded DNA, stabilizing unwound regions at the replication fork and facilitating elongation by cellular polymerases, confirming its indispensable role in eukaryotic DNA synthesis. This work built on studies of pol α-primase activity, where RPA was identified as a stimulatory cofactor during 1983–1987 biochemical assays.[4] Subsequent investigations in the early 1990s elucidated RPA's oligomeric structure, establishing it as a stable heterotrimer composed of subunits of approximately 70, 32, and 14 kDa. Cloning and functional analysis of these subunits, including demonstrations that each is essential for activity, refined the initial observations of multiple polypeptides in the 1988 purification. These studies, including genetic complementation experiments, solidified RPA's conserved architecture across species.[5] RPA's evolutionary conservation was first noted in the late 1980s through the identification of functional homologs in yeast. In 1989, Stephen J. Brill and Bruce Stillman purified a Saccharomyces cerevisiae protein complex, yeast Replication Factor A (yRFA), comprising subunits RFA1, RFA2, and RFA3, which substituted for human RPA in SV40 replication assays and bound single-stranded DNA similarly. Genetic analyses in the 1990s confirmed that mutations in these yeast genes (RFA1–3) were lethal and disrupted DNA metabolism, underscoring RPA's ancient eukaryotic origin.General Properties
Replication protein A (RPA) is a heterotrimeric protein complex that serves as the primary eukaryotic single-stranded DNA (ssDNA)-binding protein, playing an essential role in stabilizing ssDNA during various genome maintenance processes such as replication, repair, and recombination.[6][1] This complex coats ssDNA to form protective nucleoprotein filaments, shielding it from nuclease degradation and preventing the formation of unwanted secondary structures that could impede DNA transactions.[2][7] RPA exhibits high-affinity binding to ssDNA, with a dissociation constant (K_d) in the range of approximately 1-10 nM, enabling it to rapidly and specifically associate with exposed ssDNA regions.[7][8] In mammalian cells, RPA is highly abundant, with an estimated 10^5 molecules per cell, ensuring sufficient availability to respond to cellular demands during active DNA synthesis or damage response.[9] The protein complex localizes dynamically to replication forks and DNA damage sites, where its concentration increases transiently to support ongoing genomic processes.[10] RPA is highly conserved across eukaryotes, from yeast to humans, with its core ssDNA-binding and stabilizing functions remaining largely unchanged since the divergence of major eukaryotic lineages approximately 1.5 billion years ago.[11][12] This evolutionary stability underscores RPA's fundamental importance in eukaryotic DNA metabolism.[6]Molecular Structure
Subunit Composition
Replication protein A (RPA) in humans is a heterotrimeric complex composed of three subunits: RPA1 (also known as RPA70, approximately 70 kDa), RPA2 (RPA32, approximately 32 kDa), and RPA3 (RPA14, approximately 14 kDa).[6] These subunits are encoded by the RPA1, RPA2, and RPA3 genes, located on chromosomes 17p13.3, 1p35.3, and 7p21.3, respectively.[13] The RPA1 subunit serves as the central scaffold, featuring an N-terminal domain (RPA70N) that mediates protein-protein interactions and a C-terminal domain responsible for associating with the smaller subunits.[14] The assembly of the RPA heterotrimer involves the formation of a stable RPA32-RPA14 heterodimer, which binds to the C-terminal region of RPA70, resulting in a tightly associated complex with a dissociation constant below 1 nM.[15] This architecture ensures the integrity of the complex under physiological conditions, as the trimer remains stable even in the presence of up to 6 M urea.[16] All three subunits are essential for cellular viability, with deletions or mutations in any subunit, particularly in RPA70, disrupting heterotrimer formation and leading to defects in DNA metabolism.[17] Intersubunit contacts are primarily mediated through the trimerization core, where the RPA32-RPA14 heterodimer interacts with RPA70 via a conserved zinc-finger motif in the C-terminal domain of RPA70, stabilizing the overall structure and facilitating coordinated function.[18] This motif coordinates a zinc ion essential for maintaining the subunit interfaces, and its disruption impairs complex assembly without affecting individual subunit folding.[19]DNA-Binding Domains and Conformational Dynamics
Replication protein A (RPA) possesses four distinct DNA-binding domains (DBDs) that enable its interaction with single-stranded DNA (ssDNA). These domains are primarily located within the RPA70 subunit, with one in RPA32. DBD-A, spanning residues 183-422 of RPA70, exhibits high-affinity binding to ssDNA and serves as the primary site for initial recognition.[20] DBD-B, encompassing residues 89-177 of RPA70, provides moderate-affinity binding, contributing to the stability of the complex.86721-0/fulltext) DBD-C, located at residues 424-497 of RPA70, displays low-affinity binding and supports structural extension along the DNA.[15] Finally, DBD-F, covering residues 1-115 of RPA32, acts as an auxiliary domain that enhances overall binding without primary affinity.83623-0/fulltext) RPA engages ssDNA through multiple binding modes, characterized by varying footprints from 0 nucleotides (unbound) to approximately 30 nucleotides per trimer, accompanied by allosteric changes that adjust domain positioning. In the initiation phase, DBD-A and DBD-B contact ssDNA first, forming a compact high-affinity core that accommodates 8-10 nucleotides in the low-occupancy mode.[21] Subsequent recruitment of DBD-C and DBD-F extends the filament, enabling the full 30-nucleotide high-affinity mode where all domains cooperatively wrap the DNA in a polarized manner.[14] These modes lack sequence specificity, allowing RPA to bind diverse ssDNA regions generated during DNA metabolism.[1] Conformational dynamics of RPA facilitate its transition between an open state for ssDNA searching and a closed state upon binding. In the open conformation, the domains are flexible, promoting rapid diffusion along DNA; binding induces closure, with structural studies revealing a approximately 70° rotation in the relative orientation of DBD-A and DBD-F to optimize ssDNA engagement.[14] Nuclear magnetic resonance (NMR) spectroscopy has elucidated linker flexibility between domains in the unbound state, while cryo-electron microscopy (cryo-EM) structures capture the closed filament-like arrangement on extended ssDNA, highlighting allosteric propagation from core domains to auxiliary ones.[22] RPA molecules exhibit cooperativity on ssDNA, polarizing with a 5'-to-3' orientation to form continuous filaments covering ~30-35 nucleotides per trimer without gaps or overlaps. This polarized assembly, driven by inter-trimer contacts between DBD-C and adjacent RPA units, ensures efficient coating of long ssDNA stretches while maintaining accessibility for partner proteins.[2]Biological Functions
Role in DNA Replication
Replication protein A (RPA) plays a central role in eukaryotic DNA replication by binding single-stranded DNA (ssDNA) generated during the unwinding of the replication fork, thereby stabilizing these regions and coordinating the assembly of replication machinery. During initiation, after pre-replication complex (pre-RC) activation by S-phase kinases, the MCM2-7 helicase initiates unwinding at origins, generating ssDNA that RPA binds with high affinity (dissociation constant, K_D, of approximately 1 nM). This coating stabilizes the nascent replication bubble, prevents strand re-annealing, and supports the function of the Cdc45-MCM2-7-GINS (CMG) helicase complex, essential for origin unwinding and replication start.[23][24] In the elongation phase, RPA coats the ssDNA on the lagging strand, shielding it from nucleases and secondary structure formation to prevent replication fork collapse. This coating is crucial for maintaining fork progression and coordinating the synthesis of Okazaki fragments. RPA interacts directly with DNA polymerase α-primase, stimulating its primase activity to enhance RNA primer synthesis on the RPA-bound ssDNA template, thereby facilitating efficient priming. Additionally, RPA promotes the handoff of primers from polymerase α to polymerase δ for extension, ensuring continuous lagging-strand synthesis.[25][26] RPA also integrates replication with cell cycle checkpoints, particularly during S-phase stress. When replication forks stall, extended RPA-ssDNA filaments serve as a platform to recruit and activate the ATR kinase via its cofactor ATRIP, triggering checkpoint signaling to halt cell cycle progression and allow fork recovery. This mechanism prevents genomic instability by coordinating replication fidelity with damage response pathways.[27]Role in DNA Repair
Replication Protein A (RPA) is essential for multiple DNA repair pathways, where it binds with high affinity to single-stranded DNA (ssDNA) intermediates to prevent nucleolytic degradation, inhibit secondary structure formation, and serve as a platform for recruiting repair proteins.[1] This stabilization is critical across repair processes that generate ssDNA, enabling coordinated progression from damage recognition to synthesis and ligation.[28] In nucleotide excision repair (NER), RPA aids in damage verification and stabilizes post-incision ssDNA gaps, facilitating resynthesis by DNA polymerase delta.[1] It interacts with xeroderma pigmentosum protein A (XPA) to bend or unwind DNA around lesions, enhancing recognition, and positions endonucleases such as XPG and XPF-ERCC1 for dual incisions, particularly after UV-induced cyclobutane pyrimidine dimers.[29] Acetylation of RPA1 at lysine 163 further strengthens XPA binding, increasing NER efficiency in human cells.[1] During base excision repair (BER), RPA binds ssDNA flanking abasic sites processed by apurinic/apyrimidinic endonuclease 1 (APE1), supporting long-patch repair by stimulating strand-displacement synthesis via DNA polymerase epsilon and flap endonuclease 1 activity.[30] It coordinates with DNA ligase I to seal nicks after gap filling, enhancing overall repair fidelity for oxidative and alkylative base damage.[31] In interstrand crosslink repair, RPA coats ssDNA generated during crosslink unhooking by the Fanconi anemia (FA) pathway, preventing collapse of stalled replication forks and promoting downstream homologous recombination.[32] Through its RPA32 subunit, RPA recruits the E3 ubiquitin ligase RFWD3 to these sites, where RFWD3-mediated ubiquitylation of RPA facilitates timely unloading and repair completion, with defects linked to FA genome instability.[32] RPA also supports translesion synthesis (TLS) by stabilizing ssDNA at lesion-stalled forks and tethering TLS polymerases, such as polymerase eta, to damaged templates via regulation of PCNA mono-ubiquitylation by Rad18.[1] This enables error-free bypass of UV-induced lesions, maintaining replication continuity while minimizing mutagenesis.[33]Role in DNA Recombination
Replication protein A (RPA) plays a critical role in the initiation of homologous recombination (HR) by coating the single-stranded DNA (ssDNA) generated through resection of double-strand break (DSB) ends. Following DSB formation, exonucleases such as EXO1 and DNA2, in coordination with helicases like BLM, process the DNA ends to produce 3' ssDNA overhangs, which are rapidly bound by RPA to prevent nucleolytic degradation and secondary structure formation. This RPA-ssDNA platform serves as the substrate for recombinase loading, but RPA initially antagonizes the binding of RAD51, the key recombinase in HR, due to its high affinity for ssDNA. The displacement of RPA by RAD51 is mediated by recombination mediators such as RAD52 and BRCA1/2, which facilitate the nucleation and polymerization of RAD51 filaments on the RPA-coated ssDNA, enabling the presynaptic phase of HR.[7][34][35] During strand invasion, RPA stabilizes the 3' overhangs, maintaining their accessibility for homology search and facilitating the formation of displacement loops (D-loops) upon RAD51-mediated invasion of the homologous duplex DNA. Partial displacement of RPA by RAD51 results in mixed nucleoprotein filaments that enhance the efficiency of strand exchange, as complete RPA removal can lead to unstable filaments, while residual RPA may aid in preventing non-specific interactions. This dynamic exchange is crucial for the presynaptic filament's ability to identify homologous sequences and initiate repair synthesis. In alternative non-homologous end joining (alt-NHEJ), particularly in BRCA-deficient cells where HR is compromised, RPA promotes this error-prone pathway by stabilizing the microhomology-containing ssDNA tails generated during resection, allowing annealing at short homologous sequences despite impaired RAD51 loading.[36][37] In meiotic recombination, RPA modulates crossover formation by supporting the assembly of recombination intermediates that lead to class I crossovers. RPA's binding to ssDNA at programmed DSBs enables the loading of meiosis-specific recombinases DMC1 and RAD51, and its persistence influences the resolution of joint molecules. Depletion of RPA significantly reduces MLH1 foci, which mark sites of future crossovers, indicating that RPA is required for the maturation of recombination events into crossovers through interactions within the HR machinery, including the MLH1-MLH3 complex. Additionally, RPA suppresses inappropriate recombination by coating ssDNA tails to prevent the formation of secondary structures, such as hairpins or G-quadruplexes, which could lead to aberrant annealing or stalled repair.[38][7]Protein Interactions and Regulation
Key Interacting Partners
Replication Protein A (RPA) interacts with over 50 known protein partners through its modular domains, enabling dynamic coordination of DNA metabolic processes such as replication, repair, and recombination.[1] These interactions primarily occur via specific sites on the RPA70 and RPA32 subunits, including the N-terminal domain of RPA70 (RPA70N or DBD-F), the C-terminal DNA-binding domain of RPA70 (RPA70C), and the acidic loop and C-terminal winged-helix domain of RPA32 (RPA32C).[1][39] The modular architecture allows RPA to serve as a scaffold, facilitating partner recruitment to single-stranded DNA (ssDNA) and modulating enzymatic activities for efficient substrate handoff.[22] In DNA replication, RPA engages key partners to promote initiation and progression. DNA polymerase alpha (Pol α), along with its associated primase, binds primarily to RPA70N, stimulating primer synthesis and enhancing the fidelity of DNA priming at replication origins.[26] This interaction positions Pol α on RPA-coated ssDNA, enabling the transition from RNA priming to DNA extension.[39] Similarly, the replication factor C (RFC) complex, which loads the proliferating cell nuclear antigen (PCNA) processivity factor, interacts with RPA via RPA32, targeting RFC to primer-template junctions and facilitating polymerase switching during elongation.[40] For DNA repair and recombination, RPA's partners enable damage recognition and strand invasion. In nucleotide excision repair (NER), XPA binds to RPA70 DBD-A and RPA32C, stabilizing the pre-incision complex and orienting endonucleases for lesion removal.[41][42] In homologous recombination (HR), RAD51 interacts with RPA through the RPA32 acidic loop, allowing displacement of RPA from ssDNA to form nucleoprotein filaments essential for strand invasion.[43] This exchange is mediated by accessories like BRCA2 via its DSS1 subunit, which targets the RPA32 acidic loop to promote RAD51 loading and HR facilitation.[43] RAD52 also binds RPA32C, aiding RPA removal and annealing of complementary ssDNA strands. Additionally, ATR-ATRIP binds RPA70N, recruiting the kinase complex to RPA-ssDNA platforms for damage signaling and checkpoint activation.[44][45] Tumor suppressor proteins further modulate RPA's roles in genome maintenance. p53 binds RPA70N, suppressing inappropriate HR and coordinating repair responses to maintain genomic stability.[46] BRCA1 and BRCA2 interact with RPA to enhance HR efficiency; BRCA2, in particular, facilitates RAD51 nucleation on RPA-bound ssDNA through direct or DSS1-mediated contacts.[43]| Partner | Binding Site | Modulation of RPA Activity | Key Reference |
|---|---|---|---|
| Pol α/primase | RPA70N | Stimulates priming and replication initiation | Maga et al. (2001) [DOI: 10.1074/jbc.M009599200] |
| RFC | RPA32 | Loads PCNA for processive synthesis | Cai et al. (1996) [DOI: 10.1091/mbc.7.10.1865] |
| XPA | RPA70 DBD-A; RPA32C | Scaffolds NER complex assembly | Mer et al. (2000) [DOI: 10.1016/s0092-8674(00)00136-7] |
| RAD51 | RPA32 acidic loop | Enables ssDNA handoff for HR filament formation | Yang et al. (2015) [DOI: 10.1074/jbc.M115.651455] |
| ATR-ATRIP | RPA70N | Recruits for DNA damage signaling | Zou & Elledge (2003) [DOI: 10.1126/science.1083430] |
| p53 | RPA70N | Suppresses HR and aids checkpoint control | Dutta et al. (1993) [DOI: 10.1038/365079a0] |
| BRCA1/2 | RPA70 (via DSS1 for BRCA2) | Promotes RAD51 loading in HR | Wong et al. (2003) [DOI: 10.1038/sj.onc.1206071] |
| RAD52 | RPA32C | Mediates RPA displacement and strand annealing | Park et al. (1996) [DOI: 10.1074/jbc.271.31.18996] |