Nucleic acid structure
Nucleic acids are a class of large biomolecules essential to all known forms of life, serving as the primary molecules for storing and transmitting genetic information in cells and viruses.[1] They consist of two main types: deoxyribonucleic acid (DNA), which encodes the genetic instructions for protein synthesis and is primarily found in the nucleus of eukaryotic cells, and ribonucleic acid (RNA), which plays diverse roles in protein synthesis, gene regulation, and other cellular processes.[2] DNA and RNA are both polymers composed of repeating monomeric units called nucleotides, linked together by phosphodiester bonds to form long chains.[2] Each nucleotide monomer in nucleic acids comprises three key components: a nitrogenous base, a five-carbon sugar (ribose or deoxyribose), and one or more phosphate groups.[1] In DNA, the sugar is deoxyribose, and the four nitrogenous bases are adenine (A), thymine (T), cytosine (C), and guanine (G); in RNA, the sugar is ribose, and thymine is replaced by uracil (U), with the other bases remaining the same.[2] The sequence of these bases along the nucleic acid chain encodes genetic information, with specific base-pairing rules—A with T (or U in RNA), and C with G—enabling the complementary nature of strands.[3] The most iconic structural feature of DNA is its double helix configuration, consisting of two antiparallel polynucleotide strands twisted around a common axis, stabilized by hydrogen bonds between complementary base pairs and hydrophobic interactions in the core.[3] This right-handed helix, with a diameter of approximately 2 nm and 10 base pairs per helical turn (spaced 0.34 nm apart), was first proposed by James Watson and Francis Crick in 1953 based on X-ray diffraction data from Rosalind Franklin and Maurice Wilkins.[3] In contrast, RNA is typically single-stranded and flexible, allowing it to fold into intricate secondary structures such as hairpins, loops, and stems through intramolecular base pairing, which are crucial for its functional roles like catalysis in ribozymes or recognition in messenger RNA.[2] These structural differences underpin the distinct biological functions of DNA as a stable genetic archive and RNA as a versatile intermediary.[1]Basic components
Nucleobases
Nucleobases are the aromatic nitrogenous compounds that form the core informational components of nucleic acids, distinguishing DNA from RNA through specific variants. These molecules attach to sugar-phosphate backbones via glycosidic bonds and enable sequence-specific recognition through hydrogen bonding patterns. The primary nucleobases are classified into purines and pyrimidines based on their ring structures, with ionization properties governed by pKa values that ensure neutrality at physiological pH. Purine nucleobases, adenine (A) and guanine (G), possess a bicyclic structure comprising a six-membered pyrimidine ring fused to a five-membered imidazole ring, providing extended aromaticity and rigidity. Adenine, chemically known as 6-aminopurine, features an amino group at position 6, while guanine, or 2-amino-6-oxopurine, includes an amino group at position 2 and a keto group at position 6. Both exist predominantly in their amino-keto tautomeric forms under neutral conditions, with rare enol or imino tautomers occurring transiently and potentially influencing base pairing fidelity. The pKa values for these purines—approximately 4.15 for adenine (protonation at N1) and 9.2 for guanine (deprotonation at N1-H)—position them as neutral species at pH 7, minimizing electrostatic repulsion in nucleic acid polymers.[4][5] Pyrimidine nucleobases, cytosine (C), thymine (T) in DNA, and uracil (U) in RNA, are characterized by a single six-membered heterocyclic ring with nitrogens at positions 1 and 3. Cytosine is 4-amino-2-oxopyrimidine, bearing an amino group at position 4 and a keto group at position 2; thymine is 5-methyl-2,4-dioxopyrimidine, with keto groups at positions 2 and 4 and a methyl substituent at position 5; uracil is 2,4-dioxopyrimidine, identical to thymine but lacking the 5-methyl group. This methyl group in thymine enhances hydrophobic interactions and stability in DNA compared to uracil in RNA, contributing to distinct evolutionary roles in genetic storage versus expression. Their pKa values, around 4.5 for cytosine (protonation at N3), 9.7 for thymine, and 9.5 for uracil (deprotonation at N3-H), similarly favor neutral forms at physiological pH.[6][4] The hydrogen bonding capabilities of these nucleobases dictate complementary pairing: adenine forms two hydrogen bonds with thymine or uracil via its N1 acceptor and N6-H donor pairing with the O4 and N3-H of T/U, respectively, while guanine forms three hydrogen bonds with cytosine through its O6 and N1-H donors and N2-H donor interacting with cytosine's N3, O2, and N4-H. These patterns, illustrated in the base pair diagrams below, ensure specificity and stability in nucleic acid duplexes. Adenine-Thymine (or Uracil) Pair (2 H-bonds):- N6-H (A) ... O4 (T/U)
- N1 (A) ... H-N3 (T/U)
- O6 (G) ... H-N4 (C)
- N1-H (G) ... N3 (C)
- N2-H (G) ... O2 (C)
Sugars and phosphate backbone
The sugar-phosphate backbone forms the structural scaffold of nucleic acids, consisting of alternating deoxyribose (in DNA) or ribose (in RNA) sugars linked to phosphate groups via phosphodiester bonds. In DNA, the sugar is 2-deoxy-D-ribose, a pentose lacking a hydroxyl group at the 2' carbon position, while RNA incorporates D-ribose, which includes this 2'-OH group. Both sugars exist predominantly in the β-D-furanose (five-membered ring) conformation, with the anomeric carbon (C1') linked to the nucleobase via a β-glycosidic bond, ensuring a consistent orientation in the polymer chain. This furanose form provides rigidity to the backbone while allowing rotational flexibility around the C4'-C5' and C3'-O3' bonds. The absence of the 2'-OH in deoxyribose enhances DNA's chemical stability by preventing intramolecular nucleophilic attacks that could disrupt the phosphodiester linkages, making DNA suitable for long-term genetic storage. In contrast, ribose's 2'-OH group increases RNA's susceptibility to hydrolysis but also imparts greater conformational flexibility, particularly in single-stranded regions, enabling diverse folding motifs essential for RNA's functional roles. This structural difference influences overall polymer dynamics: RNA tends toward A-form helices with a wider major groove due to the 2'-OH's steric and hydrogen-bonding effects, while DNA favors the more elongated B-form. Phosphodiester bonds are formed through condensation polymerization, where the 5'-phosphate group of one nucleotide reacts with the 3'-OH group of another, eliminating a water molecule and creating a covalent linkage between the 5' carbon and 3' carbon across the phosphate. This unidirectional 5' to 3' polarity defines the orientation of nucleic acid chains, with synthesis and replication processes proceeding exclusively in this direction. The resulting backbone is polyanionic, as each phosphate carries a negative charge at physiological pH (pKa ≈ 1-2), necessitating counterions for charge neutralization and structural integrity. Monovalent cations such as Na⁺ and K⁺ serve as primary counterions, coordinating directly with the negatively charged phosphate oxygens to screen electrostatic repulsion and stabilize the helix. Molecular dynamics studies reveal that Na⁺ interacts more strongly with phosphate groups due to its higher charge density, often forming closer ion-phosphate contacts, whereas K⁺ prefers interactions with nucleobase atoms in the grooves, influencing hydration patterns and minor conformational adjustments. These ion-specific bindings are crucial for maintaining backbone solvation and preventing aggregation in cellular environments. The 2'-OH group in RNA uniquely enables base-catalyzed hydrolysis of the phosphodiester bond via a transesterification mechanism, where the deprotonated 2'-O⁻ acts as a nucleophile to attack the adjacent phosphorus, forming a 2',3'-cyclic phosphate intermediate and cleaving the chain. This reaction proceeds efficiently under alkaline conditions (pH > 7), with rate enhancements from general base catalysis, rendering RNA far less stable than DNA—phosphodiester bonds in DNA are approximately 100-200 times more resistant to such cleavage due to the missing 2'-OH. This inherent lability contributes to RNA's transient nature in vivo, contrasting with DNA's robustness.Nucleotides and polymerization
Nucleotides consist of a nucleobase linked to a pentose sugar (ribose in RNA or deoxyribose in DNA) via a β-N-glycosidic bond, with one to three phosphate groups attached to the 5'-oxygen of the sugar, forming nucleoside monophosphates (NMPs), diphosphates (NDPs), or triphosphates (NTPs).[9] The triphosphate forms are the primary substrates for nucleic acid synthesis: deoxyribonucleoside triphosphates (dNTPs) for DNA and ribonucleoside triphosphates (NTPs) for RNA, providing the energy needed for polymerization through cleavage of the high-energy phosphoanhydride bonds.[10] In cells, NTP levels are maintained higher than NMP or NDP levels to support efficient synthesis, with kinases such as nucleoside diphosphate kinase catalyzing the transfer of phosphate from ATP to NDPs.[11] Nucleic acid polymerization occurs enzymatically via DNA or RNA polymerases, which catalyze the template-directed addition of nucleotides to a growing chain. DNA polymerase, first isolated by Arthur Kornberg in 1956, incorporates dNTPs complementary to a DNA template, forming a phosphodiester bond between the 3'-hydroxyl of the primer terminus and the 5'-phosphate of the incoming dNTP, with concomitant release of pyrophosphate (PPi) that drives the reaction forward. RNA polymerase follows a similar mechanism but initiates de novo without a primer, using NTPs to synthesize RNA complementary to a DNA template starting from a promoter sequence, also releasing PPi; this process was elucidated through studies of bacterial enzymes like E. coli RNA polymerase.[12] Both enzymes require a template strand to ensure base-pairing specificity, with the incoming nucleotide selected via hydrogen bonding to the template base in the active site.[13] Polymerization proceeds exclusively in the 5' to 3' direction, where new nucleotides are added to the 3'-hydroxyl end of the chain, resulting in a linear polymer with 5' phosphate and 3' hydroxyl termini.[14] This directionality arises from the chemical mechanism of nucleophilic attack by the 3'-OH on the α-phosphate of the incoming NTP or dNTP, preventing 3' to 5' synthesis.[15] The release of PPi is often hydrolyzed by pyrophosphatases to shift the equilibrium toward polymer elongation.[16] DNA and RNA synthesis differ in substrates and fidelity: DNA polymerases use dNTPs lacking the 2'-hydroxyl group, enabling a more stable double helix, while RNA polymerases incorporate rNTPs with the 2'-OH, which introduces greater flexibility but higher reactivity.[17] DNA polymerases possess 3' to 5' exonuclease proofreading activity, achieving error rates of approximately 10^{-7} to 10^{-9} per nucleotide, whereas RNA polymerases lack robust proofreading, resulting in higher error rates of about 10^{-4} to 10^{-5}, suitable for transient RNA molecules.[18][19]Primary structure
Nucleotide sequence and composition
The primary structure of nucleic acids is defined as the precise linear sequence of nucleotides, determined by the specific order of their nitrogenous bases—adenine (A), cytosine (C), guanine (G), and thymine (T) in deoxyribonucleic acid (DNA), or uracil (U) replacing thymine in ribonucleic acid (RNA)—connected via phosphodiester bonds in the 5' to 3' direction.[20] This sequence encodes genetic information and is conventionally denoted as a string of single-letter symbols, such as 5'-ATGC-3' for a short DNA segment.[21] The nucleotide composition, particularly the GC content (the percentage of guanine and cytosine bases), significantly affects the stability and thermal properties of the nucleic acid. In double-stranded DNA, higher GC content correlates with increased melting temperature (Tm), the point at which half the double helix dissociates into single strands, due to the stronger hydrogen bonding between G-C pairs compared to A-T pairs. For oligonucleotides shorter than 20 bases under standard PCR conditions (e.g., 50 mM monovalent salt), an approximate Tm is given by the Wallace rule: T_m = 2 \times (A + T) + 4 \times (G + C) °C.[22] Chargaff's rules describe the equimolar base ratios observed in most double-stranded DNA molecules: the quantity of adenine equals thymine (A = T), and guanine equals cytosine (G = C), arising from complementary base pairing along the two strands.[23] These parity relationships, established through biochemical analyses of DNA from various organisms, do not apply universally; exceptions occur in single-stranded DNA or certain viral genomes where base pairing is absent or incomplete.[24] Sequence motifs represent recurring patterns within the primary structure, such as simple tandem repeats. A prominent example is the poly-A tail in eukaryotic messenger RNA (mRNA), a homopolymeric stretch of 50–250 adenine residues added post-transcriptionally at the 3' end.[25]Chemical modifications and stability
Chemical modifications to the primary structure of nucleic acids involve the addition of functional groups to nucleobases or the sugar-phosphate backbone, altering their chemical properties without changing the underlying nucleotide sequence. In DNA, one of the most prevalent modifications is 5-methylcytosine (5mC), which occurs primarily at CpG dinucleotides and plays a central role in epigenetic regulation by influencing gene expression through chromatin remodeling and transcriptional repression.[26] This modification is catalyzed by DNA methyltransferases and is essential for processes such as genomic imprinting and X-chromosome inactivation.[27] Another significant DNA modification is N6-methyladenine (m6A), which is widespread in bacterial genomes where it contributes to restriction-modification systems that protect against foreign DNA invasion. In bacteria like Xanthomonas oryzae, m6A is installed by methyltransferases such as Dam and helps regulate replication and repair pathways.[28] In RNA, over 170 distinct chemical modifications have been identified, particularly in ribosomal RNA (rRNA) and transfer RNA (tRNA), where they fine-tune structure and function.[29] [30] Key examples include pseudouridine (Ψ), formed by isomerization of uridine, which enhances base stacking and hydrogen bonding stability; N6-methyladenosine (m6A), the most abundant internal modification in eukaryotic mRNA that affects splicing, export, and translation; and 2'-O-methylation (Nm), which protects against degradation and modulates ribosome assembly.[30] These modifications are enzymatically installed by writer proteins, such as pseudouridine synthases (PUS enzymes) that catalyze the reversible C-C glycosidic bond formation in Ψ without requiring cofactors.[31] For instance, families like TruA and TruB in bacteria and Pus1-Pus10 in eukaryotes target specific sites in tRNA and rRNA.[31] These modifications significantly impact nucleic acid stability by conferring resistance to enzymatic degradation and modulating helical properties. In therapeutic applications, such as small interfering RNA (siRNA), incorporation of 2'-fluoro substitutions at the 2' position of the ribose sugar enhances nuclease resistance, allowing prolonged activity in vivo while maintaining RNA interference efficacy.[32] Similarly, 5mC in DNA reduces backbone flexibility, increasing helix rigidity and protecting against hydrolytic cleavage.[33] In RNA, Ψ and Nm stabilize secondary structures by improving thermodynamic stability and shielding against endonucleases like RNase A.[34] Detection of these modifications relies on specialized techniques that preserve and identify the altered bases. For 5mC in DNA, bisulfite sequencing is a cornerstone method that converts unmethylated cytosines to uracils via sulfonation and deamination, while 5mC remains resistant, enabling differentiation through subsequent PCR amplification and sequencing.[35] This approach provides genome-wide mapping but requires careful optimization to minimize DNA fragmentation. Base composition, particularly CpG density, influences the prevalence of modifiable sites like those for 5mC.[27]Secondary structure
Base pairing rules and hydrogen bonding
In nucleic acids, base pairing refers to the specific association between nucleobases that stabilizes secondary structures through hydrogen bonding. In DNA, the canonical Watson-Crick base pairing rules dictate that adenine (A) pairs with thymine (T), and guanine (G) pairs with cytosine (C). These pairings occur between a purine on one strand and a pyrimidine on the complementary strand, ensuring geometric uniformity in the double helix. The specificity arises from complementary hydrogen bond donor and acceptor sites on the bases, which form precise interactions: the A-T pair involves two hydrogen bonds, while the G-C pair forms three, contributing to greater stability in G-C rich regions.[3][36] The hydrogen bonds in these pairs are typically N-H···O or N-H···N types, involving the Watson-Crick faces of the bases. For A-T, one bond forms between the N1 of adenine (donor) and N3 of thymine (acceptor), and the second between the amino group at C6 of adenine (donor) and the carbonyl at C4 of thymine (acceptor). In G-C pairing, the three bonds are: N1 of guanine to N3 of cytosine, the amino group at C2 of guanine to the carbonyl at C2 of cytosine, and the amino group at C4 of cytosine (donor) to the carbonyl at C6 of guanine (acceptor). These interactions not only dictate pairing fidelity but also influence melting temperatures, with each additional G-C bond increasing duplex stability by approximately 1-2 kcal/mol compared to A-T.[37][38] In RNA, the base pairing rules are analogous but substitute uracil (U) for thymine, forming A-U pairs with two hydrogen bonds (N1 of adenine to N3 of uracil, and amino at C6 of adenine to carbonyl at C4 of uracil) and retaining G-C pairs with three. RNA often adopts single-stranded conformations with intramolecular base pairing to form stems in hairpin loops or other motifs, where hydrogen bonding patterns remain similar but allow for greater flexibility. The G-C pair's stronger bonding (due to the third hydrogen bond) promotes more stable RNA duplexes, as evidenced by higher thermal denaturation temperatures in GC-rich sequences.[39][40] Beyond strict Watson-Crick pairing, non-canonical interactions like wobble base pairs occur, particularly in RNA. Proposed by Francis Crick, the wobble hypothesis describes relaxed base pairing at the third position of codons during translation, allowing a single tRNA to recognize multiple synonymous codons. For instance, guanine in the anticodon can pair with either cytosine or uracil in the mRNA via two hydrogen bonds, shifting the geometry to accommodate the "wobble" without disrupting overall specificity. G-U wobble pairs, common in RNA structures, feature two hydrogen bonds (N1 of G to O2 of U, and O6 of G to N3 of U) and introduce functional diversity, such as in ribosomal RNA where they influence decoding accuracy and structural dynamics. These wobble interactions are weaker than canonical pairs but essential for the degeneracy of the genetic code, reducing the required number of tRNAs from 61 to about 40.[41][42]| Base Pair | Molecule | Hydrogen Bonds | Key Donors/Acceptors |
|---|---|---|---|
| A-T | DNA | 2 | A(N1)-T(N3); A(N6)-T(O4) |
| G-C | DNA/RNA | 3 | G(N1)-C(N3); G(N2)-C(O2); G(O6)-C(N4) |
| A-U | RNA | 2 | A(N1)-U(N3); A(N6)-U(O4) |
| G-U (wobble) | RNA | 2 | G(N1)-U(O2); G(O6)-U(N3) |
DNA double helix configurations
The DNA double helix, formed through complementary base pairing between adenine-thymine and guanine-cytosine, manifests in several distinct configurations that influence its overall geometry and biological function. These variants arise primarily from differences in backbone conformation, base stacking, and hydration levels, with the B-form representing the predominant structure under physiological conditions.[3] B-DNA is a right-handed helix characterized by a smooth, elongated structure with approximately 10.5 base pairs per turn, a helical pitch of 3.4 nm, a rise of 0.34 nm per base pair, and a twist angle of 36° between adjacent base pairs. This configuration features distinct major and minor grooves, which facilitate interactions with proteins for processes such as replication and transcription. The structure was first proposed by Watson and Crick based on X-ray diffraction data, with refined parameters derived from fiber diffraction studies.[3][43] A-DNA, also right-handed but shorter and wider than B-DNA, adopts a more compact form with 11 base pairs per turn, a pitch of about 2.8 nm, a rise of 0.23 nm per base pair, and a twist of approximately 33°. In this conformation, the base pairs are tilted relative to the helix axis, resulting in a deep, narrow minor groove and a shallow major groove. A-DNA is favored under low-humidity conditions, such as in dehydrated fibers, and is commonly observed in DNA-RNA hybrids.[43] Z-DNA represents a left-handed helix with a zig-zag phosphate backbone, accommodating 12 base pairs per turn, a pitch of 4.5 nm, a rise of 0.37 nm per base pair, and a twist of -30°. Unlike the right-handed forms, Z-DNA has a single deep groove and no distinct major/minor distinction, with syn glycosidic conformations for purines and anti for pyrimidines. This form is stabilized in sequences rich in alternating purine-pyrimidine tracts, particularly GC repeats, and was first identified through crystallographic analysis of synthetic oligonucleotides. The prevalence of these helical forms is modulated by environmental factors, including hydration, ionic strength, and sequence composition. B-DNA predominates in aqueous, physiological environments with moderate salt concentrations (e.g., ~150 mM NaCl), while A-DNA emerges at relative humidities below 75% or in the presence of alcohols that reduce water activity. Z-DNA formation is promoted by high salt concentrations (e.g., >2 M NaCl) or multivalent cations like Mg²⁺, which screen phosphate repulsions, and is further enhanced in negatively supercoiled contexts or by specific protein binding, though the latter influences are secondary to ionic effects. Sequence motifs, such as AT-rich regions favoring B-DNA stability through optimal stacking, and GC-rich segments predisposing to Z-DNA via favorable syn-anti alternations, also play a key role. pH variations can induce transitions, with acidic conditions occasionally stabilizing A-like forms by protonating bases and altering hydrogen bonding.[43]| Helix Type | Handedness | Base Pairs per Turn | Pitch (nm) | Rise per Base Pair (nm) | Twist Angle (°) | Key Features |
|---|---|---|---|---|---|---|
| B-DNA | Right | 10.5 | 3.4 | 0.34 | 36 | Major/minor grooves; physiological form |
| A-DNA | Right | 11 | 2.8 | 0.23 | 33 | Tilted bases; low humidity |
| Z-DNA | Left | 12 | 4.5 | 0.37 | -30 | Zig-zag backbone; high salt/GC-rich |
RNA folding motifs
RNA folding motifs are local secondary structural elements that arise from base pairing within single-stranded RNA molecules, enabling diverse functions such as catalysis, regulation, and molecular recognition. Unlike the continuous double helix of DNA, these motifs feature discontinuous helical regions interrupted by unpaired nucleotides, forming compact architectures stabilized by hydrogen bonding and stacking interactions.[44] The most fundamental motif is the stem-loop, consisting of a double-stranded helical stem formed by complementary base pairing and an unpaired loop of 4-7 nucleotides at the apex. Stem-loops can be further diversified by bulges and internal loops, where unpaired nucleotides protrude from one or both strands, respectively, disrupting the continuity of the helix and introducing flexibility or binding sites. For instance, bulge loops with a single unpaired nucleotide on one strand facilitate sharp turns in the RNA backbone, while internal loops with unpaired residues on both strands allow for asymmetric expansions that accommodate tertiary contacts.[44][45] Hairpins represent a specific class of stem-loops where the loop size and sequence confer exceptional stability, particularly tetraloops (four-nucleotide loops) with consensus sequences like GNRA, which exhibit enhanced thermodynamic stability due to non-canonical base interactions and stacking. These tetraloops are among the most stable loop configurations, with free energy contributions up to 4-6 kcal/mol more favorable than larger loops, as determined from optical melting studies. Magnesium ions (Mg²⁺) further stabilize these motifs by bridging negatively charged phosphate groups in loops and stems, reducing electrostatic repulsion and promoting compact folding, especially in bulged or internal regions.[46][47] Beyond simple stem-loops, pseudoknots form when a single-stranded region base-pairs with a complementary sequence outside an existing stem, creating interleaved helices that cross over like a knot and often enhance mechanical rigidity or signaling. Kissing loops occur when the apical loops of two separate hairpins interact via complementary base pairing, forming transient or stable intermolecular contacts that mediate dimerization or regulatory switching.[48][49] In transfer RNA (tRNA), the cloverleaf secondary structure exemplifies the integration of multiple stem-loops, including the acceptor stem, D-loop, anticodon arm, and T-loop, which collectively form four helical arms connected by loops for amino acid attachment and codon recognition. Riboswitches, regulatory RNA elements in bacterial mRNAs, frequently incorporate pseudoknots and kissing loops alongside stems to sense metabolites like thiamine or guanine, undergoing conformational changes that control gene expression.[50][51] Computational tools such as mfold and the ViennaRNA package enable prediction and identification of these motifs by minimizing free energy using dynamic programming algorithms that account for base-pairing rules, loop penalties, and stacking energies. Mfold, developed by Zuker, computes optimal and suboptimal foldings for sequences up to several hundred nucleotides, while ViennaRNA extends this with advanced features like pseudoknot prediction and covariance models for motif detection in alignments.[52][53]Tertiary structure
DNA supercoiling and topology
DNA supercoiling represents a key aspect of the tertiary structure of closed circular DNA molecules, where the double helix, serving as the substrate, undergoes additional coiling beyond its intrinsic helical twist to achieve compaction and facilitate biological processes. In covalently closed circular DNA, such as bacterial plasmids or viral genomes, the topology is invariant unless broken and resealed, leading to superhelical tension that influences DNA accessibility and function.[54] The topological state of supercoiled DNA is quantified by the linking number (Lk), defined as the sum of the twist (Tw), which measures the helical turns of the two strands around each other, and the writhe (Wr), which captures the coiling of the helical axis in space:\mathrm{Lk = Tw + Wr}
This relationship, established through mathematical analysis of ribbon topology, holds for any closed DNA duplex.[55] The relaxed linking number (Lk₀) corresponds to the state without supercoiling, typically about 10.5 base pairs per turn in B-form DNA. Supercoiling arises when Lk deviates from Lk₀, quantified by ΔLk = Lk - Lk₀; negative ΔLk indicates underwinding (negative supercoiling), while positive ΔLk indicates overwinding (positive supercoiling). Negative supercoiling predominates in vivo, promoting DNA unwinding for processes like replication and transcription, whereas positive supercoiling can accumulate ahead of progressing polymerases.[54][56] To manage superhelical tension, cells employ DNA topoisomerases, enzymes that transiently break and rejoin DNA strands to alter Lk. Type I topoisomerases, such as the Escherichia coli enzyme discovered in 1971, relax supercoils by nicking one strand, changing Lk in steps of ±1 without requiring ATP; they preferentially relieve negative supercoils.[57] Type II topoisomerases, including DNA gyrase, act on both strands, altering Lk in steps of ±2 and often requiring ATP; while most type II enzymes relax supercoils bidirectionally, gyrase uniquely introduces negative supercoils using ATP hydrolysis, counteracting the positive supercoils generated during transcription. In bacteria, gyrase maintains an overall negative superhelical density (σ ≈ -0.06) essential for chromosomal compaction within the confined nucleoid space.[58] Supercoiled DNA adopts distinct three-dimensional configurations to partition the writhe component. Plectonemic supercoils form right-handed interwound structures where the DNA axis coils around itself, typical in unconstrained bacterial DNA and allowing dynamic partitioning of twist and writhe. In contrast, toroidal (or solenoidal) supercoils involve the DNA wrapping left-handedly around a core, as seen in eukaryotic nucleosomes where approximately 147 base pairs wrap 1.65–1.7 turns around the histone octamer, contributing about -1 supercoil per nucleosome.[59] This wrapping constrains negative supercoils, reducing free writhe and aiding chromatin compaction. In bacteria, negative supercoiling driven by gyrase compacts the genome by favoring plectonemic structures and branched domains, while also linking to replication by relieving torsional stress at forks and to transcription by enhancing promoter opening and RNA polymerase progression.[60][61]