Citrullination
Citrullination is a calcium-dependent post-translational modification of proteins in which the positively charged amino acid arginine is enzymatically converted to the neutral amino acid citrulline by peptidylarginine deiminase (PAD) enzymes, resulting in a loss of positive charge that can significantly alter protein structure, function, and interactions.[1][2][3] This process, also known as deimination, occurs primarily in eukaryotic cells and is catalyzed by five mammalian PAD isoforms (PAD1–4 and PAD6), with PAD2 and PAD4 being the most studied due to their roles in nuclear and cytoplasmic activities.[1][3] The modification is generally considered irreversible in vivo, influencing processes such as protein folding, enzymatic activity, and molecular recognition.[1][4] In physiological contexts, citrullination plays essential roles in cellular homeostasis and immune regulation. It contributes to chromatin decondensation and gene transcription activation by citrullinating histones, particularly histone H3 and H4 via PAD4, which facilitates epigenetic remodeling and cell differentiation.[1][2] During inflammation, PAD4-mediated citrullination is critical for the formation of neutrophil extracellular traps (NETs), where it promotes chromatin relaxation to enable DNA release and microbial entrapment, aiding innate immunity.[1][2] Additionally, citrullination supports pluripotency in stem cells through histone modifications and regulates extracellular matrix remodeling in tissues like skin and myelin.[1][3] Dysregulated citrullination is implicated in numerous pathologies, particularly autoimmune and inflammatory diseases, where it generates neoepitopes that trigger autoantibody production and break immune tolerance. In rheumatoid arthritis (RA), citrullinated proteins such as vimentin, fibrinogen, and α-enolase in synovial tissues serve as autoantigens, with anti-citrullinated protein antibodies (ACPAs) detected in 50–70% of patients and correlating with disease severity; more than 150 citrullinated proteins have been identified in RA joints.[2][3] Similarly, in multiple sclerosis (MS), increased PAD2 and PAD4 activity leads to citrullination of myelin basic protein (MBP) in up to 90% of fulminant cases, contributing to demyelination and inflammation.[2] In systemic lupus erythematosus (SLE), PAD4-driven citrullination of histones in NETs promotes autoimmunity, with anti-CCP antibodies present in 12–50% of erosive cases.[2] Beyond autoimmunity, citrullination is linked to neurodegenerative disorders like Alzheimer's disease via amyloid-beta modification, cancer metastasis through PAD4 effects on the extracellular matrix, and other conditions such as type 1 diabetes.[3] Recent advances in citrullination research have enhanced its detection and therapeutic targeting, leveraging mass spectrometry (MS)-based proteomics for site-specific identification, with methods like chemical derivatization and biotin-thiol enrichment mapping hundreds of sites across human tissues.[3] Machine learning predictions and data-independent acquisition (DIA)-MS have expanded the known citrullinome, revealing over 800 potential sites in immune cells, while PAD inhibitors show promise in preclinical models for RA, MS, and cancer by reducing pathological NETosis and autoantigen formation. As of 2025, deep learning methods have improved citrullination site precision in proteomics, while citrullinated peptides are explored as RA therapeutics and citrulline shows anti-inflammatory effects in macrophages.[5][6][7][3] These developments underscore citrullination's dual role as a vital regulator and disease driver, positioning it as a key biomarker and intervention target in precision medicine.[3]Definition and Mechanism
Biochemical Process
Citrullination is a post-translational modification (PTM) in which peptidyl-arginine residues in proteins are enzymatically converted to peptidyl-citrulline through the hydrolysis of the guanidino group, a process that is calcium-dependent and catalyzed by peptidyl arginine deiminases (PADs).[8][9] This modification alters the chemical properties of the affected residue: the positively charged guanidino group of arginine is replaced by the neutral ureido group of citrulline, resulting in a net loss of one positive charge per site modified.[10][11] The loss of positive charge from citrullination disrupts the electrostatic interactions within and between proteins, leading to changes in protein conformation, folding, and stability.[2] Additionally, the modification increases the hydrophobicity of the local region, which can influence protein solubility, aggregation propensity, and degradation pathways.[12] These structural alterations often impair or modulate protein function, including enzymatic activity, binding affinities, and interactions with other biomolecules.[13] Citrullination targets arginine residues in a wide array of proteins, such as histones, cytoskeletal components, and nuclear proteins, and occurs primarily during cellular processes like differentiation and stress responses.[8] Unlike many PTMs, citrullination is irreversible at the residue level and can only be undone indirectly through protein turnover and resynthesis.[14][15]Enzymatic Catalysis
Citrullination is catalyzed by peptidyl arginine deiminases (PADs), a family of calcium-dependent enzymes comprising five mammalian isoforms: PAD1 through PAD4, which are catalytically active, and PAD6, which lacks enzymatic activity. These enzymes convert peptidyl arginine residues into citrulline through a hydrolytic reaction that does not require energy input such as ATP. The overall reaction is represented as: \text{Protein-Arg} + \text{H}_2\text{O} \rightarrow \text{Protein-Cit} + \text{NH}_3 This process involves the hydrolysis of the guanidino group on the arginine side chain, releasing ammonia and forming a neutral ureido group on citrulline.[16] The catalytic mechanism proceeds via a cysteine-based hydrolysis pathway. Upon binding of calcium ions (Ca²⁺), PADs undergo a conformational change that assembles the active site. The active-site cysteine (e.g., Cys645 in PAD4) acts as a nucleophile, attacking the guanidinium carbon of the substrate arginine to form a tetrahedral intermediate. This intermediate collapses, with a conserved histidine (e.g., His471 in PAD4) facilitating proton transfer to release the amine group as ammonia. Subsequent hydrolysis of the thioacyl-enzyme intermediate by water yields the citrullinated product and regenerates the enzyme. Calcium is essential, with multiple binding sites (up to six in PAD2) inducing the necessary structural rearrangements; high-affinity sites (K_D < 1 µM) initiate binding, while moderate-affinity sites (K_D ~250 µM) enable full activation.[16][17][18] Kinetically, PAD activation requires elevated calcium concentrations, typically in the micromolar to millimolar range depending on the isoform—for instance, half-maximal activation (K_{0.5}) occurs at approximately 140 µM for PAD1 and 560 µM for PAD4 at pH 7.6. The reaction exhibits optimal activity around neutral pH (7.2–7.6), with pH-independent calcium sensitivity between 6.0 and 8.5 for PAD4. Substrate specificity favors peptidyl arginines over free arginine, particularly those in unstructured or flexible protein regions where the side chain is accessible; multiple arginine sites per protein can be modified, though efficiency varies by sequence context and isoform—PADs generally prefer motifs with basic residues nearby.[16][17]Peptidyl Arginine Deiminases (PADs)
Structure and Activation
Peptidyl arginine deiminases (PADs) constitute a family of calcium-dependent enzymes, each with a molecular weight of approximately 70 kDa, that are evolutionarily conserved from bacteria to mammals. In humans, the five PAD isoforms (PAD1–4 and PAD6) exhibit 50–60% sequence identity overall, reflecting their shared ancestry within the arginine deiminase superfamily.[19][17] The core molecular architecture of PADs consists of an N-terminal regulatory domain, a central catalytic domain, and a variable C-terminal region. The N-terminal domain features two immunoglobulin-like subdomains that contribute to substrate recognition and protein interactions, while the central catalytic domain adopts an α/β propeller fold housing the active site, characterized by a conserved cysteine-histidine-aspartate (C-H-D) triad—typically Cys645, His471, and Asp350/473 in human PAD4—that facilitates nucleophilic attack on the substrate arginine. Some isoforms, such as PAD2 and PAD4, possess additional structural motifs in the C-terminal region that influence localization or interactions, though these vary across the family. Crystal structures of PAD1, PAD2, and PAD4, resolved in seminal studies, confirm this modular organization and highlight the enzyme's monomeric nature in solution.[20][21] Activation of PADs requires binding of calcium ions to multiple sites within the enzyme, typically 5–6 per monomer, which triggers a profound conformational shift from an autoinhibited state to an active form. In the apo (calcium-free) state, the N-terminal domain occludes the active site, positioning the catalytic cysteine distant from potential substrates and preventing activity. Upon calcium coordination—primarily by aspartate and glutamate residues in loops bridging the N- and C-terminal domains—the enzyme undergoes rigid-body rotations that align the active site residues, exposing the substrate-binding cleft and enabling catalysis. This ordered calcium binding, first detailed in structural analyses of PAD2 and PAD4, ensures tight regulation under physiological conditions where intracellular calcium levels are low.[19][20] Beyond calcium dependence, PAD activity is modulated by post-translational modifications and environmental factors. Autoinhibition is relieved solely by calcium in the core mechanism, but oxidation of the active-site cysteine by reactive oxygen species (ROS) can reversibly or irreversibly inhibit the enzyme, providing a redox-based regulatory layer responsive to oxidative stress. Phosphorylation at specific serine or threonine residues in the regulatory domain has been observed in some contexts to fine-tune activity, though its effects vary by isoform and are less central than calcium signaling. Pharmacological regulation includes irreversible inhibitors like Cl-amidine, which forms a covalent adduct with the catalytic cysteine, mimicking the transition state and potently suppressing PAD function across isoforms.[22][23][24] Evolutionarily, PADs derive from the ancient arginine deiminase superfamily, which includes prokaryotic arginine deiminases (ADIs) involved in energy metabolism via the ADI pathway. While bacterial ADIs are calcium-independent and act on free arginine to produce ammonia and ornithine, eukaryotic PADs evolved calcium dependence, likely adapting the mechanism for precise post-translational control in higher organisms. This divergence is evident in phylogenetic analyses showing PADs emerging in metazoans through gene duplication and horizontal transfer events from microbial ancestors.[22][25]Isoforms and Distribution
Peptidyl arginine deiminases (PADs) comprise a family of five isoforms in mammals, encoded by genes clustered on chromosome 1p36.13, each exhibiting distinct tissue-specific expression patterns and substrate preferences that contribute to their specialized physiological roles.[26] These isoforms share a conserved catalytic domain but differ in their N-terminal regulatory regions, influencing calcium-dependent activation and subcellular localization.[27] PAD1 is primarily expressed in the epidermis and uterus, where it localizes to the cytosol and facilitates keratinocyte differentiation and skin barrier formation through the citrullination of filaggrin and keratins.[27] In hair follicles, PAD1 contributes to the deimination of trichohyalin in the inner root sheath, supporting structural integrity during differentiation.[28] PAD1's activity is particularly prominent in the stratum granulosum of the epidermis.[26] PAD2 displays a broad, ubiquitous distribution, with high levels in the brain (especially white matter), skeletal muscle, spleen, salivary glands, and pancreas, and it is found in both cytosolic and nuclear compartments.[27] It plays a key role in central nervous system plasticity and myelin sheath maintenance by citrullinating myelin basic protein and vimentin, and it can also modify histone H3 to regulate gene expression.[26] Hormonal influences, such as estrogen, regulate PAD2 expression in reproductive tissues.[26] PAD3 is skin-specific, predominantly expressed in hair follicles and the epidermis, localizing to the cytosol in structures like the medulla, Henle, Huxley, and cuticle layers of the inner root sheath.[28] It synergizes with PAD1 to promote terminal epidermal differentiation and hair shaft hardening by citrullinating trichohyalin and filaggrin.[27] PAD4 is mainly found in hematopoietic cells, including neutrophils and monocytes, where it shuttles between the cytoplasm and nucleus to enable histone citrullination, particularly of histones H3 and H4, facilitating chromatin remodeling.[27] Its nuclear localization sequence allows translocation in response to stimuli, distinguishing it from other cytosolic isoforms.[26] PAD6 is oocyte-specific, with expression in ovaries, testes, and early embryos, residing in the cytosol and essential for granulosa cell function, oocyte maturation, and fertility through involvement in embryonic genome activation and microtubule dynamics.[27] It exhibits pseudogene-like characteristics in some contexts but remains catalytically active in reproductive tissues.[26]| Isoform | Primary Tissue Distribution | Cellular Localization | Key Substrates |
|---|---|---|---|
| PAD1 | Epidermis, uterus, hair follicles | Cytosol | Filaggrin, trichohyalin, keratins |
| PAD2 | Brain, skeletal muscle, spleen, salivary glands | Cytosol, nucleus | Myelin basic protein, vimentin, histone H3 |
| PAD3 | Hair follicles, epidermis | Cytosol | Trichohyalin, filaggrin |
| PAD4 | Hematopoietic cells (neutrophils, monocytes) | Cytoplasm, nucleus | Histones H3, H4 |
| PAD6 | Oocytes, ovaries, testes | Cytosol | Not well-defined |