Fact-checked by Grok 2 weeks ago

Formylation

Formylation is a fundamental chemical reaction in organic synthesis that introduces a formyl group (–CHO) into an organic molecule, typically targeting aromatic rings or other electron-rich systems to produce aldehydes, and it also serves as a critical modification during the initiation of protein synthesis in prokaryotes, mitochondria, and chloroplasts, where the initiator methionyl-tRNA is formylated at the alpha-amino group of methionine. In organic chemistry, formylation is commonly achieved through several named reactions, each suited to specific substrates and conditions. The Vilsmeier–Haack reaction, developed in 1924, employs a chloroiminium ion generated from N,N-dimethylformamide (DMF) and phosphorus oxychloride (POCl₃) as the electrophilic formylating agent, enabling regioselective formylation of electron-rich arenes, heterocycles like furans and indoles, and even activated alkenes, often with high yields under mild conditions. The Gattermann–Koch reaction, introduced in 1897, utilizes carbon monoxide (CO) and hydrogen chloride (HCl) in the presence of a Lewis acid catalyst such as aluminum chloride (AlCl₃) or copper(I) chloride to formylate benzene and its derivatives, particularly effective for non-activated arenes and historically significant for industrial aldehyde production. Other notable methods include the Duff reaction, which uses hexamethylenetetramine for formylating phenols and indoles under acidic conditions, and the Rieche formylation employing dichloromethyl methyl ether with titanium(IV) chloride for aromatic substrates, both offering advantages in regioselectivity and compatibility with sensitive functional groups. These reactions are essential for constructing complex molecules, including pharmaceuticals, dyes, and natural product analogs, due to the versatility of the aldehyde functionality in subsequent transformations like aldol condensations or reductions. Biologically, formylation is catalyzed by methionyl-tRNA formyltransferase using 10-formyltetrahydrofolate as the donor, ensuring the formylated initiator tRNA (fMet-tRNA) recognizes the start codon AUG in bacterial and organellar ribosomes, a process indispensable for efficient translation initiation and distinguishing prokaryotic protein synthesis from eukaryotic counterparts. The formyl group is subsequently removed by peptide deformylase to expose the mature N-terminus, and its presence on bacterial peptides also triggers innate immune responses, such as activation of formyl peptide receptors (FPRs) on mammalian neutrophils or presentation by MHC-like molecules like H2-M3 to CD8⁺ T cells, aiding in pathogen recognition and antimicrobial defense against bacteria like Mycobacterium tuberculosis. Beyond translation, formylation participates in epigenetic modifications, such as Nε-formyllysine in histones arising from oxidative DNA damage or formaldehyde exposure, influencing gene expression and cellular differentiation. Dysregulation of these processes has implications in diseases, including cancer and bacterial infections, highlighting formylation's dual role in synthetic chemistry and cellular regulation.

Overview

Definition and General Principles

Formylation refers to the chemical process of introducing a formyl group (-CHO) into an , typically through or formyl group transfer mechanisms. This reaction functionalizes molecules by attaching the formyl moiety, which is a key in due to its nature. The formyl group acts as a one-carbon (C1) building block, enabling the extension of carbon chains or the creation of reactive sites for further transformations. Formylation reactions are classified based on the atom receiving the formyl group, including C-formylation (attachment to carbon atoms), N-formylation (to atoms), and O-formylation (to oxygen atoms). In C-formylation, the reaction often proceeds via electrophilic attack on electron-rich carbon centers, such as in aromatic systems. N-formylation commonly targets amines, forming formamides that protect functionality or serve as intermediates. O-formylation, meanwhile, yields formate s from alcohols or , which are valuable in ester synthesis and as solvents. These types share the common principle of formyl transfer from a donor , with the depending on the nucleophilicity of the and the leaving group ability of the formylating agent. The reactivity of the formyl group stems from its aldehyde functionality, which can undergo nucleophilic additions, reductions to primary alcohols, or oxidations to carboxylic acids, making it a versatile precursor in synthetic routes. Thermodynamically, formylation pathways are influenced by the stability of intermediates, such as acylium-like species in electrophilic transfers, and activation energies that vary by mechanism; for instance, common electrophilic formylations require overcoming barriers associated with C-H bond cleavage or nucleophilic attack, often facilitated by catalysts to lower these energies. A representation of C-formylation is: \text{R-H} + \text{HCOX} \rightarrow \text{R-CHO} + \text{HX} where R-H is the carbon-containing substrate and X is a leaving group, such as a halide or carboxylate. This equation underscores the substitution-like nature of many C-formylation processes, balancing the energetics of bond formation and cleavage.

Historical Development

The introduction of formylation reactions in organic chemistry dates back to the late 19th century, with the Gattermann-Koch reaction serving as one of the earliest methods for introducing formyl groups into aromatic compounds. Discovered in 1897 by German chemists Ludwig Gattermann and Julius Arnold Koch, this reaction utilized carbon monoxide, hydrogen chloride, and aluminum chloride to formylate benzene derivatives, marking a significant advancement in electrophilic aromatic substitution techniques. A variant, the Gattermann reaction employing hydrogen cyanide and hydrogen chloride with zinc chloride or copper(I) chloride as catalysts, was reported by Gattermann in 1906, further expanding the toolkit for aldehyde synthesis on activated aromatics. Key milestones in the early included the development of the Vilsmeier-Haack reaction in 1927, independently discovered by Anton Vilsmeier and Albrecht Haack, which employed and phosphorus oxychloride to generate a chloromethyleneiminium for selective formylation of electron-rich aromatics and heterocycles. This method's mild conditions and versatility quickly made it a cornerstone for synthesizing aldehydes in complex molecules. In 1938, Otto Roelen at Ruhrchemie (a subsidiary) serendipitously discovered while investigating Fischer-Tropsch synthesis, revealing that cobalt carbonyl catalysts could add hydrogen and a formyl group across alkenes to produce aldehydes, laying the foundation for the industrial Oxo process. Following , saw rapid industrial adoption, with the first commercial plant operational in 1943 but scaling significantly in the to produce aldehydes for downstream applications such as plasticizers, detergents, and resins. By the , annual global production exceeded several million tons, driven by rhodium-based catalysts that improved selectivity and efficiency over the original systems. In parallel, the biological significance of formylation emerged in the mid-20th century, with the discovery of N10-formyltetrahydrofolate as a key carrier in one-carbon metabolism during the 1950s, highlighting its role in transferring formyl groups for purine biosynthesis and other pathways. By the , was identified as the initiator in bacterial protein synthesis; Keith Marcker and Francis Sanger reported its attachment to tRNA in 1964, while John M. Adams and Mario R. Capecchi demonstrated its essential function in translation initiation in in 1966. These findings illuminated formylation's conserved role in prokaryotic and organellar protein synthesis.

Formylation in

Formylation Agents

Formylation agents encompass a range of chemical and catalysts employed to introduce the formyl group (-) into substrates, primarily in synthetic contexts. Common agents include derivatives of , such as itself and its esters like , which serve as mild and accessible formyl sources. These derivatives are often utilized in N-formylation reactions due to their ability to react under relatively benign conditions, such as solvent-free heating or in at , offering good yields (78–100%) for both aromatic and aliphatic amines while exhibiting high selectivity for primary over secondary amines. , in particular, promotes dehydration to formamides without additional activators in many cases, making it a straightforward and cost-effective option. Another prominent agent is the Vilsmeier-Haack reagent, generated from N,N-dimethylformamide (DMF) and phosphorus oxychloride (POCl₃). The preparation proceeds via the equation: (\ce{CH3})_2\ce{NCHO} + \ce{POCl3} \rightarrow [(\ce{CH3})_2\ce{N=CHCl}]^+ \ce{Cl}^- This acts as a highly selective for formylation, particularly effective under mild temperatures (0–120°C) and showing preference for electron-rich systems. However, the reagent's components pose significant toxicity concerns: POCl₃ is corrosive and hydrolyzes to release toxic and , while DMF is a reproductive toxin and potential , necessitating careful handling and ventilation. Environmentally, the process generates phosphorus-containing waste, contributing to risks if not properly managed, though recent flow chemistry adaptations aim to mitigate these issues by reducing reagent exposure. The Duff reaction employs (HMTA) as a formylating agent, typically in acidic media like , to achieve regioselective ortho-formylation of activated aromatics such as . HMTA decomposes under these conditions to deliver the formyl group, with moderate to good yields and tolerance for substituents like and esters. Its selectivity favors the position, shifting to when ortho sites are blocked. Toxicity-wise, HMTA exhibits moderate irritation potential and can release —a known —upon , limiting its use in high-exposure scenarios; however, it is less acutely toxic than many alternatives. Environmentally, the reaction's reliance on organic acids reduces volatile emissions compared to halogenated , though byproduct requires neutralization for greener profiles. For carbon-carbon bond formylation, particularly in functionalization, utilizes mixtures of (CO) and (H₂) as the formyl source. This process relies on catalysts, with (Rh)-based systems, often modified with ligands, providing high activity and toward linear aldehydes (up to 99% selectivity in some cases). (Co) catalysts, such as HCo(CO)₄, offer a more economical alternative but with lower selectivity (typically 60–80% linear) and require harsher conditions (120–200°C, high pressures). Both metals exhibit low toxicity in bounded forms, but CO is highly poisonous, demanding enclosed systems to prevent exposure; Rh's scarcity raises concerns, while Co's robustness toward impurities enhances process . Recent efforts focus on ligand-modified Co catalysts to approach Rh performance, reducing overall environmental impact through decreased use and waste. Emerging organocatalytic methods enable milder variants under ambient conditions in select cases, improving selectivity and minimizing metal for greener applications.
Agent/CatalystKey PropertiesSelectivityToxicity/Environmental Notes
DerivativesMild, solvent-free compatibleHigh for primary aminesLow toxicity; biodegradable, low environmental impact
Vilsmeier-Haack ReagentReactive at low temperaturesElectron-rich substratesCorrosive (POCl₃); generates
Hexamethylenetetramine (Duff)Acid-catalyzed, regioselectiveOrtho to activating groupsReleases (carcinogenic); moderate irritation
Rh/Co Catalysts ()High pressure/temperature (Co harsher)Linear aldehydes (Rh > Co)CO poisonous; Rh resource-intensive, Co more sustainable
OrganocatalystsAmbient conditionsVariable, often highMetal-free; reduced waste and toxicity

Aromatic Formylation

Aromatic formylation involves the introduction of a formyl group (-CHO) onto aromatic rings primarily through (EAS) mechanisms, where the aromatic acts as a toward formyl cation equivalents. This process is particularly effective for electron-rich arenes, as the targets positions of high , leading to the formation of aromatic aldehydes that serve as versatile intermediates in . Unlike aliphatic formylation, which targets sp³ carbons, aromatic methods exploit the stability of the delocalized π-system to facilitate regioselective substitution. The Gattermann-Koch reaction, developed in 1897, represents a seminal method for direct formylation of and its derivatives using (CO), (HCl), and aluminum chloride (AlCl₃) as a Lewis acid under high pressure. The reaction proceeds via in situ generation of the formyl cation (HCO⁺) from CO and protonated HCl, which electrophilically attacks the aromatic ring to form a σ-complex , followed by to yield the . A simplified for the process is: \ce{ArH + CO ->[AlCl3][HCl] ArCHO} This method is best suited for activated aromatics like alkylbenzenes, yielding products such as benzaldehyde from benzene, though it requires careful control to avoid side reactions from excess HCl. The Gattermann formylation, an earlier variant introduced by Ludwig Gattermann in 1890 and refined in subsequent works, employs a mixture of hydrogen cyanide (HCN) and HCl with a copper(I) chloride (CuCl) catalyst to formylate aromatic compounds. Here, the electrophile is an iminium-like species derived from HCN protonation, which undergoes EAS on the arene, followed by hydrolysis to the aldehyde; a simplified adaptation uses zinc cyanide (Zn(CN)₂) with HCl to generate HCN in situ. This approach is advantageous for phenols and ethers, producing aldehydes like salicylaldehyde from phenol, but demands anhydrous conditions to prevent polymerization of HCN. For electron-rich heterocycles and activated aromatics, the Vilsmeier-Haack reaction, pioneered in 1924 by Anton Vilsmeier and Alfred Haack, utilizes N,N-dimethylformamide (DMF) and phosphorus oxychloride (POCl₃) to generate the chloromethyleneiminium ion ([Cl-CH=N(CH₃)₂]⁺) as the electrophile. The mechanism involves nucleophilic attack by the arene or heterocycle on this acylium-like species, forming an iminium adduct that hydrolyzes to the formyl group; it is particularly effective for pyrroles and indoles, yielding 2-formylpyrrole from pyrrole in high regioselectivity. This method's mild conditions make it complementary to the harsher Gattermann-Koch process for sensitive substrates. In all these reactions, the mechanism relies on electrophilic attack by formyl equivalents on electron-rich aromatic rings, where the Wheland (σ-complex) determines based on directing effects. Activating groups such as alkyl or alkoxy moieties direct to and positions via resonance stabilization of the , while electron-withdrawing groups like nitro favor meta by destabilizing / σ-complexes. Selectivity challenges include preventing polyformylation, which is mitigated by using stoichiometric control of reagents and low temperatures, and achieving desired in substituted benzenes, as seen in the high preference (up to 80%) in Gattermann-Koch formylation of due to steric factors in the intracomplex pathway.

Aliphatic Formylation

Aliphatic formylation refers to the introduction of a formyl group (-CHO) into non-aromatic carbon frameworks, such as alkenes, alkanes, or other sp³/sp²-hybridized carbons, distinguishing it from methods. Unlike aromatic systems, aliphatic formylation often involves catalytic processes that functionalize unsaturated or saturated hydrocarbons, with serving as the predominant classical approach for generating aliphatic aldehydes. This reaction is essential for synthesizing valuable intermediates in bulk chemical production, emphasizing and catalyst efficiency to favor linear products over branched isomers. The cornerstone of aliphatic formylation is the , also known as the oxo process, which converts alkenes into by adding a mixture of and () across the carbon-carbon . Discovered accidentally by Otto Roelen in 1938 during Fischer-Tropsch studies at Ruhrchemie, the process was patented as a method for synthesis and rapidly scaled industrially. The general for a is represented as: \ce{R-CH=CH2 + CO + H2 -> R-CH2-CH2-CHO} where R is an alkyl group, yielding primarily the linear (n-) aldehyde, though branched isomers can form depending on conditions. This transformation is catalyzed by transition metals, with cobalt and rhodium being the most established. Cobalt catalysts, typically HCo(CO)₄, operate under high-pressure conditions (200–300 bar, 150–200°C) and were used in early industrial implementations, such as the Ruhrchemie process starting in 1943. They provide moderate activity but lower selectivity for linear products (typically 60–70% n/iso ratio). Rhodium catalysts, such as HRh(CO)(PPh₃)₃, enable milder conditions (10–50 bar, 80–120°C) and superior performance, achieving n/iso ratios up to 95:5 with phosphine ligands, making them the standard for modern applications. The choice of catalyst influences not only yield but also the feasibility for higher olefins in aliphatic chains. The of proceeds via a involving the hydrido-metal carbonyl complex, as outlined in the associative pathway proposed by Heck and Breslow. It begins with the of hydrogen to the metal center, followed by coordination and migratory insertion of to form an acyl intermediate. Subsequent hydrogenolysis regenerates the catalyst and releases the . Key steps include the preference for anti-Markovnikov addition in systems, driven by from bulky ligands, which enhances linear selectivity. Radical or pathways are not typically involved in these metal-catalyzed processes. Industrially, is one of the largest homogeneous catalytic processes, producing over 10 million metric tons of aldehydes annually, primarily for conversion to alcohols via . These alcohols serve as precursors for detergents, plasticizers (e.g., di-2-ethylhexyl phthalate from n-butyraldehyde derived from propene), and solvents, with the process contributing significantly to the global chemical economy. For instance, the low-pressure (LPOx) process using , licensed by (now Dow), exemplifies high-efficiency production of C₃–C₁₅ aldehydes from corresponding alkenes. While direct C-H formylation of alkanes remains challenging and less developed, dominates aliphatic synthesis due to its scalability and .

Recent Advances in Synthetic Methods

Recent advances in formylation chemistry have emphasized sustainable approaches that leverage abundant feedstocks like CO₂ and sources, minimizing waste and enabling milder reaction conditions compared to traditional methods. A notable development involves the N-formylation of amines using CO₂ and H₂ catalyzed by es, which facilitates the direct incorporation of CO₂ as a C1 . In 2021, a -based system achieved efficient formylation of various amines under moderate pressures, yielding formamides with high selectivity and turnover numbers up to 10,000. Building on this, a 2024 additive-free pincer enabled selective N-formylation at 120°C and 30 bar, demonstrating broad substrate scope including aromatic and aliphatic amines with minimal byproducts. Electrochemical methods have emerged as energy-efficient alternatives, particularly for N-formylation using as both and . A 2025 study reported a glassy carbon electrode-mediated process for methylamine formylation, proceeding via a methylisocyanide to afford with a faradaic efficiency of up to 34% under ambient conditions, highlighting the role of anodic oxidation in activating without additional reductants. This approach reduces reliance on high-pressure gases and offers scalability for applications. Catalyst innovations have focused on heterogeneous systems to enhance recyclability and enable one-pot formylation sequences. A 2023 review detailed the use of heterogeneous catalysts, such as metal oxides and supported nanoparticles, in direct formylation reactions, including tandem processes that couple formylation with subsequent transformations like cyclization, often under solvent-free conditions to lower environmental impact. Complementing this, biobased feedstocks like have gained traction; a 2025 method employed Cu/C₃N₄ heterogeneous catalysts for oxidative N-formylation of amines with derived from , achieving quantitative conversion of the C1 unit at 25–50°C in green solvents like , thus promoting and reduced waste. These advancements collectively offer significant benefits, including operation under mild conditions—such as 50°C for oxidative variants—and high atom utilization from CO₂ or biobased sources, addressing limitations of stoichiometric reagents. A specific example is the 2024 scale-up of EDTA-mediated N-formylation of amines with CO₂ under ambient conditions, where recyclable EDTA as an organocatalyst delivered 83% conversion to mono-formamides for primary amines like benzylamine, demonstrating practical viability for larger-scale synthesis without metal additives.

Formylation in Biology

Role in Methanogenesis

In methanogenesis, the biological production of by , formylation serves as the initial step in the reductive assimilation of (CO₂) within the Wood-Ljungdahl pathway, enabling the fixation of one-carbon (C1) units essential for energy generation and autotrophic growth. This process occurs in hydrogenotrophic methanogens, which utilize CO₂ and H₂ to produce CH₄, and involves the formyl-methanofuran (Fmd or Fwd), a - or tungsten-containing iron-sulfur protein that catalyzes the reduction of CO₂ to a formyl group attached to the coenzyme methanofuran (MFR). The reaction proceeds without , distinguishing it from analogous bacterial pathways, and couples to electron donation from reduced or hydrogenase-generated low-potential electrons. The core formylation reaction is:
\ce{CO2 + 2[H] + [methanofuran](/page/Methanofuran) -> [formyl-methanofuran](/page/Formyl-methanofuran) + H2O}
catalyzed by formyl-methanofuran dehydrogenase, which first reduces CO₂ to a bound intermediate at a dinucleotide site before transferring it to MFR's amino group via a binuclear center, yielding formyl-MFR (also written as formylmethanofuran). Subsequently, a formyltransferase enzyme (Ftr) in the Wood-Ljungdahl pathway transfers the formyl group from formyl-MFR to tetrahydromethanopterin (H₄MPT), facilitating further C1 assimilation into the methyl branch of the pathway, where it is ultimately converted to via coenzyme M and methyl-coenzyme M reductase. This sequence is critical for C1 metabolism in methanogens, as mutations or deletions in fmd genes abolish growth on H₂/CO₂, underscoring the pathway's indispensability.
Biologically, formylation enables methanogens to thrive in anaerobic environments like sediments, ruminant guts, and hydrothermal vents by coupling CO₂ fixation to ATP synthesis through chemiosmotic proton translocation, supporting their role as primary producers in anoxic ecosystems. Environmentally, methanogenesis contributes significantly to the global , with methanogens mineralizing up to 2% of fixed carbon (approximately 450 Tg annually) into , a potent that influences and climate feedback loops.

Formylation in Protein Synthesis

In protein synthesis, formylation plays a critical role in the of , particularly in prokaryotes, mitochondria, and chloroplasts, where (fMet) serves as the universal initiator . This process ensures accurate assembly of the ribosomal complex by distinguishing the start site from internal codons. The formyl group is transferred from 10-formyltetrahydrofolate (formyl-THF) to the α-amino group of methionyl-tRNA^Met, forming fMet-tRNA^Met, which binds to the ribosomal via factors. In prokaryotes, the methionyl-tRNA formyltransferase (), encoded by the fmt gene, catalyzes this reaction, as described by the equation: \text{Met-tRNA}^{\text{Met}} + \text{formyl-THF} \rightarrow \text{fMet-tRNA}^{\text{Met}} + \text{THF} This formylation step enhances the fidelity of initiation by promoting stable codon-anticodon pairing at AUG start codons and preventing misincorporation during elongation. Post-initiation, the formyl group is typically removed cotranslationally by peptide deformylase (PDF), followed by potential cleavage of the N-terminal methionine by methionine aminopeptidase (MAP), yielding mature proteins without the formyl modification in most cases. However, in bacterial proteins, incomplete or alternative processing can leave N-formylmethionine residues intact, forming N-formyl peptides that serve as potent signals for the mammalian . These peptides bind to formyl peptide receptors (FPRs), particularly FPR1 and FPR2 on neutrophils and other immune cells, triggering , , and production to combat bacterial infections. Additionally, N-formyl peptides can be presented by MHC-like molecules such as H2-M3 to CD8⁺ T cells, enabling adaptive immune recognition of pathogens like . This formylation-derived signaling distinguishes prokaryotic proteins from eukaryotic ones, playing a vital role in host defense. In eukaryotic mitochondria, a homologous system conserves this prokaryotic-like mechanism due to the endosymbiotic origin of mitochondria. The mitochondrial methionyl-tRNA formyltransferase (MTFMT) performs the formylation of the single mitochondrial tRNA^Met, which serves dual roles in initiation (as fMet-tRNA^Met) and elongation (as Met-tRNA^Met). Unlike the cytosolic translation machinery, where initiation occurs directly with unformylated methionine via eukaryotic initiation factor 2 (eIF2), mitochondrial translation strictly requires fMet for efficient 55S ribosome assembly and peptidyl transferase activity. This conservation highlights the evolutionary retention of bacterial-style initiation in organelles, with MTFMT ensuring that formyl-THF-dependent formylation proceeds without interference from cytosolic deformylases. A similar formylation-dependent initiation occurs in chloroplasts, reflecting their bacterial ancestry and ensuring precise translation of organellar genomes. A key function of formylation emerged from recent studies showing its role in protecting initiator methionines from oxidative damage. In both prokaryotes and mitochondria, methionine residues can oxidize to methionine sulfoxide (MetO) under reactive oxygen species stress, impairing downstream processing. Formylation accelerates the reduction of these oxidized residues by methionine sulfoxide reductases (MsrA and MsrB), with in vitro assays demonstrating 2- to 6-fold higher catalytic efficiency for N-formyl-MetO compared to unformylated MetO. This enhancement prevents accumulation of misprocessed proteins by enabling timely deformylation and maturation, thereby maintaining translational fidelity during oxidative conditions. In Escherichia coli mutants lacking MTF, elevated levels of oxidized N-terminal methionines correlate with increased oxidative stress sensitivity, underscoring formylation's protective mechanism.

Formylation in Purine Biosynthesis

In de novo purine biosynthesis, formylation reactions incorporate carbon atoms into the purine ring using N10-formyl-tetrahydrofolate (N10-formyl-THF) as the one-carbon donor, linking this pathway to broader one-carbon metabolism. Two distinct formylation events occur: the first at step 3, where glycinamide ribonucleotide (GAR) is converted to formylglycinamide ribonucleotide (FGAR), contributing the C8 atom to the purine ring; and the second at step 9, where 5-aminoimidazole-4-carboxamide ribonucleotide (AICAR) is formylated to 5-formaminoimidazole-4-carboxamide ribonucleotide (FAICAR), providing the C2 atom. These steps are essential for assembling inosine monophosphate (IMP), the precursor to adenine and guanine nucleotides required for DNA and RNA synthesis. The initial formylation is catalyzed by transformylase, encoded by the purN gene in bacteria such as or as the GART domain in the trifunctional enzyme Trifunctional GART in mammals. This folate-dependent enzyme transfers the formyl group from N10-formyl-THF directly to via a tetrahedral , with key residues like and facilitating hydrogen bonding. The reaction is: \text{GAR} + \text{N}^{10}\text{-formyl-THF} \rightarrow \text{FGAR} + \text{THF} In some , an ATP-dependent GAR transformylase (PurT) substitutes for PurN, utilizing as the formyl source rather than N10-formyl-THF, potentially providing metabolic flexibility under folate limitation. PurT, a monomeric , likely proceeds through a formyl phosphate intermediate. The second formylation is mediated by AICAR transformylase, the C-terminal domain of the bifunctional ATIC (encoded by purH in bacteria or ATIC in humans), which also possesses IMP cyclohydrolase activity. This transfers the formyl group from N10-formyl-THF to AICAR, enabling the final ring closure to IMP; kinetic studies show moderate substrate affinity, with Km values around 30–140 μM for AICAR and N10-formyl-THF analogs. Unlike GAR transformylase, AICAR transformylase exhibits no to PurN and requires the (6R)-isomer of the cofactor. These formylation steps are biologically critical, as disruptions impair production and , particularly in rapidly dividing cells reliant on . For instance, the antifolate inhibitor lometrexol potently targets GAR transformylase (Ki ≈ 60 nM), blocking production and exerting cytotoxic effects in tumor cells with limited salvage pathways. This has positioned these enzymes as therapeutic targets in , though polyglutamylation requirements can limit efficacy.
EnzymeStepSubstrate → ProductFormyl DonorOrganism ExampleKey Feature
GAR Transformylase (PurN/GART)3GAR → FGARN10-formyl-THFE. coli (PurN), (GART)Adds C8; folate-dependent
GAR Transformylase (PurT)3 (alternative) → FGARFormate + ATP (E. coli)Folate-independent backup
AICAR Transformylase (PurH/ATIC)9AICAR → FAICARN10-formyl-THFE. coli (PurH), (ATIC)Adds C2; bifunctional with cyclohydrolase

Formylation in Histone Modification

Histone formylation involves the addition of a formyl group to the ε-amino group of lysine residues on histone proteins, serving as a post-translational modification (PTM) that functions as an epigenetic mark to regulate chromatin structure and gene expression. This modification is structurally similar to acetylation but introduces a shorter acyl chain, potentially altering the charge and interactions of histone tails with DNA and reader proteins. Unlike many canonical histone PTMs, formylation is predominantly non-enzymatic, arising from endogenous sources of formaldehyde, and has been observed across core histones (H2A, H2B, H3, H4) and linker histones (H1). The mechanism of histone lysine formylation primarily occurs non-enzymatically through reaction with reactive formaldehyde species generated during oxidative DNA damage, such as 3'-formylphosphate intermediates produced by strand breaks. These formyl groups covalently attach to lysine residues, with notable sites including H3K14 and H4K5, which overlap with common acetylation and methylation positions. Although enzymatic lysine formyltransferases have been identified in bacterial systems, no dedicated histone-specific formyltransferases are confirmed in eukaryotes; instead, the process is driven by metabolic byproducts like formaldehyde from one-carbon metabolism or demethylation reactions. This non-enzymatic pathway results in low but detectable levels of formylation (0.04–0.1% of lysines in chromatin), accumulating particularly under conditions of cellular stress.00891-9) Formylation exerts inhibitory effects on subsequent histone modifications by blocking the ε-amino group of lysine, thereby preventing acetylation or methylation at the same sites and disrupting the recruitment of regulatory complexes. Discovered in 2007 as a secondary modification linked to oxidative DNA damage, it plays a key role in the DNA damage response by altering chromatin accessibility and signaling pathways. For instance, formylated lysines mimic acetylated states but resist processing by standard deacetylases, leading to prolonged chromatin compaction. Regulation of histone formylation involves reversal through deformylases, with identified as a proficient capable of hydrolyzing formyl groups from residues, restoring the unmodified state. Studies from the have further elucidated HDAC6's specificity for formylated peptides, suggesting it maintains epigenetic balance under by counteracting non-enzymatic accumulation. Other class I HDACs show limited activity, highlighting HDAC6's unique role in this pathway. Biologically, formylation contributes to transcriptional repression during stress conditions, such as oxidative damage, by promoting stability and inhibiting activator binding at modified sites. This repression helps coordinate cellular responses to genotoxic stress, potentially safeguarding integrity while suppressing non-essential . In pathological contexts like cancer, elevated formylation correlates with dysregulated , underscoring its role as a stress-responsive epigenetic .

Formylation in Medicine

Formylation as a Therapeutic Target

Formylation pathways in cellular present viable targets for therapeutic , particularly through inhibition of key s involved in synthesis and protein maturation. Glycinamide ribonucleotide () transformylase, an in that catalyzes the transfer of a formyl group from N10-formyltetrahydrofolate to , has been targeted with antifolate inhibitors to disrupt . Lometrexol, a classical antifolate, potently inhibits transformylase, leading to depletion and antitumor effects; it advanced to phase I clinical trials in the , where weekly administration schedules were identified as tolerable, with dose-limiting toxicities including myelosuppression. Subsequent analogs like , which also inhibit transformylase alongside other folate-dependent enzymes, have shown efficacy in clinical settings for cancers such as non-small cell , underscoring the pathway's relevance in antifolate from the through the . In infectious diseases, bacterial peptide deformylase (PDF) serves as a critical antibacterial target due to its role in removing N-terminal formyl-methionine from newly synthesized proteins, a process absent in mature eukaryotic proteins. Actinonin, a naturally derived hydroxamate inhibitor, binds tightly to bacterial PDF, exhibiting broad-spectrum activity against Gram-positive and Gram-negative pathogens, including multidrug-resistant strains. Developed as a lead compound in the early 2000s, actinonin-inspired PDF inhibitors have progressed in preclinical and early clinical studies for treating bacterial infections, with structure-activity optimization focusing on enhancing potency and pharmacokinetics. In November 2025, Flightpath Biosciences licensed FP530 (Formibactin A), a clinical-stage oral PDF inhibitor for eradicating specific gram-negative bacteria. Emerging research post-2020 has explored formylation's role in , particularly histone formylation, as a potential cancer therapeutic avenue. This modification, chemically akin to and often arising under , alters accessibility and in tumor cells, potentially disrupting oncogenic signaling. Although primarily non-enzymatic, modulation of formylation levels via inhibitors holds promise for epigenetic reprogramming in cancer, with studies indicating links to stability and defects in malignancies. A major challenge across these targets is selectivity, as human homologs—such as mitochondrial PDF or folate-dependent enzymes—can cause off-target toxicity, necessitating advanced to differentiate bacterial or dysregulated cancer-specific activities from host processes. represents a primary mitochondrial disorder associated with defective formylation, particularly through mutations in the mitochondrial methionyl-tRNA formyltransferase (MTFMT) gene, which impairs the formylation of initiator methionyl-tRNA essential for mitochondrial protein synthesis. This disruption leads to combined deficiencies, predominantly affecting respiratory chain complexes I and IV, resulting in neurodegeneration and energy failure in high-demand tissues like the and muscle. The condition was first linked to MTFMT mutations in 2011, with subsequent genetic studies in the 2020s confirming its role in milder variants of characterized by slower progression compared to other etiologies. Clinical manifestations of MTFMT-related typically emerge in infancy or , featuring progressive , , developmental regression, seizures, and episodic often triggered by infections or metabolic stress. reveals characteristic bilateral symmetric lesions in the and , while biochemical tests show elevated levels in blood and . The prevalence of overall is estimated at approximately 1 in 40,000 live births, though MTFMT variants account for a small subset, with carrier frequencies varying by population. Beyond mitochondrial formylation defects, disruptions in cytosolic formylation pathways, such as those involving metabolism, have been implicated in broader metabolic disorders. deficiencies reduce the availability of 10-formyl-tetrahydrofolate, a key donor for biosynthesis, leading to impaired production and associated conditions like and increased risk of defects. These deficiencies indirectly affect formylation-dependent steps in ring assembly, exacerbating cellular issues in rapidly dividing tissues. Emerging research has also connected formylation to responses in aging-related pathologies. A 2024 study demonstrated that formylation of initiator methionines facilitates their reduction by methionine sulfoxide reductases, preventing accumulation of oxidized proteins that contribute to cellular dysfunction in neurodegenerative and age-associated diseases like Alzheimer's and Parkinson's. Defects in this protective mechanism may amplify protein misfolding and mitochondrial damage in aging tissues. Diagnosis of formylation-related relies on targeted genetic sequencing of MTFMT and related genes, often prompted by clinical suspicion, muscle showing ragged-red fibers, and enzymatic assays confirming respiratory chain defects. Treatment remains supportive, encompassing nutritional supplementation (e.g., , vitamins), management of with , and anticonvulsants for seizures, but no approved curative therapies exist as of 2025, though investigational approaches such as gene therapies for specific mutations are in development. Prognosis varies, with MTFMT cases showing potential for longer survival into adolescence compared to more severe forms, though most patients experience significant disability.

References

  1. [1]
    Formylation - an overview | ScienceDirect Topics
    Formylation is defined as a chemical reaction that introduces a formyl group (–CHO) into an arene, typically occurring ortho or para to an activating ...
  2. [2]
    Formylation - an overview | ScienceDirect Topics
    Formylation refers to the initial modification of nascent peptide chains during protein synthesis, specifically the addition of a formyl group to the ...
  3. [3]
    Vilsmeier-Haack Reaction - Organic Chemistry Portal
    The formylating agent, also known as the Vilsmeyer-Haack Reagent, is formed in situ from DMF and phosphorus oxychlorid.
  4. [4]
    Gattermann and Gattermann-Koch Formylation - US
    Learn about Gattermann-Koch formylation, used in the synthesis of benzofuran-derived natural products and other compounds used as fluorescent probes or ...
  5. [5]
    Formylation [Synthetic Reagents] | TCI AMERICA - TCI Chemicals
    The Duff reaction using hexamethylenetetramine is effective to formylate electron-rich aromatic compounds such as phenols and indoles.
  6. [6]
    Formylation - an overview | ScienceDirect Topics
    Formylation refers to the modification of DNA by the addition of a formyl group, resulting in the formation of 5-formylcytosine (5fC), which is associated ...
  7. [7]
    Formylating agents | Chemical Reviews - ACS Publications
    Recent Advances On Direct Formylation Reactions. The Chemical Record 2023 ... Journal of Synthetic Organic Chemistry, Japan 2015, 73 (9) , 911-922.
  8. [8]
    Gattermann-Koch reaction - Oxford Reference
    The reaction was discovered by Ludwig Gattermann and J. C. Koch in 1897. The use of hydrogen cyanide was reported by Gattermann in 1907. C6H5OH → HOC6H4CH ...
  9. [9]
    Synthesis by Formylation of Arene—Hydrogen Bonds
    In 1897, Gattermann and Koch demonstrated that a mixture of carbon monoxide and anhydrous hydrogen chloride in the presence of aluminum trichloride or copper(I) ...
  10. [10]
    Vilsmeier-Haack Reaction
    Vilsmeier, A. Haack, Ber. 60, 119 (1927). Formylation of activated aromatic or heterocyclic compounds with disubstituted formamides and phosphorus oxychloride:.
  11. [11]
    Otto Roelen, Pioneer in Industrial Homogeneous Catalysis - 1994
    Nov 17, 1994 · Otto Roelen discovered the oxo synthesis (hydroformylation) in 1938, and despite all the problems created by the war years he was able to explore successfully.
  12. [12]
    Applied Hydroformylation | Chemical Reviews - ACS Publications
    Hydroformylation was discovered accidently in 1938 by Otto Roelen (1897–1993), who called it “oxo process”. (1, 2) Already 4 years later, the first unit began ...
  13. [13]
    Hydroformylation's Diamond Jubilee - C&EN
    Apr 22, 2013 · Hydroformylation, also known as oxo chemistry, was discovered in 1938 by German chemist Otto Roelen and first commercialized by German chemical ...
  14. [14]
    Hydroformylation | ChemTexts
    Dec 2, 2021 · The first-generation processes followed the original procedure of Roelen or used similar conditions with cobalt-based catalysts. They ...
  15. [15]
    Compartmentalization of Mammalian Folate-Mediated One-Carbon ...
    The recognition that mitochondria participate in folate-mediated one-carbon metabolism grew out of pioneering work beginning in the. 1950s from the laboratories ...
  16. [16]
    N-formylmethionyl-sRNA as the initiator of protein synthesis. - PNAS
    If chain initiation required formyl amino acids, the terminal alanine end groups of E. coli proteins remained unexplained. There was, of course, the possibility ...
  17. [17]
    Formylation of Amines - PMC - NIH
    These methods include stoichiometric reactions of formylating reagents and catalytic reactions with CO as the carbonyl source.
  18. [18]
    [PDF] REVIEW ARTICLE ON VILSMEIER-HAACK REACTION
    The use of formylation reaction as synthetic strategy to form versatile carboxaldehyde intermediates is still of interest, due to both their intrinsic ...
  19. [19]
    [PDF] Phosphorus oxychloride - Lanxess
    This document provides a brief description of phosphorus oxychloride, its uses, and the potential hazards associated with short and long term exposure.
  20. [20]
    Duff Aldehyde Synthesis - an overview | ScienceDirect Topics
    The Duff reaction is defined as the treatment of activated arenes, such as phenols, with hexamethylenetetramine in an acidic medium, resulting in the ...
  21. [21]
    Hexamethylenetetramine - an overview | ScienceDirect Topics
    It hydrolysed in acidic medium to produce formaldehyde, which is a broad-spectrum antimicrobial agent. However, its relatively high toxicity limits its ...
  22. [22]
    Highly active cationic cobalt(II) hydroformylation catalysts - Science
    Jan 31, 2020 · In the early 1970s, rhodium catalysts were discovered to be hundreds of times more active than cobalt for the hydroformylation of linear 1- ...
  23. [23]
    Review The Chemistry of CO: Carbonylation - ScienceDirect.com
    Mar 14, 2019 · The mechanism of the Gattermann-Koch formylation is believed to proceed via a formyl cation intermediate, which is formed in situ by the ...
  24. [24]
    Gatterman-Koch Carbonylation - an overview | ScienceDirect Topics
    The Gatterman–Koch reaction refers to the formylation of arenes using carbon monoxide and Lewis acids to synthesize arene aldehydes. AI generated definition ...<|separator|>
  25. [25]
    Gattermann–Koch reaction - SpringerLink
    Jun 25, 2009 · Formylation of arenes using carbon monoxide and hydrogen chloride in the presence of aluminum chloride under high pressure.Missing: original paper
  26. [26]
    [PDF] Gattermann reaction - L.S.College, Muzaffarpur
    May 11, 2020 · chemical reaction in which aromatic compounds are formylated by a mixture of hydrogen cyanide (HCN) and hydrogen chloride (HCl) in the ...
  27. [27]
    On the Mechanism of the Gattermann Aldehyde Synthesis. I
    Simple Formylation of Aromatic Compounds Using a Sodium Formate/Triphenylphosphine Ditriflate System. Chemistry Letters 2017, 46 (6) , 840-843. https://doi ...Missing: original | Show results with:original
  28. [28]
    Vilsmeier–Haack reactions of carbonyl compounds
    The Vilsmeier–Haack reaction is a widely used method for the formylation of activated aromatic and heteroaromatic compounds.1., 2.
  29. [29]
  30. [30]
    N-Formylation of Amines with Carbon Dioxide and Hydrogen ...
    Sep 21, 2021 · The N-formylation reaction of various amines with CO 2 is a promising method to utilize CO 2 for the production of high value-added chemicals.
  31. [31]
    Ruthenium Catalyzed Additive‐Free N‐Formylation of Amines with ...
    Aug 22, 2024 · A highly efficient ruthenium pincer complex for the selective N-formylation of amines using H2 and CO2 is demonstrated, achieving a turnover ...Abstract · Introduction · Results and Discussions · Experimental Section
  32. [32]
    Electrochemical N-Formylation of Amines: Mechanistic Insights and ...
    We report the selective electrochemical N-formylation of methylamine using methanol as both reagent and solvent, facilitated by a simple glassy carbon ...Experimental Section · Results and Discussion · Supporting Information
  33. [33]
    Recent Advances On Direct Formylation Reactions - PubMed
    May 25, 2023 · Newer methods involving homo and heterogenous catalysts, one pot reactions, solvent free techniques are elaborated, which can be performed under mild ...Missing: heterogeneous | Show results with:heterogeneous
  34. [34]
    Mild oxidative N-formylation of amines with biobased ...
    Mar 15, 2025 · Biobased glycolaldehyde's all carbon atoms were converted into valuable formamides. · The transformation was achieved at 25–50 °C in green ...
  35. [35]
    N-Formylation of Carbon Dioxide and Amines with EDTA as ... - MDPI
    Jul 31, 2024 · Scale-up research has been performed effectively with a high conversion of amine (83%) to obtain the mono-formylated product selectively.Missing: mediated | Show results with:mediated
  36. [36]
  37. [37]
  38. [38]
    Lessons from Formylmethanofuran Dehydrogenases - PMC
    FMDs are efficient, "all-in-one" enzymes that capture CO2 as a formyl group, catalyzing redox-active and redox-neutral transformations.
  39. [39]
  40. [40]
    Methanogens: pushing the boundaries of biology - PubMed Central
    Dec 14, 2018 · Overall, it is estimated that up to 2% of carbon in the global carbon cycle (or 450 Tg annually) is mineralized by methanogens per year [13].
  41. [41]
    Where Does N-Formylmethionine Come from? What for ... - NIH
    Mar 31, 2022 · Indeed, fMet-tRNA was confirmed to be crucial for the initiation of protein synthesis in Escherichia coli extracts (Adams and Capecchi, 1964).Missing: 1960s | Show results with:1960s
  42. [42]
    A possible mechanism for initiation of protein synthesis. - PNAS
    Preparation of N-formylmethionine: The formylated methionine was prepared by the method ... mechanism for the initiation of protein synthesis. With random ...
  43. [43]
    Formyl-methionine as an N-degron of a eukaryotic N-end rule pathway
    In bacteria, the formyl group of Nt-fMet is cotranslationally removed from most (though not all) nascent proteins by the ribosome-associated peptide-deformylase ...
  44. [44]
  45. [45]
    Mitochondrial translation initiation machinery: Conservation ... - NIH
    In this review we describe the proteins orchestrating mitochondrial translation initiation: bacterial-like general initiation factors mIF2 and mIF3, as well as ...
  46. [46]
    Formylation facilitates the reduction of oxidized initiator methionines
    Thus, by promoting the reduction of oxidized initiator methionines, formylation assists in the proper processing of newly synthesized proteins. Abstract. Within ...
  47. [47]
    A journey into the regulatory secrets of the de novo purine ...
    De novo purine nucleotide biosynthesis (DNPNB) consists of sequential reactions that are majorly conserved in living organisms.
  48. [48]
  49. [49]
    Characterization of AICAR transformylase/IMP cyclohydrolase (ATIC ...
    Oct 24, 2017 · ATIC is a bifunctional enzyme involved in purine biosynthesis pathway, and its AICAR TFase activity catalyzes the formylation of AICAR to ...<|separator|>
  50. [50]
    Discovery of a Potent, Nonpolyglutamatable Inhibitor of Glycinamide ...
    The work detailed herein provides an unusually potent and selective purine biosynthesis inhibitor that acts by inhibiting GAR Tfase and that is incapable of ...
  51. [51]
    N-formylation of lysine in histone proteins as a secondary ... - PNAS
    By whatever mechanism, lysine acetylation is controlled by two classes of enzymes (Fig. 1): histone acetyltransferases and histone deacetylases (14, 18).
  52. [52]
    N ε -Formylation of lysine is a widespread post-translational ...
    We found that core and linker histones are formylated at multiple lysyl residues located both in the tails and globular domains of histones. In core histones, ...Missing: enzymatic | Show results with:enzymatic
  53. [53]
    Formaldehyde Is a Source of Pathological N6-Formyllysine That Is ...
    Recent studies have identified a chemical homolog of lysine acetylation, N6-formyllysine, as an abundant modification of histone and chromatin proteins, one ...
  54. [54]
    The unraveling of substrate specificity of histone deacetylase 6 ...
    Nov 29, 2018 · HDAC6 is proficient deformylase and deacetylase. Taking advantage of our newly identified ideal substrate sequence, the acyl substrate ...Deacetylation Assays Using... · Peptide Synthesis And... · Results
  55. [55]
    Epigenetic meets metabolism: novel vulnerabilities to fight cancer
    Histones formylation of chromatin in a novel point of crosstalk between epigenetics and metabolism. The existence of deformylases enzyme [47] as well as ...
  56. [56]
    A phase I clinical study of the antipurine antifolate lometrexol ...
    The work described in this report has identified for the first time a clinically acceptable schedule for the administration of a GARFT inhibitor. This ...
  57. [57]
    Phosphoribosylglycinamide Formyltransferase - an overview
    The first selective and sufficiently potent GARFT inhibitor was lometrexol, designed as a folate analog lacking the 5 and 10 nitrogen atoms and therefore ...
  58. [58]
    Peptide Deformylase Inhibitors as Antibacterial Agents - NIH
    Recently, actinonin, a naturally occurring antibiotic with a hydroxamate moiety and a tripeptide binding domain, was shown to be a potent PDF inhibitor (5).
  59. [59]
    the peptide deformylase inhibitors as antibacterial agents
    Dec 24, 2022 · Peptide deformylase inhibitors are selective for bacterial enzyme and exhibit activity against similar mammalian (including human) enzymes only ...
  60. [60]
    Epigenetic meets metabolism: novel vulnerabilities to fight cancer
    Sep 21, 2023 · Formylation. Histone Formylation has been reported as a non-enzymatic histone PTM occurring under drastic conditions such as oxidative and nitrosative stress.
  61. [61]
    Molecular and Cellular Functions of the Linker Histone H1.2 - Frontiers
    Jan 10, 2022 · Apart from phosphorylation, ubiquitination, PARylation, and methylation of linker histone, formylation ... 2 may serve as a candidate target for anti-cancer ...
  62. [62]
    Human mitochondrial peptide deformylase, a new anticancer target ...
    We show that actinonin, a peptidomimetic antibiotic that inhibits HsPDF, also inhibits the proliferation of 16 human cancer cell lines. We designed and ...
  63. [63]
    MTFMT deficiency correlates with reduced mitochondrial integrity ...
    Jul 7, 2020 · Furthermore, mutations in the MTFMT gene have been associated with progressive neurodegenerative disorders, such as Leigh syndrome. At the ...Missing: discovery | Show results with:discovery<|control11|><|separator|>
  64. [64]
    Leigh Syndrome: A Tale of Two Genomes - Frontiers
    Leigh syndrome caused by mutations in MTFMT is ... Functional and genetic studies demonstrate that mutation in the COX15 gene can cause Leigh syndrome.Abstract · Introduction · Background of Mitochondrial... · Factors Influencing Leigh...
  65. [65]
    Leigh Syndrome - Symptoms, Causes, Treatment | NORD
    Leigh syndrome is a rare genetic neurometabolic disorder. It is characterized by the degeneration of the central nervous system (ie, brain, spinal cord, and ...
  66. [66]
    Leigh syndrome: MedlinePlus Genetics
    Apr 28, 2023 · Frequency. Leigh syndrome affects at least 1 in 40,000 newborns. The condition is more common in certain populations. For example, the ...
  67. [67]
    Leigh syndrome caused by mutations in MTFMT is associated ... - NIH
    Mutations in MTFMT underlie a human disorder of formylation causing impaired mitochondrial translation. Cell Metab 2011;14:428–434. [DOI] [PMC free article] ...Missing: discovery | Show results with:discovery
  68. [68]
    Toward a better understanding of folate metabolism in health and ...
    Dec 26, 2018 · Folate metabolism is crucial for many biochemical processes, including purine and thymidine monophosphate (dTMP) biosynthesis, mitochondrial protein ...Folate Pathways: Identified... · Folate Uptake And... · Parallel Cytosolic And...
  69. [69]
    Folic Acid Deficiency - StatPearls - NCBI Bookshelf - NIH
    Jun 25, 2025 · Reduced folate levels result in a decreased availability of THF, which impairs purine and pyrimidine synthesis and nucleoprotein metabolism.
  70. [70]
    Leigh syndrome | UMDF
    Apr 7, 2025 · Leigh syndrome affects an estimated 1 in 40,000 individuals. In the Faroe Islands, the incidence is higher (1 in 1,700 individuals). Patients ...
  71. [71]
    Leigh Syndrome: A Comprehensive Review of the Disease and ...
    Mutations associated with Leigh syndrome impact genes in both the mitochondrial and nuclear genomes. ... mutation, it is advisable to commence a genetic analysis.