Fact-checked by Grok 2 weeks ago

Gene expression

Gene expression is the process by which the genetic information encoded in a gene's DNA sequence is converted into a functional product, such as a protein or non-coding RNA, primarily through the sequential steps of transcription and translation. In transcription, the enzyme RNA polymerase synthesizes a complementary messenger RNA (mRNA) strand from the DNA template within the nucleus in eukaryotic cells, copying the genetic code for export to the cytoplasm. Translation then occurs at ribosomes, where the mRNA is read in triplets (codons) to direct the assembly of amino acids into a polypeptide chain, forming the primary structure of a protein that folds into its functional form. This central dogma of molecular biology enables the manifestation of genetic traits and cellular functions, with only a subset of an organism's genes expressed in any given cell at a specific time. Gene expression is not constitutive but highly regulated to maintain cellular homeostasis and adapt to internal and external signals. Regulation occurs at multiple levels, including transcriptional control, where transcription factors bind to promoter regions of DNA to initiate or repress mRNA synthesis; post-transcriptional mechanisms, such as mRNA splicing, capping, polyadenylation, and degradation; and translational controls that modulate protein synthesis efficiency. Epigenetic modifications, like DNA methylation and histone acetylation, further influence chromatin accessibility, thereby fine-tuning gene activity without altering the underlying DNA sequence. These regulatory layers ensure precise spatiotemporal control, allowing multicellular organisms to develop diverse cell types from a single genome— for instance, neurons express genes for neurotransmitter receptors, while muscle cells prioritize those for contractile proteins. The study and manipulation of gene expression have profound implications for biology and medicine. Dysregulated expression underlies numerous diseases, including cancers driven by oncogene activation or tumor suppressor silencing, and genetic disorders like cystic fibrosis resulting from mutations in the CFTR gene. Techniques such as RNA sequencing and CRISPR-based editing have revolutionized the ability to profile and alter expression patterns, facilitating insights into development, evolution, and therapeutic interventions. Ultimately, gene expression orchestrates the complexity of life, bridging genotype to phenotype across all organisms.

Overview

Definition and importance

Gene expression is the process by which the information encoded in a gene's DNA sequence is converted into a functional product, primarily through the synthesis of RNA and proteins. This involves two main steps: transcription, where the DNA sequence is copied into (mRNA), and , where the mRNA sequence is decoded to produce a polypeptide chain that folds into a functional protein. The concept is encapsulated in the , proposed by , which posits that genetic information flows unidirectionally from DNA to RNA to protein, ensuring the faithful transmission and utilization of genetic instructions within cells. While this framework holds for most cellular processes, exceptions exist, such as reverse transcription in retroviruses, where RNA serves as a template for . Gene expression operates across multiple levels, extending beyond protein-coding genes to include the production of non-coding RNAs (ncRNAs), which do not translate into proteins but play crucial regulatory roles. These ncRNAs, such as microRNAs and long non-coding RNAs, modulate gene activity by influencing transcription, RNA stability, and chromatin structure, thereby fine-tuning cellular responses. The overall process thus encompasses the journey from DNA transcription to RNA maturation and, where applicable, protein synthesis, highlighting the versatility of genetic output in diverse biological contexts. The biological significance of gene expression cannot be overstated, as it underpins nearly every aspect of cellular and organismal function, from and to environmental adaptation and . By selectively activating or repressing specific genes, cells achieve into specialized types, such as neurons or muscle cells, despite sharing the same . For instance, , a family of transcription factors, are expressed in precise spatial and temporal patterns during embryonic to direct body patterning and segmentation in animals. Dysregulation of gene expression can lead to diseases like cancer, underscoring its essential role in maintaining physiological balance and responding to stimuli.

Historical development

The foundations of gene expression were laid in the early through experiments linking genes to biochemical functions. In 1941, and Tatum proposed the "one gene-one enzyme" hypothesis based on their studies of mutants, demonstrating that specific genes direct the production of individual enzymes involved in metabolic pathways. This idea built on earlier genetic work but shifted focus toward molecular mechanisms. Three years later, in 1944, , Colin MacLeod, and provided crucial evidence that DNA serves as the genetic material by showing that purified DNA from virulent pneumococci could transform non-virulent strains, ruling out proteins as the transforming principle. The molecular era began with the elucidation of DNA's structure in 1953 by and , who described and its base-pairing rules, implying a mechanism for genetic information storage and replication that underpins gene expression. This paved the way for understanding how genes are read. In 1961, François and introduced the concept of (mRNA) as an intermediary carrying genetic instructions from DNA to ribosomes for protein synthesis, detailed in their seminal paper on genetic regulation. That same year, and Monod proposed the lac operon model in E. coli, illustrating how genes are coordinately regulated through proteins that control transcription in response to environmental signals like . Concurrently, Marshall Nirenberg and J. Heinrich Matthaei cracked the first codon of the by using synthetic poly-uridylic acid RNA to direct incorporation of into proteins, revealing that UUU specifies and establishing RNA's role in translation. Subsequent decades revealed greater complexity, particularly in eukaryotes. In 1977, Phillip Sharp and Richard Roberts independently discovered introns—non-coding sequences interrupting eukaryotic genes—through electron microscopy of adenovirus RNA hybrids with DNA, showing that pre-mRNA is spliced to form mature mRNA. This finding challenged the continuity assumed from prokaryotic models and highlighted RNA processing as a key step in gene expression. Later milestones included the 1998 discovery of (RNAi) by and , who demonstrated that double-stranded RNA triggers sequence-specific degradation of homologous mRNAs in C. elegans, unveiling a natural mechanism for post-transcriptional . From 2012 onward, the adaptation of CRISPR-Cas9 by Martin Jinek, , , and enabled precise manipulation of gene expression by targeting and editing DNA sequences, revolutionizing studies of regulatory elements. These advances marked a progression from prokaryotic simplicity to eukaryotic intricacies, transforming gene expression from a genetic to a manipulable molecular process.

Molecular mechanisms

Transcription

Transcription is the first stage of gene expression, in which the genetic information encoded in DNA is copied into messenger RNA (mRNA) by the enzyme RNA polymerase. This process occurs in a template-dependent manner, where RNA polymerase synthesizes an RNA strand complementary to one of the DNA strands, following base-pairing rules: adenine (A) pairs with uracil (U) in RNA instead of thymine (T). Transcription is essential for converting the stable DNA blueprint into a transient RNA molecule that can be used for protein synthesis or other cellular functions. In prokaryotes, such as bacteria, transcription is carried out by a single type of RNA polymerase, a multi-subunit enzyme consisting of a core structure with five subunits (two α, one β, one β', and one ω) that catalyzes RNA synthesis. The core enzyme requires a sigma (σ) factor to form the holoenzyme, which enables specific promoter recognition. The primary σ factor, σ70 in Escherichia coli, binds to conserved promoter sequences, including the -10 box (TATAAT consensus) and the -35 box (TTGACA consensus), facilitating the initial binding of RNA polymerase to DNA. Different sigma factors allow recognition of alternative promoters, enabling responses to environmental changes. In eukaryotes, three distinct RNA polymerases handle transcription: RNA polymerase I (Pol I) synthesizes ribosomal RNA, Pol III produces transfer RNA and small RNAs, and RNA polymerase II (Pol II) transcribes mRNA and some non-coding RNAs. For mRNA synthesis, Pol II—a large complex with 12 subunits—relies on general transcription factors (GTFs) for promoter recognition and assembly of the pre-initiation complex (PIC). The core promoter often includes the TATA box (TATAAA consensus, located ~25-30 base pairs upstream of the transcription start site), to which the TATA-binding protein (TBP, a subunit of TFIID) binds, bending the DNA and recruiting other GTFs such as TFIIA, TFIIB, TFIIE, TFIIF, and TFIIH. TFIIH's helicase activity unwinds the DNA to form the open complex. The transcription process consists of three main phases: initiation, elongation, and termination. Initiation begins with promoter recognition and DNA unwinding to form the open complex, followed by the synthesis of the first few RNA nucleotides without promoter clearance in prokaryotes (abortive initiation) or stable PIC formation in eukaryotes. In prokaryotes, the sigma factor dissociates shortly after initiation, allowing the core enzyme to proceed; in eukaryotes, Pol II enters a promoter-proximal paused state before full clearance, regulated by factors like NELF and DSIF. During elongation, RNA polymerase moves along the DNA template at an average rate of approximately 40-50 nucleotides per second in prokaryotes and 20-40 nucleotides per second in eukaryotes, adding ribonucleotides to the growing 3' end of the RNA chain in the 5' to 3' direction. The enzyme maintains high fidelity through kinetic proofreading and induced-fit mechanisms, achieving an error rate of about 10^{-4} to 10^{-5} errors per nucleotide incorporated, which is lower than expected from base-pairing alone due to enhanced selectivity. In prokaryotes, elongation is coupled with translation, as ribosomes can bind nascent mRNA while it is still being transcribed, whereas in eukaryotes, transcription occurs in the nucleus, separated from translation in the cytoplasm. Termination signals the end of RNA synthesis and release of the transcript and polymerase. In prokaryotes, two main mechanisms exist: rho-independent termination, where a GC-rich loop forms in the RNA followed by a poly-U tract that weakens RNA-DNA interactions, or rho-dependent termination, involving the rho that translocates along the RNA and disrupts the elongation complex. In eukaryotes, Pol II termination is linked to the signal (AAUAAA) in the pre-mRNA, triggering cleavage and poly-A tail addition, followed by the torpedo mechanism where Rat1 exonuclease degrades the downstream RNA, leading to polymerase release.

RNA processing and maturation

In eukaryotic cells, RNA processing and maturation occur co-transcriptionally and post-transcriptionally to convert primary transcripts, known as pre-mRNAs, into functional mature RNAs capable of export from the and subsequent utilization in the . This multifaceted process ensures the removal of non-coding sequences, addition of protective modifications, and quality surveillance to prevent the accumulation of defective molecules. Key steps include 5' capping, 3' , splicing, and specific maturation pathways for non-coding RNAs, culminating in nuclear export primarily through dedicated transport receptors. The 5' capping of pre-mRNA involves the addition of a 7-methylguanosine (m7G) cap structure to the first via a 5'-5' triphosphate linkage, occurring shortly after transcription by . This modification is catalyzed by a tripartite complex: RNA triphosphatase removes the gamma , guanylyltransferase adds GMP, and guanine-7-methyltransferase methylates the at the N7 position. The cap enhances mRNA stability by protecting against 5' exonucleases and facilitates by recruiting 4E () in the . Polyadenylation at the 3' end entails of the pre-mRNA downstream of a signal (typically AAUAAA) followed by the addition of a poly(A) tail consisting of 50-250 residues. This tail is synthesized by poly(A) polymerase, which iteratively adds ATP without a template, in coordination with the cleavage and polyadenylation specificity factor (CPSF) and cleavage stimulation factor (CstF). The poly(A) tail promotes mRNA export from the , enhances stability by impeding 3' exonucleolytic degradation, and supports by interacting with poly(A)-binding protein (PABP), which circularizes the mRNA via cap-PABP bridging. Splicing removes introns and joins exons through the action of the spliceosome, a large ribonucleoprotein complex assembled stepwise on pre-mRNA introns marked by conserved 5' and 3' splice sites, branch point, and polypyrimidine tract. The spliceosome, comprising U1, U2, U4/U6, and U5 small nuclear ribonucleoproteins (snRNPs), catalyzes two transesterification reactions: the branch point adenosine attacks the 5' splice site to form a lariat intermediate, followed by 3' splice site cleavage and exon ligation. Alternative splicing, where different exon combinations are selected, generates multiple mRNA isoforms from a single gene, expanding proteomic diversity; up to 95% of human multi-exon genes undergo this process, enabling tissue-specific and developmental regulation. Maturation of non-coding RNAs follows specialized pathways distinct from mRNA processing. (rRNA) precursors are transcribed by and processed in the , where small nucleolar ribonucleoproteins (snoRNPs), particularly box C/D snoRNPs, guide 2'-O-methylation and pseudouridylation while facilitating cleavage at specific sites to yield mature 18S, 5.8S, and 28S rRNAs. (tRNA) maturation, occurring in both and , involves endonucleolytic trimming of 5' and 3' extensions by RNase P and other exonucleases, followed by the template-independent addition of a sequence to the 3' terminus by tRNA nucleotidyltransferase, which is essential for aminoacylation by synthetases. Quality control mechanisms, such as (NMD), degrade aberrant transcripts harboring premature termination codons (PTCs) located more than 50-55 upstream of an exon-exon junction. NMD is triggered during pioneer translation rounds when the encounters a PTC, recruiting up-frameshift proteins (UPF1, UPF2, UPF3) and the (EJC) to mark the mRNA for rapid degradation by endonucleases and exonucleases, thereby preventing the synthesis of truncated, potentially harmful proteins. This surveillance pathway targets approximately 5-10% of human transcripts under normal conditions, including those from splicing errors. In eukaryotes, mature mRNAs are exported from the to the through complexes via receptor-mediated transport. The primary export receptor for most bulk mRNAs is NXF1 (TAP), which binds the mRNA via adaptor proteins like ALY/REF and interacts with nucleoporins; however, certain transcripts, such as unspliced viral mRNAs or specific cellular mRNAs, utilize exportins like CRM1 (exportin 1), which recognizes leucine-rich nuclear export signals in the presence of Ran-GTP to facilitate selective export. This export step is tightly coupled to prior processing events, ensuring only properly capped, polyadenylated, and spliced RNAs are transported.

Translation

Translation is the process by which the genetic information encoded in (mRNA) is decoded to synthesize proteins on .00725-0) This step occurs in the of prokaryotes and eukaryotes, utilizing the to specify the sequence of in the polypeptide chain. The core components involved include , transfer RNAs (tRNAs), and aminoacyl-tRNA synthetases. consist of two subunits: in prokaryotes, the small 30S subunit and large 50S subunit assemble into the 70S ribosome, while in eukaryotes, the and 60S subunits form the ribosome.00725-0) tRNAs serve as adaptor molecules that carry specific to the , with their anticodon regions base-pairing to mRNA codons; there are typically 20 aminoacyl-tRNA synthetases, one for each , that catalyze the attachment of to their cognate tRNAs with high specificity. The , elucidated through experiments using synthetic polynucleotides in cell-free systems, comprises 64 codons—triplet sequences of the four bases (A, U, G, C)—that specify 20 standard and three stop signals. The code exhibits degeneracy, meaning multiple codons can encode the same , primarily differing in the third position, which reduces the impact of certain . This degeneracy is explained by the wobble hypothesis, which posits that non-standard base pairing (wobble) at the third position of the codon-anticodon interaction allows a single tRNA to recognize multiple synonymous codons. The code is nearly universal across organisms, but exceptions exist, such as in mammalian mitochondria where AUA codes for instead of and UGA specifies rather than acting as a . Translation proceeds in three main stages: , , and termination. begins with the assembly of the on the mRNA at the , , which codes for . In prokaryotes, the small ribosomal subunit binds to the Shine-Dalgarno sequence, a -rich region 4–9 upstream of the , facilitating precise positioning via complementarity to the 3' end of 16S rRNA; the initiator tRNA, charged with formyl-methionine, then binds to the . In eukaryotes, the subunit, along with initiation factors, binds near the 5' cap of the mRNA and scans downstream to the first in a favorable context defined by the (typically GCCRCCAUGG, where R is a ), after which the initiator tRNA (Met-tRNAi) associates and the 60S subunit joins to form the complete .90500-5) During elongation, the ribosome moves along the mRNA in the 5' to 3' direction, incorporating amino acids sequentially. Aminoacyl-tRNAs enter the A site of the ribosome, where codon-anticodon matching triggers GTP hydrolysis by elongation factor EF-Tu (in prokaryotes) or eEF1A (in eukaryotes) for proofreading; accurate matches proceed to peptide bond formation catalyzed by the peptidyl transferase center (PTC), a ribozyme activity residing in the 23S rRNA (prokaryotes) or 28S rRNA (eukaryotes) of the large subunit. The nascent peptide chain transfers from the P-site tRNA to the amino acid in the A site, forming a new bond, after which elongation factor EF-G (prokaryotes) or eEF2 (eukaryotes), powered by GTP hydrolysis, translocates the tRNAs to the P and E sites, advancing the mRNA by one codon and ejecting the deacylated tRNA from the E site. This cycle repeats at an average rate of approximately 20 amino acids per second in prokaryotes under optimal conditions. Termination occurs when a (UAA, UAG, or UGA) enters the A site, lacking a corresponding tRNA. In prokaryotes, release factors RF1 (recognizing UAA/UAG) or RF2 (recognizing UAA/UGA) bind, mimicking tRNA structure to trigger hydrolysis of the ester bond linking the completed polypeptide to the P-site tRNA via the PTC, releasing the protein; RF3, a , then facilitates of RF1/RF2. Ribosome recycling follows, mediated by the ribosome recycling factor (RRF) and , which split the ribosomal subunits and release the mRNA for reuse in new events. The of is maintained through multiple mechanisms, achieving an error rate of about 10^{-4} incorrect per codon incorporated, primarily via initial selection accuracy, GTPase-activated during tRNA accommodation, and translocation . This low error rate ensures functional proteins despite the process's speed. Antibiotics like target by mimicking and prematurely terminating through non-specific formation in the PTC.

Regulation of gene expression

Transcriptional regulation

Transcriptional regulation governs the initiation and rate of RNA synthesis from DNA templates, primarily through the coordinated action of cis-regulatory elements and factors that assemble at promoters. In both prokaryotes and eukaryotes, this process ensures precise control of gene expression in response to cellular needs, with core promoters serving as the primary sites for recruitment and distal enhancers providing additional regulatory input via long-range interactions. Promoters consist of core elements, such as the in eukaryotes or the -10 and -35 boxes in prokaryotes, which position the basal transcription machinery, while enhancers are distal DNA sequences that boost transcription when bound by specific factors. Enhancers can loop to promoters over distances up to megabases, facilitated by the architectural proteins and , which stabilize loops to bring enhancers into proximity with target genes. This looping mechanism enhances promoter activity by concentrating activators and co-factors at the transcription start site, as demonstrated in studies of developmental genes where CTCF-cohesin depletion disrupts enhancer-promoter contacts without fully abolishing transcription. Transcription factors (TFs) are proteins that bind DNA to modulate RNA polymerase activity, divided into general TFs required for basal transcription and specific TFs that confer regulatory specificity. General TFs, like TBP (TATA-binding protein), recognize core promoter motifs and recruit RNA polymerase II (Pol II) in eukaryotes, forming the pre-initiation complex essential for all Pol II-dependent genes. Specific TFs, such as p53, bind to cognate DNA sequences in enhancers or promoters to activate or repress target genes in response to signals like DNA damage; p53's transactivation domains interact with co-activators to stimulate transcription, while its repression domains can inhibit via interactions with general machinery components. These domains often mediate protein-protein contacts, enabling TFs to recruit or block the transcriptional apparatus. The complex acts as a central hub, bridging specific TFs bound at enhancers and promoters to the core Pol II machinery at promoters. Composed of over 20 subunits, integrates signals from diverse TFs, stabilizing the pre-initiation complex and phosphorylating Pol II's C-terminal domain to promote elongation. It collaborates with co-activators, including histone acetyltransferases like p300/CBP, which modify to facilitate access while coordinates the overall response. In prokaryotes, transcriptional regulation often occurs through operons, clusters of genes transcribed as a single mRNA under coordinated control. The exemplifies inducible regulation: in the absence of , the LacI binds the operator, blocking access; binding to LacI relieves repression, allowing transcription of genes for metabolism. The demonstrates repressible control: high levels activate the TrpR to bind the operator, halting synthesis of tryptophan biosynthetic enzymes; additionally, fine-tunes expression via a leader sequence in the mRNA, where stalling during low translates a terminator hairpin, preventing full operon transcription, whereas ample allows antiterminator formation for continued synthesis. These mechanisms highlight how prokaryotes achieve rapid, resource-efficient gene control without complex . Eukaryotic transcriptional regulation is more intricate, relying on combinatorial control where multiple TFs integrate signals at enhancers and promoters to dictate tissue-specific expression patterns. For instance, the myogenic factor , a basic helix-loop-helix TF, binds E-box motifs in muscle-specific enhancers to activate genes like those for contractile proteins, cooperating with other factors such as MEF2 to establish the program during development and regeneration. This combinatorial logic allows a limited repertoire of TFs to generate diverse outcomes, with MyoD's activity modulated by partnerships that enhance opening and Pol II recruitment in myoblasts but not other cell types. Recent advances reveal that many TFs drive transcriptional activation through biomolecular , forming liquid-like via intrinsically disordered regions (IDRs). These IDR-driven concentrate TFs, , and Pol II at super-enhancers, creating hubs that amplify signaling and enhance transcription efficiency, as shown for OCT4 and GCN4 where formation correlates with activation potency. Post-2010 studies, including those on coactivator at enhancers, underscore how provides a physical basis for selective activation, linking TF multivalency to organization. Epigenetic marks can influence accessibility to support these interactions.

Epigenetic modifications

Epigenetic modifications encompass heritable changes to DNA and chromatin that do not alter the underlying nucleotide sequence but profoundly influence gene expression patterns by modulating chromatin accessibility and transcriptional activity. These modifications include DNA methylation, histone tail alterations, chromatin remodeling, and the involvement of non-coding RNAs, all of which contribute to stable, long-term regulation of gene activity during development, cell differentiation, and response to environmental cues. Unlike sequence-specific transcriptional controls, epigenetic mechanisms often establish broad, heritable states of gene repression or activation that can persist across cell divisions. DNA methylation primarily occurs at the fifth carbon of cytosine residues (5-methylcytosine, or 5mC) within CpG dinucleotides, which are symmetrically distributed in promoter-proximal CpG islands of approximately 60% of genes. This modification is catalyzed by enzymes (DNMTs), including , which maintains methylation patterns during , and de novo methyltransferases DNMT3A and DNMT3B, which establish new methylation marks. Hypermethylation of CpG islands typically leads to by inhibiting binding and recruiting repressive complexes such as methyl-CpG-binding protein 2 (MeCP2), which in turn compact . A key example is , where parent-of-origin-specific silences one of imprinted genes like IGF2, ensuring monoallelic expression critical for embryonic development. Histone modifications involve covalent attachments to the N-terminal tails of proteins, altering structure and serving as docking s for regulatory proteins. , mediated by histone acetyltransferases (HATs) such as p300/CBP, neutralizes positive charges on residues (e.g., H3K9ac, H3K27ac), promoting open () and facilitating transcriptional by recruiting co-activators. In contrast, by histone methyltransferases (HMTs) can either activate or repress genes depending on the and degree; for instance, trimethylation of at 27 (H3K27me3), catalyzed by the subunit of Polycomb Repressive Complex 2 (PRC2), mediates transcriptional repression by compacting and blocking activator access. , often at serine or residues (e.g., H3S10ph), is associated with during but can also enhance transcription in by disrupting interactions. The code hypothesis posits that these modifications form combinatorial patterns that specify distinct states and specific effector proteins to regulate gene expression. Chromatin remodeling complexes dynamically alter positioning to control DNA accessibility for transcription. The family, prototypical ATP-dependent remodelers, use the energy from to slide, eject, or restructure nucleosomes, thereby exposing or occluding promoter regions. In and mammals, complexes (e.g., BAF in humans) are recruited to enhancers and promoters, where they facilitate binding and counteract repressive modifications to activate gene expression during and . Non-coding RNAs play a pivotal role in epigenetic silencing by guiding chromatin-modifying complexes to target loci. The long non-coding RNA exemplifies this in X-chromosome inactivation, where it coats the inactive in female mammals, recruiting PRC2 for deposition and DNMT3A for , resulting in stable repression of X-linked genes to achieve dosage compensation. DNA demethylation counteracts to reactivate genes, particularly during and cellular . This process occurs via passive mechanisms, where failure of maintenance during replication dilutes 5mC over divisions, or active pathways involving ten-eleven translocation () enzymes (TET1-3), which oxidize 5mC to (5hmC) and further to 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC), facilitating and removal. proteins are essential for pluripotency and lineage specification, as their impairs global demethylation waves in early embryos. Aberrant epigenetic modifications are implicated in diseases, notably cancer, where promoter hypermethylation silences tumor suppressor genes. For example, hypermethylation of the promoter in ovarian and breast cancers impairs and sensitizes tumors to poly(ADP-ribose) polymerase inhibitors, with recent studies confirming its prognostic value in patient stratification. Loss-of-function mutations in , leading to hypermethylation, drive myeloid malignancies, prompting therapeutic exploration of TET modulators to restore demethylation and improve outcomes in .

Post-transcriptional regulation

Post-transcriptional regulation encompasses a suite of mechanisms that modulate gene expression after RNA transcription, primarily by influencing mRNA stability, processing, localization, and translation readiness, thereby fine-tuning protein output without altering transcription rates. These processes occur in the and , involving RNA-binding proteins (RBPs), non-coding RNAs, and enzymatic modifications that determine the fate of individual transcripts. By controlling RNA half-lives, which can vary from minutes to days depending on sequence elements and cellular context, post-transcriptional regulation enables rapid and tissue-specific responses to environmental cues, such as or developmental signals. For instance, mRNA half-lives span a wide range, with some transcripts degrading rapidly within minutes while others persist for days, reflecting a 1000-fold variation in deadenylation rates that directly impacts steady-state levels. A key aspect of mRNA stability involves cis-regulatory elements in the 3' untranslated region (UTR), such as AU-rich elements (), which promote rapid decay when bound by destabilizing factors. AREs, characterized by clusters of and uracil residues, trigger deadenylation—the progressive shortening of the poly(A) tail—and subsequent exonucleolytic degradation, often limiting the lifespan of transcripts encoding cytokines or proto-oncogenes to prevent excessive or . The poly(A)-specific (PARN) plays a central role in this process by catalyzing deadenylation, particularly for mRNAs with short poly(A) tails or those targeted for rapid turnover, thereby integrating stability control with translational efficiency. Alternative processing events, including and , further diversify post-transcriptional outcomes by generating tissue-specific mRNA isoforms from a single pre-mRNA. assembles variable combinations, producing isoforms with distinct stability, localization, or function; for example, in cancer, variable splicing of the yields isoforms that enhance and , with v6-containing variants overexpressed in breast and colorectal tumors to promote invasion. Similarly, alternative selects different polyadenylation sites, altering 3' UTR length and thereby modulating miRNA accessibility or RBP binding, which can shift isoform stability across tissues like liver versus brain. These mechanisms allow a single to yield dozens of functional variants, contributing to cellular diversity and disease pathology. RNA-binding proteins (RBPs) serve as versatile executors of post-transcriptional control, binding specific sequence motifs to either stabilize or destabilize mRNAs and influence their localization within the cell. The RBP HuR (human antigen R), for instance, binds in the 3' UTR to protect transcripts like those for growth factors from degradation, extending their half-life and promoting translation during proliferation or stress responses. In contrast, tristetraprolin (TTP) competes for the same , recruiting deadenylation complexes to accelerate mRNA , as seen in the rapid turnover of pro-inflammatory cytokines like TNF-α to resolve immune responses. Beyond stability, RBPs like zipcode-binding protein 1 facilitate mRNA localization to subcellular compartments, such as dendrites in neurons, ensuring localized protein synthesis for . Dysregulation of these RBPs, such as HuR overexpression in tumors, can tip the balance toward pathological expression profiles. MicroRNAs (miRNAs) represent a major class of post-transcriptional regulators, with over 60% of human protein-coding genes harboring conserved binding sites that fine-tune expression by repressing translation or promoting decay. miRNA biogenesis begins with transcription of primary miRNAs (pri-miRNAs), which are processed in the by and DGCR8 into precursor hairpins (pre-miRNAs), followed by cytoplasmic cleavage by to yield mature ~22- duplexes; one strand then loads into the (AGO) protein within the (RISC) to guide targeting. miRNAs typically bind the 3' UTR of target mRNAs via partial base-pairing, with the 2-8 "seed" providing specificity; this interaction recruits deadenylation machinery or blocks ribosomal scanning, reducing protein levels by up to 50-70% for most targets. In and , miRNAs like miR-21 stabilize oncogenic networks by downregulating tumor suppressors, highlighting their broad regulatory scope. RNA , particularly adenosine-to-inosine (A-to-I) by enzymes, introduces sequence changes that alter mRNA stability, splicing, or coding potential post-transcriptionally. 1 and 2 catalyze A-to-I conversions, read as during , primarily in the where recodes ~2-3% of adenosines in neuronal transcripts; for example, of glutamate subunits like GluA2 modulates calcium permeability, preventing . In , aberrant links to disorders such as and amyotrophic lateral sclerosis (), where reduced 2 activity destabilizes transcripts or generates toxic isoforms, underscoring 's role in neuronal . These modifications expand the without genomic changes, with implications for neurodevelopment and disease resilience. Long non-coding RNAs (lncRNAs) also contribute to , often acting in to modulate nearby mRNA processing, stability, or localization through direct base-pairing or RBP recruitment. For instance, the lncRNA HOTAIR, transcribed from the HOXC locus, influences post-transcriptional events by interacting with protein complexes that affect splicing or decay of metastasis-associated genes, with elevated levels in promoting isoform shifts that enhance invasiveness. Post-2015 studies have expanded understanding of lncRNA cis-actions, revealing mechanisms like Xist-mediated silencing of X-chromosome genes via localized mRNA stabilization , integrating lncRNAs into dynamic regulatory networks. These elements highlight lncRNAs' emerging role in expression beyond transcriptional .

Translational and post-translational regulation

Translational regulation controls the efficiency and specificity of protein synthesis from mature mRNA, primarily at the initiation stage where ribosomes assemble on the mRNA. One key mechanism involves the phosphorylation of eukaryotic initiation factor 2 (eIF2), which inhibits global translation during cellular stress such as the unfolded protein response; for instance, PERK kinase phosphorylates eIF2α to reduce ternary complex formation, thereby attenuating initiation while selectively allowing translation of stress-response genes like ATF4. Internal ribosome entry sites (IRES) provide an alternative cap-independent initiation pathway, enabling translation under conditions where cap-dependent scanning is impaired, as seen in viral mRNAs and certain cellular transcripts like those encoding HIF-1α during hypoxia. Upstream open reading frames (uORFs) in the 5' untranslated region (UTR) of mRNAs often repress translation by sequestering ribosomes or triggering abortion of the main ORF, with polymorphic uORFs contributing to inter-individual variation in protein expression levels. Ribosome profiling, introduced in 2009, has revolutionized the study of by sequencing ribosome-protected mRNA fragments, revealing translation rates, pausing events, and the impact of regulatory elements at resolution across the . This technique has shown, for example, that uORFs and IRES elements modulate efficiency in response to environmental cues, providing quantitative insights into how stress or nutrients alter composition without changing mRNA levels. MicroRNAs (miRNAs), while primarily acting post-transcriptionally on mRNA stability, can also repress by interfering with or once ribosomes engage the mRNA. Post-translational regulation fine-tunes protein function, localization, and degradation after synthesis, often through covalent modifications that respond to cellular signals. , catalyzed by kinases such as () in response to signaling, adds groups to serine, , or residues, thereby activating or inactivating enzymes like in metabolic pathways. Ubiquitination involves the attachment of chains by E3 ligases, marking proteins for degradation via the 26S proteasome and controlling processes like progression; for example, the E3 ligase ubiquitinates , reducing its half-life to approximately 20 minutes under normal conditions to prevent excessive . attaches carbohydrate moieties in the and Golgi, influencing , stability, and trafficking, as exemplified by N-linked glycans on antibodies that enhance immune effector functions. Protein stability is a critical aspect of post-translational control, with the ubiquitin-proteasome pathway degrading short-lived regulatory proteins to maintain ; proteins like cyclins exhibit half-lives of minutes to hours, allowing rapid responses to signals. Feedback loops integrate these modifications with upstream signals, such as the pathway, which senses nutrients and growth factors to phosphorylate targets like 4E-BP1, thereby promoting cap-dependent translation initiation and balancing anabolic processes. SUMOylation, involving the small ubiquitin-like modifier (), conjugates to residues to regulate protein interactions and stress responses, with recent cryo-electron structures (post-2020) elucidating the SUMO E1-activating enzyme's mechanism in conjugating SUMO under oxidative or heat stress, thereby stabilizing transcription factors like HIF-1.

Measurement and quantification

mRNA analysis techniques

mRNA analysis techniques encompass a range of methods designed to detect, quantify, and profile transcripts, providing insights into gene expression levels and patterns. These approaches have evolved from low-throughput hybridization-based assays to high-throughput sequencing technologies, enabling genome-wide analysis with increasing resolution and sensitivity. Traditional methods like Northern blotting offer specificity for individual transcripts, while modern techniques such as (RNA-seq) allow for comprehensive profiling, including detection of and low-abundance RNAs. Northern blotting, a classical hybridization technique, separates RNA molecules by size using denaturing , transfers them to a membrane, and detects specific mRNAs via hybridization with labeled complementary probes, such as radioactive or fluorescent DNA/RNA oligos. Developed in , this method confirms transcript size, abundance, and integrity while distinguishing mature mRNAs from precursors, but it is labor-intensive, requires substantial RNA input (typically 10-30 μg), and lacks high throughput, limiting its use to validation of candidate genes rather than broad profiling. Reverse transcription quantitative polymerase chain reaction (RT-qPCR) amplifies and quantifies specific mRNA targets after converting RNA to complementary DNA (cDNA) using reverse transcriptase enzymes. Detection relies on fluorescent dyes like SYBR Green, which intercalates with double-stranded DNA, or probe-based systems such as TaqMan, where hydrolysis of a fluorophore-quencher-labeled probe during amplification generates a signal proportional to product accumulation. Quantification uses the cycle threshold (Ct) value—the PCR cycle at which fluorescence exceeds background—allowing relative expression calculation via the ΔΔCt method, normalized to stable reference genes like GAPDH to account for input variations; absolute quantification can employ standard curves. This technique offers high sensitivity (detecting femtogram levels of RNA) and specificity but is limited to predefined targets and prone to biases from reverse transcription efficiency. Microarray hybridization platforms enable parallel analysis of thousands of transcripts by immobilizing oligonucleotide probes (short DNA sequences, 25-70 nucleotides) on a solid surface, such as glass slides, where labeled cDNA or cRNA from the sample hybridizes to complementary probes, and signal intensity reflects expression levels. Pioneered in 1995, these arrays quantify gene expression through fluorescence scanning, with data normalized for background and technical variability; Affymetrix GeneChips use high-density probe pairs (perfect match and mismatch) on silicon wafers for mismatch discrimination, while Agilent arrays employ inkjet-printed long oligos on glass for higher specificity and dynamic range. Microarrays provide cost-effective genome-wide snapshots but suffer from cross-hybridization, limited dynamic range (3-4 orders of magnitude), and inability to detect novel transcripts or isoforms. RNA sequencing (RNA-seq) has revolutionized mRNA analysis by using next-generation sequencing platforms, such as Illumina's short-read technology, to generate millions of cDNA fragments for high-throughput sequencing. The workflow involves RNA isolation, fragmentation, cDNA synthesis, adapter ligation, , and sequencing, followed by read alignment to a using tools like or HISAT2, and quantification of transcript abundance via metrics like fragments per kilobase of transcript per million mapped reads (FPKM) or transcripts per million (TPM), which normalize for length, sequencing depth, and composition biases. Introduced in 2008 for mammalian , RNA-seq offers unbiased detection of all expressed , including low-abundance and novel transcripts, with a exceeding six orders of magnitude and single-base for junctions. Single-cell RNA-seq (scRNA-seq) variants, such as Drop-seq developed in 2015, encapsulate individual cells in nanoliter droplets with barcoded beads to profile thousands of cells simultaneously, revealing cellular heterogeneity but introducing challenges like dropout events and sparsity in data.00549-8) For detecting and quantifying mRNA isoforms arising from , long-read sequencing technologies like (PacBio) and (ONT) provide full-length transcript reads spanning entire molecules (up to 10-20 kb), bypassing the fragmentation issues of short-read . These methods sequence native or amplified /cDNA directly, enabling accurate isoform and quantification without reliance on computational reconstruction, as demonstrated in comprehensive studies since 2013 for PacBio Iso-Seq and 2017 for ONT native sequencing. Long-read approaches resolve complex splicing patterns and novel isoforms in 20-50% more transcripts than short-read methods, though they currently offer lower throughput and higher error rates (~0.1% for PacBio HiFi reads and ~0.5-2% for ONT with consensus calling, as of 2025), requiring error correction and hybrid short-long read strategies for optimal accuracy. To address limitations in , techniques map mRNA distribution within sections, preserving positional information often lost in dissociated samples. Methods like Visium, launched in 2019, array barcoded capture probes on slides to hybridize poly-A tails from permeabilized slices, followed by reverse transcription, sequencing, and image alignment to generate spatially resolved expression maps at near-cellular resolution (55 μm spots covering 1-10 cells). Recent advancements like Visium HD, launched in 2024, achieve 2 μm subcellular resolution for single-cell-scale profiling. Building on earlier array-based approaches from 2016, these enable profiling of thousands of genes across architecture, revealing microenvironmental gradients, but current implementations provide averaging over spots and incomplete coverage of non-polyadenylated RNAs. Such data complements bulk or single-cell analyses by integrating expression with , aiding studies of and disease.

Protein analysis techniques

Protein analysis techniques are essential for assessing the functional outcomes of gene expression, as they enable the detection, quantification, and characterization of translated proteins, including post-translational modifications (PTMs) that influence activity and localization. Unlike mRNA-based methods, these approaches directly measure the end products of gene expression, providing insights into protein abundance, interactions, and functionality in cellular contexts. Common techniques leverage immunological detection, chromatographic separation, or enzymatic reporting to achieve high specificity and sensitivity, often applied in studies of , , and . Western blotting is a widely used for detecting specific proteins in complex samples. The technique involves separating proteins by size using , followed by transfer to a or PVDF membrane, and probing with primary antibodies specific to the target protein, visualized via secondary antibody-linked enzymes or fluorophores. Developed in , it allows semi-quantitative analysis through , where band intensity correlates with protein levels, though to loading controls like or GAPDH is required for accuracy. Western blotting is particularly valuable for confirming protein expression from genes of interest and detecting PTMs such as , with detection limits typically in the nanogram range per lane. Enzyme-linked immunosorbent assay (ELISA) provides a sensitive method for quantifying proteins, especially secreted or soluble forms, in biological fluids. In the sandwich ELISA format, a capture immobilizes the target on a well, followed by detection with a second enzyme-conjugated antibody, producing a colorimetric, fluorescent, or chemiluminescent signal proportional to protein concentration. Introduced in 1971, this technique achieves sensitivities as low as ~1 pg/mL for many analytes, making it ideal for low-abundance proteins like cytokines or hormones. ELISAs are high-throughput and quantitative, often used to measure gene expression outputs in or supernatants, with variations like competitive ELISA for small molecules. Mass spectrometry (MS), particularly liquid chromatography-tandem (LC-MS/MS), enables comprehensive by identifying and quantifying thousands of proteins simultaneously from complex mixtures. In , proteins are digested into peptides, separated by LC, ionized, and fragmented for via MS/MS, allowing proteome-wide . Quantification can be label-free, relying on spectral counting or intensity, or use stable isotope labeling like SILAC (stable isotope labeling by in ), where cells are grown in media with heavy isotopes to compare relative abundances with high precision (ratios accurate to <10% error). Introduced in 2002, SILAC is compatible with MS for dynamic studies of gene expression changes. LC-MS/MS excels in PTM identification, such as ubiquitination or glycosylation sites, with recent advancements achieving up to ~5,000 proteins per cell in optimized single-cell workflows, as of 2025. Flow cytometry facilitates high-throughput analysis of protein expression at the single-cell level, including intracellular targets. Cells are fixed, permeabilized, and stained with fluorescently labeled antibodies specific to the protein of interest, then passed through a laser-interrogated flow stream to measure fluorescence intensity, enabling quantification of expression levels and heterogeneity. Multiplexing with multiple antibodies (up to 40+ colors) allows simultaneous assessment of several proteins, such as transcription factors or signaling molecules, in populations like immune cells. This technique is particularly useful for monitoring dynamics in response to stimuli, with sensitivities down to ~1,000 molecules per cell, and supports sorting of expressing cells for downstream analysis. Reporter assays offer real-time, non-invasive monitoring of protein expression by fusing the gene of interest to a reporter like or . In , the fluorescent tag allows visualization and quantification via microscopy or flow cytometry, reflecting the spatiotemporal dynamics of the target protein. Pioneered in 1994, are genetically encoded and require no substrates, enabling live-cell imaging of expression in organisms from bacteria to mammals. , based on firefly or Renilla enzymes, produce bioluminescent signals upon substrate addition, offering high sensitivity (~10^2-10^3 molecules) for transient or stable transfections, commonly used to quantify promoter activity driving gene expression. Recent advances include proximity labeling techniques like BioID, which uses a promiscuous biotin ligase fused to a bait protein to biotinylate nearby proteins in living cells, enabling identification of interactomes and transient associations via MS. Developed in 2012, BioID captures proteins within ~10 nm, complementing traditional co-immunoprecipitation by labeling under physiological conditions. Additionally, AI-enhanced MS has emerged post-2023, with machine learning models improving peptide identification accuracy by >20% through spectral prediction and , accelerating in large-scale gene expression studies. These innovations enhance the resolution of protein-level insights into gene regulation.

Correlation and integration methods

Studies of gene expression have consistently shown that mRNA abundance correlates moderately with protein levels, with Spearman coefficients typically ranging from 0.4 to 0.6 across large-scale datasets in and cells. This discrepancy arises primarily from variations in translation efficiency, influenced by factors such as and availability, as well as differences in mRNA and protein degradation rates. For instance, mRNAs with optimal codons are translated more efficiently, leading to higher protein output relative to transcript levels, while unstable proteins degrade rapidly, decoupling steady-state protein abundance from mRNA levels. To address these discrepancies, multi-omics integration methods combine transcriptomic and proteomic data for a more comprehensive view of gene expression. Ribosome profiling (Ribo-seq), which maps ribosome-protected mRNA fragments to quantify , is often paired with to estimate translation efficiency by calculating ribosome density on transcripts. Similarly, integrating Ribo-seq with mass spectrometry-based enables the identification of translated open reading frames and improves proteome annotation through proteogenomics approaches. These methods reveal that , such as alternative translation initiation, contributes significantly to the observed mRNA-protein mismatches. Mathematical modeling provides a framework for understanding these dynamics at , where protein concentration [P] is determined by the balance of and rates: [P] = \frac{k_s \cdot [mRNA]}{k_d} Here, k_s represents the rate ( rate per mRNA molecule), and k_d is the protein rate constant. This equation highlights how variations in k_s and k_d can buffer or amplify mRNA fluctuations to maintain stable protein levels, with empirical studies showing that degradation half-lives span orders of magnitude across proteins. At the single-cell level, correlations between mRNA and protein levels are even weaker due to stochastic noise in gene expression, often exacerbated by transcriptional bursting and variable translation. Techniques like single-cell RNA sequencing (scRNA-seq) integrated with (CyTOF) allow simultaneous measurement of transcriptomes and dozens of protein markers, revealing cell-to-cell heterogeneity where noise from low molecule counts dominates. For example, data shows that protein levels in immune cells correlate modestly (Spearman ~0.3-0.5) with scRNA-seq-derived mRNA estimates, underscoring the role of intrinsic stochasticity in expression variability. Buffering mechanisms further explain the imperfect correlation by stabilizing protein levels against perturbations in mRNA abundance. MicroRNAs (miRNAs) play a key role through loops, where they bind target mRNAs to repress and promote , thereby reducing noise and constraining expression variance. This miRNA-mediated buffering is particularly evident in developmental contexts, where it maintains robust protein despite fluctuating transcript levels. Emerging approaches aim to predict these correlations by modeling regulatory impacts on expression. For instance, AlphaFold3 (2024) enables accurate prediction of protein-nucleic acid interactions, which can inform how structural features influence translation efficiency and mRNA stability. Such tools, combined with on multi-omics data, hold promise for imputing missing protein levels from transcriptomic profiles, though current models remain limited by training data sparsity.

Applications and systems

Expression systems in biotechnology

Expression systems in refer to engineered platforms designed to produce specific proteins or molecules at high levels in host organisms or cell-free environments, enabling applications in , therapeutics, and . These systems promoters, regulatory , and vectors to control gene expression, often mimicking or enhancing mechanisms for precise temporal and spatial regulation. By optimizing codon usage, chaperone co-expression, and culture conditions, yields can reach grams per liter, facilitating scalable manufacturing. Prokaryotic expression systems, particularly in , are widely used due to their rapid growth, low cost, and ease of genetic manipulation. The T7 RNA polymerase-based system, developed using pET vectors, drives high-level expression from the strong T7 promoter upon induction with IPTG, achieving protein yields up to 50% of total cellular protein in optimized strains like BL21(DE3). Complementing this, the IPTG-inducible system allows tunable expression via the lac promoter, where allolactose analog IPTG relieves LacI binding, enabling fine control for toxic proteins. Eukaryotic systems provide post-translational modifications essential for mammalian proteins. In yeast like , the GAL1 promoter is induced by and repressed by glucose, supporting secreted at levels of 1-10 g/L in strains engineered for . For mammalian expression, the cytomegalovirus (CMV) promoter in human embryonic kidney (HEK293) cells drives constitutive high-level transcription, often yielding 100-500 mg/L of glycosylated antibodies via transient . Inducible systems offer reversible control to mitigate toxicity. The Tet-on and Tet-off systems use to modulate a tetracycline transactivator (tTA or rtTA), enabling activation or repression of target genes with minimal leakiness in mammalian cells. Light-inducible systems, incorporating light-oxygen-voltage (LOV) domains from proteins like VVD, allow optogenetic control of expression through blue light-triggered dimerization, achieving fold-induction ratios over 100 in and eukaryotes. Viral vectors facilitate or in hard-to-transfect cells. Adeno-associated virus (AAV) vectors provide long-term episomal expression with low , commonly used for at doses delivering 10^12-10^14 genomes per kg. Lentiviral vectors integrate transgenes for expression in dividing and non-dividing cells, supporting titers up to 10^8 TU/mL for applications like CAR-T cell engineering. For activation without genomic integration, CRISPR-based systems fuse catalytically dead Cas9 (dCas9) to VP64 activators, upregulating endogenous genes by 10-100 fold upon targeting. Synthetic biology constructs enable complex circuits. Toggle switches, bistable networks using mutual repression (e.g., lacI and promoters), maintain two stable expression states switchable by inducers, with response times under 1 hour in E. coli. Oscillators like the repressilator, a ring of three repressor genes (, , ), generate rhythmic expression with periods of 2-3 hours, demonstrating predictable dynamics . These systems underpin recombinant , such as human insulin expressed in E. coli using the lac promoter, which revolutionized treatment with over 99% market share since 1982. Cell-free systems, like transcription-translation (TXTL) extracts from E. coli, support without cellular constraints, with post-2018 optimizations incorporating energy regeneration and chaperones boosting yields to 1-2 mg/mL for model proteins.

Gene expression in disease and development

Gene expression plays a pivotal role in embryonic , where spatial and temporal gradients of regulatory proteins establish body axes and segment patterns. In , the Bicoid protein forms an anterior-to-posterior concentration gradient that acts as a , activating target genes such as hunchback in a threshold-dependent manner to specify anterior structures like the head and . This gradient is established by localized maternal mRNA deposition at the anterior pole, followed by translation and diffusion in the syncytial embryo. Similarly, vertebrate somitogenesis involves a segmentation clock, an oscillatory genetic network driven by cyclic expression of genes like hairy and enhancer of split (Hes) family members, which regulates the periodic formation of somites along the body axis. These oscillations, with periods of about 2 hours in mice, arise from loops involving , Wnt, and FGF signaling pathways, ensuring synchronized tissue segmentation. Dysregulated gene expression underlies many diseases, particularly cancer, where aberrant activation of s and silencing of tumor suppressors drive hallmarks such as sustained proliferative signaling. Amplification of the , observed in approximately 28% of tumors across various cancers including and , leads to overexpression that enhances transcription of genes promoting , , and . Conversely, epigenetic silencing of tumor suppressor genes like p16INK4a and MLH1 through promoter hypermethylation occurs frequently in colorectal and other cancers, inactivating pathways that normally halt uncontrolled proliferation. In neurological contexts, the transcription factor CREB mediates activity-dependent gene expression critical for learning, , and ; phosphorylation of CREB by kinases like in response to neuronal stimulation activates downstream targets such as BDNF and c-fos, strengthening synaptic connections in the . Beyond cancer and , altered gene expression profiles characterize other diseases, including autoimmune disorders and infectious conditions. In systemic (SLE), an (IFN) signature—marked by upregulated expression of over 100 IFN-stimulated genes in peripheral blood mononuclear cells—observed in approximately half of patients and correlates with disease activity and production. Single-cell sequencing studies of patients have revealed disease severity-specific expression changes, such as heightened IFN responses and dysregulation in severe cases, highlighting heterogeneous immune cell states that persist post-infection. Therapeutic strategies targeting these dysregulations include (HDAC) inhibitors like , which reverse epigenetic silencing of tumor suppressors in cancers such as by promoting and gene reactivation. (RNAi) therapeutics, exemplified by —an siRNA that silences (TTR) gene expression—have shown efficacy in hereditary ATTR amyloidosis by reducing toxic protein aggregates and improving neuropathy symptoms in phase III trials. Gene expression divergence also contributes to evolutionary processes, particularly , where changes in regulatory elements lead to species-specific patterns without altering protein-coding sequences. In closely related species like , cis-regulatory mutations drive differential expression of genes such as BMP4 in beak development, facilitating adaptive morphological divergence and . Such expression shifts, often involving trans-regulatory factors, accumulate over time and can result in hybrid incompatibilities, underscoring the role of regulatory evolution in generating .

Gene regulatory networks

Gene regulatory networks (GRNs) consist of interconnected genes and their regulatory elements that collectively control the timing, location, and level of gene expression in response to internal and external signals. These networks integrate transcriptional regulators, such as transcription factors, with cis-regulatory modules to orchestrate complex cellular behaviors, from to . GRNs exhibit modular architectures that enable robustness and evolvability, allowing cells to process information akin to computational circuits. Common , or recurring subgraphs, underpin the functional logic of GRNs. Feed-forward loops (FFLs), for instance, involve a regulator that controls both a direct target and an intermediary regulator of the same target, enabling rapid signal propagation or delay in response. In , FFLs are overrepresented and function in noise filtering and response acceleration. Feedback loops provide stability or amplification; dampens oscillations to maintain steady states, while reinforces commitments, such as in cell fate decisions. The exemplifies a classic motif, where the forms a loop with lactose-induced activation, ensuring efficient in response to environmental sugars. Reconstructing GRNs from high-throughput data, such as gene expression profiles, reveals these interactions. The ARACNe (Algorithm for the Reconstruction of Accurate Cellular Networks) algorithm infers direct regulatory links by estimating between genes and pruning indirect connections via , scaling to mammalian genome-wide networks. Boolean networks model GRN dynamics by assigning binary states (on/off) to genes and defining logical rules for activation, capturing and attractors that represent stable cell states. These discrete models have been applied to simulate developmental transitions and predict outcomes. GRNs often display scale-free topologies, where a few highly connected genes regulate many targets, following a power-law degree distribution. The tumor suppressor TP53 exemplifies a , integrating signals to activate hundreds of downstream genes involved in and arrest, conferring network robustness against random node failures. This scale-free property enhances resilience to perturbations, as hubs maintain core functionality even under genetic or environmental . In , GRNs coordinate spatial and temporal gene expression patterns. The Strongylocentrotus purpuratus endomesoderm GRN, modeled by Davidson in , illustrates a hierarchical where upstream inputs like β-catenin activate territorial transcription factors, leading to progressive specification of and lineages through repressive and activating interactions. This model highlights how GRNs kernel functions—small subcircuits—drive irreversible cell fate decisions. In , dysregulated GRNs contribute to , particularly in cancer. The forms a core GRN module in , where APC mutations stabilize β-catenin, driving aberrant activation of and CCND1 to promote proliferation and . Post-2022 single-cell atlas projects, such as the Human Cell Atlas, have mapped heterogeneous cell states but reveal incomplete GRN coverage due to challenges in inferring context-specific regulations from sparse single-cell data. GRN dynamics often produce oscillatory expression patterns essential for periodic processes. The mammalian circadian clock GRN features interlocking feedback loops: CLOCK-BMAL1 activates PER and CRY transcription, whose protein products form repressive complexes that inhibit CLOCK-BMAL1, generating ~24-hour rhythms in gene expression across tissues. This oscillatory architecture ensures synchronized , with disruptions linked to metabolic disorders.

Techniques and resources

Experimental tools

Experimental tools for studying gene expression encompass a range of techniques designed to visualize, perturb, and analyze regulatory mechanisms at the molecular level. These methods enable researchers to monitor promoter activity, disrupt gene function, map protein-DNA interactions, and observe dynamic processes in living cells, providing insights into transcriptional control and regulatory networks. Unlike quantification-focused approaches, these tools emphasize functional interrogation and spatial-temporal visualization. Reporter genes serve as versatile tools for visualizing and quantifying gene expression patterns and . The lacZ gene, encoding from , is a classic reporter that produces a blue precipitate upon reaction with substrate, allowing histological detection of expression in transgenic models such as mice. This system has been widely adopted for mapping developmental expression profiles due to its stability and ease of detection. (GFP), derived from the jellyfish , enables non-invasive, real-time visualization of gene expression through its intrinsic fluorescence without requiring substrates or cofactors. Introduced as a marker in , GFP and its variants have revolutionized live-cell imaging by facilitating the tracking of protein localization and expression dynamics in organisms from to mammals. Dual-luciferase assays enhance reporter precision by co-transfecting (driven by the promoter of interest) with Renilla luciferase (as an internal control for transfection efficiency), allowing normalized measurement of transcriptional activity through sequential detection. This method, developed in the mid-1990s, minimizes variability from cell number or viability, making it ideal for of regulatory elements. Perturbation techniques are essential for dissecting causal relationships in gene expression by selectively inhibiting or knocking down target genes. RNA interference (RNAi) utilizes small interfering RNAs (siRNAs) or short hairpin RNAs (shRNAs) to trigger sequence-specific mRNA degradation, effectively silencing gene expression. The discovery of RNAi in 1998 demonstrated its potency in , and shRNA expression vectors extended this to stable knockdown in mammalian cells by mimicking pri-miRNA processing. interference () employs a catalytically dead (dCas9) protein guided by single-guide RNAs (sgRNAs) to sterically block transcription initiation or elongation without altering the . Introduced in 2013, achieves tunable repression levels and multiplexed targeting, offering reversibility and minimal off-target effects compared to traditional knockouts. Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is a key method for mapping (TF) binding sites and epigenetic modifications genome-wide. The technique involves crosslinking proteins to , immunoprecipitating with antibodies specific to the TF or histone mark, and sequencing the enriched fragments to identify binding peaks. Pioneered in 2007, ChIP-seq provides high-resolution profiles of regulatory landscapes, revealing how TFs like or histone acetyltransferases influence expression. Peak calling algorithms then distinguish significant enrichment from background, enabling the annotation of enhancers and promoters associated with active transcription. The (EMSA) detects direct protein-DNA interactions by observing the retarded migration of labeled DNA probes bound to nuclear extracts during native . Developed in 1981, EMSA confirms TF binding to specific motifs, such as to κB sites, and can be supershifted with antibodies for specificity. This low-throughput assay remains valuable for validating interactions identified by high-throughput methods like ChIP-seq. Live-cell imaging techniques capture the spatiotemporal dynamics of gene expression. (FRET) uses pairs of fluorescent proteins, such as CFP and YFP fused to interacting partners, to report conformational changes or complex formation upon energy transfer from donor to acceptor emission. In gene expression studies, FRET-based reporters monitor promoter activation or TF dimerization in real time, providing kinetic data on regulatory events. extends this by employing light-sensitive proteins like or cryptochromes to control gene expression with high precision. Post-2015 applications in neural systems have utilized optogenetic tools to modulate transcription in neurons, such as light-inducible for doxycycline-independent control, aiding studies of circuit-specific expression in development and . Safety and ethical considerations are paramount when employing these tools, particularly with recombinant expression systems. Biosafety levels (BSL) classify laboratory practices based on risk: BSL-1 for well-characterized agents like non-pathogenic E. coli used in reporter assays, escalating to BSL-2 for moderate-risk materials involving viral vectors or human-derived cells in RNAi/CRISPR experiments. The CDC's Biosafety in Microbiological and Biomedical Laboratories guidelines mandate containment, , and decontamination protocols to prevent accidental release, while NIH guidelines for research ensure ethical oversight for gene perturbation studies.

Computational and database resources

Several major public databases serve as central repositories for gene expression data, enabling researchers to access, share, and analyze large-scale datasets. The (), maintained by the (), is a primary archive for data, including and high-throughput sequencing experiments on mRNA, , and protein expression across . As of 2025, hosts over 8 million samples from more than 260,000 studies, facilitating meta-analyses and validation of expression patterns. The () project provides comprehensive data on gene expression and epigenetic regulation in human cells, integrating , ChIP-seq, and other assays to map regulatory elements and their impact on transcription. 's datasets, spanning thousands of experiments, emphasize functional annotation of the non-coding genome and are accessible via its data portal for querying expression in specific cell types or conditions. Complementing these, the Genotype-Tissue Expression (GTEx) project offers tissue-specific gene expression profiles from 946 postmortem donors across 54 human tissues, linking genetic variants to () to study regulatory mechanisms. GTEx data, version 8 (released 2019), supports investigations into heritability and disease-associated expression variation. Analysis tools and algorithms are essential for processing and interpreting gene expression data from these repositories. DESeq2, an R-based package, is widely used for differential expression analysis of count data from RNA-seq experiments, employing a negative binomial model to estimate variance and detect significant changes between conditions while controlling for false discovery rates. Introduced in 2014, it has been cited over 20,000 times and remains a standard for robust in bulk and single-cell . For exploring functional relationships, the database integrates protein-protein interaction () networks with gene expression data, combining experimental, computational, and literature-derived evidence to predict co-expression and pathway involvement. STRING's latest version (12.5, 2025) covers over 12,000 organisms and includes tools for network visualization and enrichment analysis, aiding in the contextualization of expression changes within biological pathways. Prediction models leveraging have advanced the forecasting and annotation of gene expression patterns. DeepSEA, a deep , predicts (TF) binding sites and accessibility from DNA sequences, enabling the interpretation of non-coding variants' effects on expression regulation. Developed in 2015, DeepSEA was trained on data and achieves high accuracy in variant effect scoring, with applications in prioritizing disease-associated mutations. For expression forecasting, models like those based on graph neural networks or time-series analysis predict dynamic changes in gene expression under , such as in developmental trajectories or drug responses, using historical data from or GTEx. In single-cell contexts, scGPT (2023), a pretrained on millions of cells, generates and analyzes expression profiles for tasks like annotation and simulation. Integration platforms facilitate the visualization and cross-referencing of gene expression data with genomic annotations. The provides an interactive interface for viewing expression tracks alongside reference genomes, allowing users to overlay data from or GTEx with epigenetic marks and variants. It supports custom uploads and API access, making it indispensable for and hypothesis generation. Data standards like MIAME (Minimum Information About a Microarray Experiment), extended to sequencing data, ensure that deposited datasets include sufficient for , such as experimental design and processing details. However, challenges in persist, including batch effects, incomplete , and variability in analysis pipelines, which can lead to inconsistent findings across studies despite standardized submissions to GEO. Regarding accessibility, most resources like DESeq2 and are open-source, promoting widespread adoption, whereas proprietary platforms such as Illumina BaseSpace offer integrated workflows for sequencing with commercial hardware compatibility, though they may limit customization. This dichotomy highlights ongoing efforts to balance innovation with equitable access in gene expression research.

References

  1. [1]
    Gene Expression | Learn Science at Scitable - Nature
    Gene expression involves genes encoding proteins, with transcription and translation as key steps. Only a fraction of genes are expressed at once, and ...
  2. [2]
    Studying Gene Expression and Function - Molecular Biology ... - NCBI
    By examining the expression of so many genes simultaneously, we can now begin to identify and study the gene expression patterns that underlie cellular ...
  3. [3]
    Gene Expression - an overview | ScienceDirect Topics
    Gene expression is the process where DNA information is converted into functional proteins or RNA, involving mRNA transcription and non-coding RNA production.
  4. [4]
    Gene expression and regulation - Autoimmunity - NCBI Bookshelf
    Gene expression is determined by reading gene information during the transcription and translation processes. The product of transcription is mRNA and the ...<|control11|><|separator|>
  5. [5]
    Regulation of Transcription and Gene Expression in Eukaryotes
    Gene expression is controlled on two levels. First, transcription is controlled by limiting the amount of mRNA that is produced from a particular gene.
  6. [6]
    Mechanisms and Measurement of Changes in Gene Expression
    Gene expression (GE) is the synthesis of a functional gene product using the information provided by deoxyribonucleic acid (DNA; Perdew, Vanden-Heuvel, & Peters ...
  7. [7]
    Central Dogma of Molecular Biology - Nature
    Aug 8, 1970 · The central dogma of molecular biology deals with the detailed residue-by-residue transfer of sequential information.
  8. [8]
    Non-coding RNAs as regulators of gene expression and epigenetics
    This has led to the identification and isolation of novel classes of non-coding RNAs (ncRNAs) that influence gene expression by a variety of mechanisms.
  9. [9]
    An Overview of Gene Control - Molecular Biology of the Cell - NCBI
    Cell differentiation generally depends on changes in gene expression rather than on any changes in the nucleotide sequence of the cell's genome.The Different Cell Types of a... · Different Cell Types... · Cell Can Change the...
  10. [10]
    Role of Hox genes in stem cell differentiation - PMC - PubMed Central
    Hox genes are involved in embryonic development as well as in repair mechanisms in the adult body, thus regulating cell fate.<|control11|><|separator|>
  11. [11]
    "Genetic Control of Biochemical Reactions in Neurospora" (1941 ...
    Jun 11, 2014 · Beadle and Tatum experimented on Neurospora, a type of bread mold, and they concluded that mutations to genes affected the enzymes of organisms, ...
  12. [12]
    A Structure for Deoxyribose Nucleic Acid - Nature
    The determination in 1953 of the structure of deoxyribonucleic acid (DNA), with its two entwined helices and paired organic bases, was a tour de force in ...
  13. [13]
    Potent and specific genetic interference by double-stranded RNA in ...
    Feb 19, 1998 · We found that double-stranded RNA was substantially more effective at producing interference than was either strand individually.
  14. [14]
    A Programmable Dual-RNA–Guided DNA Endonuclease ... - Science
    Jun 28, 2012 · Our study reveals a family of endonucleases that use dual-RNAs for site-specific DNA cleavage and highlights the potential to exploit the system for RNA- ...
  15. [15]
    DNA Transcription | Learn Science at Scitable - Nature
    Learn more about the DNA transcription process, where DNA is converted to RNA, a more portable set of instructions for the cell.
  16. [16]
    Mechanism of Bacterial Transcription Initiation: RNA Polymerase - NIH
    Sigma factors recognize specific promoter DNA sequences, interact with transcription activators, participate in promoter DNA opening, and influence the early ...
  17. [17]
    RNA Transcription by RNA Polymerase: Prokaryotes vs Eukaryotes
    Each of these sigma factors recognizes the promoters of the genes in its group, not those "seen" by other sigma factors. This simple example illustrates how ...
  18. [18]
    Bacterial Sigma Factors and Anti-Sigma Factors: Structure, Function ...
    Jun 26, 2015 · Sigma factors are multi-domain subunits of bacterial RNA polymerase (RNAP) that play critical roles in transcription initiation.
  19. [19]
    Eukaryotic RNA Polymerases and General Transcription Factors
    The first step in formation of a transcription complex is the binding of a general transcription factor called TFIID to the TATA box (TF indicates transcription ...
  20. [20]
    RNA polymerase II transcription initiation: A structural view - PNAS
    TBP (and TFIID) binding to the TATA box is an intrinsically slow step, yielding a long-lived protein–DNA complex.
  21. [21]
    The general transcription factors of RNA polymerase II
    TBP is bound to the minor groove of the TATA sequence and induces a sharp bend in the DNA. Coordinates provided by D. Nikolov and S. Burley; figure by K. Das ...
  22. [22]
    Prokaryotic Transcription | Biology for Majors I - Lumen Learning
    Elongation synthesizes mRNA in the 5′ to 3′ direction at a rate of 40 nucleotides per second. Termination liberates the mRNA and occurs either by rho protein ...
  23. [23]
    Transcription elongation mechanisms of RNA polymerases I, II ... - NIH
    The distinct transcription initiation and termination mechanisms of eukaryotic RNA polymerases I, II, and III (Pols I, II, and III) have long been appreciated.
  24. [24]
    RNA Pol II transcription speed - Various - BNID 111604
    Typical RNA Pol II transcription speed is 18-42 bases/sec, with some estimates at 100 bases/sec.
  25. [25]
    Universally high transcript error rates in bacteria - eLife
    May 29, 2020 · Fundamental details of the rate and molecular spectrum of transcript errors were revealed in four bacterial species, providing novel ...
  26. [26]
    A narrow range of transcript-error rates across the Tree of Life
    Jul 11, 2025 · One notable observation is that there is a narrow range of transcript-error rates, between 10−6 and 10−5 errors per ribonucleotide (rNTP) for ...
  27. [27]
    15.5: Prokaryotic Transcription - Elongation and Termination in ...
    Nov 22, 2024 · Elongation in Prokaryotes. The transcription elongation phase begins with the release of the σ subunit from the polymerase.
  28. [28]
    Mechanism of alternative splicing and its regulation - PMC
    Alternative splicing increases gene expression complexity by skipping exons, creating various mature mRNA forms, and is regulated by cis and trans factors.
  29. [29]
    mRNA capping: biological functions and applications
    All eukaryotic mRNA contains a cap structure - an N7-methylated guanosine linked to the first nucleotide of the RNA via a reverse 5′ to 5′ triphosphate linkage ...
  30. [30]
    Roles of mRNA poly(A) tails in regulation of eukaryotic gene ...
    Mar 13, 2023 · Poly(A) tails are added co-transcriptionally in the nucleus and are required for the export of mature mRNAs into the cytoplasm (Figure 1). ...Ii. Poly(a) Tails In Mrna... · Iii. Poly(a) Tails In Mrna... · A. Deadenylation Enzymes
  31. [31]
    Alternative splicing: Human disease and quantitative analysis ... - NIH
    Dec 24, 2020 · Up to 95% of human multi-exon genes undergo alternative splicing to encode proteins with different functions. Moreover, around 15% of human ...
  32. [32]
    Elucidating the Role of C/D snoRNA in rRNA Processing and ...
    Most eukaryotic C/D small nucleolar RNAs (snoRNAs) guide 2′-O methylation (Nm) on rRNA and are also involved in rRNA processing. The four core proteins that ...
  33. [33]
    CCA Addition to tRNA: Implications for tRNA Quality Control - PMC
    The addition of CCA is an important step in the tRNA maturation process, which varies among different organisms (reviewed in [10, 11]). In E. coli and related ...Cca Addition To Trna... · Cca Addition In Trna... · Implication For Trna Quality...
  34. [34]
    Nonsense-Mediated mRNA Decay: Degradation of Defective ...
    Nonsense-mediated mRNA decay (NMD) is a eukaryotic surveillance mechanism that monitors cytoplasmic mRNA translation and targets mRNAs undergoing premature ...
  35. [35]
    Involvement of CRM1, a nuclear export receptor, in mRNA ... - PubMed
    CRM1, a receptor for nuclear export, is involved in mRNA export. Inhibiting CRM1 causes mRNA to accumulate in the nucleus. Results suggest CRM1 is involved in  ...
  36. [36]
    Structure of the bacterial ribosome at 2 Å resolution - eLife
    Sep 14, 2020 · Using cryo-electron microscopy (cryo-EM), we determined the structure of the Escherichia coli 70S ribosome with a global resolution of 2.0 Å.
  37. [37]
    The Structure and Function of the Eukaryotic Ribosome - PMC
    All ribosomes are composed of two subunits, both of which are built from RNA and protein (Figs. 1 and 2). Bacterial ribosomes, for example of Escherichia coli, ...
  38. [38]
    Codon—anticodon pairing: The wobble hypothesis - ScienceDirect
    This hypothesis is explored systematically, and it is shown that such a wobble could explain the general nature of the degeneracy of the genetic code.
  39. [39]
    Evolving genetic code - PMC - NIH
    This finding, together with the deviant nuclear genetic codes in not a few organisms and a number of mitochondria, shows that the genetic code is not universal, ...
  40. [40]
    Unusual Resistance of Peptidyl Transferase to Protein Extraction ...
    Peptidyl transferase, the ribosomal activity responsible for catalysis of peptide bond formation, is resistant to vigorous procedures that are ...
  41. [41]
    Dynamic basis of fidelity and speed in translation - PubMed Central
    Translation occurs fast at ∼20 amino acids per second in vivo,57, 58 equivalent of 50 ms per codon. The translocation step occurs at ∼30–100 µM−1s−1 at 37°C,59 ...
  42. [42]
    The structural basis for release-factor activation during translation ...
    Jun 12, 2019 · In termination of translation, the complete protein is released from the ribosome by a class-1 release factor (RF) recognizing one of the ...
  43. [43]
    Ribosome recycling factor (ribosome releasing factor) is essential for ...
    Ribosome releasing factor, product of the frr gene in Escherichia coli, is responsible for dissociation of ribosomes from mRNA after the termination of ...
  44. [44]
    The frequency of translational misreading errors in E. coli is largely ...
    Estimates of missense error rates (misreading) during protein synthesis vary from 10−3 to 10−4 per codon. The experiments reporting these rates have measured ...
  45. [45]
    Regulatory enhancer–core-promoter communication via ...
    Gene expression is regulated by genomic enhancers that recruit transcription factors and cofactors to activate transcription from target core-promoters.
  46. [46]
    Enhancer–promoter interactions and transcription are ... - Nature
    Dec 5, 2022 · We find that enhancer–promoter (E–P) interactions are largely insensitive to acute (3-h) depletion of CTCF, cohesin or WAPL.
  47. [47]
    Coming full circle: On the origin and evolution of the looping model ...
    In this article, we explore the emergence and development of the looping model as a means for enhancer–promoter communication and review the contrasting ...
  48. [48]
    The Transactivation Domains of the p53 Protein - PMC
    p53 has two transcriptional activation domains. Genetic, biochemical, and structural studies are illuminating the molecular details on how they function in ...
  49. [49]
    Mechanisms of transcriptional regulation by p53 - Nature
    Nov 10, 2017 · p53 is a transcription factor that suppresses tumor growth through regulation of dozens of target genes with diverse biological functions.
  50. [50]
    The Mediator complex as a master regulator of transcription by RNA ...
    Jun 20, 2022 · The Mediator complex, which in humans is 1.4 MDa in size and includes 26 subunits, controls many aspects of RNA polymerase II (Pol II) function.
  51. [51]
    The mediator coactivator complex: functional and physical roles in ...
    Sep 15, 2003 · It plays an key role in activation, bridging DNA-bound activators, the general transcriptional machinery, especially RNA polymerase II, and the ...
  52. [52]
    Transcription Attenuation: Once Viewed as a Novel Regulatory ... - NIH
    The trp operon attenuation mechanism described above is typical of the ribosome-mediated mechanisms that regulate expression of many biosynthetic operons of ...
  53. [53]
    Modeling operon dynamics: the tryptophan and lactose operons as ...
    This paper reviews recent mathematical modeling work on the tryptophan and lactose operons which are, respectively, the classical paradigms for repressible and ...<|separator|>
  54. [54]
    TRANSCRIPTIONAL CONTROL OF MUSCLE DEVELOPMENT BY ...
    The combi- natorial associations of proteins from these two families appear to establish a transcriptional code specific for skeletal muscle gene activation.<|separator|>
  55. [55]
    Master control: transcriptional regulation of mammalian Myod
    Jul 12, 2019 · Myod expression is largely controlled by just two enhancer regions located within a region 24 kb upstream of the transcription start site in mammals.
  56. [56]
    Transcription factors activate genes through the phase separation ...
    Here we report that diverse ADs form phase-separated condensates with the Mediator coactivator. For the OCT4 and GCN4 TFs, we show that the ability to form ...
  57. [57]
    Coactivator condensation at super-enhancers links phase ... - Science
    Our results show that coactivators form phase-separated condensates at SEs and that SE condensates compartmentalize and concentrate the transcription apparatus ...<|control11|><|separator|>
  58. [58]
    The Role of DNA Methylation and Histone Modifications in ... - NIH
    This chapter will discuss the effects of and mechanism by which histone modifications and DNA methylation affect transcriptional regulation.
  59. [59]
    ATP-dependent chromatin remodeling: genetics, genomics and ...
    Mar 1, 2011 · Namely, nucleosome sliding and perturbation by SWI/SNF promotes access to DNA, while nucleosome spacing by ISWI facilitates chromatin ...
  60. [60]
    Role of TET enzymes in DNA methylation, development, and cancer
    TET-mediated 5hmC deposition may therefore trigger passive replication-dependent DNA demethylation on the opposite DNA strand and be important to counter ...Missing: seminal | Show results with:seminal
  61. [61]
    DNA Methylation and Its Basic Function | Neuropsychopharmacology
    Jul 11, 2012 · Methylation of CpG islands can impair transcription factor binding, recruit repressive methyl-binding proteins, and stably silence gene ...Missing: reference | Show results with:reference
  62. [62]
    DNA methylation in genomic imprinting, development, and disease
    May 17, 2001 · DNA methylation is a modification of DNA that generally represses genes and is linked to gene regulation, development, and diseases like cancer.
  63. [63]
    The interplay of histone modifications – writers that read - EMBO Press
    Histones are subject to a vast array of posttranslational modifications including acetylation, methylation, phosphorylation, and ubiquitylation.Active Histone Modifications · Repressive Histone... · H3k27me3 And H3k9me2/3...Missing: hypothesis | Show results with:hypothesis
  64. [64]
    Swi/Snf chromatin remodeling/tumor suppressor complex ... - PNAS
    May 30, 2013 · In vitro studies using reconstituted nucleosomes have shown that the Swi/Snf complex can unwrap, slide, and eject nucleosomes as well as produce ...
  65. [65]
    Nucleosome remodeling by the SWI/SNF complex is ... - PubMed
    Jun 24, 2014 · ATP-dependent chromatin remodeling complexes are key factors in chromatin remodeling, and the SWI/SNF complex is the founding member. While many ...
  66. [66]
    Noncoding RNAs and epigenetic mechanisms during X ... - PubMed
    The Xist transcript triggers gene silencing in cis by coating the future inactive X chromosome. It also induces a cascade of chromatin changes, including ...
  67. [67]
    Xist RNA in action: Past, present, and future | PLOS Genetics
    Sep 19, 2019 · Xist RNA is the master regulator of X-chromosome inactivation (XCI), the epigenetic process that equalizes the dosage of X-linked genes between ...
  68. [68]
    The roles of TET family proteins in development and stem cells
    Jan 15, 2020 · We review recent studies of TET proteins, providing an overview of their structure, functions and roles in pluripotent stem cells and early development.Missing: seminal | Show results with:seminal
  69. [69]
    DNA methylation biomarkers for the diagnosis and treatment ... - NIH
    Extensive research in the last two decades has confirmed that DNA methylation changes are tumor-type specific [54,55]. Interestingly, hypermethylation of ...
  70. [70]
    BRCA1 promoter methylation predicts PARPi response in ovarian ...
    Aug 7, 2025 · However, the role of BRCA1 promoter methylation in guiding clinical management is unclear. Evidence is needed to improve patient selection ...
  71. [71]
    Targeting epigenetic regulators as a promising avenue to overcome ...
    Jul 18, 2025 · We evaluate the possibility and potential mechanisms of targeting epigenetic modifications to overcome resistance in cancer therapy.
  72. [72]
    Alternative splicing and cancer: a systematic review - Nature
    Feb 24, 2021 · The aim of the present study was to conduct a systematic review in order to describe the regulatory mechanisms of alternative splicing, as well as its ...
  73. [73]
    Function, clinical application, and strategies of Pre-mRNA splicing in ...
    Nov 21, 2018 · In this review, we discuss how aberrant splicing isoforms precisely regulate three basic functional aspects in cancer: proliferation, metastasis and apoptosis.
  74. [74]
    RNA-binding proteins tristetraprolin and human antigen R are novel ...
    Jun 2, 2020 · Upon binding to target mRNAs, HuR protects them from degradation by TTP and directs them to ribosome complexes to enhance their translation.Missing: stabilization | Show results with:stabilization
  75. [75]
    Deciphering the rules of microRNA targeting - Nature
    Oct 31, 2016 · MicroRNAs (miRNAs) are thought to directly regulate over 60% of human mRNAs and have been found to be involved in numerous biological ...
  76. [76]
    Gene regulation by long non-coding RNAs and its biological functions
    Dec 22, 2020 · Evidence accumulated over the past decade shows that long non-coding RNAs (lncRNAs) are widely expressed and have key roles in gene regulation.
  77. [77]
    Gene Expression: MRNA Transcript Analysis - NCBI
    The first step in gene expression is transcription of the genetic information in DNA into RNA. The individual building blocks of RNA, ribonucleotides, ...
  78. [78]
    Twenty-Five Years of Quantitative PCR for Gene Expression Analysis
    May 16, 2018 · This review examines the current state of qPCR for gene expression analysis now that the method has reached a mature stage of development and implementation.
  79. [79]
    Considerations for accurate gene expression measurement by ...
    Accurate RT-qPCR analysis could improve clinical diagnosis as well as predictive and prognostic monitoring. Furthermore, improved analytical measurement ...
  80. [80]
    An introduction to DNA microarrays for gene expression analysis
    This tutorial presents a basic introduction to DNA microarrays as employed for gene expression analysis, approaching the subject from a chemometrics ...
  81. [81]
    A survey of best practices for RNA-seq data analysis | Genome Biology
    Jan 26, 2016 · We review all of the major steps in RNA-seq data analysis, including experimental design, quality control, read alignment, quantification of gene and ...
  82. [82]
    Isoform Age - Splice Isoform Profiling Using Long-Read Technologies
    Aug 1, 2021 · In this review, we discuss current and emerging long-read sequencing methodologies for full-length RNA isoform detection and quantification.
  83. [83]
    Comprehensive assessment of mRNA isoform detection methods for ...
    May 10, 2024 · In this study, we conducted a benchmark analysis of thirteen methods implemented in nine tools capable of identifying isoform structures from long-read RNA-seq ...
  84. [84]
    Visualization and analysis of gene expression in tissue sections by ...
    Jul 1, 2016 · Spatial transcriptomics provides quantitative gene expression data and visualization of the distribution of mRNAs within tissue sections.
  85. [85]
    Method of the Year: spatially resolved transcriptomics - Nature
    Jan 6, 2021 · With spatially resolved transcriptomic methods, scientists can get transcriptomic data and know the positional context of those cells in a tissue.
  86. [86]
    Western Blot: Technique, Theory, and Trouble Shooting - PMC - NIH
    This paper will attempt to explain the technique and theory behind western blot, and offer some ways to troubleshoot. Keywords: Bio-medical research, protein, ...Missing: seminal | Show results with:seminal
  87. [87]
    Enzyme-linked immunosorbent assay (ELISA) quantitative assay of ...
    Immunochemistry, Volume 8, Issue 9, September 1971, Pages 871-874, Communication to the editors, Enzyme-linked immunosorbent assay (ELISA) quantitative assay ...
  88. [88]
    Overview of ELISA | Thermo Fisher Scientific - US
    ELISA (enzyme-linked immunosorbent assay) is a plate-based assay technique designed for detecting and quantifying soluble substances such as peptides, proteins ...ELISA Troubleshooting Guide · ELISA Development and... · ELISA Data AnalysisMissing: seminal | Show results with:seminal
  89. [89]
    Shedding Light on Intracellular Proteins using Flow Cytometry
    Jun 3, 2024 · This review discusses the scope of flow cytometry for intracellular protein detection in mammalian cells along with specific applications.Missing: seminal paper
  90. [90]
    Flow Cytometric Methods for the Detection of Intracellular Signaling ...
    Dec 8, 2020 · The flow cytometric detection of intracellular (IC) signaling proteins and transcription factors (TFs) will help to elucidate the regulation of B cell survival ...Missing: seminal | Show results with:seminal
  91. [91]
    Green Fluorescent Protein as a Marker for Gene Expression - Science
    GFP expression can be used to monitor gene expression and protein localization in living organisms.Missing: original | Show results with:original
  92. [92]
    Ten questions to AI regarding the present and future of proteomics
    By combining mass spectrometry data with data from other omics technologies, AI can help us to gain a deeper understanding of biological systems. This can lead ...Abstract · Do you still see a market for... · Mass spectrometry can study...
  93. [93]
    Gene expression and protein abundance: Just how associated are ...
    A study in 2016 also observed a modest Spearman correlation, ranging from 0.41 to 0.676, between mRNA levels and absolute abundance of 6130 proteins measured ...Gene Expression And Protein... · 2. Transcriptome And... · 4. Classification Of Studies
  94. [94]
    On the Dependency of Cellular Protein Levels on mRNA Abundance
    Apr 21, 2016 · Factors such as the delay between transcription and translation or protein stability limit the speed at which proteomes can be adapted purely ...
  95. [95]
    Multiple Transcript Properties Related to Translation Affect mRNA ...
    Our model suggests a negative relationship between mRNA degradation rates and translation elongation rates, as estimated by CAI, nTE, and ribosome density.
  96. [96]
    A review of Ribosome profiling and tools used in Ribo-seq data ...
    Ribosome profiling sequencing (Ribo-Seq) is one of the methods to study translation and its regulation. It is a high throughput technology based on deep ...
  97. [97]
    Rp3: Ribosome profiling-assisted proteogenomics improves ...
    Aug 9, 2024 · Proteogenomics is a multi-omics approach that integrates genomics, transcriptomics, and proteomics, and can be used for a multitude of tasks ...
  98. [98]
  99. [99]
    Central dogma rates and the trade-off between precision ... - Nature
    Jan 8, 2019 · First, the maximal translation is 103.6–104 proteins per mRNA per hour, a bound that can be explained from the ribosome translocation speed ( ...
  100. [100]
    Direct comparison of mass cytometry and single-cell RNA ... - Nature
    May 30, 2024 · Here, we performed scRNA-seq, mass cytometry, and flow cytometry on a single, split sample of human peripheral blood mononuclear cells (PBMCs).
  101. [101]
    Combined protein and transcript single-cell RNA sequencing in ...
    Sep 1, 2022 · Unlike CyTOF, scRNA-seq allows the detection of single-cell transcriptomes. Since the correlation between cell surface protein and mRNA ...
  102. [102]
    MicroRNA Buffering and Altered Variance of Gene Expression in ...
    Apr 9, 2014 · One potential role of miRNAs is to buffer variation in gene expression, although conflicting results have been reported.
  103. [103]
    Accurate prediction of protein–nucleic acid complexes using ...
    Nov 23, 2023 · For predicting protein components, machine learning-guided approaches like RoseTTAFold4 and AlphaFold5 are highly accurate, while RNA structure ...
  104. [104]
    Current limitations in predicting mRNA translation with deep ...
    Aug 20, 2024 · Accuracy and data efficiency in deep learning models of protein expression. ... Highly accurate protein structure prediction with AlphaFold.
  105. [105]
    A gradient of bicoid protein in Drosophila embryos - ScienceDirect
    The maternal gene bicoid (bcd) organizes anterior development in Drosophila. Its mRNA is localized at the anterior tip of the oocyte and early embryo.Missing: seminal | Show results with:seminal
  106. [106]
    Bicoid gradient formation and function in the Drosophila pre ...
    Abstract. Bicoid (Bcd) protein distributes in a concentration gradient that organizes the anterior/posterior axis of the Drosophila embryo.Missing: seminal | Show results with:seminal<|separator|>
  107. [107]
    structure, function and dynamics of the vertebrate segmentation clock
    Feb 15, 2012 · The segmentation clock is an oscillating genetic network thought to govern the rhythmic and sequential subdivision of the elongating body ...Introduction · The molecular segmentation... · three-tier model of the...
  108. [108]
    The segmentation clock mechanism moves up a notch - PMC
    The vertebrate segmentation clock is a molecular oscillator that regulates the periodicity of somite formation. Three signalling pathways have been proposed ...
  109. [109]
    The MYC oncogene — the grand orchestrator of cancer growth ... - NIH
    Genomic alterations, including gene amplification, chromosomal translocations and mutations, can increase MYC expression (FIG. 2a). A pan-cancer assessment of ...
  110. [110]
    Epigenetic gene silencing in cancer - PMC - NIH
    Epigenetic gene silencing refers to nonmutational gene inactivation that can be faithfully propagated from precursor cells to clones of daughter cells. The ...
  111. [111]
    Transcription Factors in Long-Term Memory and Synaptic Plasticity
    This review provides a brief overview of experimental work showing that several families of transcription factors, including CREB, C/EBP, Egr, AP-1, and Rel ...
  112. [112]
    Interferon-inducible gene expression signature in peripheral blood ...
    We used global gene expression profiling of peripheral blood mononuclear cells to identify distinct patterns of gene expression that distinguish most SLE ...
  113. [113]
    Single-cell immune profiling reveals distinct immune response in ...
    Sep 16, 2021 · To explore characteristics that might lead to immunopathology in asymptomatic and moderate COVID-19, we performed scRNA-seq together with ...<|separator|>
  114. [114]
    Histone Deacetylases and Their Inhibitors in Cancer Epigenetics - NIH
    Their role in epigenetics has significantly altered the development of anticancer drugs used to treat the most rare, persistent forms of cancer.
  115. [115]
    Safety and Efficacy of RNAi Therapy for Transthyretin Amyloidosis
    Aug 29, 2013 · ALN-TTR01 and ALN-TTR02 suppressed the production of both mutant and nonmutant forms of transthyretin, establishing proof of concept for RNAi therapy.
  116. [116]
    Review Gene Regulation and Speciation - ScienceDirect.com
    We review here how regulatory divergence between species can result in hybrid dysfunction, including recent theoretical support for this model.
  117. [117]
    The role of gene expression in ecological speciation - PubMed Central
    Here we review two potential roles of gene expression in ecological speciation: (1) its indirect role in facilitating population persistence and (2) its direct ...
  118. [118]
    Current approaches to gene regulatory network modelling - PMC
    Network motifs they identified to be significantly more frequent than in randomised networks included feed-forward and feedback loops. These motifs may ...
  119. [119]
    Structure and function of the feed-forward loop network motif - PubMed
    The FFL, a three-gene pattern, is composed of two input transcription factors, one of which regulates the other, both jointly regulating a target gene.
  120. [120]
    ARACNE: An Algorithm for the Reconstruction of Gene Regulatory ...
    Mar 20, 2006 · ARACNE, a novel algorithm, using microarray expression profiles, specifically designed to scale up to the complexity of regulatory networks in mammalian cells.
  121. [121]
    Boolean Models of Genomic Regulatory Networks - PubMed Central
    The Boolean network model is based on the observation that during the regulation of its functional states the cell often exhibits switch-like behavior. Recent ...
  122. [122]
    Scale-free networks in cell biology - Company of Biologists journals
    Nov 1, 2005 · Among the most well-known examples of a hub protein is the tumor suppressor protein p53, which has an abundance of incoming edges, interactions ...Missing: TP53 | Show results with:TP53
  123. [123]
    Wnt signaling and cancer - Genes & Development
    In this review, the wnt pathway will be covered from the perspective of cancer, with emphasis placed on molecular defects known to promote neoplastic ...<|separator|>
  124. [124]
    Transcriptional architecture of the mammalian circadian clock - PMC
    a) At the core, CLOCK and BMAL1 activate the Per1, Per2, Cry1 and Cry2 genes, whose protein products interact and repress their own transcription. The stability ...