Fact-checked by Grok 2 weeks ago

Intergenic region

Intergenic regions are the sequences situated between protein-coding genes in a , comprising the majority of the DNA in eukaryotic organisms such as humans, where they account for approximately 75% of the total genomic content. These regions were historically viewed as non-functional "," but extensive genomic studies have revealed their critical roles in cellular processes. A primary function of intergenic regions is to house cis-regulatory elements, including enhancers, silencers, and insulators, which modulate the expression of nearby genes by influencing transcription initiation, activation, or repression. Enhancers within these regions can act over long distances, sometimes spanning megabases, to loop and interact with promoters, thereby enabling tissue-specific and developmental-stage-specific gene regulation in eukaryotes. Additionally, intergenic regions serve as sources for non-coding RNAs, such as long intergenic non-coding RNAs (lincRNAs) and microRNAs (miRNAs), which further fine-tune through mechanisms like chromatin modification and . The study of intergenic regions has transformed , particularly through projects like , which demonstrated pervasive transcription across these areas and blurred traditional distinctions between genic and intergenic spaces by identifying widespread functional elements. In prokaryotes, intergenic regions are shorter and often contain promoters and operators essential for , contrasting with the more complex, expansive regulatory landscapes in eukaryotes. Variations in intergenic sequences contribute to phenotypic diversity, disease susceptibility, and evolutionary adaptation, underscoring their importance beyond mere spacing between genes.

Definition and Basics

Definition

An intergenic region refers to a segment of DNA located between two consecutive genes on a chromosome, typically spanning from the transcription termination site of the upstream gene (after its stop codon and associated terminator sequences) to the transcription start site of the downstream gene. These regions often include the promoter of the downstream gene but exclude its coding sequence beginning at the start codon. These regions are predominantly non-coding, meaning they do not encode proteins, but they often contain functional elements such as regulatory sequences that can influence nearby gene activity. The concept of intergenic regions emerged prominently during early genome sequencing efforts, particularly with the Human Genome Project's draft sequence published in 2001, which classified approximately 75% of the human genome as intergenic DNA based on initial gene annotations. This terminology evolved from earlier notions of "spacer DNA," a term used in pre-genomics studies since the 1970s to describe non-transcribed sequences separating genes, especially in contexts like ribosomal DNA repeats. The precise delineation of intergenic boundaries became essential for genome annotation as sequencing technologies advanced. A key distinction exists between intergenic and intragenic regions: intergenic sequences lie entirely outside the defined boundaries of any , whereas intragenic regions are located within a single and include its exons, introns, and associated untranslated regions. For example, in bacterial genomes like , intergenic regions are typically short, averaging 100-200 base pairs, and often separate genes in operons; in contrast, human intergenic regions vary widely and can extend over megabases between genes. These regions may briefly reference regulatory roles in modulating , but their primary characterization remains structural.

Genomic Context

Intergenic regions represent the non-genic portions of the , situated between annotated genes, and constitute the majority of genomic space in most organisms. In the , these regions encompass approximately 75% of the total sequence, with protein-coding exons accounting for only about 1.1% and introns covering roughly 24%. This distribution underscores the predominance of , where intergenic sequences form the bulk outside of transcribed gene units. In contrast, prokaryotic genomes exhibit much higher coding densities; for example, in K-12, intergenic regions comprise about 12% of the 4.6 genome, while protein-coding sequences occupy around 88%. These proportions highlight fundamental differences in genome organization, with eukaryotic genomes expanded by extensive non-coding elements compared to the compact structure in . Intergenic regions are positioned adjacent to key regulatory elements such as promoters, which initiate transcription at the 5' end of , terminators that signal transcription cessation at the 3' end, and enhancers that modulate from potentially distant sites. While primarily defined by the absence of coding or functional sequences, intergenic areas may harbor pseudogenes—non-functional gene copies—or repetitive elements like transposons, though these are distinguished from core intergenic space through annotation processes. This adjacency facilitates the integration of intergenic sequences into broader genomic architecture, influencing spatial organization and potential interactions with nearby functional units. Identifying intergenic regions poses significant challenges, primarily due to the reliance on genome assembly and algorithms that delineate boundaries with varying accuracy. Tools like GENSCAN employ probabilistic models to predict gene structures based on features such as codon usage, splice site signals, and exon-intron compositions, thereby defining intergenic regions as the residual non-predicted segments. However, inaccuracies in predicting long introns, , or low-expression genes can lead to over- or underestimation of intergenic extents, particularly in complex eukaryotic genomes where repetitive sequences complicate assembly. Advances in long-read sequencing have improved resolution, but bioinformatics pipelines continue to refine these identifications to minimize misannotation. The non-coding nature of intergenic regions renders them evolutionary hotspots for insertions and deletions (indels), which accumulate at higher rates than in coding sequences due to reduced selective constraints. These structural variants contribute to variation and plasticity across , though detailed evolutionary dynamics are explored in dedicated contexts.

Structural Properties

Composition and Sequence Features

Intergenic regions exhibit distinct compositions that differ between prokaryotes and eukaryotes. In eukaryotic genomes, these regions are often AT-rich, with enrichment of homopolymeric poly(dA:dT) tracts that contribute to nucleosome-depleted areas and influence organization. In prokaryotic genomes, intergenic regions display variable , typically ranging from 40% to 60% in many bacterial , and generally lower than that of adjacent coding sequences due to mutational biases and selection pressures. Additionally, intergenic sequences across both domains frequently harbor repetitive motifs, including microsatellites and transposable elements, which can comprise a substantial portion of and arise from transposition events or replication slippage. Secondary structures within intergenic DNA arise from sequence features that enable folding into stable conformations. Inverted repeats, common in these regions, have the potential to form hairpin loops or stem-loop structures that affect DNA stability and processing; for instance, in the Saccharomyces cerevisiae, approximately 33.5% of identified inverted repeats are wholly contained within intergenic regions, with clustering near 3′ flanks. These structures can influence local supercoiling and extrusion of cruciforms, though their functional constraints vary by genomic context. Boundary markers delineate the starts and ends of intergenic regions, often through specific sequence motifs tied to transcription machinery. In , rho-independent terminators act as key downstream boundaries, consisting of GC-rich stem-loop hairpins followed by polyuridine (U) tracts in the transcript—corresponding to AT-rich sequences in the DNA—that promote release and define intergenic onset. In eukaryotes, poly-A tracts similarly mark gene boundaries, with poly-T sequences precisely at 5′ ends and poly-A at 3′ ends observed in organisms like Dictyostelium discoideum, aiding in transcription termination and demarcation. Detection and characterization of intergenic composition rely on advanced sequencing technologies. Next-generation sequencing (NGS) enables high-resolution mapping of profiles and repetitive elements in these regions, revealing their heterogeneity beyond simple AT/GC biases. The project, initiated in 2007 and expanding from 2012, has uncovered hidden complexities in human intergenic sequences through integrated analyses of transcription, chromatin accessibility, and regulatory elements, showing that over 30% of transcribed bases originate from intergenic areas with diverse biochemical signatures.

Length and Distribution

Intergenic regions in bacterial genomes are typically compact, with average lengths ranging from 100 to 300 base pairs, reflecting the high gene density and streamlined architecture of prokaryotic chromosomes. For example, in Escherichia coli, the median intergenic length is approximately 134 base pairs, allowing for efficient packing of essential regulatory elements within limited non-coding space. In contrast, eukaryotic intergenic regions exhibit much greater variability and scale, often spanning from 1 kilobase to over 1 megabase, with the median intergenic length in the human genome being approximately 48 kilobases between genes. This expansion accommodates complex regulatory networks and repetitive elements. The distribution of intergenic regions across genomes is uneven, influenced by chromatin organization and genome architecture. In eukaryotes, these regions tend to cluster in gene-dense , where shorter intergenic spacers facilitate coordinated , while they are more sparse and expansive in , which is gene-poor and enriched for repressive elements. further modulates this pattern; viral genomes maintain highly compact intergenic spaces, often with overlapping genes and minimal to optimize replication . Conversely, genomes feature expansive intergenic regions due to abundant transposable elements and , contributing to their overall larger sizes compared to compact bacterial or viral counterparts. Variability in intergenic lengths is shaped by factors such as gene arrangements, which minimize spacing between co-regulated genes. For instance, gene clusters in eukaryotes like are organized in arrays with short intergenic regions, enabling synchronized replication-dependent expression. studies reveal that intergenic lengths generally increase with organismal complexity, correlating with expanded regulatory needs in multicellular lineages, as evidenced by broader intergenic distributions in vertebrates versus prokaryotes. To quantify and visualize these lengths and distributions, researchers employ computational tools such as the , which integrates annotated gene models to display intergenic intervals and chromatin states across species. This resource allows for precise measurement of region sizes and patterns through interactive tracks, facilitating comparative analyses without relying on sequence composition details.

Biological Functions

Regulatory Mechanisms

Intergenic regions harbor a variety of non-coding regulatory elements that orchestrate activity by facilitating or inhibiting transcription and . These elements include promoters, enhancers, silencers, and insulators, which interact with transcription factors, co-activators, and repressors to modulate accessibility and recruitment. In eukaryotic genomes, such regions often span thousands of base pairs and enable precise spatiotemporal control of . Core promoters, typically located immediately upstream of transcription start sites in intergenic spaces, serve as platforms for assembling the pre-initiation complex. A classic example is the , a conserved AT-rich situated approximately 20-30 base pairs upstream of the start site, which binds (TBP) to initiate transcription. Distal enhancers, conversely, can reside up to 1 megabase away within intergenic DNA and loop to contact promoters via folding, thereby boosting transcription rates through complexes and acetyltransferases. These enhancers are enriched in intergenic regions and exhibit tissue-specific activity, as demonstrated by genome-wide mapping studies. Silencer sequences in intergenic regions act as binding sites for transcriptional repressors, dampening by recruiting deacetylases or blocking activator access. Insulators, often mediated by the , prevent inappropriate enhancer-promoter interactions and delineate domains; for instance, sites in intergenic areas block enhancer activity in a directional manner, maintaining spatial organization. Approximately 50% of binding sites occur in intergenic regions, underscoring their role in genome topology. Epigenetic modifications further fine-tune intergenic regulatory functions. Active intergenic regions, particularly enhancers, are marked by lysine 4 monomethylation (H3K4me1), which correlates with open and binding, while H3K27 enhances accessibility. In contrast, at CpG islands within intergenic promoters represses transcription by inhibiting binding and promoting compaction; hypomethylation in these areas facilitates activation. Notable examples illustrate these mechanisms. In Escherichia coli, the intergenic region of the contains operator sites that bind the LacI repressor, blocking progression and regulating lactose-inducible transcription. In humans, the beta-globin locus control region (LCR), an intergenic hypersensitive site cluster upstream of the , coordinates erythroid-specific expression by integrating enhancers and insulators, including CTCF-bound elements at HS5.

Involvement in Gene Expression

Intergenic regions play a critical role in transcription initiation by serving as platforms for RNA polymerase II (Pol II) recruitment, often through the integration of regulatory elements that facilitate the assembly of pre-initiation complexes. For instance, transcribed intergenic enhancers exhibit Pol II occupancy and nascent transcription, enabling precise recruitment at distal sites to initiate gene expression in a tissue-specific manner. These regions can also harbor pausing sites where Pol II accumulates shortly after initiation, allowing for regulatory control before productive elongation; such pausing is mediated by conserved DNA sequence motifs in intergenic areas, influencing the timing and efficiency of transcription across metazoan genomes. In addition to initiation, intergenic regions contribute to transcriptional elongation by providing sequences that modulate Pol II processivity and prevent interference between adjacent genes. Studies in have identified bidirectional transcription in intergenic zones that regulates elongation through RNA polymerase mapping, highlighting how these non-coding areas ensure coordinated expression of neighboring loci. Intergenic regions enable alternative promoter usage, which generates tissue-specific mRNA isoforms, particularly in disease contexts like cancer. Distal CpG islands within intergenic spaces can act as alternative promoters, driving the expression of protein isoforms with distinct functional properties; for example, in , such intergenic promoters produce isoforms of genes like HNF4A that promote tumor progression through altered . Tumor-specific alternative transcription start sites in intergenic regions have been observed in , where they lead to isoform switching that enhances oncogenic signaling via genes such as TCF12. Regarding mRNA and stability, 3' UTR-proximal intergenic elements influence signals by harboring cryptic sites that affect and poly(A) tail addition. In genes, transcription extending into 3' intergenic DNA creates cryptic sites downstream of the mature 3' end, which, if utilized, destabilize the mRNA by altering its and export efficiency. These intergenic features can thus modulate mRNA decay rates, ensuring rapid turnover during regulation. Experimental evidence from CRISPR-based editing studies since 2012 demonstrates how intergenic deletions alter levels, particularly at GWAS-identified loci for . For example, CRISPR-Cas9 of an intergenic regulatory region near EPDR1, associated with bone mineral density via GWAS, confirmed its role in modulating target and risk. Similarly, targeted CRISPR activation of non-coding GWAS signals in schizophrenia-linked intergenic variants has shown upregulation of nearby genes like , linking these to altered expression profiles in neuronal models. Such studies underscore the functional impact of intergenic variants on expression without coding changes.

Variations Across Organisms

In Prokaryotes

In prokaryotes, intergenic regions are characteristically compact, typically ranging from 50 to 500 base pairs in length, reflecting the streamlined architecture of bacterial and archaeal genomes that prioritizes coding efficiency. These short spacers often separate within operons or divergent gene pairs and frequently harbor bidirectional promoters, enabling coordinated transcription of adjacent in opposite directions from a shared regulatory . This arrangement facilitates rapid control in response to environmental cues, as seen in many bacterial where divergent operons share promoter sequences to optimize resource use in nutrient-limited conditions. Specific structural features within these intergenic regions include Rho-dependent terminators and transcription attenuators, which play critical roles in fine-tuning . Rho-dependent terminators, located primarily at the 3' ends of genes in intergenic spaces, facilitate the release of by binding to nascent lacking strong secondary structures, thereby preventing read-through into downstream regions and recycling transcription machinery. A prominent example is the attenuator in the leader sequence of , a 162-base-pair intergenic region upstream of the structural genes that forms alternative RNA hairpins to modulate transcription based on availability, either terminating early under high levels or allowing full expression when levels are low. Additionally, intergenic regions serve as hotspots for , where mobile elements like insertion sequences integrate, promoting genetic exchange and adaptation in dynamic microbial environments. Recent studies as of 2025 have also revealed that intergenic regions in , such as those in , encode numerous small proteins (microproteins), contributing to a previously unexplored functional landscape. Functionally, intergenic regions flank resistance genes via mobile elements such as transposons and integrons, enabling their dissemination across bacterial populations. For instance, in pathogens like , intergenic insertions of mobile elements near resistance loci, such as those encoding beta-lactamases, allow rapid acquisition and expression of resistance under selective pressure from . Bacterial analyses further highlight intergenic variability, revealing that these non-coding regions exhibit higher sequence diversity than core genes, contributing to phenotypic differences across strains and facilitating in diverse ecological niches. Recent studies from the have uncovered intergenic roles in microbial community dynamics, particularly in encoding regulatory elements for signals that coordinate behaviors like formation and . In microbiomes, for example, functional screens identified acyl-homoserine lactone (AHL) quorum sensing genes within intergenic contexts, enabling density-dependent communication among diverse bacteria to enhance collective resilience against stressors. These findings underscore how intergenic variability drives community-level interactions in complex environments.

In Eukaryotes

In eukaryotes, intergenic regions are typically much larger and more structurally diverse than in prokaryotes, often comprising vast stretches of that play critical roles in gene regulation and genome architecture. These regions, sometimes referred to as intergenic deserts, can span hundreds of kilobases and serve as reservoirs for regulatory elements that modulate across developmental and environmental contexts. Unlike the compact operon-adjacent spacers in , eukaryotic intergenic spaces enable complex, long-range interactions essential for multicellularity. Recent advances as of 2024 have further elucidated enhancer-promoter specificity in these regions, involving in super-enhancers to drive precise . A prominent feature of eukaryotic intergenic regions is their capacity to harbor long non-coding RNAs (lncRNAs), which are transcribed from intergenic loci and influence states and transcriptional programs, particularly in developmental genes. For instance, many lncRNAs act as scaffolds for protein complexes or as decoys for transcription factors, thereby fine-tuning the expression of nearby genes involved in cell differentiation. Additionally, large intergenic deserts frequently contain super-enhancers, which are clusters of enhancers bound by high densities of transcription factors and mediators, driving robust, cell-type-specific of developmental genes such as those in the HOX clusters. These super-enhancers often produce enhancer RNAs (eRNAs) that stabilize chromatin loops and amplify signaling.31244-7) Intergenic regions also contribute significantly to chromatin organization within topologically associating domains (TADs), which are self-interacting segments averaging 100 to 1 in size that compartmentalize the in three dimensions. Boundaries of TADs often reside in intergenic spaces enriched with insulators like CTCF-binding sites, which prevent ectopic enhancer-promoter contacts and maintain stable 3D folding essential for coordinated . Disruptions in these intergenic elements can lead to misfolding and diseases such as congenital disorders, underscoring their role in preserving spatial integrity across eukaryotic species. In plant genomes, intergenic regions are frequently dominated by transposable elements (TEs), which constitute up to 85% of the non-coding space in species like and , facilitating adaptation through epigenetic silencing and insertion-induced variability. These TEs can mobilize under stress, altering nearby and promoting traits such as drought resistance or flowering time shifts, as observed in populations. In animal models, such as , intergenic regions host Polycomb response elements (PREs) that recruit Polycomb repressive complexes to silence developmental genes, ensuring stable epigenetic memory during embryogenesis. These PREs, often spanning 1-2 , integrate signals from multiple transcription factors to maintain repression patterns. Recent advances in technologies since 2015 have revealed cell-type-specific transcriptional activity within human intergenic regions, highlighting dynamic enhancer and expression that varies across tissues and states. For example, of immune cells has identified hundreds of intergenic uniquely upregulated in subsets like T-helper cells, modulating immune responses, while accessibility assays show tissue-specific opening of intergenic super-enhancers in brain neurons versus hepatocytes. These findings emphasize the heterogeneity of intergenic contributions to cellular identity in humans.

Evolutionary Dynamics

Conservation Patterns

Functional intergenic elements, such as promoters and enhancers, exhibit higher levels of sequence conservation compared to spacers due to their regulatory roles. PhastCons scores, which estimate the probability of negative selection on a basis ranging from 0 to 1, are notably elevated in these functional regions; for instance, robust cis-regulatory elements in intergenic DNA average around 0.27, while random sequences score approximately 0.03. In contrast, protein-coding exons display much higher conservation, with average PhastCons scores of about 0.65. This disparity underscores the purifying selection acting on functional non-coding sequences to maintain regulatory integrity. Selective pressures on intergenic regions vary by function, with strong negative selection preserving regulatory motifs essential for gene control. Transcription factor binding sites and other motifs in intergenic DNA show reduced polymorphism and divergence, indicative of purifying selection, even for moderate-affinity sites. Conversely, certain intergenic regions, particularly those involved in immune responses, experience positive selection; comparisons between human and chimpanzee genomes reveal accelerated evolution in non-coding sequences near pathogen recognition genes, such as those in the MHC region, adapting to selective pressures from infectious agents. Comparative genomic alignments across mammals, such as those generated by Ensembl's multi-species pipelines, demonstrate that intergenic regions retain roughly 20-30% sequence identity on average, far lower than the near 100% observed in orthologous exons. These alignments highlight conserved non-coding elements (CNCs) as discrete, highly preserved segments within otherwise variable intergenic space, often comprising less than 5% of total but showing exon-like constraint levels. Phylogenetic footprinting has been a key method for identifying CNCs, leveraging cross-species alignments to detect evolutionarily stable non-coding sequences likely harboring regulatory functions. This approach, applied to genomes, uncovers footprints of in intergenic regions that align poorly overall but contain motif-rich cores under selection. Recent pan-genome projects, including the Human Pangenome Reference Consortium, have refined these insights by incorporating structural variants across diverse populations, revealing that conserved intergenic elements exhibit low variability even in non-reference assemblies, thus updating CNC catalogs with greater resolution.

Role in Genome Evolution

Intergenic regions serve as mutation hotspots, exhibiting elevated rates of single nucleotide polymorphisms () and insertions/deletions () compared to coding sequences, which primarily drive neutral evolution by allowing to accumulate without immediate consequences. These non-coding areas, often comprising repetitive elements and low-complexity sequences, experience indel mutation rates that are approximately 10% of SNP rates but contribute significantly to genomic diversity through neutral drift. For instance, genome-wide analyses reveal more high-frequency SNV and indel hotspots in intergenic spaces than predicted by background models, underscoring their role in facilitating evolutionary flexibility. Intergenic duplications exemplify how these regions contribute to the emergence of novel , particularly in the of (OR) families. Large-scale, multi-chromosomal duplications originating from intergenic segments have expanded the OR repertoire, with thousands of copies arising through tandem and segmental events that initially reside in non-coding contexts before potential into functional roles. In , such duplications have driven the diversification of OR , enabling adaptive responses to environmental olfactory cues via subsequent positive selection on duplicated variants. Adaptive evolution also leverages intergenic variations, as seen in the trait in humans, where a (rs4988235, -13910*T) in an enhancer element approximately 14 kb upstream of the LCT gene arose around 10,000 years ago in pastoralist populations. This intergenic variant enhances LCT expression post-weaning, conferring a selective advantage in dairy-consuming societies and demonstrating how non-coding mutations can rapidly fix under positive selection. Similarly, intergenic regions act as breakpoints for genome rearrangements, including chromosomal inversions and translocations, which reorganize gene order and promote by suppressing recombination within inverted segments. Evidence from , including reconstructions of ancestral mammalian genomes, shows that many inversion endpoints localize to large intergenic intervals to minimize gene disruption, thereby facilitating structural evolution. Theoretical frameworks, building on Motoo Kimura's proposed in 1968, have been adapted post hoc to explain intergenic drift, positing that most non-coding mutations are selectively neutral and fixed by rather than adaptive forces. Subsequent developments, such as extensions to eukaryotic non-coding sequences in the and beyond, highlight how intergenic regions embody nearly neutral evolution, where slightly deleterious variants accumulate at rates governed by and drift, contrasting with stronger selection in areas. This model underscores the intergenic contribution to long-term genomic fluidity without compromising essential functions.

References

  1. [1]
    Intergenic Regions
    Intergenic regions are the stretches of DNA located between genes. In humans, intergenic regions are non-protein-coding and comprise a large majority of the ...
  2. [2]
    What is noncoding DNA?: MedlinePlus Genetics
    Jan 19, 2021 · Noncoding DNA contains sequences that act as regulatory elements, determining when and where genes are turned on and off.
  3. [3]
    Extended intergenic DNA contributes to neuron-specific expression ...
    May 18, 2022 · Intergenic regions contain a large number of cis-regulatory DNA elements, such as enhancers, which perform a variety of functions leading to ...
  4. [4]
    Classification of human genomic regions based on experimentally ...
    For instance, enhancers can be as far as one mega base pairs (1 Mbp) from the target gene in eukaryotes [3], and can be both upstream and downstream of the ...
  5. [5]
    The functions and unique features of long intergenic non-coding RNA
    LincRNAs and mRNAs can positively or negatively regulate the expression of their own genes, or target other genes, by interacting with chromatin-modifying ...
  6. [6]
    3 Characterization of intergenic regions and gene definition - Nature
    Jan 1, 2019 · The prevalence and analysis of ENCODE data are changing the definition and characterization of intergenic and genic regions.
  7. [7]
    The regulatory content of intergenic DNA shapes genome architecture
    Intergenic distance between genes within operons is likely to underestimate the size of DNA used to regulate these genes and this underestimate could contribute ...
  8. [8]
    Extended intergenic DNA contributes to neuron-specific expression ...
    May 18, 2022 · Intergenic regions contain a large number of cis-regulatory DNA elements, such as enhancers, which perform a variety of functions leading to ...
  9. [9]
    Intergenic Region - an overview | ScienceDirect Topics
    An intergenic region is defined as a segment of DNA located between two genes, which may contain regulatory elements or SNPs, and is often the focus of ...
  10. [10]
    The sequence of the human genome - PubMed
    Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. ... Human Genome Project*; Humans; Introns ...
  11. [11]
    Spacer DNA - an overview | ScienceDirect Topics
    Intergenic spacer regions often show a higher degree of variability than the coding genes, making the former more useful for analyses at a lower taxonomic ...
  12. [12]
    Definition of intragenic and intergenic regions - Bio-protocol
    An integration site was defined as being located in the intragenic region if the annotated integration site is located within the gene body of any ...
  13. [13]
    Genome-Wide Analyses in Bacteria Show Small-RNA Enrichment ...
    A third observation of our study is that the average sizes and distributions of intergenic-region lengths are very similar among the species analyzed, ...
  14. [14]
    Genes, pseudogenes, and Alu sequence organization ... - PNAS
    There are a total of 44 pairs of (+,+) genes with median intergenic length 28,950 bp. The median intergenic lengths, 35,568 bp, of (−,−) and 28,905 bp of (+,+) ...
  15. [15]
    An Integrated Encyclopedia of DNA Elements in the Human Genome
    In a pilot phase covering 1% of the genome, the ENCODE project annotated 60% of mammalian evolutionarily constrained bases, but also identified many additional ...
  16. [16]
    The origin, evolution, and functional impact of short insertion ...
    Short insertions and deletions (indels) are the second most abundant form of human genetic variation, but our understanding of their origins and functional ...
  17. [17]
    Homopolymer tract organization in the human malarial parasite ...
    Oct 3, 2014 · Homopolymeric tracts, particularly poly dA.dT, are enriched within the intergenic sequences of eukaryotic genomes where they appear to act ...
  18. [18]
    Genome and sequence determinants governing the expression of ...
    Jun 8, 2020 · Bacterial intergenic regions tend to be lower in GC content than ... According to our model, low GC content bacteria have evolved ...
  19. [19]
    Repetitive DNA sequence detection and its role in the human genome
    Sep 19, 2023 · TRs can be found in intergenic regions and in both the non-coding and coding regions of a variety of genes. Moreover, TRs occur ...
  20. [20]
    Transposable Elements as a Source of Novel Repetitive DNA in the ...
    A number of studies have provided examples of TE sequences that give rise to new repetitive classes, such as microsatellites, minisatellites, and satellite DNA ...
  21. [21]
    The distribution of inverted repeat sequences in the Saccharomyces ...
    Hairpins, the most common RNA secondary structural elements, are produced by intramolecular Watson–Crick binding. The DNA sequence encoding an RNA hairpin must ...
  22. [22]
    Bacterial Transcription Terminators: The RNA 3′-End Chronicles
    Intrinsic termination, sometimes called Rho-independent termination, refers to dissociation of the EC caused solely by interactions of DNA and RNA with RNAP ...Missing: boundary markers intergenic
  23. [23]
    Unusual combinatorial involvement of poly-A/T tracts in organizing ...
    We find that Dictyostelium genes are demarcated precisely at their 5′ ends by poly-T tracts and precisely at their 3′ ends by poly-A tracts.
  24. [24]
    An integrated encyclopedia of DNA elements in the human genome
    Sep 5, 2012 · Excluding RNA elements and broad histone elements, 44.2% of the genome is covered. Smaller proportions of the genome are occupied by regions of ...
  25. [25]
    The Evolution of Bacterial Genome Architecture - Frontiers
    May 29, 2017 · Whereas intergenic regions typically constitute 10 ± 5% of a bacterial genome, species subject to drift sometimes can have much greater ...
  26. [26]
    Confining euchromatin/heterochromatin territory: jumonji crosses the ...
    Heterochromatin is typically highly condensed, gene-poor, and transcriptionally silent, whereas euchromatin is less condensed, gene-rich, and more accessible to ...The Dmm-1 Jmjc Domain... · The Epe1 Jmjc Domain Protein... · The Ibm1 Jmjc Domain Protein...
  27. [27]
    Gene overlapping and size constraints in the viral world
    May 21, 2016 · We sought a unified evolutionary explanation that accounts for their genome sizes, gene overlapping and capsid properties.
  28. [28]
    The maize genome as a model for efficient sequence analysis of ...
    The genomes of flowering plants vary in size from about 0.1 to over 100 gigabase pairs (Gbp), mostly because of polyploidy and variation in the abundance of ...
  29. [29]
    Transcription of histone gene cluster by differential core-promoter ...
    The 100 copies of tandemly arrayed Drosophila linker (H1) and core (H2A/B and H3/H4) histone gene cluster are coordinately regulated during the cell cycle.
  30. [30]
    The size of the genome and the complexity of living beings - Mètode
    Feb 25, 2013 · However, in eukaryotes there is no correlation between genome size and the complexity of the organism. This is known as the C-value paradox. The ...Prokaryotes: Bacteria And... · Eukaryotes: C-Value Paradox · Number Of Genes And...<|control11|><|separator|>
  31. [31]
    Genome Browser User's Guide
    The Genome Browser offers multiple tools that can correctly convert coordinates between different assembly releases. For more information on conversion tools, ...
  32. [32]
    UCSC Genome Browser Table Browser Tutorial
    The UCSC Table Browser is a flexible tool for accessing and exporting data from genome browser tracks. This tutorial introduces the Table Browser interface and ...<|control11|><|separator|>
  33. [33]
    Enhancers, gene regulation, and genome organization - PMC
    Apr 1, 2021 · Typically located at long genomic distances from their target genes, enhancers may be in upstream or downstream intergenic regions, in intronic ...
  34. [34]
    Transcriptional regulation by promoters with enhancer function - NIH
    Promoters are located in close proximity to the 5′ end of genes and capable of inducing gene expression.
  35. [35]
    Core Promoters in Transcription: Old Problem, New Insights - PMC
    The TATA box, TATAA, (TSS), the first core promoter element to be identified, was biochemically found to be located 20–30 bp upstream of the transcription start ...
  36. [36]
    Genome-Wide Prediction and Validation of Intergenic Enhancers in ...
    Gene expression in eukaryotes is regulated by the orchestrated binding of regulatory proteins to promoters, enhancers, and other cis-regulatory DNA elements ( ...
  37. [37]
    CTCF: An Architectural Protein Bridging Genome Topology ... - NIH
    Approximately 50% of CTCF binding sites reside within intergenic regions, ~15% are located near promoters and ~40% are intragenic (exons and introns) (Fig.1).
  38. [38]
    CpG islands under selective pressure are enriched with H3K4me3 ...
    Analyzing thirteen human cell lines, we found H3K4me3, H3K27ac and H3K36me3 enrichment in the CGIs that experienced selective events. Further studies using ...
  39. [39]
    The interplay between DNA and histone methylation - PubMed Central
    DNA hypomethylation across intergenic regions and DNA hypermethylation at promoter CpG islands have been described in many cancer contexts, independently of ...
  40. [40]
    A Novel Molecular Switch - PMC - PubMed Central
    The operator of the lac operon, a short stretch of DNA (~17 base pairs), is composed of two nearly identical half sites that is located between the end of the ...
  41. [41]
    Intergenic Transcription in the Human β-Globin Gene Cluster - NIH
    Several kilobases upstream of the ɛ-globin gene are at least five DNase I-hypersensitive sites (HS1 to HS5) which constitute the locus control region (LCR).
  42. [42]
    The landscape of RNA polymerase II transcription initiation in C ...
    Based on the overlap of transcription initiation clusters with mapped transcription factor binding sites, we define 2361 transcribed intergenic enhancers.
  43. [43]
    Conserved DNA sequence features underlie pervasive RNA ...
    Based on their location, pausing sites were classified into one of four major categories: promoter-proximal, gene-body, antisense or intergenic. For defining ...
  44. [44]
    RNA polymerase mapping in plants identifies intergenic regulatory ...
    Our results suggest that bidirectional transcription can identify intergenic genomic regions in plants that play an important role in transcription regulation.Genomic Partitioning In... · Results · Ire In Maize Co-Localize...
  45. [45]
    Distal CpG islands can serve as alternative promoters to transcribe ...
    We further hypothesized that the tissue-specific usage of CGIs as alternative promoters may be regulated by cell-type–specific transcription factors (TFs).
  46. [46]
    Tumor-specific usage of alternative transcription start sites in ...
    Oct 14, 2011 · Extensive alternative splicing and dual promoter usage generate Tcf-1 protein isoforms with differential transcription control properties.The Wnt Pathway Regulates... · Tcf12 Protein Expression Is... · In Silico Protein...
  47. [47]
    Expression of mouse histone genes: transcription into 3' intergenic ...
    Expression of mouse histone genes: transcription into 3' intergenic DNA and ... mRNA production and stability in serum-stimulated mouse 3T6 fibroblasts.
  48. [48]
    CRISPR‐Cas9–Mediated Genome Editing Confirms EPDR1 ... - NIH
    CRISPR-Cas9 genome editing in the hFOB1.19 cell model supports previous observations, where this regulatory region harboring GWAS-implicated variation operates ...
  49. [49]
    From GWAS signal to function: targeted CRISPR activation enables ...
    Oct 1, 2025 · Our study demonstrates that activating genomic regions harboring specific non-coding GWAS SNPs can modulate gene expression, suggesting that ...
  50. [50]
    Widespread divergent transcription from bacterial and archaeal ...
    Bidirectional promoters enable co-regulation of divergent genes and are enriched in both intergenic and horizontally acquired regions. Divergent transcription ...
  51. [51]
    Growth Temperature and Genome Size in Bacteria Are Negatively ...
    Apr 5, 2013 · Specifically, with increasing habitat temperature and decreasing genome size, the proportion of genomic DNA in intergenic regions decreases.
  52. [52]
    Rho directs widespread termination of intragenic and stable RNA ...
    We found ≈200 Rho-terminated loci that were divided evenly into 2 classes: intergenic (at the ends of genes) and intragenic (within genes).Results · Rho Termination At Trnas · Rho Inhibition Reveals...
  53. [53]
    Regulation of Bacterial Gene Expression by Transcription Attenuation
    The key elements required for attenuation control of trp operon expression are found in a 162-bp leader region, which is defined as the region between the ...
  54. [54]
    Adaptive evolution of hybrid bacteria by horizontal gene transfer - PMC
    We conclude that HGT opens windows of positive selection for the subsequent evolution by point mutations; this effect is most pronounced in intergenic regions.
  55. [55]
    Mobile Genetic Elements Associated with Antimicrobial Resistance
    This review aims to outline the characteristics of the major types of mobile genetic elements involved in acquisition and spread of antibiotic resistance
  56. [56]
    A Rapid, Large-Scale Pan-Genome Analysis Tool for Intergenic ...
    Apr 1, 2018 · However, despite overwhelming evidence that variation in intergenic regions in bacteria can directly influence phenotypes, most current ...Missing: variability | Show results with:variability
  57. [57]
    Functional metagenomic analysis of quorum sensing signaling in a ...
    Oct 28, 2021 · We performed a metagenomic screen for AHL genes in an activated sludge microbial community from the Ulu Pandan wastewater treatment plant (WWTP) in Singapore.
  58. [58]
    Long non-coding RNAs: definitions, functions, challenges ... - Nature
    Jan 3, 2023 · Most lncRNAs evolve more rapidly than protein-coding sequences, are cell type specific and regulate many aspects of cell differentiation and ...
  59. [59]
    Gene regulation by long non-coding RNAs and its biological functions
    Dec 22, 2020 · Evidence accumulated over the past decade shows that long non-coding RNAs (lncRNAs) are widely expressed and have key roles in gene regulation.
  60. [60]
    Superenhancers as master gene regulators and novel therapeutic ...
    Feb 1, 2023 · Superenhancers (SEs), identified as novel epigenetic regulatory elements, are clusters of enhancers with cell-type specificity that can drive the aberrant ...<|separator|>
  61. [61]
    Principles of genome folding into topologically associating domains
    Apr 10, 2019 · The genome of many species is organized into domains of preferential internal chromatin interactions called “topologically associating domains” (TADs).
  62. [62]
    Evolutionary stability of topologically associating domains is ...
    Aug 7, 2018 · TADs contribute to gene regulation by restricting chromatin interactions of regulatory sequences, such as enhancers, with their target genes.
  63. [63]
    Novel Insights into Plant Genome Evolution and Adaptation as ...
    Here, we review some of the most updated examples on the roles of transposable elements (TEs) in plant genome evolution and adaptation through epigenetics ...
  64. [64]
    Transposable Elements Contribute to the Adaptation of Arabidopsis ...
    Aug 9, 2018 · Our results highlight the importance of variations in TEs for the adaptation of plants in general in the context of rapid global climate change.Abstract · Introduction · Results · Discussion
  65. [65]
    Polycomb Group Response Elements in Drosophila and Vertebrates
    In Drosophila, there are specific regulatory DNA elements called Polycomb group response elements (PREs) that bring PcG protein complexes to the DNA. Drosophila ...
  66. [66]
    A single-cell atlas of chromatin accessibility in the human genome
    Nov 24, 2021 · This rich resource provides a foundation for the analysis of gene regulatory programs in human cell types across tissues, life stages, and organ systems.
  67. [67]
    Cell type-specific novel long non-coding RNA and circular RNA in ...
    We identified hundreds of novel non-coding RNA genes and showed that the majority have cell type-dependent expression.Results · Transcriptional Signatures... · Circular Rna In Mature...
  68. [68]
    Epigenome and interactome profiling uncovers principles of distal ...
    Oct 10, 2025 · In this calculation, the mean PhastCons scores were 0.274 for robust cCREs, 0.179 for E7 segments, 0.648 for exons, and 0.031 for the random ...
  69. [69]
    Conservation Track Settings - UCSC Genome Browser
    The phastCons scores, by contrast, represent probabilities of negative selection and range between 0 and 1.
  70. [70]
    Conservation and regulatory associations of a wide affinity range of ...
    We found that not only high affinity binding sites, but also numerous moderate and low affinity binding sites, are under negative selection in the mouse genome.Introduction · Results · Pbm ``bound'' 8-Mers Are...
  71. [71]
    Comparative sequencing of human and chimpanzee MHC class I ...
    This report describes a large-scale single-contig comparison between human and chimpanzee genomes via the sequence analysis of almost one-half of the ...
  72. [72]
    Multiple genome alignments - Ensembl
    Multiple alignments are calculated between groups of genomes. These are used to calculate ancestral sequences, age of base, conservation scores and constrained ...Missing: intergenic exons
  73. [73]
    Genomic Locations of Conserved Noncoding Sequences and Their ...
    Mar 26, 2016 · The conservation levels of the CNSs are significantly higher than those of random sequences and lincRNA exons. Purifying selection on CNSs is ...
  74. [74]
    Conserved non-coding elements and cis regulation
    Apr 1, 2013 · Phylogenetic footprinting. A technique to identify potential CRMs within conserved non-coding sequences through comparison with orthologous ...Missing: CNCs | Show results with:CNCs
  75. [75]
    Strong Heterogeneity in Mutation Rate Causes Misleading ...
    Given that we focused our analyses on noncoding regions, which are essentially neutrally evolving, these indel hotspots are unlikely to result from selection, ...
  76. [76]
    High rate of mutation and efficient removal by selection of structural ...
    The inferred SV mutation rate is roughly 10% of the SNV rate and ~30% of the short indel rate, indicating that SVs comprise about 8% of new mutations, or ...
  77. [77]
    The landscape and driver potential of site-specific hotspots across ...
    May 13, 2021 · Genome-wide we find more high-frequency SNV and indel hotspots than expected given mutational background models. ... neutral somatic mutation rate ...
  78. [78]
    [PDF] Large multi-chromosomal duplications encompass many members ...
    The human genome contains thousands of genes that encode a diverse repertoire of odorant receptors (ORs). We report here on the identification and ...
  79. [79]
    Complex Evolution of 7E Olfactory Receptor Genes in Segmental ...
    Most OR genes have arisen by local duplication, but some, especially in humans, have duplicated interchromosomally (Trask et al.
  80. [80]
    On the Evolution of Lactase Persistence in Humans - Annual Reviews
    Aug 31, 2017 · The human lactase persistence-associated SNP −13910*T enables in vivo functional persistence of lactase promoter-reporter transgene expression.Missing: intergenic | Show results with:intergenic
  81. [81]
    Reconstructing the History of Yeast Genomes | PLOS Genetics
    May 15, 2009 · Fourth, if an endpoint of two inversions or translocations falls in a large intergenic region between two genes, it becomes less clear ...
  82. [82]
    [PDF] Reconstruction of ancestral chromosome architecture and gene ...
    May 31, 2016 · We retraced all chromosomal rearrangements, includ- ing gene losses, gene duplications, chromosomal inversions and translocations at single gene ...
  83. [83]
    The importance of the Neutral Theory in 1968 and 50 years on - PMC
    The Neutral Theory of Molecular Evolution asserts that most de novo mutations are either sufficiently deleterious in their effects on fitness that they have ...
  84. [84]
    Neutral Theory, Transposable Elements, and Eukaryotic Genome ...
    Apr 23, 2018 · Kimura's fundamental concept of neutral mutation-random drift, which was published 50 years ago, is re-examined in light of its pervasive influence on ...Missing: post- | Show results with:post-