Fact-checked by Grok 2 weeks ago

Junk DNA

Junk DNA, a term coined by geneticist Susumu Ohno in 1972, refers to sequences in the that were originally hypothesized to have no selective advantage or functional role in the organism's fitness. This concept emerged to address the C-value paradox, the observation that genome sizes vary widely across species without correlating to organismal complexity—for instance, the is approximately 3.2 gigabases, while the onion's is about 16 gigabases, suggesting much of the DNA is non-functional. In humans, roughly 98% of the consists of , including introns (approximately 26%), pseudogenes (approximately 1–2%), and transposable elements (approximately 45%). The term "junk DNA" quickly became controversial, as it implied evolutionary waste, but early evidence from genomic studies supported the idea that much of this DNA accumulates via neutral evolution and without contributing to . Ohno's proposal was rooted in the limited number of protein-coding genes—estimated at 1.5–2.5% of the —leaving the majority as potential "junk" that could proliferate without deleterious s due to eukaryotic organization. However, the 2012 project asserted that at least 80% of the shows biochemical activity, sparking debates over what constitutes "function": whether it requires evolutionary selection (selected ) or merely causal roles in cellular processes. Subsequent research has revealed that while a significant portion remains non-functional under strict evolutionary criteria, much plays essential regulatory roles, such as controlling through enhancers, long non-coding RNAs, and structure maintenance. For example, non-coding sequences near the OCT4 gene regulate vertebral , influencing differences between mice and snakes by modulating growth. These elements also contribute to mammalian , genome stability, and responses to environmental cues, challenging the original "junk" label and highlighting —where neutral sequences gain function over time. Today, the consensus views junk DNA as a mix of truly dispensable sequences and functionally important non-coding regions that drive genomic complexity and evolution.

Definitions and Concepts

Definition of Junk DNA

Junk DNA refers to portions of the that lack direct protein-coding and are presumed to evolve without significant selective pressure, allowing them to accumulate mutations freely. The term was coined by geneticist Susumu Ohno in his 1972 paper, where he described it as DNA sequences in the mammalian that do not encode functional polypeptides or RNAs, arising largely from gene duplications that become redundant over evolutionary time. This conceptualization emphasized the excess genomic material beyond what is necessary for basic cellular functions, highlighting a in variation across . The consists of approximately 98% , of which junk DNA—presumed non-functional non-genic, repetitive, or intergenic regions—forms a substantial but debated portion. These regions show patterns of neutral evolution, distinguishing them from the roughly 1-2% of the occupied by protein-coding exons. Junk DNA is typically viewed as a of , the latter encompassing all sequences outside of exons without implying presumed uselessness. Representative examples of junk DNA include , which consists of long tandem repeats clustered at centromeres and telomeres; transposons, such as LINEs and that make up about 50% of the and were initially regarded as parasitic or selfish elements; and pseudogenes, which are inactivated copies of functional genes that no longer produce viable products. These components illustrate the non-informational, sequence-independent nature often attributed to junk DNA, contrasting sharply with the precise, conserved structure of exons essential for encoding sequences in proteins.

Relation to Non-Coding DNA

Non-coding DNA encompasses all genomic sequences that do not directly proteins, including introns, untranslated regions (UTRs), and regulatory elements such as promoters, enhancers, and silencers. In eukaryotic genomes, including humans, comprises approximately 98-99% of the total DNA, with only 1-2% consisting of protein-coding genes. The concept of junk DNA emerged as a historical of much of this , presuming it to be non-functional and superfluous to the organism's biology. Coined by Susumu Ohno in 1972, the term described non-protein-coding regions that appeared to lack utility, originally estimating such "junk" to constitute over 90% of the based on limited counts at the time. Thus, junk DNA represents a subset of , reflecting an early interpretive label rather than a comprehensive . A key distinction lies in their conceptual foundations: non-coding DNA is a descriptive category defined by the absence of protein-coding instructions, whereas junk DNA is interpretive, implying a lack of biological usefulness or contribution to fitness. Not all non-coding DNA qualifies as junk; for example, ribosomal RNA (rRNA) genes are non-coding yet essential for ribosome assembly and protein synthesis. This overlap highlights how non-coding DNA serves as an umbrella term, while junk DNA specifically denotes presumed non-functional portions within it.

Historical Development

Early Discoveries and Paradoxes

In the 1940s and 1950s, early biochemical and cytochemical techniques, such as Feulgen microspectrophotometry, enabled the first measurements of nuclear DNA content across species, revealing striking variations in genome size that defied expectations of a direct link to organismal complexity. For instance, early measurements reported the single-celled eukaryote Amoeba proteus to have a haploid genome size of approximately 290 gigabase pairs (Gbp), roughly 100 times larger than the human genome at about 3 Gbp; however, later studies revised this to around 5.4 Gbp. These findings, initially reported through photometric assays of DNA staining, highlighted that even simple protists could possess vastly more DNA than multicellular animals, prompting questions about the functional necessity of such excess genetic material. This discrepancy culminated in the formalization of the paradox in 1971, named by Charles A. Thomas Jr., which described the observed lack of correlation between an organism's morphological or developmental complexity and its haploid DNA content (). Thomas emphasized that while prokaryotes like have compact around 4.6 megabase pairs (Mbp), many eukaryotes exhibit enormous expansions; for example, certain salamanders in the genus possess s exceeding 80 picograms (pg) of DNA per haploid nucleus—over 10 times the human value of about 3.3 pg—despite lacking proportionally greater complexity. This paradox suggested that much of the DNA in eukaryotic might not contribute directly to essential genetic functions, as genome sizes varied by orders of magnitude without corresponding increases in number or organismal sophistication. Further insights came from in the late 1960s, developed by Roy J. Britten and David E. Kohne, which used DNA reassociation kinetics to quantify sequence repetition and genome complexity. By denaturing DNA and measuring the rate at which complementary strands reannealed under controlled conditions (plotted as Cot curves, where is the product of initial DNA concentration and time), they demonstrated that eukaryotic genomes contain vast of highly repetitive sequences that reassociate rapidly (low Cot values) and moderately repetitive ones, comprising a significant , such as about 45% in calf thymus DNA, distinct from slowly reassociating unique or single-copy sequences. These repetitive elements indicated that the majority of eukaryotic DNA was non-genic and reiterated many times, amplifying the puzzle of why such abundant, seemingly redundant material persisted in genomes. By the early 1970s, additional challenges to the one-gene-one-polypeptide model emerged from observations of heterogeneous nuclear RNA (hnRNA) in eukaryotic cells, which was found to be significantly longer than the (mRNA) that reaches the . Pioneering electron microscopy and hybridization experiments, such as those by Susan M. Berget, Claire Moore, and Phillip A. Sharp in 1977, revealed that adenovirus transcripts contained intervening sequences that were excised during processing to form functional mRNA, hinting at a discontinuous that disrupted the assumption of colinear transcription-translation. These splicing observations underscored how much of the transcribed DNA—primarily non-coding—did not directly proteins, reinforcing the paradoxes posed by and repetition.

Origin and Evolution of the Term

The term "junk DNA" was first coined by geneticist Susumu Ohno in his 1972 paper presented at the Brookhaven Symposium in Biology, where he proposed that the vast majority of eukaryotic consists of non-genic DNA without direct informational content for protein synthesis. Ohno's conceptualization built upon earlier ideas from the 1940s, particularly those of Nobel laureate Hermann J. Muller, who argued that much of the chromosomal material beyond essential was likely superfluous or "useless" given estimates of limited gene numbers relative to genome sizes. This notion arose in response to the C-value paradox, the observation that genome sizes vary widely across species without corresponding differences in complexity. In the and , the term gained widespread popularity through and media, often invoked to explain apparent "genome bloat" from repetitive sequences and pseudogenes. Evolutionary biologist referenced "junk" DNA in his 1976 book , portraying it as parasitic or inert material that proliferates without benefiting the organism, thereby embedding the concept in public and academic discourse on . This period saw the phrase adopted in explanations of why organisms like humans possess far more DNA than seemingly required for coding purposes, reinforcing its role as a for evolutionary byproducts. By the 2000s, the term's usage began to shift amid growing evidence of regulatory elements within non-coding regions, rendering "junk DNA" increasingly controversial among researchers. However, scientists like Dan Graur defended its application in , arguing it remains apt for truly neutral sequences that neither contribute to nor detract from , distinguishing them from functional or deleterious elements. This evolution reflects ongoing semantic refinement rather than outright abandonment. The term has also permeated public discourse, often leading to misconceptions that equate "junk" DNA with outright evolutionary waste or irrelevance, overshadowing nuanced scientific views on genomic neutrality. Such interpretations have influenced popular media and educational materials, sometimes amplifying debates beyond .

Scientific Debates

Functional vs. Non-Functional Perspectives

The debate on junk DNA revolves around two primary perspectives: the functionalist view, which posits that the majority of is under purifying selection and thus contributes to organismal fitness, and the neutralist view, which argues that much of it evolves neutrally through , accumulating as non-functional sequences. This theoretical tension shapes interpretations of , with functionalists emphasizing selective constraints that preserve adaptive elements, while neutralists highlight the prevalence of neutral fixed by random processes. From the neutralist perspective, inspired by Kimura's , most at the molecular level arises from neutral mutations that neither benefit nor harm , leading to the fixation of such changes via . This framework explains the abundance of as largely non-functional "," including selfish genetic elements like transposable elements that proliferate autonomously without benefiting the host genome. Orgel and Crick's hypothesis of selfish DNA further supports this, proposing that such sequences act as parasitic replicators, spreading through populations by outcompeting host genes in replication efficiency rather than through positive selection. In contrast, the functionalist perspective critiques the neutral theory for underestimating the role of purifying selection, asserting that deleterious mutations are efficiently removed across much of the genome, including non-coding regions, thereby minimizing true junk DNA. Proponents argue that widespread selective constraints maintain functional integrity, with evidence drawn from lower substitution rates in constrained non-coding sequences compared to unconstrained ones. Key metrics of selective constraint, such as sequence conservation scores across species, reveal patches of non-coding DNA evolving slowly due to negative selection, indicating functionality beyond protein-coding regions. Supporting evidence for these views comes from comparative analyses of mutation and rates: nonsynonymous substitutions in coding regions occur at rates far lower than synonymous ones, reflecting strong purifying selection against changes, while synonymous rates in coding DNA approximate those in unconstrained , suggesting neutral evolution in the latter. This disparity underscores that while coding regions face intense constraint, much experiences weaker selection, allowing drift to dominate. Evolutionarily, non-functional DNA sequences can serve as raw material for innovation, such as through duplication or events that occasionally yield novel genes under subsequent selection.

ENCODE Project and Its Controversies

The () project, launched in 2003 by the , aimed to identify all functional elements in the through large-scale genomic assays. In 2012, the consortium released a series of 30 papers in , including a flagship integrative analysis, asserting that at least 80% of the exhibits biochemical activity, such as transcription, modifications, or binding sites. This claim was based on data from over 1,000 experiments across multiple cell types, suggesting pervasive regulatory roles beyond protein-coding regions. The 2012 findings sparked significant controversy, primarily over the definition of "function" and the interpretation of biochemical signals as evidence of biological utility. Critics, led by evolutionary biologist Dan Graur, argued in a 2013 Genome Biology and Evolution paper that ENCODE's broad criteria—equating any detectable biochemical activity with function—ignored evolutionary principles, where true function requires selective pressure for fitness effects rather than mere noise or transient interactions in assays. Graur dubbed this the "80% fallacy," contending that such overestimation could include non-selective artifacts like pervasive low-level transcription, potentially inflating functional estimates to implausible levels incompatible with observed mutation rates. ENCODE co-leader Ewan Birney defended the work in blog posts and interviews, clarifying that the 80% figure described biochemical activity, not necessarily selected function, and emphasized its value as a resource for discovering context-dependent roles. Following the backlash, subsequent analyses refined ENCODE's estimates using evolutionary conservation and genetic perturbation data, lowering the proportion of likely functional DNA. A 2014 study in PLOS Genetics, integrating multiple lines of evidence, estimated that only 8.2% (with a 95% of 7.1–9.2%) of the is under purifying selection and thus functional. Other works from 2014 to 2017, including a PNAS review and a Genome Biology and Evolution analysis, suggested upper limits of around 20–25% when accounting for regulatory elements, highlighting that biochemical assays often detect transient or cell-type-specific signals not indicative of broad utility. ENCODE's Phase 3, completed in with data releases continuing thereafter, shifted focus toward context-specific activity by expanding assays to over 1,300 cell types and tissues, revealing that many elements function only in particular developmental or environmental contexts rather than universally. These refinements have tempered the original claims while affirming ENCODE's role in mapping dynamic genomic regulation.

Known Functions

Regulatory and Structural Roles

Much of what was once termed junk DNA consists of regulatory elements that control , including promoters, enhancers, silencers, and insulators. Promoters are DNA sequences located near the transcription start sites of genes, serving as binding sites for and transcription factors to initiate transcription. Enhancers, often distant from their target genes, loop to interact with promoters via folding, boosting transcription in a tissue-specific manner; the project has mapped approximately 1 million such enhancer candidates across the , many residing in non-coding regions. Silencers repress transcription by recruiting repressive complexes, while insulators prevent unwanted interactions between enhancers and promoters, thereby delineating functional genomic domains. Non-coding RNAs (ncRNAs) transcribed from these regions play crucial roles in post-transcriptional and epigenetic regulation. Long non-coding RNAs (lncRNAs), typically longer than 200 nucleotides, modulate gene expression by interacting with chromatin-modifying complexes or serving as scaffolds for protein assemblies; for instance, the lncRNA Xist coats one X chromosome in female mammals to trigger X-chromosome inactivation, ensuring dosage compensation between sexes by silencing X-linked genes. MicroRNAs (miRNAs), short ncRNAs of about 22 nucleotides, primarily exert post-transcriptional control by binding to messenger RNA (mRNA) targets, leading to their degradation or translational repression; in humans, over 1,000 miRNA genes have been identified, regulating a substantial portion of protein-coding transcripts. Recent studies as of have further revealed that can sense environmental cues to regulate fate. For example, certain repetitive non-coding sequences enable to detect and respond to external signals, influencing and potentially holding therapeutic implications for . In addition to regulation, fulfills essential structural functions in architecture and stability. Telomeres, repetitive non-coding sequences at ends (TTAGGG in humans), protect against DNA degradation and fusion events, maintained by the enzyme to counteract replicative shortening. Centromeres, large blocks of repetitive non-coding DNA enriched in alpha-satellite sequences, assemble kinetochores to facilitate accurate segregation during . Interactions with the , a meshwork of intermediate filaments lining the inner , anchor non-coding regions to maintain three-dimensional folding; for example, lamina-associated domains (LADs) often encompass heterochromatic non-coding sequences, influencing compaction and gene positioning. Ultraconserved elements (UCEs), stretches of non-coding DNA over 200 base pairs with 100% sequence identity across humans, mice, and rats, exemplify conserved regulatory functions; many UCEs act as enhancers driving tissue-specific expression of developmental genes, such as those in the Hox clusters. Evidence from CRISPR-Cas9 knockouts further demonstrates functionality: targeted deletions of non-coding enhancers, like those regulating the SOX9 gene, result in limb malformations in mouse models, mirroring human congenital disorders; similarly, excising ultraconserved elements near the DLX5/6 locus disrupts craniofacial development. The ENCODE project aided in mapping these elements, highlighting their biochemical activity.

Evolutionary and Other Functions

Non-coding DNA, once dismissed as junk, plays pivotal roles in evolutionary innovation through transposable elements (TEs), which constitute approximately 45% of the human genome and have been co-opted for new functions over time. TEs, such as Alu elements—short interspersed nuclear elements (SINEs) that expanded dramatically in primate lineages—act as drivers of genomic novelty by inserting into regulatory regions, creating lineage-specific enhancers that influence gene expression during development and stress responses. For instance, Alu-derived motifs have fine-tuned inflammatory responses in humans and other primates, contributing to adaptations in immune function and disease susceptibility. These elements, originally selfish replicators, have been exapted to promote evolutionary flexibility, enabling rapid adaptation without disrupting core coding sequences. As of 2025, research has highlighted additional evolutionary roles, including ancient viral DNA—endogenous retroviruses (ERVs)—embedded in the genome that contribute to early human development. These sequences, once considered , regulate in embryonic stages, influencing primate-specific traits like development. Pseudogenes, inactivated copies derived from duplication or retrotransposition, serve as evolutionary backups by providing for regulatory innovation. Many pseudogenes produce non-coding RNAs that modulate parent , acting as decoys for microRNAs or competing endogenous RNAs (ceRNAs) to fine-tune pathways like stress signaling. In mammals, retrogene pseudogenes have buffered conserved pathways, such as those involved in resilience to environmental stressors, by retaining partial functionality that can be reactivated during evolutionary pressures. This reservoir of sequences allows for reversible loss and potential novelty, accelerating divergence in gene families without immediate costs. In adaptive contexts, non-coding DNA facilitates key evolutionary transitions, including sex-determination systems where repetitive sequences on sex chromosomes promote differentiation and suppress recombination. The accumulation of such elements on the Y chromosome, including near the SRY gene—a master regulator of male development derived from an ancestral SOX3 duplication—has driven the morphogenesis of sex chromosomes across vertebrates, enabling sexual dimorphism. Similarly, in the immune system, non-coding flanks containing recombination signal sequences (RSS) enable V(D)J recombination, shuffling variable (V), diversity (D), and joining (J) segments to generate vast antibody and T-cell receptor diversity essential for pathogen recognition. These mechanisms, embedded in non-coding regions, underpin adaptive immunity's evolutionary success by allowing hypermutation and combinatorial assembly. Beyond direct adaptation, non-coding DNA provides other utilities, such as buffering against deleterious mutations through heterochromatin-mediated silencing, which compacts repetitive regions to prevent gross chromosomal rearrangements and stabilize the genome during replication. Symbiotic contributions from endogenous retroviruses (ERVs), viral remnants integrated into the genome, further illustrate this; for example, ERV-derived syncytin genes encode fusogenic proteins critical for trophoblast fusion and placenta formation in eutherian mammals, a co-option that facilitated viviparity's evolution. Additionally, as of 2025, non-coding elements have been shown to have therapeutic potential, such as in destroying cancer cells by activating immune responses, transforming junk DNA into a tool for oncology. These roles highlight how erstwhile junk DNA has been repurposed for long-term evolutionary stability and innovation.

Current Understanding

Evidence for Truly Non-Functional DNA

Genomic analyses reveal signatures of in large portions of the , particularly in intergenic regions, where mutation rates align closely with neutral expectations and sequence conservation is minimal. These regions accumulate substitutions at rates comparable to synonymous sites in sequences, indicating a lack of purifying selection. For instance, comparative alignments show that approximately 80-90% of the lacks evolutionary constraint, with intergenic sequences exhibiting high variability across mammals, consistent with non-functional status. A 2023 study using 240 mammalian genomes estimated that only 10.7% (332 Mb) of the is under purifying selection, leaving the majority subject to neutral drift. This low conservation level supports the persistence of truly non-functional DNA, as functional elements would be expected to show stronger selective pressure. Comparative genomics further underscores the existence of non-functional DNA through lineage-specific expansions that lack cross-species homologs or detectable effects. In , for example, the has undergone substantial amplification of transposable elements, such as and rodent-specific LINEs, which constitute up to 30% of the sequence and are absent or divergent in . These expansions do not correlate with conserved regulatory motifs or phenotypic traits unique to , suggesting they represent neutral accumulations without selective advantage. The assembly highlights that rat-specific repeats occupy ~15% of the sequence, with rodent-specific repeats adding another ~8%, with no evidence of functional recruitment in other lineages. Such patterns indicate that much of the repetitive content arises via unchecked proliferation rather than adaptive . Experimental manipulations provide direct evidence that substantial non-functional DNA can be removed without fitness consequences. Large-scale deletion studies in mice have targeted non-conserved intergenic regions, yielding viable offspring with no observable impacts on , , or . Seminal work deleted over 1 Mb of gene deserts—non-coding intervals flanking developmental genes—resulting in homozygous mice indistinguishable from wild-type controls, despite removing thousands of conserved non-coding sequences. Similarly, targeted removal of four ultraconserved non-coding elements (out of 481 identified), each spanning up to 731 with 100% identity across , , and , produced fertile mice lacking any gross abnormalities, challenging assumptions of indispensability for highly conserved sequences. Evolutionary modeling further constrains the functional fraction to 8.2-15%, implying 85-92% as non-functional based on mutational load tolerances. These results collectively affirm the prevalence of genuinely neutral DNA in mammalian genomes.

Future Directions and Research Gaps

Advancements in single-cell assays are essential to resolve the functional contributions of at cellular resolution, enabling researchers to differentiate subtle regulatory activities from stochastic noise in heterogeneous tissues. Tools like single-cell and multi-omics approaches, such as SDR-seq, allow integration of DNA accessibility, RNA expression, and genetic variants to map non-coding effects in individual cells, addressing limitations of bulk assays that mask cell-type specificity. Complementing these, AI-driven models like AlphaGenome predict how non-coding variants influence regulation and expression across genomic contexts, surpassing earlier methods by handling long DNA sequences and variant combinations with high accuracy. These technologies aim to extend beyond the project's broad biochemical signals, which often conflate potential activity with proven function. Key research gaps persist in understanding the context-dependency of non-coding functions, where regulatory roles may emerge only under specific environmental or developmental cues, complicating universal classifications of "junk." For instance, cis-regulatory elements exhibit conserved yet context-specific effects on , varying by or stress conditions, yet systematic studies in diverse scenarios remain limited. Additionally, investigations into non-model organisms lag, as most data derive from humans and common models like mice, overlooking evolutionary insights from species with vastly different non-coding landscapes. Emerging questions center on the role of non-coding elements, particularly repetitive sequences like transposable elements, in pathogenesis, such as cancer progression through genomic or immune evasion. In , these elements' dysregulation can drive tumor evolution, yet causal mechanisms and therapeutic targets require further elucidation. offers a pathway to test neutrality by engineering insertions or deletions in non-coding regions of model organisms, identifying phenotypically neutral sites to probe evolutionary constraints without disrupting essential functions. As of 2025, integrating pangenomics with non-coding analysis highlights population-level variations in these regions, revealing how structural variants and alleles in "junk" DNA contribute to adaptive traits and susceptibility across diverse ancestries. This approach uncovers hidden regulatory diversity, challenging static views of non-coding neutrality and informing .

References

  1. [1]
    What We Talk About When We Talk About “Junk DNA” - PMC
    May 10, 2022 · “Junk DNA” is a popular yet controversial concept that states that organisms carry in their genomes DNA that has no positive impact on their fitness.
  2. [2]
    The Case for Junk DNA - PMC - PubMed Central
    May 8, 2014 · Today, “junk DNA” is often used in the broad sense of referring to any DNA sequence that does not play a functional role in development, ...
  3. [3]
    'Junk DNA' tells mice—and snakes—how to grow a backbone
    Aug 1, 2016 · Scientists began discovering junk DNA sequences in the 1960s. These stretches of the genome—also known as noncoding DNA—contain the same genetic ...Missing: review | Show results with:review
  4. [4]
    So much "junk" DNA in our genome - PubMed
    So much "junk" DNA in our genome. Brookhaven Symp Biol. 1972:23:366-70. Author. S Ohno. PMID: 5065367. No abstract available. MeSH terms.
  5. [5]
    What is noncoding DNA?: MedlinePlus Genetics
    Jan 19, 2021 · Only about 1 percent of DNA is made up of protein-coding genes; the other 99 percent is noncoding. Noncoding DNA does not provide ...Missing: credible | Show results with:credible
  6. [6]
    On causal roles and selected effects: our genome is mostly junk
    Dec 5, 2017 · The conundrum that Susumu Ohno, often credited as having first formally promoted the term “junk DNA”, highlighted in his 1972 paper [1] is still ...
  7. [7]
    Non-Coding DNA - National Human Genome Research Institute
    Non-coding DNA corresponds to the portions of an organism's genome that do not code for amino acids, the building blocks of proteins.Missing: percentage | Show results with:percentage
  8. [8]
    A Statistical Framework to Predict Functional Non-Coding Regions ...
    May 27, 2015 · It is estimated that approximately 98% of the human genome is non-protein-coding. Because of the apparent importance of coding regions, many ...
  9. [9]
    What is junk DNA, and what is it worth? - Scientific American
    Feb 12, 2007 · In 1972 the late geneticist Susumu Ohno coined the term "junk DNA" to describe all noncoding sections of a genome, most of which consist of ...<|control11|><|separator|>
  10. [10]
    The mammalian transcriptome and the function of non-coding DNA ...
    Mar 25, 2004 · Many non-coding transcribed sequences are proving to have important regulatory roles, but the functions of the majority remain mysterious.
  11. [11]
    Nuclear genome size: Are we getting closer? - Wiley Online Library
    Jun 25, 2010 · At the beginning of the 1950s, biochemical and cytochemical studies established the constancy of nuclear DNA amount for a given species and ...
  12. [12]
    Largest and Smallest Genome in the World - ResearchGate
    The genome of a cousin, Amoeba proteus, has a mere 290 billion base pairs, making it 100 times larger than the human genome.Missing: original measurement
  13. [13]
    The C-value paradox, junk DNA and ENCODE - ScienceDirect.com
    Nov 6, 2012 · What is the C-value paradox? You might expect more complex organisms to have progressively larger genomes, but eukaryotic genome size fails ...Missing: original | Show results with:original
  14. [14]
    Repeated sequences in DNA. Hundreds of thousands of copies of ...
    Repeated sequences in DNA. Hundreds of thousands of copies of DNA sequences have been incorporated into the genomes of higher organisms.Missing: Cot analysis 1960s
  15. [15]
    Discovery of RNA splicing and genes in pieces - PubMed Central
    Jan 19, 2016 · The Problem: Short-Lived Heterogeneous Nuclear RNA. During the 1960s and 1970s, there was a major conundrum in eukaryotic molecular biology ...
  16. [16]
    On the Immortality of Television Sets: “Function” in the Human ...
    Feb 20, 2013 · Can ENCODE tell us how much junk DNA we carry in our genome? ,. Biochem Biophys Res Commun. ,. 2013. , vol. 430. (pg. 1340. -. 1343. ).
  17. [17]
    Half a Century of Controversy: The Neutralist/Selectionist Debate in ...
    Feb 5, 2024 · The neutralist/selectionist controversy has structured the field and influences the way molecular evolutionary scientists conceive their research.
  18. [18]
    Evolutionary Rate at the Molecular Level - Nature
    Calculating the rate of evolution in terms of nucleotide substitutions seems to give a value so high that many of the mutations involved must be neutral ones.<|separator|>
  19. [19]
    Selfish DNA: the ultimate parasite - Nature
    Apr 17, 1980 · Orgel, L., Crick, F. Selfish DNA: the ultimate parasite. Nature 284, 604–607 (1980). https://doi.org/10.1038/284604a0. Download citation.
  20. [20]
    Theorists Debate How 'Neutral' Evolution Really Is | Quanta Magazine
    Nov 8, 2018 · For 50 years, evolutionary theory has emphasized the importance of neutral mutations rather than adaptive ones at the level of DNA.Missing: functionalist junk
  21. [21]
    Widely distributed noncoding purifying selection in the human genome
    We show that a substantial fraction of active purifying selection in human noncoding sequences occurs outside of CNSs and is diffusely distributed across the ...Missing: review | Show results with:review
  22. [22]
    Selective constraint in intergenic regions of human and mouse ...
    The average number of selectively constrained nucleotides within a mammalian intergenic region is at least 2000. This is threefold higher than within a nematode ...<|separator|>
  23. [23]
    Relative Rates of Evolution in the Coding and Control Regions of ...
    In the coding region, there was a significantly higher rate of substitution at synonymous sites than at nonsynonymous sites as well as in the tRNA and rRNA ...Introduction · Materials and Methods · Results · Discussion
  24. [24]
    The Complex Truth About 'Junk DNA' | Quanta Magazine
    Sep 1, 2021 · The 98% of the human genome that does not encode proteins is sometimes called junk DNA, but the reality is more complicated than that name implies.
  25. [25]
    An integrated encyclopedia of DNA elements in the human genome
    Sep 5, 2012 · The Encyclopedia of DNA Elements (ENCODE) project aims to delineate all functional elements encoded in the human genome. Operationally, we ...
  26. [26]
    ENCODE data describes function of human genome
    Sep 5, 2012 · During the new study, researchers linked more than 80 percent of the human genome sequence to a specific biological function and mapped more ...
  27. [27]
    The ENCODE debacle
    The trumpetings that ENCODE "changes paradigms of genomic function" were frankly appalling.” A. 36. “I recommend Graur et al.'s paper to all biologists ...Missing: controversies | Show results with:controversies
  28. [28]
    ENCODE: My own thoughts - Ewan's Blog: Bioinformatician at large
    Sep 5, 2012 · It's clear that 80% of the genome has a specific biochemical activity – whatever that might be. This question hinges on the word “functional” so ...
  29. [29]
    Scientists Clash on the Meaning of ENCODE's Genetic Data
    Apr 12, 2013 · Around 80 percent of the human genome is "functional," the researchers leading the Encyclopedia of DNA Elements (ENCODE) project said.
  30. [30]
    8.2% of the Human Genome Is Constrained: Variation in Rates ... - NIH
    Jul 24, 2014 · We estimate that 8.2% (7.1–9.2%) of the human genome is presently subject to negative selection and thus is likely to be functional.Missing: post- | Show results with:post-
  31. [31]
    Defining functional DNA elements in the human genome - PNAS
    Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments.Missing: refinements | Show results with:refinements
  32. [32]
    An Upper Limit on the Functional Fraction of the Human Genome
    Jul 11, 2017 · The functional fraction of the human genome cannot exceed 15%, based on mutational load considerations.
  33. [33]
    Expanded encyclopaedias of DNA elements in the human ... - Nature
    Jul 29, 2020 · The ENCODE Project aims to delineate precisely and comprehensively the segments of the human and mouse genomes that encode functional elements.Missing: refinements | Show results with:refinements
  34. [34]
    NHGRI completes phase 3 of ENCODE project
    Aug 6, 2020 · A monthly update from the NHGRI Director on activities and accomplishments from the institute and the field of genomics. For More Information.Missing: 2023-2025 context- specific
  35. [35]
    ENCODE Publications
    The Encyclopedia of DNA Elements (ENCODE) project has established a genomic resource for mammalian development, profiling a diverse panel of mouse tissues.Missing: refinements lower estimate
  36. [36]
    An Integrated Encyclopedia of DNA Elements in the Human Genome
    The Encyclopedia of DNA Elements (ENCODE) Project aims to delineate all functional elements encoded in the human genome. Operationally, we define a functional ...
  37. [37]
    [PDF] A User's Guide to the Encyclopedia of DNA Elements (ENCODE)
    Apr 19, 2011 · Cis-regulatory regions include diverse functional elements (e.g., promoters, enhancers, silencers, and insulators) that collectively modulate ...
  38. [38]
    Gene regulation by long non-coding RNAs and its biological functions
    Dec 22, 2020 · The best-known mechanisms of gene repression mediated by lncRNAs are related to gene-dosage compensation. ... The Xist lncRNA exploits ...
  39. [39]
    Long noncoding RNA XIST: Mechanisms for X chromosome ... - NIH
    A growing body of evidence has revealed that the lncRNA XIST, an important regulator in X chromosome dosage compensation in placental mammals, can play pivotal ...
  40. [40]
    Post-transcriptional control of miRNA biogenesis - PMC
    MicroRNAs (miRNAs) are small noncoding RNAs that negatively regulate the expression of a large proportion of cellular mRNAs. They have unique, diverse ...
  41. [41]
    Telomere biology and ribosome biogenesis: structural and ...
    Telomeres are nucleoprotein structures that play a pivotal role in the protection and maintenance of eukaryotic chromosomes. Telomeres and the enzyme ...
  42. [42]
    Nuclear lamins: major factors in the structural organization and ...
    This review provides an up-to-date overview of the functions of nuclear lamins, emphasizing their roles in epigenetics, chromatin organization, DNA replication ...
  43. [43]
    Tying up loose ends: telomeres, genomic instability and lamins - PMC
    Here, we summarize recent research suggesting that telomeres, the capping structures that protect chromosome ends, are stabilized by lamin-binding.
  44. [44]
    Perfect and imperfect views of ultraconserved sequences - PMC
    In addition to enhancer activity, non-coding ultraconserved sequences can have other roles in gene expression regulation. One element upstream of the HoxD locus ...
  45. [45]
    Article Ultraconserved Elements Occupy Specific Arenas of Three ...
    Jul 10, 2018 · This study explores the relationship between three-dimensional genome organization and ultraconserved elements (UCEs), an enigmatic set of DNA elements that ...
  46. [46]
    Functional interrogation of non-coding DNA through CRISPR ... - NIH
    Here we review CRISPR-based loss- and gain-of-function techniques for the interrogation of non-coding DNA.
  47. [47]
    Advanced analysis of retrotransposon variation in the human ...
    Apr 25, 2025 · Transposable elements (TEs) comprise approximately 45% of the human genome. Part of this abundance stems from their ability to jump to different ...
  48. [48]
    Primate-specific transposable elements shape transcriptional ...
    Nov 23, 2022 · Here, we describe how many primate-restricted TEs have additional binding sites for lineage-specific transcription factors driving their expression during ...
  49. [49]
  50. [50]
    Pseudogene-derived small interference RNAs regulate gene ...
    Apr 29, 2011 · Over the past years, however, it has become evident that pseudogenes may have diverse functions, mainly in regulating gene expression (3–7).Sign Up For Pnas Alerts · Results · Discussion
  51. [51]
    The HAPSTR2 retrogene buffers stress signaling and resilience in ...
    Jan 11, 2023 · We thus identify a novel protein-coding retrogene that buffers a conserved stress response pathway in mammals.
  52. [52]
    The subordinate role of pseudogenization to recombinative deletion ...
    Jul 9, 2025 · Indeed, several studies have put pseudogenes forward as an important source of long noncoding RNAs (lncRNAs), which govern regulatory functions.
  53. [53]
    Junk DNA promotes sex chromosome evolution | Heredity - Nature
    Apr 1, 2009 · These findings in animal species show that the accumulation of junk DNA is an important step in promoting the morphogenesis of sex chromosomes.Missing: determination | Show results with:determination
  54. [54]
    Generating combinatorial diversity via engineered V(D)J-like ...
    Jul 1, 2025 · V(D)J recombination is integral to the development of antibody diversity and proceeds through a complex DNA cleavage and repair process ...
  55. [55]
    Heterochromatin suppresses gross chromosomal rearrangements at ...
    Jan 11, 2019 · Here, we found in fission yeast that heterochromatin suppresses gross chromosomal rearrangements (GCRs) at centromeres. Mutations in Clr4/Suv39 ...
  56. [56]
    An endogenous retroviral envelope syncytin and its cognate ... - PNAS
    Nov 21, 2017 · Syncytins are envelope genes from endogenous retroviruses that have been captured during evolution for a function in placentation.
  57. [57]
    Endogenous retroviruses regulate periimplantation placental growth ...
    Sep 26, 2006 · This work supports the hypothesis that ERVs play fundamental roles in placental morphogenesis and mammalian reproduction.
  58. [58]
    Evolutionary constraint and innovation across hundreds of placental ...
    Apr 28, 2023 · We estimate that a minimum of 332 Mb (10.7%) of the human genome is under constraint through purifying selection (Fig. 2A) (12). We computed ...
  59. [59]
    Megabase deletions of gene deserts result in viable mice - Nature
    ### Summary of Main Finding on Megabase Deletions in Gene Deserts in Mice
  60. [60]
    Deletion of Ultraconserved Elements Yields Viable Mice
    To our surprise, we found that the mice lacking these elements are viable, fertile, and show no apparent abnormalities.
  61. [61]
    A cell type-aware framework for nominating non-coding variants in ...
    Sep 27, 2024 · Unsolved Mendelian cases often lack obvious pathogenic coding variants, suggesting potential non-coding etiologies.
  62. [62]
    SDR-seq Connects Genetic Variants to Disease in Single Cells
    Oct 10, 2025 · Researchers at EMBL have developed a single-cell sequencing tool that enables DNA and RNA to be studied together in the same cell, ...<|separator|>
  63. [63]
    AlphaGenome: AI for better understanding the genome
    Jun 25, 2025 · Today, we introduce AlphaGenome, a new artificial intelligence (AI) tool that more comprehensively and accurately predicts how single variants ...Missing: assays | Show results with:assays
  64. [64]
    Beyond AlphaFold: how AI is decoding the grammar of the genome
    Aug 18, 2025 · Scientists are seeking to decipher the role of non-coding DNA in the human genome, helped by a suite of artificial-intelligence tools.Missing: junk assays
  65. [65]
    Mapping the regulatory effects of common and rare non-coding ...
    Feb 19, 2025 · By applying ChromBPNet to GWAS and QTL data, we uncover context-specific regulatory effects underlying genetic associations and integrate these ...
  66. [66]
    Exploring the roles of conserved context‐dependent cis‐regulatory ...
    Jul 4, 2025 · Conserved context-dependent cis-regulatory elements act as a major reservoir of disease-associated polymorphisms in the human genome and are ...
  67. [67]
    A pangenomic approach reveals the sources of genetic variation ...
    Mar 19, 2025 · These non-coding regions are in some cases conserved across more distantly related species, suggesting they could serve important regulatory ...
  68. [68]
    Unmasked: transposable elements as drivers and targets in cancer
    Sep 10, 2025 · further showed that exonization of a primate-specific Alu element into IFNAR2 produces a decoy isoform, IFNAR2-S, which lacks signaling domains ...
  69. [69]
    From Junk DNA to Genomic Treasure: Impacts of Transposable ...
    Aug 13, 2025 · Transposable elements shape development and disease at every level of the central dogma by functioning as regulatory DNA, functional RNA, ...1 Introduction · 2 Transposons As An... · 3 Transposons As A Source Of...
  70. [70]
    Putative Phenotypically Neutral Genomic Insertion Points in ...
    Mar 10, 2022 · We report putative editing targets for 10 common synthetic biology chassis organisms, including coverage of available RNA-seq data, and provide software to ...
  71. [71]
    pangenomic approach reveals the sources of genetic variation ...
    Sep 23, 2025 · These non-coding regions are in some cases conserved across more distantly related species, suggesting they could serve important regulatory ...
  72. [72]
    [PDF] From map to blueprint: the plant pan-genome unraveling genetic ...
    Sep 23, 2025 · Beyond gene content, pan- genomic analyses now also encompass regulatory variations and non-coding sequences, providing a more holistic view of ...