Fact-checked by Grok 2 weeks ago

Inverted repeat-lacking clade

The inverted repeat-lacking clade (IRLC), also known as the inverted repeat loss clade, is a monophyletic subgroup within the legume subfamily Papilionoideae of the family Fabaceae, distinguished by the evolutionary loss of the large (IR) region—a typically duplicated 20–30 kb segment—in their plastid genomes, resulting in smaller genome sizes of approximately 121–126 kb. This clade encompasses around 56 genera and over 4,000 , many of which are ecologically and economically significant herbaceous or woody adapted to temperate regions. The IRLC's defining plastome feature arose approximately 39 million years ago in the common ancestor of the group, promoting higher rates of structural rearrangements, inversions, and substitutions compared to IR-retaining clades, which has facilitated extensive research into genome stability and . Notable genera include Medicago (e.g., , M. sativa), Trifolium (clovers), Astragalus (milkvetches), Wisteria (wisterias), and Glycyrrhiza (licorice), which collectively represent diverse applications in , , , and . The clade's taxonomic lineage traces through to the and fabids, positioning it as a key branch in the papilionoid under the broader Hologalegina group. Although the IR loss is nearly universal across the IRLC, rare exceptions highlight dynamic plastome evolution; for instance, has independently regained a novel ~9 kb IR, underscoring recombination mechanisms in DNA. Studies of IRLC plastomes have broader implications for , including crop resilience and understanding homoplastic inversions that affect order and in .

Overview

Definition

The inverted repeat-lacking clade (IRLC) is an informal monophyletic group comprising a diverse assemblage of flowering plants within the subfamily (synonym Papilionoideae) of the family (). This represents one of the most species-rich lineages in , encompassing approximately 56 genera and over 4,000 species adapted to a wide range of habitats, though its boundaries are defined primarily through molecular evidence rather than formal taxonomic ranks. The defining genetic hallmark of the IRLC is the loss of one of the two large (IR) regions in the plastid () genome, a feature shared by nearly all members of the , with rare exceptions. In typical angiosperm plastomes, these IRs—each approximately 25 in length—separate the large single-copy (LSC) and small single-copy () regions, forming a conserved quadripartite that stabilizes the . The IRLC's IR loss results in a contracted of approximately 121–126 and a distinctive where the LSC and SSC regions are contiguous without the intervening repeat, a configuration rare among land and absent in most other angiosperms. This was first identified and characterized in the early 2000s through phylogenetic analyses of DNA sequences, particularly the matK gene, which revealed the IR loss as a synapomorphy uniting the group. Earlier observations of IR absence in select legume genera, such as Pisum and , had hinted at this pattern, but comprehensive sampling confirmed its and extent across the .

Key Characteristics

Members of the inverted repeat-lacking clade (IRLC) are predominantly herbaceous or shrubby plants within the , exhibiting the characteristic papilionoid flower structure with a prominent (banner) petal dorsally, two lateral wing petals, and a ventral comprising fused petals that enclose the reproductive organs. This floral facilitates by and is a defining feature across the . A shared morphological among IRLC is the presence of pinnately compound leaves, which in some lineages, such as the Fabeae tribe, include tendrils that enable climbing and support in vining growth forms, as seen in peas (Pisum sativum). All members form symbiotic nodules with rhizobial bacteria, enabling biological that converts atmospheric nitrogen into plant-usable forms, thereby enhancing soil fertility in their habitats. Ecologically, IRLC legumes are mainly distributed across temperate to subtropical zones, with numerous species showing adaptations to arid, semi-arid, or disturbed environments, such as steppes and grasslands, where their nitrogen-fixing ability promotes resilience and nutrient cycling. In their nodules, IRLC species exhibit terminal bacteroid differentiation, where rhizobial cells undergo irreversible enlargement, endoreduplication, and loss of reproductive capacity, driven by nodule-specific cysteine-rich (NCR) peptides; for instance, in Medicago truncatula, over 600 NCR genes are expressed to induce elongated bacteroids that optimize nitrogen fixation.

Taxonomy

Included Tribes

The Inverted repeat-lacking clade (IRLC) encompasses several core tribes within the subfamily Papilionoideae of Fabaceae, primarily defined by molecular phylogenetic analyses. These include Astragaleae (containing milkvetches like Astragalus); Caraganeae (peavines like Caragana); Cicereae, characterized by herbaceous species such as chickpeas; Fabeae, featuring temperate herbaceous plants like peas and lentils; Galegeae, with elements such as licorice (Glycyrrhiza); Hedysareae, including sweetvetches; Trifolieae, represented by clovers and alfalfa (Medicago); and Wisterieae, which includes woody climbers like Wisteria. Recent studies have also recognized Glycyrrhizeae (resurrected for Glycyrrhiza and allies) and Adinobotryeae as part of the IRLC. The reassignment of genera from the former tribe Millettieae to Wisterieae reflects the integration of liana-forming lineages into the clade based on shared and nuclear markers. A phylogenomic study redefined Caraganeae to include only and reduced Astragaleae to the Erophaca-Astragalean clade. Beyond the defining loss of the , tribal synapomorphies in the IRLC are supported by shared molecular markers, including variations in the rpl32-trnL intergenic spacer region, which exhibit conserved patterns across these lineages. Historical taxonomic revisions of the IRLC tribes occurred primarily between 2003 and 2010, driven by molecular studies utilizing matK and trnK sequences that confirmed the of the core tribes and prompted the exclusion of unrelated groups like Loteae while incorporating Wisterieae. Subsequent works from 2021 onward, integrating phylogenomic data, further refined tribal boundaries, such as the resurrection of Glycyrrhizeae and redefinition of Astragaleae.

Major Genera

The inverted repeat-lacking clade (IRLC) encompasses several species-rich genera within the family, with Astragalus standing out as the largest, comprising approximately 3,000 species of herbs and small shrubs predominantly distributed across the temperate regions of the . This genus exhibits remarkable morphological diversity, ranging from low-growing cushion plants to taller perennials, and includes toxic species known as locoweeds that can affect grazing in arid and semi-arid habitats. High is evident in Central and , where many species are adapted to and mountain environments, reflecting the clade's evolutionary success in these areas. Medicago, with around 87 species, is another prominent genus in the IRLC, primarily found in temperate and subtropical and . Notable for its role in , it includes Medicago sativa (), a perennial crop widely cultivated for hay and due to its high and ability to fix atmospheric nitrogen through symbiotic relationships with rhizobial . Studies on this have advanced understanding of mechanisms, highlighting its ecological and economic contributions to sustainable farming practices. The genus Trifolium, containing about 300 species of clovers, is widespread in temperate grasslands and meadows across Europe, Asia, and North America. These mostly herbaceous perennials or annuals feature trifoliate leaves and inflorescences that attract pollinators, playing a key role in pasture ecosystems by enhancing soil fertility and biodiversity. Species like Trifolium repens (white clover) are integral to mixed grasslands, supporting forage production and serving as model organisms for studying legume-rhizobia interactions. Other notable genera in the IRLC include Pisum with two to three species, primarily the cultivated garden pea (Pisum sativum), a cool-season annual originating from the Mediterranean region and valued for its edible seeds. Cicer, encompassing 46 species native to the Mediterranean and , features the chickpea (Cicer arietinum) as its sole major crop, an annual crop essential for in arid zones due to its . Similarly, Vicia with 248 accepted species of vetches, distributed mainly in temperate and introduced elsewhere, includes economically important taxa like Vicia faba (broad bean), a protein-rich with a long history of for and . Glycyrrhiza, with about 20 species, includes the licorice plant (Glycyrrhiza glabra), used in and . These genera collectively underscore the IRLC's significance in both wild diversity and human , with cultivation histories tied to ancient domestication events in the .

Phylogeny and Evolution

Divergence and Age

The inverted repeat-lacking clade (IRLC) of is estimated to have originated approximately 39.0 ± 2.4 million years ago during the late Eocene, based on analyses of genes matK and rbcL calibrated with macros. These estimates were derived using penalized likelihood rate smoothing on Bayesian phylogenetic trees, with 12 calibrations constraining minimum ages for nodes within Leguminosae, including early papilionoid divergences. Recent phylogenomic studies employing Bayesian methods and expanded datasets, including markers, have corroborated this late Eocene timing for the IRLC radiation, often incorporating 20 or more calibrations to refine divergence estimates across the . The fossil record provides supporting evidence for an Eocene origin and mid-Eocene radiation of IRLC-like . Earliest IRLC-like papilionoid flowers and fruits have been documented from early Eocene deposits in , such as the Tepee Trail Formation in , featuring morphological traits consistent with derived papilionoids including IRLC precursors. fossils attributable to papilionoid appear in mid-Eocene sediments from (e.g., Green River Formation), supporting an early radiation contemporaneous with the molecular estimates. The divergence and subsequent diversification of the IRLC coincided with major climate shifts during the Eocene-Oligocene transition (approximately 34–37 million years ago), characterized by , , and the onset of glaciation, which promoted the expansion of open temperate habitats favorable to IRLC lineages. This environmental reconfiguration likely facilitated adaptive radiations in herbaceous and shrubby forms typical of the , as evidenced by the alignment of occurrences with paleoclimate proxies showing a shift from subtropical to more seasonal temperate conditions in the . Biogeographic analyses further indicate that these changes drove intercontinental dispersals and niche specialization within the IRLC during the late Eocene to early .

Relationships within Faboideae

The inverted repeat-lacking clade (IRLC) represents one of the most derived, crownward lineages within the subfamily Papilionoideae (synonym Faboideae), nested deeply within the expansive 50-kb inversion clade that encompasses approximately 95% of the subfamilys species diversity. This positioning places the IRLC as a monophyletic group within the Hologalegina clade, characterized by the loss of one inverted repeat in the plastid genome, distinguishing it from earlier-branching lineages that retain both repeats despite sharing the defining 50-kb inversion. Within the 50-kb inversion , the Hologalegina (including the IRLC and robinioid genera) is sister to Indigofereae, with that combined group sister to the Millettieae-Phaseoleae (millettioid-phaseoloid) assemblage; more basally within this inversion lie the genistoid and dalbergioid lineages, which form a successive leading toward the IRLC and its immediate relatives, reflecting a progression from more ancestral papilionoid forms to the specialized structure of the IRLC. Phylogenetic resolution of these relationships has been consistently strong in multi-locus and phylogenomic analyses, including a comprehensive study utilizing 1,456 low-copy nuclear loci across Papilionoideae, which recovered the IRLC and its sister groupings with bootstrap support values of 100% and posterior probabilities of ≥0.99. Earlier multi-locus efforts, such as those combining markers like matK and trnL, also upheld the IRLC's placement within the 50-kb inversion , though with moderate bootstrap values around 62% for broader nodes. The emergence of the IRLC signifies a pivotal evolutionary in plastid genome organization, occurring subsequent to the 50-kb inversion event estimated at approximately 50–60 million years ago during the early diversification of Papilionoideae. This shift underscores the clades role in the of papilionoid , though detailed genomic implications are tied to subsequent IR loss rather than the inversion itself.

Plastid Genome

Structure and IR Loss

The plastid genomes (plastomes) of the inverted repeat-lacking clade (IRLC) are distinguished by the complete absence of the large (IR) region, a ~25 kb sequence duplication that flanks the (SSC) region in most land plant plastomes and promotes genomic stability through . Typical angiosperm plastomes feature a quadripartite comprising a large single-copy (LSC) region, two IR copies (IRa and IRb), and the SSC region, yielding total lengths of 140-160 kb. In IRLC species, this canonical arrangement is altered to an LSC-SSC configuration due to the loss of the IR, resulting in more compact plastomes of 110-130 kb. For instance, the plastome measures 123-124 kb, reflecting this streamlined architecture. The IR loss in IRLC plastomes arose from the independent contraction and deletion of one IR copy in the clade's common ancestor, an event predating other major rearrangements and likely facilitated by between the original IR duplicates. This mechanism has been corroborated through and sequencing in key genera, such as (e.g., M. truncatula lacking the rRNA-encoding IR) and Pisum (e.g., P. sativum showing post-loss inversions), where the single remaining rRNA resides adjacent to the LSC/SSC boundary. IRLC plastomes generally retain 110-120 genes, encompassing 70-80 protein-coding genes, 30 tRNAs, and 4 rRNAs, with minimal outright gene losses but notable pseudogenization in some lineages. The absence of the reduces recombination-mediated stabilization, thereby elevating the potential for structural rearrangements like inversions and indels; for example, Pisum plastomes exhibit at least five such inversions relative to the configuration. Comparisons with IR-containing plastomes underscore these differences: IRLC genomes display accelerated rearrangement rates and uniform substitution patterns across regions, unlike the suppressed variation in IR-flanked segments of outgroups like (~150 kb with intact IR). Sequenced IRLC exemplars, such as Cicer arietinum (125,319 bp, 108 genes, confirmed IR loss via boundary analysis), illustrate this heightened plasticity without compromising core functionality.

Evolutionary Implications

The loss of the (IR) in plastomes of the inverted repeat-lacking clade (IRLC) results in reduced intergenomic recombination between duplicated sequences, leading to decreased stability and elevated rates specifically in the expanded single-copy regions. This exposes former IR genes, now residing in single-copy areas, to higher rates—approximately 3-4 times greater than in IR-retaining taxa—thereby accelerating overall plastome evolution within the IRLC. Such dynamics are evident in papilionoid , where IRLC lineages exhibit notably higher rates compared to IR-containing relatives, correlating with increased structural rearrangements like inversions and indels. These accelerated evolutionary rates in IRLC plastomes may confer adaptive advantages by facilitating rapid divergence of plastid genes involved in and , potentially aiding the clade's extensive diversification into over 4,000 species across diverse habitats. Although no direct causal link exists between IR loss and nodulation, the IRLC's genetic dynamism parallels the of nodule-specific cysteine-rich (NCR) peptides, which are to this clade and enable precise control of bacteroid differentiation in root nodules for enhanced . This co-evolutionary pattern with rhizobial symbionts underscores the IRLC's success, particularly in temperate ecosystems where herbaceous genera like Astragalus and dominate. Rare instances of IR re-expansion, such as the ~9 kb IR in minima and a smaller 425 version in its sister M. lupulina, demonstrate the reversibility of IR loss through repeat-mediated recombination mechanisms like resolution. These events, occurring within the last 3-9 million years, do not necessarily restore plastome stability, as ongoing illegitimate recombination driven by dispersed repeats continues to promote rearrangements in these lineages. Overall, the IRLC's plastomic innovations correlate with its ecological dominance in temperate regions, reflecting broader co-evolution with microbes without implying a primary role for IR loss in nodulation traits.

Diversity and Distribution

Species Diversity

The inverted repeat-lacking clade (IRLC) encompasses approximately 4,000 species across about 56 genera, representing roughly 20% of the total species diversity in the family, which comprises around 20,000 species overall. This substantial contribution highlights the IRLC's role as one of the most species-rich lineages within the subfamily . Species richness within the IRLC is highly uneven, dominated by a handful of megadiverse genera that account for the majority of its diversity. The genus Astragalus stands out with over 3,000 species, establishing it as the largest genus among all angiosperms and a key driver of the clade's overall species density. Similarly, Oxytropis harbors more than 300 species, further exemplifying concentrated diversification in select lineages. These hotspots of speciation contrast with smaller genera, underscoring the clade's skewed distribution of diversity. Diversification patterns in the IRLC feature notable radiations during the , particularly in temperate herbaceous groups, which facilitated rapid species accumulation in genera like Astragalus. The shows lower in tropical lineages relative to temperate ones, aligning with its predominant to cooler climates and reflecting evolutionary biases toward higher-latitude habitats. Conservation challenges affect many IRLC species, with numerous narrow endemics—especially in megadiverse genera—threatened by loss from agricultural intensification and ; however, no complete events across the have been recorded to date.

Geographic Distribution

The Inverted Repeat-Lacking Clade (IRLC) of exhibits a predominantly Holarctic distribution, centered in the temperate zones of the , with primary concentrations in and . The clade's range extends into subtropical regions of and , particularly through high-elevation habitats and Mediterranean-adjacent areas. Centers of diversity are prominent in , including the Irano-Turkestan region and the Sino-Himalayan Plateau, where genera like Astragalus dominate ecosystems, and in the , supporting diverse assemblages in grasslands. Biogeographic patterns within the IRLC reflect adaptation to temperate and montane environments, with Astragalus species prevalent in arid steppes across and extending into North American prairies. In contrast, clovers (Trifolium) are characteristic of and Mediterranean grasslands, with highest diversity in temperate regions. The shows limited native presence in the Neotropics, primarily through historical introductions of species like Trifolium and Medicago to and , rather than natural colonization. The biogeographic history of the IRLC traces to a Laurasian origin in the late Eocene, approximately 39 million years ago, with subsequent diversification driven by vicariance and long-distance dispersal events. Post-Eocene cooling facilitated spread across northern continents, including Beringian migrations to and southward extensions into African highlands via Miocene dispersals. Current threats from particularly affect high-altitude distributions, such as those of Oxytropis in Eurasian and n mountains, where warming is predicted to contract suitable habitats and shift ranges poleward.

Economic and Ecological Significance

Agricultural Crops

The Inverted repeat-lacking clade (IRLC) within the family encompasses several economically vital that form the backbone of global and forage production. Key domesticated crops include chickpeas (Cicer arietinum), lentils (Lens culinaris), peas (Pisum sativum), (Medicago sativa), and (Vicia faba), which together contribute significantly to and feed. Domestication of these IRLC crops originated approximately 10,000 years ago in the Near East's , where early farmers selected for non-shattering pods and larger seeds to enhance harvest efficiency. Today, these and other IRLC contribute to global dry production, which reached approximately 96 million metric tons in 2023, with chickpeas, lentils, and dry peas accounting for about 35 million tons. ranks as the world's leading forage legume, cultivated on over 30 million hectares as of recent estimates (around the 2020s), primarily in the United States, , and . These crops offer substantial agricultural benefits, including high protein content—ranging from 20-25% in dry seeds of chickpeas and lentils—making them essential for in vegetarian diets and . Their symbiotic with bacteria improves , reducing the need for synthetic fertilizers and supporting sustainable farming; for instance, can fix up to 200 kg of per per year. Modern breeding has yielded high-yielding varieties, such as alfalfas that produce 15-20 tons of per under optimal conditions, enhancing productivity in arid regions. Despite these advantages, IRLC crops face challenges such as susceptibility to pests; certain Astragalus species, known as locoweeds, produce swainsonine toxins that poison grazing on infested pastures, leading to significant economic losses in rangelands. Breeding efforts are increasingly focused on , with genetic improvements in peas and chickpeas enabling yields to be maintained under water-limited conditions, as demonstrated by varieties that sustain 2-3 tons per with 30% less .

Other Uses and Roles

Members of the inverted repeat-lacking clade (IRLC) exhibit diverse non-agricultural applications, particularly in and . The root extracts of Glycyrrhiza glabra (liquorice), a prominent IRLC species, are widely utilized in pharmaceuticals for their , , , and antiviral properties, aiding in treatments for respiratory disorders, peptic ulcers, and infections. Similarly, Trigonella foenum-graecum () seeds serve as a key ingredient in health supplements due to their antidiabetic, , and effects, while also functioning as a spice in culinary applications. In ornamental , IRLC genera like are valued for , where their vigorous climbing vines adorn arbors, pergolas, and trellises, providing aesthetic appeal through cascading blooms. Clovers (Trifolium spp.), another IRLC group, are incorporated into lawns for their low-maintenance ground cover and support production, as their nectar-rich flowers attract bees for and yield. Ecologically, IRLC legumes play significant roles in grassland ecosystems by enhancing biodiversity through nitrogen fixation and habitat provision, contributing to species richness in semi-arid temperate regions. However, certain Astragalus species, such as A. variabilis, exhibit invasive potential as poisonous weeds, disrupting rangelands and causing economic losses in livestock husbandry by inducing locoism. Additionally, IRLC plants serve as vital forage for wildlife, supporting pollinators, insects, and herbivores while bolstering overall ecosystem services like soil stabilization. In research, stands out as a for studying symbiotic interactions, particularly nitrogen-fixing nodulation with , enabling insights into plant-microbe signaling and metabolic exchanges. Biotechnological applications leverage nodule-specific cysteine-rich (NCR) peptides from M. truncatula, which promote bacteroid differentiation and exhibit antimicrobial properties, offering potential tools for enhancing crop resistance and .