The inverted repeat-lacking clade (IRLC), also known as the inverted repeat loss clade, is a monophyletic subgroup within the legume subfamily Papilionoideae of the family Fabaceae, distinguished by the evolutionary loss of the large inverted repeat (IR) region—a typically duplicated 20–30 kb segment—in their plastid genomes, resulting in smaller genome sizes of approximately 121–126 kb.[1] This clade encompasses around 56 genera and over 4,000 species, many of which are ecologically and economically significant herbaceous or woody plants adapted to temperate regions.[2][3]The IRLC's defining plastome feature arose approximately 39 million years ago in the common ancestor of the group, promoting higher rates of structural rearrangements, inversions, and nucleotide substitutions compared to IR-retaining legume clades, which has facilitated extensive research into organelle genome stability and evolution.[1] Notable genera include Medicago (e.g., alfalfa, M. sativa), Trifolium (clovers), Astragalus (milkvetches), Wisteria (wisterias), and Glycyrrhiza (licorice), which collectively represent diverse applications in agriculture, forage, pharmacology, and biodiversity.[1][4] The clade's taxonomic lineage traces through eudicots to the rosids and fabids, positioning it as a key branch in the papilionoid legumes under the broader Hologalegina group.[3]Although the IR loss is nearly universal across the IRLC, rare exceptions highlight dynamic plastome evolution; for instance, Medicago minima has independently regained a novel ~9 kb IR, underscoring recombination mechanisms in organelle DNA.[1] Studies of IRLC plastomes have broader implications for biotechnology, including engineering crop resilience and understanding homoplastic inversions that affect gene order and function in legumes.[4]
Overview
Definition
The inverted repeat-lacking clade (IRLC) is an informal monophyletic group comprising a diverse assemblage of flowering plants within the subfamily Faboideae (synonym Papilionoideae) of the legume family (Fabaceae).[5] This clade represents one of the most species-rich lineages in Faboideae, encompassing approximately 56 genera and over 4,000 species adapted to a wide range of habitats, though its boundaries are defined primarily through molecular evidence rather than formal taxonomic ranks.[2]The defining genetic hallmark of the IRLC is the loss of one of the two large inverted repeat (IR) regions in the plastid (chloroplast) genome, a feature shared by nearly all members of the clade, with rare exceptions.[6][1] In typical angiosperm plastomes, these IRs—each approximately 25 kb in length—separate the large single-copy (LSC) and small single-copy (SSC) regions, forming a conserved quadripartite structure that stabilizes the genome.[7] The IRLC's IR loss results in a contracted genome size of approximately 121–126 kb and a distinctive architecture where the LSC and SSC regions are contiguous without the intervening repeat, a configuration rare among land plants and absent in most other angiosperms.[1]This clade was first identified and characterized in the early 2000s through phylogenetic analyses of plastid DNA sequences, particularly the matK gene, which revealed the IR loss as a synapomorphy uniting the group.[8] Earlier observations of IR absence in select legume genera, such as Pisum and Cicer, had hinted at this pattern, but comprehensive sampling confirmed its monophyly and extent across the clade.[6]
Key Characteristics
Members of the inverted repeat-lacking clade (IRLC) are predominantly herbaceous or shrubby plants within the papilionoid legumes, exhibiting the characteristic papilionoid flower structure with a prominent standard (banner) petal dorsally, two lateral wing petals, and a ventral keel comprising fused petals that enclose the reproductive organs.[9] This floral morphology facilitates pollination by insects and is a defining feature across the clade.[10]A shared morphological trait among IRLC species is the presence of pinnately compound leaves, which in some lineages, such as the Fabeae tribe, include tendrils that enable climbing and support in vining growth forms, as seen in peas (Pisum sativum).[9] All members form symbiotic root nodules with rhizobial bacteria, enabling biological nitrogen fixation that converts atmospheric nitrogen into plant-usable forms, thereby enhancing soil fertility in their habitats.[11]Ecologically, IRLC legumes are mainly distributed across temperate to subtropical zones, with numerous species showing adaptations to arid, semi-arid, or disturbed environments, such as steppes and grasslands, where their nitrogen-fixing ability promotes ecosystem resilience and soil nutrient cycling.[12][13]In their nodules, IRLC species exhibit terminal bacteroid differentiation, where rhizobial cells undergo irreversible enlargement, endoreduplication, and loss of reproductive capacity, driven by nodule-specific cysteine-rich (NCR) peptides; for instance, in Medicago truncatula, over 600 NCR genes are expressed to induce elongated bacteroids that optimize nitrogen fixation.[14]
Taxonomy
Included Tribes
The Inverted repeat-lacking clade (IRLC) encompasses several core tribes within the subfamily Papilionoideae of Fabaceae, primarily defined by molecular phylogenetic analyses. These include Astragaleae (containing milkvetches like Astragalus); Caraganeae (peavines like Caragana); Cicereae, characterized by herbaceous species such as chickpeas; Fabeae, featuring temperate herbaceous plants like peas and lentils; Galegeae, with elements such as licorice (Glycyrrhiza); Hedysareae, including sweetvetches; Trifolieae, represented by clovers and alfalfa (Medicago); and Wisterieae, which includes woody climbers like Wisteria.[2][15] Recent studies have also recognized Glycyrrhizeae (resurrected for Glycyrrhiza and allies) and Adinobotryeae as part of the IRLC.[15]The reassignment of genera from the former tribe Millettieae to Wisterieae reflects the integration of liana-forming lineages into the clade based on shared plastid and nuclear markers. A 2021 phylogenomic study redefined Caraganeae to include only Caragana and reduced Astragaleae to the Erophaca-Astragalean clade.[2]Beyond the defining loss of the chloroplastinverted repeat, tribal synapomorphies in the IRLC are supported by shared molecular markers, including variations in the rpl32-trnL intergenic spacer region, which exhibit conserved patterns across these lineages.[16]Historical taxonomic revisions of the IRLC tribes occurred primarily between 2003 and 2010, driven by molecular studies utilizing plastid matK and trnK sequences that confirmed the monophyly of the core tribes and prompted the exclusion of unrelated groups like Loteae while incorporating Wisterieae. Subsequent works from 2021 onward, integrating phylogenomic data, further refined tribal boundaries, such as the resurrection of Glycyrrhizeae and redefinition of Astragaleae.[8][15][2]
Major Genera
The inverted repeat-lacking clade (IRLC) encompasses several species-rich genera within the Fabaceae family, with Astragalus standing out as the largest, comprising approximately 3,000 species of herbs and small shrubs predominantly distributed across the temperate regions of the Northern Hemisphere.[17] This genus exhibits remarkable morphological diversity, ranging from low-growing cushion plants to taller perennials, and includes toxic species known as locoweeds that can affect livestock grazing in arid and semi-arid habitats. High endemism is evident in Central and West Asia, where many species are adapted to steppe and mountain environments, reflecting the clade's evolutionary success in these areas.[18]Medicago, with around 87 species, is another prominent genus in the IRLC, primarily found in temperate and subtropical Eurasia and North Africa.[1] Notable for its role in agriculture, it includes Medicago sativa (alfalfa), a perennial forage crop widely cultivated for hay and silage due to its high nutritional value and ability to fix atmospheric nitrogen through symbiotic relationships with rhizobial bacteria. Studies on this genus have advanced understanding of nitrogen fixation mechanisms, highlighting its ecological and economic contributions to sustainable farming practices.[19]The genus Trifolium, containing about 300 species of clovers, is widespread in temperate grasslands and meadows across Europe, Asia, and North America.[20] These mostly herbaceous perennials or annuals feature trifoliate leaves and inflorescences that attract pollinators, playing a key role in pasture ecosystems by enhancing soil fertility and biodiversity. Species like Trifolium repens (white clover) are integral to mixed grasslands, supporting forage production and serving as model organisms for studying legume-rhizobia interactions.[21]Other notable genera in the IRLC include Pisum with two to three species, primarily the cultivated garden pea (Pisum sativum), a cool-season annual originating from the Mediterranean region and valued for its edible seeds.[22]Cicer, encompassing 46 species native to the Mediterranean and Central Asia, features the chickpea (Cicer arietinum) as its sole major crop, an annual pulse crop essential for human nutrition in arid zones due to its drought tolerance.[23] Similarly, Vicia with 248 accepted species of vetches, distributed mainly in temperate Eurasia and introduced elsewhere, includes economically important taxa like Vicia faba (broad bean), a protein-rich legume with a long history of cultivation for food and fodder.[24]Glycyrrhiza, with about 20 species, includes the licorice plant (Glycyrrhiza glabra), used in confectionery and medicine. These genera collectively underscore the IRLC's significance in both wild diversity and human agriculture, with cultivation histories tied to ancient domestication events in the Near East.
Phylogeny and Evolution
Divergence and Age
The inverted repeat-lacking clade (IRLC) of legumes is estimated to have originated approximately 39.0 ± 2.4 million years ago during the late Eocene, based on molecular clock analyses of plastid genes matK and rbcL calibrated with Tertiary macrofossils.[25] These estimates were derived using penalized likelihood rate smoothing on Bayesian phylogenetic trees, with 12 fossil calibrations constraining minimum ages for key nodes within Leguminosae, including early papilionoid divergences.[25] Recent phylogenomic studies employing Bayesian methods and expanded datasets, including nuclear markers, have corroborated this late Eocene timing for the IRLC crown radiation, often incorporating 20 or more fossil calibrations to refine divergence estimates across the Fabaceae.[26]The fossil record provides supporting evidence for an Eocene origin and mid-Eocene radiation of IRLC-like legumes. Earliest IRLC-like papilionoid flowers and fruits have been documented from early Eocene deposits in North America, such as the Tepee Trail Formation in Wyoming, featuring morphological traits consistent with derived papilionoids including IRLC precursors.[27]Pollen fossils attributable to papilionoid legumes appear in mid-Eocene sediments from North America (e.g., Green River Formation), supporting an early radiation contemporaneous with the molecular estimates.The divergence and subsequent diversification of the IRLC coincided with major climate shifts during the Eocene-Oligocene transition (approximately 34–37 million years ago), characterized by global cooling, aridification, and the onset of Antarctic glaciation, which promoted the expansion of open temperate habitats favorable to IRLC lineages. This environmental reconfiguration likely facilitated adaptive radiations in herbaceous and shrubby forms typical of the clade, as evidenced by the alignment of fossil occurrences with paleoclimate proxies showing a shift from subtropical to more seasonal temperate conditions in the Northern Hemisphere.[25] Biogeographic analyses further indicate that these changes drove intercontinental dispersals and niche specialization within the IRLC during the late Eocene to early Oligocene.
Relationships within Faboideae
The inverted repeat-lacking clade (IRLC) represents one of the most derived, crownward lineages within the subfamily Papilionoideae (synonym Faboideae), nested deeply within the expansive 50-kb inversion clade that encompasses approximately 95% of the subfamilys species diversity.[28] This positioning places the IRLC as a monophyletic group within the Hologalegina clade, characterized by the loss of one inverted repeat in the plastid genome, distinguishing it from earlier-branching lineages that retain both repeats despite sharing the defining 50-kb inversion.[28]Within the 50-kb inversion clade, the Hologalegina clade (including the IRLC and robinioid genera) is sister to Indigofereae, with that combined group sister to the Millettieae-Phaseoleae (millettioid-phaseoloid) assemblage; more basally within this inversion clade lie the genistoid and dalbergioid lineages, which form a successive grade leading toward the IRLC and its immediate relatives, reflecting a progression from more ancestral papilionoid forms to the specialized structure of the IRLC.[28]Phylogenetic resolution of these relationships has been consistently strong in multi-locus and phylogenomic analyses, including a comprehensive study utilizing 1,456 low-copy nuclear loci across Papilionoideae, which recovered the IRLC and its sister groupings with bootstrap support values of 100% and posterior probabilities of ≥0.99.[28] Earlier multi-locus efforts, such as those combining plastid markers like matK and trnL, also upheld the IRLC's placement within the 50-kb inversion clade, though with moderate bootstrap values around 62% for broader nodes.[29]The emergence of the IRLC signifies a pivotal evolutionary transition in plastid genome organization, occurring subsequent to the 50-kb inversion event estimated at approximately 50–60 million years ago during the early diversification of Papilionoideae.[25] This shift underscores the clades role in the adaptive radiation of papilionoid legumes, though detailed genomic implications are tied to subsequent IR loss rather than the inversion itself.[28]
Plastid Genome
Structure and IR Loss
The plastid genomes (plastomes) of the inverted repeat-lacking clade (IRLC) are distinguished by the complete absence of the large inverted repeat (IR) region, a ~25 kb sequence duplication that flanks the small single-copy (SSC) region in most land plant plastomes and promotes genomic stability through homologous recombination. Typical angiosperm plastomes feature a quadripartite structure comprising a large single-copy (LSC) region, two IR copies (IRa and IRb), and the SSC region, yielding total lengths of 140-160 kb. In IRLC species, this canonical arrangement is altered to an LSC-SSC configuration due to the loss of the IR, resulting in more compact plastomes of 110-130 kb. For instance, the Medicago truncatula plastome measures 123-124 kb, reflecting this streamlined architecture.[30][31]The IR loss in IRLC plastomes arose from the independent contraction and deletion of one IR copy in the clade's common ancestor, an event predating other major rearrangements and likely facilitated by homologous recombination between the original IR duplicates. This mechanism has been corroborated through physical mapping and sequencing in key genera, such as Medicago (e.g., M. truncatula lacking the rRNA-encoding IR) and Pisum (e.g., P. sativum showing post-loss inversions), where the single remaining rRNA operon resides adjacent to the LSC/SSC boundary.[30][31][32]IRLC plastomes generally retain 110-120 genes, encompassing 70-80 protein-coding genes, 30 tRNAs, and 4 rRNAs, with minimal outright gene losses but notable pseudogenization in some lineages. The absence of the IR reduces recombination-mediated stabilization, thereby elevating the potential for structural rearrangements like inversions and indels; for example, Pisum plastomes exhibit at least five such inversions relative to the Medicago configuration.[33][31]Comparisons with IR-containing plastomes underscore these differences: IRLC genomes display accelerated rearrangement rates and uniform substitution patterns across regions, unlike the suppressed variation in IR-flanked segments of outgroups like Lotus japonicus (~150 kb with intact IR). Sequenced IRLC exemplars, such as Cicer arietinum (125,319 bp, 108 genes, confirmed IR loss via boundary analysis), illustrate this heightened plasticity without compromising core functionality.[32][33][34]
Evolutionary Implications
The loss of the inverted repeat (IR) in plastomes of the inverted repeat-lacking clade (IRLC) results in reduced intergenomic recombination between duplicated sequences, leading to decreased genome stability and elevated mutation rates specifically in the expanded single-copy regions. This structural change exposes former IR genes, now residing in single-copy areas, to higher substitution rates—approximately 3-4 times greater than in IR-retaining taxa—thereby accelerating overall plastome evolution within the IRLC. Such dynamics are evident in papilionoid legumes, where IRLC lineages exhibit notably higher nucleotidesubstitution rates compared to IR-containing relatives, correlating with increased structural rearrangements like inversions and indels.[35][35][36]These accelerated evolutionary rates in IRLC plastomes may confer adaptive advantages by facilitating rapid divergence of plastid genes involved in photosynthesis and metabolism, potentially aiding the clade's extensive diversification into over 4,000 species across diverse habitats. Although no direct causal link exists between IR loss and nodulation, the IRLC's genetic dynamism parallels the evolution of nodule-specific cysteine-rich (NCR) peptides, which are unique to this clade and enable precise control of bacteroid differentiation in root nodules for enhanced nitrogen fixation. This co-evolutionary pattern with rhizobial symbionts underscores the IRLC's success, particularly in temperate ecosystems where herbaceous genera like Astragalus and Medicago dominate.[35][37][38]Rare instances of IR re-expansion, such as the ~9 kb IR in Medicago minima and a smaller 425 bp version in its sister M. lupulina, demonstrate the reversibility of IR loss through repeat-mediated recombination mechanisms like Holliday junction resolution. These events, occurring within the last 3-9 million years, do not necessarily restore plastome stability, as ongoing illegitimate recombination driven by dispersed repeats continues to promote rearrangements in these lineages. Overall, the IRLC's plastomic innovations correlate with its ecological dominance in temperate regions, reflecting broader co-evolution with soil microbes without implying a primary role for IR loss in nodulation traits.[39][36][39]
Diversity and Distribution
Species Diversity
The inverted repeat-lacking clade (IRLC) encompasses approximately 4,000 species across about 56 genera, representing roughly 20% of the total species diversity in the Fabaceae family, which comprises around 20,000 species overall.[40][41] This substantial contribution highlights the IRLC's role as one of the most species-rich lineages within the subfamily Faboideae.Species richness within the IRLC is highly uneven, dominated by a handful of megadiverse genera that account for the majority of its diversity. The genus Astragalus stands out with over 3,000 species, establishing it as the largest genus among all angiosperms and a key driver of the clade's overall species density. Similarly, Oxytropis harbors more than 300 species, further exemplifying concentrated diversification in select lineages.[42] These hotspots of speciation contrast with smaller genera, underscoring the clade's skewed distribution of diversity.Diversification patterns in the IRLC feature notable radiations during the Miocene, particularly in temperate herbaceous groups, which facilitated rapid species accumulation in genera like Astragalus.[40] The clade shows lower species diversity in tropical lineages relative to temperate ones, aligning with its predominant adaptation to cooler climates and reflecting evolutionary biases toward higher-latitude habitats.[40]Conservation challenges affect many IRLC species, with numerous narrow endemics—especially in megadiverse genera—threatened by habitat loss from agricultural intensification and urbanization; however, no complete extinction events across the clade have been recorded to date.[43][44]
Geographic Distribution
The Inverted Repeat-Lacking Clade (IRLC) of legumes exhibits a predominantly Holarctic distribution, centered in the temperate zones of the Northern Hemisphere, with primary concentrations in Eurasia and North America.[21] The clade's range extends into subtropical regions of Africa and Asia, particularly through high-elevation habitats and Mediterranean-adjacent areas.[21] Centers of diversity are prominent in Central Asia, including the Irano-Turkestan region and the Sino-Himalayan Plateau, where genera like Astragalus dominate steppe ecosystems, and in the Mediterranean Basin, supporting diverse assemblages in grasslands.[45][46]Biogeographic patterns within the IRLC reflect adaptation to temperate and montane environments, with Astragalus species prevalent in arid steppes across Central Asia and extending into North American prairies.[47] In contrast, clovers (Trifolium) are characteristic of European and Mediterranean grasslands, with highest diversity in temperate Old World regions.[48] The clade shows limited native presence in the Neotropics, primarily through historical introductions of species like Trifolium and Medicago to South America and Central America, rather than natural colonization.[21]The biogeographic history of the IRLC traces to a Laurasian origin in the late Eocene, approximately 39 million years ago, with subsequent diversification driven by vicariance and long-distance dispersal events.[21] Post-Eocene cooling facilitated spread across northern continents, including Beringian migrations to North America and southward extensions into African highlands via Miocene dispersals.[2] Current threats from climate change particularly affect high-altitude distributions, such as those of Oxytropis in Eurasian and North American mountains, where warming is predicted to contract suitable habitats and shift ranges poleward.[49][50]
Economic and Ecological Significance
Agricultural Crops
The Inverted repeat-lacking clade (IRLC) within the Fabaceae family encompasses several economically vital legumes that form the backbone of global pulse and forage production. Key domesticated crops include chickpeas (Cicer arietinum), lentils (Lens culinaris), peas (Pisum sativum), alfalfa (Medicago sativa), and broad beans (Vicia faba), which together contribute significantly to food security and livestock feed.Domestication of these IRLC crops originated approximately 10,000 years ago in the Near East's Fertile Crescent, where early farmers selected for non-shattering pods and larger seeds to enhance harvest efficiency. Today, these and other IRLC pulses contribute to global dry pulse production, which reached approximately 96 million metric tons in 2023, with chickpeas, lentils, and dry peas accounting for about 35 million tons.[51]Alfalfa ranks as the world's leading forage legume, cultivated on over 30 million hectares as of recent estimates (around the 2020s), primarily in the United States, Argentina, and China.These crops offer substantial agricultural benefits, including high protein content—ranging from 20-25% in dry seeds of chickpeas and lentils—making them essential for human nutrition in vegetarian diets and animal feed. Their symbiotic nitrogen fixation with rhizobia bacteria improves soil fertility, reducing the need for synthetic fertilizers and supporting sustainable farming; for instance, alfalfa can fix up to 200 kg of nitrogen per hectare per year. Modern breeding has yielded high-yielding varieties, such as hybrid alfalfas that produce 15-20 tons of dry matter per hectare under optimal conditions, enhancing productivity in arid regions.Despite these advantages, IRLC crops face challenges such as susceptibility to pests; certain Astragalus species, known as locoweeds, produce swainsonine toxins that poison livestock grazing on infested pastures, leading to significant economic losses in rangelands. Breeding efforts are increasingly focused on drought tolerance, with genetic improvements in peas and chickpeas enabling yields to be maintained under water-limited conditions, as demonstrated by varieties that sustain 2-3 tons per hectare with 30% less irrigation.
Other Uses and Roles
Members of the inverted repeat-lacking clade (IRLC) exhibit diverse non-agricultural applications, particularly in medicine and horticulture. The root extracts of Glycyrrhiza glabra (liquorice), a prominent IRLC species, are widely utilized in pharmaceuticals for their anti-inflammatory, antioxidant, antimicrobial, and antiviral properties, aiding in treatments for respiratory disorders, peptic ulcers, and infections.[52] Similarly, Trigonella foenum-graecum (fenugreek) seeds serve as a key ingredient in health supplements due to their antidiabetic, anti-inflammatory, and galactagogue effects, while also functioning as a spice in culinary applications.[53][54]In ornamental horticulture, IRLC genera like Wisteria are valued for landscaping, where their vigorous climbing vines adorn arbors, pergolas, and trellises, providing aesthetic appeal through cascading blooms.[55] Clovers (Trifolium spp.), another IRLC group, are incorporated into lawns for their low-maintenance ground cover and support honey production, as their nectar-rich flowers attract bees for pollination and yield.[56][57]Ecologically, IRLC legumes play significant roles in grassland ecosystems by enhancing biodiversity through nitrogen fixation and habitat provision, contributing to species richness in semi-arid temperate regions.[58][59] However, certain Astragalus species, such as A. variabilis, exhibit invasive potential as poisonous weeds, disrupting rangelands and causing economic losses in livestock husbandry by inducing locoism.[60] Additionally, IRLC plants serve as vital forage for wildlife, supporting pollinators, insects, and herbivores while bolstering overall ecosystem services like soil stabilization.[61][62]In research, Medicago truncatula stands out as a model organism for studying symbiotic interactions, particularly nitrogen-fixing nodulation with rhizobia, enabling insights into plant-microbe signaling and metabolic exchanges.[63][64] Biotechnological applications leverage nodule-specific cysteine-rich (NCR) peptides from M. truncatula, which promote bacteroid differentiation and exhibit antimicrobial properties, offering potential tools for enhancing crop resistance and sustainable agriculture.[65][66][67]