Fact-checked by Grok 2 weeks ago

WD40 repeat

The WD40 repeat is a short found in numerous eukaryotic proteins, typically comprising 40–60 residues and characterized by a conserved glycine-histidine () near the and a tryptophan-aspartate (WD) at the . These repeats fold into four anti-parallel β-strands (labeled A–D) that form individual blades in a β-propeller architecture, most commonly with seven blades arranged in a circular scaffold. First identified in as an ancient regulatory in G-protein β-subunits, the WD40 repeat serves primarily as a non-enzymatic platform for mediating protein–protein interactions without intrinsic catalytic activity. WD40 domain-containing proteins are among the most abundant and versatile in eukaryotic genomes, with over 260 non-redundant members identified in humans alone, often grouped into subfamilies based on additional domains or functions. The β-propeller structure provides a stable, rigid scaffold with distinct surfaces—the top for specific binding, the bottom for core interactions, and the sides for inter-domain contacts—enabling recognition of short linear motifs, post-translational modifications like , and even nucleic acids. This adaptability allows WD40 proteins to participate in diverse cellular processes, including (e.g., in G-protein signaling), regulation (e.g., via CDC20 in the anaphase-promoting complex), (e.g., through TRAF proteins), and modification (e.g., in complexes). Notable examples include the G-protein β-subunits, which were pivotal in the motif's discovery and facilitate heterotrimeric G-protein activation in response to G-protein-coupled receptors, and the COP1 protein, which uses its WD40 domain to target transcription factors for degradation in plant photomorphogenesis and animal circadian rhythms. In disease contexts, mutations in WD40 repeat proteins like WDR45 are linked to neurodegeneration, such as beta-propeller protein-associated neurodegeneration (BPAN), highlighting their essential roles in cellular . Overall, the WD40 repeat's evolutionary conservation across eukaryotes underscores its fundamental importance in assembling multiprotein complexes for regulatory functions.

Structure and Sequence

Sequence Motif

The WD40 repeat is a short typically comprising 40–60 residues, from which the name "" is derived. This motif is characterized by a conserved glycine-histidine () near the and a tryptophan-aspartate () at the . The sequence of the WD40 repeat exhibits considerable variability, particularly in the flexible loop regions between beta-strands, while maintaining a pattern featuring the and WD dipeptides with intervening hydrophobic residues, as captured by models like the pattern [LIVMSTAC]-[LIVMFYWSTAGC]-[LIMSTAG]-[LIVMSTAGC]-x(2)-[DN]-x-{P}-[LIVMWSTAC]-{DP}-[LIVMFSTAG]-W-[DEN]-[LIVMFSTAGCN]. This pattern reflects the essential sequence features that define the motif across diverse proteins. A typical WD40 domain comprises 4–9 repeats, with 7 being the most common number, which collectively fold into a closed propeller architecture. Identification of WD40 repeats in protein sequences relies on computational methods such as hidden Markov models (HMMs), which model the probabilistic patterns of the to predict its presence with high accuracy.

Three-Dimensional Structure

The WD40 repeat domain adopts a circular β-propeller fold, typically consisting of 4 to 8 blades arranged symmetrically around a central axis, with each blade formed by four anti-parallel β-strands designated A (innermost) to D (outermost). This architecture creates a , tapered structure that serves as a stable scaffold for protein interactions. The overall spans approximately 4-6 nm in diameter, with a central tunnel of about 1-2 nm wide. Due to circular permutation in the sequence, the β-strands within each blade are contributed by adjacent repeats: the D strand from one repeat pairs as the outer strand with the A, B, and C strands from the next repeat to complete the . The inner A and B strands form hydrogen bonds with each other, while the outer C and D strands are linked by flexible loops, enabling the skewed arrangement that propagates around the . Closure of the ring occurs via a "" mechanism in the final , where N-terminal strands supply the outer elements, facilitating velocity-dependent folding and enhancing during . The propeller's is reinforced by a hydrophobic core formed by packed side chains from the β-strands, supplemented by inter-blade bridges and bonds that rigidify the between repeats. structures exemplify this , such as the 7-bladed propeller of human WDR5 (PDB: 2GNQ), which demonstrates the symmetric arrangement and conserved core interactions. Variations exist, including incomplete propellers with fewer than 4 full blades that may oligomerize to approximate a complete , or hybrid domains where WD40 motifs fuse with other folds like TPR repeats for specialized functions.

Function and Interactions

Protein-Protein Interactions

WD40 repeats facilitate protein-protein interactions primarily through the exposed surfaces of their β-propeller fold, where the top and bottom faces, along with interconnecting loops and edges, provide versatile platforms for molecular recognition and binding. These surfaces are characterized by conserved residues that enable stable or transient associations with partner proteins, allowing WD40 domains to act as central hubs in complex assemblies. In their adaptor role, repeats serve as scaffolds for multi-protein complexes, recruiting key regulatory elements such as ligases and transcription factors to coordinate sequential or cooperative interactions. The modular nature of the β-propeller architecture supports simultaneous binding of multiple partners, enhancing the efficiency of and regulatory processes at the molecular level. Binding specificity is dictated by distinct electrostatic and hydrophobic patches distributed across the propeller surface, which selectively engage complementary features on molecules; for example, clusters of charged residues on the top face can interact with oppositely charged motifs like those on tails. Hydrophobic grooves and rings formed by conserved residues further stabilize these interfaces, ensuring partner selectivity amid diverse cellular contexts. The dynamics of WD40-mediated interactions are supported by flexible loops that undergo localized conformational changes upon , which refine the binding pocket and enhance overall without disrupting the core rigidity. These adaptive adjustments allow for fine-tuned of strength and duration. Co-immunoprecipitation assays have confirmed the formation of these multi-component complexes, revealing dependencies on specific surface residues for partner recruitment. Complementarily, NMR spectroscopy has elucidated the transient and dynamic aspects of WD40 , capturing intermediate states and loop movements that underpin mechanisms.48475-4/fulltext)

Roles in Cellular Processes

WD40 repeat-containing proteins serve as versatile scaffolds that integrate and modulate diverse cellular pathways, facilitating , , control, and cell fate decisions. These proteins often act as hubs in multiprotein complexes, enabling precise regulation of cellular through their β-propeller structures that coordinate interactions without enzymatic activity themselves. In , repeats contribute to pathway modulation by recognizing phosphorylated substrates and linking them to degradation machinery, as seen in the Wnt/β-catenin pathway where they facilitate β-catenin turnover to prevent aberrant signaling. For instance, the domain in substrate adaptors binds phosphorylated β-catenin, promoting its ubiquitination and degradation, thereby maintaining low cytoplasmic levels in the absence of Wnt ligands and allowing rapid response to signaling cues. This scaffolding role ensures tight control over pathway activation, preventing constitutive signaling that could lead to uncontrolled . WD40 repeats play a in protein degradation by functioning as substrate adaptors within cullin-RING ubiquitin complexes, such as CUL4-DDB1, where they selectively recruit targets for ubiquitination and proteasomal breakdown. These adaptors, often containing DWD motifs, dock onto the DDB1 subunit via specific sequence elements, enabling the ligase to target a wide array of substrates involved in , replication, and progression. This adaptability allows a single cullin scaffold to assemble numerous distinct ligases, each tailored to specific cellular needs, thus regulating protein levels with . In transcription regulation, repeats mediate the recruitment of co-activators and repressors to , influencing modifications and patterns. For example, certain proteins recognize unmodified or symmetrically dimethylated arginine 2 on , facilitating the assembly of Trithorax group complexes that deposit activating H3K4 methylation marks at promoters and enhancers. Conversely, other WD40 domains bind repressive marks like , aiding polycomb repressive complex 2 in silencing developmental genes, thereby balancing activation and repression to fine-tune transcriptional outputs across cellular contexts. WD40 repeats are integral to and control, particularly through their involvement in checkpoint mechanisms that ensure genomic . In the anaphase-promoting complex/cyclosome (APC/C), WD40-containing activators bind substrates bearing destruction motifs, triggering their ubiquitination to drive mitotic progression and prevent . For instance, the seven WD40 repeats form a β-propeller that recruits securin and for degradation, enabling sister separation and mitotic exit, while dysregulation can activate apoptotic pathways in response to prolonged mitotic arrest. Dysregulation of WD40 repeat proteins is implicated in various diseases, notably cancer and endocrine disorders, where altered disrupts normal pathway fidelity. In oncogenesis, aberrant WD40 function can stabilize pro-proliferative signals or impair degradation of oncoproteins, contributing to tumorigenesis across multiple tissues. In endocrine contexts, mutations or misregulation of these proteins interfere with signaling, leading to disruptions in hormone-responsive and syndromes like or .

Occurrence and Evolution

Distribution Across Organisms

WD40 repeats are ubiquitous across eukaryotic organisms, serving as a fundamental in diverse cellular contexts. Genome-wide analyses have identified approximately 262 non-redundant WD40-containing genes in the , reflecting their extensive role in complex multicellular eukaryotes. In simpler eukaryotes such as the yeast , the count is notably lower, with 83 WD40 proteins annotated, indicating a scaling with organismal complexity. In contrast, WD40 repeats are rare in prokaryotes, with systematic surveys identifying only about 4,000 such proteins across bacterial and archaeal genomes, primarily in select phyla like Actinobacteria and Proteobacteria. This scarcity supports the hypothesis of a predominantly eukaryotic origin and expansion of the WD40 family, though isolated prokaryotic instances suggest possible ancient horizontal transfer or . Among plants, WD40 repeats exhibit expansions tailored to specific physiological demands, such as growth and metabolism. A 2024 genome-wide study in Capsicum annuum (pepper) revealed 269 CaWD40 genes, many implicated in developmental and metabolic pathways. Similarly, a 2025 pan-WD40ome analysis across 26 diverse maize inbred lines identified 6,849 WD40 genes, highlighting line-specific structural and functional diversity that underscores adaptive variations in plant genomes. In vertebrates, WD40 gene numbers generally exceed those in , correlating with increased regulatory complexity; for instance, counts surpass the ~172 in (). Comprehensive annotation resources like WDSPdb facilitate cross-species comparisons, cataloging over 600,000 predicted WD40 proteins from 4,426 species, including detailed structural predictions for eukaryotic and rare prokaryotic entries.

Evolutionary Conservation

The WD40 repeat domain traces its origins to the last eukaryotic common ancestor (LECA), where it emerged as a versatile structural module essential for protein interactions across diverse cellular functions. Phylogenetic analyses indicate that core WD40 repeats have been highly conserved throughout eukaryotic evolution, maintaining structural integrity and functional roles in phyla ranging from protists to metazoans. This deep conservation underscores the domain's fundamental importance, with homologs identifiable in nearly all eukaryotic lineages but absent or rudimentary in prokaryotes, suggesting an eukaryotic-specific innovation. Sequence conservation of WD40-containing proteins exhibits gradients correlated with the number of repeats, where proteins harboring more repeats (typically 7 or greater) display elevated evolutionary compared to those with fewer (often 4-6). Recent pan-genome analyses of inbred lines reveal that WD40 proteins with higher repeat counts are more frequently retained in core orthogroups across diverse accessions, reflecting stronger purifying selection and functional constraints. In contrast, proteins with fewer repeats show greater variability and are prone to lineage-specific losses, highlighting how repeat multiplicity influences resilience to evolutionary pressures. These patterns align with observations in other eukaryotes, where multi-repeat architectures correlate with enhanced under selective regimes. The expansion of repeats primarily occurred through and segmental gene duplications, enabling the diversification of β-propeller structures while preserving core motifs. duplications within genes facilitated the addition of repeats, promoting architectural complexity, whereas whole-gene duplications contributed to family-wide proliferation. appears rare, confined mostly to prokaryotic contexts and not a significant driver in eukaryotic WD40 evolution. Adaptive variations in repeats reflect lineage-specific evolutionary pressures, particularly in specialized metabolic and signaling pathways. In , such as , WD40 proteins have evolved to regulate , including biosynthesis, as evidenced by genome-wide identifications linking specific CaWD40 genes to pigmentation and responses. In animals, lineage-specific expansions of the WD40 family have supported adaptations in signaling cascades, with duplications enhancing roles in pathways like those involving kinases and regulators. These divergences illustrate how conserved scaffolds accommodate functional innovations without compromising structural fidelity.

Examples and Applications

Notable Proteins Containing WD40 Repeats

One prominent example of a protein featuring repeats is WDR5, which forms a seven-bladed β-propeller structure essential for its role in complexes. WDR5 serves as a core subunit in the MLL/ complex, where its WD40 domain binds to the tail, facilitating H3K4 trimethylation and thereby promoting transcriptional activation. This interaction is critical for the complex's catalytic activity, as structural studies reveal that the central cavity of the WDR5 β-propeller accommodates the unmodified 2 of , stabilizing substrate presentation to the methyltransferase. In cell cycle regulation, CDC20 exemplifies a WD40 repeat-containing activator of the anaphase-promoting complex/cyclosome (APC/C). CDC20 possesses a C-terminal WD40 domain that forms a β-propeller, enabling it to bind substrates and activate APC/C ubiquitination during mitosis. This domain recognizes destruction motifs like D-boxes and KEN-boxes on mitotic regulators, ensuring timely degradation for metaphase-to-anaphase transition and mitotic exit. DDB1, involved in DNA damage response, is a key adaptor in the cullin4-DDB1 ubiquitin ligase complex, characterized by three WD40 β-propeller domains (BPA, BPB, and BPC) that form a unique triple-propeller architecture. The BPA and BPC propellers create a docking platform for substrate receptors, while BPB interacts with cullin4, facilitating ubiquitination of proteins like histones and replication factors in response to UV-induced damage. The G-protein signaling pathway highlights β (GNB1), where the protein's seven WD40 repeats assemble into a seven-bladed β-propeller that mediates heterotrimer formation with Gα and Gγ subunits. This structure is pivotal in phototransduction, as the propeller's blades provide interaction surfaces for effector binding, transmitting signals from activated to downstream in rod cells. In plants, TTG1 (TRANSPARENT TESTA GLABRA1) in is a WD40 repeat protein that regulates development through its β-propeller domain, which interacts with and bHLH transcription factors to form an activator complex. Mutations in TTG1 disrupt this complex, leading to glabrous phenotypes and impaired epidermal cell differentiation, underscoring its role in cell fate specification. Proteins with hybrid domains, such as COP1 in plants, combine WD40 repeats with other motifs to integrate signaling pathways. COP1 features a C-terminal WD40 β-propeller alongside an N-terminal and coiled-coil domain, enabling it to function as an E3 ubiquitin ligase in photomorphogenesis by targeting transcription factors like HY5 for degradation in the dark. The WD40 domain in COP1 facilitates substrate recognition and assembly with SPA proteins, balancing light-dependent de-etiolation.

Therapeutic and Research Applications

WD40 repeat domains, traditionally viewed as challenging targets due to their involvement in protein-protein interactions (PPIs) considered "undruggable," have emerged as viable candidates for small-molecule intervention through systematic ligandability assessments. A 2025 evaluation systematically screened and characterized ligands for over 100 WD40 repeat (WDR)-containing proteins, identifying potent binders for several family members and demonstrating that these β-propeller structures can accommodate drug-like molecules at conserved sites, thereby opening avenues for modulating PPIs in disease contexts. This approach builds on initiatives like the Structural Consortium's Target 2035 project, which from 2020 to 2025 focused on WDR proteins to develop chemical probes and advance dark proteome targeting. Proteolysis-targeting chimeras (PROTACs) exploiting repeats have shown promise for selective protein degradation, particularly via DCAF1 and WDR5. In 2025, a small-molecule probe, OICR-41103, was developed as a potent, selective binder to the DCAF1 domain, enabling recruitment to CRL4 ligase for targeted degradation of neo-substrates in cellular models. For WDR5, DCAF1-based PROTACs were reported in 2024, with crystal structures of ternary complexes revealing how these bifunctional molecules induce WDR5 ubiquitination and proteasomal degradation, reducing oncogenic signaling in cancer cells. Similarly, VHL-recruiting WDR5 PROTACs discovered in 2023-2024 achieved nanomolar potency in degrading WDR5, suppressing proliferation in pancreatic ductal models. In cancer therapeutics, modulation of WD40 repeats, especially WDR5, has been linked to oncogenesis through its role in MLL complexes and stabilization. A review highlighted WDR5 inhibitors and degraders that impair and in leukemias and solid tumors, with expanded profiling in 2024 confirming their efficacy in disrupting -driven proliferation across multiple cancer cell lines. Pharmacologic WDR5 degradation via PROTACs in 2025 further suppressed growth in vitro and by targeting its interaction with SS18-SSX fusion proteins. Emerging links to endocrine dysregulation involve WD40-mediated signaling, though therapeutic modulation remains preclinical. Research tools leveraging WD40 repeats have advanced functional genomics and structural studies. CRISPR-Cas9 knockout screens have identified essential WD40 proteins in viral replication and cellular pathways; for instance, a 2022 screen pinpointed WDR81 as a host factor for reovirus infection, enabling dissection of WD40 roles in endosomal trafficking. Cryo-electron microscopy (cryo-EM) has provided high-resolution insights into WD40-containing complexes, such as the 2024 structure of the Spo11 core complex bound to DNA, revealing how WD40 repeats from Ski8 organize meiotic double-strand break machinery. These tools facilitate targeted perturbations and visualization of dynamic assemblies. Challenges in WD40-targeted therapies include achieving specificity amid multi-WD40 architectures and off-target effects in the expansive WDR family. Pan-WD40ome analyses, such as a 2025 study across inbred lines, underscore structural diversity and conservation, informing human efforts but highlighting the need for isoform-selective . Future directions emphasize comprehensive WD40ome mapping and integrative approaches combining AI-driven design with high-throughput screens to overcome these hurdles.