Fact-checked by Grok 2 weeks ago

Gene co-expression network

A is an undirected model in bioinformatics where nodes represent individual genes and edges denote statistically significant in their expression levels across multiple samples, conditions, or tissues, enabling the inference of potential functional associations among genes without implying direct causation. These networks are constructed from high-throughput transcriptomic data, such as or RNA-sequencing profiles, by first computing pairwise similarity measures—typically Pearson's or Spearman's coefficients—between profiles, then applying thresholds or transformations to define connections and often yielding scale-free topologies where a few highly connected "hub" genes dominate. A prominent approach, weighted (WGCNA), enhances this by raising values to a soft-thresholding power (β) to create continuous adjacency weights, preserving nuanced co-expression information and facilitating the detection of biologically meaningful modules or clusters of co-expressed genes. GCNs leverage the "guilt-by-association" principle, positing that co-expressed genes are likely involved in shared biological pathways, regulatory mechanisms, or responses to perturbations, thus aiding in the functional of unknown genes and the of candidates for traits like susceptibility or crop resilience. In practice, they are analyzed using tools such as packages (e.g., WGCNA) or web-based platforms (e.g., ), which support , identification via or topological overlap, and with other data for multi-layer insights. Applications span biomedical research—uncovering linked to Alzheimer's or cancer progression—and plant sciences, where GCNs from like ATTED-II or RiceFrend inform breeding for stress tolerance, though challenges persist in handling data noise, context-specificity, and distinguishing direct from indirect interactions.

Definition and Fundamentals

Overview and Definition

A is an undirected model that represents relationships between based on the similarity of their expression profiles derived from high-throughput transcriptomic data. In this framework, nodes correspond to or transcripts, while edges denote the degree of co-expression between connected nodes, typically measured across multiple samples, tissues, or experimental conditions. These networks provide a systems-level view of gene interactions, highlighting patterns where exhibiting similar expression behaviors are assumed to share functional associations. The primary data sources for constructing gene co-expression networks are transcriptomic datasets, including microarray experiments and RNA-sequencing (RNA-seq) profiles, which quantify mRNA abundance in diverse biological contexts such as developmental stages, disease states, or perturbations. For instance, large-scale compendia of expression data from model organisms like humans, yeast, or worms enable the identification of conserved co-expression patterns across species. Key structural components include the nodes (genes), edges (co-expression similarities), and the overall topology, which can be unweighted—using binary connections above a threshold—or weighted, preserving continuous edge strengths to better reflect nuanced correlations. In contrast to gene regulatory networks, which model directed causal interactions often mediated by transcription factors and regulatory elements, co-expression networks focus solely on correlative patterns without inferring directionality or mechanisms. This correlative nature assumes that co-expressed genes are likely involved in coordinated biological functions, such as shared pathways, offering a foundation for inferring functional modules.

Biological Interpretation

Gene co-expression networks are grounded in the biological principle known as "guilt by association," which posits that genes exhibiting similar expression patterns across diverse conditions are likely to share common regulatory mechanisms, participate in the same biochemical pathways, or contribute to overlapping cellular functions. This principle leverages the idea that coordinated gene expression reflects underlying functional relationships, allowing researchers to infer uncharacterized gene roles from well-studied neighbors in the network. For instance, modules of co-expressed genes have been identified in yeast during the cell cycle, where genes involved in DNA replication and mitosis show synchronized expression profiles, highlighting conserved regulatory controls. In model organisms like , co-expression analysis has revealed clusters of genes responsive to biotic stresses, such as pathogen attacks, where upregulated genes encode defense-related proteins like PR1 and WRKY transcription factors, demonstrating how networks capture adaptive biological responses. These examples underscore the conserved nature of co-expression patterns across species, providing evidence for functional linkages beyond direct physical interactions. However, interpreting co-expression requires caution, as does not imply direct causal interactions or physical associations between products; instead, observed patterns may arise indirectly through shared upstream regulators or environmental influences. Common regulators, such as transcription factors, can drive apparent co-expression without genes operating in the same or pathway. Within systems biology, gene co-expression networks serve as proxies for identifying functional modules—groups of genes that collectively perform coherent biological roles—facilitating a holistic view of cellular organization and response dynamics. This approach integrates expression data with broader network models to elucidate emergent properties of biological systems.

Historical Development

Early Concepts

The emergence of gene co-expression networks traces its origins to the 1990s, coinciding with the development of high-throughput technologies for measuring gene expression on a genomic scale. The introduction of cDNA microarray technology by Schena et al. in 1995 revolutionized the field by enabling the parallel quantification of mRNA levels for thousands of genes, allowing researchers to capture dynamic expression patterns across diverse biological conditions for the first time. This shift from targeted, single-gene analyses to comprehensive profiling laid the groundwork for identifying patterns of coordinated gene activity, highlighting the limitations of reductionist approaches in understanding complex cellular processes. Building on this technological foundation, early analytical methods emphasized grouping genes with similar expression profiles to infer functional relationships. A seminal contribution came from Eisen et al. in , who presented a system for and visualization of genome-wide expression data using standard statistical algorithms, such as Pearson correlation for similarity measures. Applied to cell cycle data, their approach, implemented in the freely available and TreeView software, revealed distinct clusters of genes that oscillated in synchrony, often corresponding to known functional categories like or protein synthesis. These findings demonstrated that genes exhibiting correlated expression dynamics are likely co-regulated and involved in shared biological pathways, serving as an initial step toward modeling gene interdependencies. The primary motivation for these early concepts was to move beyond isolated studies toward a systems-level comprehension of , where coordinated expression patterns could reveal underlying regulatory mechanisms and functional modules within the genome. By the late 1990s, this perspective began incorporating influences from , adapting concepts from analyses and nascent protein-protein interaction maps to conceptualize genes as nodes in interconnected structures, thereby facilitating the transition from static clusters to dynamic network representations of biological relationships.

Key Milestones

The field of gene co-expression networks saw significant advancements in the early 2000s, building on initial data analyses to enable functional predictions through correlated expression patterns. A foundational contribution came from Butte and Kohane, who in 2000 introduced relevance networks, demonstrating how pairwise measurements could genes and predict functions based on co-expression similarities across diverse datasets. This approach highlighted the potential of co-expression for inferring biological relationships without prior annotations, influencing subsequent network-based methodologies. A major breakthrough occurred in 2005 with the introduction of weighted gene co-expression network analysis (WGCNA) by Zhang and Horvath, which formalized a framework for constructing networks using soft thresholding to preserve continuous connectivity weights. This method shifted the field toward assuming scale-free topology in co-expression networks, where a few highly connected hub genes dominate interactions, mirroring biological systems' robustness and modularity. The scale-free assumption facilitated the identification of biologically meaningful modules and improved network interpretability in large-scale genomic studies. In 2008, Langfelder and Horvath extended WGCNA through an R package that streamlined network construction, module detection via hierarchical clustering and dynamic tree cutting, and topological analysis, making the approach accessible for widespread adoption in systems biology research. Concurrently, databases like COXPRESdb, first released by Obayashi and Kinoshita in 2007 and refined in 2009 with rank-based correlation measures for cross-species comparability, began aggregating co-expression data from multiple organisms, enabling comparative analyses and resource sharing. These tools solidified co-expression networks as standard for exploring gene functions and regulatory mechanisms. The 2010s marked expansions driven by next-generation sequencing (NGS), which provided deeper transcriptomic resolution and facilitated co-expression analyses in non-model organisms and under diverse conditions. Integration of NGS data with co-expression frameworks revealed finer-grained networks, enhancing applications in by capturing condition-specific interactions. By the mid-2010s, the field evolved from static models to dynamic and temporal co-expression networks, incorporating time-series data to model evolving relationships during processes like development or disease progression. In the 2020s, advancements continued with the integration of single-cell RNA sequencing (scRNA-seq), enabling the construction of co-expression networks at cellular resolution to uncover cell-type-specific interactions and heterogeneity. Databases were updated, such as GeneFriends in 2022, incorporating data from thousands of samples across humans and mice for enhanced co-expression resources. Emerging methods incorporated and , exemplified by GeneRAIN in 2025, which uses models to learn complex relationships from large-scale bulk datasets. These developments have further expanded GCN applications in precision medicine and as of 2025.

Construction Methods

Co-expression Measures

Gene co-expression measures quantify the similarity between expression profiles of pairs of genes across a set of samples, serving as the foundational step in identifying potential regulatory or functional relationships. The most widely adopted measure is the , which assesses linear dependencies and was first applied to cluster genes based on data in early genome-wide expression studies. This parametric approach assumes that expression levels follow a Gaussian distribution and is computationally efficient for large datasets. The Pearson correlation coefficient r between two gene expression vectors X and Y is defined as: r = \frac{\text{cov}(X, Y)}{\sigma_X \sigma_Y}, where \text{cov}(X, Y) = \frac{1}{n-1} \sum_{i=1}^n (X_i - \bar{X})(Y_i - \bar{Y}) is the sample covariance, \sigma_X and \sigma_Y are the standard deviations of X and Y, and n is the number of samples. This formula derives from normalizing the covariance to yield values between -1 and 1, emphasizing the strength and direction of linear co-variation while accounting for the scales of the variables. For non-parametric alternatives, the Spearman rank correlation coefficient evaluates monotonic relationships by ranking expression values and applying the Pearson formula to the ranks, making it robust to non-Gaussian distributions and outliers common in . , a measure from , captures non-linear dependencies by quantifying the reduction in uncertainty about one gene's expression given knowledge of another's, computed as I(X; Y) = H(X) + H(Y) - H(X, Y), where H denotes ; it generalizes correlation but requires or for continuous expression data. In weighted gene co-expression network analysis (WGCNA), raw correlation measures like Pearson are transformed via soft-thresholding to produce connection strengths that approximate scale-free , using the adjacency function a_{ij} = |r_{ij}|^\beta, where \beta > 0 is a selected to balance and preserve strong correlations while down-weighting weak ones. The of \beta is determined by fitting the resulting network's to a power-law model, typically ranging from 6 to 12 for expression data. Key considerations in selecting co-expression measures include handling technical noise through robust variants or filtering, as small sample sizes can inflate spurious correlations, particularly for mutual information which is sensitive to estimation bias. Pearson is preferred for normally distributed data like log-transformed RNA-seq counts, while Spearman or mutual information suits skewed or heterogeneous profiles to avoid underestimating non-linear interactions.

Network Building Techniques

Once co-expression similarities, such as Pearson correlation coefficients, are computed between pairs, these values are transformed into a network to represent the structure. The A is a symmetric n \times n (where n is the number of genes), with A_{ij} indicating the connection strength between genes i and j. In binary networks, edges are defined using a hard \tau, where A_{ij} = 1 if the absolute co-expression value |r_{ij}| > \tau and $0 otherwise, resulting in an unweighted that simplifies analysis but may discard nuanced relationships. In contrast, weighted networks retain continuous values as edge weights, often transforming raw correlations via a soft-thresholding function like A_{ij} = |r_{ij}|^\beta (where \beta \geq 1 is a power parameter), preserving more biological information and enabling scale-free topology approximation. Threshold selection is critical for binary networks to balance sparsity and connectivity, avoiding overly dense graphs that obscure meaningful connections. Methods include spectral graph theory, which analyzes eigenvalues of a Laplacian transformation of the adjacency matrix to identify thresholds yielding robust network components, often resulting in conservative cutoffs compared to arbitrary values. Percolation theory-inspired approaches assess the emergence of a giant connected component as \tau decreases, ensuring the network transitions from fragmented to biologically plausible without excessive edges. For weighted networks, the threshold power \beta is selected to approximate scale-free topology, typically aiming for a linear regression fit R^2 > 0.8 between log-transformed connectivity distribution and log-degree, as implemented in weighted gene co-expression network analysis (WGCNA). This criterion reflects real biological networks, where the degree distribution follows a power law P(k) \sim k^{-\gamma} with exponent \gamma \approx 2-3, indicating hubs and robustness. Variants of network building accommodate specific data characteristics. Signed networks distinguish positive and negative correlations by assigning edge signs (e.g., A_{ij} = \left( \frac{1 + r_{ij}}{2} \right)^\beta for signed weights), enabling detection of co-activation versus co-suppression modules, as demonstrated in analyses of murine data where signed structures better captured regulatory mechanisms. For time-series expression , dynamic networks construct multiple adjacency matrices across time windows or conditions, revealing temporal changes in co-expression, such as during Drosophila embryonic development where evolves to reflect developmental stages.

Analysis Techniques

Topological Properties

Gene co-expression networks typically exhibit a scale-free , characterized by a power-law degree distribution where a few , known as hubs, possess high while most have few connections. This architecture arises from the construction process using measures and thresholding, leading to heterogeneous patterns observed across , such as in tissue-specific profiles and yeast datasets. Hub , with their elevated centrality, play pivotal roles in , often corresponding to or highly expressed . Scale-free networks confer robustness to random perturbations, such as gene knockouts or expression , because disruptions primarily affect lowly connected nodes, preserving overall functionality. For instance, in plant co-expression networks, this buffers against , with hubs showing stronger selective constraints and lower . The degree distribution follows P(k) \propto k^{-\gamma}, where \gamma typically ranges from 2 to 3, confirming near-scale-free behavior in rank-based constructions. Centrality measures quantify importance in these networks. Degree centrality for i, denoted k_i, is the of its adjacencies: k_i = \sum_j a_{ij}, where a_{ij} is the entry; high-degree s identify influential hubs. measures the fraction of shortest paths passing through , calculated as g_{i}(v) = \sum_{s \neq v \neq t} \frac{\sigma_{st}(v)}{\sigma_{st}}, highlighting s critical for global information flow. Closeness centrality assesses average shortest path distance from to others, defined as C_i = \frac{1}{\sum_{j \neq i} d_{ij}}, where d_{ij} is the geodesic distance, identifying efficiently connected s. Local clustering reveals network modularity. The clustering coefficient for node i is C_i = \frac{2 \times n_i}{k_i (k_i - 1)}, where n_i is the number of edges among neighbors of i; values around 0.2–0.6 in co-expression networks indicate dense local connections compared to random graphs (≈0.003). Globally, network density, the ratio of actual edges to possible edges, is low: \rho = \frac{m}{\binom{n}{2}}, with m edges and n nodes, reflecting sparse structures that avoid excessive false positives while maintaining connectivity (mean degree 2–5). Many co-expression networks display small-world properties, combining high clustering coefficients with short s akin to random networks. In co-expression data, the is approximately 4, similar to randomized equivalents (≈2.8–4), yet clustering is markedly higher, facilitating efficient information propagation and modular organization. This enhances biological efficiency, as seen in conserved patterns across datasets.

Module Detection and Inference

Module detection in gene co-expression networks involves partitioning the network into subsets of genes that exhibit high internal connectivity and low external connectivity, representing potential functional units. Common algorithms for this purpose include , clique percolation, and spectral partitioning, which operate primarily on the network's adjacency or similarity without relying on topological metrics. These methods aim to uncover densely connected groups that may correspond to biological pathways or co-regulated processes. Hierarchical clustering, often implemented in an agglomerative manner, builds a from the derived from co-expression similarities, allowing to be identified by cutting the at appropriate heights. This approach is particularly effective for visualizing structure in large networks and is widely used due to its simplicity and interpretability. In the Weighted Gene Co-expression Network (WGCNA) framework, is applied to the topological overlap matrix (TOM) to group genes into , with dynamic tree-cutting algorithms enhancing the detection of biologically meaningful clusters. Clique percolation identifies modules as unions of overlapping k-cliques, where a k-clique is a complete of k nodes, and adjacent cliques share k-1 nodes; this method excels at detecting overlapping communities, which is common in biological networks where participate in multiple functions. Originally developed for general , it has been adapted for gene co-expression analysis to reveal dense, interconnected gene groups. Spectral partitioning leverages the of the network's to partition nodes into modules, providing a that minimizes the cut size between clusters while maximizing intra-cluster connections. This technique is advantageous for large-scale networks as it scales well computationally and can automatically estimate the number of modules using spectral gaps or clustering indices like silhouette width. A key measure in WGCNA for enhancing module detection is the topological overlap measure (TOM), which quantifies the relative interconnectedness between genes i and j by for both direct connections and shared neighbors. The TOM is calculated as: TOM_{ij} = \frac{l_{ij} + a_{ij}}{\min(k_i, k_j) + 1 - a_{ij}} where a_{ij} is the adjacency weight between genes i and j, l_{ij} = \sum_u a_{iu} a_{uj} is the sum of the products of connection weights through all other genes u (representing shared neighbors in weighted networks), and k_i = \sum_{j \neq i} a_{ij} (and similarly for k_j) is the (weighted ) of gene i. This measure transforms the into a dissimilarity for clustering, promoting robust identification by emphasizing topological similarity over mere pairwise . Once detected, modules are inferred as groups of co-regulated genes likely involved in shared biological functions, such as response to stimuli or metabolic processes. To summarize a module's expression profile, the module eigengene—defined as the first principal component of the genes' expression data within the module—is computed, serving as a representative meta-gene for downstream analyses like with traits. Biological interpretation is further achieved through enrichment analysis, where module genes are tested for overrepresentation in (GO) terms or canonical pathways using tools like , revealing functional themes such as regulation or immune signaling. Validation of detected modules emphasizes and , often assessed via resampling techniques such as bootstrap or of the expression data to evaluate consistency of membership across iterations. Methods like the Similarity Across Bootstrap RE-sampling () quantify robustness by measuring agreement in gene assignments, helping to filter transient clusters. For condition-specific analyses, dynamic detection extends static approaches by constructing time- or perturbation-resolved networks, identifying modules that emerge, split, or shift under varying biological contexts, such as stress responses.

Applications

Disease and Biomarker Discovery

Gene co-expression networks facilitate the identification of disease-associated genes in by revealing module hubs that act as potential oncogenes driving tumor progression. In , weighted gene co-expression network analysis (WGCNA) of transcriptomic profiles from 49 cell lines identified five subtype-specific modules, including the turquoise module for the Basal B subtype, which is enriched in pathways related to and involving signaling, with ETS1 emerging as a central hub gene implicated in oncogenic regulation. Similarly, integrative co-expression analyses across datasets have highlighted hubs such as those in modules, correlating with aggressive phenotypes and validating their role as therapeutic targets. In neurological disorders, co-expression networks elucidate pathological pathways, particularly in where they reveal disruptions in amyloid processing. WGCNA of brain expression data from postmortem samples identified co-expressed modules enriched in synaptic and neuronal functions, with hub gene ELAVL4 directly interacting with amyloid precursor protein (APP) and (BACE1) to modulate amyloid-β production, a hallmark of amyloid plaque formation. Such analyses across multiple brain regions further confirmed amyloid-related gene clusters, including those overlapping with known AD risk loci, providing insights into disease-specific rewiring. Biomarker discovery leverages differential co-expression networks to detect changes in gene interactions between diseased and healthy states, emphasizing hub genes as prognostic indicators. In (TCGA) data for various cancers, these networks identified hubs like SECISBP2L in response modules and PHTF2 in non-response modules, whose expression levels showed strong correlations with overall survival (hazard ratios up to 4.12), enabling risk stratification beyond traditional markers. For instance, in cohorts from TCGA, hub genes from differential networks predicted treatment outcomes and survival, with validation across independent datasets confirming their clinical utility. Case studies in infectious diseases illustrate practical applications, such as WGCNA in to uncover modules. Analysis of longitudinal blood transcriptomes from patients revealed the brown module, enriched in activation pathways with hubs including ITGB2 and ITGAM, strongly correlating with clinical severity metrics like (R=0.852) and low , highlighting dysregulated as a key driver in severe cases during the outbreaks. These modules informed potential interventions targeting immune overactivation, demonstrating the translational value of co-expression approaches in acute pathologies.

Functional Annotation and Systems Biology

Gene co-expression networks facilitate the functional of uncharacterized genes through the guilt-by-association principle, where genes within the same are assumed to share biological roles based on their coordinated expression patterns. In , this approach has been applied to stress response modules, enabling the prediction of functions for hundreds of previously unknown genes by associating them with well-characterized counterparts in co-expression clusters. For instance, analysis of data from and other organisms identified 215 functionally unknown genes whose roles were inferred from their network neighborhoods, demonstrating high accuracy in assigning categories such as and signaling. This method leverages the topological structure of networks to propagate known annotations, providing a scalable way to annotate genomes without direct experimental validation. Integration of co-expression networks with other data enhances functional insights by refining predictions through multi-layered evidence. Combining transcriptomic co-expression with reveals protein-level corroboration of gene interactions, while incorporation of epigenomic features like helps identify regulatory influences on expression modules. Bayesian networks have been particularly effective for this integration, modeling probabilistic dependencies across data types to infer refined gene functions and pathways; for example, they have been used to construct directed graphs linking epigenomic modifications to transcriptomic co-expression patterns, improving the resolution of functional modules. Such approaches allow for a systems-level view where co-expression serves as a scaffold for overlaying diverse molecular data, yielding more robust annotations. Evolutionary conservation of co-expression patterns across species supports ortholog function prediction by highlighting preserved regulatory relationships. Comparisons between and genomes using aligned co-expression networks reveal that orthologous genes often maintain similar and module membership, enabling the transfer of functional annotations from model organisms to humans. Bayesian methods applied to these networks have shown significant conservation of expression profiles, with implications for inferring conserved biological processes like regulation. This cross-species strategy underscores the evolutionary stability of co-expression as a proxy for functional . In , temporal co-expression networks capture dynamic regulatory cascades during embryogenesis, elucidating how genes coordinate to drive sequential patterning events. Time-series analyses in early embryos, for instance, have identified pioneer factors like that orchestrate the initial activation of coordinated gene modules, propagating regulatory signals across developmental stages. These networks reveal hub genes that temporally link early zygotic transcription to later morphogenetic processes, providing insights into the hierarchical control of embryogenesis without relying on static snapshots. Recent applications as of 2025 include patient-specific co-expression networks for identifying novel subtypes in lung adenocarcinoma, enhancing precision in functional annotation for .

Challenges and Future Directions

Methodological Limitations

Gene co-expression are susceptible to data-related artifacts that can compromise their reliability. Batch effects, arising from technical variations in sample processing or experimental conditions, often confound co-expression patterns by introducing spurious across heterogeneous datasets. For instance, combining data from multiple sources without proper reduces the recovery of known functional associations, as demonstrated in analyses of plant gene expression where subset-specific outperformed integrated ones. Low sample sizes exacerbate this issue, leading to unstable with high variance in weights; constructed from fewer than 100 samples exhibit significant instability, while those from larger cohorts (e.g., over 300 samples) show improved consistency in topological features. Additionally, data presents challenges due to the high prevalence of zero counts, which can distort estimates and require specialized preprocessing to avoid biased network inference. Computational demands pose another major limitation, particularly in for genome-wide analyses. Constructing adjacency matrices involves pairwise comparisons among thousands of genes, resulting in O(n²) where n is the number of genes, making it resource-intensive for large-scale datasets. This scaling hinders the application of co-expression to high-dimensional data without substantial computational infrastructure. Biologically, co-expression capture rather than mechanistic relationships, leading to pitfalls such as indirect correlations that mask true regulatory interactions. For example, genes may appear co-expressed due to shared upstream regulators rather than direct functional links, potentially including transitive edges that dilute specificity. Furthermore, these inherently lack directionality, equating co-expression with but failing to distinguish causation from mere , which limits their utility in inferring regulatory hierarchies. Validation of co-expression networks is fraught with challenges, including high rates of false positives from thresholding procedures. Arbitrary cutoffs for edge inclusion can inflate spurious connections, particularly in noisy data, and network stability is highly sensitive to outliers that skew measures. Brief reference to threshold selection underscores how such choices amplify these validation issues, often requiring multiple criteria to mitigate but still not fully resolving inherent uncertainties.

Emerging Advances

Recent advancements in gene co-expression networks have leveraged single-cell sequencing (scRNA-seq) to construct cell-type-specific networks, enabling the dissection of intratumor heterogeneity in diseases like . For instance, studies in 2023 utilized scRNA-seq data from non-small cell (NSCLC) to identify distinct cell subpopulations and their associated co-expression modules, revealing how patterns vary across tumor cells and contribute to therapeutic resistance. Similarly, multi-omics approaches in single-cell data have uncovered intra-cell-line heterogeneity in models, highlighting dynamic co-expression shifts that inform tumor evolution and personalized interventions. These cell-type-specific networks provide higher resolution than bulk analyses, facilitating the identification of rare subpopulations driving disease progression in tumors. Hypergraph-based models represent a significant in construction, moving beyond pairwise interactions to capture higher-order relationships among . The Weighted Gene Co-expression Hypernetwork Analysis (WGCHNA), introduced in 2025, models as nodes and samples as hyperedges in a weighted , allowing for the integration of multi-sample data to detect complex modules with improved topological overlap measures. This approach has demonstrated superior performance in identifying -associated gene clusters compared to traditional methods, particularly in heterogeneous datasets. Complementing this, techniques have advanced dynamic inference of co-expression , incorporating time-series or data to model regulatory changes. For example, Bayesian variable selection methods applied to longitudinal expression data in 2023 enable genome-wide detection of dynamic co-expression modules, reducing computational demands while capturing temporal shifts in gene interactions relevant to developmental or processes. Updates to co-expression databases have enhanced accessibility and analytical power for researchers. The GeneFriends database, expanded in 2022, now includes RNA-seq-derived gene and transcript co-expression networks for over 44,000 and 31,000 genes across multiple tissues, incorporating tissue-specific and dataset-specific views to support cross-species comparisons. AI-driven tools for predicting causal edges within these networks, such as graph neural networks and variational autoencoders, have emerged to infer regulatory directions from co-expression patterns, with applications in single-cell data for more accurate reconstruction. Patient-specific co-expression networks are paving the way for precision medicine, particularly in . In lung adenocarcinoma (LUAD), 2025 studies have developed individualized networks using weighted gene co-expression analysis (WGCNA) integrated with multi-omics data to identify subtype-specific modules predictive of response and . These personalized models, often enhanced by , stratify patients into risk groups based on unique co-expression signatures, enabling tailored therapeutic strategies that account for heterogeneity.

References

  1. [1]
    Gene Co-Expression Network - an overview | ScienceDirect Topics
    A gene co-expression network (CEN) is defined as an undirected graph where nodes represent genes and edges indicate correlations between them, facilitating the ...
  2. [2]
    Learning from Co-expression Networks: Possibilities and Challenges
    Network based on similarity in gene expression are called (gene) co-expression networks. One of the major application of gene co-expression networks is the ...Data Availability for Co... · Data Selection for Co... · Co-Expression Network...
  3. [3]
    WGCNA: an R package for weighted correlation network analysis
    Dec 29, 2008 · The WGCNA package provides R functions for weighted correlation network analysis, e.g. co-expression network analysis of gene expression data.
  4. [4]
    Gene Co-Expression Network Tools and Databases for Crop ... - MDPI
    This review presents the features and the most recent gene co-expression network databases in crops and summarises the present status of the tools that are ...<|control11|><|separator|>
  5. [5]
  6. [6]
  7. [7]
    Gene co-expression analysis for functional classification and gene ...
    In this review, we provide an introduction and overview of what constitutes a co-expression network, followed by a guide of the different steps in co-expression ...Co-expression networks · Differential co-expression... · Integrated network analysis
  8. [8]
    “Guilt by Association” Is the Exception Rather Than the Rule in Gene ...
    Historically, many attempts to understand gene function leverage a biological principle known as “guilt by association” (GBA). ... gene co-expression network ...
  9. [9]
    Constructing gene co-expression networks and predicting functions ...
    The gene co-expression network obtained from original yeast cell cycle data is compared with the gene co-expression network obtained from a derived yeast cell ...
  10. [10]
    Discovery of Core Biotic Stress Responsive Genes in Arabidopsis by ...
    Using Weighted Gene Co-expression Network Analysis (WGCNA) we constructed an undirected network leveraging a rich curated expression dataset comprising 272 ...
  11. [11]
    Leveraging User-Friendly Network Approaches to Extract ...
    That is, correlation does not imply causation, and hence the undirected ... gene co-expression network analysis (WGCNA) algorithm is among the most ...
  12. [12]
  13. [13]
  14. [14]
    A general framework for weighted gene co-expression network ...
    A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol. 2005:4:Article17. doi: 10.2202/1544-6115.1128 ...
  15. [15]
    Rank of Correlation Coefficient as a Comparable Measure for ...
    Sep 18, 2009 · Rank of Correlation Coefficient as a Comparable Measure for Biological Significance of Gene Coexpression Open Access. Takeshi Obayashi,.<|separator|>
  16. [16]
    Cluster analysis and display of genome-wide expression patterns
    Cluster analysis and display of genome-wide expression patterns. Michael B. Eisen, Paul T. Spellman, Patrick O. Brown, and David BotsteinAuthors Info ...Cluster analysis and display of ...
  17. [17]
    A comparative study of statistical methods used to identify ...
    Aug 20, 2013 · Pearson's correlation is the most common method used to measure dependence between gene expression signals, but it works well only when data are ...
  18. [18]
    Comparison of co-expression measures: mutual information ...
    Dec 9, 2012 · We provide a comprehensive comparison between mutual information and several correlation measures in 8 empirical data sets and in simulations.
  19. [19]
    Threshold selection in gene co-expression networks using spectral ...
    Oct 8, 2009 · This method presents a systematic, data-based alternative to using more artificial cutoff values and results in a more conservative approach to threshold ...
  20. [20]
    Comparison of threshold selection methods for microarray gene co ...
    Dec 2, 2009 · Threshold selection approaches based on network structure of gene relationships gave thresholds with greater relevance to curated biological ...Missing: spanning tree percolation
  21. [21]
    Signed weighted gene co-expression network analysis of ...
    Jul 20, 2009 · We show that signed networks provide a better systems level understanding of the regulatory mechanisms of ES cells than unsigned networks.
  22. [22]
    Dynamics of Gene Co-expression Networks in Time-Series Data
    May 25, 2020 · Co-expression networks tightly coordinate the spatiotemporal patterns of gene expression unfolding during development. Due to the dynamic ...
  23. [23]
    A general co-expression network-based approach to gene ...
    Feb 2, 2010 · In this paper, we presented a general co-expression network-based approach for the analysis of high-throughput gene expression data. We ...
  24. [24]
    Conservation and Coevolution in the Scale-Free Human Gene ...
    The topology of a human gene coexpression network, derived from tissue-specific expression profiles, shows scale-free properties that imply evolutionary self- ...Abstract · Introduction · Materials and Methods · Results and Discussion
  25. [25]
    Gene co-expression network connectivity is an important ...
    In this study, we aimed to determine the evolutionary forces that maintain the genetic variation of gene expression within the context of the corresponding co- ...
  26. [26]
    Gene co-expression network connectivity is an important ...
    The effect of regulatory variants and the interaction of genes can be described by co-expression networks, which are known to contain a small number of highly ...
  27. [27]
    Gene co-expression network construction and analysis for ...
    Jul 24, 2023 · Network adjacency matrix is obtained as shown in step 1(c), based on arbitrary threshold value as 0.5. If the correlation value is above 0.5, it ...
  28. [28]
    The yeast coexpression network has a small-world, scale-free ...
    Our model reproduces the scale-free, small-world architecture of the coregulation network and the homology relations between coregulated genes without the need ...
  29. [29]
    A comprehensive evaluation of module detection methods for gene ...
    Mar 15, 2018 · A critical step in the analysis of large genome-wide gene expression datasets is the use of module detection methods to group genes into co-expression modules.
  30. [30]
  31. [31]
    Quantitative assessment of gene expression network module ...
    Oct 16, 2015 · We compared the available module validation methods based on 11 gene expression datasets and partially consistent results in the form of homogeneous models ...
  32. [32]
    SABRE: a method for assessing the stability of gene modules in ...
    Nov 14, 2016 · We describe here the SABRE (Similarity Across Bootstrap RE-sampling) procedure for assessing the stability of gene network modules using a re-sampling strategy.Missing: resampling | Show results with:resampling
  33. [33]
    Identification of modules and key genes associated with breast ...
    May 29, 2024 · This study analyzed 28,143 genes expressed in 49 breast cancer cell lines using a Weighted Gene Co-expression Network Analysis to determine ...
  34. [34]
    Prognostic Genes of Breast Cancer Identified by Gene Co ... - Frontiers
    The weighted gene co-expression network analysis (WGCNA) was widely used to analyze large-scale data sets and to find modules of highly correlated genes. WGCNA ...
  35. [35]
    Co-expression Network Analysis Reveals Novel Genes ... - Frontiers
    Nov 25, 2020 · The present study identified the novel association of four genes (ENO2, ELAVL4, SNAP91, and NEFM) with AD pathogenesis by using gene co- ...
  36. [36]
    Integrative network analysis of nineteen brain regions identifies ...
    Nov 1, 2016 · Gene co-expression network analysis. AD, like many other phenotypes, is a complex process involving dysregulation of genes in different pathways ...
  37. [37]
    Differential co‐expression network analysis elucidated genes ...
    Dec 8, 2023 · To identify potential disease genes that regulate patient prognosis, we used differential co‐expression network analysis and transcriptomics ...
  38. [38]
    Differential co‐expression network analysis elucidated genes ...
    Dec 8, 2023 · To identify potential disease genes that regulate patient prognosis, we used differential co-expression network analysis and transcriptomics ...2.1 Transcriptome Datasets · 3 Results · 4 Discussion
  39. [39]
    Transcriptomic signatures and repurposing drugs for COVID-19 ...
    Moreover, we applied weighted correlation network analysis (WGCNA) to identify gene modules that are highly correlated with clinical traits of COVID-19 patients ...Missing: 2020s | Show results with:2020s<|control11|><|separator|>
  40. [40]
    Weighted gene co-expression network analysis revealed T cell ...
    Mar 25, 2023 · Our conclusion revealed that key genes were associated with the age-related phenotypes in COVID-19 patients, and it would be beneficial for clinical doctors.Missing: 2020s | Show results with:2020s