Fact-checked by Grok 2 weeks ago

Proteomics

Proteomics is the systematic, large-scale study of the entire set of proteins—known as the —expressed by a in a given at a specific time, encompassing their structures, functions, interactions, modifications, and dynamics. The field, coined by Australian scientist Marc Wilkins in 1994, emerged in the as a complement to , driven by advances in protein separation and analysis technologies, and recognizes that the human proteome may comprise around 1 million proteins due to extensive post-translational modifications (PTMs) beyond the approximately 20,000 protein-coding genes in the . Key approaches in proteomics include gel-based methods like two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) for protein separation and visualization, as well as gel-free techniques such as , which involve enzymatic digestion of proteins into peptides followed by liquid chromatography coupled with (LC-MS/MS). (MS), often using electrospray ionization (ESI) or matrix-assisted laser desorption/ionization (MALDI), serves as the cornerstone for protein identification, quantification, and characterization of PTMs like or , enabling both top-down (analysis of intact proteins) and bottom-up (peptide-level) workflows. Bioinformatics tools, including database search algorithms like SEQUEST or , are essential for interpreting MS data and mapping protein interactions or expression profiles across cell types, tissues, or disease states. In , proteomics plays a pivotal role in elucidating cellular processes, such as protein-protein interactions and signaling pathways, and has transformative applications in discovery for diseases like cancer and , where techniques like (LCM) combined with identify tumor-specific proteins. It supports by therapeutic targets— for instance, in research using plant-derived compounds— and enables through quantitative analysis of changes in response to treatments or environmental perturbations. Despite challenges like detecting low-abundance proteins, sample variability, and the proteome's dynamic nature, ongoing innovations in sensitivity and unbiased workflows continue to expand its impact on understanding health, disease, and therapeutic responses.

Introduction and Fundamentals

Definition and Scope

Proteomics is defined as the large-scale, systematic study of the , , interactions, and modifications of proteins within a . This field encompasses the comprehensive of proteins, including their , quantification, localization, and post-translational alterations, to elucidate their roles in cellular processes and organismal . At its core, proteomics aims to provide a functional readout of by analyzing the , which represents the realized protein output of the under specific conditions. The is the complete set of proteins expressed by a , , , or at a given time and under defined environmental conditions. Unlike the relatively stable , the is highly dynamic, varying in response to developmental stages, environmental stimuli, states, and temporal factors, which underscores the need for context-specific analyses in proteomics. The scope of proteomics includes both qualitative aspects, such as protein identification and structural elucidation, and quantitative dimensions, such as measuring protein abundance, turnover rates, and interactions to capture dynamic changes across cellular contexts. This breadth allows proteomics to bridge with systems-level understanding, revealing how proteins execute biological functions. Proteomics is distinct from genomics, which focuses on the sequencing, structure, and function of genes encoded in DNA and RNA, as it shifts attention to the downstream protein products that directly mediate cellular activities. In contrast to metabolomics, which examines the full complement of small-molecule metabolites produced by cellular metabolism, proteomics targets macromolecules central to enzymatic, structural, and signaling roles. These distinctions highlight proteomics' position in the hierarchy of omics disciplines, providing insights into the functional proteome that neither nucleic acid-focused genomics nor metabolite-oriented metabolomics can fully address.

Historical Development and Etymology

The term "," denoting the complete set of proteins expressed by a , , , or at a given time, was coined in 1994 by Marc Wilkins during a proteomics workshop at the , , while he was a PhD student at in . This blended "protein" and "" to parallel the concept of the genome in , marking the conceptual birth of systematic protein analysis beyond individual studies. Wilkins also introduced "proteomics" around the same time to describe the large-scale study of proteomes, establishing the field's nomenclature and founding the first dedicated proteomics lab in 1995. The historical roots of proteomics trace back to mid-20th-century advances in protein chemistry, particularly Frederick Sanger's pioneering work in the 1950s, where he elucidated the primary structure of insulin through amino acid sequencing techniques, earning the in 1958 for demonstrating that proteins have defined sequences. This foundational achievement shifted biological inquiry from proteins as amorphous entities to precise molecular blueprints. A pivotal technological milestone arrived in 1975 with Patrick H. O'Farrell's development of two-dimensional (2D-PAGE), which separated proteins by and molecular weight, allowing visualization of up to 2,000 proteins from complex samples like the Escherichia coli in a single gel. O'Farrell's method transformed protein profiling from labor-intensive isolation to high-resolution mapping, laying the groundwork for proteome-scale analyses. The 1990s saw proteomics coalesce as a discipline, propelled by genomic progress. The sequencing of the yeast (Saccharomyces cerevisiae) genome in 1996 enabled the first targeted eukaryotic maps, with early 2D-PAGE studies visualizing over 1,000 protein spots and identifying dozens to hundreds of them, correlating to open reading frames, as reported by various teams who integrated for unambiguous identification. Mann's innovations in the early 1990s, such as nanoelectrospray ionization, dramatically improved sensitivity for peptide sequencing, facilitating proteome-wide coverage. The Human Genome Project's draft publication in 2001 and full completion in 2003 further catalyzed the field, underscoring that genomic sequences alone insufficiently explain dynamic protein functions, interactions, and modifications, thus spurring global proteomics initiatives. Institutional momentum built with the founding of the Human Proteome Organization (HUPO) in 2001, which standardized methodologies, fostered collaborations, and launched projects like the Human Proteome Project to map the ~20,000 human protein-coding genes' expressions. These developments, driven by visionaries like Wilkins, O'Farrell, and , evolved proteomics from biochemical curiosity to an indispensable complement to by the early 2000s.

The Proteome's Complexity

Protein Diversity Through Post-Translational Modifications

Post-translational modifications (PTMs) represent a fundamental layer of protein , involving the covalent attachment or removal of chemical groups to side chains after ribosomal synthesis of the polypeptide chain. These modifications vastly expand the functional repertoire of the , enabling a single to produce multiple protein variants with distinct activities, localizations, and interactions, far surpassing the diversity encoded by the alone. Over 500 distinct types of PTMs have been identified in eukaryotes, including , , sumoylation, and others that dynamically fine-tune protein behavior in response to cellular cues. The mechanisms of PTMs are predominantly enzymatic, with specialized proteins catalyzing the addition or reversal of modifications to ensure precise spatiotemporal control. For instance, involves the transfer of a group from ATP to serine, , or residues, mediated by kinases such as cyclin-dependent kinases (CDKs), which activate or inhibit target proteins by altering their charge and conformation. Similarly, ubiquitination entails the sequential action of E1 activating enzymes, E2 conjugating enzymes, and E3 ligases to attach moieties to residues, often forming polyubiquitin chains that signal for proteasomal degradation and thus regulate protein stability and turnover. , another prevalent PTM, adds carbohydrate moieties in the or Golgi apparatus via glycosyltransferases, influencing , trafficking, and cell-cell recognition. The impact of PTMs on proteome complexity is profound, as they generate structural and functional isoforms that underpin cellular signaling, homeostasis, and adaptation. Phosphorylation alone dynamically modifies approximately 30% of the human proteome at any given time, creating a vast array of signaling networks essential for processes like signal transduction and stress responses. In the context of cell cycle control, CDK-mediated phosphorylation of substrates such as retinoblastoma protein (Rb) promotes progression from G1 to S phase by derepressing E2F transcription factors, illustrating how PTMs orchestrate temporal ordering of events. Ubiquitination exemplifies PTM-driven diversity in protein degradation, where K48-linked polyubiquitin chains target misfolded or regulatory proteins to the 26S proteasome for ATP-dependent breakdown, preventing accumulation and maintaining proteome integrity during development and disease states. These modifications collectively amplify the proteome's informational content, allowing cells to respond rapidly to environmental changes without altering gene expression.

Context-Dependent Protein Expression and Variants

Protein expression is highly dynamic and context-dependent, varying across cellular compartments, developmental stages, and external conditions to enable adaptive responses in living organisms. This variability arises from regulatory mechanisms that control which proteins are produced, in what quantities, and under specific circumstances, thereby shaping the functional beyond static genomic predictions. Such dynamism is essential for cellular , , and pathological states, where shifts in protein profiles can profoundly influence physiological outcomes. At the transcriptional and post-transcriptional levels, and generate diverse protein isoforms from a single , significantly expanding complexity in response to cellular contexts. allows for the inclusion or exclusion of exons during mRNA processing, producing multiple protein variants with distinct functions, structures, or localizations; for instance, over 90% of multi-exon genes undergo alternative splicing, leading to tissue-specific isoforms that adapt to environmental cues. , particularly adenosine-to-inosine modifications, further diversifies transcripts by altering codons, resulting in changes that create novel protein isoforms; this process is prevalent in tissues and contributes to proteomic heterogeneity by recoding up to thousands of sites across the . These mechanisms enable rapid proteome remodeling without genomic alterations, as evidenced by studies showing that splicing events correlate with context-specific isoform in cells. Environmental factors profoundly influence protein expression by triggering selective induction or repression of specific protein sets to maintain cellular integrity. Under thermal stress, heat shock proteins (HSPs) such as and are rapidly upregulated to chaperone misfolded proteins and prevent aggregation, a response conserved across eukaryotes and activated within minutes of temperature elevation. Nutrient availability similarly modulates proteome composition; for example, nutrient deprivation or dietary components can alter expression of metabolic enzymes and signaling proteins, as demonstrated in nutriproteomics studies where imbalances lead to differential abundance of ribosomal and translational regulators in mammalian cells. exposure induces host proteome reprogramming, including the upregulation of and immune effectors; during bacterial or viral infections, proteomics reveals infection-specific signatures, such as increased expression of interferon-stimulated genes in response to intracellular pathogens. In disease contexts, aberrant protein expression drives pathological proteome alterations, particularly in cancer and infections. Cancer cells often exhibit overexpressed oncoproteins, such as or , which promote uncontrolled proliferation; proteomic analyses across tumor types show that these proteins are elevated, correlating with aggressive phenotypes in and cancers. In infectious diseases, hijack host expression machinery, leading to dysregulated proteomes; for instance, viral infections like induce overexpression of host factors aiding replication while suppressing antiviral proteins, resulting in a shifted proteome that favors pathogen persistence. These changes highlight how disease contexts exploit regulatory pathways to alter protein landscapes, often amplifying isoform diversity through splicing dysregulation. Protein abundance exhibits marked temporal and spatial variations, underscoring the proteome's responsiveness to dynamic contexts. Circadian rhythms regulate approximately 10% of the proteome in mammalian tissues, with rhythmic proteins peaking in nuclear compartments to coordinate metabolic and transcriptional cycles. Spatially, protein levels differ across organelles and types; for example, synaptic proteins in neurons fluctuate diurnally by up to 50% in abundance, reflecting localized demands. These quantitative shifts, often spanning orders of magnitude, enable precise control over cellular functions and adaptation. In addition to expression regulation, post-translational modifications can further diversify these variants, as explored in related discussions on protein diversity.

Challenges in Proteomic Research

Limitations Relative to Genomics

Proteomics faces several inherent limitations when compared to genomics, primarily due to the fundamental differences in the stability and manipulability of proteins versus nucleic acids. DNA, the subject of genomic analysis, is a highly stable molecule that can be readily amplified using techniques such as polymerase chain reaction (PCR), allowing for sensitive detection even of low-abundance sequences without significant loss of material. In contrast, proteins cannot be amplified in a similar manner, necessitating direct isolation from biological samples where they exist in dynamic, often low-abundance states, which complicates comprehensive analysis. This disparity in amplification capability makes genomic studies more scalable and less prone to sensitivity issues. A key challenge in proteomics stems from the inherent instability of proteins, which are susceptible to rapid degradation and enzymatic modification, unlike the robust chemical structure of DNA. Proteins can denature, aggregate, or be cleaved by proteases during sample preparation and storage, leading to inconsistent recovery and altered profiles that do not accurately reflect in vivo conditions. Post-translational modifications (PTMs), such as phosphorylation or glycosylation, further exacerbate this instability by introducing chemical heterogeneity that hinders clean isolation and identification, a level of variability absent in the more uniform nucleic acid backbone. These factors make proteomic sample handling far more labor-intensive and error-prone than the straightforward extraction and sequencing of genomic material. The of protein concentrations in biological systems presents another profound limitation, spanning up to 10 orders of magnitude (10^10-fold) in complex samples like human plasma, compared to approximately 10^4-fold for mRNA transcript levels. This vast disparity means that high-abundance proteins often dominate detection signals, masking low-abundance ones critical for cellular function, such as signaling molecules or rare isoforms, whereas genomic and transcriptomic analyses benefit from more compressed ranges that facilitate uniform coverage. Finally, the "one gene, many proteins" paradigm underscores how underestimates functional diversity, as a single can produce multiple protein variants through and PTMs, potentially expanding the to over a million distinct forms from the roughly 20,000 human protein-coding genes. While genomic sequencing captures the genetic blueprint, it cannot predict these protein-level diversifications, leading to an incomplete view of that proteomics must laboriously resolve.

Analytical and Technical Hurdles

One of the primary analytical hurdles in proteomics is the limited sensitivity for detecting low-abundance proteins within samples exhibiting a . The cellular spans approximately seven orders of magnitude, from one copy per to ten million copies, making it difficult for mass spectrometry-based methods to identify rare proteins without being overwhelmed by dominant high-abundance species. In complex biological fluids like , this challenge is exacerbated, as the proteome dynamic range reaches up to 12 orders of magnitude, with abundant proteins such as suppressing signals from low-concentration targets like cardiac by over ten orders. Consequently, current techniques often fail to capture low-copy-number proteins, limiting comprehensive proteome coverage. Sample preparation presents significant technical difficulties, particularly in complex mixtures where extraction biases and distort protein representation. Protein extraction from tissues or fluids like frequently introduces biases favoring high-abundance proteins, as early proteomics studies using data-dependent acquisition methods identified only a few hundred proteins with a strong skew toward abundant . In , pre-analytical variables such as delays or conditions can lead to from platelets or other cellular components, further complicating downstream analysis and reducing the detection of low-abundance biomarkers. Affinity-based depletion strategies, while aimed at removing high-abundance proteins, often result in incomplete removal and variable recovery, perpetuating inconsistencies across samples. Throughput constraints remain a bottleneck in proteomics, contrasting sharply with the high-speed capabilities of . Unlike genomic sequencing, which can process thousands of samples rapidly, workflows are time-intensive, with individual runs limited by instrument capacity to hours per sample and requiring extensive for depth. This limitation arises from the need for meticulous sample handling and chromatographic separation, often restricting large-scale studies to hundreds rather than millions of analyses, thereby slowing progress in proteome-wide investigations compared to .00970-1) Reproducibility issues further hinder proteomic research, stemming from both biological variability and factors. Biological samples exhibit inherent heterogeneity, such as differences in composition or physiological states, which amplify variability in protein yields and detection across replicates. drift, including fluctuations in spectrometer performance over time or between labs, contributes to inconsistent quantification, with platform comparisons showing low correlations for many analytes like cytokines. Multi-laboratory assessments reveal that while certain methods can reproducibly quantify over 4,000 proteins, overall consistency remains challenged by these technical variances, necessitating standardized protocols to mitigate drift and sample-to-sample differences.

Experimental Methods in Proteomics

Antibody-Based Detection Techniques

Antibody-based detection techniques in proteomics exploit the highly specific and high-affinity between antibodies (immunoglobulins) and their antigens (proteins or protein epitopes) to enable targeted detection, quantification, and characterization of proteins in complex biological samples. This immunological specificity arises from the complementary paratope-epitope interaction, where the antibody's variable region recognizes unique structural features on the , often with dissociation constants in the nanomolar range. These methods are particularly valuable for low- to medium-throughput analysis, providing orthogonal validation to unbiased approaches like . Key types of antibody-based techniques include enzyme-linked immunosorbent assay (), Western blotting, and . In , proteins are captured on a , such as a well, using immobilized ; a secondary enzyme-conjugated then binds to the target, producing a colorimetric, fluorescent, or chemiluminescent signal proportional to protein abundance for quantification. The sandwich variant enhances sensitivity by employing two : a capture specific to one and a detection targeting a distinct on the same protein, reducing non-specific binding and achieving detection limits as low as picograms per milliliter. Western blotting combines for protein size separation with probing on a , allowing identification of proteins by molecular weight alongside detection of post-translational modifications like . employs fluorescently labeled to detect surface or intracellular proteins on individual cells, enabling analysis of protein expression in heterogeneous populations and subcellular localization through multiparametric sorting. These techniques offer advantages such as exceptional specificity due to antibody-antigen , relative ease of implementation in standard laboratory settings, and the ability to maintain native protein conformations for functional insights. They are cost-effective for targeted assays and provide semi-quantitative or absolute quantification when calibrated with standards. However, limitations include potential , where antibodies bind non-target proteins sharing similar epitopes, leading to false positives, and batch-to-batch variability in antibody quality, which can affect . Additionally, these methods require prior of target proteins for antibody selection and may miss low-abundance or proteins without suitable reagents, often necessitating validation against physicochemical methods like for comprehensive proteomics workflows. In proteomics applications, antibody-based techniques are primarily used to validate candidate proteins identified from high-throughput screens, such as confirming expression levels or modifications in disease-relevant samples. For instance, is routinely applied to quantify like cytokines in serum, while Western blotting verifies size variants and Western arrays extend this to multiplexed validation of dozens of targets. Flow cytometry supports proteomics by assessing protein localization in cellular contexts, aiding in the study of signaling pathways. These methods bridge discovery and functional analysis, ensuring reliability in applications like verification.

Mass Spectrometry and Separation Methods

Mass spectrometry (MS) serves as a cornerstone technology in proteomics for the identification, quantification, and characterization of proteins by analyzing their peptide components. It enables the detection of proteins at low abundances within complex biological samples, providing sequence-specific information through the measurement of ion masses. Unlike antibody-based methods, which rely on targeted recognition, MS offers an unbiased, global view of the proteome. The fundamental principle of MS involves ionizing biomolecules and separating the resulting ions based on their mass-to-charge ratio (m/z). Ionization is typically achieved using soft techniques that preserve peptide integrity: (ESI), which generates multiply charged ions from liquid samples and is ideally suited for online coupling with liquid chromatography (LC), or (MALDI), which uses a to desorb and ionize peptides from a solid matrix, often for direct analysis of gel spots or tissues. Following ionization, mass analyzers—such as quadrupoles, time-of-flight (TOF) instruments, or analyzers—separate ions by m/z, allowing precise determination of peptide masses. For example, systems achieve high exceeding 100,000 (at m/z 400), enabling the distinction of closely related peptides. Peptide sequencing in MS relies on (MS/MS), where a precursor is isolated, fragmented (commonly via , CID), and the resulting fragment ions are analyzed to generate spectra that reveal sequences. These spectra are then matched against protein databases using algorithms like SEQUEST or to identify peptides and infer protein identities. Mass accuracy is critical for reliable matching; MS, for instance, delivers average errors below 1 ppm with lock-mass calibration, far surpassing the <5 ppm threshold needed for confident identifications in complex mixtures. Separation methods are essential for reducing sample complexity prior to MS analysis, enhancing resolution and sensitivity. Two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) separates intact proteins first by isoelectric point (pI) via isoelectric focusing (IEF) in the first dimension, followed by molecular weight via sodium dodecyl sulfate-PAGE (SDS-PAGE) in the second, resolving up to thousands of protein spots from a single sample. Excised spots are then digested in-gel for MS analysis. Alternatively, liquid chromatography (LC), particularly reversed-phase LC (RP-LC), prefractionates peptides based on hydrophobicity, often integrated online with for automated workflows. These techniques improve proteome coverage by isolating low-abundance species from high-dynamic-range samples. The standard bottom-up proteomics workflow begins with protein extraction from cells or tissues, followed by enzymatic digestion—typically with trypsin—to generate peptides of 5–20 amino acids, which are more amenable to ionization and fragmentation than intact proteins. Peptides are then separated using LC or gel-based methods, ionized, and subjected to MS/MS for spectral acquisition. Fragmentation patterns are computationally searched against databases like , with matches scored by metrics such as peptide mass tolerance and fragment ion coverage to achieve high-confidence protein identifications. This approach, pioneered in the early 2000s, has enabled large-scale proteomic studies with identification rates exceeding 10,000 proteins per run in optimized setups.

High-Throughput and Hybrid Approaches

High-throughput proteomics enables the large-scale analysis of proteomes by scaling up traditional methods to profile thousands of proteins simultaneously, often through unbiased approaches like shotgun proteomics. In shotgun proteomics, also known as bottom-up mass spectrometry, proteins are enzymatically digested into peptides, which are then separated and analyzed by liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS), allowing for the identification and quantification of complex protein mixtures without prior knowledge of the proteome. This method has revolutionized proteome-wide studies by providing deep coverage, with seminal implementations demonstrating its utility in mapping cellular proteomes from minimal sample amounts. Complementing this, protein microarrays facilitate high-throughput interrogation of protein-protein interactions by immobilizing thousands of proteins on a solid surface and probing them with fluorescently labeled partners or analytes, enabling the simultaneous assessment of binding affinities and specificities in a multiplexed format. These arrays have been instrumental in discovering interaction networks, with protocols achieving quantitative measurements of interactions at sub-nanomolar sensitivities. Hybrid approaches integrate proteomics with other omics disciplines or complementary techniques to enhance resolution and context-specific insights. Affinity purification-mass spectrometry (AP-MS) combines targeted protein pull-down using epitope-tagged baits with mass spectrometry to map protein complexes and interactions, often leveraging genomic information to select baits from predicted open reading frames, thereby bridging proteomics and genomics for systems-level network reconstruction. This method has identified thousands of stable complexes in yeast and human cells, with quantitative variants using stable isotope labeling improving specificity by distinguishing true interactors from contaminants. Bioorthogonal labeling extends hybrid strategies to live-cell proteomics by incorporating non-canonical amino acids or chemical tags into proteins via metabolic engineering, followed by selective ligation with probes for imaging or enrichment prior to MS analysis; this allows spatiotemporal tracking of protein synthesis and dynamics in native cellular environments without genetic perturbation. Such labeling has enabled the profiling of nascent proteomes in living cells, revealing dynamic changes in protein turnover under stress conditions. Recent advances in instrumentation have deepened proteome coverage in high-throughput workflows, particularly through nano-liquid chromatography-mass spectrometry (). NanoLC employs capillary columns with inner diameters of 50-100 μm to achieve high-resolution peptide separations at low flow rates, coupling efficiently with sensitive MS detectors like to identify over 10,000 proteins in single runs from mammalian cell lysates by the early 2020s, a marked improvement over earlier limits of a few thousand. These systems reduce sample requirements to picograms while minimizing ion suppression, facilitating applications in low-abundance biomarker discovery. Emerging single-molecule proteomics via nanopores represents a frontier in hybrid high-throughput methods, where proteins or peptides are translocated through biological or solid-state , and ionic current blockades or associated signals decode amino acid sequences at the individual molecule level. Proof-of-concept demonstrations have sequenced short peptides and unfolded full-length proteins, promising ultra-sensitive, label-free analysis of proteomes from minute samples, with potential to integrate with MS for hybrid validation.

Applications of Proteomics

Drug Discovery and Therapeutic Targeting

Proteomics plays a pivotal role in drug discovery by enabling the identification of disease-relevant proteins through comprehensive analysis of protein expression, modifications, and interactions, thereby facilitating the development of targeted therapies. In particular, it supports the transition from basic research to clinical applications by providing insights into protein alterations associated with pathological states, which can be leveraged to design small-molecule inhibitors, biologics, and personalized treatments. This approach has accelerated the validation of therapeutic targets and the optimization of drug candidates, reducing the risk of off-target effects and improving efficacy profiles. Target identification in drug discovery often relies on proteomic profiling to detect differential protein expression between diseased and healthy tissues, highlighting potential candidates for therapeutic intervention. For instance, mass spectrometry-based proteomics can quantify thousands of proteins simultaneously, revealing upregulated or downregulated species in cancer cells compared to normal counterparts, which informs the selection of druggable targets. A key application is phosphoproteomics, which maps phosphorylation events to uncover hyperactive kinases in diseases like cancer; this has informed the development of kinase inhibitors such as imatinib for chronic myeloid leukemia by targeting BCR-ABL kinase activity. Such strategies prioritize proteins with high therapeutic potential, focusing on those involved in disease progression rather than housekeeping functions. Drug screening benefits from activity-based protein profiling (ABPP), a chemoproteomic technique that uses small-molecule probes to label and quantify the activity of enzymes directly in native proteomes, enabling the discovery of selective inhibitors. ABPP probes covalently bind to active sites of target enzymes, allowing researchers to monitor inhibition potency and selectivity across complex biological samples without relying on indirect readouts like cell viability. This method has been instrumental in identifying covalent inhibitors for proteases and other hydrolases in infectious diseases and oncology, streamlining lead optimization by distinguishing on-target engagement from broader proteomic perturbations. By integrating ABPP with high-throughput screening, pharmaceutical pipelines can rapidly triage compounds, enhancing the efficiency of hit-to-lead transitions. A notable case study is the application of proteomics in advancing trastuzumab (Herceptin), a monoclonal antibody targeting HER2 in breast cancer. Proteomic analyses have confirmed HER2 overexpression in approximately 15-20% of breast tumors, validating its role as a therapeutic target and guiding patient stratification for treatment. Quantitative proteomics, including reverse-phase protein arrays, has further elucidated downstream signaling changes upon HER2 inhibition, revealing mechanisms of response and resistance that inform combination therapies. This integration of proteomics not only supported the initial approval of trastuzumab but continues to refine its use in precision oncology. Pharmacoproteomics extends these efforts by monitoring the dynamic effects of drugs on the proteome, capturing changes in protein abundance, localization, and post-translational modifications in response to treatment. This approach uses time-resolved proteomic profiling to assess drug-induced proteome rewiring, such as pathway activation or compensatory responses, which can predict toxicity or efficacy early in development. For example, stable isotope labeling by amino acids in cell culture () combined with mass spectrometry tracks proteome-wide alterations following kinase inhibitor dosing, aiding in dose optimization and biomarker identification for clinical monitoring. By providing a holistic view of drug action, pharmacoproteomics bridges preclinical models and human responses, minimizing attrition rates in late-stage trials.

Biomarker Discovery and Diagnostics

Proteomics plays a pivotal role in biomarker discovery by enabling the identification of protein signatures in biofluids such as plasma, serum, urine, and cerebrospinal fluid, which reflect disease states non-invasively. These signatures often involve altered protein abundance, post-translational modifications, or peptide patterns associated with pathological processes like cancer or neurodegeneration. A classic example is prostate-specific antigen (PSA), a serine protease elevated in prostate cancer, which has been used since the 1980s for screening but suffers from limited specificity due to elevations in benign conditions like prostatitis or hyperplasia, leading to unnecessary biopsies in up to 75% of cases. Proteomic approaches aim to refine such single markers by integrating them into panels that capture multifaceted disease profiles. Recent 2025 studies have identified novel proteomic panels, such as one combining EEF1G, MSLN, BCAM, and TAGLN2 for high-grade serous ovarian cancer detection. Discovery pipelines typically begin with mass spectrometry (MS)-based profiling of biofluids to generate comprehensive proteomic maps, allowing untargeted detection of hundreds to thousands of proteins in complex samples like plasma. Techniques such as data-independent acquisition (DIA) MS enable high-throughput quantification from microliter volumes of serum, identifying differentially expressed proteins between healthy and diseased cohorts. Candidate biomarkers are then validated using targeted methods, including immunoassays like enzyme-linked immunosorbent assays () or multiple reaction monitoring () MS, to confirm specificity and sensitivity in larger populations. This workflow has been standardized in initiatives like the Human Proteome Organization (), emphasizing reproducibility across labs. Challenges in proteomic biomarker discovery include the dynamic range of plasma proteins, where abundant species like albumin mask low-abundance candidates, and inter-individual variability due to age, sex, or comorbidities. Successes have come from multi-marker panels that enhance diagnostic accuracy; for instance, 2010s studies on ovarian cancer identified panels combining apolipoproteins, transferrin, and transthyretin, achieving sensitivities of 90-95% for early-stage detection when integrated via multivariate index assays. These panels outperform single markers like CA-125 by reducing false positives in premenopausal women. Clinical translation is exemplified by FDA-approved proteomic tests, such as , cleared in 2009 as the first in vitro diagnostic multivariate index assay (IVDMIA) for assessing ovarian malignancy risk in women with pelvic masses. integrates five proteins (prealbumin, CA-125, apolipoprotein A1, transferrin, and transthyretin) via a proprietary algorithm, improving triage to surgical specialists with 99% negative predictive value for benign masses. Subsequent approvals like (2016) refined this approach for BRCA-mutated cases, demonstrating proteomics' impact on reducing overtreatment.

Structural and Interaction Network Analysis

Structural proteomics encompasses techniques aimed at determining the three-dimensional structures of proteins on a proteome-wide scale, providing critical insights into their folding, stability, and function. Traditional methods such as X-ray crystallography, which resolves atomic structures by analyzing diffraction patterns from protein crystals, and nuclear magnetic resonance (NMR) spectroscopy, which elucidates structures in solution through magnetic field interactions, have been foundational but are limited by challenges in protein crystallization and size constraints, respectively. Cryo-electron microscopy (cryo-EM) has emerged as a complementary approach, enabling visualization of large protein complexes in near-native states by imaging frozen samples, often achieving resolutions below 3 Å. These structural methods are increasingly integrated with mass spectrometry (MS), where techniques like hydrogen-deuterium exchange MS (HDX-MS) and cross-linking MS (XL-MS) provide dynamic information on solvent accessibility and residue proximities, aiding in fold prediction and validation of low-resolution models. Interaction proteomics focuses on mapping protein-protein interactions (PPIs) to uncover functional networks within the proteome. The yeast two-hybrid (Y2H) system, a genetic assay that detects binary interactions by reconstituting a transcriptional activator in yeast cells, has been pivotal for high-throughput screening, identifying thousands of PPIs in model organisms like yeast and humans. Affinity purification-mass spectrometry (AP-MS), which involves tagging a bait protein, pulling down interactors using affinity beads, and identifying them via MS, excels at capturing stable, multi-protein complexes and has mapped interactomes in diverse systems, including human signaling pathways. These experimental approaches generate comprehensive PPI datasets, often revealing transient interactions missed by other methods, and are essential for distinguishing direct from indirect associations. Network analysis of proteomic data integrates structural and interaction information to model biological systems as graphs, where nodes represent proteins and edges denote interactions or structural features. Hub proteins, characterized by high connectivity (degree >10-20 interactions), often serve as central coordinators in signaling pathways, such as or PI3K hubs that propagate signals in cancer-related cascades, making them vulnerable points for dysregulation. Tools like the database aggregate experimental, predicted, and literature-derived PPIs into searchable networks, enabling visualization of hubs and modules; as of 2025, the STRING database (version 12.5) integrates over 27 billion interactions across more than 12,000 organisms, highlighting pathway enrichments with confidence scores. Such analyses reveal scale-free topologies where hubs drive network robustness, informing targeted perturbations. In , structural and interaction data from proteomics facilitate structure-based , where atomic models of protein are used to computationally screen and optimize small-molecule ligands for . For example, cryo-EM structures of channels combined with PPI networks have guided simulations to develop selective inhibitors, as seen in the design of Nav1.7 blockers for . AP-MS-derived interaction maps prioritize hubs as therapeutic , enhancing accuracy by accounting for allosteric effects. This accelerates lead optimization, reducing experimental iterations in pipelines like those for inhibitors.

Bioinformatics and Computational Proteomics

Protein Identification and Quantification

Protein in proteomics primarily involves database searching algorithms that match experimental (MS/MS) spectra to theoretical spectra derived from protein sequence databases. These tools fragment observed spectra and compare them against predicted fragments from digests of known protein sequences, scoring matches based on mass-to-charge ratios and intensities. Seminal algorithms include SEQUEST, which correlates uninterpreted MS/MS data with sequences using functions to assess spectral similarity, and , which employs a probabilistic scoring system to evaluate the likelihood of random matches. Such methods enable the assignment of spectra to s, facilitating proteome-wide from complex samples. Quantification complements identification by measuring protein abundance levels, either relatively across samples or absolutely in calibrated systems. Label-free approaches, such as spectral counting, estimate abundance by tallying the number of MS/MS spectra assigned to each protein, assuming higher counts correlate with greater abundance; this method is straightforward and avoids labeling but can be biased toward more efficiently ionized peptides. Isotopic labeling techniques provide more precise relative quantification: SILAC incorporates stable isotopes (e.g., 13C or 15N) into during , allowing direct comparison of light and heavy pairs in the same MS run based on mass shifts. Similarly, iTRAQ uses isobaric tags that yield reporter ions in MS/MS fragmentation, enabling multiplexed quantification of up to eight samples by measuring distinct reporter ion intensities for relative or absolute (with added standards) protein levels. Software suites like MaxQuant integrate identification and quantification pipelines, processing raw MS data to achieve high peptide identification rates (often >50% for high-resolution spectra) and proteome-wide quantification with part-per-billion mass accuracy. To control error rates in identifications, false discovery rate (FDR) estimation via the target-decoy approach is standard; this involves searching spectra against both real (target) and reversed/decoy protein databases, using the decoy hit rate to estimate and filter false positives, typically targeting 1% FDR at peptide and protein levels. Challenges in protein identification and quantification arise from protein isoforms and post-translational modifications (PTMs), which generate sequence variants and mass shifts that complicate database matches. Isoforms from can lead to redundant or ambiguous assignments, requiring specialized indexing or de novo-assisted searches to resolve. PTMs, such as , add variable mass tags that necessitate inclusion of modification-specific residue masses in search parameters, increasing computational complexity and false positives without comprehensive PTM databases. These issues underscore the need for hybrid strategies combining database searching with de novo sequencing to improve accuracy in diverse proteomes.

Structure Prediction and Modeling

Structure prediction and modeling in proteomics involve computational algorithms that infer the three-dimensional () architecture of proteins from their sequences, enabling insights into function, interactions, and disease mechanisms without relying solely on experimental determination. These methods are essential in proteomics workflows, where high-throughput sequencing generates vast primary structure data that must be translated into spatial models to understand biological roles. Traditional approaches like exploit evolutionary conservation by aligning target sequences to experimentally solved templates in databases such as the (), achieving reliable predictions when sequence identity exceeds 30%. methods, in contrast, predict structures using physical principles or to simulate folding pathways, particularly for novel folds lacking close homologs. A landmark advancement in ab initio prediction came with AlphaFold2, a deep learning system that revolutionized the field by achieving unprecedented accuracy in the 2020 Critical Assessment of Structure Prediction (CASP14) competition, with median backbone root-mean-square deviation (RMSD) of 0.96 Å for many targets—approaching experimental resolution for proteins up to 400 residues. Subsequent developments, such as AlphaFold 3 released in May 2024, have further improved predictions for protein complexes, including interactions with DNA, RNA, ligands, and ions, enhancing applicability to dynamic proteomic systems. This breakthrough, powered by attention-based neural networks trained on PDB structures and multiple sequence alignments, has enabled proteome-wide modeling, predicting structures for nearly all human proteins with high confidence. Complementing these, tools like Rosetta employ fragment assembly and energy minimization to generate diverse structural ensembles, useful for refining models and designing variants in de novo scenarios. Similarly, I-TASSER integrates threading with ab initio refinement to produce ranked ensembles of models, incorporating spatial restraints from predicted contacts for improved accuracy in multi-domain proteins. In proteomics, predicted models are often validated and refined using (MS) data, particularly cross-linking MS (XL-MS), which identifies residue-pair distances in native complexes to score and constrain computational outputs. For instance, XL-MS-derived distance maps can filter AlphaFold ensembles, resolving ambiguities in flexible regions and confirming predicted interfaces with sub-nanometer precision. This integration bridges computational prediction with experimental proteomics, enhancing reliability for dynamic systems. Such modeling aids in dissecting folding pathways for amyloidogenic proteins, linking sequence variations to neurodegeneration. Experimental structures from cryo-EM or , as explored in interaction analyses, occasionally serve as benchmarks for these predictions.

Post-Translational Modification Analysis

Post-translational modifications (PTMs) introduce functional diversity to proteins, and their computational analysis in proteomics involves detecting, predicting, and quantifying these modifications from (MS) data to understand regulatory mechanisms. Detection typically begins with MS data from enriched samples, such as those using immobilized metal (IMAC) for , followed by algorithmic assignment of modification sites. Site localization scores, such as the Ascore or probability-based metrics, evaluate the confidence of PTM placement on specific residues by comparing observed fragment intensities against theoretical spectra for possible isomers. These scores, often integrated into search engines like MaxQuant or Proteome Discoverer, achieve localization probabilities above 95% for high-confidence sites, enabling reliable identification amid spectral noise. Prediction of PTM sites relies on computational models trained on sequence motifs and structural features to forecast potential modification hotspots. NetPhos, a neural network-based tool, predicts serine, , and phosphorylation sites with specificity around 0.88 by recognizing kinase consensus motifs from curated datasets. More advanced approaches, such as models like DeepMVP or MIND-S, incorporate evolutionary profiles, physicochemical properties, and 3D structures to predict multiple PTM types with AUC values exceeding 0.90, outperforming motif-based methods on benchmark datasets. These models are trained on high-quality annotations, reducing false positives in genome-wide scans. Quantification of PTM stoichiometry computationally assesses the fraction of modified protein forms under varying conditions, revealing dynamic regulation. Tools like FLEXIQuant-LF and multiFLEX-LF analyze label-free MS data by co-isolating modified and unmodified peptide signals, calculating occupancy ratios through precursor intensity ratios and normalization to total protein levels. For instance, in signaling studies, these methods detect stoichiometry shifts from <10% to >50% upon stimulation, providing insights into pathway activation without isotopic labeling. Databases centralize PTM knowledge for validation and model training. PhosphoSitePlus curates approximately 500,000 unique PTM sites across species as of 2024, integrating literature and MS evidence with tools for kinase-substrate mapping. This resource supports queries on regulatory contexts, facilitating integration with proteomic workflows.

Integration with Systems Biology and Multi-Omics

Proteomics plays a pivotal role in systems biology by providing protein-level insights that complement genomic and other omics data, enabling a more comprehensive understanding of biological systems. In systems biology, the integration of proteomics with other omics layers reveals dynamic regulatory mechanisms that transcriptomics or genomics alone cannot capture, such as post-transcriptional control and protein function in cellular networks. This holistic approach facilitates the modeling of complex interactions, from molecular pathways to organism-wide responses, enhancing predictive capabilities for disease mechanisms and therapeutic interventions. Proteogenomics exemplifies this integration by leveraging mass spectrometry (MS)-based proteomics to refine genome annotations. By searching MS-derived peptide spectra against genomic sequences, proteogenomics identifies novel peptides arising from unannotated genes, alternative splicing, or mutations, thereby improving gene models and discovering previously unknown protein-coding regions. For instance, early seminal work demonstrated that searching tandem MS spectra against a six-frame translation of genomic DNA can uncover non-canonical protein variants, with applications in human and microbial genomes. More recent advances have produced highly accurate proteogenomic knowledge bases, validating thousands of novel peptides across diverse species and enhancing annotation accuracy in projects like GENCODE. Multi-omics integration further extends this by correlating proteomic data with transcriptomic and metabolomic profiles to uncover regulatory discrepancies and pathway activities. The Clinical Proteomic Tumor Analysis Consortium (CPTAC), active since the 2010s, has pioneered such efforts through pan-cancer studies that align quantitative proteomics with , transcriptomics, and , revealing protein-level alterations driving oncogenesis, such as kinase signaling dysregulation in . These analyses highlight poor correlation between mRNA and protein abundance, emphasizing proteomics' role in identifying functional effectors; for example, CPTAC data from and ovarian cancers showed that integrating proteome and layers elucidates metabolic reprogramming in tumors. By 2023, CPTAC's datasets encompassed 10 cancer types, providing resources for discovering multi-omics signatures of therapeutic resistance. In network modeling, proteomics informs constraint-based approaches like (FBA) by incorporating protein abundance as constraints on metabolic fluxes, bridging static genome-scale models with dynamic cellular states. Traditional FBA optimizes fluxes under stoichiometric constraints, but integrating proteomic data—such as levels—allows for realistic bounds on reaction rates, improving predictions of metabolic phenotypes under varying conditions. A key method, iOMA (integrated omics-metabolomics analysis), combines with in FBA frameworks to elucidate flux distributions in , demonstrating enhanced accuracy in predicting overflow . Recent extensions, like constrained allocation FBA, allocate limited protein resources across pathways, revealing trade-offs in growth versus stress responses. Computational tools facilitate these integrations, with MixOmics serving as a widely adopted for multivariate analysis and pathway reconstruction across datasets. MixOmics employs sparse partial methods to select correlated features from proteomics, transcriptomics, and , enabling the identification of shared biological pathways without assuming linear relationships. For example, it has been applied to reconstruct signaling networks in cancer by integrating CPTAC-like data, prioritizing multi- modules for downstream validation. This tool's emphasis on ensures interpretable results, supporting systems-level hypotheses in diverse biological contexts. As of 2025, emerging trends in multi-omics include deeper AI and integration for predictive modeling of disease progression and personalized therapies, as well as spatial multi-omics approaches combining proteomics with transcriptomics to map protein distributions in tissues.

Advances in Single-Cell and Clinical Proteomics

Single-cell proteomics has seen significant methodological advancements since 2020, enabling the quantification of thousands of proteins from individual and revealing cellular heterogeneity in diseases like cancer. Techniques such as nanoPOTS (nanodroplet processing in one pot for trace samples) have evolved to support high-throughput analysis, with the nested nanoPOTS (N2) platform introduced in 2021 achieving identification and quantification of approximately 1,000 proteins per single while processing up to 240 per chip. Similarly, SCoPE-MS (single-cell proteomics by ) and its extension SCoPE2 have facilitated the detection of over 1,000 proteins from single mammalian , allowing for the mapping of variations during and in heterogeneous tumor microenvironments. These methods have been applied to study tumor heterogeneity, identifying distinct protein signatures in subpopulations that contribute to and . Recent 2024–2025 advances include the Chip-Tip , which enhances sensitivity and scalability for analyzing over 1,500 single in high-throughput setups, and automated pipelines enabling profiling of 1,536 per experiment, pushing toward population-scale studies. In clinical proteomics, progress in plasma proteome mapping has expanded the depth of detectable proteins, supporting applications in diagnostics and . By 2023, large-scale studies using aptamer-based approaches such as SomaScan measured nearly 5,000 proteins across thousands of individuals, linking profiles to organ-specific aging and risk assessment. Complementary platforms like Olink Explore HT have profiled over 5,400 proteins in , enabling the discovery of circulating for early detection. Recent spectrometry-based methods, such as the , have achieved depths of approximately 4,500 proteins in as of 2025. integration has further advanced clinical diagnostics by enhancing from these proteomes; models post-2020 have improved the prediction of outcomes and validation in precision medicine, such as identifying cardiovascular and signatures with reduced false positives. Looking ahead, proteomics is poised to enable monitoring through wearable biosensors, which could detect protein biomarkers in biofluids like sweat or interstitial fluid for continuous health tracking. In precision oncology, these advances promise to refine therapeutic targeting by combining single-cell proteome data with genomic profiles, facilitating dynamic adjustments to treatments based on tumor . Spatial proteomics, recognized as Method of the Year in 2024, integrates multi-omics to provide tissue-contextual insights into protein localization and interactions, with applications in cancer and neurodegeneration. Despite these gains, challenges persist in scaling single-cell proteomics to population-level studies, where maintaining high amid increased throughput demands improved and cost-effective to avoid loss of . Analytical bottlenecks, such as handling low-abundance proteins and integrating datasets from diverse cohorts, also hinder broader clinical adoption while preserving depth.

References

  1. [1]
    Proteomics - an overview | ScienceDirect Topics
    Proteomics is defined as the study of all proteins enclosed in a specific biological matrix at a given point in time, i.e. the proteome.
  2. [2]
    Proteomics - Latest research and news - Nature
    Proteomics refers to the study of proteomes, but is also used to describe the techniques used to determine the entire set of proteins of an organism or system.
  3. [3]
    Proteomics: Concepts and applications in human medicine - PMC
    Proteomics is the complete evaluation of the function and structure of proteins to understand an organism's nature.
  4. [4]
    Proteomics Planning Workshop
    Jun 1, 2005 · As explicitly defined at the outset of the workshop, proteomics is the study of proteomes, the collections of proteins encoded by genomes. The ...<|control11|><|separator|>
  5. [5]
    Proteomics - ScienceDirect.com
    Proteomics is the large-scale study of proteins, their structure, and their physiological role or functions.<|control11|><|separator|>
  6. [6]
    Introduction to Computational Proteomics - PMC - PubMed Central
    Jul 27, 2007 · Proteomics is defined as the protein complement of the genome and involves the complete analysis of all the proteins in a given sample [1,2].Introduction · Tandem Mass Spectrometry · Ms/ms Scoring Functions
  7. [7]
    proteome | Learn Science at Scitable - Nature
    A proteome is the complete set of proteins expressed by an organism. The term can also be used to describe the assortment of proteins produced at a specific ...
  8. [8]
    Proteome - an overview | ScienceDirect Topics
    A proteome is the entire set of proteins that is or can be expressed by a cell, tissue, or organism at a given time. A cell proteome is much larger and more ...
  9. [9]
    Proteome - an overview | ScienceDirect Topics
    The proteome is defined as the total protein content of one biological system. The proteome is highly dynamic and is constantly changing according to ...
  10. [10]
    Proteomics: Challenges, Techniques and Possibilities to Overcome ...
    Proteomics is the large-scale study of the structure and function of proteins in complex biological sample.Missing: scope | Show results with:scope
  11. [11]
    Proteomics - an overview | ScienceDirect Topics
    Proteomics is the characterization of all proteins in a biological system including the protein spatial distribution and temporal dynamics.
  12. [12]
    Transcriptomes and Proteomes - Genomes - NCBI Bookshelf - NIH
    This multiplicity of protein function provides the proteome with its ability to convert the blueprint contained in the genome into the essential features of the ...
  13. [13]
    Proteomics and metabolomics - ScienceDirect.com
    Proteomics studies proteins (proteome), while metabolomics studies metabolic products (metabolome) within a tissue, cell, or organelle.Pharmacology · Proteomics · Metabolomics<|control11|><|separator|>
  14. [14]
    Genomic, Proteomic, and Metabolomic Data Integration Strategies
    This review focuses on select methods and tools for the integration of metabolomic with genomic and proteomic data using a variety of approaches.
  15. [15]
    Proteomics - ScienceDirect
    The word proteome is a combination of the words protein and genome, first coined by Marc Wilkins in 1994. Wilkins used the term to describe the entire ...
  16. [16]
    Professor Marc Wilkins - UNSW Research
    In 1994, Marc Wilkins developed the concept of the proteome and coined the term. In 1997 he co-wrote and co-edited the first book on proteomics (4,000+ copies ...
  17. [17]
    Origins of mass spectrometry-based proteomics - Nature
    Oct 5, 2016 · Matthias Mann describes a 1992 paper by Donald Hunt and colleagues that revolutionized the use of mass spectrometry in molecular biology.
  18. [18]
    Post-translational modifications in proteins: resources, tools and ...
    Apr 7, 2021 · There are more than 400 different types of PTMs (27) affecting many aspects of protein functions. According to the dbPTM (6), one of the most ...
  19. [19]
    Discovering the Landscape of Protein Modifications - PMC - NIH
    ... number of known protein modifications, with over 500 discrete modifications counted today. ... Keywords: Post-translational modifications, Protein ...
  20. [20]
    The ubiquitin system for protein degradation and some of its roles in ...
    Aug 11, 2005 · The sequential action of these enzymes leads to conjugation of ubiquitin to proteins and then in most cases to their degradation. This review ...
  21. [21]
    Protein post-translational modifications and regulation of ... - NIH
    Nov 12, 2013 · Post-translational modifications (PTMs) are known to be essential mechanisms used by eukaryotic cells to diversify their protein functions ...
  22. [22]
    Multisite protein phosphorylation makes a good threshold but can be ...
    Some 30% of proteins are phosphorylated at any time, many on multiple sites, raising the question of how the cellular phosphorylation state is regulated.
  23. [23]
    Global detection of human variants and isoforms by deep proteome ...
    Mar 23, 2023 · The high proteome sequence coverage of our dataset provides an opportunity to globally detect protein isoforms arising from alternative splicing ...
  24. [24]
    Alternative splicing and related RNA binding proteins in human ...
    Feb 2, 2024 · This review provides a detailed account of the recent advancements in the study of alternative splicing and AS-related RNA-binding proteins in tissue ...
  25. [25]
    Landscape of adenosine-to-inosine RNA recoding across human ...
    Mar 4, 2022 · Editing of protein-coding sequences can introduce novel, functionally distinct, protein isoforms and diversify the proteome.
  26. [26]
    Stress proteins: the biological functions in virus infection, present ...
    Jul 13, 2020 · The HSP expression is rapidly induced when cells meet physiological or environmental attacks such as starvation, high temperature, hypoxia or ...
  27. [27]
    a promising tool to link diet and diseases in nutritional research
    Nutriproteomics is a nascent research arena, exploiting the dynamics of proteomic tools to characterize molecular and cellular changes in protein expression ...
  28. [28]
    Proteomics and integrative omic approaches for understanding host ...
    Here, we review these proteomic methods and their application to studying viral and bacterial intracellular pathogens. We examine approaches for defining ...
  29. [29]
    An integrated landscape of protein expression in human cancer
    Apr 23, 2021 · Here, we provide a reference resource of protein expression across different types of primary tumours and the corresponding cell line models ( ...Missing: oncoproteins infections
  30. [30]
    Genomics of circadian rhythms in health and disease
    Dec 17, 2019 · These studies revealed the rhythmic presence of about 500 proteins (~ 10%) in the nucleus that are components of nuclear complexes involved in ...
  31. [31]
    Circadian Proteomic Analysis Uncovers Mechanisms of Post ...
    Roughly 27% of the identified proteome (1,273 proteins) significantly cycled (p < 0.05) with a circadian period, hereafter referred to as “rhythmic” (Figure S2A) ...
  32. [32]
  33. [33]
  34. [34]
  35. [35]
    How many human proteoforms are there? - PMC - PubMed Central
    However, if one considers that many genes are transcribed with splice variants, the number of human proteins increases to ~70,000 (per Ensembl). In addition, ...
  36. [36]
    Single-cell proteomics: challenges and prospects | Nature Methods
    Mar 10, 2023 · Another challenge is the dynamic range: proteins can be present at anywhere from one to ten million copies per cell, spanning a whopping seven ...
  37. [37]
    Nanoproteomics enables proteoform-resolved analysis of low ...
    Aug 6, 2020 · MS detection of low-abundance proteins from blood remains an unsolved challenge due to the extraordinary dynamic range of the blood proteome ...
  38. [38]
    Multicenter evaluation of label-free quantification in human plasma ...
    Oct 2, 2025 · Early plasma proteomics studies using DDA-based methods identified typically only a few hundred proteins3,40, with a bias toward high-abundant ...
  39. [39]
    Current landscape of plasma proteomics from technical innovations ...
    Sep 25, 2025 · However, utilizing the plasma proteome as a source of biomarkers presents several challenges. The plasma proteome spans a wide dynamic range, ...
  40. [40]
    High-throughput proteomics: a methodological mini-review - Nature
    Aug 3, 2022 · With the technological revolution and emerging computational and statistic models, proteomic methodology has evolved rapidly in the past decade ...
  41. [41]
    Strategies to enable large-scale proteomics for reproducible research
    Jul 30, 2020 · To enable the deployment of large-scale proteomics, we assess the reproducibility of mass spectrometry (MS) over time and across instruments and develop ...
  42. [42]
    Multi-laboratory assessment of reproducibility, qualitative and ...
    Aug 21, 2017 · We demonstrate that using SWATH-mass spectrometry data acquisition we can consistently detect and reproducibly quantify >4000 proteins from HEK293 cells.
  43. [43]
    Detecting protein and post-translational modifications in single cells ...
    Aug 3, 2020 · Conventional protein-detection methods, such as western blots, enzyme-linked immunosorbent assay (ELISA) are generally difficult to downscale ...
  44. [44]
    Mass spectrometry-based proteomics - Nature
    Mar 13, 2003 · Aebersold, R., Mann, M. Mass spectrometry-based proteomics. Nature 422, 198–207 (2003). https://doi.org/10.1038/nature01511. Download ...
  45. [45]
    Parts per Million Mass Accuracy on an Orbitrap Mass Spectrometer ...
    Using automatic gain control and narrow mass ranges (SIM scans) we observed an average absolute mass error between 0.6 and 0.7 ppm in recent large scale ...
  46. [46]
    Peer Reviewed: Prefractionation Techniques in Proteome Analysis
    Two-Dimensional Separation for Proteomic Analysis. 2012, 57 ... Effects of chromatography conditions on intact protein separations for top-down proteomics.
  47. [47]
    Proteomics and disease: opportunities and challenges - PMC
    Proteomics has created opportunities to identify, investigate and target proteins that are differentially expressed in health and disease.
  48. [48]
    Applications of Proteomics in Drug Discovery - Technology Networks
    Oct 16, 2025 · Proteomics enables systematic mapping of protein expression and activity across tissues, cell types and disease states. This helps ...
  49. [49]
    Profiling disease-selective drug targets: from proteomics to ...
    Although comparative expression proteomics of diseased versus healthy cells can reveal disease-associated proteins, the approach cannot interrogate protein– ...
  50. [50]
    Optimizing differential expression analysis for proteomics data via ...
    May 9, 2024 · Differential expression analysis (DEA) for proteomics data is crucial for accurate detection of phenotype-specific proteins, which can be useful ...
  51. [51]
    Global Effects of Kinase Inhibitors on Signaling Networks Revealed ...
    The majority of proteomics studies in kinase drug development have focused on direct binding targets of inhibitor compounds. The quantitative phosphoproteomics ...
  52. [52]
    Activity-based protein profiling: A graphical review - PubMed Central
    Activity-based protein profiling (ABPP) is a chemoproteomic technology using small chemical probes to directly interrogate protein function within complex ...
  53. [53]
    Advanced Activity-Based Protein Profiling Application Strategies for ...
    Apr 8, 2018 · This review focuses on the overall workflow of the ABPP technology and on additional advanced strategies for target identification and/or drug discovery.
  54. [54]
    Activity-based protein profiling: Recent advances in medicinal ...
    Apr 1, 2020 · ABPP is a powerful chemical biology technique, using small molecule probes to covalently bind and modify the active sites of proteins.
  55. [55]
    Proteomics for cancer drug design - PMC - NIH
    Cancer drug development relies on proteomic technologies to identify potential biomarkers, mechanisms-of-action, and to identify protein binding hot spots.
  56. [56]
    Clinical Proteomics of Breast Cancer Reveals a Novel Layer of ...
    These findings utilize extensive proteomics to identify a novel luminal breast cancer subtype, highlighting the added value of clinical proteomics in.
  57. [57]
    Targeting HER2-positive breast cancer: advances and future ...
    Nov 7, 2022 · This Review discusses the current standards of care for HER2-positive breast cancer, mechanisms of resistance to HER2-targeted therapy and new therapeutic ...
  58. [58]
    Integrating Pharmacoproteomics into Early-Phase Clinical ...
    The interaction of every drug with the human proteome is complex and diverse. Drugs may affect protein-protein, protein-nucleic acid, other protein interactions ...
  59. [59]
    Precision medicine: from pharmacogenomics to pharmacoproteomics
    Sep 26, 2016 · In this article, we discuss the current status of pharmacogenomics in precision medicine and highlight the needs for concordant analysis at the proteome and ...
  60. [60]
    (PDF) Pharmacoproteomics in drug development - ResearchGate
    Aug 6, 2025 · Proteomes are significantly used to investigate different protein expressions and modifications that will affect the body's biological processes ...
  61. [61]
    Proteomic Approaches for the Discovery of Biofluid Biomarkers of ...
    Aug 31, 2018 · This review is intended as an overview of how modern proteomic techniques (liquid chromatography mass spectrometry (LC-MS/MS) and advanced capture-based ...
  62. [62]
    Proteomics of human biological fluids for biomarker discoveries
    We review proteomic technologies for the identification of biomarkers. These are based on antibodies/aptamers arrays or mass spectrometry (MS), but new ones ...<|control11|><|separator|>
  63. [63]
    The Role of Proteomics in Biomarker Development for Improved ...
    This review aims to (i) provide an overview of these technologies as well as describe some of the candidate PCa protein biomarkers that have been discovered ...
  64. [64]
    PSA and beyond: alternative prostate cancer biomarkers - PubMed
    A search for alternative prostate cancer biomarkers, particularly those that can predict disease aggressiveness and drive better treatment decisions.
  65. [65]
    Plasma/Serum Proteomics based on Mass Spectrometry - PMC - NIH
    In this review, we comprehensively introduce the background and advancement of technologies for blood proteomics, with a focus on mass spectrometry (MS).
  66. [66]
    High throughput and accurate serum proteome profiling by ...
    Mar 1, 2018 · We present a rapid and robust integrated DIA-based quantitative proteomic workflow for streamlined serum proteomic profiling from 1 μL serum.
  67. [67]
    MS-Based Proteomics of Body Fluids: The End of the Beginning
    May 18, 2023 · MS-based proteomics excels by its untargeted nature, specificity of identification, and quantification, making it an ideal technology for biomarker discovery ...
  68. [68]
    Multicenter Longitudinal Quality Assessment of MS-Based ...
    Feb 7, 2025 · Overall, 71 proteins are reproducibly detectable in all setups in both serum and plasma samples, and 22 of these proteins are FDA-approved ...
  69. [69]
    Proteomics-driven noninvasive screening of circulating serum ...
    Dec 18, 2023 · Mass spectrometry (MS)-based proteomics is in principle an ideal tool for biomarker discovery. However, proteomic analysis of serum or plasma ...
  70. [70]
    Comprehending the Proteomic Landscape of Ovarian Cancer
    May 25, 2021 · Here, we review single and multiple marker panels that have been identified through proteomic investigations of patient sera, effusions, and other biospecimens.
  71. [71]
    Proteomics-Derived Biomarker Panel Improves Diagnostic Precision ...
    In summary, these findings support the use of multi-marker panels for the differential diagnosis of difficult cases resembling endometrioid carcinoma and HGSC.
  72. [72]
    the OVA1 test, from biomarker discovery to FDA clearance - PubMed
    A recipe for proteomics diagnostic test development: the OVA1 test, from biomarker discovery to FDA clearance. Clin Chem. 2010 Feb;56(2):327-9.Missing: approval | Show results with:approval
  73. [73]
    Recipe for Proteomics Diagnostic Test Development: The OVA1 Test ...
    Feb 1, 2010 · A Recipe for Proteomics Diagnostic Test Development: The OVA1 Test, from Biomarker Discovery to FDA Clearance · Extract · New journal issues alert.Missing: approval | Show results with:approval
  74. [74]
    Lessons Learned from the First FDA-Cleared In Vitro Diagnostic ...
    The OVA1 test is an In Vitro Diagnostic Multivariate Index Assay (IVDMIA) of Proteomic Biomarkers that has been recently cleared by the FDA for assessing ...
  75. [75]
    NMR and X-ray Crystallography, Complementary Tools in Structural ...
    Here we report a comparison for 263 unique proteins screened by both NMR spectroscopy and X-ray crystallography in our structural proteomics pipeline. Only 21 ...
  76. [76]
    State-of-the-Art and Future Directions in Structural Proteomics
    Native MS complements orthogonal methods such as cryo-EM, NMR spectroscopy, X-ray crystallography, and small-angle scattering methods, providing snapshots ...
  77. [77]
    Structural proteomics, electron cryo-microscopy and structural ...
    Feb 19, 2020 · In this review, we focus on proteomics, electron cryo-microscopy and structural modeling to showcase instances where affinity-purification (AP) and cross- ...
  78. [78]
    Mass Spectrometry Structural Proteomics Enabled by Limited ...
    Sep 19, 2024 · This review focuses on two powerful MS-based techniques for peptide-level readout, namely limited proteolysis-mass spectrometry (LiP-MS) and cross-linking mass ...
  79. [79]
    State-of-the-Art and Future Directions in Structural Proteomics - PMC
    Sep 3, 2025 · CryoEM/X-ray crystallography: Map cross-links onto 3D densities to confirm subunit positions and resolve ambiguous interfaces in medium- ...
  80. [80]
    Yeast Two-Hybrid, a Powerful Tool for Systems Biology - PMC
    This review provides an overview on available yeast two-hybrid methods, in particular focusing on more recent approaches.
  81. [81]
    Recent Advances in Mass Spectrometry-Based Protein Interactome ...
    This review highlights recent advancements in mass spectrometry-based techniques for mapping protein interactomes, including affinity purification, proximity ...
  82. [82]
    Mapping protein–protein interactions by mass spectrometry - Liu
    May 14, 2024 · This review highlights recent advances in enrichment methodologies for interactomes before MS analysis and compares their unique features and specifications.
  83. [83]
    Mass spectrometry‐based protein–protein interaction networks for ...
    Jan 12, 2021 · Here, we review MS techniques that have been instrumental for the identification of protein–protein interactions at a system‐level.
  84. [84]
    Hubs and bottlenecks in plant molecular signalling networks - Dietz
    Oct 19, 2010 · The review introduces the concept of networks, hubs and bottlenecks and describes four examples from plant science in more detail.
  85. [85]
    Protein–protein interaction networks: how can a hub protein bind so ...
    A single protein binds to a very large number of partners. In reality, it does not; rather, protein networks reflect the combination of multiple proteins.
  86. [86]
    STRING database in 2023: protein–protein association networks ...
    Nov 12, 2022 · The STRING database (https://string-db.org/) systematically collects and integrates protein–protein interactions—both physical interactions as ...
  87. [87]
    Identifying Hubs in Protein Interaction Networks | PLOS One
    Hub proteins have been reported to have special properties with respect to their level of co-expression with neighboring proteins in a protein interaction ...
  88. [88]
    Protein structure-based drug design: from docking to molecular ...
    Nov 14, 2017 · In this review we highlight recent advancements in applications of ligand docking tools and molecular dynamics simulations to ligand identification and ...
  89. [89]
    Molecular Docking and Structure-Based Drug Design Strategies - NIH
    The purpose of this review is to examine current molecular docking strategies used in drug discovery and medicinal chemistry
  90. [90]
    Structure-based drug design: aiming for a perfect fit - Portland Press
    Nov 8, 2017 · In this editorial we provide a brief overview of the powerful impact of structure-based drug design (SBDD), which has its roots in computational and structural ...Yearly growth of structures in... · Crystal structure of the...
  91. [91]
    Homology modeling in the time of collective and artificial intelligence
    Homology modeling is a method for building protein 3D structures using protein primary sequence and utilizing prior knowledge gained from structural ...Homology Modeling In The... · 2. Homology Modeling · 8. Artificial Intelligence...
  92. [92]
    Before and after AlphaFold2: An overview of protein structure ...
    Feb 27, 2023 · In this mini-review, we provide an overview of the breakthroughs in protein structure prediction before and after AlphaFold2 emergence.Introduction · Structure prediction methods · AlphaFold · New methods of protein...
  93. [93]
    Highly accurate protein structure prediction with AlphaFold - Nature
    Jul 15, 2021 · In CASP14, AlphaFold structures were vastly more accurate than competing methods. AlphaFold structures had a median backbone accuracy of 0.96 Å ...
  94. [94]
    Rosetta Commons – The hub for Rosetta modeling software.
    Rosetta Commons drives innovation in biomolecular modeling and design through cutting-edge computational methods and shared software.Software · Download · Rosetta · About
  95. [95]
    I-TASSER: a unified platform for automated protein structure ... - Nature
    Mar 25, 2010 · I-TASSER server is an integrated platform for automated protein structure and function prediction based on the sequence-to-structure-to-function paradigm.
  96. [96]
    Cross-Linking Mass Spectrometry for Investigating Protein ...
    Nov 19, 2021 · We summarize the most important cross-linking reagents, software tools, and XL-MS workflows and highlight prominent examples for characterizing proteins.
  97. [97]
    Mechanism of misfolding of the human prion protein revealed by a ...
    Mar 17, 2021 · Prion diseases are among the most relevant neurodegenerative disorder involving the misfolding and aggregation of otherwise-functional proteins.
  98. [98]
    structural insights into how prion proteins encipher heritable ... - Nature
    Jul 13, 2022 · The prion hypothesis embodies the radical concept that prion proteins contain the necessary information for infectious replication within their shape.
  99. [99]
    DeepMVP: deep learning models trained on high-quality data ...
    Aug 26, 2025 · We identified a total of 397,524 PTM sites across the six PTM types, including 33,010 acetylation sites on 6,766 proteins, 15,843 methylation ...
  100. [100]
    Modification Site Localization Scoring: Strategies and Performance
    In this article are discussed the main strategies currently used by software for modification site localization and ways of assessing the performance of these ...
  101. [101]
    Phosphorylation Site Localization Using Probability-Based Scoring
    A probability-based score that measures the likelihood of correct phosphorylation site localization based on the presence and intensity of site-determining ...
  102. [102]
    NetPhos 3.1 - DTU Health Tech - Bioinformatic Services
    The kinase specific predictions are identical to the predictions by NetPhosK 1.0. ... Software Downloads. Version 3.1. SunOS · Linux. GETTING HELP. If you need ...
  103. [103]
    MIND-S is a deep-learning prediction model for elucidating protein ...
    Mar 27, 2023 · We present MIND-S, a deep-learning-based PTM prediction tool utilizing protein-level information combined with its sequence and structure.
  104. [104]
    FLEXIQuant-LF to quantify protein modification extent in label ... - eLife
    Dec 7, 2020 · We introduce FLEXIQuant-LF, a software tool for large-scale identification of differentially modified peptides and quantification of their modification extent.
  105. [105]
    multiFLEX-LF: A Computational Approach to Quantify the ...
    Jan 27, 2022 · A computational tool that builds upon FLEXIQuant, which detects modified peptide precursors and quantifies their modification extent.
  106. [106]
    PhosphoSitePlus - Database Commons
    The number of unique PTMs in PSP is now more than 450 000 from over 22 000 articles and thousands of MS datasets.
  107. [107]
    PhosphoSitePlus
    PhosphoSitePlus® provides comprehensive information and tools for the study of protein post-translational modifications (PTMs) including phosphorylation, ...Protein, Sequence, or... · Site Search · Kinase Prediction · How to cite us
  108. [108]
    Improving gene annotation using peptide mass spectrometry - PMC
    We present algorithms to construct and efficiently search spectra against a genomic database, with no prior knowledge of encoded proteins. By searching a corpus ...
  109. [109]
    Proteogenomics produces comprehensive and highly accurate ...
    Proteogenomics is an emerging field in which proteomics and genomics data are combined to improve genome annotation and study impact of genome variations at the ...
  110. [110]
    Improving GENCODE reference gene annotation using a high ...
    Jun 2, 2016 · Here we report a stringent workflow for the interpretation of proteogenomic data that could be used by the annotation community to interpret novel ...Missing: seminal | Show results with:seminal
  111. [111]
    Proteogenomic data and resources for pan-cancer analysis
    Aug 14, 2023 · The CPTAC dataset currently includes 10 cancer cohorts of prospectively collected tumors analyzed with genomics, transcriptomics, proteomics, ...
  112. [112]
    Integrating quantitative proteomics and metabolomics with a ...
    Jun 1, 2010 · This study presents a novel approach for integrating quantitative proteomic and metabolomic data with a genome-scale metabolic network model to ...2 Methods · 2.2 The Ioma Method · 3 ResultsMissing: seminal | Show results with:seminal
  113. [113]
    Constrained Allocation Flux Balance Analysis - Research journals
    Jun 29, 2016 · This work introduces a computational genome-scale framework (Constrained Allocation Flux Balance Analysis, CAFBA) which incorporates growth laws into canonical ...
  114. [114]
    Integration of proteomic data with genome‐scale metabolic models
    This review explores methodologies for incorporating proteomics data into genome‐scale models. Available methods are grouped into four distinct categories.
  115. [115]
    An R package for 'omics feature selection and multiple data integration
    We introduce mixOmics, an R package dedicated to the multivariate analysis of biological data sets with a specific focus on data exploration, dimension ...
  116. [116]
    mixOmics – From Single to Multi-Omics Data Integration
    mixOmics is an R package for exploring and integrating omics data, including transcriptomics, proteomics, lipidomics, microbiome, metagenomics and beyond.The mixOmics package · mixOmics Publications · The mixOmics team · Workshops
  117. [117]
  118. [118]
  119. [119]
    Exploration of cell state heterogeneity using single-cell proteomics ...
    Sep 22, 2023 · SCoPE-MS: mass spectrometry of single mammalian cells quantifies proteome heterogeneity during cell differentiation. Genome Biol. 19, 161 ...
  120. [120]
    Organ aging signatures in the plasma proteome track health and ...
    Dec 6, 2023 · Plasma proteins can model organ aging. To test this, we measured 4,979 proteins in a total of 5,676 subjects across five independent cohorts ( ...
  121. [121]
    The Circulating Proteome Technological Developments, Current ...
    For instance, of the 4608 proteins listed in the 2023 Human Plasma Proteome PeptideAtlas, 44% are covered by the 5416 proteins of the Olink Explore HT and ...
  122. [122]
    Progress and trends on machine learning in proteomics during 1997 ...
    Research after 2020 demonstrates that AI and ML are increasingly being applied to enhance precision medicine and personalized treatment strategies. For ...Missing: post- | Show results with:post-
  123. [123]
  124. [124]
    Future directions: what LIES ahead for smart biochemical wearables ...
    Jan 6, 2025 · Next-generation non-invasive biochemical wearables hold promise in transforming healthcare by providing real-time, continuous monitoring of ...
  125. [125]
    Proteomics appending a complementary dimension to precision ...
    With the advent of the precision oncology era, proteomics delivered innovative insights into the mechanisms of cancer generation, empowering people to break ...
  126. [126]
    Promises and Challenges of populational Proteomics in Health and ...
    May 17, 2024 · We also highlight emergent challenges associated with study design, analytical considerations, and data integration as population-scale studies ...
  127. [127]
    A review of the current state of single-cell proteomics and future ...
    Jun 7, 2023 · We investigate the challenges associated with working with very small sample volumes and the acute need for robust statistical methods for data ...