Fact-checked by Grok 2 weeks ago

Protein Data Bank

The Protein Data Bank (PDB) is an international open-access archive that stores and disseminates three-dimensional structural data for biological macromolecules, including proteins, nucleic acids, and complex assemblies, primarily determined through experimental methods such as , , and cryo-electron microscopy. Established in 1971 at in the United States with just seven initial structures under the leadership of Walter Hamilton, the PDB has grown into a foundational resource for , enabling advancements in fundamental , biomedicine, biotechnology, and energy sciences by providing free access to atomic coordinates, sequences, and associated metadata for visualization, analysis, and research. Managed as a collaborative effort by the Worldwide Protein Data Bank (wwPDB) organization—formed in 2003 to ensure a unified global archive—the PDB is operated through distributed data centers, including the Research Collaboratory for Structural Bioinformatics (RCSB) in the US, the Protein Data Bank in Europe (PDBe), the Protein Data Bank Japan (PDBj), and the Biological Magnetic Resonance Data Bank (BMRB). This international framework handles data deposition, validation, and distribution, with the RCSB PDB serving as the primary US hub and offering user-friendly tools for querying, downloading, and exploring over 244,000 entries as of November 2025, reflecting an annual growth rate of approximately 7% driven by technological advances in structure determination. The archive's economic impact is profound, with the estimated cost to replicate its data exceeding $23 billion, underscoring its role as a living digital resource that supports millions of users worldwide annually in drug discovery, protein engineering, and educational initiatives.

Overview

Definition and Scope

The Protein Data Bank (PDB) is a global, open-access repository that serves as the single worldwide archive for the three-dimensional (3D) structures of biological macromolecules, including proteins, nucleic acids, and their complexes. It primarily collects structures determined through experimental techniques such as X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, and cryo-electron microscopy (cryo-EM), ensuring that the data reflect empirically validated atomic arrangements. Established in 1971 and now managed by the Worldwide Protein Data Bank (wwPDB), the PDB was founded to systematically archive and freely disseminate structural biology data, facilitating its use in research, education, and drug discovery. The scope of the PDB encompasses atomic coordinate files, associated experimental data (such as diffraction patterns or maps), and comprehensive describing the structure determination results, and biological context. As of November 2025, the contains over 244,000 released entries, representing a vast collection of experimentally derived models that span diverse biological systems from simple peptides to large macromolecular assemblies. While the PDB focuses exclusively on experimentally determined structures, it integrates with complementary resources for computed models, such as those generated by methods, to provide broader access to predicted structures without including them in experimental . This delimited scope ensures the PDB remains a reliable source for high-quality, validated data, emphasizing over theoretical predictions to support reproducible scientific advancements.

Importance and Impact

The Protein Data Bank (PDB) serves as a foundational resource in , enabling researchers to elucidate the three-dimensional structures of proteins and other biomolecules, which is essential for understanding their molecular functions, interactions, and mechanisms. This structural insight has profoundly influenced fields such as , where atomic-level models guide the development of targeted therapeutics, and , facilitating the modification of proteins for industrial and medical applications. Nearly 90% of published PDB structures are cited in journals focused on biochemistry and , underscoring the archive's centrality to research. The PDB's educational value stems from its open-access policy under the CC0 1.0 Universal Public Domain Dedication, which waives all and , allowing unrestricted use for , learning, and worldwide. This supports educational initiatives by providing free access to high-quality structural data and associated visualization tools, enabling students, educators, and non-experts to explore complex biomolecular architectures without specialized software. The RCSB PDB, for instance, offers interactive resources like 3D viewers and tutorials that democratize . Beyond academia, the PDB has driven broader societal impacts by accelerating breakthroughs in and ; for example, during the , over 1,500 SARS-CoV-2-related structures, including those of the critical for development, were rapidly deposited, informing the design of mRNA vaccines and antiviral drugs. Its contributions extend to , where structural data has enabled innovations in optimization and design, with an estimated annual use value exceeding $5.5 billion and a replacement cost for archived structures exceeding $23 billion, reflecting the immense economic leverage from enabled discoveries. Post-2020, the PDB has expanded to integrate computed structure models, such as those from , allowing hybrid analyses that combine experimental data with AI-generated predictions to address gaps in structural coverage and enhance functional studies. This linkage supports advanced research by providing complementary views of protein conformations, fostering innovations in predictive modeling and therapeutic targeting.

History

Establishment

The Protein Data Bank (PDB) was established in October 1971 at (BNL) in , USA, under the leadership of Walter Hamilton, who served as the laboratory's chairman of the chemistry department. The initiative arose from suggestions by members of the American Crystallographic Association (ACA) and participants at a 1971 workshop on the use of computers in structural chemistry held at BNL, aiming to create a centralized repository for experimentally determined three-dimensional structures of biological macromolecules. From its inception, the PDB collaborated with the Cambridge Crystallographic Data Centre (CCDC) in the UK to facilitate data exchange and standardization. The archive launched with an initial collection of seven protein structures, primarily determined by , including the structure of rubredoxin from the bacterium vulgaris (PDB ID: 7RXN), one of the first small iron-sulfur proteins to be solved at atomic resolution. These early entries represented pioneering work in protein , such as those from and studies dating back to the 1950s and 1960s, now digitized for accessibility. Initial operations were funded by the U.S. (NSF) and the (NIH), supporting the basic infrastructure for data collection and distribution. In its formative years, the PDB faced significant logistical challenges due to the era's limited computing resources, with data submissions and distributions relying on manual methods such as punched cards and magnetic tapes sent via postal mail. To address this, the first formal deposition guidelines and were developed in 1972, specifying a text-based record structure compatible with punch-card technology to ensure consistent atomic coordinate representation. Key early contributors included Helen M. Berman, a co-founder who played a pivotal role in establishing the archive's foundational protocols alongside .

Evolution and Milestones

The Protein Data Bank (PDB) experienced steady growth in the 1980s, reaching 100 deposited structures by 1982, primarily from experiments. This period also marked the initial integration of (NMR) data, with the first NMR-derived structure released in 1988, expanding the archive beyond crystallographic methods to include solution-based structural information. By the early , the PDB had grown to 1,000 entries in 1993, reflecting increased experimental capabilities and community adoption. The late 1990s brought significant organizational and technological advancements. The archive reached 10,000 structures in 1999, coinciding with the transfer of management from to the Research Collaboratory for Structural Bioinformatics (RCSB) at , completed that year to enhance data dissemination and sustainability. Web-based access was adopted during this decade, enabling broader remote querying and visualization of structures through early online interfaces. In 2003, the formation of the Worldwide Protein Data Bank (wwPDB) established a collaborative international framework to ensure unified archiving and validation. The 2000s saw further diversification with the inclusion of (cryo-EM) structures, starting around 2003 with initial de novo models from electron density maps. Subsequent decades accelerated expansion and modernization. The PDB surpassed 100,000 entries in 2014, 200,000 in 2023, and 240,000 in late 2024, driven by advances in high-throughput methods and global contributions. In 2021, the PDB celebrated its 50th anniversary with global events organized by the wwPDB, highlighting its foundational role in . The Electron Microscopy Data Bank (EMDB) formally joined the wwPDB in 2021, integrating cryo-EM density maps more seamlessly with atomic models. Policy updates included mandating the PDBx/mmCIF format for all crystallographic depositions starting July 2019, improving data richness and interoperability. Weekly releases became the standard protocol, facilitating timely access to new data. From 2020 to 2025, enhancements supported computed structure models, notably through 2024 RCSB updates integrating predictions to complement experimental entries.

Organization and Management

Worldwide Protein Data Bank (wwPDB)

The Worldwide Protein Data Bank (wwPDB) was established in 2003 as a non-profit to coordinate the global management of the Protein Data Bank (PDB) archive, evolving from earlier national efforts to ensure a unified, publicly accessible resource for data. It maintains uniform standards for data deposition, validation, and dissemination while not owning the underlying data, which remains in the . The wwPDB's core tasks encompass developing and implementing deposition and annotation protocols, generating validation reports for submitted structures, and promoting to all archived data in line with (Findable, Accessible, Interoperable, and Reusable) principles. These efforts include coordinating biocuration, data remediation, secure storage, and free distribution services across its network, all provided without charge to researchers worldwide. Membership in the wwPDB is structured around full core members, which currently include the RCSB PDB (United States), PDBe (), PDBj (), BMRB (United States, focused on nuclear magnetic resonance data), and EMDB (electron microscopy data archive, integrated as a core member in 2021). The consortium also supports associate members, who contribute to specific activities, and federated members, who manage complementary resources through data exchange agreements, with new additions requiring unanimous approval. Governance operates under the wwPDB Charter, with the 2020 update—effective January 1, 2021—emphasizing inclusivity, global equity, and enhanced collaboration by revising prior versions from 2003 and 2013. This framework includes annual meetings of core members, an Advisory Committee to address policies and disputes, and community input for standard updates, such as those to the PDBx/mmCIF format, ensuring consistent data integrity and accessibility.

Regional Data Centers

The Worldwide Protein Data Bank (wwPDB) oversees four primary regional data centers that collectively manage the deposition, processing, validation, and dissemination of Protein Data Bank (PDB) data, ensuring uniform access to the global archive. These centers—RCSB PDB in the United States, in , in , and BMRB in the United States—operate collaboratively to handle structures determined by various experimental methods, including , (NMR) spectroscopy, and cryo-electron microscopy (cryo-EM). Each center contributes specialized expertise while mirroring the full archive weekly to maintain synchronization and redundancy. The RCSB Protein Data Bank (RCSB PDB), located at Rutgers, The State University of , and the San Diego Supercomputer Center at the , serves as the primary U.S. site and acts as the official archive keeper for the wwPDB, with sole write access to the PDB core archive. It processes the majority of global depositions, particularly those from , and provides a comprehensive for data access, visualization, and download. RCSB PDB also develops educational resources to support teaching and outreach in , including interactive tools for exploring molecular structures, and has pioneered advancements like the Mol* 3D viewer for enhanced data rendering. As the central hub for U.S.-based research, it emphasizes rigorous biocuration and annotation to ensure and . The Protein Data Bank in (PDBe), hosted at the European Molecular Biology Laboratory's (EMBL-EBI) in Hinxton, , functions as the European regional center and founding member of the wwPDB. PDBe focuses on integrating PDB data with broader bioinformatics resources, facilitating seamless connections to genomic and proteomic databases for enhanced analysis. It prioritizes depositions from and supports cryo-EM data through close ties with the Electron Microscopy Data Bank (EMDB), promoting collaborative initiatives across European research networks. PDBe's contributions include specialized annotation for complex assemblies and ligands, adding value through region-specific outreach and programs. The Protein Data Bank Japan (PDBj), based at the Institute for Protein Research at , operates as the Asian regional center and a founding wwPDB since 2003. It handles a significant portion of depositions from , offering Japanese-language interfaces and support to encourage participation from regional researchers. PDBj emphasizes the development of computational tools for data processing and analysis, tailored to diverse experimental techniques, and contributes to global biocuration efforts by annotating structures with a focus on Asian scientific priorities. Its role extends to fostering international collaborations, particularly in the , to broaden the archive's representation. The Biological Magnetic Resonance Data Bank (BMRB), situated at in , , specializes in archiving NMR spectroscopy data as a core wwPDB partner since 2006. BMRB maintains a dedicated for chemical shift assignments, restraint data, and other NMR-derived information, which complements atomic coordinate models in the PDB archive. It processes NMR-specific depositions globally, ensuring integration with full structural entries through wwPDB protocols, and provides resources for validating NMR assignments to support hybrid structure determination. As the sole center focused on biomolecular NMR, BMRB enhances the archive's utility for dynamics and conformational studies. These regional centers coordinate under wwPDB oversight to mirror the entire PDB, EMDB, and BMRB archives weekly, preventing data silos and enabling each to offer unique value-added services, such as region-tailored annotations and interfaces, without duplicating core functions. This distributed model ensures equitable global access and leverages local expertise to sustain the archive's growth and reliability.

Contents

Types of Structures

The Protein Data Bank (PDB) archives a diverse array of biological macromolecules and their complexes, categorized primarily by their molecular composition. Protein structures form the core of the archive, encompassing single-chain polypeptides as well as multi-chain assemblies, such as enzymes, receptors, and structural proteins. Nucleic acids, including DNA and RNA molecules, are also prominently represented, often as standalone entities or in hybrid forms like DNA-RNA complexes. Protein-nucleic acid complexes, such as transcription factors bound to DNA or ribosomes with mRNA and tRNA, highlight functional interactions central to cellular processes. Additionally, the PDB includes lipids, typically as components of membrane protein structures, carbohydrates in the form of linear or branched oligosaccharides (e.g., glycans attached to proteins), and small molecules like ligands, cofactors, and solvent molecules (e.g., water or ions) that modulate biological activity. Structures in the PDB are further classified by the experimental methods used for their determination, reflecting advances in biophysical techniques. dominates the archive, accounting for the vast majority of entries (approximately 81% as of late 2025), and provides high-resolution models (often better than 2 Å) of rigid, crystalline macromolecules, enabling detailed atomic visualization. (NMR) spectroscopy contributes solution-state structures, particularly for smaller, flexible proteins and domains (about 6% of entries), yielding ensembles that capture dynamic conformations in near-physiological conditions. Cryo-electron (cryo-EM) has seen rapid growth, representing around 12% of the archive by late 2025, and excels at resolving large macromolecular complexes and assemblies (e.g., viruses or ribosomes) at resolutions approaching atomic levels without requiring crystallization. Other methods, such as fiber diffraction or , are less common but included for specialized cases like fibrous proteins. The PDB features special collections that emphasize biologically significant or therapeutically relevant structures. Viral structures, including entire virions or key viral proteins (e.g., envelope glycoproteins), support and design. Membrane proteins, often embedded in lipid bilayers, are curated with resources like the Orientations of Proteins in Membranes (OPM) database to aid studies of and signaling. Antibody structures, frequently in complex with antigens, facilitate research. Since 2020, there has been a marked increase in disease-related entries, particularly for , with thousands of structures deposited to accelerate discovery and understanding of viral-host interactions (e.g., spike protein-ACE2 complexes). Notably, the PDB focuses exclusively on atomic coordinate models derived from experimental data and does not store raw experimental data, such as electron microscopy density maps or diffraction images, which are archived in complementary repositories like the Electron Microscopy Data Bank (EMDB) or software grids like SBGrid.

Statistics and Growth

As of November 2025, the Protein Data Bank (PDB) archive contains 245,074 entries. These structures are primarily determined by (199,418 entries), followed by electron microscopy (cryo-EM; 30,264 entries) and (NMR; 14,632 entries), with the remainder (760 entries) from other or hybrid methods. The PDB has exhibited since its , starting with just 7 entries in 1971. Key milestones include reaching 100,000 entries in 2014, 150,000 in 2019, and surpassing 200,000 by early 2023. Post-2020, annual additions have averaged 14,000 to 15,000 structures, driven by advances in and high-throughput methods such as cryo-EM. Notable trends include the rapid rise in cryo-EM structures, which constituted less than 1% of the archive in 2010 but grew to approximately 12% by 2025. Multi-method structures, combining techniques like and cryo-EM, have also increased modestly, numbering around 200 to 300 new entries annually in recent years. Geographically, depositions remain dominated by the and , though Asia's contributions have grown steadily since the early . The PDB undergoes weekly updates, typically on Wednesdays at 00:00 UTC, incorporating new releases, revisions, and status changes. The core archive's file size exceeds 1.4 terabytes as of 2024, encompassing coordinate files, validation reports, and associated data.

Data Submission and Validation

Deposition Process

The deposition process for structures in the Protein Data Bank (PDB) begins with researchers preparing their data through the wwPDB OneDep system, a unified online portal launched in that streamlines submissions across all wwPDB partner sites. This platform supports deposition of atomic coordinates and associated experimental data for structures determined by various methods, including , (NMR) spectroscopy, cryo-electron microscopy (cryo-EM), neutron diffraction, and combinations thereof, while accommodating multiple file formats such as PDBx/mmCIF. Depositors initiate a session by providing an and selecting the principal investigator's institution, receiving a session ID and password to manage uploads securely; sessions remain active for up to three months if not submitted. Key preparation steps involve generating atomic coordinate files using refinement software like PHENIX or REFMAC5, ensuring inclusion of all residues (including expression tags and disordered segments), and verifying sequences against databases such as . Metadata, including experimental details like diffraction data or maps, must be compiled alongside ligand information validated against the Chemical Component Dictionary. Local validation is recommended using the wwPDB Validation Server to identify issues early, supplemented by tools like PDB-REDO for model optimization and rebuilding, which helps improve geometry and reduce processing delays. Once prepared, files are uploaded via the OneDep interface to a preferred regional data center (e.g., RCSB PDB in the , PDBe in , or PDBj in ), where wwPDB biocurators perform annotation, including standardization of and integration of validation reports; this step typically takes 1-2 weeks, though simpler entries may process in hours. Depositions require comprehensive experimental metadata, such as method-specific parameters and resolution metrics, with cryo-EM structures require deposition of map volumes and must meet quality standards assessed through metrics like half-map resolutions and (FSC) curves, with acceptance depending on justification and validation outcomes. Submission is mandatory for publications in major journals like and , which require inclusion of the assigned PDB ID and wwPDB validation reports in manuscripts. Following annotation, depositors review the processed files and select a release status—immediate (REL), hold until publication (HPUB, up to one year), or private hold (HOLD, up to one year)—before final submission. In response to the , post-2020 procedures included expedited annotation and release for SARS-CoV-2-related structures to accelerate research, resulting in over 1,500 such entries processed rapidly during 2020-2021. Additionally, the OneDep system has integrated support for computed structure models (CSMs), allowing joint deposition of predicted models (e.g., from ) alongside experimental data to enhance archive completeness.

Validation and Quality Control

The wwPDB employs a validation pipeline that integrates automated computational checks with expert biocurator review to assess the quality of submitted structures. Automated tools, such as MolProbity for evaluating backbone and side-chain geometry including outliers and steric clashes, for overall structure quality metrics, and for ligand geometry against the Cambridge Structural Database, process model coordinates and experimental data. Biocurators then perform manual inspections to verify compliance with community standards recommended by wwPDB Validation Task Forces. This pipeline culminates in the generation of wwPDB Validation Reports—detailed PDF and XML documents—for each entry, provided to depositors during processing and released publicly alongside the structure. Validation criteria are tailored to the experimental technique and focus on key indicators of reliability. For , essential metrics include (e.g., high-quality structures typically at ≤1.8 ) and R-free values to gauge model refinement and . Cryo-EM structures are scrutinized for map quality, encompassing half-map and (FSC) curves to confirm reported . Across methods, chemical accuracy is rigorously checked, including bond lengths, angles, torsion angles, and to ensure consistency with known chemical principles. Upon validation, structures with major discrepancies—such as poor fit or unresolved geometric errors—may face rejection or require depositor revisions, though the wwPDB accepts nearly all submissions meeting minimum criteria after corrections. Public reports transparently flag outliers, like non-favored Ramachandran angles or clash scores exceeding thresholds, enabling users to evaluate entry quality. From 2020 to 2025, enhancements have strengthened validation for emerging data types. Cryo-EM reports now include advanced map-model fit assessments and FSC-based resolution percentiles for better comparability. and validation improved via geometric diagrams, electron density fits (e.g., using RSCC and RSR metrics), and outcomes from the 2016 Ligand Validation Workshop and 2024 EMDataResource Ligand Model Challenge. In 2025, wwPDB introduced the 3DEM Model-Map percentile slider and Q-score slider to validation reports, aiding in the assessment of local resolution and atom resolvability in cryo-EM maps. Integration with the ModelCIF format, developed by the wwPDB ModelCIF Working Group, facilitates validation and archiving of computed structure models, extending PDB standards to predictive ensembles.

File Formats

Legacy PDB Format

The legacy Protein Data Bank (PDB) format is a plain-text file structure designed for storing atomic coordinates and associated metadata of biological macromolecules, utilizing fixed-width columns across 80-character lines to ensure compatibility with early computing systems like punched cards. Introduced in 1976 as an evolution of earlier formats, it provided a standardized way to archive three-dimensional structures determined primarily by and NMR spectroscopy. This format served as the primary medium for data exchange in the PDB archive from its inception until support was frozen in 2012, after which new depositions for crystallographic structures transitioned to more flexible alternatives like PDBx/mmCIF, while legacy PDB format remains accepted for NMR and cryo-EM depositions; legacy files remain accessible for historical entries. Key records in the format include the HEADER record, which captures essential such as the structure's , deposition date, and unique four-character PDB ID code in fixed positions (columns 1-6 for the record name, 11-50 for classification, 51-59 for the initial date, and 63-66 for the ID code). The records describe coordinates for atoms in standard or residues, while HETATM records handle non-standard atoms or ligands, following nearly identical layouts to ATOM but with the record name changed to "HETATM" in columns 1-6. These coordinate records form the core of the file, listing atomic positions in orthogonal coordinates (X, Y, Z in Ångströms), along with attributes like and temperature factors. The format's rigid structure imposes several limitations, including the 80-character line constraint, which restricts detailed annotations and leads to truncated or abbreviated data in complex entries. Atom serial numbers are confined to five digits (columns 7-11), limiting files to a maximum of 99,999 atoms and precluding representation of very large biomolecular assemblies without splitting into multiple files. Additionally, ambiguities arise from fixed column alignments, such as right-justified atom names in four characters (columns 13-16), which can confuse similar labels (e.g., distinguishing "" for alpha-carbon from space-padded variants), and inconsistent handling of alternate conformations or insertion codes. A representative example is the ATOM record, which adheres to the following column specifications:
ColumnsField NameDescriptionFormat/Example
1-6Record nameFixed as "ATOM "
7-11Serial numberAtom index (, right-justified)
13-16Atom nameName of the (right-justified) (with spaces if <4 chars)
17Alternate Indicator for conformers (blank or A-Z)(blank)
18-20Residue nameThree-letter (left-justified)
22Chain IDSingle character identifierA
23-26Residue sequenceSequence number ()23
27Insertion For residue insertions (A-Z or blank)(blank)
31-38X coordinateOrthogonal X in Ångströms (real)28.000
39-46Y coordinateOrthogonal Y in Ångströms (real)33.000
47-54Z coordinateOrthogonal Z in Ångströms (real)45.000
55-60Fraction of time atom is occupied1.00
61-66Temperature factorB-factor (real)15.00
77-78Element symbol (right-justified)
79-80ChargeAtom charge ( or blank)(blank)
An example ATOM line might read: ATOM 1 N ASP A 23 28.000 33.000 45.000 1.00 15.00 N , illustrating how data is precisely positioned to maintain parseability despite the format's constraints.

Modern Formats

The Macromolecular Crystallographic Information File (mmCIF), also known as PDBx/mmCIF, serves as the primary modern format for the Protein Data Bank (PDB), featuring a hierarchical, dictionary-based structure that organizes data into categories and data items for comprehensive representation of macromolecular structures. This format employs a self-describing syntax based on the Crystallographic Information Framework (CIF), allowing for flexible encoding of atomic coordinates, metadata, experimental details, and biological assemblies without the size constraints of earlier formats. Since July 1, 2019, submission of PDBx/mmCIF files has been mandatory for crystallographic depositions to the PDB, ensuring standardized data processing, annotation, and archiving across the Worldwide Protein Data Bank (wwPDB). To support future growth of the archive, PDBx/mmCIF accommodates extended 12-character PDB IDs (prefixed by "pdb_"), as the traditional four-character IDs are anticipated to be exhausted by 2028; legacy PDB format cannot support these extended IDs. Key advantages of PDBx/mmCIF include its support for unlimited numbers of atoms, residues, and chains, making it particularly suitable for large structures determined by techniques such as cryo-electron microscopy (cryo-EM), where models can exceed hundreds of thousands of atoms. It also facilitates detailed representation of ligands and non-standard residues through dedicated categories for chemical components and validation reports, enabling integration of quality metrics like clash scores and geometry assessments directly within the file. For instance, the _atom_site category records essential details for each atom, including positional coordinates (via _atom_site.Cartn_x, _atom_site.Cartn_y, and _atom_site.Cartn_z), , B-factors, and labels for chains, residues, and alternate conformations, providing a robust framework for precise structural description. In addition to PDBx/mmCIF, the PDB supports complementary formats for specific needs. PDBML, an XML-based representation derived from the PDBx/mmCIF , was introduced in 2005 to enable machine-readable archival and services for structures deposited from that year onward. BinaryCIF (BCIF), a compressed binary encoding of mmCIF , offers enhanced for and of large datasets, reducing file sizes while maintaining full with text-based mmCIF tools. For computed structure models (CSMs), such as those from predictions, modelCIF extends PDBx/mmCIF with additional definitions for ensemble representations and prediction metadata; it was first released in 2018 under the wwPDB ModelCIF to promote principles for non-experimental models. All new PDB entries are now archived in PDBx/mmCIF format, with ongoing efforts to convert legacy structures to this standard, ensuring long-term accessibility and compatibility with modern computational workflows. This transition addresses limitations in older formats, such as fixed record lengths, by providing extensible support for emerging data types like cryo-EM density maps and validation.

Accessing and Analyzing Data

Search and Retrieval

The Protein Data Bank (PDB) provides multiple search interfaces to facilitate discovery of structural data, primarily through the RCSB PDB portal at rcsb.org and the PDBe (PDBe-KB). The RCSB advanced search allows users to query by polymer sequence similarity, chemical attributes or similarity (e.g., using SMILES or InChI), and author names, enabling precise retrieval of entries matching specific criteria. Filters refine results by experimental method (e.g., , cryo-EM), resolution (e.g., high-resolution structures below 2 ), and deposition or release date, with options to sort by recency. PDBe-KB integrates structural and functional annotations from multiple sources, offering aggregated views for proteins with superposed models and data, accessible via a unified search across experimental and computed structures. Retrieval of PDB data supports flexible download options in various formats to accommodate different user needs. Individual entries can be downloaded in PDBx/mmCIF, XML, BinaryCIF, or legacy PDB formats, either compressed or uncompressed, directly from the RCSB or PDBe portals. For bulk access, users employ -based services replacing deprecated FTP, including scripts for mirroring the full archive (e.g., via https://files.wwpdb.org/pub/pdb/data/structures/divided), which organizes files by entry for efficient retrieval of large datasets. Programmatic access is enabled through RESTful , such as the RCSB Data API for querying entry (e.g., GET https://data.rcsb.org/rest/v1/core/entry/4HHB returns details on structure 4HHB) and the Search API for advanced queries on sequences or ligands. In 2025, the Sequence Coordinates Service was introduced to better integrate protein structures and sequence mappings, replacing the 1D Coordinates Service as of May 31, 2025. Key features enhance search precision, including structure similarity searches that compare 3D shapes using algorithms like TM-align for pairwise of protein chains. A addition allows filtering by computed structure model () confidence, with toggles to include or exclude AlphaFold-derived models marked by a icon in results. The PDB receives weekly updates with new releases, ensuring timely access to the latest structures. Usage reflects high demand, with wwPDB services supporting millions of users annually and over 3.6 billion file downloads in alone across FTP and accesses. Regional centers like PDBe and PDBj offer mirrored portals with similar search and retrieval capabilities for global accessibility.

Visualization and Analysis Tools

The RCSB Protein Data Bank (PDB) integrates the Mol* Viewer as its primary web-based tool for interactive 3D visualization and analysis of macromolecular structures, enabling users to explore atomic coordinates, maps, and associated annotations without software installation. Developed collaboratively by the Protein Data Bank in (PDBe) and RCSB PDB, Mol* supports streaming of large datasets, structure superposition, visualization, and measurement of distances, angles, and torsion angles, facilitating rapid assessment of structural features like binding sites and conformational changes. For advanced analysis, the RCSB PDB provides the Pairwise Structure Alignment Tool, which computes and visualizes alignments between user-selected PDB entries or uploaded models using algorithms like TM-align, highlighting conserved regions and structural differences. This tool integrates with Mol* for immediate 3D rendering of results, supporting research into evolutionary relationships and functional similarities. Additionally, the PDB's Sequence Viewer and Interaction tools allow analysis of residue-level interactions, hydrogen bonds, and chemical components within visualized s. Beyond web interfaces, widely adopted desktop software enhances PDB data exploration. PyMOL, an open-source molecular visualization system, excels in high-quality rendering, ray-tracing, and scripting for custom analyses such as cavity detection and morphing between structures, with plugins for volume data and trajectories. UCSF ChimeraX, a successor to , offers multi-scale visualization, support, and tools for density map fitting, ensemble analysis, and , processing PDB files alongside cryo-EM data for comprehensive workflows. (VMD) specializes in dynamic simulations, supporting PDB trajectory analysis, volumetric rendering, and plugin extensions for electrostatics and secondary structure calculations. The NCBI's iCn3D provides a free web-based alternative, allowing structure viewing, , and (e.g., salt bridges, hydrophobic contacts) directly from PDB IDs or files, with export options for images and reports. These tools collectively enable researchers to derive insights from PDB archives, prioritizing accuracy and across experimental and computational data.

References

  1. [1]
    Website FAQ - RCSB PDB
    Nov 5, 2024 · The PDB or Protein Data Bank is an archive for three dimensional structural data of biological macromolecules and their various complexes with ...
  2. [2]
    PDB History
    The PDB was established in 1971 at Brookhaven National Laboratory under the leadership of Walter Hamilton and originally contained 7 structures.
  3. [3]
    About RCSB PDB: A Living Digital Data Resource That Enables ...
    RCSB PDB is the US data center for the Protein Data Bank, providing free access to 3D structure data for biological molecules.
  4. [4]
    PDB Statistics: Overall Growth of Released Structures Per Year
    PDB Statistics: Overall Growth of Released Structures Per Year ; 2024, 229,666, 15,474 ; 2023, 214,192, 14,501 ; 2022, 199,691, 14,290 ; 2021, 185,401, 12,586.
  5. [5]
    wwPDB: Worldwide Protein Data Bank
    - **Definition**: The Protein Data Bank (PDB) is the single repository of information about the 3D structures of proteins, nucleic acids, and complex assemblies since 1971.
  6. [6]
    RCSB Protein Data Bank: Sustaining a living digital data resource ...
    Oct 25, 2017 · Not surprisingly, nearly 90% of published PDB structures have been cited by journals in the area of Biochemistry & Molecular Biology. High ...
  7. [7]
    Impact of structural biology and the protein data bank on us fda new ...
    Jun 17, 2024 · As of February 2024, the PDB housed at least 5440 structures of protein kinases, including 4817 proteins of human origin. Moreover, many sponsor ...
  8. [8]
    Protein Data Bank: A Comprehensive Review of 3D Structure ...
    Not surprisingly, nearly 90% of published PDB structures analyzed in 2018 were cited by journals in the area of Biochemistry and Molecular Biology. High ...
  9. [9]
    PDB adopts a standard Creative Commons open source license
    wwPDB has replaced its historical data access license with a standard open source license from Creative Commons, the CC0 1.0 Universal (CC0 1.0) Public Domain ...
  10. [10]
    Usage Policies - wwPDB
    Data files contained in the PDB archive are available under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication. Users of PDB data are encouraged to ...Missing: license educational
  11. [11]
    RCSB Protein Data Bank resources for structure-facilitated design of ...
    Since late January 2020, more than 1,500 structures of SARS-CoV-2 proteins have been deposited into the PDB (https://rcsb.org/covid19). Of central importance ...
  12. [12]
    COVID-19/SARS-CoV-2 Resources - RCSB PDB
    Access all SARS-CoV-2 PDB structures. Main proteases; Spike proteins and receptor binding domains; Papain-like proteinases; Other SARS-CoV-2 structures; PanDDA ...
  13. [13]
    Computed Structure Models and RCSB.org
    Oct 30, 2024 · The approaches utilize knowledge of protein structures from the PDB, and vast amounts of protein sequencing data to compute these models. Access ...Computed Structure Models... · Documentation · Querying For Structures In...
  14. [14]
    AlphaFold Protein Structure Database: massively expanding the ...
    Nov 17, 2021 · AlphaFold DB will enable biomedical scientists to use 3D models of protein structures as a core tool, driving research and innovation across ...
  15. [15]
    Protein Data Bank (PDB): Database of Three-Dimensional Structural ...
    The early years: 1971–1988. The PDB was established in 1971 by Dr Walter Hamilton, at the suggestion of members of the American Crystallographic Association ( ...
  16. [16]
    7RXN: STRUCTURE OF RUBREDOXIN FROM ... - RCSB PDB
    The X-ray model of rubredoxin from Desulfovibrio vulgaris has been refined against 1.5 A X-ray diffraction data collected on a diffractometer.Missing: 7 | Show results with:7
  17. [17]
    The 'new' PDB | Nature Structural & Molecular Biology
    The PDB was established in 1971 at BNL by an international group of scientists and was supported by the National Science Foundation (NSF) from 1971–1991. In ...
  18. [18]
    Protein Data Bank: Key to the Molecules of Life - NSF Impacts
    Since the 1970s, the U.S. National Science Foundation has funded the Protein Data Bank (PDB), a vital resource for biologists and medical researchers that ...Missing: NIH | Show results with:NIH
  19. [19]
    The Protein Data Bank archive as an open data resource - PMC - NIH
    In the days before the Internet, it was necessary to send boxes of punched cards or magnetic tapes of coordinates through the mail. In 1971, a Cold Spring ...
  20. [20]
    [PDF] PROTEIN DATA BANK. - FILE RECORD FORMATS
    Each protein is assigned an identification code and this code is carried on all records for the data set, except atomic coordinate, torsion angle and end of ...
  21. [21]
    Helen M. Berman, Director Emerita - RCSB PDB
    Helen was a co-founder of the Protein Data Bank (PDB) archive that was launched in 1971 and has been committed to the continued development and maintenance of ...Missing: early involvement
  22. [22]
    an interview with Helen Berman, director of the Protein Data Bank
    Sep 10, 2008 · I had been involved with PDB since 1971 as a co-founder. Brookhaven hosted the database from 1971 to 1998. During the 1990s there was a large ...
  23. [23]
    The PDB Archive Reaches a Significant Milestone
    Since its inception, the size of the archive has increased tenfold roughly every 10-15 years: the PDB reached 100 released entries in 1982, 1000 entries in 1993 ...
  24. [24]
    Protein Data Bank (PDB): The Single Global Macromolecular ...
    The Protein Data Bank (PDB) was established in 1971 with fewer than ten X-ray crystallographic structures of proteins, becoming the first open access digital ...
  25. [25]
    Evolution of standardization and dissemination of cryo-EM structures ...
    In 2003, a de novo model derived from a 4 Å map of a helical filament of membrane protein embedded in vitreous ice was deposited to PDB (Fig. 2) (28). All these ...
  26. [26]
    PDB Reaches a New Milestone: 200,000+ Entries
    With this week's update, the PDB archive contains a record 200,069 entries. The archive passed 150,000 structures in 2019 and 100,000 structures in 2014.
  27. [27]
    EMDB becomes a partner in Worldwide Protein Data Bank - EMBL-EBI
    Jun 23, 2021 · ... wwPDB in 2003. BMRB (USA) joined in 2006. This move formalises a long-standing relationship between the EMDB and wwPDB. EMDB was established ...Missing: date | Show results with:date
  28. [28]
    Mandatory PDBx/mmCIF format files submission for MX depositions
    Submission of PDBx/mmCIF format files for crystallographic depositions to the PDB will be mandatory from July 1st 2019 onward.
  29. [29]
    [PDF] charter of the worldwide protein data bank - RCSB PDB
    Apr 28, 2021 · wwPDB Core Members and the wwPDB Federated Member governing data exchange, data release, data confidentiality, and related matters. 4 ...
  30. [30]
    Protein Data Bank: the single global archive for 3D macromolecular ...
    Oct 24, 2018 · The Protein Data Bank (PDB, pdb.org) was established in 1971 as the first open-access, molecular data resource in biology (1). More than 47 ...
  31. [31]
    FAQ - wwPDB
    HOW CAN I ACCESS THE PROTEIN DATA BANK ARCHIVE? The RCSB PDB, PDBe, and PDBj serve as deposition, data processing and distribution sites of the PDB Archive.Missing: roles contributions
  32. [32]
    EMDB becomes a partner in wwPDB - RCSB PDB
    ... wwPDB in 2003. BMRB (USA) joined in 2006. This move formalizes a long-standing relationship between the EMDB and wwPDB. EMDB was established in 2002 at ...Missing: date | Show results with:date
  33. [33]
    FAQ - wwPDB
    The wwPDB members agree that the RCSB will be the 'archive keeper' and will have sole write access to the PDB archive and control over the directory structure ...<|control11|><|separator|>
  34. [34]
    RCSB PDB: Homepage
    RCSB Protein Data Bank (RCSB PDB) enables breakthroughs in science and education by providing access and tools for exploration, visualization, and analysis.About RCSB PDB · Protein Data Bank · PDB Statistics · Team Members
  35. [35]
    RCSB Protein Data Bank: Sustaining a living digital data resource ...
    At the beginning of 2017, the PDB archive contained 125,463 entries and is expected to grow by ∼10% before year end. Total data storage for the archival files ...
  36. [36]
  37. [37]
    What is PDBe? - EMBL-EBI
    PDBe is a founding member of the worldwide PDB consortium (wwPDB) and is actively engaged in the deposition, annotation, remediation and dissemination of ...
  38. [38]
    Protein Data Bank in Europe
    PDBe is a founding member of the Worldwide Protein Data Bank (wwPDB) which collects, organises and disseminates data on biological macromolecular structures.About PDBe · PDBe API · PDBe - Knowledge Base · PDBe < PISA < EMBL-EBI
  39. [39]
    [PDF] Worldwide Protein Data Bank (wwPDB) - Protein Data Bank Japan
    Worldwide Protein Data Bank. (wwPDB). ▫Ensures data are freely and globally available. ▫Members. ▫RCSB PDB (US)*Archive Keeper. ▫PDBj (Osaka University ...
  40. [40]
    Protein Data Bank Japan: PDBj top page
    PDBj (Protein Data Bank Japan) maintains the single global PDB/BMRB/EMDB archives of macromolecular structures and provide integrated tools.
  41. [41]
    [wwPDB] Celebrating 20 Years of the wwPDB - Protein Data Bank ...
    Jul 10, 2023 · [wwPDB] Celebrating 20 Years of the wwPDB. This page is also ... (PDBj) in the management of the essential Protein Data Bank Core ...
  42. [42]
    Worldwide Protein Data Bank (wwPDB): A virtual treasure for ...
    Dec 15, 2021 · The Worldwide Protein Data Bank (wwPDB) represents a brilliant collection of 3D structure data associated with important and vital biomolecules.Missing: formation governance
  43. [43]
    BioMagResBank (BMRB) as a partner in the Worldwide Protein Data ...
    We describe the role of the BioMagResBank (BMRB) within the Worldwide Protein Data Bank (wwPDB) and recent policies affecting the deposition of biomolecular NMR ...
  44. [44]
    About BMRB
    As a member of the wwPDB, BMRB shares the same usage policy and license for BMRB data. More detail is available on the wwPDB web site. WWW server URL: https ...
  45. [45]
    BioMagResBank (BMRB) as a Resource for Structural Biology - NIH
    BMRB became a core member of the Worldwide Protein Data Bank (wwPDB) in 2007, and the BMRB archive is now a core archive of the wwPDB. Currently, about 10 ...
  46. [46]
    BioMagResBank (BMRB) as a Resource for Structural Biology
    BMRB became a core member of the Worldwide Protein Data Bank (wwPDB) in 2007, and the BMRB archive is now a core archive of the wwPDB. Currently, about 10% of ...<|control11|><|separator|>
  47. [47]
    Organization of 3D Structures in the Protein Data Bank - RCSB PDB
    Oct 25, 2023 · Biomolecules in the Protein Data Bank (PDB) archive are organized and represented using this hierarchy to simplify searching and exploration.
  48. [48]
    Methods for Determining Atomic Structures - PDB-101
    Several methods are currently used to determine the structure of a protein, including X-ray crystallography, NMR spectroscopy, and electron microscopy.
  49. [49]
    PDB Statistics: Growth of Structures from X-ray Crystallography ...
    PDB Statistics: Growth of Structures from X-ray Crystallography Experiments Released per Year ; 2025, 198,931, 8,473 ; 2024, 190,458, 9,206 ; 2023, 181,252, 9,584.
  50. [50]
    Growth of Structures from NMR Experiments Released per Year
    PDB Statistics: Growth of Structures from NMR Experiments Released per Year ; 2025, 14,623, 193 ; 2024, 14,430, 284 ; 2023, 14,146, 272 ; 2022, 13,874, 301.
  51. [51]
    Growth of Structures from 3DEM Experiments Released per Year
    PDB Statistics: Growth of Structures from 3DEM Experiments Released per Year ; 2025, 29,794, 5,712 ; 2024, 24,082, 5,791 ; 2023, 18,291, 4,576 ; 2022, 13,715, 4,104.
  52. [52]
    Membrane Protein Resources - RCSB PDB
    Feb 29, 2024 · Integral membrane proteins are permanently attached to a lipid bilayer, either embedded in or anchored to the membrane. Peripheral membrane ...
  53. [53]
    than 1000 SARS-CoV-2 Coronavirus Protein Structures Available
    With this week's update, 1,018 SARS-CoV-2-related structures are now freely available from the Protein Data Bank. The first SARS-CoV-2 structure, ...<|control11|><|separator|>
  54. [54]
    Integrative Structures on RCSB.org
    May 20, 2025 · Structures of complex macromolecular assemblies are increasingly determined using integrative or hybrid methods (IHM), where a combination of ...
  55. [55]
    441 new PDB entries have been released on 2025-11-05.
    441 new PDB entries have been released on 2025-11-05. 244349 entries are now available in total. Please refer to PDB / EMDB latest information for other ...
  56. [56]
    Applications of Cryo-EM in small molecule and biologics drug design
    Nov 23, 2021 · Between 2010 and 2020, the proportion of PDB depositions solved by EM grew from 0.7% to 17%, with nearly 50% of reported cryo-EM structures ...<|separator|>
  57. [57]
    PDB Statistics: Growth of Structures by Multi-method per Year
    PDB Statistics: Growth of Structures by Multi-method per Year ; 2023, 229, 15 ; 2022, 214, 21 ; 2021, 193, 17 ; 2020, 176, 20.Missing: percentage | Show results with:percentage
  58. [58]
    Protein Data Bank: A Comprehensive Review of 3D Structure ... - MDPI
    Not surprisingly, nearly 90% of published PDB structures analyzed in 2018 were cited by journals in the area of Biochemistry and Molecular Biology. High ...Protein Data Bank: A... · 2. Results · 2.1. Pdb Data Metrics And...<|control11|><|separator|>
  59. [59]
    Frequently Asked Questions - RCSB PDB
    The RCSB PDB creates tools and resources for research and education in molecular biology, structural biology, computational biology, and beyond.
  60. [60]
    PDB Statistics: Data Storage Growth
    PDB Core Archive and Related Holdings (single copy) ; 2024, 1437 GB for 229,564 PDB structures, 269 GB ; 2023, 1242 GB for 214,121 PDB structures, 231 GB ; 2022 ...Missing: 2025 | Show results with:2025
  61. [61]
    OneDep: Unified wwPDB System for Deposition, Biocuration, and ...
    Feb 9, 2017 · Biocuration activities are distributed among the wwPDB partner sites, based on the geographic location from which the deposition originated.
  62. [62]
    wwPDB Deposition
    Welcome to the wwPDB OneDep system! To make efficient deposition, validate your structures on our anonymous validation server for better data quality.
  63. [63]
    [PDF] 5 Easy Steps to PDB Deposition with OneDep
    Deposition is tailored to experimental type: • X-ray, Neutron, NMR,. EM, or any combination of these. • You must upload the relevant coordinate and.
  64. [64]
    Tutorial - wwPDB
    Tutorial: Using the wwPDB OneDep System. For Coordinate Deposition: The top-level deposition page at http://deposit.wwpdb.org/deposition/ has two functions: ...
  65. [65]
  66. [66]
    PDB-REDO
    It combines popular crystallographic software from CCP4, e.g. REFMAC, with with our specially developed tools like pepflip, loopwhole, and HODER, and structure ...Download · DSSP · Archive · About
  67. [67]
    FAQ - wwPDB
    If you have any questions about deposition, you can log in to your deposition session and use the Communication page to communicate with wwPDB annotation staff.
  68. [68]
    wwPDB Deposition Policies and wwPDB Biocuration Procedures
    The process for weekly PDB archive data release, with the advice and concurrence of the Advisory Committee to the Worldwide Protein Data Bank, is as follows:.
  69. [69]
    Cryo-EM model validation recommendations based on outcomes of ...
    Feb 4, 2021 · This paper describes outcomes of the 2019 Cryo-EM Model Challenge. The goals were to (1) assess the quality of models that can be produced from cryogenic ...
  70. [70]
    wwPDB: Instructions to Journals - Worldwide Protein Data Bank
    When is the PDB Archive Updated? The weekly update to the PDB data archive occurs in two phases: Phase I: Every Saturday by 3:00 UTC, for every new entry, wwPDB ...Missing: frequency | Show results with:frequency<|separator|>
  71. [71]
    PDB and the Pandemic
    With this week's update, 1,536 SARS-CoV-2-related structures are now freely available from the Protein Data Bank. 55 new SARS-CoV-2 structures were released ...Missing: faster | Show results with:faster
  72. [72]
    Structure Validation and Quality - RCSB PDB
    Aug 15, 2025 · Tools for structure validation include CheckMyMetal, DAQ Score, MolProbity, NQ-Flipper, Procheck, Prosa-web, Verify3D, WHAT IF, and WHAT_CHECK.
  73. [73]
    Validation of Structures in the Protein Data Bank - PMC - NIH
    Nov 22, 2017 · The wwPDB validation pipeline orchestrates execution of each community-recommended validation tool (Table 3), extracts key metrics produced ...
  74. [74]
    Validation Reports - wwPDB
    The wwPDB provides depositors with detailed reports (PDF and XML files) that include the results of model and experimental data validation.
  75. [75]
    User guide to the wwPDB X-ray validation reports
    Aug 9, 2024 · The report summarises the quality of the structure and highlights specific concerns by considering the atomic model, the diffraction data and the fit
  76. [76]
    User guide to the wwPDB EM validation reports
    Sep 3, 2025 · Careful analysis of the experimental data is typically required to make the distinction. Outlier residues that are important for structure ...
  77. [77]
    EMDataResource
    ### Summary of 2020-2025 Updates to Validation Reports
  78. [78]
    Enhanced Validation of Small-Molecule Ligands and Carbohydrates ...
    The Worldwide Protein Data Bank (wwPDB) has enhanced validation reports generated for small molecule ligands and carbohydrates to help ensure PDB data quality.Introduction · Results · Enhanced Wwpdb Validation Of...<|control11|><|separator|>
  79. [79]
    Outcomes of the EMDataResource Cryo-EM Ligand Modeling ... - NIH
    The EMDataResource Ligand Model Challenge aimed to assess the reliability and reproducibility of modeling ligands bound to protein and protein/nucleic-acid ...Missing: modelCIF | Show results with:modelCIF<|control11|><|separator|>
  80. [80]
    ModelCIF Working Group - wwPDB
    The ModelCIF Working Group develops data standards and software for representing and archiving computed structure models, promoting ModelCIF adoption.Missing: 2020-2025 cryo-
  81. [81]
    A user‐friendly interface for the manipulation of PDB files - PMC - NIH
    Nov 7, 2020 · The Protein Data Bank (PDB) file format remains a popular format ... The Protein Data Bank (PDB) format, which was created in 1976 to ...
  82. [82]
    File Format Documentation - wwPDB
    In 1976, a version using 72 characters plus 8 for sequencing was introduced. This 80-column format is what has commonly been called the (legacy) "PDB format".
  83. [83]
    Announcing mandatory submission of PDBx/mmCIF format files for ...
    We now announce that as of 1 July 2019, PDBx/mmCIF will be the only format allowed for deposition of the atomic coordinates for PDB structures resulting from ...
  84. [84]
    [PDF] PROTEIN DATA BANK ATOMIC COORDINATE AND ...
    SUMMARY OF RECORD TYPES AND THEIR SEQUENCE. For each atomic coordinate and bibliographic entry, the file consists of records each of 80 charac-.
  85. [85]
    Coordinate Section - wwPDB Format version 3.3
    The ATOM/HETATM records are checked for PDB file format, sequence information, and packing. The ATOM records are compared to the corresponding sequence ...Missing: Bank legacy
  86. [86]
    File Formats and the PDB - wwPDB
    The PDB file format is now frozen, and no future changes will be made. As the PDBx/mmCIF format continues to evolve, PDB format files will become outdated. The ...Missing: discontinued | Show results with:discontinued
  87. [87]
    MMCIF USER GUIDE
    Jun 7, 2024 · PDBx/mmCIF format utilizes the ASCII character set. All data items are identified by name, begin with the underscore character, and are composed ...
  88. [88]
    PDBx/mmCIF Ecosystem: Foundational Semantic Tools for ...
    In coordination with the wwPDB mmCIF Working Group (see below), wwPDB leadership adopted PDBx/mmCIF files as the official master archival format of the PDB Core ...
  89. [89]
    Data Category atom_site - PDBx/mmCIF
    Data items in the ATOM_SITE category record details about the atom sites in a macromolecular crystal structure, such as the positional coordinates, ...
  90. [90]
    PDBML Schema Resources
    PDBML is a representation of PDB data in XML format, used for archival macromolecular structure data. Its schema is in the PDB Exchange Data Dictionary.
  91. [91]
    BinaryCIF and CIFTools—Lightweight, efficient and extensible ...
    Oct 19, 2020 · BinaryCIF and CIFTools enable lightweight, efficient, and extensible handling of 3D macromolecular structural data.
  92. [92]
    ModelCIF: An extension of PDBx/mmCIF data representation ... - NIH
    It is an extension of the Protein Data Bank Exchange / macromolecular Crystallographic Information Framework (PDBx/mmCIF), which is the global data standard for ...Missing: Binary | Show results with:Binary
  93. [93]
    Overview: Advanced Search - RCSB PDB
    Dec 21, 2023 · Polymer Entities - distinct (chemically unique) polymeric molecules present in PDB entries, specifically proteins (polypeptides), DNA ( ...<|control11|><|separator|>
  94. [94]
  95. [95]
    File Download Services - RCSB PDB
    Sep 3, 2025 · The RCSB PDB provides rsync capabilities for efficiently maintaining full copies of the archive. To facilitate automated downloads, we offer ...Missing: total | Show results with:total
  96. [96]
    data-api – RCSB PDB Data API: Understanding and Using
    The RCSB PDB offers two ways to access data through application programming interfaces (APIs): REST-based API: refer to the full REST-API documentation. ...Missing: PDBe- KB options
  97. [97]
    Web APIs Overview - RCSB PDB
    May 28, 2025 · The RCSB PDB offers Data, Search, ModelServer, VolumeServer, Sequence Coordinates, and Alignment APIs for programmatic access to the PDB ...Web Apis Overview · Data Api · Search ApiMissing: submissions | Show results with:submissions<|control11|><|separator|>
  98. [98]
    Structure Similarity Search - RCSB PDB
    Dec 14, 2023 · The Structure Similarity Search option allows you to query the PDB archive using the 3D shape of a protein structure.Structure Similarity Search · Documentation · Query OptionsMissing: TM- | Show results with:TM-
  99. [99]
    Pairwise Structure Alignment Tool - RCSB PDB
    This tool allows the selection of protein 3D structures for alignment. Use an existing PDB or Computed Structure Model entry ID, upload a local file with ...
  100. [100]
    Download Statistics - wwPDB
    Annual Download Statistics ; 2025, 3,848,207,345, 1,881,721,277 ; 2024, 3,643,387,479, 1,970,665,405 ; 2023, 3,102,043,501, 2,035,853,611 ; 2022, 3,134,697,434 ...
  101. [101]
    A new chapter for RCSB Protein Data Bank Molecule of the Month in ...
    Apr 18, 2025 · A new chapter begins in 2025: Janet Iwasa has taken over as the ... RCSB PDB develops resources to search, visualize, and analyze PDB ...
  102. [102]
    PDBe - Knowledge Base
    ### Summary of PDBe-KB Features
  103. [103]
    Mol* Viewer: modern web app for 3D visualization and analysis of ...
    The web-native Mol* Viewer enables 3D visualization and streaming of macromolecular coordinate and experimental data, together with capabilities for displaying ...
  104. [104]
    3D View - RCSB PDB
    RCSB PDB Mol* 3D Viewer. Import, visualize, and align multiple structures in an interactive 3D Mol* viewer.Missing: analysis | Show results with:analysis
  105. [105]
    Structure - RCSB PDB
    Nov 30, 2023 · The real value of PDB data is the opportunity to visualize molecular structures and analyze them in three-dimensions (3D). Each PDB entry has a ...Structure Summary Page · Sequence · Ligands · Annotations
  106. [106]