Protein Data Bank
The Protein Data Bank (PDB) is an international open-access archive that stores and disseminates three-dimensional structural data for biological macromolecules, including proteins, nucleic acids, and complex assemblies, primarily determined through experimental methods such as X-ray crystallography, nuclear magnetic resonance spectroscopy, and cryo-electron microscopy.[1] Established in 1971 at Brookhaven National Laboratory in the United States with just seven initial structures under the leadership of Walter Hamilton, the PDB has grown into a foundational resource for structural biology, enabling advancements in fundamental biology, biomedicine, biotechnology, and energy sciences by providing free access to atomic coordinates, sequences, and associated metadata for visualization, analysis, and research.[2][3] Managed as a collaborative effort by the Worldwide Protein Data Bank (wwPDB) organization—formed in 2003 to ensure a unified global archive—the PDB is operated through distributed data centers, including the Research Collaboratory for Structural Bioinformatics (RCSB) in the US, the Protein Data Bank in Europe (PDBe), the Protein Data Bank Japan (PDBj), and the Biological Magnetic Resonance Data Bank (BMRB).[2] This international framework handles data deposition, validation, and distribution, with the RCSB PDB serving as the primary US hub and offering user-friendly tools for querying, downloading, and exploring over 244,000 entries as of November 2025, reflecting an annual growth rate of approximately 7% driven by technological advances in structure determination.[3][4][5] The archive's economic impact is profound, with the estimated cost to replicate its data exceeding $23 billion, underscoring its role as a living digital resource that supports millions of users worldwide annually in drug discovery, protein engineering, and educational initiatives.[3]Overview
Definition and Scope
The Protein Data Bank (PDB) is a global, open-access repository that serves as the single worldwide archive for the three-dimensional (3D) structures of biological macromolecules, including proteins, nucleic acids, and their complexes.[6] It primarily collects structures determined through experimental techniques such as X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, and cryo-electron microscopy (cryo-EM), ensuring that the data reflect empirically validated atomic arrangements. Established in 1971 and now managed by the Worldwide Protein Data Bank (wwPDB), the PDB was founded to systematically archive and freely disseminate structural biology data, facilitating its use in research, education, and drug discovery.[6] The scope of the PDB encompasses atomic coordinate files, associated experimental data (such as diffraction patterns or electron density maps), and comprehensive metadata describing the structure determination process, validation results, and biological context. As of November 2025, the archive contains over 244,000 released entries, representing a vast collection of experimentally derived models that span diverse biological systems from simple peptides to large macromolecular assemblies.[4] While the PDB focuses exclusively on experimentally determined structures, it integrates with complementary resources for computed models, such as those generated by artificial intelligence methods, to provide broader access to predicted structures without including them in the core experimental archive. This delimited scope ensures the PDB remains a reliable source for high-quality, validated 3D data, emphasizing empirical evidence over theoretical predictions to support reproducible scientific advancements.Importance and Impact
The Protein Data Bank (PDB) serves as a foundational resource in structural biology, enabling researchers to elucidate the three-dimensional structures of proteins and other biomolecules, which is essential for understanding their molecular functions, interactions, and mechanisms. This structural insight has profoundly influenced fields such as drug design, where atomic-level models guide the development of targeted therapeutics, and protein engineering, facilitating the modification of proteins for industrial and medical applications. Nearly 90% of published PDB structures are cited in journals focused on biochemistry and molecular biology, underscoring the archive's centrality to structural biology research.[7][8] The PDB's educational value stems from its open-access policy under the CC0 1.0 Universal Public Domain Dedication, which waives all copyright and related rights, allowing unrestricted use for teaching, learning, and outreach worldwide. This accessibility supports educational initiatives by providing free access to high-quality structural data and associated visualization tools, enabling students, educators, and non-experts to explore complex biomolecular architectures without specialized software. The RCSB PDB, for instance, offers interactive resources like 3D viewers and tutorials that democratize structural biology education.[9][10][8] Beyond academia, the PDB has driven broader societal impacts by accelerating breakthroughs in medicine and biotechnology; for example, during the COVID-19 pandemic, over 1,500 SARS-CoV-2-related structures, including those of the spike protein critical for vaccine development, were rapidly deposited, informing the design of mRNA vaccines and antiviral drugs.[11][12] Its contributions extend to biotechnology, where structural data has enabled innovations in enzyme optimization and biomaterial design, with an estimated annual use value exceeding $5.5 billion[13] and a replacement cost for archived structures exceeding $23 billion,[14] reflecting the immense economic leverage from enabled discoveries. Post-2020, the PDB has expanded to integrate computed structure models, such as those from AlphaFold, allowing hybrid analyses that combine experimental data with AI-generated predictions to address gaps in structural coverage and enhance functional studies. This linkage supports advanced research by providing complementary views of protein conformations, fostering innovations in predictive modeling and therapeutic targeting.[15][16]History
Establishment
The Protein Data Bank (PDB) was established in October 1971 at Brookhaven National Laboratory (BNL) in Upton, New York, USA, under the leadership of Walter Hamilton, who served as the laboratory's chairman of the chemistry department.[2][17] The initiative arose from suggestions by members of the American Crystallographic Association (ACA) and participants at a 1971 workshop on the use of computers in structural chemistry held at BNL, aiming to create a centralized repository for experimentally determined three-dimensional structures of biological macromolecules.[17] From its inception, the PDB collaborated with the Cambridge Crystallographic Data Centre (CCDC) in the UK to facilitate data exchange and standardization. The archive launched with an initial collection of seven protein structures, primarily determined by X-ray crystallography, including the structure of rubredoxin from the bacterium Desulfovibrio vulgaris (PDB ID: 7RXN), one of the first small iron-sulfur proteins to be solved at atomic resolution.[2][18] These early entries represented pioneering work in protein crystallography, such as those from hemoglobin and myoglobin studies dating back to the 1950s and 1960s, now digitized for accessibility. Initial operations were funded by the U.S. National Science Foundation (NSF) and the National Institutes of Health (NIH), supporting the basic infrastructure for data collection and distribution.[19][20] In its formative years, the PDB faced significant logistical challenges due to the era's limited computing resources, with data submissions and distributions relying on manual methods such as punched cards and magnetic tapes sent via postal mail.[20][21] To address this, the first formal deposition guidelines and file format were developed in 1972, specifying a text-based record structure compatible with punch-card technology to ensure consistent atomic coordinate representation.[22] Key early contributors included Helen M. Berman, a co-founder who played a pivotal role in establishing the archive's foundational protocols alongside Hamilton.[23][24]Evolution and Milestones
The Protein Data Bank (PDB) experienced steady growth in the 1980s, reaching 100 deposited structures by 1982, primarily from X-ray crystallography experiments.[25] This period also marked the initial integration of nuclear magnetic resonance (NMR) data, with the first NMR-derived structure released in 1988, expanding the archive beyond crystallographic methods to include solution-based structural information.[2] By the early 1990s, the PDB had grown to 1,000 entries in 1993, reflecting increased experimental capabilities and community adoption.[25] The late 1990s brought significant organizational and technological advancements. The archive reached 10,000 structures in 1999, coinciding with the transfer of management from Brookhaven National Laboratory to the Research Collaboratory for Structural Bioinformatics (RCSB) at Rutgers University, completed that year to enhance data dissemination and sustainability.[26] Web-based access was adopted during this decade, enabling broader remote querying and visualization of structures through early online interfaces. In 2003, the formation of the Worldwide Protein Data Bank (wwPDB) established a collaborative international framework to ensure unified archiving and validation.[2] The 2000s saw further diversification with the inclusion of cryogenic electron microscopy (cryo-EM) structures, starting around 2003 with initial de novo models from electron density maps.[27] Subsequent decades accelerated expansion and modernization. The PDB surpassed 100,000 entries in 2014, 200,000 in 2023, and 240,000 in late 2024, driven by advances in high-throughput methods and global contributions.[28][4] In 2021, the PDB celebrated its 50th anniversary with global events organized by the wwPDB, highlighting its foundational role in structural biology.[29] The Electron Microscopy Data Bank (EMDB) formally joined the wwPDB in 2021, integrating cryo-EM density maps more seamlessly with atomic models.[30] Policy updates included mandating the PDBx/mmCIF format for all crystallographic depositions starting July 2019, improving data richness and interoperability.[31] Weekly releases became the standard protocol, facilitating timely access to new data.[1] From 2020 to 2025, enhancements supported computed structure models, notably through 2024 RCSB updates integrating AlphaFold predictions to complement experimental entries.[15]Organization and Management
Worldwide Protein Data Bank (wwPDB)
The Worldwide Protein Data Bank (wwPDB) was established in 2003 as a non-profit international consortium to coordinate the global management of the Protein Data Bank (PDB) archive, evolving from earlier national efforts to ensure a unified, publicly accessible resource for structural biology data. It maintains uniform standards for data deposition, validation, and dissemination while not owning the underlying data, which remains in the public domain.[6][32] The wwPDB's core tasks encompass developing and implementing deposition and annotation protocols, generating validation reports for submitted structures, and promoting open access to all archived data in line with FAIR (Findable, Accessible, Interoperable, and Reusable) principles. These efforts include coordinating biocuration, data remediation, secure storage, and free distribution services across its network, all provided without charge to researchers worldwide.[6][32] Membership in the wwPDB is structured around full core members, which currently include the RCSB PDB (United States), PDBe (Europe), PDBj (Japan), BMRB (United States, focused on nuclear magnetic resonance data), and EMDB (electron microscopy data archive, integrated as a core member in 2021).[6][32][33] The consortium also supports associate members, who contribute to specific activities, and federated members, who manage complementary resources through data exchange agreements, with new additions requiring unanimous approval.[6][32] Governance operates under the wwPDB Charter, with the 2020 update—effective January 1, 2021—emphasizing inclusivity, global equity, and enhanced collaboration by revising prior versions from 2003 and 2013. This framework includes annual meetings of core members, an Advisory Committee to address policies and disputes, and community input for standard updates, such as those to the PDBx/mmCIF format, ensuring consistent data integrity and accessibility.[32]Regional Data Centers
The Worldwide Protein Data Bank (wwPDB) oversees four primary regional data centers that collectively manage the deposition, processing, validation, and dissemination of Protein Data Bank (PDB) data, ensuring uniform access to the global archive.[34] These centers—RCSB PDB in the United States, PDBe in Europe, PDBj in Japan, and BMRB in the United States—operate collaboratively to handle structures determined by various experimental methods, including X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, and cryo-electron microscopy (cryo-EM).[35] Each center contributes specialized expertise while mirroring the full archive weekly to maintain synchronization and redundancy.[36] The RCSB Protein Data Bank (RCSB PDB), located at Rutgers, The State University of New Jersey, and the San Diego Supercomputer Center at the University of California San Diego, serves as the primary U.S. site and acts as the official archive keeper for the wwPDB, with sole write access to the PDB core archive.[37] It processes the majority of global depositions, particularly those from North America, and provides a comprehensive web portal for data access, visualization, and download.[38] RCSB PDB also develops educational resources to support teaching and outreach in structural biology, including interactive tools for exploring molecular structures, and has pioneered advancements like the Mol* 3D viewer for enhanced data rendering.[3] As the central hub for U.S.-based research, it emphasizes rigorous biocuration and annotation to ensure data quality and interoperability.[39] The Protein Data Bank in Europe (PDBe), hosted at the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) in Hinxton, United Kingdom, functions as the European regional center and founding member of the wwPDB.[40] PDBe focuses on integrating PDB data with broader bioinformatics resources, facilitating seamless connections to genomic and proteomic databases for enhanced analysis.[41] It prioritizes depositions from Europe and supports cryo-EM data through close ties with the Electron Microscopy Data Bank (EMDB), promoting collaborative initiatives across European research networks.[30] PDBe's contributions include specialized annotation for complex assemblies and ligands, adding value through region-specific outreach and training programs.[42] The Protein Data Bank Japan (PDBj), based at the Institute for Protein Research at Osaka University, operates as the Asian regional center and a founding wwPDB partner since 2003.[43] It handles a significant portion of depositions from Asia, offering Japanese-language interfaces and support to encourage participation from regional researchers.[44] PDBj emphasizes the development of computational tools for data processing and analysis, tailored to diverse experimental techniques, and contributes to global biocuration efforts by annotating structures with a focus on Asian scientific priorities.[45] Its role extends to fostering international collaborations, particularly in the Pacific Rim, to broaden the archive's representation.[46] The Biological Magnetic Resonance Data Bank (BMRB), situated at UConn Health in Farmington, Connecticut, United States, specializes in archiving NMR spectroscopy data as a core wwPDB partner since 2006.[47] BMRB maintains a dedicated repository for chemical shift assignments, restraint data, and other NMR-derived information, which complements atomic coordinate models in the PDB archive.[48] It processes NMR-specific depositions globally, ensuring integration with full structural entries through wwPDB protocols, and provides resources for validating NMR assignments to support hybrid structure determination.[49] As the sole center focused on biomolecular NMR, BMRB enhances the archive's utility for dynamics and conformational studies.[50] These regional centers coordinate under wwPDB oversight to mirror the entire PDB, EMDB, and BMRB archives weekly, preventing data silos and enabling each to offer unique value-added services, such as region-tailored annotations and interfaces, without duplicating core functions.[34] This distributed model ensures equitable global access and leverages local expertise to sustain the archive's growth and reliability.[6]Contents
Types of Structures
The Protein Data Bank (PDB) archives a diverse array of biological macromolecules and their complexes, categorized primarily by their molecular composition. Protein structures form the core of the archive, encompassing single-chain polypeptides as well as multi-chain assemblies, such as enzymes, receptors, and structural proteins. Nucleic acids, including DNA and RNA molecules, are also prominently represented, often as standalone entities or in hybrid forms like DNA-RNA complexes. Protein-nucleic acid complexes, such as transcription factors bound to DNA or ribosomes with mRNA and tRNA, highlight functional interactions central to cellular processes. Additionally, the PDB includes lipids, typically as components of membrane protein structures, carbohydrates in the form of linear or branched oligosaccharides (e.g., glycans attached to proteins), and small molecules like ligands, cofactors, and solvent molecules (e.g., water or ions) that modulate biological activity.[51][52] Structures in the PDB are further classified by the experimental methods used for their determination, reflecting advances in biophysical techniques. X-ray crystallography dominates the archive, accounting for the vast majority of entries (approximately 81% as of late 2025), and provides high-resolution models (often better than 2 Å) of rigid, crystalline macromolecules, enabling detailed atomic visualization. Nuclear magnetic resonance (NMR) spectroscopy contributes solution-state structures, particularly for smaller, flexible proteins and domains (about 6% of entries), yielding ensembles that capture dynamic conformations in near-physiological conditions. Cryo-electron microscopy (cryo-EM) has seen rapid growth, representing around 12% of the archive by late 2025, and excels at resolving large macromolecular complexes and assemblies (e.g., viruses or ribosomes) at resolutions approaching atomic levels without requiring crystallization. Other methods, such as fiber diffraction or electron diffraction, are less common but included for specialized cases like fibrous proteins.[52][53][54][55] The PDB features special collections that emphasize biologically significant or therapeutically relevant structures. Viral structures, including entire virions or key viral proteins (e.g., envelope glycoproteins), support virology and vaccine design. Membrane proteins, often embedded in lipid bilayers, are curated with resources like the Orientations of Proteins in Membranes (OPM) database to aid studies of transport and signaling. Antibody structures, frequently in complex with antigens, facilitate immunotherapy research. Since 2020, there has been a marked increase in disease-related entries, particularly for SARS-CoV-2, with thousands of structures deposited to accelerate antiviral drug discovery and understanding of viral-host interactions (e.g., spike protein-ACE2 complexes).[56][12][57] Notably, the PDB focuses exclusively on atomic coordinate models derived from experimental data and does not store raw experimental data, such as electron microscopy density maps or diffraction images, which are archived in complementary repositories like the Electron Microscopy Data Bank (EMDB) or software grids like SBGrid.[58]Statistics and Growth
As of November 2025, the Protein Data Bank (PDB) archive contains 245,074 entries.[59] These structures are primarily determined by X-ray crystallography (199,418 entries), followed by electron microscopy (cryo-EM; 30,264 entries) and nuclear magnetic resonance (NMR; 14,632 entries), with the remainder (760 entries) from other or hybrid methods.[59][59][59] The PDB has exhibited exponential growth since its inception, starting with just 7 entries in 1971.[28] Key milestones include reaching 100,000 entries in 2014, 150,000 in 2019, and surpassing 200,000 by early 2023.[28] Post-2020, annual additions have averaged 14,000 to 15,000 structures, driven by advances in automation and high-throughput methods such as cryo-EM.[4] Notable trends include the rapid rise in cryo-EM structures, which constituted less than 1% of the archive in 2010 but grew to approximately 12% by 2025.[60][55] Multi-method structures, combining techniques like X-ray and cryo-EM, have also increased modestly, numbering around 200 to 300 new entries annually in recent years.[61] Geographically, depositions remain dominated by the United States and Europe, though Asia's contributions have grown steadily since the early 2000s.[62] The PDB undergoes weekly updates, typically on Wednesdays at 00:00 UTC, incorporating new releases, revisions, and status changes.[63] The core archive's file size exceeds 1.4 terabytes as of 2024, encompassing coordinate files, validation reports, and associated data.[64]Data Submission and Validation
Deposition Process
The deposition process for structures in the Protein Data Bank (PDB) begins with researchers preparing their data through the wwPDB OneDep system, a unified online portal launched in 2014 that streamlines submissions across all wwPDB partner sites.[65] This platform supports deposition of atomic coordinates and associated experimental data for structures determined by various methods, including X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy, cryo-electron microscopy (cryo-EM), neutron diffraction, and combinations thereof, while accommodating multiple file formats such as PDBx/mmCIF.[66][67] Depositors initiate a session by providing an email address and selecting the principal investigator's institution, receiving a session ID and password to manage uploads securely; sessions remain active for up to three months if not submitted.[68] Key preparation steps involve generating atomic coordinate files using refinement software like PHENIX or REFMAC5, ensuring inclusion of all residues (including expression tags and disordered segments), and verifying sequences against databases such as UniProt.[67] Metadata, including experimental details like diffraction data or electron density maps, must be compiled alongside ligand information validated against the Chemical Component Dictionary.[67] Local validation is recommended using the wwPDB Validation Server to identify issues early, supplemented by tools like PDB-REDO for model optimization and rebuilding, which helps improve geometry and reduce processing delays.[69][70] Once prepared, files are uploaded via the OneDep interface to a preferred regional data center (e.g., RCSB PDB in the US, PDBe in Europe, or PDBj in Japan), where wwPDB biocurators perform annotation, including standardization of nomenclature and integration of validation reports; this step typically takes 1-2 weeks, though simpler entries may process in hours.[65][71] Depositions require comprehensive experimental metadata, such as method-specific parameters and resolution metrics, with cryo-EM structures require deposition of map volumes and must meet quality standards assessed through metrics like half-map resolutions and Fourier Shell Correlation (FSC) curves, with acceptance depending on justification and validation outcomes.[72][73] Submission is mandatory for publications in major journals like Nature and Science, which require inclusion of the assigned PDB ID and wwPDB validation reports in manuscripts.[74] Following annotation, depositors review the processed files and select a release status—immediate (REL), hold until publication (HPUB, up to one year), or private hold (HOLD, up to one year)—before final submission.[67] In response to the COVID-19 pandemic, post-2020 procedures included expedited annotation and release for SARS-CoV-2-related structures to accelerate research, resulting in over 1,500 such entries processed rapidly during 2020-2021.[75] Additionally, the OneDep system has integrated support for computed structure models (CSMs), allowing joint deposition of predicted models (e.g., from AlphaFold) alongside experimental data to enhance archive completeness.[15]Validation and Quality Control
The wwPDB employs a validation pipeline that integrates automated computational checks with expert biocurator review to assess the quality of submitted structures. Automated tools, such as MolProbity for evaluating backbone and side-chain geometry including Ramachandran plot outliers and steric clashes, WHAT IF for overall structure quality metrics, and Mogul for ligand geometry against the Cambridge Structural Database, process model coordinates and experimental data.[76][77] Biocurators then perform manual inspections to verify compliance with community standards recommended by wwPDB Validation Task Forces.[77] This pipeline culminates in the generation of wwPDB Validation Reports—detailed PDF and XML documents—for each entry, provided to depositors during processing and released publicly alongside the structure.[78] Validation criteria are tailored to the experimental technique and focus on key indicators of reliability. For X-ray crystallography, essential metrics include resolution (e.g., high-quality structures typically at ≤1.8 Å) and R-free values to gauge model refinement and overfitting.[79] Cryo-EM structures are scrutinized for map quality, encompassing half-map resolutions and Fourier Shell Correlation (FSC) curves to confirm reported resolutions.[80] Across methods, chemical accuracy is rigorously checked, including bond lengths, angles, torsion angles, and chirality to ensure consistency with known chemical principles.[77] Upon validation, structures with major discrepancies—such as poor ligand fit or unresolved geometric errors—may face rejection or require depositor revisions, though the wwPDB accepts nearly all submissions meeting minimum criteria after corrections.[72] Public reports transparently flag outliers, like non-favored Ramachandran angles or clash scores exceeding thresholds, enabling users to evaluate entry quality.[78] From 2020 to 2025, enhancements have strengthened validation for emerging data types. Cryo-EM reports now include advanced map-model fit assessments and FSC-based resolution percentiles for better comparability.[81] Ligand and carbohydrate validation improved via 2D geometric diagrams, 3D electron density fits (e.g., using RSCC and RSR metrics), and outcomes from the 2016 Ligand Validation Workshop and 2024 EMDataResource Ligand Model Challenge.[82][83] In 2025, wwPDB introduced the 3DEM Model-Map percentile slider and Q-score slider to validation reports, aiding in the assessment of local resolution and atom resolvability in cryo-EM maps.[84] Integration with the ModelCIF format, developed by the wwPDB ModelCIF Working Group, facilitates validation and archiving of computed structure models, extending PDB standards to predictive ensembles.[85]File Formats
Legacy PDB Format
The legacy Protein Data Bank (PDB) format is a plain-text file structure designed for storing atomic coordinates and associated metadata of biological macromolecules, utilizing fixed-width columns across 80-character lines to ensure compatibility with early computing systems like punched cards. Introduced in 1976 as an evolution of earlier formats, it provided a standardized way to archive three-dimensional structures determined primarily by X-ray crystallography and NMR spectroscopy. This format served as the primary medium for data exchange in the PDB archive from its inception until support was frozen in 2012, after which new depositions for crystallographic structures transitioned to more flexible alternatives like PDBx/mmCIF, while legacy PDB format remains accepted for NMR and cryo-EM depositions; legacy files remain accessible for historical entries.[86][87][88][89] Key records in the format include the HEADER record, which captures essential metadata such as the structure's classification, deposition date, and unique four-character PDB ID code in fixed positions (columns 1-6 for the record name, 11-50 for classification, 51-59 for the initial date, and 63-66 for the ID code). The ATOM records describe coordinates for atoms in standard amino acid or nucleotide residues, while HETATM records handle non-standard atoms or ligands, following nearly identical layouts to ATOM but with the record name changed to "HETATM" in columns 1-6. These coordinate records form the core of the file, listing atomic positions in orthogonal coordinates (X, Y, Z in Ångströms), along with attributes like occupancy and temperature factors.[90][91] The format's rigid structure imposes several limitations, including the 80-character line constraint, which restricts detailed annotations and leads to truncated or abbreviated data in complex entries. Atom serial numbers are confined to five digits (columns 7-11), limiting files to a maximum of 99,999 atoms and precluding representation of very large biomolecular assemblies without splitting into multiple files. Additionally, nomenclature ambiguities arise from fixed column alignments, such as right-justified atom names in four characters (columns 13-16), which can confuse similar labels (e.g., distinguishing "CA" for alpha-carbon from space-padded variants), and inconsistent handling of alternate conformations or insertion codes.[91][90] A representative example is the ATOM record, which adheres to the following column specifications:| Columns | Field Name | Description | Format/Example |
|---|---|---|---|
| 1-6 | Record name | Fixed as "ATOM " | ATOM |
| 7-11 | Serial number | Atom index (integer, right-justified) | 1 |
| 13-16 | Atom name | Name of the atom (right-justified) | N (with spaces if <4 chars) |
| 17 | Alternate location | Indicator for conformers (blank or A-Z) | (blank) |
| 18-20 | Residue name | Three-letter code (left-justified) | ASP |
| 22 | Chain ID | Single character identifier | A |
| 23-26 | Residue sequence | Sequence number (integer) | 23 |
| 27 | Insertion code | For residue insertions (A-Z or blank) | (blank) |
| 31-38 | X coordinate | Orthogonal X in Ångströms (real) | 28.000 |
| 39-46 | Y coordinate | Orthogonal Y in Ångströms (real) | 33.000 |
| 47-54 | Z coordinate | Orthogonal Z in Ångströms (real) | 45.000 |
| 55-60 | Occupancy | Fraction of time atom is occupied | 1.00 |
| 61-66 | Temperature factor | B-factor (real) | 15.00 |
| 77-78 | Element symbol | Chemical element (right-justified) | N |
| 79-80 | Charge | Atom charge (integer or blank) | (blank) |
ATOM 1 N ASP A 23 28.000 33.000 45.000 1.00 15.00 N , illustrating how data is precisely positioned to maintain parseability despite the format's constraints.[91][90]