An accession number is a unique identifier assigned to an item, record, or acquisition in various institutional collections or databases, typically indicating the sequence or order in which it was received or entered. This numbering system facilitates tracking, cataloging, and retrieval, serving as a foundational tool for inventory management across fields like libraries, museums, archives, and scientific repositories.[1][2]In cultural and archival institutions, accession numbers are essential for establishing legal and physical control over newly acquired materials. For instance, in archives such as the U.S. National Archives and Records Administration (NARA), the number uniquely identifies a group of records transferred into an institution's custody, often formatted to include the year of accession followed by a sequential identifier.[1] In libraries, it denotes the order of addition to the collection, aiding in shelving and reference processes.[3] Museums and galleries similarly use these numbers to document artifacts or artworks from the moment of intake, ensuring provenance and preventing loss.In scientific and bibliographic databases, accession numbers provide stable, permanent references for data entries. The National Center for Biotechnology Information (NCBI) assigns them to nucleotide or protein sequences in repositories like GenBank, using a format such as an alphabetical prefix followed by numerals to distinguish records uniquely and persistently, even as annotations evolve.[2] In broader database contexts, such as those for scholarly articles, the number acts as a supplemental locator beyond DOIs or ISBNs, assigned sequentially upon ingestion to enable precise searches and citations.[4] Specialized applications, like cancer registries under the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) program, employ patient-specific accession numbers to track initial encounters and ensure data integrity in epidemiological studies.[5]
Cultural and Archival Contexts
In Museums and Galleries
In museums and galleries, an accession number serves as a unique alphanumeric identifier assigned to physical artifacts, artworks, sculptures, paintings, and historical items upon their formal acquisition, enabling precise tracking of ownership, provenance, location, and custodial history. This system formalizes the object's entry into the permanent collection, distinguishing it from temporary loans or unaccessioned items, and ensures compliance with ethical standards for cultural heritage preservation.[6]The acquisition process typically commences with the initial receipt of the object, involving a preliminary assessment for condition, authenticity, and alignment with the institution's collection policy. This is followed by detailed inspection, including photographic documentation, measurements, material analysis, and verification of legal provenance through due diligence to confirm lawful ownership and absence of export/import restrictions. Once approved, the accession number is assigned and recorded in the museum's register or database, often incorporating a year prefix (e.g., 2025 for the acquisition year), a departmental code, and a sequential identifier, such as 2025.14.3 to denote the third object in department 14 that year. For groups of related items acquired together, a single base number may be extended with modifiers (e.g., 2019.9.1.1 through 2019.9.1.5). The International Council of Museums (ICOM) mandates that this process include multidisciplinary review—curatorial, legal, and conservation perspectives—to uphold public trust and prevent conflicts of interest.[6][7][8]Standards for accessioning are guided by ICOM's Code of Ethics, which emphasizes comprehensive documentation, including object descriptions, donor details, and risk assessments, to facilitate accessibility for research, exhibitions, and public engagement. Institutions like the British Museum utilize a comma-separated format, such as 2020,0101.1, where the year and sequential accession number precede a period and the specific object identifier, aiding in database searches and inventory management across their collection of over 8 million items. Similarly, the Louvre employs inventory numbers prefixed by letters denoting categories (e.g., "AO" for Antiquités orientales), ensuring unique identification within departmental registers. These conventions promote interoperability with international databases and adherence to anti-trafficking protocols, such as those from Interpol.[6][9][10]Accession numbers play a vital role in conservation by linking objects to maintenance records, allowing restorers to reference historical treatments and monitor deterioration over time. For loans to temporary exhibitions, the numbers facilitate conditionreporting, insurance valuation, and secure return protocols, minimizing risks during transit. In legal contexts, they support provenance claims and repatriation disputes, as seen in cases involving looted artifacts, by providing an immutable audit trail. This tracking system also underpins insurance policies, where numbers correlate to appraised values and coverage details.[6][11]The use of accession numbers evolved in the 19th century amid the professionalization of museums, as institutions sought systematic methods to catalog growing collections and avert losses from disorganized storage or duplication in records. Early examples include the Smithsonian Institution's implementation of numbered acknowledgments starting in 1885, and retrospective registers compiled for collections like those at Saffron Walden Museum dating back to the 1840s. By the late 1800s, such practices became standard in major European and American museums to enhance accountability and scholarly access.[12][13][14]
In Libraries and Archives
In libraries and archives, an accession number serves as a unique, sequential identifier assigned to books, documents, and other textual or documentary materials upon their receipt and formal addition to the collection, establishing initial physical and legal control over the items. This number typically reflects the order of acquisition, often formatted to include the year followed by a sequence, such as 2025.04567, to indicate both the temporal context and the item's position in the influx of materials. The practice ensures traceability from the moment of transfer, facilitating inventory management, processing, and future reference without relying solely on bibliographic details.[15]Accession numbers integrate closely with cataloging systems to link items to broader metadata, call numbers, and descriptive records. In standards like MARC 21, accession numbers can be recorded in field 541$e for immediate source of acquisition, or commonly in local control fields (e.g., 090 or 099) to connect with bibliographic entries, enabling seamless searches and shelving arrangements.[16] Integrated Library Systems (ILS) such as Koha and Alma further embed this functionality; Koha allows entry of accession numbers during cataloging to track items without barcodes, while Alma supports automated generation of accession numbers in holdings records for resource identification and workflow automation.[17][18] This linkage supports efficient circulation, interlibrary loans, and digitization efforts by associating the accession number with detailed provenance and condition notes.In archival contexts, accession numbers play a critical role in fonds-level control, where a fonds represents the entire body of records created or accumulated by a single entity, such as an organization or family, maintained in its original order per the principle of respect des fonds.[19][20] Numbers are assigned to entire accessions—units of materials transferred at one time—to organize record groups within a fonds, aiding arrangement, description, and preservation planning. Deaccessioning, the formal removal of items from the collection due to duplication, poor condition, or misalignment with institutional scope, requires updating accession records, control files, and finding aids to document the disposition and maintain intellectual control.[21]National libraries exemplify these practices on a large scale. At the Library of Congress, accession numbers are used for special collections and uncataloged materials, such as rare books and manuscripts, to support inventory during processing and digitization projects, complementing LC classification for over 800,000 items in the Rare Book and Special Collections Division.[22] Similarly, the British Library employs accession registers to document transfers of archival materials, ensuring legal custody and enabling access to vast holdings through integrated metadata systems that facilitate research and preservation initiatives.[23]Challenges arise in managing duplicates or transfers between institutions, where conflicting numbers can disrupt continuity; protocols often involve creating duplicate accession records with modified identifiers or renumbering to preserve unique tracking while documenting the transfer history.[24] These procedures, guided by professional standards, help mitigate errors in multi-institutional collaborations or mergers, ensuring the integrity of collection control.[21]
Bioinformatics and Genomics
Sequence Identification in Databases
In bioinformatics and genomics, an accession number serves as a permanent, unique alphanumeric identifier assigned to nucleotide (DNA or RNA) and protein sequences upon their submission to public repositories. These identifiers ensure that sequences can be reliably referenced, retrieved, and linked across databases, regardless of updates or revisions to the underlying data. For instance, the accession number NM_000123.4 denotes a curated reference mRNA sequence for a human gene in the RefSeq subset of GenBank, where the ".4" indicates the version of the sequence record.[2][25]The assignment of accession numbers occurs through the International Nucleotide Sequence Database Collaboration (INSDC), a partnership among GenBank (hosted by the National Center for Biotechnology Information, NCBI), the European Nucleotide Archive (ENA, hosted by the European Bioinformatics Institute), and the DNA Data Bank of Japan (DDBJ). Upon submission, typically required for publication in scientific journals, the collaborating databases generate and assign the identifier immediately after validating the sequence data, including details such as the locus name, source organism, and sequence length. This process ensures global synchronization, with daily data exchanges among the partners to maintain identical records worldwide. Submitters receive the accession number as confirmation of acceptance, which must be cited in publications to enable verification and access.[26][27][28]Accession numbers follow standardized formats defined by the INSDC, typically consisting of a two-letter prefix followed by a string of digits, which encodes information about the sequencetype and origin. Common prefixes include NC for complete genomic sequences (e.g., NC_000001 for human chromosome 1), NP for protein sequences (e.g., NP_000101 for a human protein), and various others like U for standard nucleotide submissions (e.g., U00096.3 for the complete genome of Escherichia coli strain K-12). These prefixes originated from the need to distinguish submission sources and record types, with expansions in 2018 to longer formats (e.g., six letters plus numbers) to accommodate growing data volumes. While the core accession remains fixed for stability, a version suffix (e.g., .3) tracks non-substantive updates to the sequence or annotations.[25][29][30]These identifiers play a crucial role in sequence retrieval and interoperability, particularly through systems like NCBI's Entrez, which allows users to search and fetch records using the accession number as a primary key. This enables seamless cross-referencing between nucleotide and protein databases, as well as integration with tools for alignment, annotation, and phylogenetic analysis. In publications, accession numbers facilitate reproducibility by providing direct links to the exact sequence data analyzed.[2][31][32]Accession numbers were first introduced in the early 1980s alongside the founding of GenBank in 1982 at Los Alamos National Laboratory, marking the beginning of organized public nucleotide sequence archiving. Initially managing just thousands of sequences, the system has expanded dramatically through INSDC coordination, reaching over 4.7 billion nucleotide sequences encompassing 34 trillion base pairs by 2025, reflecting the explosion in genomic data from high-throughput sequencing technologies.[33][34][35]
Versioning and Updates
In bioinformatics databases such as GenBank, the accession number serves as a stable, unique identifier for a biological sequence record, remaining unchanged throughout its lifecycle to ensure long-term referenceability.[36] To accommodate revisions to the underlying sequence data—such as corrections to base calls or annotations—a versioning system appends a dot followed by an incremental number (e.g., AF123456.2), where the suffix begins at .1 upon initial release and increases by 1 with each substantive update.[36] Complementing this, the GenInfo Identifier (GI) provides a temporary numeric ID assigned to each specific sequence version; unlike the accession, the GI changes with every update, even minor ones affecting a single base, facilitating precise tracking within database tools.[36]Updates to sequence records are initiated by submitters, who notify the database administrators (e.g., via email to [email protected] for GenBank) with the relevant accession number and details of the proposed corrections.[37] Upon validation, a new version is released, incorporating the changes while preserving the original sequence and its associated GI for historical reference; revision histories, accessible via tools like the GenBank sequence revision history report, document these updates with timestamps, prior GIs, and changelog summaries in a COMMENT field.[38]This versioning approach supports reproducibility in bioinformatics research by allowing scientists to reference and retrieve exact sequence states used in analyses, such as multiple sequence alignments or phylogenetic tree constructions, thereby minimizing discrepancies due to post-publication modifications.[39] Search tools like BLAST enhance accuracy by accepting accession.version inputs to query specific iterations rather than the latest, ensuring alignments reflect the intended data.[40]A prominent example is the reference genome for SARS-CoV-2 (accession NC_045512), initially released as version .1 in early 2020 and updated to .2 by July 2020 to incorporate refinements during the ongoing pandemic, with subsequent tracking via GI 1798174254 to maintain research continuity amid rapid viral sequencing efforts.Deprecation of accession numbers is exceedingly rare under International Nucleotide Sequence Database Collaboration (INSDC) policies, occurring only for egregious errors like contamination or invalid submissions; in such cases, records are suppressed from general searches but remain retrievable by accession to preserve scientific integrity, with INSDC coordination emphasizing modifications to existing accessions over new assignments to prevent data fragmentation across member databases (GenBank, EMBL, DDBJ).[41]
Healthcare Contexts
Patient Registration and Records
In specialized healthcare contexts, such as cancer registries and pathology laboratories, an accession number serves as a unique identifier assigned to a patient's reportable case or specimen, facilitating the organization and retrieval of clinical records. This number, often formatted to include the year followed by a sequential identifier (e.g., 2025-78901), is generated during the reporting or intake process to link essential patient information such as demographics and medical history within registry or laboratory information systems.[5][42] Unlike the lifelong Medical Record Number (MRN), which identifies the patient across all interactions, the accession number specifically tracks individual cases or specimens, such as in cancer reporting, without supplanting the MRN.[42]The process begins at case reporting or specimen receipt, where staff verify details and enter data into the system, automatically assigning the accession number to ensure accurate record linkage. This identifier supports data flow for documentation and reporting to national registries; for instance, the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) program relies on them to uniquely identify patients at reporting facilities for epidemiological tracking and research. These numbers also aid in billing and longitudinal studies by providing a standardized reference for cases. As of November 2025, AI-driven algorithms assist in patient record deduplication within electronic health record (EHR) systems, analyzing patterns to merge or flag duplicates and reduce errors, though primarily at the patient (MRN) level rather than per accession.[43]
Diagnostic Imaging and Orders
In diagnostic imaging, the accession number serves as a unique identifier for imaging service requests, such as MRI or CT scans, typically generated by a Radiology Information System (RIS) or Picture Archiving and Communication System (PACS). For instance, it may take a format like ACC-2025-IMG-456 to denote the year, modality, and sequential order. This number is assigned at the point of order entry to link the requisition directly to the patient's study and ensures traceability throughout the imaging process.[44][45]The workflow begins with the accession number being created in the RIS upon order placement, often via HL7 messages for order management (e.g., ORM^O01 for new, updates, or cancellations). It is then propagated to the DICOM Modality Worklist (MWL), where imaging devices retrieve it to populate study headers, avoiding manual data entry errors. From requisition to image acquisition, reporting, and storage in PACS, the accession number tracks the entire chain, with DICOM standards embedding it in image metadata for interoperability across systems. HL7 facilitates broader integration by carrying the accession number in segments like OBR-3, enabling communication between RIS, PACS, and other healthcare IT components.[46][47][48]This identifier is crucial in high-volume radiology departments to prevent mix-ups between patients and orders, maintaining the integrity of diagnostic connections. According to Society for ImagingInformatics in Medicine (SIIM) guidelines, accession numbers are essential for correlating patient information, order details, and scheduling data, particularly in scenarios involving multiple exams per patient. In busy settings, such as emergency departments, this linkage reduces errors in study association and supports accurate billing and quality assurance.[44]Accession numbers integrate seamlessly with Electronic Medical Records (EMR) systems to deliver results, using HL7 for order status updates (e.g., ORU messages for reports) and DICOM for image transmission, ensuring reports are attached to the correct patient record. As of 2025, advancements in AI have enhanced this integration by automating workflow elements, such as order reconciliation and preliminary accession assignment in RIS, improving efficiency in PACS-EMR handoffs.[49][50][51]Challenges arise in handling amendments or cancellations, where updates must preserve the original chain without duplicating or orphaning studies. HL7 supports this through targeted messages keyed to the accession number, but discrepancies in system implementations can disrupt interoperability, requiring robust reconciliation tools in PACS to maintain workflow continuity.[47][52]
Regulatory and Financial Systems
SEC EDGAR Filings
In the context of the U.S. Securities and Exchange Commission's (SEC) Electronic Data Gathering, Analysis, and Retrieval (EDGAR) system, an accession number serves as a unique identifier for each electronic filing submission. This 20-character code follows the format of a 10-digit Central Index Key (CIK) of the submitting entity or agent, followed by a hyphen, the two-digit acceptance year, another hyphen, and a six-digit sequence number (e.g., 0001628280-24-017503). It is automatically generated by the EDGAR system upon successful validation and acceptance of the submission, ensuring traceability for every document filed.[53][54]The submission process begins with filers preparing documents according to the EDGAR Filer Manual, which outlines validation requirements for electronic formats. Once transmitted via EDGAR's online platforms, the system performs automated checks; if accepted, it assigns the accession number immediately, which filers receive via confirmation email or the submission status page. This applies to various forms, including annual reports (10-K), quarterly reports (10-Q), and current reports (8-K), facilitating efficient processing for public companies, mutual funds, and other regulated entities. Amendments to prior filings, such as corrections to a 10-K, are submitted as separate documents with their own new accession numbers, denoted by an "/A" suffix in the form type (e.g., 10-K/A), allowing distinct tracking without altering the original filing's identifier.[55][56]The primary purpose of the accession number is to enable seamless public access, regulatory auditing, and targeted querying through the SEC's EDGAR search tools, where users can retrieve full filings by entering the number directly. It supports compliance monitoring by linking submissions to filer accounts and timestamps, while also aiding in the dissemination of over 3,000 terabytes of data annually to investors and analysts. For instance, Tesla, Inc.'s 10-Q filing for the quarter ended March 31, 2024, was assigned accession number 0001628280-24-017503, submitted on April 24, 2024, and made publicly available shortly thereafter.[57][58][59]Historically, accession numbers were introduced alongside the EDGAR system's rollout in the early 1990s, with the operational version launching voluntarily in 1992 and electronic filing becoming mandatory by 1995 following phased implementation. By 2025, EDGAR processes approximately 4,700 filings per day, resulting in over 1 million accessions annually, reflecting the system's evolution into a cornerstone of financial transparency.[60][58]
Patent and Intellectual Property Databases
In patent and intellectual property (IP) databases, accession numbers serve as unique identifiers for tracking patent applications, granted patents, and related materials, enabling efficient retrieval and management of IP records. For instance, the Derwent World Patents Index (DWPI), a comprehensive database covering patents from over 40 patent-issuing authorities, assigns a Primary Accession Number (PAN) to each basic patent family, such as in the format LYYYY-NNNNNN where L is a section code letter (e.g., 2024-A123456), which uniquely identifies the core invention and links all equivalent filings across jurisdictions.[61] Similarly, for biological materials integral to inventions, the United StatesPatent and Trademark Office (USPTO) requires deposits in recognized International Depositary Authorities (IDAs) under the Budapest Treaty, where repositories like the American Type Culture Collection (ATCC) issue accession numbers upon receipt, such as ATCC PTA-XXXXX, to certify the viability and availability of the material for patent examination and enforcement.[62][63]These accession numbers are typically assigned immediately upon filing of a patent application or deposit of materials, ensuring traceability from the initial submission through to publication and grant. They often interconnect with other identifiers, such as national application or publication numbers (e.g., USPTO serial numbers or WOpublication numbers for PCT applications), facilitating cross-referencing in global databases. The World Intellectual Property Organization (WIPO) Standard ST.9 establishes standardized codes for bibliographic data on patent documents, including fields for application numbers, priority dates, and applicant details, which helps harmonize how accession-like identifiers are presented and searched across IP offices worldwide. This standardization supports the tracking of patent family equivalents—related applications filed in multiple countries claiming the same priority—through systems like the WIPO PATENTSCOPE database.In practice, these identifiers play a crucial role in enabling targeted searches and prior art analysis in major IP databases. For example, Espacenet, operated by the European Patent Office (EPO), allows users to query by application numbers or equivalent identifiers to retrieve over 140 million patent documents, while the USPTO's Patent ApplicationInformation Retrieval (PAIR) system uses serial numbers tied to accessions for status tracking and public access post-publication.[64] In the biotechnology sector, accession numbers for biological deposits are particularly vital under 37 CFR 1.801 et seq., where they provide evidence of enablement under 35 U.S.C. 112 by ensuring that skilled artisans can access and reproduce the deposited material, such as microbial strains or cell lines, without undue experimentation; failure to include such a deposit when necessary can lead to rejection for lack of written description or enablement.[65]Recent developments as of 2025 have expanded IP database capabilities for tracking AI-related inventions, with WIPO's PATENTSCOPE introducing an Artificial Intelligence Index that categorizes and assigns searchable tags to AI patents based on technological subfields, enhancing the identification of emerging trends without altering core accession structures but improving metadata linkages for global equivalents.[66] This builds on broader efforts, such as the WIPO Patent Landscape Report on Generative AI, which analyzes over 60,000 AI-related filings to highlight innovation hotspots and facilitate prior art searches in databases like Derwent and Espacenet.[67]
General Database Applications
Academic and Publication Databases
In academic and publication databases, accession numbers serve as unique, database-specific identifiers assigned sequentially to scholarly articles, datasets, and other research outputs upon indexing, facilitating precise retrieval and persistent referencing. For instance, PubMed employs the PubMed Unique Identifier (PMID), a 1- to 8-digit number without leading zeros, to manage and disseminate records in its repository of over 39 million biomedical literature citations. Similarly, the Web of Science Core Collection uses an accession number denoted by the field tag "UT," a unique alphanumeric code for each document that enables batch searching and citation tracking across its indexed content. These identifiers are distinct from broader persistent identifiers like Digital Object Identifiers (DOIs), though DOIs often incorporate accession-like suffixes for granular resolution, such as in Crossref-registered works.[68][69][70]Examples of such systems include Scopus, which assigns an Electronic Identifier (EID) or Document ID—typically formatted as "2-s2.0-" followed by a numeric string—to each indexed record for unambiguous document location and author profiling. In data repositories, Dryad issues DOIs with internal numeric components (e.g., doi:10.5061/dryad.1zcrjdhxx) upon dataset submission, ensuring citable links to supporting research materials. These accession numbers enhance discoverability by enabling cross-database searches and integration with author identifiers like ORCID, where authenticated iDs are embedded in publication metadata to link works directly to researcher profiles, reducing name ambiguity and supporting automated updates to scholarly records.[71][72][73]The primary purposes of these identifiers extend to versioning peer-reviewed content, allowing updates or corrections to be tracked without losing historical references, and promoting interoperability in citation networks. Compliance with National Information Standards Organization (NISO) guidelines for metadata—such as those outlined in the "Understanding Metadata" primer—ensures standardized description, accessibility, and licensing elements across platforms, supporting machine-readable formats for global scholarly communication. Recent updates to NISO standards, including enhancements for persistent identifiers like ARKs, further improve interoperability in general database applications as of 2025. By 2025, the growth in open access has amplified their role, with major repositories like PubMed and Crossref collectively indexing over 150 million records, driven by policies mandating immediate public access and resulting in significantly increased output since 2014.[74][75][76]Challenges in this domain include harmonizing identifiers across disparate platforms, as variations in formats and metadata completeness can hinder seamless linking, such as matching ORCIDs or funder IDs in Crossref deposits. Efforts like Crossref's persistent identifier initiatives address these by promoting uniform resolution and metadata enrichment, though ambiguities in funder names and duplicate DOIs persist, requiring ongoing community collaboration to maintain the integrity of the scholarly record.[77][78]
Enterprise and Information Management Systems
In enterprise and information management systems, accession numbers serve as unique sequential identifiers assigned to corporate records, documents, assets, and transactions to facilitate organized tracking and retrieval within large-scale organizational databases. These identifiers, often formatted as prefixed sequential codes such as "ENT-2025-00987," are commonly implemented in systems like OpenText Content Manager and SAP Records Management, where they distinguish entries in customer relationship management (CRM) platforms, enterprise resource planning (ERP) modules, and document repositories. Unlike fixed formats in highly regulated sectors, enterprise accession numbers allow customizable structures to align with internal workflows, enabling departments to adapt numbering schemes for specific needs such as prefixing by business unit or year.[79][80]Implementation typically involves auto-generation upon data entry, ensuring each new record receives a distinct number without manual intervention, which supports audit trails and compliance requirements like the General Data Protection Regulation (GDPR). Under GDPR, these numbers help trace personal data processing activities, maintaining accountability by linking records to their origin and modifications over time. In systems such as SalesforceCRM, sequential numbering can be configured for specific objects like invoices or assets, generating gapless identifiers that integrate with broader ERP ecosystems for seamless data flow. This automation minimizes duplication and errors in high-volume environments, where manual assignment could lead to inconsistencies.[81]Practical examples include inventory management in manufacturing firms, where accession numbers track assets from procurement to deployment—such as assigning "INV-2025-0456" to a batch of machinery parts in SAP—to enable precise stock audits and recall processes. In human resources (HR) systems, they identify employee records or onboarding documents, like "HR-2025-1123" for a personnel file, aiding in retention scheduling and access controls across enterprise platforms. These applications enhance operational efficiency by providing a reliable reference for cross-departmental queries and reporting, with improved retrieval speeds in digitized archives compared to non-unique labeling.[82]The primary benefits of accession numbers in these systems lie in error reduction for large-scale data handling and improved retrieval speeds. As of 2025, emerging trends integrate blockchain technology with accession numbers to ensure immutability, linking identifiers to distributed ledgers in platforms like Hyperledger Fabric for tamper-proof audit logs in enterprise records management. This approach addresses compliance demands for verifiable data integrity without altering the customizable nature of enterprise formats, distinguishing them from rigid standards in fields like securities filings.[83][84]