Fact-checked by Grok 2 weeks ago

Unique identifier

A unique identifier (UID) is a numeric or alphanumeric string associated with a single entity—such as an object, record, or device—to distinguish it uniquely within a defined system or context, thereby enabling accurate tracking, retrieval, and management without ambiguity.^[1]^[2] In computer science and information systems, UIDs serve as foundational elements for data integrity, serving roles like primary keys in relational databases to enforce referential integrity and prevent duplicates during queries or updates.^[3] They underpin distributed systems by facilitating collision-resistant labeling, as seen in universally unique identifiers (UUIDs), which employ 128-bit values generated via algorithms outlined in RFC 4122 to achieve near-certain global uniqueness without centralized authority.^[4] Notable implementations include IEEE's extended unique identifiers (EUIs) for network interfaces, ensuring device-level distinction in protocols like Ethernet, and ISO/IEC 15459 standards for supply chain items, where non-significant strings track individual units across lifecycles.^[5]^[6] While UIDs enhance scalability and interoperability, their design must balance uniqueness probability against storage overhead and potential privacy risks in pervasive tracking applications.^[7]

Fundamentals

Definition

A unique identifier (UID) is a numeric or alphanumeric string associated with a single entity within a defined system, namespace, or context, ensuring it can be distinguished from all others.^[1]^[2] This identifier serves as a reference mechanism for locating, tracking, or managing the entity, such as a record in a database, a device in a network, or an object in a distributed system.^[8] Uniqueness is enforced relative to the scope of application, preventing duplication and supporting operations like data retrieval, updates, and integrity checks.^[9]^[10] In computer science, UIDs are typically permanent and immutable once assigned, facilitating reliable identification across processes or time periods.^[3] They underpin data models by acting as primary keys in relational databases, where constraints ensure no two rows share the same value, thus maintaining referential integrity and avoiding ambiguity in queries.^[11] For instance, in inventory systems, a UID might link a product to its specifications, sales history, and location without conflation.^[12] The design of a UID prioritizes collision resistance— the probability of two independent assignments yielding the same value—often through algorithms that leverage sequence, randomness, or hashing to achieve high uniqueness guarantees within practical constraints. While local UIDs suffice for bounded environments like single databases, broader applications demand mechanisms for global uniqueness to support interoperability across systems.^[13] Failure to ensure uniqueness can lead to errors such as data corruption or misattribution, underscoring their foundational role in scalable computing architectures.^[14]

Essential Properties

A unique identifier must possess uniqueness as its core property, ensuring that it distinguishes one entity from all others within the defined scope, preventing collisions or duplicates that could compromise data integrity or system functionality.^[1] ^[3] This requires mechanisms such as sufficient bit length or algorithmic generation to minimize the probability of overlap, as seen in standards where identifiers are designed to be collision-resistant across distributed environments.^[15] Persistence is another essential attribute, meaning the identifier remains stably linked to the entity throughout its lifecycle and is not reassigned to different objects, which supports reliable referencing in databases, tracking systems, and long-term data management.^[16] ^[17] Without persistence, changes or reallocation could lead to ambiguity or loss of historical traceability, undermining applications like audit trails or entity resolution.^[18] Immutability ensures that once assigned, the identifier does not alter, facilitating consistent retrieval and relationships across systems without requiring updates that risk errors or propagation failures.^[16] This property is critical in scenarios involving data migration or integration, where mutable identifiers could introduce inconsistencies.^[19] Additionally, opaqueness—where the identifier reveals no inherent information about the entity—enhances security by obscuring patterns that might enable guessing or inference attacks.^[16] ^[20] These properties are interdependent and typically enforced through system-level protocols, such as centralized registries or probabilistic guarantees, to maintain reliability in diverse contexts like software development and identity management.^[18] ^[3] Failure to uphold them can result in issues like data duplication or failed authentications, as evidenced in distributed computing challenges.^[21]

Classification

By Scope and Persistence

Unique identifiers are classified by their scope, which delineates the domain of guaranteed uniqueness, and by their persistence, which measures the identifier's longevity and resolvability. Scope distinguishes between local identifiers, unique only within a confined context such as a single database table, namespace, or system, and global identifiers, unique across distributed networks, organizations, or universally without reliance on a specific authority.^[22]^[23] Persistence differentiates persistent identifiers, engineered for indefinite validity through resolution mechanisms that withstand changes in storage, ownership, or technology, from transient (or ephemeral) identifiers, which expire after short durations like a session or process lifecycle.^[24] Locally persistent identifiers, such as auto-incrementing primary keys in relational databases (e.g., a user_id column unique within one table), ensure entity distinction within a bounded system while surviving restarts or migrations if the database schema persists.^[22] These are common in monolithic applications where cross-system coordination is unnecessary, but they risk collisions if data merges across contexts without namespace prefixes. Globally persistent identifiers, like Universally Unique Identifiers (UUIDs) version 4 or Digital Object Identifiers (DOIs), achieve worldwide uniqueness probabilistically or via centralized registries, with persistence maintained by standards ensuring resolvability over decades; for instance, UUIDs generate 128-bit values with collision odds below 1 in 2^122 for practical scales.^[15]^[25] DOIs, prefixed by agency codes (e.g., 10.1000 for Crossref), resolve to digital objects via handles.net, supporting scholarly citations since 2000.^[24] Locally transient identifiers include process IDs (PIDs) in operating systems like Unix, which uniquely tag running processes on a host (e.g., values from 1 to 32768 recycled upon termination) but become invalid post-exit, aiding short-term resource tracking without global coordination.^[26] In web applications, session cookies generate local ephemeral tokens unique per user-browser interaction, discarded after logout or timeout to enhance privacy. Globally transient identifiers appear in network protocols, such as ephemeral port numbers in TCP (typically 49152–65535) or connection IDs in QUIC, which ensure endpoint uniqueness during active flows but rotate or expire to mitigate tracking risks, as analyzed in IETF standards where reuse cycles prevent indefinite persistence. These transient types prioritize security and efficiency in dynamic environments but demand regeneration mechanisms to avoid reuse conflicts.^[26] This dual classification informs design trade-offs: local persistence suits cost-effective, siloed data management, while global persistence enables interoperability in federated systems like the web; transient scopes reduce privacy exposures in transient interactions, though they complicate auditing compared to persistent alternatives. Empirical evaluations, such as those in protocol implementations, show transient IDs lowering collision risks in high-volume scenarios via frequent randomization, but persistent global schemes like UUIDs excel in distributed databases for scalability without central bottlenecks.^[27]^[15]

By Structure and Meaningfulness

Unique identifiers are classified by structure into flat and hierarchical categories, reflecting their internal organization, and by meaningfulness into opaque and semantic varieties, indicating the degree to which they convey entity-specific information. Flat structures feature a uniform, non-segmented format, such as sequential integers or fixed-length random strings, which prioritize simplicity in storage and comparison but lack inherent grouping mechanisms.^[28] Hierarchical structures, conversely, embed delimited components representing nested levels, enabling scalable namespace delegation as in geographic or organizational schemes common across identifier systems.^[29] Flat identifiers are exemplified by auto-incrementing database primary keys, typically 64-bit integers starting from 1 and increasing monotonically, which ensure ordinal uniqueness within a single table but require centralized coordination to avoid collisions in distributed environments.^[28] Universally Unique Identifiers (UUIDs) in their random variant (version 4) also adopt a flat 128-bit structure, formatted as 8-4-4-4-12 hexadecimal groups, generating approximately 5.3 × 10^36 possible values to minimize collision risk without sequential dependency. Hierarchical identifiers partition the value into fields denoting parent-child relationships, such as Internet domain names under the DNS hierarchy managed by ICANN since 1998, where top-level domains like .com precede subdomains for logical partitioning.^[29] Digital Object Identifiers (DOIs), prefixed with "10." followed by registrant and suffix codes, similarly layer authority and specificity, supporting persistent resolution across 200 million registered objects as of 2023.^[30] Opaque identifiers withhold semantic content, functioning as arbitrary tokens that decouple identification from descriptive attributes, thereby enhancing stability against entity changes and reducing enumeration vulnerabilities in APIs or databases.^[31] This opacity suits distributed systems, where UUIDs or hashed values prevent inference of creation order or cardinality, though they complicate debugging due to human unreadability.^[32] Semantic identifiers, by embedding interpretable elements, facilitate quick entity categorization but introduce fragility if encoded data—such as product codes implying category—becomes outdated or context-dependent. For instance, legacy systems using "intelligent" keys like department-prefixed employee numbers risk proliferation of invalid entries during organizational shifts, contrasting with opaque alternatives that isolate identity from business logic.^[33] Trade-offs favor opacity for longevity in volatile domains like web resources, while semantics aid domain-specific querying in stable hierarchies.^[34]

Generation Methods

Sequential and Deterministic Approaches

Sequential and deterministic approaches to unique identifier generation produce IDs through predictable, rule-based processes that guarantee uniqueness via ordering, counters, or fixed computations, eschewing randomness to enable reproducibility and temporal sorting. These methods prioritize causality in ID assignment, such as insertion order or generation time, making them suitable for systems requiring auditability or efficient querying. Unlike probabilistic methods, they rely on synchronized state or unique inputs to avoid collisions, though they demand coordination in distributed environments to maintain global uniqueness.^[35]^[36] A foundational example is the auto-increment mechanism in relational databases, which assigns monotonically increasing integer values to new records. In MySQL, the AUTO_INCREMENT column attribute starts at 1 and increments by 1 per insertion, ensuring sequential uniqueness within a single instance while supporting efficient B-tree indexing for range queries.^[35] PostgreSQL achieves similar results via CREATE SEQUENCE, which generates unique integers retrievable with NEXTVAL, often used as default values for primary keys.^[37] These approaches excel in centralized setups for their storage efficiency—typically 4-8 bytes per ID—and natural ordering, which facilitates sorting and gap detection for data integrity checks, but they falter in sharded or distributed databases due to potential duplication without locks or partitioning.^[38] In distributed systems, time-based deterministic schemes extend sequentiality across nodes. Twitter's Snowflake algorithm, deployed since 2010, composes 64-bit IDs from a 41-bit timestamp (milliseconds since 2010-11-04 epoch), 5-bit datacenter ID, 5-bit worker ID, and 12-bit per-millisecond sequence counter, yielding up to 4096 IDs per node per millisecond without central coordination.^[36] This structure ensures approximate global ordering by generation time, deterministic reproduction from inputs, and collision resistance via node uniqueness, powering tweet IDs at scales exceeding 500 million daily generations as of early implementations.^[39] Limitations include vulnerability to clock skew—mitigated by sequence resets—and exposure of approximate creation timestamps, which can infer system activity.^[40] UUID version 1 represents another deterministic variant, embedding a 60-bit timestamp, 14-bit clock sequence for non-monotonic clocks, and 48-bit node identifier (typically MAC address) into a 128-bit value, producing time-sortable IDs unique per generator.^[41] Defined in RFC 4122, this method supports rates up to 163 billion IDs per second per node while remaining reproducible given identical timing and hardware contexts, though privacy concerns arise from leaked MAC addresses and timestamps.^[42] Overall, these approaches trade scalability for predictability, favoring applications like logging or versioning where order and verifiability outweigh anonymity.^[4]

Random and Probabilistic Approaches

Random approaches to generating unique identifiers rely on pseudorandom number generators (PRNGs) or cryptographically secure random number generators (CSPRNGs) to produce sequences of bits or characters drawn from a sufficiently large address space, ensuring that the probability of collisions remains negligible for practical scales of usage. These methods eschew deterministic sequencing or hashing in favor of stochastic selection, where uniqueness is probabilistic rather than absolute, predicated on the birthday paradox: the expected number of identifiers needed to achieve a 50% collision probability approximates the square root of half the total possible combinations. For instance, in a 128-bit space, approximately 2.71 × 10^18 identifiers must be generated for a 50% chance of at least one duplicate.^[4]^[43] A prominent standardization of random unique identifiers is the Universally Unique Identifier (UUID) version 4, defined in RFC 4122 (updated by RFC 9562), which allocates 128 bits total, with 122 bits derived from random values after reserving 4 bits for the version number (0100) and 2 bits for the variant (10). The remaining bits, including clock sequence and node fields repurposed for randomness, are filled via a random source, yielding 2^122 possible values and a collision risk below 0.1% even after generating billions of instances in distributed environments. UUID v4 generation requires no synchronization across systems, making it suitable for decentralized applications like database primary keys or session tokens, though it sacrifices sortability and may introduce index fragmentation in storage systems due to non-sequential ordering.^[44]^[4] Probabilistic variants extend this paradigm to constrained spaces, such as 64-bit or shorter alphanumeric strings, where collision risks are calibrated against expected volume; for example, generating 64-bit random integers yields a collision probability of roughly n^2 / (2 × 2^64) for n identifiers, remaining under 10^-18 for n up to 10^9, sufficient for most high-scale systems without central coordination. In distributed computing, these approaches mitigate coordination overhead—unlike sequential methods—by accepting infinitesimal duplication risks, often mitigated further via application-layer checks or hybrid schemes combining randomness with timestamps. However, reliance on PRNG quality is critical: weak entropy sources can elevate collision rates, as evidenced by historical vulnerabilities in non-cryptographic RNGs; thus, standards recommend CSPRNGs like those in /dev/urandom or hardware entropy for production use.^[43]^[45]^[4]

Hash-Based and Derived Methods

Hash-based methods for generating unique identifiers apply a cryptographic hash function to input data, typically a combination of a namespace identifier and a unique name or string, to produce a fixed-length, deterministic output that minimizes collision probability. These techniques leverage the one-way and avalanche properties of hash functions, ensuring that small input changes yield significantly different outputs, while identical inputs always produce the same identifier. Uniqueness relies on the input's distinctiveness within the defined namespace and the hash's resistance to collisions, with practical collision risks approaching zero for well-chosen algorithms and sufficient input entropy.^[46] In the UUID standard (RFC 4122), versions 3 and 5 exemplify hash-based generation. For version 3, the MD5 algorithm hashes the concatenation of a 128-bit namespace UUID and a UTF-8 encoded name, yielding a 128-bit digest from which specific bytes are rearranged to set the version (3) and variant bits, forming the final UUID. Version 5 substitutes SHA-1 for MD5, offering stronger collision resistance despite SHA-1's deprecation for security-critical uses; the process remains identical otherwise. These methods, introduced in 2005 via RFC 4122, enable decentralized, reproducible ID creation, as demonstrated in applications like DNS name-to-ID mappings, where the same name in the same namespace consistently resolves to the same UUID.^[46]^[47]^[48] Derived methods construct unique identifiers by augmenting a base value—often sequential or random—with computed components like checksums, which verify integrity without central authority. The base ensures core uniqueness, while the derived element, typically a modulo-based digit from the base's weighted sum, detects transcription errors at rates up to 99% for single-digit mistakes. For example, the Luhn algorithm, patented in 1959 and standardized in ISO/IEC 7812 for payment cards, appends a check digit to a base account number: double every second digit from the right, sum the results (reducing >9 to single digits), and set the check digit so the total modulo 10 equals zero. This derives identifiers like credit card numbers (16 digits total), where the base 15 digits uniquely identify the account, and the checksum flags invalid entries. Similar derivations appear in ISBN-13 codes, where the final digit uses a modulo-10 sum of weighted preceding digits (weights alternating 1 and 3), ensuring book-specific uniqueness with error-checking since the standard's 2007 revision. These approaches contrast with purely random methods by prioritizing determinism and verifiability, though they demand careful namespace management to avoid intentional collisions exploiting hash weaknesses—MD5's practical breaks since 2004 underscore preferring SHA-1 or stronger for v5 equivalents in high-stakes systems. Derived checksums add minimal overhead (one digit) but do not prevent duplicates if bases collide, relying instead on upstream uniqueness guarantees.

Standards and Protocols

Globally Unique Standards

Universally Unique Identifiers (UUIDs), standardized by the Internet Engineering Task Force (IETF), represent a primary mechanism for generating globally unique identifiers without centralized coordination. Defined in RFC 9562, published in May 2024, UUIDs are 128-bit values designed to ensure uniqueness across space and time through structured generation methods, including time-based, random, and namespace-derived variants.^[49] This standard obsoletes the earlier RFC 4122 from 2005, incorporating updates such as version 6 (monotonically increasing time-based UUIDs for better database indexing) and version 7 (hybrid time-random UUIDs for improved randomness and sortability).^[44] UUIDs are typically represented as 32 hexadecimal digits grouped into five segments (e.g., 123e4567-e89b-12d3-a456-426614174000), with embedded fields for version, variant, timestamp, clock sequence, and node identifier to minimize collision probabilities.^[49] Uniqueness in UUIDs relies on probabilistic guarantees rather than registration; for instance, version 4 (random) UUIDs leverage 122 bits of randomness, yielding an estimated collision risk of less than 1 in a billion trillion for typical usage scales.^[49] Version 1 UUIDs incorporate timestamps and MAC addresses (or random node IDs) to achieve temporal and spatial uniqueness, while versions 3 and 5 use hashing (MD5 or SHA-1) of namespaces and names for deterministic derivation.^[44] These properties make UUIDs suitable for distributed systems, such as database primary keys, session tokens, and object storage, where independent generators must produce non-colliding IDs. Adoption spans operating systems (e.g., Linux's libuuid), programming languages (e.g., Java's UUID class compliant with RFC 4122 variants), and protocols, with Microsoft referring to them as Globally Unique Identifiers (GUIDs) in Windows environments.^[50]^[51] Other globally oriented standards include the Digital Object Identifier (DOI) system, which provides persistent, resolvable identifiers for digital content under ISO 26324:2012, managed through a federated registration agency model by the International DOI Foundation.^[52] DOIs ensure global uniqueness via prefix-based assignment (e.g., 10.1234/example), where prefixes are allocated centrally but suffixes can be generated locally, supporting long-term citability in scholarly publishing with resolution via the Handle System protocol.^[52] However, unlike UUIDs, DOIs require registration for official persistence and do not permit arbitrary local generation, limiting their use to registered entities. Similarly, Archival Resource Keys (ARKs) offer an alternative non-commercial scheme for persistent identification, emphasizing low-cost delegation and global resolvability without mandatory fees, though lacking the decentralized generation of UUIDs. These standards collectively address the need for interoperability in international data exchange, prioritizing collision avoidance through a combination of algorithmic design and minimal coordination.

Domain-Specific Identifiers

Domain-specific identifiers are standardized systems for assigning unique codes within delimited scopes, such as industries or application areas, where a central authority or coordinated network ensures non-duplication relative to the domain's entities rather than achieving universal uniqueness independent of oversight. These protocols facilitate tracking, interoperability, and supply chain efficiency in sectors like publishing and manufacturing, often incorporating structured elements for categorization and validation. Assignment typically involves registration with domain-specific agencies, contrasting with probabilistic global methods by emphasizing verifiable persistence through human-managed registries.^[53]^[54] In publishing, the International Standard Book Number (ISBN) serves as a primary example, comprising a 13-digit code that identifies books and similar monographic products. Established under ISO 2108, the ISBN includes a prefix (978 or 979), a registration group for geographic or linguistic areas, a registrant code for publishers, a publication-specific element, and a check digit for error detection. The International ISBN Agency coordinates national agencies for assignment, with over 195 member countries participating as of 2023; for instance, the United States ISBN agency, operated by Bowker, issues blocks of numbers to publishers based on projected output. This system supports global book trade logistics, enabling precise inventory and sales tracking without global collision risks due to centralized control.^[55]^[56] For serial publications like journals and magazines, the International Standard Serial Number (ISSN) provides an 8-digit identifier, formatted as two groups of four digits separated by a hyphen, with a check digit calculated modulo 11. Defined by ISO 3297:2020, the ISSN is assigned by the ISSN International Centre in Paris and its network of over 90 national centers, uniquely tagging continuing resources across print and digital media. Unlike ISBNs, ISSNs remain constant across editions or formats of the same serial, aiding archival and citation stability; as of 2024, the system registers millions of entries, primarily for academic and periodical content.^[57]^[58] The Digital Object Identifier (DOI) system targets scholarly and digital content, offering persistent links to objects like articles, datasets, and multimedia via a prefix-suffix structure (e.g., 10.1000/xyz123), where the prefix denotes the registrant and the suffix the specific item. Managed by the International DOI Foundation since 2000 and standardized as ISO 26324, DOIs incorporate Handle System technology for resolution, with over 5,000 registration agencies including publishers and data centers assigning them. By 2023, billions of DOIs had been minted, primarily in research domains, ensuring long-term accessibility even if hosting changes, through metadata synchronization via the DOI registry.^[59]^[60] In transportation, the Vehicle Identification Number (VIN) exemplifies manufacturing-focused identifiers, a 17-character alphanumeric code standardized by ISO 3779 for road vehicles including cars, trucks, and motorcycles. The VIN divides into a World Manufacturer Identifier (first three characters, allocated by the Society of Automotive Engineers), vehicle attributes (positions 4-8, encoding model, body type, and engine), a check digit (position 9 for validation), a model year (position 10), assembly plant (11), and serial number (12-17). Mandated globally since the 1980s, VINs enable recall tracking and theft prevention, with manufacturers self-assigning within ISO-approved codes to avoid duplicates in the automotive domain.^[61]^[62] These protocols mitigate collision risks through domain governance but introduce dependencies on agency reliability and potential for scope creep, such as ISBN extensions to e-books or DOIs to non-traditional data. Adoption varies by sector maturity, with publishing standards like ISBN and ISSN achieving near-universal compliance due to economic incentives, while VIN enforcement relies on regulatory mandates.^[63]^[52]

Applications Across Domains

In Computing and Data Management

In relational databases, primary keys function as unique identifiers for each record, enforcing both uniqueness across rows and non-null constraints to maintain data integrity.^[64] These keys, often implemented as a single column or composite set, enable efficient querying, indexing, and referential integrity through automatically generated unique indexes by the database engine.^[65] Surrogate primary keys, such as auto-incrementing integers or UUIDs, are preferred over natural keys to avoid dependencies on changeable business data like user names or emails.^[66] Universally Unique Identifiers (UUIDs) play a critical role in distributed computing environments, providing 128-bit values with negligible collision probability (approximately 1 in 2^122 for random generation) to label database entries, files, or transactions without central authority.^[67] In systems like CockroachDB or PostgreSQL, UUIDs serve as primary keys for scalability, allowing offline generation and seamless replication across nodes by eliminating sequence dependencies that could cause conflicts in multi-server setups.^[67] This approach contrasts with sequential integers, which require synchronization and can expose enumeration vulnerabilities, though UUIDs introduce larger index sizes and potential fragmentation in B-tree structures.^[68] In data management platforms, unique identifiers facilitate entity resolution and tracking, such as assigning UUIDs to assets or configuration items during discovery processes to prevent duplication in inventories.^[69] For non-relational databases like NoSQL stores, document or partition keys perform analogous roles, ensuring unique access to records in distributed ledgers or object storage where global uniqueness supports horizontal scaling. Hardware-level identifiers, including MAC addresses (48-bit EUI standardized by IEEE), provide unique binding for network interfaces in data center management, though they are not inherently global without vendor extensions.^[1] File systems and storage solutions leverage unique identifiers like volume UUIDs or inode numbers for local persistence, with cryptographic hashes (e.g., SHA-256) offering content-addressable uniqueness in deduplication schemes to verify integrity and avoid redundant storage.^[70] In cloud data management, such as AWS S3, UUID-based object keys ensure isolation and retrievability, mitigating risks from sequential naming collisions in high-volume uploads.^[71] Overall, these mechanisms underpin reliable data operations but demand careful selection to balance uniqueness guarantees against performance overheads like storage bloat from UUIDs' fixed 16-byte length.^[72]

In Government and Personal Identification

Unique identifiers play a central role in government systems for verifying individual identities, enabling access to public services, taxation, social welfare, and legal processes. In foundational identification systems, such numbers link personal records across agencies, ensuring consistent authentication and record location while minimizing duplication.^[73]^[74] These identifiers, often alphanumeric strings assigned at birth or upon registration, support civil registration functions like birth and death tracking, voter enrollment, and benefit distribution.^[75] In the United States, the Social Security Number (SSN), established under the Social Security Act of 1935 and first issued in 1936, was originally designed solely to track workers' earnings histories for benefit calculations.^[76] Over time, it evolved into a de facto personal identifier for federal and state interactions, including tax filing, banking, and employment verification, despite legislative efforts to limit its non-entitlement uses.^[77] By 2020, over 330 million SSNs had been issued, with the system incorporating randomization since 2011 to enhance security and reduce predictability.^[78] Internationally, dedicated national identity numbers are common in population register-based systems. For instance, countries like Estonia and India employ unique lifelong numbers integrated with biometric data for e-governance, healthcare access, and financial services, assigning them via civil registries to cover nearly universal populations.^[74] In the European Union, formats vary but often encode birth dates for validation, such as the 11-digit personal code in Nordic countries used for all administrative purposes.^[79] Passports and travel documents incorporate unique machine-readable identifiers standardized by the International Civil Aviation Organization (ICAO) under Document 9303, which specifies formats for document numbers, biometric chips, and visual zones to facilitate global interoperability and fraud detection.^[80] These e-passports, mandatory in ICAO member states since 2010 for enhanced security, embed unique serial numbers in RFID chips containing facial biometrics and personal details, verifiable via public key directories.^[81] As of 2023, over 150 countries issued ICAO-compliant e-passports, processing billions of border crossings annually with collision-resistant identifiers.^[82] Driver's licenses and voter IDs also rely on state-issued unique numbers, often cross-referenced with national systems for verification; for example, U.S. REAL ID-compliant licenses since 2008 use unique alphanumeric codes tied to source documents for domestic air travel and federal access. Such systems reduce administrative errors but require robust safeguards against reuse or forgery, as identifiers remain constant across an individual's lifecycle.^[75]

In Science, Research, and Publishing

In scientific research and publishing, unique identifiers enable precise referencing, disambiguation, and persistent access to scholarly outputs, authors, and data, supporting reproducibility, citation tracking, and collaboration across global networks.^[22] The Digital Object Identifier (DOI) system, administered by the International DOI Foundation, assigns alphanumeric strings (e.g., 10.1000/xyz123) to journal articles, datasets, books, and other digital objects, resolving to their current locations via the Handle System for long-term stability despite URL changes.^[52] DOIs facilitate automated metadata exchange through services like CrossRef and DataCite, with over 200 million registered by 2023, enhancing discoverability in databases and reducing citation errors in bibliometric analyses.^[83] For researchers, the Open Researcher and Contributor ID (ORCID) provides a free, persistent 16-digit identifier (e.g., 0000-0001-2345-6789) that links individuals to their works, affiliations, and funding, addressing name ambiguity—such as multiple authors sharing common names like "John Smith"—which affects up to 10% of PubMed entries.^[84] Adopted by major funders like the National Institutes of Health and publishers including Elsevier and Springer Nature, ORCID integration in submission systems (mandatory in over 1,000 journals by 2023) streamlines authorship verification and profile aggregation in platforms like Scopus and Web of Science.^[85] Domain-specific identifiers complement these, such as PubMed IDs (PMIDs) for biomedical literature, unique sequential numbers (e.g., 12345678) indexing over 36 million citations in the MEDLINE database since 1966, enabling targeted retrieval in health research. Similarly, International Geo Sample Numbers (IGSNs) assign DOIs to physical specimens in geosciences, promoting data sharing in repositories like EarthChem, while Research Organization Registry (ROR) IDs standardize institutional identifiers to avoid duplication in grant reporting and collaboration networks. These systems collectively underpin FAIR data principles—findable, accessible, interoperable, reusable—by ensuring identifiers remain globally unique and resolvable, though adoption varies by discipline, with life sciences leading due to federal mandates.^[23]

In Transportation and Logistics

In transportation, unique identifiers enable precise tracking, regulatory compliance, and interoperability across vehicles, cargo units, and shipments. Standards such as those from ISO and GS1 ensure global uniqueness, reducing errors in supply chains where billions of items move annually; for instance, over 200 million shipping containers are in circulation worldwide, each requiring unambiguous identification for customs, insurance, and logistics operations.^[86] These systems prioritize alphanumeric codes that encode origin, attributes, and sequential details, often incorporating check digits to validate integrity against transcription errors. For road vehicles, the Vehicle Identification Number (VIN) provides a standardized 17-character alphanumeric code under ISO 3779:2009, applicable to motor vehicles, trailers, and motorcycles. The structure includes a World Manufacturer Identifier (first three characters), vehicle attributes (positions 4-9), a check digit (10th), and a serial number (last six), facilitating theft prevention, recalls, and data exchange in global databases.^[87] Adopted since 1981 in the United States and harmonized internationally, VINs have minimized duplication risks, with the National Highway Traffic Safety Administration reporting their role in recovering over 90% of stolen vehicles equipped with them in recent years. In maritime and intermodal logistics, shipping containers use BIC codes per ISO 6346, an 11-character format starting with a four-letter owner prefix and category (e.g., "U" for containers), followed by a six-digit serial and check digit. Managed by the Bureau International des Containers, these codes ensure uniqueness across owners like Maersk or COSCO, supporting automated scanning at ports handling 800 million TEUs annually.^[86] Non-compliance, such as invalid check digits, can delay shipments, as evidenced by port congestion analyses linking identifier errors to 5-10% of processing delays.^[88] Aviation employs aircraft registration marks under ICAO Annex 7, consisting of a nationality prefix (e.g., "N" for the United States) followed by a hyphen and unique alphanumeric serial up to five characters, displayed on the fuselage for visual and regulatory identification. These marks, registered nationally but internationally recognized, enable real-time tracking via systems like ADS-B, with over 300,000 civil aircraft globally relying on them for air traffic management and safety oversight. Logistic units in supply chains utilize the GS1 Serial Shipping Container Code (SSCC), an 18-digit identifier (GS1 prefix + serial reference + check digit) for pallets, cartons, or vehicles in transit. Implemented via barcodes or RFID, SSCCs support end-to-end visibility in e-commerce and manufacturing, with GS1 reporting adoption in over 150 countries to cut inventory discrepancies by up to 30% through standardized data capture.^[89] While proprietary tracking numbers (e.g., 10-22 digits from carriers like DHL) supplement these for parcels, they often align with SSCC for interoperability in multimodal transport.^[90]

In Economics and Regulation

In economic research and statistical analysis, unique identifiers for firms and households facilitate the longitudinal tracking of economic units, enabling precise measurement of productivity, employment dynamics, and market concentration without compromising confidentiality. For instance, the U.S. Census Bureau employs the Employer Identification Number (EIN), issued by the Internal Revenue Service, as a unique identifier for single-unit enterprises in datasets like the Statistics of U.S. Businesses (SUSB), allowing aggregation of firm-level data for national accounts while anonymizing sensitive information.^[91] Globally, initiatives such as the United Nations' Global Initiative on Unique Identifiers for Businesses promote standardized business identifiers to link administrative registers with statistical systems, improving cross-border comparability of economic indicators like GDP contributions and trade flows.^[92] In financial regulation, the Legal Entity Identifier (LEI) serves as a cornerstone for identifying counterparties in transactions, mandated by bodies like the Financial Stability Board (FSB) since 2012 to enhance systemic risk monitoring following the 2008 financial crisis. The LEI, a 20-character alphanumeric code compliant with ISO 17442, uniquely denotes legal entities across jurisdictions and includes hierarchical ownership data to trace relationships such as "who owns whom," supporting regulatory reporting under frameworks like Dodd-Frank in the U.S. and EMIR in the European Union.^[93] ^[94] As of 2024, over 2.5 million LEIs have been issued worldwide, with adoption required for derivatives reporting and uncleared swaps to reduce opacity in over-the-counter markets.^[94] Regulatory compliance extends to transaction-level identifiers, such as the Unique Transaction Identifier (UTI) for derivatives, which standardizes reporting to authorities and mitigates settlement risks by enabling automated reconciliation and reduced fails in post-trade processing.^[95] In government procurement, the U.S. Unique Entity Identifier (UEI), replacing the DUNS number since April 2022, is required for entities contracting with federal agencies, ensuring verifiable identity and streamlining award management through SAM.gov.^[96] These identifiers collectively underpin causal analysis of economic policies, such as antitrust enforcement via firm-level merger data, and enforce traceability in regulated sectors to prevent fraud and market abuse.^[97]

Challenges and Risks

Technical Limitations and Collision Risks

Unique identifiers, particularly those generated probabilistically in computing systems, face collision risks arising from the birthday paradox, where the probability of at least one duplicate increases quadratically with the number of items relative to the namespace size. For a 128-bit universally unique identifier (UUID) version 4, which employs 122 bits of randomness, the number of UUIDs required to yield a 50% collision probability approximates 2.71 × 10^{18}, equivalent to generating roughly 1 billion UUIDs per second for about 100 years.^[98] This risk, while theoretically present, remains practically negligible for most applications but underscores the finite nature of even large namespaces in hyper-scale distributed systems. Hash-derived unique identifiers amplify collision vulnerabilities when using cryptographically weakened functions like MD5, for which deliberate collisions were first demonstrated in 2004 through practical attacks requiring modest computational resources. Such collisions compromise data integrity by allowing distinct inputs to map to the same identifier, a flaw exploited in scenarios like certificate forgery. Stronger alternatives like SHA-256 mitigate this by providing 256 bits of output, reducing accidental collision odds to approximately 1 in 2^{128} under birthday attack assumptions, though deliberate attacks still demand infeasible 2^{128}-operation brute force. Systems relying on hashes for uniqueness must thus select algorithms resistant to known preimage and collision attacks to avoid integrity failures. Fixed-length sequential identifiers, such as 64-bit auto-incrementing primary keys in databases, eliminate probabilistic collisions within their range but impose exhaustion risks after 1.84 × 10^{19} values, necessitating careful overflow management in long-lived systems. In distributed environments, timestamp-based identifiers risk collisions from clock skew or leap seconds, potentially duplicating values across nodes without synchronized time sources. Random identifiers alleviate predictability but introduce storage overhead—UUIDs consume 128 bits versus 64 for integers—and cause B-tree index fragmentation in databases, as non-sequential inserts scatter pages and inflate maintenance costs during inserts and vacuums. Concurrency in identifier generation exacerbates risks; without atomic checks or distributed locks, parallel processes may produce duplicates, as seen in naive implementations lacking uniqueness constraints. Mitigation strategies, including hybrid approaches combining timestamps, machine IDs, and counters (e.g., Snowflake IDs), reduce but do not eliminate these issues, trading off simplicity for resilience in globally scaled systems. Overall, while modern unique identifiers achieve near-certain uniqueness through expansive bit lengths, their technical limitations demand rigorous design to prevent rare but impactful collisions in data management and distributed computing.

Security Vulnerabilities

Unique identifiers, when poorly designed or implemented, expose systems to risks such as forgery, enumeration, and unauthorized access. Sequential identifiers, commonly used as primary keys in databases, enable insecure direct object reference (IDOR) attacks, where adversaries guess successive values to access restricted resources without authentication.^[99] ^[100] For instance, exposing auto-incrementing integer IDs in URLs or APIs allows attackers to enumerate records, infer database size, and retrieve sensitive data from other users by incrementing or decrementing the ID.^[38] In computing, UUID version 1 (v1) variants incorporate timestamps and MAC addresses, leaking metadata about the generating system's clock and hardware identifier, which can facilitate targeted attacks or device tracking.^[101] Even random-based UUID version 4 (v4) should not be relied upon for security-sensitive purposes like authentication tokens, as implementations may suffer from insufficient entropy, predictability in low-entropy environments, or vulnerability to collision attacks if the random number generator is flawed.^[102] ^[103] Domain-specific identifiers like Social Security Numbers (SSNs) in the United States amplify identity theft risks due to their structured format—first three digits indicating geographic issuance, middle tied to birth patterns—and widespread reuse across sectors, enabling fraud such as unauthorized credit applications upon exposure.^[104] A stolen SSN can facilitate synthetic identity fraud, where attackers combine it with fabricated details to create new fraudulent profiles, with U.S. Government Accountability Office reports noting persistent vulnerabilities from over-reliance on SSNs without robust protections.^[104] Breaches exposing SSNs, such as those affecting millions, underscore how static, human-readable identifiers fail to incorporate cryptographic safeguards against guessing or reuse.^[105] Hardware-based unique identifiers, including MAC addresses, are susceptible to spoofing, where attackers alter their device's address to impersonate legitimate ones, bypassing access controls like MAC filtering on networks.^[106] This technique, executable via standard operating system tools, allows unauthorized entry to Wi-Fi networks or ARP poisoning for man-in-the-middle attacks, as MAC addresses transmit in plaintext frames without inherent encryption.^[107] Duplicate or forged identifiers can further erode protections, permitting resource access violations if systems assume uniqueness without verification.^[108]

Controversies and Debates

Privacy Implications and Surveillance Concerns

Unique identifiers, by design, enable the persistent linkage of personal data across disparate systems, fundamentally undermining anonymity and facilitating comprehensive profiling of individuals' behaviors, locations, and associations. This capability raises profound privacy implications, as a single, unchanging identifier—such as a national ID number or biometric template—serves as a linchpin for aggregating information from financial transactions, health records, travel patterns, and online activities, often without individuals' knowledge or consent. Critics argue that this aggregation creates a "digital panopticon," where habitual monitoring becomes feasible, eroding the ability to engage in private conduct free from retrospective scrutiny.^[109] In governmental contexts, mandatory unique identifier systems, including biometric-linked national IDs, amplify surveillance risks by centralizing vast troves of sensitive data, which can be queried or shared across agencies for non-original purposes—a phenomenon known as function creep. For instance, India's Aadhaar program, which assigns a 12-digit unique number tied to biometric data for over 1.3 billion residents, has been linked to unauthorized data sharing and potential mass surveillance, with reports of biometric harvesting enabling identity fraud and government overreach. Similarly, the United States' REAL ID Act, implemented to standardize driver's licenses as de facto national identifiers, has drawn opposition for creating a unified database vulnerable to breaches and enabling expansive tracking by federal authorities.^[110]^[111]^[109] Biometric unique identifiers, such as facial scans or fingerprints, exacerbate these concerns due to their immutability; unlike passwords, compromised biometrics cannot be reset, rendering affected individuals permanently vulnerable to impersonation or exclusion from services. Surveillance applications, including facial recognition deployed in public spaces, leverage these identifiers to identify individuals in real-time without warrants, as seen in systems like Clearview AI, which aggregates billions of facial images for law enforcement queries, blurring lines between private and state surveillance. In China, the social credit system integrates unique citizen IDs with CCTV and behavioral data to score and penalize compliance, demonstrating how identifiers can enforce normative conduct through pervasive monitoring.^[112]^[113]^[114] Data breaches further compound risks, as centralized repositories of unique identifiers invite large-scale identity theft and unauthorized access; for example, incidents involving digital ID systems have exposed millions to ransomware and hacking, with recovery complicated by the permanence of linked attributes. Even purportedly privacy-enhancing designs, such as the European Union's digital identity wallet, face criticism for potential tracking flaws and collusion risks between issuers and verifiers, potentially enabling "over-identification" that diminishes pseudonymity in online interactions. These vulnerabilities disproportionately affect marginalized groups, including immigrants and low-income populations, who may face exclusion or heightened scrutiny when opting out of such systems.^[115]^[116]^[117] Proponents of unique identifiers contend that robust encryption and selective disclosure mitigate surveillance threats, yet empirical evidence from breaches and mission creep in systems like Aadhaar underscores persistent causal links between identifier deployment and privacy erosion, independent of intended safeguards. Addressing these concerns requires stringent data minimization, revocability where feasible, and independent audits to prevent authoritarian drift, though implementation varies widely by jurisdiction.^[118]^[109]

Ethical and Societal Trade-offs

Unique identification systems offer substantial societal benefits, including reduced fraud in welfare distribution and improved access to services, but these gains come at the cost of heightened privacy risks and potential for state overreach. In India's Aadhaar program, which enrolled over 1.2 billion individuals by 2022, biometric-linked unique IDs have streamlined subsidy payments and cut duplicate claims, saving an estimated 0.59% of GDP annually through leakages prevention in programs like direct benefit transfers.^[119]^[120] However, this efficiency has been offset by data breaches exposing millions of records and enabling unauthorized surveillance via private-sector linkages, raising concerns over centralized data vulnerabilities that could enable mass tracking without adequate consent mechanisms.^[121]^[118] Ethically, the irrevocable nature of biometric identifiers—such as fingerprints or facial scans, which cannot be altered like passwords—amplifies risks to personal autonomy and dignity, as a single compromise ties an individual's identity to lifelong data trails. Systems like these have demonstrated error rates up to 10 times higher for non-white or female demographics in facial recognition, fostering discriminatory outcomes in applications from hiring to policing, where algorithmic biases perpetuate unequal treatment.^[122]^[123] Proponents argue that such technologies enhance inclusion by enabling financial access for the unbanked, as seen in national ID schemes reducing identity fraud by up to 50% in social protection programs, yet critics highlight how mandatory enrollment excludes marginalized groups without access to enrollment infrastructure, exacerbating digital divides.^[124]^[125] Societally, unique identifiers facilitate causal improvements in governance, such as curbing terrorism and immigration fraud through verifiable tracking, but they enable "social sorting" where data aggregation profiles citizens for differential treatment, potentially eroding anonymity essential for dissent.^[126]^[127] In trade-off assessments, empirical evidence from digital ID implementations shows net welfare gains in fraud reduction—e.g., via secure registries minimizing extortion and harassment—but only when paired with robust legal safeguards against misuse, as unchecked centralization has led to exclusion from essential services for non-compliant individuals in systems like Aadhaar.^[128]^[129] Balancing these requires prioritizing decentralized alternatives or revocable identifiers to mitigate surveillance incentives while preserving verifiable efficiency.^[130]

References

[1]
What is a unique identifier (UID)? | Definition from TechTarget
Sep 4, 2024 · A unique identifier (UID) is a numeric or alphanumeric string that is associated with a single entity within a given system.
[2]
What Is a Unique Identifier (UID), and Why Is It Important? - Coursera
Oct 30, 2024 · Unique identifiers (UIDs) are specific codes or numbers assigned to objects, people, or data to distinguish them from others.
[3]
Unique Identification - an overview | ScienceDirect Topics
Unique identification refers to the assignment of distinct identifiers, such as Digital Object Identifiers (DOIs) or PubMed IDs (PMIDs), to entities like ...
[4]
RFC 4122 - A Universally Unique IDentifier (UUID) URN Namespace
This specification defines a Uniform Resource Name namespace for UUIDs (Universally Unique IDentifier), also known as GUIDs (Globally Unique IDentifier).
[5]
Guidelines for Use of Extended Unique Identifier (EUI ... - IEEE
Apr 1, 2022 · An Extended Unique Identifier (EUI) is either a 48-bit Extended Unique Identifier (EUI-48) or a 64-bit Extended Unique Identifier (EUI-64).
[6]
ISO/IEC 15459-4:2008 - Information technology — Unique identifiers
ISO/IEC 15459-4:2008 specifies a unique, non-significant string of characters for the unique identifier for individual items.
[7]
https://www.camcode.com/blog/what-is-uid/
[8]
What is unique identifier and how to use it? - dbBee
A unique identifier (UID) is a numeric or alphanumeric string that is associated with a single entity within a given system.
[9]
What is a Unique Identifier (UID) and why is it necessary for each ...
Jan 23, 2022 · A unique identifier (UID) is an identifier that marks that particular record as unique from every other record.
[10]
Data Preparation: Unique Identifiers - DataRails
Jul 24, 2024 · A unique identifier (UID) distinguishes each database entry. Examples include ISBNs, customer IDs, or order numbers. UIDs ensure data accuracy ...
[11]
[PDF] UIDs
The unique identifier (UID) is very important in relational databases. □ It is the value or combination of values that enables the user to find.
[12]
About Unique Identifiers - UNT Libraries - University of North Texas
An identifier is simply a number that is unique within the collection (the way that a person has a unique Social Security Number).Missing: definition | Show results with:definition
[13]
What is Unique Identifier (UID) | TEKTELIC Glossary
A unique identifier (UID) is a numeric or alphanumeric string related to a certain entity within a system. This allows for access to the entity and interaction ...
[14]
What Is Universally Unique Identifier (UUID)? - ITU Online IT Training
A Universally Unique Identifier (UUID) is a 128-bit number used to uniquely identify information in computer systems.Missing: science | Show results with:science<|separator|>
[15]
The Benefits of Using UUIDs for Unique Identification - PingCAP
Jul 17, 2024 · UUIDs provide global uniqueness, avoid collisions, facilitate horizontal scaling, and enhance data security by making identifiers hard to guess.
[16]
Reference Documentation for Unique Identifiers
Characteristics of Unique Identifiers · Uniqueness · Persistence · Reassignment · Privacy preservation using opaqueness · Readability or lucency · Linkability ...
[17]
Globally Unique Identifiers (GUID) - iDigBio
Jan 29, 2012 · A persistent identifier is one which is never assigned to a different object, whether or not the object to which it is assigned persists.Missing: fundamental | Show results with:fundamental<|control11|><|separator|>
[18]
Defining Unique Identifiers - Jenna Jordan
Jul 11, 2019 · These properties of uniqueness, unambiguity, identifier stability, and entity stability can only be enforced if the UIDs exist within a system.
[19]
Why a uuid is critical for unique user identification - Statsig
Feb 4, 2025 · UUIDs are essential for maintaining data integrity and enabling accurate synchronization. They prevent identifier conflicts when merging data from different ...<|separator|>
[20]
Identity Crisis: How Modern Applications Generate Unique Ids
Dec 30, 2022 · Ids Must Be Secure · Ids should not leak any information about the information it identifies · Ids should not leak any information about the host ...
[21]
Best practices for unique identifiers | Identity - Android Developers
In general, user account identifiers can be considered unique. That is, each device/account combination has a unique ID. On the other hand, the less unique an ...<|separator|>
[22]
How to design, provision, and reuse persistent identifiers to ... - NIH
An identifier is a sequence of characters that identifies an entity. The term “ persistent identifier ” is usually used in the context of digital objects that ...
[23]
Unique, Persistent, Resolvable: Identifiers as the Foundation of FAIR
Jan 1, 2020 · Identifiers have a scope within which they are verifiably unique. For digital artifacts this may be local scope within some particular data ...
[24]
Persistent identifiers - Digital Preservation Handbook
A persistent identifier is a long-lasting reference to a digital resource, with a unique identifier and a service that locates it over time.
[25]
[PDF] Choosing a Persistent Identifier Type for Your Digital Objects
Aug 15, 2023 · Persistent identifiers usually consist of a prefix and a suffix. The prefix uniquely identifies the organization while the suffix uniquely ...
[26]
RFC 9414 - Unfortunate History of Transient Numeric Identifiers
This document analyzes the timeline of the specification and implementation of different types of transient numeric identifiers used in IETF protocols.
[27]
RFC 9415: On the Generation of Transient Numeric Identifiers
This document performs an analysis of the security and privacy implications of different types of transient numeric identifiers used in IETF protocols.
[28]
A Comprehensive Guide to ID Types: Use Cases, Pros, Cons, and ...
integers, UUIDs, strings, and more — and their use cases, advantages, disadvantages, and solutions for ...
[29]
How do different communities create unique identifiers? - Lost Boy
Apr 14, 2020 · Hierarchies that are based on geography and/or organisational structures are common patterns in identifiers. Existing hierarchies provide a ...
[30]
4: Unique and Persistent Identifiers
A Unique Identifier (UID) uniquely identifies a resource. This means that the identifier may change for the particular embodiment of the resource and each copy ...
[31]
API design: Choosing between names and identifiers in URLs
Oct 17, 2017 · There is a general rule here. URLs based on opaque identifiers (sometimes called permalinks) are inherently stable and reliable, but they aren' ...
[32]
Distributed System IDs - Matt Layman
An opaque identifier can allow the service that owns the ID to make changes without affecting other services that hold the ID. When the non-owning service knows ...Missing: meaningful | Show results with:meaningful
[33]
Identifiers should not contain semantics - except when they should
Jul 13, 2007 · For items managed by computers identifiers should be semantically null except where semantics are absolutely inarguable and guaranteed to be ...
[34]
Choosing between names and identifiers in URLs | Hacker News
Oct 17, 2017 · URLs are trying to be human-meaningful and machine-meaningful at the same time, but those requirements are fundamentally incompatible.
[35]
MySQL 8.4 Reference Manual :: 5.6.9 Using AUTO_INCREMENT
The AUTO_INCREMENT attribute can be used to generate a unique identity for new rows: CREATE TABLE animals ( id MEDIUMINT NOT NULL AUTO_INCREMENT, name CHAR ...
[36]
The Unique Features of Snowflake ID and its Comparison to UUID
Dec 28, 2023 · Snowflake ID, created by Twitter (now X), stands out as one of the widely embraced algorithms for generating unique IDs. Learn why.Id Generation Process · Small Traffic Scenarios · Snowflake Id And Its Growing...
[37]
Six Options for Generating Distributed Unique IDs
Apr 20, 2023 · Option 3: Auto-Increment ID from databases. RDBMS supports auto-increment entities. For example, we can use a sequence in Postgres like this.
[38]
The Evils of Sequential IDs - Jefferey Cave - Medium
Jan 17, 2024 · The main reason I will reach for a sequential ID is for audit purposes: a sequential ID can be used to verify missing data by creating gaps in ...
[39]
How Twitter Generates Scalable Unique IDs For Tweets - LinkedIn
Apr 6, 2024 · Each ID generated by Snowflake is a 64-bit long unsigned integer. This provides a huge space for unique identifiers. Each ID consists of the ...
[40]
Twitter snowflake approach is cool | by Atakan Demircioğlu - Medium
Dec 18, 2022 · What is Twitter's snowflake approach? It is a solution to generate unique IDs in distributed systems. Twitter uses this approach in Tweets, DM's ...Advantages & Disadvantages... · Usage Notes · Written By Atakan...
[41]
Predictably Unique: Exploring the Deterministic Nature of UUIDs
Aug 10, 2024 · TL;DR: · UUID Version 1 (Time-based): Utilises the current timestamp, a clock sequence, and the MAC address of the computer to generate a UUID.What Are Uuids? · Uuid Versions · Using Uuids: Code Samples
[42]
Is it possible to build a system to generate UUIDs where every UUID ...
Jan 15, 2024 · For example: the UUID v1 scheme uses MAC addresses as generator ID, and then timestamp + counter as strictly monotonically increasing sequence.
[43]
Are 64-bit random identifiers free from collision? - Daniel Lemire's blog
Dec 12, 2019 · If you assign two 64-bit integers at random to distinct objects, the probability of a collision is very, very small. You can be confident that ...
[44]
RFC 9562 - Universally Unique IDentifiers (UUIDs) - IETF Datatracker
UUID Version 4. UUIDv4 is meant for generating UUIDs from truly random or ... This specification defines the "UUID Subtypes" registry for common widely used UUID ...Table of Contents · UUID Format · UUID Layouts · UUID Best Practices
[45]
Acceptable to rely on random ints being unique?
Dec 30, 2016 · One of the algorithms for generating UUIDs will effectively generate an ID consisting of 122 random bits and assume it will be unique. And ...Is it possible to build a system to generate UUIDs where every UUID ...What are some efficient and easy-to-remember schemes for ...More results from softwareengineering.stackexchange.comMissing: probabilistic | Show results with:probabilistic
[46]
https://datatracker.ietf.org/doc/html/rfc4122#section-4.3
[47]
https://datatracker.ietf.org/doc/html/rfc4122#section-4.3.2
[48]
https://datatracker.ietf.org/doc/html/rfc4122#section-4.3.3
[49]
RFC 9562: Universally Unique IDentifiers (UUIDs)
UUID Version 4. UUIDv4 is meant for generating UUIDs from truly random or ... This specification defines the "UUID Subtypes" registry for common widely used UUID ...Table of Contents · UUID Format · UUID Layouts · UUID Best Practices
[50]
UUID (Java Platform SE 8 ) - Oracle Help Center
A UUID represents a 128-bit value. There exist different variants of these global identifiers. The methods of this class are for manipulating the Leach-Salz ...
[51]
Information on RFC 4122 - » RFC Editor
This specification defines a Uniform Resource Name namespace for UUIDs (Universally Unique IDentifier), also known as GUIDs (Globally Unique IDentifier).
[52]
DOI
The DOI Foundation is a not-for-profit organization. We govern the Digital Object Identifier (DOI) system on behalf of the agencies who manage DOI registries ...Resources · The Foundation · The Identifier · DOI® Handbook
[53]
International ISBN Agency
Easily recognisable as the identifier for all kinds of books, the International Standard Book Number is key to an efficient and effective book supply chain.Find an agency · How to get an ISBN · Global Register of Publishers · About ISBN
[54]
ISSN, a standardised code - issn.org
The ISSN standard: ISO 3297. The ISO 3297 standard was drafted in 1971, published for the first time in 1975, and revised on a regular basis since that date.
[55]
ISBN Standard | International ISBN Agency
ISBN is an International Standard that is published by ISO, the International Organization for Standardization. ISO publish over 19,500 different international ...
[56]
Find an agency
International ISBN Agency · About ISBN · ISBN Standard · Benefits · How to get an ISBN · Scope and Assignment of ISBN · Guidelines for assignment to e-books.National ISBN Agencies · French · National agencies
[57]
What is an ISSN? | ISSN
An ISSN is an 8-digit code used to identify newspapers, journals, magazines and periodicals of all kinds and on all media–print and electronic.
[58]
International standard serial number (ISSN) - ISO 3297:2020
This document defines and promotes the use of a standard code (ISSN) for the unique identification of serials and other continuing resources.
[59]
About Us - DOI
We govern the Digital Object Identifier (DOI) system on behalf of the agencies who manage DOI registries and provide services to their respective communities.
[60]
Key Facts on Digital Object Identifier System - DOI
Foundation launched to develop system in 1998. · Currently used by well over 5,000 assigners, e.g., publishers, science data centres, movie studios, etc.
[61]
https://www.iso.org/obp/ui/en/#!iso:std:82285:en
[62]
[PDF] Vehicle Identification Number VIN - UNECE
Sep 28, 2007 · It is based on the ISO Standards 3779 and 3780. The former describes the composition of the VIN and the latter describes the codification system.
[63]
How to get an ISBN - International ISBN Agency
ISBN Standard ... Where an agency charges for ISBN assignment the price for ISBN should be relative to the living standard in the agency's area of operation.<|separator|>
[64]
SQL Primary Key: A Comprehensive Technical Tutorial - DataCamp
Aug 7, 2025 · SQL primary keys are foundational to relational databases, enforcing uniqueness, enabling efficient queries, and supporting data integrity.
[65]
Primary and foreign key constraints - SQL Server - Microsoft Learn
Feb 4, 2025 · When you specify a primary key constraint for a table, the Database Engine enforces data uniqueness by automatically creating a unique index for ...<|control11|><|separator|>
[66]
Difference between Primary key and Unique key - GeeksforGeeks
Jul 11, 2025 · A primary key is a column of a table that uniquely identifies each tuple (row) in that table. The primary key enforces integrity constraints to the table.
[67]
What is a UUID, and what is it used for? - CockroachDB
Jun 29, 2023 · UUIDs versions 3 and 5 differ primarily in that they use different hashing algorithms. UUID v3 uses MD5, and UUID v5 uses SHA-1. Version 4.
[68]
Advantages and disadvantages of GUID / UUID database keys
Sep 5, 2008 · Advantages: Can generate them offline. Makes replication trivial (as opposed to int's, which makes it REALLY hard); ORM's usually like them ...Best practices on primary key, auto-increment, and UUID in SQL ...UUID vs Long primary key - database - Stack OverflowMore results from stackoverflow.com
[69]
Understanding Unique Identifiers and Their Impact | ServiceOps
Unique Identifiers are used to distinguish one asset or configuration item (CI) from another during creation via discovery or agent installation.
[70]
4.3 File Naming Conventions and Unique Identifiers
A Unique Identifier (UID) uniquely identifies a resource. This means that the identifier may change for the particular embodiment of the resource and each copy ...
[71]
Is it safe to rely on UUIDs for privacy?
Mar 16, 2014 · I'm uploading files on S3, where the filename is a UUID. All files uploaded are publicly read/write, however these files are considered private to users.
[72]
Introduction to UUIDs for Data Engineers - by Daniel Beach
Oct 9, 2023 · UUIDs in Data Engineering solve the unique identification problem of tracking rows of data though different systems.
[73]
Types of ID systems | Identification for Development - ID4D
Uniqueness typically means that (a) one person does not claim multiple identities within the system, and (b) each identity is only claimed by one person. In ...
[74]
Unique ID Numbers | Identification for Development - ID4D
Unique ID numbers (UINs) are basic identifiers, ensuring no two people share the same number, and are used for record location and authentication.
[75]
[PDF] THE ROLE OF UNIQUE IDENTIFIERS IN CIVIL REGISTRATION ...
In civil registration and identification systems, unique identifiers provide a unified means of identification between agencies and over time – facilitating ...
[76]
The Story of the Social Security Number
The Social Security number (SSN) was created in 1936 for the sole purpose of tracking the earnings histories of US workers.
[77]
How Social Security Numbers Became A Form Of National ... - NPR
Mar 22, 2018 · The Social Security number was never meant to be a form of national identification. And yet, here were are: Nine digits that rule our lives and ruin our lives.
[78]
Your Social Security Number: The 9-Digit Evolution | St. Louis Fed
Jan 2, 2020 · In the beginning, when President Roosevelt signed the Social Security Act, the purpose of the SSN was identification for accurate recordkeeping.
[79]
What is my national ID or personal code number? - Smart-ID
Look for a number with 11 digits containing your birthday in a year-month-date or date-month-year format. For example, if you are born on 30th of October 1980, ...
[80]
[PDF] Doc 9303 Machine Readable Travel Documents - ICAO
ICAO Member States have recognized that standardization is a necessity and that the benefits of adopting the Doc 9303 standard formats for passports and other ...
[81]
What is an ePassport? - Keesing Technologies
A standard regarding the data groups and communication protocols of electronic passports has been set by the International Civil Aviation Organization (ICAO).
[82]
ICAO PKD
The ICAO PKD provides an efficient means for States to upload their own information and download that of other States. By playing the role of central broker for ...
[83]
Global visibility of publications through Digital Object Identifiers - PMC
Aug 17, 2023 · DOIs are used in scientometrics and related research fields, for example, to study the lists of references in publications (Mugnaini et al., ...
[84]
ORCID iD
ORCID is a free, unique, persistent identifier (PID) for individuals to use as they engage in research, scholarship, and innovation activities.Get my ORCID iD and record · For Researchers · About · Orcid
[85]
ORCID for Researchers
ORCID is a free, unique identifier for researchers, a 16-digit number with a profile that links all your research to you.
[86]
BIC: Bureau of International Containers
BIC Facility Codes. Container Facility Database. A unique, harmonized code to identify container facilities and their locations, with an API serving both BIC ...BIC Codes · Size & Type Code · Apply for a BIC Code · Check Digit Calculator
[87]
ISO 3779:2009 - Vehicle identification number (VIN)
CHF 43.00 In stockISO 3779:2009 applies to motor vehicles, towed vehicles, motorcycles and mopeds as defined in ISO 3833.
[88]
What Are Shipping Container Codes & How Are They Used? - NMFTA
When you need to confirm container ownership or validate the format of a code, the BIC Code Finder is the best place to start. It lets you search for a code and ...
[89]
[PDF] GS1 Identification Keys in Transport & Logistics - GS1 Guideline
This guideline provides guidance to identify equipment, locations, assets and logistic units and shipments used in the transport & logistics industry. It is ...
[90]
Encoding Transport Process Information GS1 Implementation ...
In transport and logistics, a transport label with an SSCC for the transport unit is required. The seller assigns an SSCC, a globally unique ID Key, to each ...
[91]
Statistics of U.S. Businesses Methodology - U.S. Census Bureau
May 22, 2025 · A single-unit enterprise's unique identifier is its Employer Identification Number (EIN). The Internal Revenue Service (IRS) issues an EIN ...
[92]
UNSD — Global Initiative on Unique Identifiers for Businesses
The Global Initiative on Unique Identifiers for Businesses focuses on strengthening the connection between business registration and statistical business ...
[93]
Identifying Organizations - the Legal Entity Identifier (LEI) - GLEIF
Each LEI contains information about an entity's ownership structure. In this way, the LEI answers the questions of 'who is who' and 'who owns whom'. Simply put, ...ISO 17442: The Global Standard · ISO 17442 Part 2 · Organisatorische Identität
[94]
Implementation of the Legal Entity Identifier: Progress report
Oct 21, 2024 · The Legal Entity Identifier (LEI) was established in 2012 as a way to uniquely identify counterparties to financial transactions across borders.<|separator|>
[95]
Why you should implement the Unique Transaction Identifier now
The Unique Transaction Identifier (UTI) boosts transparency, cuts risks, and ensures regulatory compliance. Overview; Risk mitigation and efficiency; Get ...Overview · Risk Mitigation And... · Support Csdr Compliance
[96]
Unique Entity Identifier update - GSA
Apr 4, 2022 · Unique Entity ID now required The Unique Entity ID is the official identifier for doing business with the U.S. Government as of April 4, 2022. ...
[97]
The Fed - Legal Entity Identifier: What Else Do You Need to Know?
Jul 10, 2020 · This code is commonly referred to as a legal entity identifier (LEI). The information that is collected to accompany and describe the LEI will ...
[98]
Wikipedia claims birthday paradox means "only after generating 1 ...
"only after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%." 1.
[99]
Exposing sequential IDs is bad! Here is how to avoid it.
Mar 2, 2021 · Exposing the internal ID is bad practice. You can still use an auto-incremented primary key internal and use it for all foreign keys as well.
[100]
Identity Crisis: Sequence v. UUID as Primary Key - Brandur
Jun 29, 2021 · Malicious users might take advantage of them to try attacks based on ID iteration or collision. They occasionally reduce operational accidents.
[101]
Universally Unique IDentifiers (UUIDs) Are Yours Secure? - VerSprite
Mar 4, 2023 · One of the disadvantages of using UUID v4 is that there is a small chance of generating duplicated UUIDs. Although, with the high number of ...
[102]
Never Rely on UUID for Authentication: Generation Vulnerabilities ...
May 1, 2024 · The risks and best practices of using UUIDs for authentication, uncovering vulnerabilities, and secure implementation strategies.<|separator|>
[103]
Cautionary note: UUIDs generally do not meet security requirements
Nov 22, 2015 · Do not assume that UUIDs are hard to guess; they should not be used as security capabilities (identifiers whose mere possession grants access), for example.
[104]
Social Security Numbers: Use is Widespread and Protection Could ...
This testimony focuses on describing the (1) use of SSNs by government agencies, (2) use of SSNs by the private sector, and (3) vulnerabilities that remain to ...
[105]
The Social Security data breach compromised 'billions' of accounts ...
Jul 14, 2025 · A Social Security number can be used to steal someone else's identity. This allows criminals to fraudulently open new lines of credit, apply for ...
[106]
MAC Spoofing Attacks Explained: A Technical Overview - SecureW2
MAC address spoofing is a potential vulnerability that can be leveraged by threat actors to compromise the network and obtain unauthorized access. The tool for ...What Is a MAC Address? · What Is MAC Spoofing? · How Does a MAC Spoofing...
[107]
What is MAC Spoofing? How It Works & Examples - Twingate
Aug 1, 2024 · MAC spoofing is a technique where an attacker alters the Media Access Control (MAC) address of their device to mimic another device on the network.
[108]
CWE-694: Use of Multiple Resources with Duplicate Identifier (4.18)
If unique identifiers are assumed when protecting sensitive resources, then duplicate identifiers might allow attackers to bypass the protection. Quality ...
[109]
Mandatory National IDs and Biometric Databases
Government mandated biometric systems are invasive, costly, and damage the right to privacy and free expression. They violate the potential for anonymity, which ...
[110]
India's dodgy mass surveillance project should concern us all - WIRED
Aug 25, 2017 · Indeed, coercive application of Aadhaar creates a potential for mass surveillance, which in turn threatens the privacy of Indian citizens.
[111]
REAL ID concerns: Privacy advocates say government ... - USA Today
May 3, 2025 · Opponents argue REAL ID creates a de facto national identification system, increasing vulnerability to data breaches and government surveillance ...<|control11|><|separator|>
[112]
Facial Recognition in the United States: Privacy Concerns and Legal ...
Dec 1, 2021 · Using FRT to identify individuals without their knowledge or consent raises privacy concerns, especially since biometrics are unique to an ...
[113]
Face Surveillance and Biometrics - Epic.org
Commercial facial recognition products are often used by law enforcement, blurring the line between corporations and the government. Clearview AI, for example, ...
[114]
CHINA'S SOCIAL CREDIT SYSTEM
Jul 30, 2024 · The system relies heavily on surveillance technologies, such as CCTV cameras and facial recognition, to monitor public behavior and integrate ...
[115]
[PDF] Understanding the Risks of Digital IDs - Immigrant Defense Project
They expose us to identity theft, massive data breaches, and major privacy concerns. Many ID systems are implemented without adequate data protection ...
[116]
The EU's digital identity wallet has a big privacy problem
Jul 16, 2025 · But cryptographers say the technical design of the wallet the EU is piloting has flaws that could make it easier for governments to spy on ...
[117]
Concerns raised over EU Digital ID Wallet's impact on privacy and ...
The concern is that the system may eliminate anonymity, leading to an 'over-identification' and a loss of privacy.
[118]
Public Infrastructure and Private Surveillance in India's Aadhaar ...
Aug 18, 2025 · India's Aadhaar may set a global precedent in how private surveillance is legitimized through public infrastructure, writes Haakon Huynh.
[119]
Addressing Its Lack of an ID System, India Registers 1.2 Billion in a ...
Apr 13, 2022 · In response to numerous legal challenges, the Supreme Court of India in 2018 ruled that Aadhaar didn't violate Indians right to privacy. But the ...
[120]
India's Aadhaar Program: A Legitimate Trade-off between Social ...
Mar 28, 2016 · They argue that the benefits so outweigh possible costs (in terms of privacy or misidentification risks) that the trade-off is justified in ...
[121]
The Aadhaar Card: Cybersecurity Issues with India's Biometric ...
May 9, 2019 · Aadhaar has been plagued by a myriad of internal and legal problems, as well as major leaks and vulnerabilities in the overall security of the system.
[122]
The ethical application of biometric facial recognition technology
Apr 13, 2021 · This article examines the rise of biometric facial recognition, current applications and legal developments, and conducts an ethical analysis of the issues ...
[123]
[PDF] Bias in Emerging Biometric Systems: A Scoping Review
The implications of systems having demographic biases include risk of discrimination that potentially breaks equality laws across the world. (Wang & Deng, 2019) ...
[124]
Digital Delivery Systems for Social Protection - World Bank
Sep 19, 2025 · Tools like secure IDs, social registries, and digital payments improve social protection program efficiency, reduce fraud, ...
[125]
Trustworthy digital identities can set the standards for secure ...
Sep 12, 2025 · Digital identity systems can contribute to benefits fraud reduction, but effectiveness depends heavily on implementation quality, system ...
[126]
National ID Cards: Crime-Control, Citizenship and Social Sorting
Jan 1, 2007 · Returning to the reasons why new IDs are established, each of 'eliminating terrorism,' 'combating fraud,' and 'controlling immigration' depend ...Introduction · The Social Shaping Of New... · Ids, Social Sorting And...Missing: impacts | Show results with:impacts
[127]
Understanding Identity Systems Part 1: Why ID? | Privacy International
Jan 31, 2019 · Fraud prevention: to prevent identity fraud · Security: to prevent attacks from foreign powers, terrorist attacks, and other national security ...Missing: societal surveillance
[128]
[PDF] Navigating the Risks and Rewards of Digital ID Systems
Inattention to open government and democratic principles in the implementation of digital ID systems increases the risks posed by these systems: data breaches ...
[129]
India: Identification Project Threatens Rights | Human Rights Watch
Jan 13, 2018 · “Making an Aadhaar card a prerequisite to access essential services and benefits can obstruct access to several constitutional rights, including ...
[130]
Ethical challenges of digital health technologies: Aadhaar, India - NIH
Linking sensitive private health information with the Aadhaar identification system introduces issues of potential breaches of privacy, data ownership and use, ...