Common Platform Enumeration
Common Platform Enumeration (CPE) is a standardized, machine-readable naming scheme for identifying information technology systems, software, and packages, utilizing URI syntax to encode product names in a structured format.[1] Developed initially by MITRE Corporation to support consistent references across cybersecurity specifications, CPE enables precise identification of IT assets for vulnerability management and security automation.[2] CPE forms a core component of the U.S. National Institute of Standards and Technology's (NIST) Security Content Automation Protocol (SCAP), providing a formal method for naming operating systems, hardware devices, applications, and their versions.[3] The current specification, CPE 2.3, released in August 2011, defines a layered set of capabilities including a naming scheme, procedures for comparing names, and applicability statements that combine multiple CPE names with logical operators.[4] NIST maintains the official CPE Dictionary, an XML-based repository of approved names that is updated nightly and accepts public contributions since December 2009, ensuring coverage of current and historical product releases.[1] In practice, CPE facilitates automated security processes by allowing tools to match vulnerabilities—such as those in the Common Vulnerabilities and Exposures (CVE) list—to specific platforms, enhancing risk assessment and patch management across enterprises.[5] By standardizing nomenclature, it reduces ambiguity in IT inventories, supports interoperability among security products, and aids in compliance with frameworks like SCAP-validated tools.[2]Overview
Definition
Common Platform Enumeration (CPE) is a structured, machine-readable naming scheme developed for identifying information technology products, encompassing hardware, operating systems, applications, and packages.[3] It establishes a standardized nomenclature that enables consistent and unambiguous references to these entities across diverse systems and databases.[5] By providing a formal format for product names, CPE facilitates automated processing and interoperability in security and inventory management contexts.[2] Key characteristics of CPE include its uniform vocabulary, which draws from a centrally maintained dictionary to ensure consistent terminology for product descriptions, and its extensibility, allowing the incorporation of additional attributes as needed without disrupting existing implementations.[5] Furthermore, CPE names are formatted as Uniform Resource Identifiers (URIs), leveraging URI syntax to guarantee global uniqueness and ease of parsing by machines.[3] This URI-based structure supports precise matching and comparison of product identifiers in large-scale environments.[2] Unlike Software Identification (SWID) tags, which provide comprehensive metadata for software lifecycle tracking including installation details and entitlements, CPE emphasizes enumeration through concise, standardized naming primarily for cataloging and reference purposes.[6] In practice, CPE plays a critical role in vulnerability identification by enabling tools to map known issues to specific platforms.[5]Purpose and Applications
Common Platform Enumeration (CPE) serves as a standardized nomenclature for precisely identifying information technology products, including hardware, operating systems, and applications, to facilitate automated processes in security and management. Its primary objective is to provide a structured, machine-readable format that enables consistent naming and classification of IT assets, thereby supporting vulnerability assessment by linking specific platforms to known security issues.[5][2] This approach addresses the challenges of ambiguous product descriptions in traditional inventories, allowing tools to accurately detect and enumerate software and systems across diverse environments. In practical applications, CPE plays a central role in vulnerability management by enabling lookups against databases like the Common Vulnerabilities and Exposures (CVE) list, where identifiers help determine applicable threats to identified platforms. It integrates with the Security Content Automation Protocol (SCAP) to automate compliance evaluations, such as verifying adherence to security configurations and policies through tagged applicability statements. Additionally, CPE supports enterprise asset management by streamlining software inventory processes, ensuring organizations can track and update IT components efficiently without reliance on inconsistent vendor-specific naming.[5][1][2] The benefits of CPE include reducing ambiguity in product identification, which minimizes errors in security assessments and patch deployment, and promoting interoperability among tools from different vendors through a shared vocabulary maintained in the official NIST CPE Dictionary. This standardization aids in patch management by directly associating platform names with specific vulnerabilities, enabling proactive risk mitigation and scalable automation in large-scale IT operations.[5][1]Development
Origins
Common Platform Enumeration (CPE) was developed by the MITRE Corporation under sponsorship from the U.S. Department of Homeland Security (DHS), with initial efforts beginning in the mid-2000s to establish a standardized method for identifying information technology products and platforms. This work aimed to enable machine-readable naming that could support automated security processes, building on the recognition that inconsistent product nomenclature hindered effective vulnerability management across diverse databases.[2] The primary motivation for CPE's creation stemmed from longstanding challenges in vulnerability reporting, particularly the inconsistencies in naming IT products affected by security issues, which traced back to the early Common Vulnerabilities and Exposures (CVE) program's experiences in the late 1990s. At that time, multiple security databases used varying terms for the same vulnerabilities and associated platforms, complicating correlation, duplication detection, and overall interoperability in threat intelligence sharing.[7][8] By providing a structured, uniform naming scheme, CPE sought to resolve these issues, facilitating better automation in software inventory, configuration assessment, and vulnerability scanning.[2][8] Key early contributors included MITRE's CVE team, which leveraged its expertise in vulnerability enumeration to drive CPE's design as an open standard. This involved collaboration with industry stakeholders to ensure broad applicability and adoption, emphasizing a community-driven approach to refine the specification for real-world security needs.[2][8] Over time, oversight of CPE transitioned to the National Institute of Standards and Technology (NIST) to further integrate it into broader security automation frameworks.[2]Version History
The development of Common Platform Enumeration (CPE) began with version 1.1, released on March 13, 2007, by MITRE Corporation under sponsorship from the U.S. Department of Homeland Security.[9] This initial specification introduced a basic structured naming scheme for identifying information technology platforms, encompassing hardware, operating systems, and applications. It emphasized simple string formats using a URI-like syntax, such ascpe:/{part}:{vendor}:{product}:{version}, with optional facets separated by slashes to denote hardware, OS, or application components, enabling straightforward enumeration without complex attributes.[9]
Version 2.2, published on March 11, 2009, marked a significant refinement by MITRE and the National Security Agency.[10] It prescribed a more formalized URI name format, cpe:/{part}:{vendor}:{product}:{version}:{update}:{edition}:{language}, which expanded attributes for greater precision in describing product variations, including support for language tags per IETF standards and edition details. This version also introduced initial dictionary support through an official XML-based collection of standardized CPE names, complete with metadata and submission processes for community contributions, facilitating centralized management and validation.[10]
CPE version 2.3, released by the National Institute of Standards and Technology (NIST) in August 2011, represented a major architectural evolution.[11] It shifted to a modular "specification stack" comprising naming, matching, dictionary, and applicability layers, promoting interoperability within the Security Content Automation Protocol (SCAP). While functional changes were minimal—retaining core URI compatibility—improvements included the well-formed name (WFN) abstraction, new attributes like software edition and target hardware, and enhanced bindings for formatted strings, all aimed at better modularity and extensibility.[11]
As of 2025, version 2.3 remains the current specification, actively maintained by NIST with no full version 3.0 released.[1] Post-2.3 updates have consisted of maintenance releases, such as minor clarifications and bug fixes to the naming pseudocode in 2011, alongside ongoing expansions to the official CPE dictionary through nightly updates to incorporate new product names and refine existing entries.[11][1]
Naming Scheme
URI Format
The Common Platform Enumeration (CPE) URI format provides a structured, machine-readable representation of CPE names, adhering to the generic syntax for Uniform Resource Identifiers (URIs) as defined in RFC 3986. This format ensures consistent identification of information technology products, platforms, and configurations across security tools and databases. The URI binding of a CPE name consists of a scheme identifier followed by attribute values separated by colons, with all components mandatory for well-formedness. The general syntax of a CPE URI iscpe:<version>:<part>:<vendor>:<product>:<version>:<update>:<edition>:<language>:<sw_edition>:<target_sw>:<target_hw>, where each attribute is a string value (potentially empty) and colons delimit the fields. Here, <version> specifies the CPE Naming Specification version, <part> indicates the type of platform, and the remaining nine attributes describe the entity's vendor, product details, version information, edition, language, software edition, target software, and target hardware, respectively. For example, a URI might appear as cpe:2.3:a:ntp:ntp:4.2.8:p3:*:*:*:*:*:*:*, representing a specific NTP application version.[12]
The version indicator, such as "2.3", denotes the version of the CPE Naming Specification under which the name is constructed, ensuring compatibility with the corresponding specification rules (e.g., "2.3" aligns with CPE Naming Specification Version 2.3). The part indicator follows immediately and uses one of three values: "a" for applications, "o" for operating systems, or "h" for hardware, which determines the semantic context for the subsequent attributes.[12]
Reserved URI characters within attribute values must be escaped using hexadecimal encoding to maintain syntactic validity, as per URI rules. For instance, the tilde (~) is encoded as %7E and the exclamation mark (!) as %21. Only reserved URI characters (per RFC 3986) that appear in attribute values require percent-encoding; common characters like periods (.) and hyphens (-) in versions are not encoded. This escaping applies specifically to the URI binding, distinct from the logical Well-Formed Name (WFN) representation, which uses backslash quoting for certain characters.[12]
A CPE URI is considered well-formed only if all components are present, even when an attribute value is empty or not applicable; unspecified attributes (logical ANY) are represented by empty components (consecutive colons ::), while not applicable (NA) uses a hyphen (-). Asterisks (*) are used in formatted strings for wildcard matching purposes. Attribute strings must consist of printable UTF-8 characters, with non-alphanumeric characters properly quoted or encoded to avoid parsing errors. This rigid structure facilitates automated processing in vulnerability management systems and standardized dictionaries.[12]
Component Attributes
The component attributes in a Common Platform Enumeration (CPE) name provide structured fields to identify specific aspects of an IT product, such as its producer, release details, and target environment. These attributes form the core of the Well-Formed Name (WFN) data model, where each is represented as a key-value pair, and unspecified attributes default to the logical value "ANY" to allow for flexible matching. Specified attribute values must be non-empty strings encoded in UTF-8, consisting of printable ASCII characters (hex 00-7F), with alphanumeric characters, underscores, and hyphens permitted; spaces are prohibited, and multi-word names use hyphens or underscores for separation. Special characters like asterisks (*) or question marks (?) are reserved for wildcard usage at the start or end of strings, while other non-alphanumeric characters require backslash escaping.[12] The vendor attribute identifies the organization or individual responsible for producing the product, such as "microsoft" for software developed by Microsoft Corporation; it is required in formatted CPE names to ensure unambiguous identification and must be specified in lowercase letters without spaces. For example, in a CPE name for Microsoft Windows, the vendor would be "microsoft". This attribute supports precise vulnerability mapping by distinguishing products from different vendors with similar names.[12] The product attribute denotes the specific name of the product being enumerated, such as "windows" for the operating system or "internet_explorer" for the web browser; like vendor, it is required in formatted names and rendered in lowercase, using hyphens to connect multi-word terms (e.g., "adobe_reader"). An example is "product='windows'" in a WFN, which allows enumeration of all variants under that product line. This field is essential for categorizing software, hardware, or firmware in security databases.[12] The version attribute specifies the release version of the product, such as "10" for Windows 10 or "8.0.6001" for a browser release; it is optional and can incorporate wildcards like "*" to match any version or "?" for single-character substitutions. Dots in version numbers must be escaped as ".", as in "version='8.0.6001'". This attribute enables targeted identification of version-specific vulnerabilities without enumerating every minor update.[12] The update attribute captures patch levels, service packs, or incremental updates, such as "sp1" for Service Pack 1; it is optional and follows the same formatting rules as version, allowing values like "beta" for pre-release builds. For instance, "update='sp1'" refines a CPE name to include only patched instances of a product version. This helps in tracking security fixes applied post-release.[12] The edition attribute describes major variants of a product, such as "pro" or "enterprise"; though deprecated in CPE 2.3 in favor of other fields for new names, it remains supported for backward compatibility with version 2.2 and is optional. An example usage is "edition='enterprise'" for professional editions of software. It provides granularity for editions that affect functionality or security profiles.[12] The language attribute indicates the localized version using RFC 5646 language tags, such as "en-US" for English (United States); it is optional and limited to valid tags without subtags exceeding standard lengths. For example, "language='en-US'" distinguishes region-specific builds that may have unique vulnerabilities. This field is particularly relevant for software with internationalization features.[12] The sw_edition attribute specifies software-specific editions or tailoring for market segments, such as "basic" or "online"; it is optional and used when the edition field is insufficient, as in "sw_edition='online'" for web-based variants of diagnostic tools. This allows enumeration of application-specific sub-variants not captured by broader edition types.[12] The target_sw attribute identifies the targeted software platform or operating system on which the product depends or runs, such as "windows_7"; it is optional and formatted with underscores or hyphens for multi-word targets, enabling dependency-aware enumeration like "target_sw='windows_2003'". This is crucial for products like plugins that require a host environment.[12] The target_hw attribute denotes the targeted hardware architecture or platform, such as "x86" or "x64"; it is optional and supports values indicating instruction sets, as in "target_hw='x64'" for 64-bit systems. This attribute facilitates hardware-specific enumerations, such as firmware tied to particular processors.[12] Attribute strings in CPE names have practical constraints derived from URI binding, though individual attributes lack stricter per-field limits beyond general string rules; vendor and product fields, being foundational, should prioritize brevity for effective use in security tools. Special characters in attribute values require URI percent-encoding when forming the complete CPE URI, such as "%2E" for dots only if reserved.[12]Specification Stack
Matching Specification
The Common Platform Enumeration (CPE) Name Matching Specification defines the procedures for determining whether a source CPE name matches a target CPE name, enabling precise comparisons in vulnerability management and security automation. This one-to-one matching process operates on well-formed names (WFNs), which are structured as attribute-value pairs representing components such as part, vendor, and product.[13] All string comparisons are performed in a case-insensitive manner after normalization, ensuring consistency regardless of capitalization in the original names.[14] Matching begins with normalization of the source name, which involves unescaping any quoted or escaped values to interpret special characters correctly. For instance, a backslash () precedes characters like asterisks (*) or question marks (?) to treat them as literal data rather than wildcards; unescaping reverses this for comparison. The normalized source is then compared component-by-component to the target WFN, where each attribute-value pair yields one of four relations: EQUAL (=), SUPERSET (⊃, source is broader), SUBSET (⊂, source is narrower), or DISJOINT (∅, no overlap). Undefined components in the source, such as omitted attributes, are treated as "ANY" values, functioning as implicit wildcards that match any corresponding target value.[13][14] The specification supports three primary matching types: exact, case-insensitive, and wildcard. Exact matching requires full string equality for each component after normalization, resulting in an EQUAL relation across all pairs for a successful match. Case-insensitive matching is inherent to all comparisons, converting strings to lowercase equivalents before evaluation. Wildcard matching allows flexible patterns in source values using unquoted * (matching zero or more characters) or ? (matching a single character), enabling regex-like expressions such as "cpe:2.3:a:vendor:product:9.*:::::::*" to match any version starting with "9.". These wildcards apply only to the well-formed string representation of the source, expanding the SUPERSET relation for partial matches.[13][14] Within a single WFN, matching is conjunctive (AND), requiring all component pairs to satisfy the relation for the overall name to match. For sets of CPE names, the specification supports disjunctive (OR) operations by applying sequential one-to-one matches, where a target matches the set if it matches any individual name; conjunctive sets require matches against all names in the set. This enables compound queries in broader CPE contexts.[13][14] Version 2.3 formalized string matching patterns by introducing explicit support for * and ? wildcards, replacing less flexible mechanisms from prior versions like implicit prefixes. This update enhances query flexibility while maintaining backward compatibility through defined escaping rules, allowing more precise applicability in security tools.[13][14]Dictionary and Applicability Specifications
The Common Platform Enumeration (CPE) Dictionary Specification defines an XML-based repository hosted by the National Institute of Standards and Technology (NIST) that serves as the authoritative source for standardized CPE names identifying information technology products.[15] The dictionary's structure uses a root element<cpe_dict:cpe-list> to organize entries, with each <cpe_dict:cpe-item> containing a well-formed name (WFN), human-readable titles in multiple languages, references to external documentation, notes for clarification, and optional check elements for verification methods.[15] It also accommodates deprecated and legacy names through dedicated <cpe_dict_ext:deprecation> elements that link to replacement entries, ensuring historical continuity while marking obsolete identifiers.[15] As of November 2025, the official dictionary contains over 1,500,000 entries, reflecting ongoing additions and updates.[16]
The CPE Applicability Language Specification provides an XML schema for expressing sets of CPE names through logical combinations, enabling the description of complex platform applicability statements.[17] The schema's root element <cpe:platform-specification> encapsulates one or more <cpe:platform> elements, each featuring a unique @id, titles, remarks, and a <cpe:logical-test> that applies operators such as AND and OR (via @operator attribute) or NOT (via @negate attribute).[17] Nested logical tests allow for intricate queries, such as identifying all versions of a software product excluding a specific patch level, by referencing bound CPE names through <cpe:fact-ref> elements and evaluating results according to defined truth tables.[17]
Management of the CPE dictionary follows NIST-defined processes that incorporate community input for additions and updates while maintaining integrity.[18] New entries are submitted by vendors or users via a formal process requiring complete details (vendor, product, version) without wildcards, subject to acceptance criteria for uniqueness and conformance; approved submissions are integrated into the official repository.[15] Deprecation occurs for obsolete products through immutable updates that link to successors, with removal limited to corrections of errors, ensuring no loss of traceability.[15]
In CPE version 2.3, the dictionary acts as the primary reference for generating well-formed names, while the applicability language supports authoring of Security Content Automation Protocol (SCAP) content by defining platform sets for vulnerability assessments and compliance checks.[15][17] This integration ensures that SCAP-validated products can reference dictionary entries and logical expressions for precise platform targeting.[17]
Examples and Usage
Illustrative Examples
A simple example of a CPE name for software iscpe:2.3:a:[adobe](/page/Adobe):flash_player:11.2.202.235:-:-:-:-:-:-:-, which identifies a specific version of Adobe Flash Player application, where "a" denotes the application part, "adobe" is the vendor, "flash_player" is the product, "11.2.202.235" specifies the version, and hyphens ("-") indicate unspecified values for update, edition, language, and other attributes.[19][12]
For hardware platforms, consider cpe:2.3:h:apple:iphone_6s:-:-:-:-:-:-:-, representing the Apple iPhone 6s device, with "h" indicating the hardware part, "apple" as the vendor, "iphone_6s" as the product, and hyphens for unspecified subsequent attributes including version.[20][12]
Wildcard characters allow for broader matching; for instance, cpe:2.3:a:microsoft:office:-:*:pro:eng:-:-:- denotes any update version of Microsoft Office Professional in English, where the hyphen after the product signifies any version, the asterisk ("*") matches any update, "pro" specifies the edition, "eng" the language, and hyphens for remaining attributes.[21][12]
To handle special characters in attribute values, such as a tilde ("cpe:2.3:a:vendor_with~product:product_name:-:-:-:-:-:-:-, where the vendor name "vendor_with