Fact-checked by Grok 2 weeks ago

Data element

A data element is a fundamental unit of data in information systems, defined as an indivisible atomic component that conveys a precise and unique meaning within a specific context.^[1] According to the ISO/IEC 11179 standard for metadata registries, it serves as the basic container for data, combining a data element concept—which captures the semantic meaning—and a value domain—which specifies the allowable values and representation format.^[2] This structure ensures reusability and standardization across organizations, facilitating data interoperability in fields like databases, telecommunications, and government reporting.^[3] In practice, data elements are essential for metadata management, where each is described by attributes such as a unique name, unambiguous definition, data type, and constraints to prevent ambiguity and support consistent data exchange.^[3] For instance, the U.S. National Institute of Standards and Technology (NIST) describes a data element as "a basic unit of information that has a unique meaning and subcategories (data items) of distinct value," with examples including gender, race, and geographic location, emphasizing its role in privacy and cybersecurity frameworks.^[4] The ISO/IEC 11179 series, particularly parts 1 through 6, provides the international framework for registering and governing these elements in metadata registries (MDRs), promoting semantic precision and reducing redundancy in large-scale data environments. By enabling precise definitions without circular references or procedural details, data elements underpin data quality improvement and integration in modern applications, from electronic health records to financial transactions.^[5]

Fundamentals

Definition

A data element is an atomic unit of data that is indivisible and carries precise, unambiguous meaning within a specific context.^[4] It represents the smallest meaningful component of information that cannot be further subdivided without losing its semantic integrity, ensuring clarity in data processing and interpretation.^[4] According to ISO/IEC 11179, a data element combines a data element concept—which captures the semantic meaning—and a value domain—which specifies the allowable values and representation.^[6] In metadata, data models, and information exchange, the data element serves as the foundational building block for constructing larger structures, such as records or messages, enabling consistent representation and interoperability across systems.^[7] By providing a standardized unit of meaning, data elements facilitate the organization of complex data hierarchies and support reliable data sharing and analysis. Properties such as identification and representation further characterize these units, though detailed attributes are explored elsewhere. The concept of the data element traces its historical origins to early database theory in the 1960s, particularly through the efforts of the CODASYL Data Base Task Group (DBTG), which formalized data structures in reports that influenced the development of database management systems (DBMS).^[8] These foundational works emphasized atomic data units within network models, evolving over decades into modern data management practices that integrate data elements into relational, NoSQL, and big data architectures for enhanced scalability and semantics.^[9] It is important to distinguish a data element from a related term like data item; according to standards such as those from HHS, the latter often refers to a specific occurrence or instance of the data element, while the data element itself is the definitional atomic unit.^[10]

Properties

A data element is characterized by several core properties that ensure its clarity, reusability, and interoperability in information systems. These include a unique identification, typically in the form of a name or identifier, which distinguishes it within a given context or registry.^[10] A precise definition is essential, providing a concise, unambiguous statement of the element's meaning without circular references or embedded explanations.^[11] Additionally, the data type specifies the nature of the values it can hold, such as string, integer, or date, while the representation term—often qualifiers like "Code," "Amount," or "Identifier"—indicates the general category of representation to promote consistency.^[11]^[10] Optional properties enhance the element's utility and flexibility. Enumerated values may be defined for categorical data, listing permissible options within a value domain to restrict inputs and ensure semantic accuracy.^[11] Synonyms or aliases can be included to accommodate alternative names used in different systems or contexts, facilitating mapping and integration.^[11] Constraints, such as maximum length, format requirements, or units of measure, further delimit the element's valid representations, preventing errors in data capture and processing.^[10] Guidelines for constructing these properties emphasize precision to avoid ambiguity. Definitions should be context-specific, tailored to the domain without vagueness—for instance, specifying "Age: The number of years since birth" rather than a generic term like "how old someone is."^[11] They must remain non-circular, relying on established terms rather than self-referential loops, and unambiguous to support consistent interpretation across users and systems.^[10] An illustrative example is the data element PersonBirthDate, which includes: a unique name "PersonBirthDate"; a definition "The date on which an individual was born"; data type "date"; representation term "Date"; format constraint "YYYY-MM-DD"; and no enumerated values, as it draws from a standard calendar domain.^[10] This set ensures the element's atomic nature as an indivisible unit of data.^[11]

Standardization

ISO/IEC 11179

ISO/IEC 11179 is an international standard developed by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) that provides a framework for metadata registries (MDRs) to register, manage, and describe data elements, concepts, and classifications in a structured manner. First published in parts during the mid-1990s, with initial editions such as ISO/IEC 11179-4 in 1995, the standard has evolved through multiple revisions, reaching its latest editions in 2023 and 2024 across its multi-part structure. It consists of several parts, including frameworks for conceptual schemas (Part 1), metamodels for data and metadata (Part 3), naming and identification principles (Part 5), and registration procedures (Part 6), enabling organizations to ensure semantic consistency and interoperability of data across systems.^[12]^[13] The standard defines key components essential for describing data elements within an MDR. A data element concept represents the abstract meaning or semantic content of a data item, independent of its specific format, such as "Person Birth Date" denoting the date of birth without specifying how it is stored. A data element is a specific instantiation of a data element concept, including its representation (e.g., data type, length), such as "PersonBirthDate" formatted as YYYY-MM-DD. Value domains specify the permissible values or ranges for data elements, either as enumerations (e.g., a list of countries) or qualifiers (e.g., numeric ranges with precision), ensuring controlled and consistent usage. These components collectively support the classification and governance of metadata to facilitate data sharing and reuse. The registration process in ISO/IEC 11179 outlines a formal procedure for submitting, evaluating, and maintaining entries in an MDR to maintain quality and authority. Submission involves providing detailed metadata for a proposed data element, including its concept, representation, and value domain, along with supporting documentation for review by a registration authority. The review process assesses compliance with standard criteria, such as semantic clarity and uniqueness, potentially involving iterations for refinement before approval or rejection. Once registered, data elements undergo stewardship, including versioning to track changes (e.g., updates to value domains) and periodic reviews for obsolescence, ensuring ongoing relevance and traceability. This process promotes accountability through designated stewards responsible for maintenance.^[14] Naming conventions under ISO/IEC 11179 emphasize clarity, consistency, and semantic precision to avoid ambiguity in data element identifiers. Names should employ Upper Camel Case, where each word starts with an uppercase letter and subsequent letters are lowercase, such as "PersonGivenName" for a first name field. A representation term from a controlled list (e.g., "Identifier," "Name," "Date") must conclude the name to indicate the data's form or qualifier, drawn from standardized glossaries to ensure uniformity. Abbreviations are discouraged to prevent misinterpretation, favoring full terms unless explicitly defined in the registry, thereby enhancing readability and machine-processability across diverse systems. As of 2025, recent updates to ISO/IEC 11179, including extensions in Parts 31, 33, and 34 published in 2023 and 2024, have enhanced support for semantic web technologies such as Resource Description Framework (RDF) to improve interoperability with linked data environments. These revisions introduce metamodels for data provenance and conceptual mappings that align with RDF schemas, enabling MDRs to export metadata as triples for integration with ontologies and knowledge graphs, thus bridging traditional data element management with modern semantic ecosystems. Additionally, in 2025, ISO/IEC TR 19583-21 and TR 19583-24 were published, offering SQL instantiation and RDF schema mappings for the ISO/IEC 11179 metamodel to support integration with relational databases and linked data environments.^[15]^[16]^[17] Several standards and frameworks extend the foundational concepts of data elements outlined in ISO/IEC 11179, focusing on domain-specific interoperability and reusable components for information exchange. The ebXML Core Components Technical Specification, developed in the 2000s under ISO/TS 15000-5, defines core components as reusable building blocks for business document exchange, where data elements represent atomic pieces of business information structured within XML schemas to ensure semantic consistency across electronic transactions. This approach promotes the reuse of data elements in supply chain and e-commerce contexts by specifying aggregate and basic components that encapsulate business semantics.^[18] In the United States, the Global Justice XML Data Model (GJXDM), initiated in the early 2000s, provides an object-oriented framework for justice and public safety information sharing, organizing data elements into a dictionary and XML schema to standardize exchanges among law enforcement and judicial entities.^[19] Building on GJXDM, the National Information Exchange Model (NIEM), launched in 2005, expands this to broader government domains by defining a core set of reusable data elements—such as those for persons, activities, and locations—that support XML-based information exchanges while allowing domain-specific extensions.^[20] NIEM's data model emphasizes governance processes to maintain element definitions, facilitating interoperability across federal, state, and local agencies.^[21] For metadata applications, the Dublin Core Metadata Element Set, version 1.1, offers a simple vocabulary of 15 properties, including dc:title for resource naming, designed as basic data elements for describing digital resources in a cross-domain, interoperable manner.^[22] These elements prioritize simplicity and extensibility, enabling lightweight resource discovery without complex hierarchies.^[23] ISO/IEC 19773:2011 further supports data element reuse by extracting modular components from ISO/IEC 11179, including data element concepts for integration into open technical dictionaries, which serve as shared repositories for standardized terminology in engineering and technical applications.^[24] These modules define value spaces and datatypes to ensure consistency in multilingual and multi-domain environments.^[25] By 2025, data element standards have increasingly aligned with web technologies, such as W3C-endorsed schema.org, which provides structured data vocabularies—including types like WebPage and properties for entities—to markup web content as reusable data elements for enhanced search and interoperability.^[26]

Usage in Information Systems

Databases and Data Models

In relational databases, data elements serve as columns within tables, defining the structure and type of information stored for each attribute of an entity. For instance, a column representing a customer's name might use the VARCHAR data type in SQL to accommodate variable-length strings, ensuring efficient storage and querying of textual data.^[27] This organization into rows and columns allows for systematic representation of relationships between data, where each row (or tuple) corresponds to a complete record. Normalization techniques, such as those outlined in Edgar F. Codd's relational model, are applied to these data elements to minimize redundancy and dependency issues, organizing tables to eliminate duplicate information across columns.^[28]^[29] Within conceptual data models, particularly entity-relationship (ER) diagrams, data elements are represented as attributes attached to entities, capturing specific properties that describe real-world objects. An attribute like CustomerID functions as a unique identifier (primary key) linked to the Customer entity, enabling the modeling of one-to-many or many-to-many relationships between entities without data duplication. These attributes can be simple, such as a single-valued field for an employee's ID, or composite, combining multiple data elements like address components (street, city, zip code). This approach ensures that data elements maintain referential integrity and support the translation of ER models into physical database schemas.^[30]^[31] The role of data elements has evolved across database paradigms, originating from hierarchical models in the 1960s and 1970s, where data was structured in tree-like parent-child relationships, to the relational model introduced by Codd in 1970, which emphasized tabular independence and query flexibility. In modern NoSQL databases like MongoDB, data elements appear as key-value pairs within flexible document structures, allowing nested or varying attributes in BSON format without rigid schemas—for example, a user document might include a "preferences" key with sub-elements like language and theme. This shift accommodates diverse data types and scalability needs, contrasting with the fixed columns of relational systems.^[32]^[33]^[34] Interoperability between database schemas relies on mapping data elements to align disparate structures, often facilitated by Extract, Transform, Load (ETL) processes that extract data from source systems, transform elements (e.g., converting date formats or aggregating values), and load them into target databases. Tools in ETL pipelines define mappings to ensure semantic consistency, such as linking a "client_name" field from one schema to "customer_fullname" in another, preventing data silos in integrated environments.^[35]^[36]

Markup Languages and XML

In markup languages, data elements serve as the fundamental building blocks for structuring and exchanging information in a human- and machine-readable format. In XML, data elements are represented as tagged components enclosed by start and end tags, such as <GivenName>John</GivenName>, which encapsulate specific pieces of data while allowing for hierarchical organization and extensibility.^[37] This structure enables the definition of custom tags to represent domain-specific data, ensuring that documents can be parsed and validated consistently across systems.^[37] To enforce consistency and interoperability, XML data elements are typically defined and validated using XML Schema Definition (XSD), a W3C recommendation that specifies constraints on element types, cardinality, and content models. For instance, an XSD can declare an element like <GivenName> with a string type and length restrictions, allowing tools to verify document compliance before processing.^[38] Namespaces in XML further support reusability by qualifying element names to avoid conflicts, as seen in schemas where global elements—reusable across multiple documents—are prefixed with unique URIs. In ebXML, an OASIS and UN/CEFACT standard for electronic business, global elements exemplify this by defining reusable core components, such as <ID> or <Amount>, with attributes for data types and business semantics to facilitate standardized B2B exchanges.^[18] These XML-based data elements find practical application in web services and configuration files, promoting portable data interchange. In SOAP, a protocol for XML messaging in web services, data elements form the payload within the <Body> envelope, enabling structured requests and responses over HTTP for operations like remote procedure calls. Similarly, RESTful APIs can leverage XML payloads where data elements represent resources, though JSON has become more prevalent; in both cases, schemas ensure data integrity during transmission. Configuration files, such as those in enterprise software, use XML data elements to define parameters—like <server><host>[example.com](/page/Example.com)</host></server>—allowing modular and version-controlled settings that are easily parsed by applications. Naming conventions for these elements often draw from ISO/IEC 11179 to promote clarity and semantic consistency. Post-2010 developments have extended these concepts beyond pure XML, with JSON-LD emerging as a W3C recommendation for linked data serialization. In JSON-LD, properties function as data elements annotated with semantic contexts, such as {"@context": {"givenName": "http://schema.org/givenName"}, "givenName": "John"}, enabling JSON documents to link to ontologies like Schema.org for enhanced discoverability and interoperability in web-scale data exchange.^[39] This approach bridges traditional markup with semantic web technologies, treating properties as reusable, context-aware data elements without requiring full XML adoption.

Telecommunications

In telecommunications, data elements refer to the structured, named components within protocol data units (PDUs), which serve as the fundamental units of data exchange between network entities across layers of the OSI model. These elements encapsulate specific information, such as addresses, control flags, or payload details, ensuring interoperability and precise handling in communication protocols. For instance, at the data link layer, a PDU might include data elements like source and destination addresses, while higher layers incorporate session identifiers or error-checking fields.^[40]^[41] A prominent example appears in the Signaling System No. 7 (SS7), a global standard developed by the ITU-T in the late 1970s and refined through the 1980s for circuit-switched telephone networks. In SS7's ISDN User Part (ISUP) messages, data elements function as mandatory or optional parameters, such as the calling party number, which specifies the originator's address in formats including nature of address indicator and numbering plan identification to facilitate call routing and billing. This parameter, defined in ITU-T Recommendation Q.763, ensures unambiguous identification across international networks. In modern 5G networks, data elements are integral to New Radio (NR) protocol messages, as specified by 3GPP standards. For example, in Radio Resource Control (RRC) signaling, information elements like the UE identity or measurement reports within PDUs enable efficient resource allocation and handover procedures; these are encoded using ASN.1 notation in TS 38.331 to minimize overhead while supporting high-speed data transmission.^[42] The evolution of data elements in telecommunications traces from the 1980s ITU-T standards, which emphasized circuit-oriented signaling like SS7 for voice services, to IP-based architectures in the 2000s. The IP Multimedia Subsystem (IMS), standardized by 3GPP, introduced data elements for session initiation and control in packet-switched environments, such as the P-Asserted-Identity header in SIP messages, which carries authenticated user information to support multimedia sessions over IP networks. This shift enabled convergence of voice, data, and video, with IMS data elements ensuring quality-of-service parameters like bandwidth allocation.^[43] As of 2025, data elements play a critical role in Internet of Things (IoT) protocols, particularly MQTT, an OASIS-standardized lightweight publish/subscribe messaging transport. In MQTT, data elements within the payload—such as topic strings and variable-length sensor readings (e.g., temperature or humidity values)—are transmitted with minimal overhead, using fixed headers for control flags and QoS levels to enable reliable, low-bandwidth communication from resource-constrained devices to cloud brokers.

Contemporary Applications

Big Data and Data Lakes

In big data environments, data elements serve as flexible attributes within distributed systems like Apache Hadoop, accommodating the high variety of structured, semi-structured, and unstructured data sources that challenge traditional notions of atomicity and uniformity.^[44]^[45] Hadoop's Hadoop Distributed File System (HDFS) enables the storage and processing of diverse data formats across clusters, allowing data elements—such as individual fields in logs, sensor readings, or multimedia metadata—to be treated as modular components without rigid upfront constraints.^[46] This flexibility supports scalability for massive volumes but complicates ensuring the indivisibility and semantic consistency of data elements amid heterogeneous inputs.^[47] Data lakes address these challenges by storing raw data elements in their native form without predefined schemas, applying a schema-on-read approach that defers structure imposition until analysis time.^[48] In platforms like Amazon Web Services (AWS), data elements are ingested as objects in Amazon S3 buckets, often tagged with metadata elements such as timestamps, source identifiers, or content types to facilitate discovery and later governance.^[48] This method preserves the integrity of varied data elements—from JSON records to binary files—enabling organizations to apply retrospective schemas for compliance, querying, or transformation while avoiding the bottlenecks of upfront normalization.^[49] As of 2025, advancements in data lake technologies include lakehouse architectures, which combine the flexibility of data lakes with the reliability of data warehouses, using open table formats like Apache Iceberg to provide ACID transactions and schema evolution for better governance of data elements in analytical workloads.^[50]^[51] To manage serialization and efficiency, tools like Apache Avro and Apache Parquet are widely used for encoding data elements with embedded schemas in big data pipelines.^[52] Avro employs JSON-defined schemas stored alongside binary data, supporting schema evolution and compact representation ideal for streaming and batch processing of evolving data elements. Parquet, with its columnar format, optimizes for analytical workloads by partitioning data elements into logical segments, reducing I/O for high-volume queries and enhancing compression for velocity-driven environments like real-time ingestion.^[53] These practices ensure data elements remain accessible and performant across distributed storage, balancing flexibility with structured access. Additionally, as of 2025, AI integration in big data pipelines enables automated processing and real-time analytics on data elements, improving velocity and veracity in handling exponential data growth.^[54]^[55] As of 2025, a key trend involves integrating data elements into data mesh architectures, where they are treated as domain-owned assets to enable decentralized analytics and cross-team collaboration.^[56] Originating from principles outlined by Zhamak Dehghani, data mesh decentralizes ownership of data products—bundles of related data elements—across business domains, fostering self-service platforms that treat elements like customer IDs or transaction attributes as governed, shareable resources rather than centralized silos.^[56] This shift supports scalable governance in big data and lake ecosystems, aligning with projections for exponential data growth by emphasizing federated control over monolithic structures.^[57]

AI and Machine Learning

In artificial intelligence and machine learning, data elements serve as the fundamental atomic units or features that form the inputs to models, enabling the processing and analysis of complex datasets. For instance, in computer vision tasks, individual pixel values in images act as discrete data elements that capture color intensity and spatial information, while in natural language processing (NLP), tokenized words or subwords represent data elements derived from text sequences. As of 2025, multimodal models increasingly incorporate diverse data elements such as combined text, images, and audio for more comprehensive AI applications, including agentic systems that autonomously reason and act on these elements. These elements must undergo rigorous preparation, including cleaning to remove noise, duplicates, or inconsistencies, and normalization to scale values uniformly—such as through min-max scaling or z-score standardization—to ensure model stability and prevent dominance by features with larger ranges.^[58]^[59]^[60]^[61]^[62] Within machine learning pipelines, data elements are organized into structured datasets that support efficient training and evaluation workflows. Tools like TensorFlow Datasets provide pre-built collections where data elements are exposed as tf.data pipelines, allowing seamless loading, batching, and transformation for models such as convolutional neural networks or transformers. To maintain reproducibility amid iterative development, versioning systems like Data Version Control (DVC) track changes to these data elements, treating them akin to code commits in Git, which facilitates collaboration and rollback in large-scale AI projects.^[58]^[63] Despite their utility, data elements in AI systems introduce significant challenges related to bias and privacy. Bias can arise from skewed representations within data elements, such as underrepresentation of certain demographic groups in feature distributions, leading to discriminatory model outcomes—a phenomenon observed in selection bias where training data fails to mirror real-world diversity. Privacy concerns are equally critical, with regulations like the EU's General Data Protection Regulation (GDPR) mandating anonymization techniques, such as k-anonymity or differential privacy, to obscure personal identifiers in data elements while preserving analytical utility.^[64] Advancements as of 2025 have enhanced the handling of data elements in distributed and automated AI contexts. In federated learning, data elements remain localized on edge devices, with only model updates aggregated centrally to train shared models without raw data exchange, thereby bolstering privacy in scenarios like mobile health applications, particularly through integration with edge computing for real-time processing. Concurrently, AutoML platforms integrate automatic feature selection algorithms that evaluate and rank data elements based on relevance metrics like mutual information, streamlining pipeline optimization for non-experts and focusing on improved interpretability and handling of unstructured data. Enumerated values from categorical data elements can be encoded as one-hot vectors for such selections, ensuring compatibility with numerical models.^[65]^[66]^[67]^[68]

Issues and Management

Semantic Overloading

Semantic overloading refers to the assignment of multiple, conflicting meanings to a single data element within or across information systems, which introduces ambiguities that undermine data integration and usability. In database contexts, this occurs when a data element, such as a field labeled "ID," is interpreted differently— for instance, as a unique user identifier in one schema versus a transaction reference in another—leading to erroneous mappings and query results during integration efforts. This phenomenon is rooted in the limitations of early database models, where schemas impose overloaded constructs that blend unrelated domain concepts, forcing users to navigate implicit assumptions rather than explicit semantics.^[69] The primary causes of semantic overloading stem from the incremental evolution of legacy systems, where initial simplistic naming and modeling practices become inadequate as organizational needs expand, resulting in reused elements for unrelated purposes without clear documentation. Poor naming conventions exacerbate this by relying on generic terms that fail to capture context-specific nuances, while domain mismatches arise when integrating data from disparate sources, such as during enterprise consolidations. Historical instances from the 1990s data warehousing initiatives illustrate these pitfalls; efforts to unify operational data from siloed systems frequently encountered semantic heterogeneity, contributing to integration failures that delayed or derailed projects by introducing inconsistencies in aggregated views.^[70]^[71] Detection of semantic overloading typically involves semantic analysis tools that employ ontologies to compare and align data element meanings across sources. For example, the Web Ontology Language (OWL) enables the formal representation of concepts and relationships, allowing inference engines to identify conflicts by mapping elements to shared vocabularies and flagging divergences in interpretation. Complementary data profiling techniques examine value patterns, cardinality, and dependencies statistically to uncover hidden ambiguities, such as unexpected overlaps in data distributions that suggest multiple semantics. Mitigation strategies focus on resolving these through ontology-mediated mappings or schema enhancements that explicitly delineate meanings, thereby restoring clarity without overhauling underlying structures.^[72]^[73] The impacts of semantic overloading are particularly acute in scenarios involving data quality degradation during mergers and migrations, where unaddressed ambiguities propagate errors into unified datasets, resulting in inaccurate analytics and compliance risks. For instance, merging customer records from acquired entities can lead to duplicated or misinterpreted identities if overloaded fields are not reconciled, amplifying costs and timelines.

Data Governance and Critical Data Elements

Data governance encompasses the policies, processes, and standards that organizations establish to ensure the effective stewardship, quality, security, and compliance of data elements throughout their lifecycle.^[74] This includes defining roles such as data stewards who oversee data ownership, access controls, and accountability to maintain data integrity from creation and acquisition to usage, archiving, and eventual retirement.^[75] Classification plays a central role, where data elements are categorized based on sensitivity and importance; for instance, personally identifiable information (PII) is often designated as critical due to its potential for privacy breaches and regulatory penalties.^[76] Lifecycle management further involves systematic planning to address risks at each stage, such as encryption during storage and secure disposal to prevent unauthorized access.^[77] Critical Data Elements (CDEs) represent a subset of data elements that are essential to an organization's core operations, decision-making, and risk mitigation, such as financial transaction amounts or security credentials that underpin authentication systems.^[78] These elements are identified through criteria including their direct or indirect impact on business outcomes, regulatory requirements, and potential financial or reputational risks if compromised.^[79] For example, data elements tied to customer financial records qualify as CDEs because inaccuracies could lead to non-compliance with financial reporting standards, while security credentials are prioritized for their role in preventing cyber threats.^[80] Prioritizing CDEs allows organizations to allocate resources efficiently, focusing governance efforts on high-value assets rather than treating all data uniformly.^[81] Established frameworks like the Data Management Body of Knowledge (DAMA-DMBOK) provide structured guidance for assessing and advancing data governance maturity, outlining functional areas such as data quality, metadata management, and policy enforcement to evaluate an organization's progress from ad-hoc practices to optimized, enterprise-wide implementation.^[82] Tools such as Collibra support these efforts by enabling the creation and maintenance of centralized catalogs for CDEs, facilitating discovery, stewardship, and integration with governance workflows as of 2025.^[83] Best practices in data governance emphasize regular auditing to verify compliance and data quality, lineage tracking to document the provenance and transformations of data elements for transparency, and adherence to key regulations like the California Consumer Privacy Act (CCPA) of 2018, which mandates protections for consumer data, or the EU AI Act of 2024, which requires robust data governance for high-risk AI systems.^[84] These practices mitigate risks, including those from semantic overloading where ambiguous data definitions can undermine stewardship, ensuring reliable and ethical data usage across the organization.^[85]

References

[1]
ISO/IEC 11179-1:2015(en), Information technology
In ISO/IEC 11179 the basic container for data is called a data element. ... Note 1 to entry: The definition states that a data element is “indivisible” in some ...
[2]
ISO/IEC 11179 data element representation
Oct 13, 2022 · A Data Element defines how a piece of data is recorded for a specific set of objects (or items of interest), using reusable metadata components.
[3]
ISO/IEC 11179-5:2005 - Information technology
ISO/IEC 11179-5:2005 provides instruction for naming and identification of data element concepts, conceptual domains, data elements, and value domains within a ...
[4]
data element - Glossary | CSRC
Definitions: A basic unit of information that has a unique meaning and subcategories (data items) of distinct value. Examples of data elements include gender, ...
[5]
https://www.iso.org/standard/35346.html
[6]
Data element - EPFL Graph Search
In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics.
[7]
https://www.aihw.gov.au/getmedia/08edfd46-50c6-406c-9b09-39fe20af19d9/gdd-c04.pdf.aspx
[8]
How Charles Bachman Invented the DBMS, a Foundation of Our ...
Jul 1, 2016 · During the late 1960s the ideas Bachman created for IDS were taken up by the Database Task Group of CODASYL, a standards body for the data ...
[9]
[PDF] The Worlds of Database Systems - Stanford InfoLab
For example, the CODASYL query language had statements that allowed the user to jump from data element to data ele- ment, through a graph of pointers among ...
[10]
[PDF] Meta Data Standard - Texas Health and Human Services
Mar 5, 2025 · ... data element, a unit of data considered indivisible. A data element is ... Data item: One occurrence of a data element. Data layer: The ...
[11]
[PDF] METADATA REGISTRY, ISO/IEC 11179 - UNT Digital Library
Jan 7, 2008 · The representation of a data element consists of a value domain, a data-type, units of measure, and a representation class. A data element ...
[12]
Metadata Element - an overview | ScienceDirect Topics
Data elements metadata includes critical data elements, data elements definitions, data formats, and aliases/synonyms. ▫ Information architecture metadata ...
[13]
ISO/IEC 11179-1:2023 - Information technology
In stock 2–5 day deliveryThis document provides the means for understanding and associating the individual parts of ISO/IEC 11179 and is the foundation for a conceptual understanding ...
[14]
ISO/IEC 11179-6:2023 - Information technology
In stock 2–5 day deliveryThis document defines the type of information to be specified, the conditions to be met, and the procedure(s) to be followed for each item to be registered ...
[15]
ISO/IEC 11179-31:2023 - Metadata registries (MDR)
In stock 2–5 day deliveryThis document provides a specification for an extension to a Metadata Registry (MDR), as specified in ISO/IEC 11179-3, in which metadata that describes data ...
[16]
[PDF] UN/CEFACT – ebXML Core Components Technical Specification ...
Feb 8, 2002 · ebXML Technical Report, Guide to the Core Components Dictionary v1.04 ... Specification and Standardization of Data Elements, International.
[17]
Global Justice XML (Archive) - Bureau of Justice Assistance
The Global JXDM is an object-oriented data model for organizing the content of a data dictionary, the Global Justice XML Data Dictionary (Global JXDD), in a ...
[18]
National Information Exchange Model (NIEM)
NIEM will standardize content (actual data exchange standards), provide tools, and managed processes.
[19]
National Information Exchange Model Naming and Design Rules, 3.0
This document specifies the data model, XML schema, and rules for NIEM 3.0, an information sharing framework using XML for standard information exchange.
[20]
DCMI: Dublin Core™ Metadata Element Set, Version 1.1: Reference ...
The Dublin Core™ Metadata Element Set is a vocabulary of fifteen properties for use in resource description. The name "Dublin" is due to its origin at a 1995 ...
[21]
DCMI Metadata Terms - Dublin Core
Jan 20, 2020 · DCMI metadata terms are an up-to-date specification of metadata terms, including properties, classes, and vocabulary encoding schemes, ...Title · DCMI: Dublin Core · Alternative Title · Release History
[22]
ISO/IEC 19773:2011 - Metadata Registries (MDR) modules
In stockISO/IEC 19773:2011 specifies small modules of data that can be used or reused in applications. These modules have been extracted from ISO/IEC 11179-3, ISO/IEC ...Missing: concepts dictionaries
[23]
ISO/IEC 19773:2011(en), Information technology
A datatype differs from a kind in the following ways: (1) a datatype implies a value space, which is a specialization of an extension (totality of values vs.
[24]
WebPage - Schema.org Type
Every web page is implicitly assumed to be declared to be of type WebPage, so the various properties about that webpage, such as breadcrumb may be used.AboutPage · CollectionPage · FAQPage · ItemPageMissing: W3C | Show results with:W3C
[25]
ISO 19115 Geographic Metadata Information | NASA Earthdata
The document lists the relevant ISO standards to be used to describe science data products. Complete descriptions of NASA collections and products will require ...
[26]
Relational Model in DBMS - GeeksforGeeks
Sep 9, 2025 · The Relational Model organizes data using tables (relations) with rows (tuples) representing records and columns (attributes) representing data ...
[27]
Relational Database: Definition, Examples, and More - Coursera
Oct 15, 2025 · Data in a relational database is stored in tables. The tables are connected by unique IDs or "keys." When a user needs to access specific ...What Is A Relational... · How Do Relational Databases... · Why Is A Relational Database...
[28]
A Brief History of Data Modeling - Dataversity
Jun 7, 2023 · By the end of the 1980s, the hierarchical model was becoming outdated, with Codd's relational model becoming the popular replacement. Query ...
[29]
Key concepts: Entity, attribute, and entity type - IBM
Oct 25, 2022 · An entity is a unique object, an entity type is the type of information stored, and an attribute is a trait of an entity type.
[30]
Types of Attributes in ER Model - GeeksforGeeks
Jul 12, 2025 · The types of attributes in ER model are: Simple, Composite, Single-Valued, Multi-Valued, Derived, Complex, Stored, Key, Null, and Descriptive.
[31]
A brief history of databases: From relational, to NoSQL, to distributed ...
Feb 24, 2022 · The first computer database was built in the 1960s, but the history of databases as we know them, really begins in 1970.
[32]
Document Database - NoSQL | MongoDB
Primitive data handling Presenting JSON data as simple strings and numbers rather than the rich data types supported by native document databases such as ...What Are Documents? · Collections · How Much Easier Are...
[33]
What Is NoSQL? NoSQL Databases Explained - MongoDB
NoSQL databases come in a variety of types based on their data model. The main types are document, key-value, wide-column, and graph. They provide flexible ...NoSQL Data Models · NoSQL Vs SQL Databases · When to Use NoSQL
[34]
What Is ETL Data Mapping And Why Should You Know About It?
Feb 7, 2025 · ETL data mapping is the process of defining the relationship between different data fields in source systems and target systems.
[35]
ETL Data Mapping: Process, Methods, Use Cases, Benefits
Mar 29, 2023 · The first step in ETL mapping is identifying data elements to be extracted from the source system. · The data team must analyze the source data ...
[36]
Extensible Markup Language (XML) 1.0 (Fifth Edition) - W3C
Nov 26, 2008 · This document specifies a syntax created by subsetting an existing, widely used international text processing standard (Standard Generalized Markup Language, ...
[37]
XML Schema Part 1: Structures Second Edition - W3C
Oct 28, 2004 · An XML Schema consists of components such as type definitions and element declarations. These can be used to assess the validity of well-formed ...
[38]
JSON-LD 1.1 - W3C
Jul 16, 2020 · JSON-LD is a lightweight syntax to serialize Linked Data in JSON [ RFC8259 ]. Its design allows existing JSON to be interpreted as Linked Data with minimal ...
[39]
What is a Protocol Data Unit (PDU)? | Definition from TechTarget
Feb 18, 2025 · A protocol data unit (PDU) is the basic unit of exchange between entities that communicate with a specified networking protocol.
[40]
Protocol Data Unit (PDU) - GeeksforGeeks
Jul 23, 2025 · Each layer's information is referred to as a Protocol Data Unit (PDU). Along with the data, it contains protocol-specific control information.
[41]
[PDF] ETSI TS 138 331 V15.3.0 (2018-10)
The present document may refer to technical specifications or reports using their 3GPP identities, UMTS identities or. GSM identities. These should be ...
[42]
1970s - 1980s - ITU's 160th anniversary
ITU standards were crucial in the creation of modern digital communications, from shaping the development of TCP/IP, the protocol suite that powers the ...Missing: IMS elements
[43]
Apache Hadoop
### Summary of Hadoop Handling Data Elements in Big Data
[44]
What is Big Data? | IBM
Along with traditional structured data, big data can include unstructured data, such as free-form text, images and videos. It can also include semi-structured ...Missing: elements | Show results with:elements
[45]
https://www.ibm.com/think/topics/big-data
[46]
Challenges of Big Data Analysis - PMC - NIH
This article overviews the opportunities and challenges brought by Big Data, with emphasis on the distinguished features of Big Data and statistical and ...
[47]
What is a Data Lake? - Introduction to Data Lakes and Analytics - AWS
### Summary of Data Lakes: Storage, Schema-on-Read, AWS S3, and Metadata Tagging
[48]
Discovering your data with S3 Metadata tables - AWS Documentation
Use Amazon S3 Metadata to accelerate data discovery by easily surfacing, storing, and querying the metadata for your S3 objects. With S3 Metadata, you can ...
[49]
https://docs.aws.amazon.com/AmazonS3/latest/userguide/metadata-tables-overview.html
[50]
Documentation
- **Apache Parquet**: A file format for serializing data elements with schemas, optimized for big data processing.
[51]
Data Mesh Principles and Logical Architecture - Martin Fowler
Dec 3, 2020 · For more on Data Mesh, Zhamak went on to write a full book that covers more details on strategy, implementation, and organizational design.
[52]
Data Trends in 2025: 8 Trends To Follow | Splunk
Aug 16, 2024 · Some key components of Data mesh architecture include: Domain-oriented decentralized data ownership; Self-serve data platform for easy access to ...
[53]
TensorFlow Datasets
TensorFlow Datasets is a collection of datasets ready to use, with TensorFlow or other Python ML frameworks, such as Jax. All datasets are exposed as tf.data. ...Introducing TensorFlow... · Models & datasets · Writing custom datasets · CatalogMissing: elements machine
[54]
What is Normalization in Machine Learning? A ... - DataCamp
Jan 4, 2024 · Normalization is a specific form of feature scaling that transforms the range of features to a standard scale.
[55]
Data Cleaning in ML - GeeksforGeeks
Sep 16, 2025 · Data cleaning is a step in machine learning (ML) which involves identifying and removing any missing, duplicate or irrelevant data.
[56]
Pipelines | Data Version Control · DVC
Open-source version control system for Data Science and Machine Learning projects. Git-like experience to organize your data, models, and experiments.
[57]
Fairness: Types of bias | Machine Learning - Google for Developers
Aug 25, 2025 · Selection bias occurs if a dataset's examples are chosen in a way that is not reflective of their real-world distribution.Reporting bias · Historical bias · Selection bias · Group attribution bias
[58]
Federated learning: what it is and how it works | Google Cloud
Federated learning (FL) is a machine learning approach that enables the training of a shared AI model using data from numerous decentralized edge devices or ...
[59]
https://www.datacamp.com/tutorial/normalization-in-machine-learning
[60]
[PDF] Standards-based Semantic Integration of Manufacturing Information
The physical model is typically flat and contains little and imprecise formal semantics, where one entity is overloaded with several similar concepts and ...
[61]
[PDF] Semantic Database Modeling: Survey, Applications, and Research ...
It reviews the philosophical motivations of semantic models, including the need for high-level modeling abstractions and the reduction of semantic overloading ...
[62]
[PDF] Introduction to the Special Issue on Semantic Integration
Semantic heterogeneity is one of the key challenges in integrating and sharing data across disparate sources, data exchange and migration, data warehous-.Missing: failures | Show results with:failures
[63]
(PDF) OWL-Based Semantic Conflicts Detection and Resolution for ...
Aug 7, 2025 · We use three steps to achieve data interoperability and integration: Step 1. Detect possible semantic conflicts by examining ontologies. Step 2.
[64]
Ontologies and Data Management: A Brief Survey | KI
Aug 13, 2020 · This survey gives an overview of research work on the use of ontologies for accessing incomplete and/or heterogeneous data.Missing: detection mitigation overloading elements
[65]
Heterogeneous data integration: Challenges and opportunities
Semantic heterogeneity as a result of domain evolution. ACM Sigmod Rec., 20 (4) (1991), pp. 16-20. Crossref View in Scopus Google Scholar. [12]. B. Ben Mahria ...Missing: failures | Show results with:failures
[66]
What is Data Governance? - IBM
Data governance is the data management discipline that focuses on the quality, security and availability of an organization's data.
[67]
What Is Data Governance & Why Is It Crucial? - Salesforce
Data governance defines the policies, processes, and roles that keep your data secure, reliable, and optimized for better decision-making and compliance.
[68]
What Is Data Governance? A Comprehensive Guide - Databricks
What is data governance? It describes the processes, policies, tech and more that organizations use to manage and get the most from their data.
[69]
Data Governance Lifecycle: Key Stages & Core Capabilities - Atlan
Apr 25, 2025 · The data governance lifecycle defines how organizations manage data policies, ownership, quality, and compliance—across every step of a data ...
[70]
Critical Data Elements (CDE) Best Practices for Data Governance
Jul 10, 2024 · CDEs are the data that informs and enables an organisation's operations, decision-making processes, risk management, reporting accuracy, and compliance with ...
[71]
Critical data elements: why are they important and how to measure ...
Data elements are the different attributes that describe the data entity. For example, data elements of the customer entity might be a unique id to identify the ...
[72]
[PDF] Critical Data Elements Explained: Defining, Governing, and ...
Feb 21, 2025 · CDEs refer to the essential data attributes that are critical to an organization's operations, decision-making processes, risk management ...
[73]
Critical Data Elements Explained - Dataversity
Oct 31, 2023 · Some data elements are less critical than others and need less governance. These data elements are not considered as important, and per the CDE ...
[74]
Data Management Body of Knowledge (DAMA-DMBOK
DAMA-DMBOK is a globally recognized framework that defines the core principles, best practices, and essential functions of data management.DAMA® Dictionary of Data... · DAMA-DMBOK® Infographics · FAQs
[75]
Collibra Data Catalog software
Make informed decisions with Collibra Data Catalog. Achieve data intelligence and streamline data across sources in a single cloud solution.
[76]
Top Data Governance Best Practices for 2025 - DataTeams AI
Discover essential data governance best practices for 2025 to ensure compliance, security, and effective data management. Learn more now!1. Establish Clear Data... · 2. Implement Comprehensive... · 7. Establish Data Access...<|control11|><|separator|>
[77]
Five reasons why data lineage is essential for regulatory compliance
Aug 6, 2025 · Data lineage improves governance and accountability Governance and accountability are essential for building trust, reducing compliance risks, ...