Fact-checked by Grok 2 weeks ago

Codd's 12 rules

Codd's twelve rules are a set of thirteen criteria (numbered from zero to twelve) proposed by , the pioneer of the model, in 1985 to establish the standards for a database management system to be considered fully relational. These rules define the core principles of the , emphasizing logical , , and user-friendly access mechanisms to ensure that systems adhere strictly to relational theory rather than merely incorporating relational features. Developed amid growing commercial interest in relational databases during the , the rules were first detailed in two Computerworld articles titled "Is Your DBMS Really Relational?" (October 14, 1985) and "Does Your DBMS Run By the Rules?" (October 21, 1985), addressing vendors' tendency to market non-relational systems as relational for advantage. The foundation rule (Rule 0) requires that all data management functions be performable using only relational capabilities, setting the baseline for compliance. Subsequent rules cover critical aspects such as representing all information in tables (Rule 1), guaranteeing access to data via logical identifiers (Rule 2), handling null values systematically (Rule 3), maintaining active online catalogs as base tables (Rule 4), supporting comprehensive sublanguages for data definition, manipulation, and control (Rule 5), enabling view updating (Rule 6), providing relational operations for insert, update, and delete (Rule 7), ensuring physical (Rule 8) and logical (Rule 9), enforcing constraints through the relational language (Rule 10), supporting distributed databases (Rule 11), and preventing low-level languages from bypassing relational safeguards (Rule 12). While no commercial database management system has fully satisfied all twelve rules, they remain a foundational benchmark for evaluating relational fidelity and have profoundly influenced the design and standardization of modern RDBMS, including SQL-based systems.

Introduction

Definition and Purpose

Codd's 12 rules, formally known as a set of 13 criteria (numbered from Rule 0 to Rule 12), were proposed by Edgar F. Codd in 1985 as a formal evaluation scheme to assess whether a database management system (DBMS) qualifies as truly relational. These rules serve as benchmarks for fidelity to the relational model, which Codd himself introduced in 1970 as a framework for organizing data into relations to simplify data access and management. Often referred to as the "Twelve Commandments" despite the inclusion of the foundational Rule 0, the criteria emphasize that a relational DBMS must manage data exclusively through relational mechanisms without relying on non-relational or procedural extensions. The primary purpose of these rules is to protect the integrity of the against dilutions by vendors, who in the frequently marketed hybrid or "born-again" systems as fully relational while incorporating navigational or hierarchical elements that undermined relational principles. Codd developed the rules out of frustration with such misleading claims, aiming to provide database users and purchasers with a rigorous standard to evaluate vendor products and ensure long-term investments in applications, training, and data administration remain viable. By enforcing strict adherence, the rules promote standardized , integrity constraints, and manipulative capabilities, countering the "performance myth" that non-relational features were necessary for efficiency. At their core, the rules define a relational DBMS as a that organizes and manages data using relations—typically represented as tables consisting of rows (tuples) and columns (attributes)—and supports declarative query languages for operations like retrieval, insertion, update, and deletion at the relational level, often handling multiple records simultaneously. This approach ensures data sublanguage commands are comprehensive and uniformly applicable across the database, distinguishing true relational systems from those requiring low-level navigational access or record-at-a-time processing.

Relation to the Relational Model

The , introduced by in 1970, organizes data into relations—mathematical structures akin to tables—comprising rows (tuples) and columns (attributes) defined over specific domains, with the entire framework rooted in and first-order predicate logic to enable precise querying and manipulation. This abstraction allows data to be represented declaratively without regard to physical storage details, distinguishing it from earlier models like hierarchical or network databases that imposed rigid navigational structures. Codd's 12 rules, formalized in and elaborated in his work, extend this foundational model by translating its theoretical principles into practical, verifiable requirements for database management systems (DBMS). Specifically, the rules enforce declarative access to through non-procedural languages, logical and physical to insulate applications from storage changes, and systematic handling of values to represent or inapplicable information accurately, thereby preventing implementations from deviating toward non-relational paradigms such as pointer-based hierarchies or networks. These extensions ensure that systems adhere strictly to the model's emphasis on logical structures, where relations maintain through keys and domains, rather than relying on implementation-specific optimizations. A core concept in this relation is the prioritization of logical over physical representation: users interact with data via operations like selection, , and join, oblivious to how tuples are stored or indexed on disk. The rules serve as a rigorous to assess a DBMS's to the model, confirming that all data manipulation occurs within the relational framework without procedural code or external dependencies. The rules presuppose familiarity with basic relational elements, such as primary keys for uniqueness and query languages for retrieval. For instance, consider an employee with attributes EmployeeID (a unique ), Name (a ), and Department (a categorical ):
EmployeeIDNameDepartment
101Engineering
102Bob JohnsonSales
103Carol LeeEngineering
This table exemplifies a where tuples represent facts, attributes enforce domain constraints, and queries (e.g., selecting all engineers) demonstrate the model's declarative power.

Historical Development

Edgar F. Codd's Contributions

Edgar Frank Codd (1923–2003) was a British-born and whose work profoundly influenced modern database systems. Born on August 19, 1923, in the Isle of Portland, England, Codd earned an honors degree in from University, in 1948 after serving as a pilot in the Royal Air Force during . He later obtained a Ph.D. in computer and communication sciences from the in 1965. Codd joined in 1949 as a mathematical programmer in , initially focusing on early computing systems such as the Selective Sequence Electronic Calculator and contributing to the design of the computer in the early 1950s. By 1957, he had moved to , where he helped develop the , the company's first transistorized , advancing concepts in multiprogramming. In 1968, Codd relocated to IBM's San Jose Research Laboratory in , marking his transition toward database research amid growing needs for managing large-scale data in business environments. Codd's pivotal contributions began in the late when he sought alternatives to prevailing hierarchical and network database models, such as IBM's Information Management System (IMS) and the Conference on Data Systems Languages () approach, which required programmers to navigate complex pointer-based structures. In a landmark 1970 paper, "A Relational Model of Data for Large Shared Data Banks," published in Communications of the ACM, he introduced the , proposing organization into tables (relations) with rows (tuples) and columns (attributes), linked via keys to ensure logical independence and simplify querying without knowledge of physical storage. This model emphasized declarative access, allowing users to specify what they needed rather than how to retrieve it, fundamentally shifting from navigational to set-based operations. Building on this foundation, Codd extended the throughout the 1970s, refining concepts like to minimize and to maintain . By the late 1970s and early 1980s, Codd grew concerned with database vendors, including itself, incorporating non-relational features into products like SQL/DS—such as low-level navigational interfaces and deviations from strict relational principles—that diluted the model's purity and complicated user access. Motivated to establish clear criteria for true management systems (RDBMS), Codd advocated for rigorous standards to enforce , , and , preventing vendor implementations from undermining the relational paradigm's benefits. His efforts culminated in the A.M. from the Association for Computing Machinery, recognizing "his fundamental and continuing contributions to the and of database management systems." Codd retired from IBM in 1984 but continued independent consulting and research. In the , Codd further advanced his ideas through the Version 2 (RM/V2), outlined in his 1990 book The Relational Model for Database Management: Version 2, which expanded the original framework into over 300 detailed rules and features to address evolving requirements like temporal data and enhanced integrity constraints. This work reinforced his commitment to evolving the while preserving its mathematical rigor based on first-order predicate logic and .

Publication and Evolution

E. F. Codd first formally proposed his 12 rules for evaluating management systems (RDBMS) in a two-part article series published in magazine on October 14 and October 21, 1985. Titled "Is Your DBMS Really Relational?" and "Does Your DBMS Run by the Rules?", these articles outlined the rules as a rigorous test to distinguish truly relational systems from those merely claiming the label, including Rule 0 as the foundational principle that the system must use relational facilities exclusively to manage the database. The publication came amid growing commercial interest in relational databases during the , as vendors such as and with its DB2 product aggressively marketed their offerings as relational, often without full adherence to Codd's emerging criteria. This led to widespread use of the term "relational" in industry promotions, sparking debates on authenticity and prompting Codd's rules as a for compliance. The rules also exerted influence on the development of ANSI SQL standards, providing conceptual guidance for features like data sublanguages and catalog management in subsequent revisions beyond the initial 1986 standard. In the 1990s, Codd refined his ideas through the RM/V2 framework, detailed in his 1990 book The Relational Model for Database Management: Version 2, which updated several rules—such as expanding updatability requirements—and integrated them into a broader vision of relational integrity and temporal support. These evolutions addressed limitations in early implementations and aimed to guide future standards, though assessments of commercial compliance varied.

The Rules

Rule 0: The Foundation Rule

Rule 0, known as the Foundation Rule, establishes the fundamental prerequisite for a database management system (DBMS) to be considered truly relational. It requires that any system advertised or claimed to be a relational DBMS must manage the database exclusively using its relational facilities, without relying on any non-relational mechanisms. This ensures that the entire scope of database operations—from definition and manipulation to integrity enforcement—is handled through relational principles alone. The rule explicitly prohibits the incorporation of non-relational extensions, such as navigational pointers, hierarchical structures, or procedural coding elements, which were common in pre-relational systems like or IMS. Instead, it mandates that all data manipulation be declarative and relation-based, leveraging mathematical relations (tables with rows and columns) to represent and query data. This foundational constraint guarantees that the system's architecture adheres strictly to the , preventing hybrid approaches that dilute its purity and benefits, such as and simplicity. A practical illustration of this is that a compliant DBMS cannot expose or depend on low-level, record-oriented for data access; rather, it must provide interfaces limited to relational constructs like tables, primary and foreign keys, and a comprehensive (e.g., SQL equivalents) for all operations, including inserts, updates, and deletes. Codd introduced Rule 0 in his 1985 series of articles in as the zeroth rule—preceding the other twelve—to emphasize its indispensable role, directly addressing misleading vendor claims in the where products were labeled "relational" despite heavy reliance on non-relational procedural add-ons.

Rule 1: The Information Rule

Rule 1, known as the Information Rule, stipulates that all information in a , including both user data and such as database structure definitions, must be represented explicitly at the logical level and in exactly one way—by values in tables. This rule ensures that the database's logical structure is self-contained within the relational framework, without reliance on external files or non-relational mechanisms for storing information. By confining all data representation to tables, the rule promotes a uniform approach to , abstracting away physical storage details and emphasizing the logical view presented to users. At the logical level, this representation hides the underlying physical storage mechanisms, such as file formats or indexing structures, allowing users to interact solely with tabular data through relational operations. , including details like names, column definitions, and data types, is stored as rows and columns in dedicated system tables, eliminating the need for separate schema files or proprietary formats outside the . This approach aligns with the foundational relational facilities outlined in Rule 0, ensuring the entire system operates on a consistent . For instance, the definition of a —such as its columns and their types—would be entered as values in a system catalog , treatable like any other relational . This enables the database to be self-describing, where structural information is as accessible and manipulable as application , fostering flexibility in and maintenance. The implication of this rule is profound: it establishes the basis for treating and uniformly, which is essential for building truly relational systems that support dynamic and evolution without disrupting the .

Rule 2: The Guaranteed Access Rule

The Guaranteed Access Rule, designated as Rule 2 in Edgar F. Codd's framework for relational database management systems, mandates that each and every datum (atomic value) in the database is guaranteed to be logically accessible by specifying a combination of the relation name, primary key value, and attribute name. This precise addressing scheme ensures unambiguous retrieval of individual scalar values without ambiguity or reliance on implementation-specific details. As a direct to Rule 1's emphasis on data representation solely as values in relations, Rule 2 reinforces the need for relations to adhere to (1NF), where attributes are atomic, and higher normal forms like fifth normal form (5NF) to maintain dependencies tied to the . It explicitly prohibits access methods based on positional or ordinal references, such as identifying data by the "third field" in a sequential record, thereby mandating the use of for unique identification. A practical illustration of this rule involves accessing an employee's in a named employees, where emp_id serves as the . The query would be:
sql
SELECT [salary](/page/Salary) FROM employees WHERE emp_id = 123;
This logical specification avoids any reference to physical constructs, such as "field 5 of record 10," ensuring the access remains of storage layout. By promoting a three-part logical addressing mechanism over physical pointers or navigational paths, Rule 2 enhances and portability, allowing applications to interact with the without concern for underlying hardware or storage changes. This foundational underscores the relational model's shift toward declarative query languages for robust, scalable .

Rule 3: Systematic Treatment of Null Values

Rule 3 requires that a management system (RDBMS) support values to represent or inapplicable in a systematic and uniform manner, independent of the involved. Nulls must be distinctly handled and differentiated from other representations such as empty character strings, strings of blank characters, zeros, or any other numeric values that could otherwise serve as legitimate data entries. This ensures that nulls function as a dedicated marker solely for the absence of applicable data, avoiding the pitfalls of ad-hoc conventions that vary by column or domain. Central to this rule is the concept of nulls as a unique indicator for "unknown" or "not applicable" states, which necessitates the adoption of in query processing and data manipulation. In traditional two-valued logic (true/false), the presence of nulls introduces a third outcome: unknown. For instance, a involving a null value, such as checking if a equals 50,000, evaluates to unknown rather than true or false, propagating through operations like AND, OR, and NOT according to extended truth tables. This systematic approach allows queries to explicitly test for nulls using operators like IS NULL or IS NOT NULL, ensuring consistent behavior across the database. Primary keys and certain foreign keys can be constrained to disallow nulls, enforcing where completeness is mandatory. Consider a table with columns for first name, last name, and middle initial. The middle initial field can legitimately contain a value for customers without a , distinguishing it from an that might imply a deliberate absence of . A query to find customers missing a middle initial, such as SELECT * FROM customers WHERE middle_initial IS [NULL](/page/Null), must reliably identify these records without conflating them with zero-length strings or other placeholders, thereby maintaining query accuracy. By mandating this uniform treatment, Rule 3 mitigates ambiguities in data representation and querying that arise from incomplete real-world datasets, such as optional attributes in forms or unavailable measurements in scientific records. It promotes robust and reliable analysis, preventing errors where missing information is misinterpreted as specific values, and supports scalable handling of partial data without compromising the relational model's foundational principles.

Rule 4: Active Online Catalog

Rule 4, known as the Active Online Catalog rule, requires that the structure description of the entire database be represented at the logical level in the same way as ordinary , enabling authorized users to apply the same relational for querying the as they do for regular . This ensures the functions as a dynamic component of the database, maintaining consistency with the principles outlined in the Information Rule by treating uniformly as relations. The rule emphasizes that the must be stored online and accessible in , allowing immediate reflection of any structural changes without requiring separate tools or offline processes. The term "active" in this context signifies that the catalog supports updates and queries, integrating it as an essential, always-available part of the database system rather than relying on static files or external documentation. This relational representation of the —often referred to as a —facilitates seamless interrogation by users, who need only master a single and language, unlike in non-relational systems where access demands distinct mechanisms. By embedding the within the relational framework, the rule promotes uniformity and simplifies database administration, as structural modifications propagate instantly to authorized queries. A practical example of this rule in action is the use of system views like INFORMATION_SCHEMA.COLUMNS, which stores details about column names, data types, and other attributes for all tables in a database and can be queried using standard relational operations. Authorized users can execute queries such as SELECT * FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'example_table' to retrieve and analyze table structures dynamically, with the catalog updating in to reflect schema alterations if permissions allow modifications through the same language. This approach exemplifies how the catalog remains modifiable where appropriate, ensuring it evolves alongside the database content. The implications of Rule 4 extend to enabling metadata-driven applications, where software can automatically discover and utilize database structures for tasks like report generation or validation without hard-coded assumptions. It also fosters self-documenting databases, as the itself serves as a comprehensive, queryable that authorized users can extend into a full-fledged relational if the vendor's implementation falls short. This rule underscores the relational model's emphasis on and , reducing complexity for developers and administrators while enhancing overall system integrity.

Rule 5: Comprehensive Data Sublanguage Rule

Rule 5, known as the Comprehensive Data Sublanguage Rule, stipulates that a relational database management system (RDBMS) must include at least one language that comprehensively supports all essential database operations through a well-defined syntax expressible as character strings. This language must handle data definition, view definition, data manipulation (both interactively and programmatically), integrity constraints, authorizations, and transaction boundaries (such as begin, commit, and ). Formulated by in 1985, the rule ensures that the system avoids reliance on disparate, non-relational tools by mandating a unified, relational-based sublanguage, often exemplified by SQL, which integrates these functions seamlessly. The rule emphasizes a single, powerful language to promote uniformity across database tasks, supporting both interactive use (e.g., via command-line interfaces) and embedded forms within host programming languages like C or Java. This relational foundation, rooted in set theory and predicate logic, distinguishes it from procedural or navigational languages used in earlier models like CODASYL, ensuring operations align with the relational paradigm's declarative nature. For instance, SQL fulfills this by providing Data Definition Language (DDL) commands like CREATE TABLE for defining structures, Data Manipulation Language (DML) operations such as SELECT, INSERT, UPDATE, and DELETE for data handling, and Data Control Language (DCL) statements like GRANT for authorizations, all within the same syntax. By requiring such comprehensiveness, Rule 5 implies enhanced developer productivity and system maintainability, as users can perform all database interactions without switching between multiple specialized languages or graphical tools that might bypass relational principles. This uniformity reduces complexity in application development and enforces consistent enforcement of rules like value treatment from Rule 3 or catalog access from Rule 4, fostering robust, scalable RDBMS implementations.

Rule 6: View Updating Rule

Rule 6, known as the View Updating Rule, requires that all views which are theoretically updatable—meaning those whose definitions permit unambiguous translation of modifications back to the underlying base relations—must support insert, update, and delete operations through the system. This stipulation ensures that the relational database management system (DBMS) treats such views equivalently to base tables for data manipulation purposes, without imposing artificial restrictions beyond theoretical limitations. In the relational model, views function as virtual tables derived from base relations via operators like selection, projection, and equi-join, preserving the structure of relations while providing abstracted perspectives on the data. Updatability is assessed at view-definition time using algorithms such as VU-1 or stronger variants, which analyze the view's expression, base table declarations, and integrity constraints to determine properties like tuple-insertibility, tuple-deletability, and component-updatability. Simple views, such as those based on a single base table with selection (e.g., restricting to rows meeting a condition like age greater than 30) or that retains the , are always theoretically updatable, as each view row maps uniquely to a base row, allowing insertions, updates, or deletions to propagate directly. More complex views, however, such as those involving many-to-many joins or projections omitting primary keys, may fail updatability tests due to ambiguities like "quads" (multiple contributing base rows), in which case the system flags restrictions via catalog indicators (e.g., not tuple-insertible). This rule reinforces the relational model's emphasis on abstraction and logical data independence by enabling users to modify data through customized views without direct access to base tables, thereby simplifying application and . For example, inserting a row into a view of employees over 30 must add a qualifying record to the base employee table, with the system handling the propagation seamlessly if the view meets updatability criteria. By integrating with the comprehensive data sublanguage outlined in Rule 5, Rule 6 ensures that relational operators support full manipulative capabilities across both base relations and s.

Rule 7: High-Level Insert, Update, and Delete

Rule 7, known as the High-level Insert, Update, and Delete rule, stipulates that a management system must support insert, update, and delete operations using a multiple-record-at-a-time approach, treating entire relations or derived relations as single operands rather than processing tuples individually. This requirement ensures that data manipulation can be specified declaratively at a high level, without reliance on low-level or record-by-record locking, which aligns with the relational model's emphasis on set-oriented processing. The key concept here is set-at-a-time processing, exemplified by SQL's relational algebra-inspired operations that allow modifications to multiple rows in a single statement, promoting efficiency by leveraging query optimizers to minimize CPU and I/O overhead. For instance, the SQL statement UPDATE employees SET salary = salary * 1.1 WHERE department = 'Sales'; atomically adjusts salaries across all qualifying rows in the employees , without procedural loops or explicit traversal. This approach not only simplifies user queries but also enhances performance in distributed environments by reducing intersite communication costs. By mandating such high-level capabilities, Rule 7 reinforces the declarative paradigm of the , where users specify what data to modify rather than how, building on guaranteed access to individual tuples from Rule 2 while avoiding subversion through procedural code. This leads to more robust, scalable systems that handle bulk operations efficiently, a principle central to modern relational database implementations.

Rule 8: Physical Data Independence

Rule 8, known as Physical , states that application programs and terminal activities remain logically unimpaired whenever any changes are made in either storage representations or access methods. This principle, articulated by E.F. Codd in his framework for evaluating management systems (RDBMS), ensures that the physical aspects of —such as file structures, hardware devices, and indexing techniques—are isolated from the logical structure of the data, which is represented as relations per the foundational information rule. By maintaining this separation, the rule allows database administrators to optimize performance through physical modifications without necessitating alterations to the application code or user interfaces that interact with the logical views. At its core, physical data independence distinguishes the , which handles how data is stored and accessed on underlying (e.g., disk files or allocations), from the logical layer, where data is organized into relations accessible via declarative queries. For instance, a database system might switch from indexing to indexing for faster lookups on a particular attribute, or migrate data to a different storage device, without impacting the relational schema or the SQL statements used by applications. This decoupling is achieved through the DBMS's mapping mechanisms, which translate logical requests into physical operations transparently. Codd emphasized that true relational systems must fully support this isolation to prevent non-relational systems' common pitfalls, where physical details leak into application logic. A practical example illustrates this rule: consider a relational storing employee records; if is added to the physical format to reduce disk usage, applications issuing SELECT queries on the —such as retrieving employee details via joins—continue to unchanged, as the DBMS handles the internally. This capability not only facilitates ongoing but also enhances system maintainability, as physical upgrades (e.g., adopting solid-state drives) can occur without the costly and error-prone task of rewriting application programs. Overall, Rule 8 promotes a robust where logical consistency and application reliability are preserved amid evolving physical infrastructures.

Rule 9: Logical Data Independence

Rule 9, known as the Logical Data Independence rule, requires that application programs and terminal activities remain logically unimpaired whenever information-preserving changes of any kind are made to the base tables, provided those changes theoretically allow for such unimpairment. This rule emphasizes the insulation of the logical layer of the database from modifications in the , ensuring that the overall and meaning are preserved without necessitating alterations to views or dependent applications. At its core, logical data independence provides a buffer between the external schema (how users perceive the data) and the conceptual schema (the logical structure of base tables), allowing database administrators to evolve the underlying logical design—such as renaming tables, adding or removing columns, or splitting tables—while maintaining transparent access for applications through mechanisms like views. For instance, views can compensate for structural changes by joining or projecting data in a way that mimics the original schema, thereby hiding the modifications from end-users and software. This concept builds on the in the three-schema architecture, where changes at the logical level do not propagate to the external level as long as the semantics remain intact. A practical example involves splitting a single "orders" base table, which originally contained columns for order ID, customer ID, product ID, quantity, and price, into two separate tables: "orders" (with order ID, customer ID, and date) and "order_details" (with order ID, product ID, quantity, and price), linked by a . To preserve , a can be created that joins these tables and presents the combined structure identical to the original "orders" table; applications querying the view continue to function without modification, as the view handles the underlying split transparently. The primary implication of Rule 9 is that it enables schema evolution in production environments without incurring , recoding of applications, or disruptions to ongoing operations, fostering long-term and adaptability in systems. This contrasts with physical data independence (Rule 8), which addresses storage-level changes, by focusing solely on logical restructurings that affect table definitions rather than file organization or access paths.

Rule 10: Integrity Independence

Rule 10, known as the Independence rule, requires that all constraints specific to a be definable in the relational data sublanguage and storable in the , rather than within application programs. This ensures that the database's mechanisms operate independently of the software applications that access it, allowing constraints to be modified without necessitating changes to external code. The primary integrity constraints addressed by this rule are entity integrity and , both foundational to the . Entity integrity mandates that every component of a must be non-null and unique within its , preventing ambiguous or incomplete identification of . Referential integrity requires that the value of any in one either matches the value of some in the referenced or is , thereby maintaining consistent relationships across the database without orphaned data. These constraints must be specified using the relational data sublanguage—such as declarative (DDL) statements—and automatically enforced by the database management system (DBMS) during operations like inserts, updates, and deletes, with storage in the online catalog as outlined in Rule 4. For instance, in a for an employee database, can be enforced by declaring emp_id as the in the employees , ensuring no values or duplicates. might then be defined with a statement like FOREIGN KEY (dept_id) REFERENCES departments(dept_id), which the DBMS checks automatically to validate that any dept_id in employees corresponds to an existing in departments. This DDL approach embeds the constraints directly in the , independent of any application logic. By centralizing integrity constraints in the , Rule 10 facilitates easier maintenance, as modifications to business rules—such as tightening referential checks—can be applied once at the database level without recompiling or redeploying multiple applications. This independence enhances portability across different application environments and reduces the risk of integrity violations due to overlooked code in disparate programs.

Rule 11: Distribution Independence

Rule 11, known as the Distribution Independence rule, requires that a relational database management system (RDBMS) must support the distribution of data across multiple sites or machines while remaining fully transparent to end-users and applications. According to E.F. Codd, "A relational DBMS has distribution independence. By this we mean that application programs and on-line terminal activities should continue to operate successfully, unchanged, when data previously stored at one site is relocated to another site or is replicated at several sites." This rule extends the principles of data independence outlined in Rules 8 and 9 by ensuring that the logical and physical aspects of distribution do not impact user interactions. At its core, the rule facilitates techniques such as horizontal partitioning (dividing tables by rows across sites) and vertical partitioning (dividing tables by columns), allowing the system to manage large datasets efficiently without altering the or query interfaces. Queries and updates formulated in the relational language, such as SQL, remain valid and perform as expected regardless of how the data is distributed, whether centrally or across a . This transparency is achieved through the DBMS's query optimizer and distribution mechanisms, which handle location resolution internally. For instance, consider a query like SELECT * FROM customers WHERE country = 'USA';. In a compliant system, this executes identically whether the customers table resides on a single server or is sharded horizontally across multiple distributed servers, with the DBMS routing and aggregating results seamlessly. The primary implication of Rule 11 is enhanced scalability and flexibility for enterprise environments, enabling the construction of large-scale, federated, or cloud-based databases without necessitating modifications to existing applications or user queries. This supports growth in data volume and geographic distribution, as seen in modern distributed RDBMS implementations, while preserving the single logical database illusion.

Rule 12: The Nonsubversion Rule

Rule 12, known as the Nonsubversion Rule, stipulates that if a management system (RDBMS) provides a low-level or —such as one operating on single records at a time—that low-level mechanism must not allow users to subvert or bypass the integrity rules, constraints, and protections enforced by the higher-level relational , which handles multiple records simultaneously. This rule ensures that all forms of access to the database adhere to the relational model's safeguards, preventing any procedural or record-oriented operations from undermining the system's overall integrity. The core concept behind this rule is to eliminate potential "back doors" into the database, such as direct file input/output or low-level procedural calls, that could circumvent relational semantics like and constraint enforcement. For instance, even if the system supports non-relational languages for manipulation tasks, there must be rigorous proof that these cannot violate constraints defined in the relational and cataloged within the system. By mandating enforcement of high-level relational protections across all interfaces, the rule promotes a uniform application of database rules, safeguarding against inconsistencies that might arise from mixed access methods. A practical example illustrates this principle: suppose an RDBMS exposes a procedural designed for record-level updates; under Rule 12, this API must still enforce constraints, preventing duplicate insertions or null violations that would be impossible through high-level relational operations like SQL INSERT statements. Such enforcement ensures that developers cannot inadvertently or maliciously introduce data anomalies by exploiting low-level access, maintaining the database's reliability regardless of the interface used. This rule reinforces the foundational principles of the , particularly Rule 0, by guaranteeing that the system's relational capabilities are not compromised by supplementary features, thereby discouraging vendors from implementing non-relational shortcuts that could erode user trust in the DBMS's relational compliance. Ultimately, it upholds the and aspects outlined in prior rules, ensuring a cohesive relational environment.

Significance and Impact

Adoption and Compliance

In the 1980s and 1990s, major database vendors including , with its DB2 product released in 1983, and Sybase aggressively marketed their systems as relational, often invoking Codd's framework to differentiate from legacy hierarchical and models like 's IMS. However, many implementations offered only partial adherence, with vendors repackaging existing technologies and making unsubstantiated claims of full relational capability, such as supporting limited features like basic theta-select operations while ignoring broader model requirements. Codd himself contributed to this landscape after leaving in 1984 by founding two consulting firms in 1985 dedicated to evaluating and advising on products, influencing vendor designs through education on principles like and . The ANSI SQL standards of 1986 (SQL-86) and 1992 () directly incorporated core elements of Codd's , such as tabular data representation and declarative query access, establishing a vendor-neutral foundation that propelled widespread adoption while addressing early systems' inconsistencies. Early relational prototypes like Ingres, developed in the at UC Berkeley and commercialized in the , demonstrated partial compliance by supporting key features like operations and SQL-like querying but falling short on advanced independence rules due to constraints. These standards and systems helped embed Codd's vision into industry practice, though full adherence remained elusive. Analyses by C.J. Date and Hugh Darwen in the late revealed that most commercial systems readily passed Rules 1 through 5, which emphasize foundational aspects like information representation and guaranteed access, but compliance dropped sharply for Rules 6 through 12, particularly view updating (Rule 6) and distribution (Rule 11), due to SQL's deviations such as duplicate rows and weak constraint support. Rule 12, the nonsubversion rule, posed ongoing challenges as low-level interfaces like cursors or procedural extensions in products such as SQL Server and often allowed bypassing relational constraints, undermining the model's security guarantees. Codd's rules also informed SQL:1999 extensions, including enhanced semantic mechanisms like assertions and triggers, which aimed to better align with (Rule 10) while accommodating object-relational features. Overall, no database management system achieved complete compliance with all 12 rules by the early , as practical trade-offs in , legacy integration, and favored partial implementations over theoretical purity, a pattern evident in vendor products that prioritized over strict adherence to higher-level rules like logical and physical .

Criticisms and Limitations

Codd's 12 rules have been critiqued for their overly strict nature, which often prioritizes theoretical purity over practical implementation in real-world database systems. For instance, Rule 6, the View Updating Rule, requires that all theoretically updatable views be updatable by the system, but this proves impractical for complex views involving joins or aggregations, as determining unambiguous updates becomes computationally intensive and error-prone without additional user intervention. Critics argue that such rigidity ignores essential performance trade-offs, where full compliance could degrade query efficiency or increase development costs without proportional benefits in usability. Additionally, Rule 0, the Rule, enforces a purist relational approach that resists integration with non-relational elements, making it challenging for hybrid systems that blend structured and unstructured data. The rules also exhibit significant limitations stemming from their 1985 origins, predating the rise of object-oriented databases and paradigms that address hierarchical or more flexibly. They provide minimal guidance on beyond basic authorizations in Rule 5's comprehensive data sublanguage, overlooking advanced concerns like , access controls for distributed environments, or regulations that became prominent later. Furthermore, the framework does not account for scalability challenges, such as handling petabyte-scale volumes or real-time processing, which require distributed architectures beyond the rules' scope for logical and physical independence. Debates surrounding the rules often center on compliance interpretations, with Codd himself clashing with vendors who exaggerated adherence for marketing purposes, leading to systems that emulated relational features without full integrity enforcement. C.J. Date and David McGoveran advocated for stricter interpretations, emphasizing that partial compliance undermines the relational model's foundational logic and arguing that many commercial DBMSs fail on rules like distribution independence (Rule 11) due to proprietary optimizations. In practice, widespread non-compliance persists, as vendors prioritize usability and performance over exhaustive rule adherence, such as skipping full view updatability for complex scenarios. Conceptually, the rules are viewed as aspirational guidelines rather than mandatory standards, serving as benchmarks to guide development but not as exhaustive criteria for relational fidelity. Codd later expanded on this in his Version 2 (RM/V2), introducing over 300 features to address gaps in structure, integrity, and manipulation, evolving the original rules into a broader, more nuanced .

Modern Perspectives

Current DBMS Compliance

Contemporary relational database management systems (RDBMS) post-2010, including , , and 26ai (as of 2025), exhibit strong adherence to Codd's foundational Rules 0 through 5, which define the core relational structure, guaranteed access via tables and keys, systematic null value handling, and active catalogs. These systems represent all data logically in tables and support declarative query languages like SQL for manipulation, fulfilling the information rule and guaranteed access without reliance on physical storage details. Challenges arise with more advanced rules, particularly Rule 6 on view updating, where support remains partial across major RDBMS. In , for example, simple views based on single tables are updatable, but complex views involving joins or aggregates require custom rules or triggers for insert, update, or delete operations, limiting full compliance. Similarly, strives for view updatability aligned with Codd's intent but restricts it to key-preserved tables without subqueries or aggregates. 26ai supports updatable views through its SQL engine but encounters limitations with joined or grouped views unless using INSTEAD OF triggers. complies with most rules but is criticized for partially subverting Rule 12, the nonsubversion rule, through proprietary extensions like procedural language, which allows low-level record access bypassing relational interfaces. Rule 11, distribution independence, shows varied implementation in modern systems, often enhanced by cloud architectures. Sharding transparency differs; while on-premises setups like with Citus require application-level awareness for distributed queries, cloud services abstract this complexity. , for instance, achieves distribution independence by automatically managing replication across up to 15 read replicas and a shared layer up to 256 TiB (as of 2025), presenting a unified database interface to users without exposing partitioning details. In contrast, the ISO/IEC SQL:2023 standard aligns closely with Codd's relational principles, incorporating features like , enhanced temporal support, and property graph queries while maintaining declarative integrity constraints. Non-relational systems like fail Rule 0 outright, as they manage data via document stores rather than relational tables, though they incorporate concepts like indexing inspired by relational practices. Projects such as Rel, built on Tutorial D, pursue full compliance by enforcing strict relational operations without procedural deviations, serving as a for theoretical adherence. Analyses from the 2020s indicate partial compliance in commercial RDBMS for core functionality, with ongoing improvements in standards driving better support for independence rules.

Influence on Database Standards

Codd's 12 rules profoundly influenced the formulation of the ANSI/ISO SQL standard beginning in , embedding core relational principles such as the information rule (Rule 1), which mandates that all be represented explicitly as values in ; the guaranteed access rule (Rule 2), ensuring logical accessibility via table, row, and column identifiers; the comprehensive data sublanguage rule (Rule 5), which SQL fulfills through its declarative query capabilities; and the integrity independence rule (Rule 10), supported by SQL's constraint mechanisms for defining and enforcing independently of application code. These elements provided a rigorous for relational compliance, guiding the evolution of SQL revisions and ensuring standardized support for relational operations across vendors. Beyond SQL, the rules served as a foundational basis for properties in database transactions, particularly through Rule 10's focus on integrity constraints that underpin the requirement, enabling reliable data manipulation in concurrent environments. They also complemented Codd's earlier theory, forming the core of teachings that emphasize anomaly prevention and redundancy reduction in relational schemas. The broader legacy of the rules extends to application programming interfaces like ODBC and JDBC, which embody the data independence principles (Rules 8 and 9) by allowing applications to interact with relational data sources without regard to underlying storage or schema changes. In NewSQL systems such as , Rule 11's distribution independence is directly realized, as the architecture transparently manages data sharding and replication across nodes, presenting a unified SQL interface to users. As an educational staple, the rules are routinely covered in curricula to illustrate fundamentals and system evaluation criteria. Specific integrations appear in IEEE standards and publications, where the rules are appended as benchmarks for design and development. Codd's principles of are echoed in GDPR's clauses on maintaining accuracy, completeness, and reliability of , reinforcing relational safeguards against corruption. Extensions to semi-structured data are evident in standards like SQL:2023, which incorporates querying and property graph support while preserving relational access rules for hybrid environments. The enduring emphasis on —spanning physical (Rule 8), logical (Rule 9), and distribution (Rule 11) aspects—continues to shape database principles, with adaptations in frameworks like Spark SQL, which emulates relational rules for scalable, declarative querying over distributed datasets.

References

  1. [1]
    50 Years of Queries | Communications of the ACM
    Jun 25, 2024 · Relational database systems were attracting so much attention during the 1980s that Codd published a list of Twelve Rules (actually 13 rules, ...
  2. [2]
    Codd's Twelve Rules - Simple Talk - Redgate Software
    Apr 14, 2020 · Dr. Codd, the creator of relational databases, was bothered by this, so he set up a set of 13 rules that a product had to match to be considered relational.
  3. [3]
    Is your DBMS really relational? | The Thaumatorium
    Oct 14, 1985 · This paper critiques DBMS' that claim to be relational but fail to adhere to the principles of the Relational Model.
  4. [4]
    Does your DBMS run by the rules? - The Thaumatorium
    Oct 21, 1985 · Does your DBMS run by the rules? To be "mid-80s" fully relational, a DBMS must support all 12 basic rules plus nine structural, 18 manipulative ...
  5. [5]
    The relational model for database management: version 2
    Codd, E. F. (1985) "How Relational Is Your Database Management System? ... Codd, E. F. (1986b) The Twelve Rules for Relational DBMS. San Jose, The ...
  6. [6]
    Edgar F. Codd - A.M. Turing Award Laureate
    Codd who invented the relational model and was responsible for the significant development of the database field as a scientific discipline.
  7. [7]
    Edgar F. Codd - IBM
    “Ted” Codd was a mathematician and computer scientist best known for his trailblazing work on the relational model that led to the multibillion-dollar database ...
  8. [8]
    Access isn't a relational database - The Register
    Dec 22, 2006 · He published two articles in Computerworld (14th and 21st October 1985) and in the first he wrote: “In this paper I supply a set of rules with ...<|control11|><|separator|>
  9. [9]
    Codd's 12 Rules - Computerworld
    Sep 2, 2002 · The relational data model was first developed by Dr. E.F. Codd, an IBM. researcher, in 1970. In 1985, Dr. Codd published a list of 12 rules.Missing: ACM | Show results with:ACM<|control11|><|separator|>
  10. [10]
    The early history of databases and DB2 - DataGeek.blog
    Feb 2, 2012 · ... Oracle was the “first relational database”. Oracle was “commercially available” as a relational database before DB2 in June of 1979 (http ...
  11. [11]
    [PDF] The Third Manifesto - DCS - Department of Computer Science
    ... 12. Copyright © 2014 C. J. Date and Hugh Darwen. 1. Part I: PRELIMINARIES. This ... Codd's original papers, therefore, it can be seen as an abstract ...
  12. [12]
    The Relational Model And Edgar F. Codd's 12 Rules - BPS-Corp.com
    Nov 7, 2022 · EF Codd's 12 Rules By Mike Bennyhoff. The relational model for databases was first proposed by Edgar F. Codd in 1970. Codd's model was based ...<|control11|><|separator|>
  13. [13]
    Codd's 12 Criteria (Normalization) - RelationalDBDesign
    Rule 1: Information Rule. All information in a relational database is represented explicitly at the logical level and in exactly one way—by values in tables.
  14. [14]
    Understanding Codd's 12 Rules for RDBMS
    Oct 17, 2020 · Guaranteed Logical Access: Each and every datum (atomic value) is guaranteed ... Rule #2 by subverting the logical addressing scheme[13].<|control11|><|separator|>
  15. [15]
    Codd's Twelve Rules - Rel
    Jun 30, 2019 · Rule 12: If a relational system has a low-level (single-record-at-a-time) language, that low level cannot be used to subvert or bypass the ...Missing: ACM | Show results with:ACM
  16. [16]
    [PDF] The relational model for database management - CodeBlab
    Beginning in 1968, Dr. Codd turned his attention to the management of large commercial databases and developed the relational model as a foundation. Since the ...
  17. [17]
  18. [18]
    None
    ### Extracted Statement and Explanation of Rule 5: Comprehensive Data Sublanguage Rule
  19. [19]
    [PDF] ch1.pdf - Chapter 7: Relational Database Design
    Database System Concepts - 7th Edition. Physical Data Independence. ▫ Physical Data Independence – the ability to modify the physical schema without changing ...
  20. [20]
    [PDF] The Relational Data Model Data and Its Structure Physical Data ...
    This property is referred to as physical data independence. 5. Conceptual Data Level (con t). Application. DBMS. Conceptual view of data. Physical view of data.
  21. [21]
    Physical and Logical Data Independence - GeeksforGeeks
    Jul 15, 2025 · Data independence refers to the ability to change the schema (structure) at one level without affecting the schema at higher or lower levels ...Logical Data Independence · 1. Flexibility In Database... · 3. Compatibility With...<|control11|><|separator|>
  22. [22]
    [PDF] CSE 544 Principles of Database Management Systems - Washington
    – Example relationships are course registrations, product purchases. • User ... • Motivation: better logical and physical data independence. • Overview.
  23. [23]
    [PDF] Logical Data Independence Via Views: A Misapprehension?
    Logical Data Independence Via Views: A Misapprehension? J.M. de Graa& RJ ... Splitting a table into two or more tables by columns using column names ...
  24. [24]
    [PDF] A Relational Model of Data for Large Shared Data Banks
    A Relational Model of Data for. Large Shared Data Banks. E. F. CODD. IBM Research Laboratory, San Jose, California. Future users of large data banks must be ...
  25. [25]
    [PDF] Codd's 12 Rules
    Dr Edgar F. Codd did some extensive research in Relational Model of database systems and came up with twelve rules of his own which according to him, ...
  26. [26]
    Codd's Rules in DBMS - GeeksforGeeks
    Jul 23, 2025 · Rule 1: The Information Rule · Rule 2: The Guaranteed Access Rule · Rule 3: Systematic Treatment of NULL Values · Rule 4: Active Online Catalog ...Missing: RM/ 333
  27. [27]
    DBMS - Codd's 12 Rules - Tutorials Point
    Rule 3: Systematic Treatment of NULL Values. The NULL values in a database must be given a systematic and uniform treatment. This is a very important rule ...Missing: source | Show results with:source
  28. [28]
    12 simple rules: How Ted Codd transformed the humble database
    Aug 19, 2013 · By 1985, Codd had outlined his 12 rules for defining a fully relational database. Sponsored: How TeamViewer builds enterprise trust through ...
  29. [29]
    The SQL Standard - ISO/IEC 9075:2023 (ANSI X3.135)
    Oct 5, 2018 · SQL (Structured Query Language) standard for relational database management systems is ISO/IEC 9075:2023, with origins in ANSI X3.135.
  30. [30]
    History of the Ingres Corporation - ResearchGate
    Aug 7, 2025 · Codd published his famous 12 `rules' for determining whether a database management system can be called relational. Adherence to these rules ...
  31. [31]
    Semantic Integrity Support in SQL-99 and Commercial (Object ...
    Aug 6, 2025 · In this paper, we give an overview of the semantic integrity support in the most recent SQL-standard SQL:1999, and we show to what extent the ...
  32. [32]
  33. [33]
    What Is a True Relational System (and What It Is Not)
    Mar 11, 2017 · Codd's 12 Rules Are Deprecated. While a true RDBMS will comply with Codd's 12 Rules, compliance is insufficient for relational fidelity. The ...
  34. [34]
    Documentation: 18: 39.2. Views and the Rule System - PostgreSQL
    Views in PostgreSQL are implemented using the rule system. A view is basically an empty table (having no actual storage) with an ON SELECT DO INSTEAD rule.
  35. [35]
    What is Amazon Aurora? - Amazon Aurora
    ### Summary: How AWS Aurora Provides Distribution Independence to Users
  36. [36]
    how many of the codd's rules are supported by oracle?
    Oracle allows us to update key-preserved views. Rule 7: High-level insert, update, and delete. Oracle has set-based insert, update, and delete. Rule 8 ...
  37. [37]
    Introduction of Relational Model and Codd Rules in DBMS
    Jul 23, 2025 · Codd has proposed 13 rules which are popularly known as Codd's 12 rules. These rules are stated as follows: Rule 0: Foundation Rule- For any ...
  38. [38]
    54 Years of Relational Databases | LearnSQL.com
    Jun 27, 2024 · Edgar Frank (Ted) Codd was an English mathematician and computer scientist. During World War II, he was a pilot in the Royal Air Force. After ...Who Was Edgar Frank Codd? · Databases in the 1960s · The Road to Relational...Missing: biography | Show results with:biography
  39. [39]
    ACID Properties in DBMS - GeeksforGeeks
    Sep 8, 2025 · How ACID Properties Impact DBMS Design and Operation · 1. Data Integrity and Consistency · 2. Concurrency Control · 3. Recovery and Fault Tolerance.Missing: teachings | Show results with:teachings
  40. [40]
    Normalization to 3NF | What?, Steps, Types & Examples
    Normalization is a way of arranging the database data to eliminate data duplication, anomaly of addition, anomaly of modification & anomaly of deletion.Missing: impact curriculum
  41. [41]
    Codd's Rules for Relational Database Systems - SQL in a Nutshell ...
    Information is represented logically in tables. · Data must be logically accessible by table, primary key, and column. · Null values must be uniformly treated as ...
  42. [42]
    Mainframe to Distributed SQL, Part 3 - CockroachDB
    Nov 14, 2024 · In distributed databases like CockroachDB, the system hides the complexities of data replication (keyspace sharding into ranges and distribution ...
  43. [43]
    Data Integrity: A Detailed Overview - EFS Consulting
    Jul 29, 2025 · Edgar F. Codd introduced a data model that remains the foundation of modern database systems in his article “A Relational Model of Data for ...
  44. [44]
    Multi-model query languages: taming the variety of big data
    May 31, 2023 · In Sect. 3, we present the SQL-based extensions toward multi-model data, including the standard SQL extensions such as SQL/XML, SQL/JSON, ...
  45. [45]
    Spark SQL for Relational Databases - Analytics Vidhya
    Jul 19, 2022 · This article will look at some of the significant advances made in harnessing the power of relational databases, but “at scale,” using some of the newer ...Missing: emulation Codd's