Relational calculus
Relational calculus is a declarative, non-procedural query language for relational databases, founded on first-order predicate logic, that allows users to specify the desired data retrieval conditions without detailing the computational steps involved.[1] Introduced by Edgar F. Codd in 1972, it serves as a theoretical basis for expressing queries on collections of relations.[2] The language comprises two primary variants: tuple relational calculus (TRC), which uses tuple variables ranging over entire rows of relations, and domain relational calculus (DRC), which employs variables ranging over individual attribute values or domains.[3] Both variants leverage logical connectives such as conjunction (∧), disjunction (∨), negation (¬), and quantifiers (∃, ∀) to formulate queries as well-formed formulas, akin to those in applied predicate calculus.[2] For instance, TRC queries often take the form {t | P(t)}, where t is a tuple variable and P(t) is a formula defining the condition for inclusion.[3] A key property of relational calculus is its relational completeness, meaning it can express any query definable using relational algebra operations like selection, projection, join, and union, ensuring equivalent expressive power between the declarative calculus and procedural algebra approaches.[2] This equivalence underpins the theoretical foundations of modern database query languages, including SQL, while emphasizing data independence and logical query specification.[1] Safety conditions are imposed on queries to ensure finite results and domain independence, restricting formulas to those translatable into finite relational algebra expressions.[1][4]Overview
Definition and Purpose
Relational calculus is a non-procedural, logic-based formalism for querying relational databases, derived from first-order predicate logic.[2][5] It enables the expression of queries through logical statements that define the properties of the desired output, without specifying the computational steps required to obtain it.[2][6] The primary purpose of relational calculus is to specify what data to retrieve from a database, contrasting sharply with procedural languages that dictate how the data should be processed.[2][1] In this framework, queries are formulated as logical formulas that, when evaluated against the database relations, produce the set of tuples or values satisfying the given conditions.[5][6] Introduced by Edgar F. Codd in 1970 as a foundational component of the relational model, relational calculus provides a declarative approach to formalize data retrieval operations.[7] Its two primary variants—tuple relational calculus and domain relational calculus—offer different ways to structure these logical expressions.[6][1]Historical Development
The origins of relational calculus trace back to Edgar F. Codd's foundational 1970 paper, where he proposed it as a declarative formalism for querying relational databases, alongside relational algebra, to enable data independence and logical data manipulation within the relational model.[7] This work built on earlier logical foundations, particularly Alfred Tarski's 1941 development of the calculus of relations, which provided the semantic basis for handling binary relations in a first-order predicate framework that influenced Codd's approach to relational semantics.[8] In 1971, Codd extended the concept through his ALPHA sublanguage, explicitly founding it on relational calculus to support precise query expressions using tuple variables and predicates, marking a key step in formalizing non-procedural database operations. The 1970s saw further development, with tuple relational calculus (TRC) refined by Codd and collaborators as a variable-based notation for specifying sets of tuples satisfying conditions. Domain relational calculus (DRC) was formalized later in the decade, notably in Codd's 1972 paper on relational completeness, which demonstrated the equivalence of calculus and algebra in expressive power and introduced criteria for evaluating query language completeness.[2] Raymond Reiter contributed significantly in the late 1970s by formalizing deductive query mechanisms on relational data, integrating logical inference with calculus to handle complex reasoning over databases.[9] A major milestone occurred in 1977 when Moshe Zloof integrated relational calculus principles into Query By Example (QBE), a user-friendly, skeleton-table-based language that translated visual queries into calculus expressions, bridging theory and practical implementation at IBM.[10] During the 1980s, relational calculus played a pivotal role in the standardization of SQL, as the language's declarative syntax drew from calculus semantics to ensure relational completeness, culminating in ANSI's adoption of SQL in 1986 and ISO's in 1987, which solidified its influence on commercial database systems. Over time, relational calculus evolved from a theoretical tool for proving query equivalence into the cornerstone of modern declarative query paradigms, underpinning languages like SQL and enabling scalable, logic-based data retrieval in diverse applications.Types
Tuple Relational Calculus
Tuple relational calculus (TRC) is a non-procedural query specification language for relational databases, grounded in first-order predicate logic and utilizing tuple variables that range over complete tuples within database relations. Formally introduced by Edgar F. Codd in 1972 as part of the relational model, TRC enables the declarative expression of queries by defining the properties that output tuples must satisfy, abstracting away the procedural steps for data retrieval.[2] The core structure of a TRC query is the expression { T \mid P(T) }, where T denotes a free tuple variable representing the result tuples, and P(T) is a well-formed formula constructed from atomic predicates, logical connectives, and quantifiers, potentially involving additional bound tuple variables. Tuple variables in P(T) are either free, meaning they appear unbound and correspond to output components, or bound within the scope of quantifiers to restrict their range over specific relations.[2][1] Atomic formulas form the building blocks of P(T) and include range restrictions such as T \in [r](/page/R), specifying that the tuple variable T ranges over the relation named [r](/page/R), as well as comparison predicates like T.A \theta [c](/page/C) (where A is an attribute of T, \theta is a comparison operator such as =, <, >, \leq, or \geq, and c is a constant value) or T.A \theta S.B (where S is another tuple variable and B is an attribute of S).[2][1] Well-formed formulas are defined recursively: any atomic formula qualifies as a basic well-formed formula; if \phi is a well-formed formula, then so is its negation \neg \phi; if \phi_1 and \phi_2 are well-formed formulas, then \phi_1 \land \phi_2 and \phi_1 \lor \phi_2 are well-formed formulas; and if \phi contains a free tuple variable U, then the quantified formulas \exists U (\phi) (existential) and \forall U (\phi) (universal) are well-formed, with variable renaming required to prevent name conflicts across scopes (e.g., using distinct identifiers like U_1 and U_2).[2] In contrast to domain relational calculus, which employs variables that range over individual attribute domains, tuple relational calculus operates on tuples as indivisible units, fostering more intuitive formulations for queries centered on row-level conditions while potentially requiring more verbose syntax for fine-grained attribute manipulations.[1]Domain Relational Calculus
Domain relational calculus (DRC) is a variant of relational calculus that employs domain variables to represent individual attribute values, treating relations as sets of tuples. Introduced by Michel Lacroix and Alain Pirotte in 1977,[11] it serves as a declarative query language for relational databases, emphasizing the domains underlying attribute values to construct queries via logical predicates. Unlike tuple relational calculus, DRC focuses on attribute-level variables to define the conditions for query results. The core structure of a DRC formula takes the form \{ \langle x_1, x_2, \dots, x_n \rangle \mid \phi(x_1, \dots, x_n) \}, where x_1, \dots, x_n are domain variables ranging over the possible values in the attribute domains, and \phi is a well-formed formula built from atomic predicates and logical connectives.[12] These domain variables are implicitly typed based on the schemas of the relations in the database, ensuring that each x_i corresponds to values from a specific domain. The resulting relation consists of all ordered tuples satisfying \phi, providing a set-based output independent of the query's evaluation order. Atomic formulas in DRC form the building blocks of \phi and include comparisons such as x_i \, \theta \, c (where \theta is an operator like = or <, and c is a constant) or x_i \, \theta \, x_j (comparing two domain variables from compatible domains), as well as membership tests like (x_1, \dots, x_k) \in R, which verifies if the specified values form a tuple in the relation R or its projection.[12] Complex formulas are constructed by nesting these atomics using logical operators (negation, conjunction, disjunction) and quantifiers. Quantifiers in DRC operate on domain variables to express existential or universal conditions over attribute values; for instance, \exists x \, P(x) holds if there exists at least one value in the domain of x making the predicate P true, while \forall x \, P(x) requires P to hold for all such values.[12] This mechanism allows DRC to capture dependencies and constraints at the attribute level, mirroring the quantifier usage in first-order logic but tailored to relational domains. DRC provides a compact syntax for queries that manipulate individual attributes directly, achieving expressive completeness equivalent to tuple relational calculus and relational algebra through its ability to denote any relation definable by these formalisms.[12] In formal notation, the emphasis on ordered tuples in the output ensures precise mapping to relation schemas, with semantics defined denotationally to map formulas to subsets of Cartesian products over typed domains.[12]Syntax
Basic Components
Relational calculus, as a declarative query language for relational databases, relies on a syntax grounded in first-order predicate logic, with foundational elements that enable the expression of queries through logical formulas.[2] The core components include variables, constants, relation names, attribute names, and comparison operators, which form the building blocks for constructing predicates and formulas.[2] Variables in relational calculus are of two types: free variables, which remain unbound and correspond to the attributes of the query's result relation, and bound variables, which are captured by quantifiers to restrict their scope within the formula.[13] Constants represent specific domain values, such as individual atomic values, while relation names denote the database relations (e.g., R), and attribute names or indices (e.g., r) specify components within tuples.[2] Comparison operators, collectively denoted as \theta, include equality (=), inequality (\neq), less than (<), greater than (>), less than or equal to (\leq), and greater than or equal to (\geq), used to compare variables, constants, or attribute values.[2] Logical connectives provide the means to combine simpler formulas into more complex ones, following standard precedence rules where negation (\neg) has the highest precedence, followed by conjunction (\wedge) and disjunction (\vee), with parentheses used to override these.[13] These connectives—\wedge for logical and, \vee for logical or, and \neg for logical not—allow the recursive assembly of well-formed formulas (wffs) from atomic predicates.[2] Quantifiers play a crucial role in binding variables and expressing existential or universal conditions over relations. The existential quantifier \exists asserts that there exists at least one value for the bound variable satisfying the formula, while the universal quantifier \forall requires that all possible values for the bound variable satisfy it.[13] In relational calculus, quantifiers are often range-coupled, such as \exists r or \forall r, where the variable r ranges over tuples in a specified relation, thereby defining the domain for the bound variable.[2] Atomic predicates, the simplest well-formed formulas, consist of equality or comparison statements between terms, where terms can be variables, constants, or attribute references. Examples include relation membership predicates like P(r), indicating that tuple variable r belongs to relation P, or comparison predicates like r \theta s, comparing the i-th component of tuple r with the j-th component of tuple s, or r \theta a, comparing a tuple component with a constant a.[2] These atomic forms evaluate to true or false based on the database instance. Variable scoping ensures that free variables project the attributes of the output relation, while bound variables, restricted by quantifiers, do not appear in the result and are used only to impose conditions.[13] The scope of a quantifier extends over the formula in which it appears, binding all occurrences of the variable within that subformula, with outer quantifiers not affecting inner free variables.[2] Well-formed formulas in relational calculus are constructed recursively from atomic predicates using logical connectives and quantifiers. Specifically, a wff is either an atomic predicate, the negation of a wff, the conjunction or disjunction of two wffs, or a quantified formula \exists v (\phi) or \forall v (\phi), where v is a variable and \phi is a wff containing free occurrences of v.[13] This recursive definition ensures that all formulas are syntactically valid and interpretable within the relational model.[2]Formula Construction
Relational calculus formulas are constructed recursively, beginning with atomic formulas and building up through logical connectives and quantifiers to form well-formed expressions that specify queries over relational databases. Atomic formulas serve as the foundation and include membership in a relation, such as R(t) where t is a tuple variable ranging over relation R, and comparison predicates like t \theta u or t \theta c, where \theta is a comparison operator (e.g., =, <), i and j are attribute positions, u is another tuple variable, and c is a constant.[14][2] These atomic elements are combined using connectives—conjunction (\wedge), disjunction (\vee), and negation (\neg)—to form compound formulas, such as P \wedge Q where P and Q are existing formulas.[2] Quantifiers then extend the structure: the existential quantifier \exists x \, \phi(x) asserts that there exists a value for variable x making subformula \phi true, while the universal quantifier \forall x \, \phi(x) requires \phi to hold for all values of x in its domain; these are applied only to formulas with free occurrences of x.[2][14] A complete formula, often called a range formula or alpha expression, takes the form \{ \mathbf{t} \mid \phi(\mathbf{t}) \}, where \mathbf{t} specifies the output tuple (possibly projecting specific attributes, e.g., (t{{grok:render&&&type=render_inline_citation&&&citation_id=1&&&citation_type=wikipedia}}, t{{grok:render&&&type=render_inline_citation&&&citation_id=2&&&citation_type=wikipedia}})) and \phi is a well-formed formula restricting the range, such as R(t) \wedge condition.[2] In tuple relational calculus, \mathbf{t} is a free tuple variable, while in domain relational calculus, it consists of free domain variables (one per output attribute).[14] The output specification is determined by the free variables in \phi: their arity defines the result relation's degree, and their associations map to attributes from the underlying schema.[14] Domain restrictions are incorporated via atomic membership predicates, e.g., \{ T \mid T \in \text{Employees} \wedge \phi(T) \}, ensuring variables range only over relevant tuples or values.[14] To maintain structural integrity during construction, especially in nested scopes, a renaming mechanism is employed, systematically relabeling bound variables (e.g., from x to y) to prevent capture, where a free variable in an outer formula becomes inadvertently bound by an inner quantifier.[2] Well-formedness rules enforce this recursively: a formula is well-formed if it is atomic, the negation of a well-formed formula, a conjunction or disjunction of well-formed formulas, or a quantified well-formed subformula with no free variables in the scope of the quantifier beyond the bound one; all variables must be declared via quantifiers or free in the output, with no undeclared free variables permitted.[2][14] Common construction patterns leverage these elements to mimic relational operations. For selection, conditions are applied directly to the tuple variables, e.g., R(t) \wedge t = c within the query formula \{ t \mid \dots \}, to retrieve tuples satisfying the condition. Projection arises naturally from free variables, where only specified components of t appear in the output tuple. Joins are formed by correlating variables across relations via comparison predicates, e.g., R(t) \wedge S(u) \wedge t = u, binding multiple ranges into a single formula.[2] These patterns ensure formulas remain declarative, specifying what data satisfies the criteria without detailing how to retrieve it.[14]Semantics
Interpretation of Expressions
The semantics of relational calculus are model-theoretic, interpreting formulas as statements in first-order logic over a database structure consisting of a domain (a set of constant values) and relations interpreted as predicates on that domain.[15] A database instance provides a finite interpretation where the domain is typically restricted to the active domain—the set of all values appearing in the relations of the instance—to ensure computability and finiteness.[15] The evaluation of a relational calculus formula φ with free variables X₁, ..., Xₖ proceeds by defining the result as the set of all tuples t over the active domain such that the database instance satisfies φ under the valuation that assigns t to the free variables: { t | database ⊨ φ[t/X₁, ..., Xₖ] }, where ⊨ denotes logical satisfaction.[15] Satisfaction is defined recursively: for atomic formulas, such as a relation atom R(t) where t is a tuple term, it holds if the valuation of t is an element of the interpreted relation R; equality atoms t = c hold if the valuation matches the constant c from the active domain.[15] Boolean connectives follow standard truth tables—conjunction ∧ holds if both subformulas are satisfied, disjunction ∨ if at least one is, and negation ¬ if the subformula is not—while quantifiers are evaluated over the active domain: existential ∃Y φ holds if there exists some value in the domain assigned to Y making φ true, and universal ∀Y φ if every possible assignment to Y satisfies φ.[15] The resulting relation from evaluating φ consists of the tuples assigned to the free variables that satisfy the formula, with the schema preserving the types of those variables (e.g., attribute names and domains from the database schema).[15] For safe formulas—those adhering to syntactic restrictions ensuring domain independence—the evaluation yields a deterministic, finite output relation independent of the underlying full domain, relying solely on the active domain to avoid non-determinism or infinite results.[15]Query Safety
In relational calculus, safe queries are defined as those whose results are finite and depend solely on the active domain of the database instance, which comprises all constants explicitly appearing in the relations, thereby preventing infinite outputs or results influenced by extraneous domain elements.[16] This domain independence ensures that query evaluation yields consistent, well-defined results across different but equivalent database interpretations.[5] For tuple relational calculus (TRC), safety is typically ensured through range-restricted formulas, where every variable—whether free or bound by existential (∃) or universal (∀) quantifiers—must appear positively in at least one relational literal, such as t \in R, without being restricted solely by negated or quantified conditions.[17] This syntactic condition confines variable bindings to the active domain, avoiding quantification over the potentially infinite set of all possible values. Domain relational calculus (DRC) employs a similar safety criterion, requiring that all quantified variables occur in some atomic membership formula of the form \langle x_1, \dots, x_k \rangle \in R, where R is a relation in the database schema, thus anchoring variables to database constants.[17] Violations of these rules in either TRC or DRC can lead to unsafe queries that produce infinite tuples by ranging over non-database elements or, in some cases, empty results due to over-restrictive negations. The implications of query safety are profound for practical database systems: unsafe queries risk undecidable or computationally infeasible evaluations, whereas safe ones guarantee finite outputs and preserve decidability, making them suitable for implementation in query processors.[18] Safety also aligns relational calculus with relational algebra, as every safe calculus query is equivalent to a safe algebra expression, facilitating translation and optimization.[5] Testing for safety involves syntactic verification, such as confirming the absence of range variables under negation and ensuring all variables meet the range-restriction criteria; these checks are decidable and can transform formulas into an equivalent relational algebra normal form for evaluation.[17] The concept of query safety, emphasizing domain independence for practical use, emerged in the early 1980s through foundational work on evaluable and allowed formula classes in relational calculus.[17]Equivalence and Properties
Relation to Relational Algebra
Relational calculus and relational algebra are theoretically connected through their expressive equivalence in the safe fragment, as established by Codd's theorem. This theorem states that the safe relational calculus—restricted to domain-independent queries—possesses the same expressive power as relational algebra, meaning any query expressible in one can be formulated in the other. The proof relies on constructive translations between the two formalisms, demonstrating relational completeness for database sublanguages.[2][15] A key aspect of this equivalence is the translation of relational calculus formulas into equivalent relational algebra expressions. This process typically proceeds by induction on the structure of the calculus formula, converting atomic formulas, logical connectives, and quantifiers into corresponding algebra operations. For instance, an existential quantification such as \exists T (T \in R \land P(T)) translates to a selection \sigma_{P}(R), where P is a predicate on tuples from relation R. More complex formulas involving multiple relations map to joins, while free variables determine projections to yield the result relation.[19][15] The mapping of components further illustrates this correspondence: atomic membership formulas like T \in R correspond to base relations or selections; conjunctions and disjunctions map to joins and unions, respectively; existential quantifiers align with projections after selections; and universal quantifiers can be expressed using set differences over the active domain. Conversely, the inverse translation from algebra to calculus expresses operators such as selection \sigma_c(E) as \phi_E \land c, projection \pi_{A_1, \dots, A_k}(E) as \exists X_1 \dots \exists X_m . \phi_E (quantifying over non-projected attributes), join E_1 \Join E_2 as \phi_{E_1} \land \phi_{E_2} with compatible variables, and union E_1 \cup E_2 as \phi_{E_1} \lor \phi_{E_2}. These bidirectional mappings ensure semantic equivalence within the safe fragment.[19][15] This theoretical equivalence has significant practical implications for database systems. It enables query optimizers to translate declarative calculus-style queries into procedural algebra expressions for efficient execution, leveraging algebraic rewrite rules like commutativity of selections and projections. Languages like SQL draw from both paradigms: its declarative syntax is inspired by relational calculus, while underlying engines use algebra for optimization, making SQL a hybrid that achieves relational completeness.[15] Outside the safe fragment, however, the full relational calculus is more expressive than relational algebra, as it can formulate queries that produce infinite or undefined results over finite databases, such as those involving unrestricted negation or universal quantification without domain bounds. Such queries are generally undecidable, with problems like safety and equivalence checking reducing to undecidable problems in first-order logic. This limits the full calculus to theoretical analysis, while practical systems restrict to the safe, equivalent fragment.[15]Expressive Completeness
The expressive completeness of relational calculus is formalized by Codd's theorem, which states that the domain-independent (or safe) fragment of relational calculus has the same expressive power as relational algebra: every safe relational calculus query can be equivalently expressed as a relational algebra expression, and conversely, every relational algebra query can be equivalently expressed as a safe relational calculus formula.[2] This equivalence establishes relational completeness as the baseline for query languages in the relational model, ensuring that both formalisms capture the full scope of typeless, first-order queries over relations without domain dependencies that could lead to infinite or undefined results.[2] The proof of this theorem demonstrates mutual inclusion. One direction shows that relational calculus simulates relational algebra by encoding its core operators—such as selection (via atomic formulas), projection (via existential quantification), join (via conjunction), union (via disjunction), and difference (via negation with universal quantification)—directly into logical formulas. The converse direction constructs relational algebra expressions from calculus formulas by translating the logical structure into algebraic operations that verify satisfaction and containment, leveraging the finite nature of databases to ensure equivalence.[2] This shared expressive power enables relational calculus to formulate all standard relational operations, including selection, projection, natural join, Cartesian product, union, and set difference, thereby achieving relational completeness. However, it excludes operations beyond first-order logic, such as aggregation (e.g., computing averages or counts) and ordering (e.g., sorting results), which require extensions outside the core model.[2] A key limitation is the inability to express recursive queries like transitive closure—for instance, identifying all reachable nodes in a graph—which cannot be captured in first-order logic and demands iterative or fixpoint mechanisms.[20] To address such limitations, extensions like Datalog incorporate least fixpoint operators into a logic programming framework, allowing recursive definitions that express transitive closure and other inductive queries while remaining grounded in the relational paradigm.[20] Computationally, for safe fragments such as unions of conjunctive queries (a core subset of safe relational calculus), problems like query containment—determining if one query's results are always a subset of another's—are decidable and NP-complete in complexity.Examples
Basic Query Example
To illustrate the core concepts of relational calculus, consider a simple database schema consisting of two relations: Employees with attributes (EmpID, Name, Dept, Salary) and Depts with attributes (Dept, Location). A basic query in this schema might seek the names of employees in the Sales department. In tuple relational calculus (TRC), this query is expressed as the set of names satisfying the formula: \{ E.\text{Name} \mid E \in \text{Employees} \land \exists D (D \in \text{Depts} \land D.\text{Dept} = E.\text{Dept} \land D.\text{Dept} = \text{`Sales'} ) \} This formula, following the syntax introduced by Codd, declares a target attribute (E.Name) from the Employees relation, restricted by a conjunction of atomic formulas and an existential quantification over a tuple variable D from the Depts relation. In domain relational calculus (DRC), the same query uses domain variables ranging over attribute values and is formulated as: \{ \langle x \rangle \mid \exists y \exists z \exists s \exists l \ ( \langle y, x, z, s \rangle \in \text{Employees} \land \langle z, l \rangle \in \text{Depts} \land z = \text{`Sales'} ) \} Here, x represents the name (output domain), while y, z, s, and l are existentially quantified variables for EmpID, Dept, Salary, and Location, respectively; the formula ensures z (Dept) equals 'Sales' and matches a tuple in Depts. To demonstrate satisfaction, suppose the database contains the following sample data:- Employees: \langle 1, \text{`Alice'}, \text{`Sales'}, 50000 \rangle, \langle 2, \text{`Bob'}, \text{`HR'}, 45000 \rangle, \langle 3, \text{`Charlie'}, \text{`Sales'}, 55000 \rangle
- Depts: \langle \text{`Sales'}, \text{`NY'} \rangle, \langle \text{`HR'}, \text{`LA'} \rangle
- For E = \langle 1, \text{`Alice'}, \text{`Sales'}, 50000 \rangle: The condition holds because \exists D such that D = \langle \text{`Sales'}, \text{`NY'} \rangle satisfies D.Dept = E.Dept = 'Sales', so 'Alice' is included.
- For E = \langle 2, \text{`Bob'}, \text{`HR'}, 45000 \rangle: No D matches E.Dept = 'HR' with 'Sales', so 'Bob' is excluded.
- For E = \langle 3, \text{`Charlie'}, \text{`Sales'}, 55000 \rangle: Similar to Alice, 'Charlie' is included.
Complex Query Example
To illustrate the flexibility of relational calculus in handling multi-relation joins, existential quantification for inclusion, and negation for exclusion, consider an extended schema building on standard employee and department relations. Introduce the Projects relation with attributes (ProjID, Dept, Budget) to track project details, and the Assignments relation with attributes (EmpID, ProjID) to link employees to projects. A representative complex query in this schema is: "Retrieve the names of employees who are assigned to at least one high-budget project (Budget > 100000) but whose department is not located in London." This query demonstrates the calculus's ability to express conditional inclusions across relations while excluding based on a negated condition. In tuple relational calculus (TRC), the query is expressed as: \{ E.\text{Name} \mid E \in \text{Employees} \land \exists A \exists P (A \in \text{Assignments} \land P \in \text{Projects} \land A.\text{EmpID} = E.\text{EmpID} \land A.\text{ProjID} = P.\text{ProjID} \land P.\text{Budget} > 100000) \land \lnot \exists D (D \in \text{Depts} \land D.\text{Dept} = E.\text{Dept} \land D.\text{Location} = \text{`London'} ) \} The existential quantifiers (∃ A, ∃ P) handle the join conditions to identify employees linked to qualifying projects via Assignments and Projects, ensuring only those with Budget > 100000 are included. The negated existential (¬ ∃ D) excludes employees whose department matches a London location in Depts, effectively implementing set difference through logical negation. The free variable E.Name ensures the output is a unary relation of names, and the formula is domain-independent (safe) because all variables are bound to active domain values from the relations, avoiding infinite results. The equivalent domain relational calculus (DRC) formula uses domain variables for attributes: \{ \langle x \rangle \mid \exists \text{e_id} \exists \text{d_name} \exists \text{salary} ( \langle \text{e_id}, x, \text{d_name}, \text{salary} \rangle \in \text{Employees} \land \exists \text{a_emp} \exists \text{a_proj} ( \langle \text{a_emp}, \text{a_proj} \rangle \in \text{Assignments} \land \text{a_emp} = \text{e_id} ) \land \exists \text{p_proj} \exists \text{p_dept} \exists \text{p_budget} ( \langle \text{p_proj}, \text{p_dept}, \text{p_budget} \rangle \in \text{Projects} \land \text{p_proj} = \text{a_proj} \land \text{p_budget} > 100000 ) \land \lnot \exists \text{d_dept} \exists \text{d_loc} ( \langle \text{d_dept}, \text{d_loc} \rangle \in \text{Depts} \land \text{d_dept} = \text{d_name} \land \text{d_loc} = \text{`London'} ) ) \} Here, domain variables (e.g., e_id for EmpID, x for Name) range over attribute values, with existential quantifiers binding joins and the negation excluding London departments; safety is maintained similarly by restricting to active domains. This TRC (or DRC) expression is equivalent to the relational algebra query: π_Name (σ_Budget>100000 (Assignments ⋈ Projects) ⋉ Employees) - π_Name (σ_Location='London' (Employees ⋈ Depts)), where the left semijoin selects qualifying employees and the set difference removes those in London departments, confirming expressive completeness. To demonstrate, suppose the database contains the following sample data:- Employees: \langle 1, \text{`Alice'}, \text{`Sales'}, 50000 \rangle, \langle 2, \text{`Bob'}, \text{`HR'}, 45000 \rangle, \langle 3, \text{`Charlie'}, \text{`Sales'}, 55000 \rangle
- Depts: \langle \text{`Sales'}, \text{`NY'} \rangle, \langle \text{`HR'}, \text{`London'} \rangle
- Projects: \langle 101, \text{`Sales'}, 150000 \rangle, \langle 102, \text{`HR'}, 80000 \rangle
- Assignments: \langle 1, 101 \rangle, \langle 3, 101 \rangle, \langle 2, 102 \rangle
Bob (Emp2, HR, London) assigned to Proj102 (budget 80k <100k) → excluded (no high-budget assignment and London).
Charlie (Emp3, Sales, NY) assigned to Proj101 (high budget), not London → included. Result: { \text{
Alice'}, \text{Charlie'} }.