Fact-checked by Grok 2 weeks ago

Algebraic data type

An algebraic data type (ADT) is a composite data type in computer programming, particularly in functional languages, formed by combining simpler types through two fundamental operations: product types, which represent conjunctions ("and") of types, and sum types, which represent disjunctions ("or") of types, mirroring algebraic structures like multiplication and addition.^[1]^[2] This construction allows programmers to define complex, type-safe data structures that encapsulate multiple possibilities or combinations without runtime errors from invalid states.^[3] Product types, such as tuples or records, bundle multiple values together, enabling the representation of structured data where all components must be present, as in a pair of coordinates (x, y) where the total number of possible values is the product of the individual type sizes.^[1]^[3] Sum types, conversely, define a value as belonging to exactly one of several variants, often called tagged unions or discriminated unions, such as the option type in OCaml or Maybe in Haskell, which can be either a present value (Some v) or absent (None), facilitating safe handling of optional or erroneous results.^[1]^[2] ADTs are foundational in languages like Haskell, ML, and Rust, promoting expressive and verifiable code by supporting pattern matching for exhaustive decomposition of values, which ensures all cases are handled at compile time.^[3] They enable the modeling of recursive structures, such as linked lists or trees, and advanced abstractions like monads, while their algebraic properties allow for formal reasoning about type equivalence and inhabitant counting, akin to mathematical algebra.^[1]^[2]

Fundamentals

Definition

An algebraic data type (ADT) is a composite type formed by combining other types using finite products and sums, providing a structured way to represent data in programming languages, especially functional ones. Products, such as tuples or records, correspond to the algebraic operation of multiplication, combining multiple components into a single value where the possible instances multiply accordingly. Sums, such as variants or discriminated unions, correspond to addition, offering disjoint alternatives where the total possibilities add up. This algebraic intuition allows the number of distinct values of an ADT to be calculated precisely from its constructors, mirroring arithmetic properties like distributivity and identities.^[4] In contrast to primitive types like integers or booleans, which are atomic and non-composite, ADTs are inductively defined, permitting recursive specifications that construct increasingly complex types from base cases. This recursive nature enables the modeling of hierarchical or nested data structures, such as lists or trees, without relying on external mechanisms like pointers.^[5] ADTs motivate type-safe programming by enforcing exhaustive analysis of all possible values, typically through pattern matching, thereby eliminating common runtime errors like null pointer exceptions that arise from unhandled absence cases in languages without such constructs. For instance, an optional value type distinguishes present from absent data explicitly, requiring handling of both to proceed.^[6] Typical implementations of ADTs emphasize immutability, where constructed values remain unchanged after creation to ensure predictable behavior in concurrent or pure functional contexts, and totality, meaning their definitions cover all cases without inherent partiality.^[7]

Product and Sum Types

Algebraic data types are constructed using two fundamental operations: product types and sum types. Product types represent the Cartesian product of two or more types, denoted T \times U, where a value of this type consists of a pair (t, u) with t : T and u : U, ensuring all components are simultaneously accessible and present.^[8] This construction corresponds to records or tuples in programming languages, allowing structured aggregation of data from distinct types.^[9] For instance, a two-dimensional point can be modeled as the product type \mathbb{Z} \times \mathbb{Z}, capturing both coordinates in a single value.^[9] Sum types, in contrast, embody disjoint unions, denoted T + U, where a value inhabits exactly one of the alternatives, typically via injections such as \mathsf{inl}(t) for t : T or \mathsf{inr}(u) for u : U. The empty type 0, with no inhabitants, serves as the additive unit.^[8] These are often realized as tagged unions, with a discriminant tag to identify the active variant and ensure type-safe discrimination at runtime or compile time.^[9] This mechanism enables exhaustive representation of mutually exclusive cases, such as the Either construct for computational results that may succeed with a value of type T or fail with one of type U.^[8] The algebraic structure of these types treats them as polynomials in a type algebra, where products correspond to multiplication and sums to addition, with the unit type 1 acting as the multiplicative identity for products and the empty type 0 as the additive identity for sums.^[9] For example, the type L of lists over elements of type T satisfies the equational specification

L \cong 1 + T \times L,

where $1 denotes the empty list and T \times L the non-empty cons cell pairing an element with the tail list.^[9] By nesting products within sums or vice versa, and introducing recursion via least fixed points, these constructors yield complex algebraic data types such as binary trees, defined recursively as \mathsf{Tree} \cong 1 + T \times (\mathsf{Tree} \times \mathsf{Tree}).^[8] This recursive combination facilitates the modeling of hierarchical and inductive data structures central to functional programming paradigms.^[9]

Historical Development

Origins in Lambda Calculus

The concept of algebraic data types (ADTs) traces its mathematical foundations to Alonzo Church's development of lambda calculus in the early 1930s, where functions were treated as first-class citizens through lambda abstraction, allowing them to be passed as arguments, returned as results, and composed freely in a formal system for logic. This untyped framework, introduced as a set of postulates for the foundation of logic, emphasized abstraction (λx.M) and application (M N) as primitive operations, laying the groundwork for typed variants that would later support structured type constructions akin to ADTs. Church extended this in the simply typed lambda calculus, formulated in 1940, which introduced base types such as individuals (ι) and propositions (o), along with function types denoted as (αβ), where α is the argument type and β the result type.^[10] Types are constructed via abstraction, forming a function type (βα) from a variable of type β and an expression of type α, and application, which combines a function of type (αβ) with an argument of type β to yield type α.^[10] These function types (A → B) prefigure the product and sum constructors central to ADTs by providing a hierarchical type formation mechanism based on abstraction and application, enabling encodings of structured data in subsequent extensions.^[10] In the 1970s, Dana Scott advanced this foundation through domain theory, developing continuous lattices to model recursive types and solve domain equations for infinite data structures.^[11] A key innovation was solving equations like D = 1 + A \times D, where D represents a list domain, 1 denotes the empty list, and A \times D captures a pair of an element of type A with the rest of the list, using fixed-point theorems to obtain least solutions as complete partial orders.^[11] This approach, detailed in Scott's work on data types as lattices, provided a semantic basis for recursive ADTs by ensuring denotational models for potentially non-terminating computations and self-referential types.^[11] The Hindley-Milner type system, emerging in the 1970s, further formalized polymorphic types that underpin ADT constructions by introducing principal type-schemes in combinatory logic, allowing a single scheme from which all typings derive via substitution.^[12] J. Roger Hindley's 1969 theorem established that every typable object has a unique, computable principal type-scheme, enabling polymorphic functions over structured types.^[12] Building on this, Robin Milner's 1978 theory integrated polymorphism into programming languages, supporting operations on algebraic structures like lists via types such as \forall \alpha, \beta. (\alpha \to \beta) \times \alpha \text{ list} \to \beta \text{ list} for mapping, with complete type inference.^[13] These developments bridged theoretical type systems to practical polymorphic ADTs.^[13]

Evolution in Programming Languages

The development of algebraic data types (ADTs) in programming languages began in the 1970s with experimental functional languages that sought to operationalize concepts from lambda calculus, providing concrete mechanisms for defining recursive data structures with type safety.^[14] One of the earliest such languages was Hope, developed at the University of Edinburgh, which introduced user-defined ADTs alongside pattern matching to enable expressive functional programming without side effects. Hope's design emphasized simplicity and strong typing, allowing programmers to define data types as sums of products, influencing subsequent work on applicative languages.^[15] In the late 1970s and through the 1980s, the ML family of languages, originating from Robin Milner's work on the LCF theorem prover at Edinburgh, standardized ADTs through the datatype keyword, integrating them with polymorphic type inference based on the Hindley-Milner system.^[16] This allowed for concise definitions of complex structures like lists and trees, with automatic handling of recursion and exhaustiveness in pattern matching, marking a shift toward practical, type-safe functional programming in theorem proving and beyond.^[17] Standard ML, formalized in the 1980s and 1990s, further refined these features, establishing ADTs as a core element of the language family that includes variants like OCaml.^[18] By the 1990s, Haskell extended ADTs with parametric polymorphism and lazy evaluation, building on precursors like Miranda to create a purely functional language where data types could be defined generically and evaluated on demand.^[19] This evolution popularized ADTs in broader functional programming communities, enabling advanced abstractions such as type classes for overloading while maintaining referential transparency.^[20] Haskell's design committee incorporated feedback from ML implementations, enhancing ADT flexibility for real-world applications like symbolic computation. The adoption of ADTs also spurred early experiments in Lisp dialects during the 1970s and 1980s, where researchers explored static typing extensions to Lisp's dynamic model, drawing from LCF's initial Lisp-based implementation to incorporate variant-like structures for safer symbolic manipulation.^[21] This influence extended to imperative systems, prompting efforts in languages like Ada (with variant records) and later safer designs that integrated sum types to mitigate runtime errors in object-oriented and procedural paradigms.^[16]

Practical Examples

Linked Lists

A singly linked list exemplifies a recursive algebraic data type, leveraging sum and product types to model sequences of elements of arbitrary length. It is defined as a sum type with two constructors: Nil, representing the empty list and corresponding to the unit type (often denoted as 1), and Cons, a product type that pairs an element of type A with another list of type \mathrm{List}\langle A \rangle. This structure encodes the distinction between absence and presence of elements directly in the type system.^[22]^[23] The recursive nature of this definition enables the representation of lists of any finite size, formalized by the type equation \mathrm{List}\langle A \rangle = 1 + A \times \mathrm{List}\langle A \rangle, where + signifies the disjoint union of the sum type and \times the Cartesian product. Unfolding this equation iteratively yields the empty list (all Nil) or chains of Cons nodes terminating in Nil, ensuring well-founded recursion without cycles.^[1] Lists are constructed using the provided constructors: the empty list is built directly as Nil, while the cons operation prepends an element to an existing list, yielding a new Cons value. For instance, the list containing elements x and y (in that order) is formed as \mathrm{cons}(x, \mathrm{cons}(y, \mathrm{Nil})), with each cons application creating a fresh product without mutating prior structures. This immutability supports safe sharing of sublists in functional contexts.^[22] By encoding the empty case explicitly as Nil rather than a special null value, algebraic data types for linked lists enforce type safety at compile time, eliminating common errors like attempting to access the head or tail of an empty list, which would otherwise lead to null dereference exceptions in pointer-based implementations.^[24]^[25]

Tree Structures

Binary trees serve as a quintessential example of nested recursive algebraic data types (ADTs), where sum types enable branching structures by allowing a choice between terminal leaves and internal nodes with subtrees. In this formulation, a binary tree over a type A is defined as a sum type consisting of a leaf constructor representing an empty tree (isomorphic to the unit type $1) and a node constructor that is a product type combining a value of type A with two subtrees of the same tree type.^[26]^[27] This recursive structure is captured by the type equation

\text{Tree}\langle A \rangle = 1 + A \times \text{Tree}\langle A \rangle \times \text{Tree}\langle A \rangle,

where the sum reflects the disjoint union of leaves and nodes, and the recursion permits arbitrary depth, supporting both balanced trees (with roughly equal subtrees) and skewed trees (devolving into linear chains).^[26] In languages like Haskell, this is concretely declared as

haskell
data Tree a = Leaf | Node a (Tree a) (Tree a)
data Tree a = Leaf | Node a (Tree a) (Tree a)

deriving from the sum of Leaf and the product Node (value paired with left and right subtrees).^[27] The above definition realizes full binary trees, in which every node has either zero children (a leaf) or exactly two children, distinguishing it from more general binary trees that permit nodes with one child; construction proceeds inductively by applying the node constructor to values and subtrees, such as building a simple tree with root value $1 and a right child of value $2 both with leaf subtrees.^[26] Unlike linked lists, which form linear chains through single-child recursion, binary trees introduce branching via the dual subtrees, enabling representation of hierarchical relationships.^[27] Such tree structures find application in modeling hierarchical data, for instance, expression trees that encode arithmetic operations and operands in a compiler's frontend.^[28]

Abstract Syntax Trees

Abstract syntax trees (ASTs) represent the syntactic structure of source code in a form suitable for further processing in compilers and interpreters, and they are naturally modeled using algebraic data types (ADTs).^[29] An AST for expressions in a simple programming language can be defined as a recursive sum type, where the sum constructors capture the variants of expression nodes, and product types handle their subcomponents. For instance, the type for arithmetic expressions might be specified as \text{Expr} = \text{Lit}(\mathbb{Z}) + \text{Add}(\text{Expr} \times \text{Expr}) + \text{Sub}(\text{Expr} \times \text{Expr}) + \text{Mul}(\text{Expr} \times \text{Expr}) + \text{Var}(\text{String}), allowing literals, binary operations, and variables as disjoint alternatives.^[7] In compilers, ADTs for ASTs provide a type-safe, structured representation of parsed code, facilitating semantic analysis, optimization, and code generation without resorting to error-prone string manipulation or ad-hoc data structures.^[29] This approach ensures that the tree encodes only the essential syntactic information, abstracting away concrete details like parentheses or operator precedence from the parser's output. For example, the expression $2 + (3 \times 4) would be constructed as \text{Add}(\text{Lit}(2), \text{Mul}(\text{Lit}(3), \text{Lit}(4))), where the product types for binary operators nest recursively to reflect the expression's hierarchy.^[7] The use of ADTs in ASTs offers key advantages, including compile-time guarantees for exhaustive coverage of syntax variants through type systems, which help prevent bugs in parser implementations and downstream analyses.^[29] By leveraging sum types, developers can ensure that all possible node types are handled in transformations, reducing runtime errors and improving maintainability in language tools.^[7]

Deconstruction Mechanisms

Pattern Matching Basics

Pattern matching is the primary mechanism for deconstructing algebraic data types (ADTs), allowing functions to be defined by specifying cases that correspond to the possible constructors of a sum type, thereby ensuring that all variants are handled explicitly.^[30] This process involves testing a value against a set of patterns, each associated with a constructor, and selecting the matching case to execute the corresponding code branch. For product types within constructors, patterns can bind variables to the individual components, facilitating the extraction of structured data.^[31] The syntax typically employs case expressions or match statements, where the input expression is followed by a series of pattern-action pairs separated by alternatives. For instance, a pattern for a constructor C x y would match a value constructed as C v1 v2 by binding x to v1 and y to v2, provided the types align.^[30] This binding occurs only if the constructor matches exactly, promoting precise and composable code without manual type checks.^[31] Type safety is enforced through compile-time verification, where the compiler analyzes patterns to detect non-exhaustive matches—cases where not all possible constructors are covered—and issues warnings or errors to prevent runtime failures.^[30] In OCaml, for example, incomplete pattern matches result in a compiler warning, encouraging but not strictly enforcing that functions handle every variant of the ADT.^[30]^[32] A simple illustration is a sum type representing colors:

ocaml
type color = Red | Green | Blue
type color = Red | Green | Blue

A function to convert a color to a string might use pattern matching as follows:

ocaml
let color_to_string c =
  match c with
  | Red -> "red"
  | Green -> "green"
  | Blue -> "blue"
let color_to_string c =
  match c with
  | Red -> "red"
  | Green -> "green"
  | Blue -> "blue"

This ensures exhaustive coverage of all constructors, with the compiler verifying completeness.^[30]

Recursive Pattern Matching

Recursive pattern matching allows the deconstruction of recursive algebraic data types (ADTs) by applying patterns to nested constructors, facilitating the traversal and transformation of structures like lists and trees through inductive definitions. This approach builds on basic pattern matching by enabling recursive calls within case branches, where each step decomposes a constructor and recurses on substructures, ensuring functions process data layer by layer until reaching base cases.^[33] For instance, in a list ADT defined as either empty (Nil) or a cons cell (Cons of a head element and tail list), a fold operation can match the structure recursively: the base case handles Nil by returning an initial accumulator, while the recursive case applies a step function to the head and folds over the tail.^[34] Techniques such as guards and nested patterns enhance recursive matching for more expressive processing. Guards introduce conditions within cases, allowing branches to apply only when predicates hold, such as filtering elements during traversal. Nested patterns handle product-like constructors by binding variables at multiple levels; for example, in a cons pattern Cons x (Cons y xs), the inner cons can further decompose the tail. These features support inductive accumulation, as seen in a length function for lists:

haskell
length :: [a] -> Int
length []     = 0
length (_:xs) = 1 + [length](/page/Length) xs
length :: [a] -> Int
length []     = 0
length (_:xs) = 1 + [length](/page/Length) xs

Here, the pattern (_:xs) ignores the head and recurses on xs, accumulating the count until the empty list base case.^[35] For tree structures, recursive pattern matching enables operations like summing node values, defined as an ADT with leaf and internal node constructors:

haskell
data [Tree](/page/Tree) a = [Leaf](/page/Leaf) a | Node ([Tree](/page/Tree) a) ([Tree](/page/Tree) a)
sumTree :: Num a => [Tree](/page/Tree) a -> a
sumTree ([Leaf](/page/Leaf) x)   = x
sumTree ([Node](/page/Node) l r) = sumTree l + sumTree r
data [Tree](/page/Tree) a = [Leaf](/page/Leaf) a | Node ([Tree](/page/Tree) a) ([Tree](/page/Tree) a)
sumTree :: Num a => [Tree](/page/Tree) a -> a
sumTree ([Leaf](/page/Leaf) x)   = x
sumTree ([Node](/page/Node) l r) = sumTree l + sumTree r

The matching on Node l r recurses on both subtrees, combining results to compute the total, demonstrating how patterns align with the recursive definition to process arbitrary depths.^[34] Regarding error handling, recursive pattern matching on finite ADTs avoids infinite loops through structural induction: each recursive call operates on a strictly smaller substructure (e.g., shorter tail or subtree), guaranteeing termination as the size measure decreases to the base case.^[36]

Theoretical Underpinnings

Relation to Type Theory

Algebraic data types (ADTs) emerge as a natural extension of the simply typed lambda calculus, where they are formalized as inductive types equipped with specific introduction and elimination rules. In this framework, an inductive type is defined by a collection of constructors that serve as introduction rules, allowing the formation of terms of the type through finite applications of these constructors, such as the nil constructor for empty lists or cons for non-empty ones.^[37] The elimination rules, typically realized through pattern matching or recursion principles, enable the deconstruction of these terms in a type-safe manner, ensuring that functions defined over inductive types respect the structure imposed by the constructors.^[38] This duality—construction via introduction and analysis via elimination—mirrors the inductive definitions in intuitionistic type theory and provides a rigorous foundation for ADTs, guaranteeing that all values of the type are generated precisely by the constructors.^[39] While the basic formulation of ADTs in simply typed lambda calculus deals with non-dependent cases, where type constructors do not depend on value indices, this setup previews the richer landscape of dependent types. In dependent type theories, such as the Calculus of Constructions, ADTs generalize to indexed families of types, where the constructors and eliminators can incorporate dependencies on parameters, allowing types like vectors of length n indexed by natural numbers n.^[40] However, the non-dependent inductive types central to standard ADTs maintain simplicity by avoiding such value dependencies, focusing instead on fixed-type products and sums that compose data structures without parametric variation in the type itself.^[38] A key benefit of this type-theoretic formulation is the enforcement of totality for functions over ADTs through exhaustive pattern matching. In the simply typed setting extended with inductive types, pattern matching must cover all possible constructors to be well-typed, ensuring that the resulting function is total—defined for every input of the type—unlike partial functions in untyped lambda calculus that may fail to terminate or cover cases.^[37] This exhaustiveness check at the type level prevents runtime errors and aligns with the constructive nature of the calculus, where proofs of totality coincide with the programs themselves.^[41] The typed lambda calculus enriched with inductive types for ADTs inherits strong theoretical properties, notably the Church-Rosser confluence theorem, which ensures that beta-reduction and other rewriting steps lead to unique normal forms regardless of the order of reductions.^[37] This confluence holds for the full system, including the interaction between lambda abstractions, applications, and the inductive constructors and eliminators, thereby preserving the decidability of type checking and the reliability of normalization in ADT-based computations.^[39]

Categorical Perspective

In category theory, algebraic data types (ADTs) are interpreted within categories of types and typed functions, such as the category of sets or more generally Cartesian closed categories enriched with suitable structure. Product types correspond to categorical products, equipped with projection morphisms that allow access to components, while sum types correspond to coproducts, featuring injection morphisms to embed variants into the disjoint union.^[42] This framework captures the universal properties of ADTs: for products, any pair of morphisms into a type factors uniquely through the projections, ensuring structured composition; for coproducts, any pair of morphisms out of a type factors uniquely through the injections, enabling exhaustive case analysis.^[42] Recursive ADTs, such as lists or trees, are modeled as initial algebras for endofunctors on the category of types. An initial algebra for a functor F is an object A with a structure map \alpha: F(A) \to A such that for any other algebra (B, \beta: F(B) \to B), there exists a unique homomorphism h: A \to B satisfying h \circ \alpha = \beta \circ F(h). This initiality provides a foundation for induction and recursion over the data type. For example, the list functor is F(X) = 1 + A \times X, where $1is the terminal object (empty type) and+$ denotes coproduct; the initial algebra yields the least fixed point, consisting of finite lists built from nil and cons constructors.^[43]^[42]^[44] Parametricity in polymorphic type systems, as formalized in relational interpretations of types, yields free theorems that enforce behavioral laws on ADT operations. Specifically, polymorphic functions over ADTs, such as the map function \forall a b. (a \to b) \to \mathrm{List}(a) \to \mathrm{List}(b), respect the constructors due to the relational action of the type constructors: map applies the function to elements while preserving structure, as dictated by the functorial nature of the type.^[45] These theorems arise from the uniform treatment of type variables, ensuring that operations cannot distinguish between isomorphic representations and thus maintaining abstraction.^[45] Under parametricity, native ADTs are isomorphic to their Church encodings in suitable categories, such as those modeling the polymorphic lambda calculus. The Church encoding represents a product type A \times B as \forall C. (A \to B \to C) \to C and a sum type A + B as \forall C. (A \to C) \to (B \to C) \to C; parametricity guarantees that these encodings satisfy the same universal properties as the categorical products and coproducts, establishing bijections between the carriers and structure maps.^[46] This isomorphism highlights the expressive power of higher-order functions in simulating first-order data structures while preserving categorical semantics.^[46]

Support in Programming Languages

Functional Languages

Functional languages provide native support for algebraic data types (ADTs) through declarative syntax that directly expresses sum and product types, enabling the construction of complex, type-safe data structures without mutable state.^[47]^[48]^[49] In Haskell, ADTs are defined using the data keyword, which specifies a type name, parameters, and constructors representing variants; for example, data Maybe a = [Nothing](/page/Nothing) | Just a declares a sum type for optional values.^[47] Haskell also supports automatic derivation of standard type classes like Eq and Show via the deriving clause, such as deriving ([Eq](/page/EQ), Show), which generates equality and string representation implementations based on the type's structure.^[47] A unique feature of Haskell's ADT system is its integration with lazy evaluation, which permits the definition and manipulation of infinite data structures, such as an endless list ones = 1 : ones, where only demanded elements are computed.^[50] Standard ML (SML) and its variants, part of the ML family, use the datatype keyword to define ADTs with named constructors that serve as both builders and pattern matchers; for instance, datatype color = Red | Blue | Green creates a simple sum type.^[49] These languages emphasize strong type inference, allowing ADT declarations to be checked and inferred without explicit annotations, ensuring exhaustive pattern matching at compile time.^[49] Scala supports ADTs through case classes, which model product types with automatic structural equality, hashing, and deconstruction support, as in case class Point(x: Int, y: Int).^[48] For sum types, sealed traits or abstract classes restrict extensions to the defining file, enabling compiler-verified exhaustiveness in pattern matching; an example is sealed trait [Shape](/page/Shape) extended by case class Circle(radius: Double) extends [Shape](/page/Shape).^[48] Scala's pattern matching includes guards—boolean conditions attached to cases—for refined control, such as case p: Point if p.x > 0 => "positive".^[51]

Imperative and Object-Oriented Languages

In imperative and object-oriented languages, algebraic data types are typically emulated rather than natively supported, using constructs like structs for products and discriminated unions for sums, often requiring additional boilerplate for type safety and deconstruction. In C++, product types are represented using struct or class, which aggregate multiple fields into a single unit, providing a straightforward way to model tuples of values. For sum types, prior to C++17, implementations relied on manually constructed tagged unions, pairing an enum to tag the active variant with a union to store variant-specific data, ensuring type safety through explicit checks. Since C++17, the standard library's std::variant offers a built-in, type-safe discriminated union that holds exactly one value from a specified set of types, with compile-time enforcement via accessors like std::get and runtime checks via std::holds_alternative, closely approximating sum types without undefined behavior. For example, a shape variant might be defined as std::variant<Circle, Rectangle>, where Circle and Rectangle are structs. Java approximates product types through records, introduced as a preview in Java 14 and standardized in Java 16, which concisely define immutable classes with automatically generated constructors, accessors, equals, hashCode, and toString methods for bundling data. Sum types are supported via sealed classes and interfaces, finalized in Java 17, which restrict subclassing to an explicitly permitted set of types, modeling a closed discriminated union suitable for exhaustive analysis in switches. For instance, an expression type could be sealed interface Expr permits ConstantExpr, PlusExpr, with ConstantExpr as a record record ConstantExpr(int value) {}. With stable pattern matching for switch introduced in Java 21, deconstruction can utilize pattern matching in switch expressions and statements for more expressive handling of sealed hierarchies, though integration remains less seamless than in functional languages.^[52] Rust, blending imperative and functional paradigms, natively supports algebraic data types via enum, where variants can be unit-like, tuple-like, or struct-like, enabling both sums (choice of one variant) and products (data within variants). Pattern matching through the match expression provides exhaustive, type-safe deconstruction, with destructuring for nested data; for example, enum IpAddr { V4(u8, u8, u8, u8), V6([String](/page/String)) } models IP addresses as a sum, destructible via match ip { IpAddr::V4(a,b,c,d) => ..., _ => ... }. A key challenge in these languages is manual tagging for sum types in older or minimalistic environments, where developers must explicitly manage an enumeration alongside a union or equivalent, prone to runtime errors if tags are mismatched. For recursive algebraic data types, such as tree structures, languages without garbage collection like C++ require meticulous memory management using smart pointers (e.g., std::unique_ptr or std::shared_ptr in variants) to handle allocation and deallocation, avoiding leaks, dangling references, or cycles that could lead to undefined behavior. In contrast, garbage-collected languages like Java automatically reclaim memory for recursive structures but may incur performance pauses during collection, complicating real-time applications.

References

[1]
Algebraic data types - CS 242
The basic idea behind algebraic data types (ADTs) is to represent relations between data, specifically the notions of “and” and “or”.Variants · Adt Theory · Sum Types
[2]
3.2.4. Algebraic Data Types · Functional Programming in OCaml
Another name for these variant types is an algebraic data type. ... "Algebra" here refers to the fact that variant types contain both sum and product types, as ...
[3]
The algebra (and calculus!) of algebraic data types - Code Words
Algebraic data types (ADTs) are fundamental to functional programming, built from primitives, and have a strong resemblance to algebra, allowing for counting ...Introduction · Poking Holes In Things · The Zipper
[4]
[PDF] Lecture 04.1: Algebraic data types and general recursion 1. Recap
2.3. Collectively, these are called algebraic data types because they have algebraic properties similar to normal integers. Generally, you can understand these ...
[5]
[PDF] Algebraic Data Types
Algebraic types is about how to build new types in an INFINITE number of ways. This is where most of those other types came from - you could define them ...
[6]
3.2.1. Options · Functional Programming in OCaml
Algebraic Data Types · 3.2.4.1. Catch-all ... Java does not force programmers to explicitly check for the null case, which leads to null pointer exceptions.
[7]
[PDF] 11. Algebraic data types - Abstraction and Design In Computation
Data types built by conjunction and disjunction are called ALGEBRAIC. DATA TYPES.2 As mentioned, we've seen several examples already, as. 2 Algebra is the ...
[8]
Types and Programming Languages - MIT Press
This text provides a comprehensive introduction both to type systems in computer science and to the basic theory of programming languages.
[9]
Practical Foundations for Programming Languages
**Extracted Sections Summary: Product Types, Sum Types, and Recursive Types in Algebraic Data Types**
[10]
[PDF] A Formulation of the Simple Theory of Types Alonzo Church The ...
Apr 2, 2007 · A FORMULATION OF THE SIMPLE THEORY OF TYPES. ALONZO CHURCH. The purpose of the present paper is to give a formulation of the simple theory of ...
[11]
[PDF] DATA TYPES AS LATTICES
I'i/lx) ~ B}. Page 40. 559. DATA TYPES AS LATTICES is a llJ,-set and every such set has this form. Thus the {s index the elements of the class. Suppose f, g E ...
[12]
[PDF] THE PRINCIPAL TYPE-SCHEME OF AN OBJECT IN ...
One of the aims of combinatory logic is to study the most basic properties of functions and other concepts, with as few restrictions as possible; ...Missing: exponential discipline
[13]
[PDF] A Theory of Type Polymorphism in Programming
The aim of this work is largely a practical one. A widely employed style of programming, particularly in structure-processing languages.
[14]
Where does the name "algebraic data type" come from? - Lysxia's blog
Jul 26, 2024 · The name comes from universal algebra, whereas another common interpretation of “algebraic” as a reference to “sums of products” is not historically accurate.Missing: polynomial | Show results with:polynomial
[15]
(PDF) HOPE: An Experimental Applicative Language - ResearchGate
It supports pure functional programming (including algebraic data types, higher-order programming ... languages such as OCaml and Haskell, Algebraic Data Types ...Missing: Stobie | Show results with:Stobie
[16]
[PDF] The History of Standard ML - CMU School of Computer Science
Mar 17, 2020 · This paper focuses on the history of Standard ML, which plays a central rôle in this family of languages, as it was the first to include the ...
[17]
The history of Standard ML - ACM Digital Library
Jun 12, 2020 · This paper focuses on the history of Standard ML, which plays a central role in this family of languages, as it was the first to include the complete set of ...
[18]
[PDF] The History of Standard ML
Mar 28, 2020 · This paper focuses on the history of Standard ML, which plays a central rôle in this family of languages, as it was the first to include the ...
[19]
[PDF] A History of Haskell: Being Lazy With Class - Microsoft
Apr 16, 2007 · “constructor classes” (see Section 6). • Algebraic data types were extended in several ways: new- types, strictness annotations, and named ...
[20]
A history of Haskell: being lazy with class - ACM Digital Library
This paper describes the history of Haskell, including its genesis and principles, technical contributions, implementations and tools, and applications and ...
[21]
An Interview with Robin Milner - University of Sussex
Robin Milner was born in 1934 in Plymouth, moved to Edinburgh as a child, went to Eton, then Cambridge, and served in the army before Cambridge.<|separator|>
[22]
The Haskell 98 Report: Predefined Types and Classes
6.1. Lists are an algebraic datatype of two constructors, although with special syntax, as described in Section 3.7. The first constructor is the null list, ...
[23]
3.9. Algebraic Data Types — OCaml Programming
Another name for these variant types is an algebraic data type. “Algebra” here refers to the fact that variant types contain both sum and product types, as ...
[24]
Reading 17: Recursive Data Types - MIT
which will always work, including when an empty list refers to an Empty object. Keep null values out of your data structures, and your life will be happier.
[25]
Chapter 3. Defining Types, Streamlining Functions
Algebraic data types allow us to distinguish between otherwise identical pieces of information. ... null, null)); } } 1 comment. In Haskell, we don't have an ...
[26]
Algebraic Data Types | SpringerLink
Sep 21, 2016 · Algebraic data types are a ... We finish this section giving the algebraic data type of a binary tree, and an example of these trees.
[27]
[PDF] CSci 450: Org. of Programming Languages Algebraic Data Types
Oct 10, 2017 · Types can also be recursive. For example, consider the user-defined type BinTree, which defines a binary tree with values of a polymorphic type.
[28]
[PDF] Expression Trees as Algebraic Data Types
Functional Programming Lecture 10. Expression Trees as Algebraic Data Types. Don Sannella. University of Edinburgh. Page 2. Part I. Expression Trees. Now we can ...
[29]
Abstract Syntax Trees and Interpreters - Bookish.press
Algebraic data type definitions for abstract syntax trees of an example language. The ADT value has single constructor NumberValue(int) , thus there is ...
[30]
6 Patterns - OCaml
The two sub-patterns pattern1 and pattern2 must bind exactly the same identifiers to values having the same types. Matching is performed from left to right.6 Patterns · variable Patterns · variant Patterns
[31]
The Haskell 98 Report: Expressions
Matching a refutable pattern is strict: if the value to be matched is _|_ the match diverges. The irrefutable patterns are as follows: a variable, a wildcard, N ...
[32]
Verifying higher-order functional programs with pattern-matching ...
We introduce pattern-matching recursion schemes (PMRS) as an accurate model of computation for functional programs that manipulate algebraic data-types.
[33]
Algebraic Data Types and Pattern-Matching - okmij.org
Feb 14, 2024 · We present a systematic embedding of algebraic data types and their (recursive) processing using pattern-matching, and illustrate on examples ...
[34]
Making Our Own Types and Typeclasses - Learn You a Haskell
Using that, we can create recursive data types, where one value of some type contains values of that type, which in turn contain more values of the same type ...Algebraic Data Types Intro · Derived Instances · Recursive Data StructuresMissing: tutorial | Show results with:tutorial<|control11|><|separator|>
[35]
Lecture 1: Functional Programming; Proofs; ADTs — CS 345H
We've reviewed structural induction, introduced algebraic data types, and seen that programs are just ADTs too (and so can be reasoned about with induction ...
[36]
[PDF] Inductive Data Type Systems - Inria
The class of positive inductive types is the largest class that one can consider within the framework of the simply-typed λ-calculus, since any non-positive ...
[37]
[PDF] Inductively_Defined_Types--Coquand+Paulin.pdf - David Darais
The aim of this paper is to present an extension of Church's simple type theory (see [5, 20]) where we add inductively defined types. This will introduce the ...
[38]
[PDF] PROOFS AND TYPES - Paul Taylor
This little book comes from a short graduate course on typed λ-calculus given at the Université Paris VII in the autumn term of 1986–7.
[39]
[PDF] Why Dependent Types Matter
Apr 14, 2005 · Functional programmers have started to incorporate many aspects of dependent types into novel type systems using generalized algebraic data ...
[40]
[PDF] Pattern Matching with Dependent Types Introduction 1 Statement of ...
This note deals with notation in type theory. The de nition of a function by pattern matching is by now common, and quite important in practice, in functional ...Missing: totality | Show results with:totality
[41]
[PDF] Colecture 1: Algebras, algebraic data types, and recursion
Nov 15, 2016 · A common past-time of category theories is to double their research output by turning all the arrows of some gadget around, and calling it a ...
[42]
[PDF] Initial Algebra Semantics and Continuous Algebras
GOGUEN, J. w. THATCHER, E. G. WAGNER, AND J. B. WRIGHT of type (Aa • • • An, A)) provides a semantics for the context-free language generated by. G. TG ...
[43]
[PDF] DATA TYPES AS TERM ALGEBRAS - UBC Computer Science
Data types in programming have been mathematically studied from two different viewpoints, namely data types as (initial) algebras and data types as complete ...
[44]
[PDF] Theorems for free Philip Wadler University of Glasgow* June 1989 ...
The main contribution of this paper is to suggest that parametricity also has "specific" applications: it says interesting things about particular functions ...
[45]
Categorical data types in parametric polymorphism
Our work is concerned with categorical properties, so the PL category was used in the preceding paper (Hasegawa 1989). However, extensionality (well-pointedness) ...
[46]
Algebraic data type - HaskellWiki - Haskell.org
May 22, 2023 · An algebraic data type specifies the shape of elements using 'sums' (alternation) and 'products' (combination), determining its structural ...Missing: programming | Show results with:programming<|control11|><|separator|>
[47]
Algebraic Data Types | Scala 3 — Book
Enumerations can also be recursive, as illustrated ... Pattern matching on the particular constructor ( IntBox or BoolBox ) recovers the type information:.
[48]
[PDF] Programming in Standard ML - CMU School of Computer Science
Feb 11, 2011 · This book is an introduction to programming with the Standard ML pro- gramming language. It began life as a set of lecture notes for ...
[49]
Lazy evaluation - Haskell « HaskellWiki
Sep 15, 2025 · Lazy evaluation is a method to evaluate a Haskell program. It means that expressions are not evaluated when they are bound to variables.
[50]
Pattern Matching | Tour of Scala - Scala Documentation
Case classes are especially useful for pattern matching. Scala 2 and 3 ... Pattern guards are boolean expressions which are used to make cases more specific.