Fact-checked by Grok 2 weeks ago

Standard ML

Standard ML (SML) is a general-purpose, language that is statically typed with automatic , emphasizing safety through compile-time type checking and garbage collection, while supporting higher-order functions, polymorphic types, recursive datatypes with , exceptions, and a sophisticated module system for and separate compilation. Originating as the Meta Language (ML) in the late 1970s at the University of Edinburgh, SML was developed by Robin Milner and his colleagues as part of the LCF (Logic for Computable Functions) theorem prover to provide a reliable meta-language for formal proofs, drawing inspiration from earlier works like Peter Landin's ISWIM and John McCarthy's Lisp variants. The language evolved through collaborative efforts in the 1980s, with key contributions including David MacQueen's 1983 proposal for a module system and Mads Tofte's 1987 work on polymorphic references, leading to initial design meetings starting in 1982 and the first formal definition in 1986. Standardization efforts culminated in The Definition of Standard ML published in 1990 by Milner, Mads Tofte, Robert Harper, and David MacQueen, which provided a rigorous mathematical semantics with 196 rules for type checking, ensuring properties like and deterministic evaluation. The revised edition in 1997 refined this to 189 rules, introducing the value restriction on type polymorphism to prevent unsoundness and improving semantics with opaque matching, while also standardizing a Basis Library as part of the 1997 revision with 67 interfaces for portability across implementations. SML's type system uses Hindley-Milner polymorphism with unification-based , allowing concise code without explicit annotations, and supports imperative features like mutable references and arrays alongside pure functional constructs, making it suitable for both research and practical applications such as theorem proving, compilers, and . Notable implementations include /NJ (from , starting 1986), MLton (optimizing compiler), and Poly/ML, which have advanced features like proposed by Tofte and Jean-Pierre Talpin in 1994. Despite stalled efforts for a successor like ML2000 in the , SML remains influential in programming language research, influencing languages like , , and through its innovations in and modularity.

History

Origins in the ML Language Family

The ML language family originated in the 1970s at the as the Meta Language () for the LCF (Logic for Computable Functions) theorem prover, a system designed to support interactive proof construction with a focus on logical security and mechanized reasoning. Developed under the leadership of , the project began around 1973 when Milner, then at , adapted and extended an earlier LCF implementation from Stanford to create a more robust framework for proofs. The initial ML was an untyped language inspired by the and , serving primarily as a scripting tool for defining proof tactics within LCF. Key contributors to ML's early design included Milner, along with Michael J. C. Gordon, Christopher P. Wadsworth, and others in the group, who formalized ML's structure in their work on LCF. A pivotal innovation during 1973–1978 was the introduction of polymorphic , independently rediscovered by Milner and presented in his 1978 paper, which established the Hindley-Milner algorithm for principal type schemes and let-polymorphism. This advancement transformed ML from an untyped meta- into a statically typed functional language, emphasizing while preserving expressiveness for theorem-proving tasks. In the 1980s, ML evolved into standalone typed variants as its utility extended beyond LCF. Edinburgh ML emerged in 1981 through Luca Cardelli's development of a compiler for a free-standing version, initially called "ML under VMS," which decoupled ML from the LCF system and enabled broader experimentation. Concurrently, Cambridge ML was developed at the University of Cambridge, building on Edinburgh's foundations to support proof assistants like HOL and later influencing the creation of Caml in 1985. These implementations marked ML's transition to a general-purpose functional programming language. The rapid proliferation of ML dialects, including LCF/ML, Cardelli ML, Poly/ML (1988), and others, led to significant divergences in syntax, semantics, and features, resulting in portability issues that hindered collaborative development and widespread adoption. Without a unified specification, programs written for one implementation often failed in another, prompting initial efforts toward standardization in the early 1980s.

Standardization Process

The standardization of Standard ML emerged from efforts within the community to unify divergent implementations of the ML , with roots in discussions under the IFIP 2.1 as early as 1983. In April 1983, drafted an initial proposal for a "Standard ML," emphasizing a core language design that would serve as a portable foundation. This led to the formal formation of the SML Definition Committee following a pivotal meeting in from May 23–25, 1985, chaired by Milner and including key contributors such as Mads Tofte, Robert Harper, and David MacQueen. The committee operated under the auspices of IFIP 2.1, which focused on algorithmic languages and provided a structured framework for defining programming languages, drawing on the British research tradition centered in . Key milestones in the standardization process included the release of draft definitions that refined the language's semantics. In March 1986, the committee published a preliminary draft as ECS-LFCS-86-2 from the . This was followed in August 1987 by another draft, ECS-LFCS-87-36, which introduced formal static semantics and principal signatures to ensure and inference. These drafts facilitated community feedback and iteration, culminating in the final Definition of Standard ML, published in 1990 by and authored by Milner, Tofte, , and MacQueen. The 1990 standard initially encompassed the core language, deliberately excluding modules to focus on foundational constructs like expressions, types, and functions, though the module system was integrated into the final document to support modular programming. This scope aimed to define a minimal, consistent subset that could be implemented portably across diverse systems, addressing fragmentation in earlier ML variants. The primary rationale was to promote interoperability and reliability, enabling developers to write code that would behave uniformly in implementations such as the Standard ML of New Jersey (SML/NJ) and others, thereby fostering wider adoption in research and education.

Revisions and Modern Developments

In 1997, the Standard ML community released a revised definition of the language, known as SML '97, which integrated the module system more deeply into the core language and provided clearer specifications of its semantics to address ambiguities in the original 1990 standard. This revision, detailed in The Definition of Standard ML (Revised) by Milner, Tofte, , and MacQueen, simplified certain aspects of the language while maintaining , ensuring that existing programs could run unchanged. Accompanying the revised definition, the Standard ML Basis Library was formalized in 1997 as a comprehensive specification of predefined modules, promoting portability across implementations by standardizing essential functionality for systems and applications programming. This library, documented in The Standard ML Basis Library by Gansner and Reppy, includes modules for data structures, I/O, and time handling, among others, and has been integral to the language's ecosystem since its adoption. Since the 1997 revision, the core language has seen no major changes, with development focusing instead on implementation extensions and community-driven enhancements. Efforts to develop a successor like ML2000 in the late stalled, contributing to the stability of the core language since 1997. The ML Family Workshop series continues to foster this activity, with events in (held September 6 in , , co-located with ICFP) and 2025 (held October 16 in , co-located with ICFP/ 2025) discussing advancements in ML-family languages, including practical extensions for Standard ML. The community maintains the language through collaborative efforts, such as the SML Family GitHub repository, with ongoing support for implementations like SML/NJ. As of November 4, 2025, the latest SML/NJ release (version 110.99.9) is a bug-fix , emphasizing over new features.

Language Design

Core Principles and

Standard ML () is a general-purpose language designed with the foundational goals of achieving high expressiveness while ensuring safety and correctness through rigorous formal semantics. Its philosophy emphasizes a balance between theoretical foundations and practical utility, drawing directly from and to support higher-order functions and abstract mathematical structures. This approach, inspired by early systems like and LCF, prioritizes a declarative paradigm where programs are composed as expressions rather than imperative statements, promoting clarity and reducing unintended side effects. Central to SML's core principles is its strong static typing system, which leverages the Hindley-Milner framework to provide automatic and guarantee without runtime checks. Polymorphism, particularly , enables generic code that operates uniformly across types, enhancing reusability while the value restriction mechanism preserves soundness in the presence of imperative features. Referential transparency is a key tenet, ensuring that expressions evaluate predictably based on their values alone, which fosters correctness by minimizing mutable state dependencies and aiding in reasoning about program behavior. SML's design philosophy extends to modularity and abstraction, facilitating the construction of large-scale software through a sophisticated module system comprising signatures, structures, and functors. Functors act as parameterized modules, allowing reusable components that abstract over types and values, thereby supporting hierarchical organization and information hiding without compromising the language's type-safe expressiveness. Overall, these principles reflect SML's commitment to a secure, general-purpose language where formal verification aligns with intuitive programming practices.

Type System Fundamentals

Standard ML employs the Hindley-Milner type system, which enables automatic for expressions without requiring explicit type annotations, ensuring that every well-typed program has a principal type scheme that is the most general possible. This system, foundational to the language's design, guarantees that type checking is decidable and efficient, allowing compilation without runtime type errors. The core inference mechanism is based on Algorithm W, which computes principal types through a process of substitution and unification, as originally formalized for functional programs. Algorithm W performs type inference by recursively assigning fresh type variables to subexpressions and unifying them to resolve constraints arising from operations like function application. Unification, drawing from Robinson's algorithm, matches types by finding substitutions that make them identical, while incorporating an occurs check to prevent infinite type expansions, such as a type variable occurring within the type it is being unified with. For instance, in inferring the type of the identity function \lambda x. x, Algorithm W assigns a fresh variable \alpha to x, yielding the monotype \alpha \to \alpha, which is then generalized to the polymorphic scheme \forall \alpha. \alpha \to \alpha. This process ensures completeness and soundness, meaning every typable expression receives its principal type, and no incorrect typings are inferred. Polymorphic types in Standard ML are expressed through over type variables, forming type schemes that capture generality, such as \forall \alpha. \alpha \to \alpha for functions that work uniformly across types. Type variables, denoted by primes like 'a or 'b, represent unknowns that can be instantiated to concrete types during use, while a type scheme binds these variables universally to prevent ad-hoc polymorphism and enforce parametricity. In the static semantics, environments map identifiers to such s, allowing polymorphic values like list constructors to be reused with different element types without duplication. The provides static guarantees that eliminate runtime type errors, as all type inconsistencies are detected at through exhaustive and checking. types, marked with an equality attribute (e.g., 'a with equality), restrict polymorphism to types supporting structural , such as integers or datatypes, enabling operations like only where defined. types, introduced via abstype declarations, hide implementation details by treating the type as opaque, exposing only specified operations and preventing direct construction or inspection outside the . These features ensure and safety without compromising the language's expressiveness. Type equivalence in Standard ML is structural for tuples and records, where two types are if their components match recursively by structure, allowing flexible composition without name dependencies. In contrast, datatypes employ nominal equivalence, where is determined by the datatype name and constructor definitions, preserving identity even if structures coincide, which supports distinct abstract interpretations of similar layouts. This hybrid approach balances expressivity with the need for identifiable types in polymorphic contexts.

Basic Constructs

Declarations and Expressions

In Standard ML, programs are constructed from declarations and expressions, forming the foundational building blocks of the language. Declarations introduce bindings for values and functions, while expressions represent computations that evaluate to values. Unlike imperative languages, Standard ML treats the entire program as a sequence of pure expressions without statements, emphasizing functional composition and . Value declarations use the val keyword to bind immutable identifiers to expressions, following the syntax val pat = exp, where pat is a (typically a simple identifier for basic bindings) and exp is an expression. These bindings are recursive if prefixed with rec, allowing mutual definitions within a single declaration, as in val rec x = e where e may reference x. Optional type variables can precede the binding, such as val 'a x = e, to explicitly scope polymorphic types to that declaration. Function declarations employ the fun keyword with syntax fun f pat1 ... patn = exp, defining a function f that takes arguments matching the patterns and returns the value of exp; multiple clauses can be chained with | for pattern alternatives, and the rec modifier enables recursive or mutually recursive functions. These forms ensure all bindings are statically checked for type consistency during elaboration. Expressions in Standard ML are categorized into atomic and compound forms. Atomic expressions include literals like integers (42), strings ("hello"), and identifiers (x), as well as constructs like record literals {field = value} or tuple constructions. Compound expressions build upon these through application (exp1 exp2), type ascription (exp : ty), or more complex forms such as conditional if exp1 then exp2 else exp3, which evaluates to exp2 if exp1 is true and exp3 otherwise. The let expression, let dec in exp end, introduces local declarations dec whose bindings are visible only within exp, providing a for scoped introduction without global pollution. Additionally, case exp of match dispatches on the value of exp using pattern matches, though basic usage here focuses on simple exhaustive cases without advanced destructuring. All expressions are pure, meaning they have no side effects unless explicitly involving mutable structures. Scoping in Standard ML follows lexical rules, where bindings introduced by declarations are visible from their point of definition onward, unless nested within a let expression, which creates a new environment for its body. In a let dec in exp end, the declarations in dec shadow outer bindings but do not affect the surrounding scope, enforcing hygiene and preventing unintended captures. Type variables in value or function bindings are explicitly scoped to that declaration, avoiding leakage into broader contexts. This design supports modular programming by isolating local computations. Standard ML programs are structured as sequences of top-level declarations, either in source files or interactively via a read-eval-print loop (REPL). A file consists of declarations separated implicitly by their forms, with the entire program evaluating as the sequence of bound values; there are no explicit statements or outside expressions. In the REPL, each declaration is elaborated and evaluated immediately, binding new identifiers into the persistent . This expression-centric model, devoid of imperative statements, aligns with the language's functional , where computation proceeds through value substitutions. For example, a simple might declare a value and use it in an expression:
val pi = 3.14159;
val circumference = fn r => 2.0 * pi * r;
circumference 5.0;
Here, pi is bound immutably, circumference defines a , and the final application evaluates to approximately 31.4159, demonstrating how declarations feed into expressive computations. automatically deduces types like real for these reals, though explicit annotations can be added for clarity.

Functions and Lambda Expressions

In Standard ML, functions are first-class citizens, meaning they can be defined anonymously, passed as arguments, returned as results, and stored in data structures, enabling powerful functional abstractions. The core mechanism for creating functions is the abstraction, which allows for concise expression of computations without naming. functions are defined using the syntax fn pat => exp, where pat is a (often a simple variable) and exp is the body expression; this evaluates to a value of type, represented internally as a containing the pattern matches, the static environment, and an empty dynamic environment. For example, fn x => x * 2 defines a that doubles its input, with inferred type int -> int. Standard ML employs by default, transforming multi-argument functions into a chain of single-argument functions of type τ₁ → (τ₂ → ... → τₙ → τ), which facilitates . Thus, a like fun add x y = x + y has type int -> int -> int, and applying it partially as (add 3) yields a new of type int -> int that adds 3 to its argument. This implicit supports flexible composition, where add 3 4 evaluates to 7, but add 3 can be bound to a for reuse. Recursive functions are introduced using the rec keyword in value bindings, allowing a function to refer to itself within its definition. The syntax fun rec f pat = exp (or equivalently val rec f = fn pat => exp) binds f to a recursive , where the expression must be a lambda abstraction to enable . For instance, fun rec factorial n = if n = 0 then 1 else n * factorial (n - 1) computes the with base case and recursive step, inferring type int -> int. Mutual recursion between multiple functions is supported by chaining bindings with and, such as val rec even = fn n => n = 0 orelse odd (n - 1) and odd = fn n => n <> 0 andalso even (n - 1), enabling definitions where functions call each other. Function application in Standard ML uses strict call-by-value semantics, where arguments are fully evaluated from left to right before substitution into the function body. This ensures deterministic behavior and side-effect ordering, as in (fun f x = x + 1) (2 + 3), where 2 + 3 evaluates to 5 prior to application, yielding 6. Unlike lazy evaluation, this strict strategy prevents unevaluated thunks, promoting efficiency for most computations while requiring explicit handling for infinite data structures if needed.

Built-in Types and Type Synonyms

Standard ML provides a set of built-in types that form the foundation of its , including the monomorphic types int for integers, real for floating-point numbers, string for sequences of characters, bool for boolean values (true and false), char for individual characters, and unit as a singleton type with value (). These types have arity zero and are part of the initial static basis, with int, bool, string, char, and unit admitting equality comparisons via the = and <> operators, while real does not due to its non-discrete nature. Among the built-in types, lists and tuples are constructed using predefined type constructors and syntactic sugar for convenience. Lists are parameterized by the type constructor list of arity one, yielding types like 'a list for homogeneous sequences, constructed via the nil value for empty lists and the infix :: operator for cons cells, with derived syntax [e1; e2; ...; en] for list literals where all elements share the same type. Tuples, on the other hand, are derived from record types and expressed as (e1, e2, ..., en) for n ≥ 2, equivalent to labeled records {1=e1, 2=e2, ..., n=en} with fixed, anonymous labels, allowing heterogeneous components of arbitrary types. Both lists and tuples admit equality if all their component types do, enabling structural comparisons and pattern matching on equality types. Type synonyms in Standard ML allow developers to create aliases for existing types without introducing new type constructors, using the declaration type tyvarseq tycon = ty to name complex types transparently. For example, one might define type intlist = int list to abbreviate lists of integers, or more generally type 'a slist = 'a * 'a list for a pairing an element with a list; such synonyms are fully interchangeable with their underlying types and preserve properties like admissibility. This mechanism supports polymorphism by allowing parametric synonyms, such as type 'a seq = 'a list, facilitating readable code without altering the type system's . Equality types play a crucial role in Standard ML, denoting those types for which structural equality is defined and usable in expressions like e1 = e2 or in pattern matching, including all primitive types except real and built-in constructors like list and tuples when their arguments admit equality. This distinction ensures safe operations, as non-equality types like real avoid imprecise floating-point comparisons in patterns or equality tests. Built-in type constructors extend the primitives, notably ref of arity one for creating mutable references with type 'a ref, which admits equality only if the underlying type 'a does, enabling imperative features like assignment while integrating with the functional core. Other constructors like list are similarly predefined in the initial basis, providing essential polymorphic structures without requiring user definitions.

Advanced Features

Algebraic Data Types

Algebraic data types in Standard ML allow programmers to define custom sum and product types through datatype declarations, enabling the construction of complex, type-safe data structures such as trees or lists. A datatype declaration specifies a new along with one or more value constructors that define how values of the type are formed, combining variants (sums) via alternatives separated by bars and products via tuples or . For instance, a can be declared as datatype 'a [tree](/page/Tree) = [Leaf](/page/Leaf) of 'a | Node of 'a [tree](/page/Tree) * 'a [tree](/page/Tree), where Leaf wraps a single value and Node combines two subtrees. Each constructor in a datatype declaration generates a corresponding that creates values of the type, with types inferred from the declaration; for the example, Leaf has type 'a -> 'a [tree](/page/Tree) and Node has type 'a [tree](/page/Tree) * 'a [tree](/page/Tree) -> 'a [tree](/page/Tree). These constructors are exception-safe, as they cannot produce invalid states like pointers—instead, all values must be explicitly built using the defined constructors, preventing errors from uninitialized . Standard ML supports recursive datatypes, where the type refers to itself within constructor definitions, facilitating structures like trees or the built-in type (though custom lists can also be defined for illustration). is enabled by chaining multiple datatype declarations with the and keyword, allowing types to reference each other, such as datatype even = Zero | Succ of odd and datatype odd = One | Succ of even. This mechanism ensures that is properly handled at the type level without requiring special annotations. The enforces safety by treating each datatype as a distinct new type, incompatible with others unless explicitly equated, which eliminates implicit conversions and promotes exhaustive construction via constructors. , covered elsewhere, provides a natural way to deconstruct these types, but the declaration itself guarantees that all values are well-formed.

Pattern Matching

in Standard ML provides a mechanism for destructuring composite values and binding variables to their components in a concise and type-safe manner, forming a of the language's expressive power for handling structured data. It appears primarily in expressions and definitions, where an input is tested against a sequence of patterns, and the body associated with the first successful is evaluated. This approach enables declarative specification of behavior based on data shape, avoiding explicit type tests or conditionals. The primary construct for is the expression, with syntax case exp of pat_1 => exp_1 | ... | pat_n => exp_n, where exp is evaluated to a value that is sequentially matched against the patterns pat_i. The first that binds any variables in it to corresponding subvalues from exp, after which the associated exp_i is evaluated in this extended environment. If no , the built-in Match exception is raised. A wildcard _ can be used to match any value without binding variables, often serving as a default case. For example, the following computes the length of :
fun length l =
  case l of
    [] => 0
  | _ :: rest => 1 + length rest
Here, [] matches an empty list, while _ :: rest matches a non-empty list, binding rest to its tail. Standard ML supports a rich set of pattern forms to accommodate various data structures. Variables serve as simple patterns that bind to the entire matched value, such as x in case e of x => .... Constructor patterns apply a data constructor to subpatterns, like Cons(h, t) for a cons cell, destructuring the value into head h and tail t. Tuple patterns (pat_1, ..., pat_n) match n-tuples by binding each component to the corresponding subpattern, as in case p of (x, y) => x + y. List patterns include literals like [pat_1, ..., pat_n] for fixed-length lists or cons forms pat_1 :: pat_2 for recursive decomposition. As-patterns, written pat as vid, bind vid to the full matched value while simultaneously decomposing it via pat; for instance, p as (d, _) in a dictionary update function binds p to the pair and d to its first element. This form, also known as a layered pattern, allows access to both the whole and its parts without recomputation. To ensure robustness, Standard ML compilers perform static checks on match expressions and function definitions involving patterns. Exhaustiveness checking verifies that the patterns cover all possible values of the scrutinized expression's type, issuing a warning if incomplete; for algebraic data types, this involves analyzing constructor coverage. If the match is inexhaustive at runtime, the Match exception is raised non-exhaustively. Redundancy checking detects and warns about superfluous rules that cannot be reached due to prior patterns, such as a clause following a wildcard. These analyses promote reliable code by catching common errors early, though they are advisory rather than errors, allowing compilation to proceed.

Higher-Order Functions and Modules

Standard ML supports higher-order functions, which treat functions as first-class citizens that can be passed as arguments to other functions, returned as results, or stored in data structures. This capability enables powerful abstractions for composing computations and processing collections like lists. A canonical example is the map function from the Basis Library's List structure, which applies a given function to each element of a list, producing a new list of transformed elements. Its type is ('a -> 'b) -> 'a list -> 'b list, allowing polymorphic transformation without specifying the exact function in advance. For instance:
- List.map (fn x => x * 2) [1, 2, 3];
> val it = [2, 4, 6] : int list
This promotes reusable code for processing. Reduction operations like foldl and foldr further exemplify higher-order functions by accumulating a value over a using a binary operator. The foldl function performs a left-to-right fold with type ('a * 'b -> 'b) -> 'b -> 'a [list](/page/List) -> 'b, starting from an initial accumulator and applying the operator to each element. Similarly, foldr folds right-to-left with the same type signature, enabling efficient computations such as summing a :
- List.foldl (op +) 0 [1, 2, 3];
> val it = 6 : [int](/page/INT)
These functions are essential for expressing recursive patterns generically. The module system in Standard ML provides mechanisms for large-scale abstraction through structures, signatures, and functors, allowing code to be organized into modular units with controlled interfaces. Structures encapsulate a collection of , type, and submodule bindings, forming a self-contained . For example, a structure might define a abstraction:
structure Queue =
  struct
    type 'a queue = int * 'a list * 'a list
    val empty = (0, [], [])
    (* additional bindings *)
  end
Signatures specify the interface of a structure, declaring the types, values, and substructures it must provide, ensuring type-safe modular composition. A corresponding signature might be:
signature QUEUE =
  sig
    type 'a [queue](/page/Queue)
    val empty : 'a [queue](/page/Queue)
    (* additional specifications *)
  end
A can be ascribed to a for verification, such as structure Q :> [QUEUE](/page/Queue) = Queue. Functors extend by parameterizing over other matching a given , enabling . The syntax is functor F (S : SIG) = struct ... end, where SIG is the parameter and the body uses S. application instantiates the parameter, generating a new : F (structure Arg = SomeStruct). This produces a result whose bindings are elaborated based on the body, supporting reusable abstractions like generic dictionaries parameterized over ordered keys. To enforce , Standard ML uses opaque (sealed) signatures, which hide implementation details by preventing access to types beyond the . Opaque ascription, via :> SIG, matches a to a but renames non-specified types to abstract ones, breaking external equalities. Transparent ascription (: SIG) preserves more information. constraints in signatures or functor applications enforce equality of types or structures across modules, such as sharing type T1 = T2 to ensure consistent type identities. The open declaration imports all bindings from a structure into the current namespace, facilitating convenient access without qualification, as in open Queue to use empty directly. This aids readability while maintaining modularity.

Imperative Programming

Mutable State with References

Standard ML provides limited support for imperative programming through mutable references, which serve as the primary mechanism for introducing state into an otherwise purely functional language. A reference is created using the ref constructor applied to an expression e, resulting in a mutable cell initialized with the value of e; the type of this reference is 'a ref where e has type 'a. Dereferencing occurs with the prefix operator ! applied to a reference r, yielding the current value stored in the cell, of type 'a if r has type 'a ref. Assignment to a reference uses the infix operator :=, as in r := e, which updates the cell with the value of e (requiring e to have type 'a) and returns unit. These operations enable side effects while integrating seamlessly with the type system, ensuring that mutations are type-safe. References in Standard ML exhibit polymorphism in their content type, allowing a single reference type 'a ref to accommodate any of 'a. However, the value restriction imposes limitations to maintain , preventing polymorphic generalization of created in expansive expressions (those involving function applications or other computations). For instance, a declaration like val r = ref 42 yields a monomorphic type int ref, as the expression ref 42 is expansive and does not generalize to forall 'a. 'a ref. In contrast, lambda abstractions such as fn x => ref x can achieve polymorphic types like forall 'a. 'a -> 'a ref because they are non-expansive values. This restriction avoids unsoundness, such as mutating a polymorphic with incompatible types, and ensures that allocated on the only escape with concrete types upon function return. Imperative control structures in Standard ML are derived from functional primitives combined with references. Sequencing of expressions uses the infix operator ;, where e1; e2 evaluates e1 for its side effects (discarding its value if not unit) and then returns the value of e2, with the overall type matching that of e2. While loops, absent as a primitive in the core language, are implemented using recursive functions and references to maintain mutable state across iterations. For example, a counter-based loop might use a reference to track progress:
sml
let
  val count = [ref](/page/The_Ref) 0
  [fun](/page/Fun) loop () =
    if !count < 10 then
      (count := !count + 1; [loop](/page/Loop) ())
    else ()
in
  loop ()
end
Here, the recursive loop function tests the condition on the dereferenced reference, updates it via assignment, and recurses, leveraging tail-call optimization for efficiency. Such constructs simulate imperative loops while relying on the language's recursive binding form rec (or fun for top-level recursion) to define the looping function. This design positions references as a minimal extension to the pure functional core of Standard ML, confining mutations to explicit imperative constructs without polluting the declarative semantics. The type system tracks reference mutability precisely, preventing unintended aliasing or type inconsistencies, and ensures that pure expressions remain unaffected by distant side effects. As a result, most Standard ML programs can leverage functional purity for reasoning and compositionality, resorting to references only where imperative state is essential, such as in performance-critical sections or interfaces to external systems.

Exceptions and Error Handling

Standard ML provides an exception mechanism for handling runtime errors and enabling non-local control transfers, allowing programs to signal and recover from exceptional conditions without relying on return codes or global state. Exceptions are values of the built-in abstract datatype exn, which is extensible and cannot be constructed directly outside of exception declarations, ensuring type safety by preventing accidental confusion with ordinary values. This design supports polymorphic payloads, where an exception can carry data of arbitrary type, while maintaining a uniform handling interface through pattern matching. Exceptions are declared using the exception keyword, introducing a new variant constructor for the exn type. A simple exception without payload is declared as exception E, yielding a value of type exn. For exceptions carrying data, the syntax exception E of τ specifies a payload of type τ, resulting in an exception value of type τ -> exn; for example, exception Fail of string allows raising a string message. Multiple exceptions can be declared in a single statement using and, and exceptions may be aliased via exception E = E'. These declarations extend the global exception namespace and are visible throughout the program. To raise an exception, the raise expression is used: raise exp, where exp evaluates to an exn value, immediately terminating the current evaluation and propagating the exception upward through the call stack. For instance, raise Fail "error occurred" signals a failure with a descriptive message. Propagation continues until a handler matches the exception or the program terminates, providing a clean way to interrupt normal flow for error conditions. Handling occurs via the handle construct in expressions: exp handle match, where match consists of pattern-matching rules of the form pat => exp1 | .... The expression exp is evaluated first; if it succeeds, the handler is ignored. If it raises an exception e, the patterns are matched against e—using the same as in case expressions—and the first successful match evaluates to the corresponding exp_i. Unmatched exceptions propagate further up the stack. This mechanism allows selective recovery, such as logging an error and continuing with a default value. Standard ML includes several built-in exceptions in the initial basis, raised automatically by language constructs or library functions for common errors:
  • Bind: Raised on pattern-binding failures, such as in non-exhaustive val bindings.
  • Match: Raised on non-exhaustive pattern matches in case expressions or function clauses.
  • Overflow: Raised for arithmetic operations exceeding representable bounds.
  • Div: Raised for division by zero.
  • Interrupt: Raised on user interrupts, like Ctrl+C in interactive environments.
These exceptions are of type exn (or polymorphic variants where applicable) and can be handled like user-defined ones, with functions like exnName and exnMessage providing string representations for debugging. The polymorphic nature ensures flexibility—for example, Fail carries a string—while the distinct exn type prevents exceptions from being used as regular data, enforcing safe error handling. The following example demonstrates declaration, raising, and handling:
exception DivideByZero;

fun safeDiv (x, y) =
  if y = 0 then raise DivideByZero
  else x div y;

val result = safeDiv (10, 0) handle DivideByZero => 0;
Here, safeDiv raises DivideByZero on zero division, but the handler catches it and returns 0, illustrating recovery without stack unwinding beyond the try scope (noting that handle is the SML equivalent of a try-catch).

Standard Library

Basis Library Overview

The Standard ML Basis Library is a standardized collection of modules that forms the core initial environment for programs, ensuring portability and consistency across implementations. Defined as part of the 1997 revision of the standard (), it specifies a minimal set of required modules and signatures to promote interoperable code that behaves predictably regardless of the underlying system. This library was developed to address the limitations of earlier versions by providing a rich, general-purpose foundation for applications, with an emphasis on idioms and type-safe operations. The library is organized hierarchically into structures and signatures, where structures implement the functionality and signatures define the interfaces to guarantee consistent behavior. Conformance to the standard requires implementations to provide all mandatory modules, such as those for basic data types and operations, while optional modules extend capabilities for specific environments like POSIX. Examples of core modules include Int for integer handling, List for list manipulations, and String for string operations, each adhering to predefined signatures like INTEGER, LIST, and STRING to enforce uniformity. This structure supports the portability goal by abstracting platform-specific details, allowing developers to write code that compiles and runs identically on compliant systems. Key modules cover essential areas, including general-purpose facilities like (I/O) and . For I/O, structures such as TextIO and BinIO provide text and binary stream operations, enabling reading from and writing to files or in a type-safe manner. The Time structure handles time intervals and timestamps, supporting operations like and comparison of time values for timing applications. In integer arithmetic, the Int module (implementing the INTEGER ) offers operations such as (+), multiplication (*), and division (div), with support for both signed and unsigned variants through related modules like LargeInt. For lists, the List structure includes higher-order functions like append (concatenating lists) and rev (reversing a list), facilitating efficient functional compositions without mutable state. Usage of the Basis Library typically involves implicit access to pervasive elements in the top-level environment, such as the int type or the length function, while specific modules are imported explicitly for clarity and namespace management. Developers can use open List; to bring all identifiers from the List structure into scope, or qualify names like List.map to avoid conflicts in larger programs. This approach aligns with SML's module system, allowing seamless integration of Basis components into user-defined modules for building portable applications.

Specialized Utility Modules

The Standard ML Basis Library includes several specialized modules that extend the core functionality for operations, mathematical computations, date and time handling, operating system interactions, character and manipulations, and low-level data structures. These modules provide domain-specific utilities that enable programmers to perform tasks such as handling, trigonometric calculations, and bitwise operations while maintaining the language's and functional paradigm.

I/O Modules

The TextIO structure facilitates text-based input and output using streams of characters and . It defines types such as instream for input streams and outstream for output streams, supporting operations like openIn to open a for reading, inputLine to read a newline-terminated line (returning SOME([string](/page/String)) or NONE at end-of-stream), and output to write a to an output stream. Additional functions include openOut for creating or truncating , openAppend for appending to , and print for outputting to the standard output with automatic flushing; streams are block-buffered by default for and line-buffered for interactive devices. Complementing TextIO, the BinIO structure handles binary input and output of 8-bit bytes (Word8.word elements). It provides similar stream types (instream and outstream) and functions such as openIn, openOut, and openAppend, inheriting imperative I/O semantics for reading (input, inputN) and writing (output, outputSubVec) . This module is essential for low-level file operations where byte-level precision is required, with error handling via the Io exception for failures like non-existent files.

Math and Utility Modules

The Math structure offers fundamental mathematical constants and functions operating on the real type, adhering to IEEE 754-1985 floating-point semantics. Key constants include pi (approximately 3.14159) and e (approximately 2.71828), while functions encompass sin, cos, and tan for trigonometric operations in radians, sqrt for square roots (returning NaN for negative inputs), exp for exponentiation, ln and log10 for logarithms (NaN for non-positive arguments), and pow for general exponentiation, returning NaN for negative bases with non-integer exponents and handling other edge cases per IEEE 754 semantics. These operations support scientific computing and numerical algorithms within SML programs. For calendar and time management, the Date structure provides tools for representing and manipulating dates in a specific time zone. It defines an abstract date type constructed via the date function from components like year, month, day, hour, minute, second, and optional offset; conversion functions include fromTimeLocal and fromTimeUniv to derive dates from UTC times, and toTime for the reverse. Formatting is handled by fmt using specifiers (e.g., %Y for four-digit year) and toString for a standard 24-character representation, while scanning parses dates via scan from streams or fromString from strings, returning options for error cases. This module integrates with the Time structure for comprehensive datetime operations. The OS structure serves as a facade for operating system interactions, encapsulating substructures like OS.FileSys for file and directory , OS.IO for polling, OS.Path for pathname syntax, and OS.Process for process control and environment access. It declares the SysErr exception for runtime errors and provides utilities such as errorMsg to retrieve descriptive strings for system errors, errorName for unique identifiers, and syserror for converting error strings to optional codes. These facilities enable portable syscalls, such as file attribute queries in OS.FileSys or process termination in OS.Process, abstracting platform-specific details.

Char and String Modules

The Char structure supplies predicates and conversions for individual , treating them as 8-bit values with operations like ord and chr for mappings (raising Chr for out-of-range inputs), toLower and toUpper for case , and toString for escaped representations. Predicates include isDigit to check for digits ('0' to '9'), isAlpha for letters, isAlphaNum for alphanumeric characters, isSpace for whitespace, and isCntrl for characters, facilitating text validation and processing. Building on Char, the String structure manages immutable sequences of characters with functions for conversion and extraction. Conversions include str to create a single-character string and implode to concatenate a character list into a string, while explode reverses this by producing a char list. Substring operations feature sub to access the i-th character (zero-based), substring to extract a contiguous segment of length j starting at i, and extract for substrings from i to the end or a specified length; these support efficient string manipulation without mutation.

Extensions in Basis

The CharArray structure extends array facilities for mutable sequences of characters, providing low-level data handling akin to the general but specialized for chars. It includes array to create an array of given length initialized with a char, fromList for list-to-array conversion, sub for indexed access, update for modification, and vector to convert to an immutable , useful for performance-critical text buffering. Similarly, the Word structure supports unsigned machine-word integers with fixed (wordSize bits), enabling bitwise and arithmetic operations for low-level data. Key features include modular arithmetic via +, -, *, div, and mod; bitwise functions like andb, orb, xorb, and notb; shifts with << (left), >> (logical right), and ~>> (arithmetic right); and conversions such as toInt/fromInt and toLarge/fromLarge for interfacing with other numeric types, ideal for and .

Implementations

SML/NJ

Standard ML of New Jersey (SML/NJ) is a prominent implementation of the Standard ML '97 programming language, originally developed in spring 1986 by David MacQueen at Bell Laboratories and Andrew Appel at Princeton University. It emerged as a dedicated compiler aimed at producing highly optimized native code, evolving from earlier ML systems and serving as the primary reference implementation due to its comprehensive support for the language definition and ongoing maintenance by a core group including MacQueen and John Reppy. Key features of SML/NJ include its interactive read-eval-print loop (REPL), accessible via the sml command, which enables incremental development, of modules, and immediate execution of expressions in an environment with full and collection. The system supports compilation to native through the Manager (CM), which uses dependency analysis to minimize recompilation and leverages the MLRISC backend for targets like AMD64. Beyond strict conformance to Standard ML '97, SML/NJ offers extensions such as first-class continuations via the Cont structure and suspend/resume mechanisms provided by the Susp structure, facilitating advanced control abstractions like coroutines. For performance, SML/NJ employs a two-level bootstrap compiler process, where an initial lightweight builds the full , enabling efficient self-hosting and generation of high-quality code across platforms. The latest release, version 110.99.9 from November 4, 2025, focuses on bug fixes and stability improvements, including enhancements to 64-bit support on macOS and . SML/NJ is distributed as free, under the BSD 3-Clause License and is cross-platform, with official support for systems (including ), Windows, and macOS, including recent additions for Arm64 architectures. Source distributions and installers are available via and the official website, ensuring accessibility for developers and researchers.

MLton

MLton is a whole-program for the Standard ML programming language, designed to generate small, standalone executables with high runtime performance and no dependency on an interactive . Unlike interactive environments, MLton performs a complete and optimization of the entire program before , enabling aggressive optimizations that eliminate overhead from higher-order features and produce native . This approach results in executables that are self-contained, leveraging techniques such as unboxed native representations for integers, reals, and arrays, as well as fast via GMP integration. The compiler's optimization pipeline includes several advanced transformations to improve efficiency. Key steps involve defunctionalization and closure conversion, which translate higher-order functional code into a first-order static single assignment (SSA) form, eliminating closures and enabling subsequent optimizations like constant propagation and dead code elimination. Additional optimizations encompass multiple code generation backends and garbage collection strategies, allowing selection based on target architecture for minimal memory and execution overhead. MLton also supports built-in profiling capabilities, including allocation, time, and stack profiling via the mlprof tool, which instruments code to measure function-level metrics without significant performance degradation in non-profiled builds. MLton adheres strictly to the SML'97 standard, ensuring full conformance to the revised Definition of Standard ML while providing extensions such as a C (FFI) for integrating with system libraries and utilities like lexer and parser generators (mllex and mlyacc). The project originated in 1997 as an independent compiler effort, with the first release under the name MLton in 1999, and has been open-source throughout its development. As of 2025, MLton remains actively maintained, with the latest release (20241230) in December 2024 and ongoing updates to support modern platforms and enhancements for applications.

Poly/ML

Poly/ML is a high-performance implementation of Standard ML, originally developed by David Matthews at the University of 's Computer Laboratory. It derives its name from an experimental language called Poly, in which the initial version was written, and builds on earlier efforts like Cambridge ML to provide efficient compilation and support for large-scale programming. Designed with a focus on , Poly/ML emphasizes for demanding applications through advanced features. A core strength of Poly/ML lies in its support for parallelism, achieved via native operating system threads that leverage primitives such as forks, mutexes, and condition variables. This enables full multiprocessor utilization, with the thread library and collector both optimized for concurrent execution on shared-memory systems. While is not natively built-in, the implementation facilitates high-level parallelism abstractions, such as values in tools like Isabelle/ML, for implicit scheduling and parallel evaluation. Additionally, its collector is engineered for efficiency with large heaps, accommodating memory-intensive workloads up to the terabyte scale through 64-bit addressing and dynamic heap adjustment. Poly/ML achieves full conformance to the Standard ML '97 specification, including the complete Basis Library, ensuring portability of Standard ML code across implementations. Beyond the standard, it introduces extensions primarily within the PolyML structure, such as the compiler for foreign calls that generate executable code from SML sources and export to object files, and the Profiling submodule for measuring execution time and memory allocation. Other enhancements include explicit garbage collection triggers via fullGC, heap analysis functions like objSize, and thread-safe compilation to support parallel code generation. These features extend the language without compromising core compatibility, enabling integration with external systems and performance tuning. As of 2025, Poly/ML remains actively maintained, with version 5.9.2 released in August 2025 and development continuing on through community contributions. It is particularly valued in scientific computing for its robustness in handling complex, resource-heavy tasks, serving as the backend for systems like Isabelle and HOL4, where parallel proof checking and large-scale theorem proving demand efficient memory and concurrency management.

Other Notable Implementations

Moscow ML is a lightweight implementation of Standard ML, designed primarily for educational and research purposes, featuring an interactive interpreter and a compiler that generates standalone executables. It fully conforms to the 1997 revision of Standard ML, including the modules language, and implements most of the Standard ML Basis Library, though it omits functional stream I/O. Originally developed by Peter Sestoft at the , it is portable across systems and Windows, with a small footprint that facilitates quick startup times in interactive sessions, making it suitable for teaching concepts. Moscow ML includes extensions such as higher-order functors and first-class modules, which enhance its expressiveness beyond the strict standard while maintaining compatibility. MLKit is a research-oriented compiler toolkit for Standard ML, originating from the University of Aarhus and later developed at the , emphasizing as an alternative to traditional collection. This approach uses region inference to allocate objects in hierarchical regions that are deallocated explicitly or automatically at region boundaries, reducing runtime overhead and enabling predictable memory usage in performance-critical applications. MLKit supports the full SML'97 language definition and a large portion of the Basis Library, compiling to native code for architectures via a . Its design prioritizes static analysis for region lifetimes, allowing programmers to influence memory behavior through annotations while preserving the language's safety guarantees. CakeML provides a verified implementation of a significant subset of Standard ML, tailored for and certified software development, with its entire compiler pipeline mechanically proven correct in the HOL4 theorem prover. This subset encompasses core features like datatypes, , higher-order functions, references, exceptions, polymorphism, modules, signatures, and arbitrary-precision integers, but excludes records, I/O operations, and functors to simplify verification. Developed collaboratively by researchers at institutions including the and the , CakeML generates certified , enabling end-to-end guarantees from source code to execution, including a bootstrapping compiler and interactive REPL. Its focus on supports applications in trustworthy systems, such as certified operating system components or theorem provers. These implementations exhibit trade-offs in performance, features, and conformance to the 1997 Standard ML definition. Moscow ML prioritizes lightweight interactivity and educational accessibility, offering full language conformance with rapid startup but less aggressive optimization compared to whole-program compilers, achieving reasonable execution speeds for interpretive use. MLKit balances research innovation in —regions can yield lower latency than garbage collection in structured programs—with complete SML'97 support, though it requires familiarity with region annotations for optimal efficiency. CakeML, while limited to a verified (approximately 80% of SML features), ensures highest assurance levels at the cost of broader feature omission and potential performance overhead from its proof obligations, making it ideal for safety-critical domains rather than general-purpose coding. Overall, all three maintain high fidelity to the 1997 standard where applicable, with varying Basis Library coverage: Moscow ML and MLKit near-complete, and CakeML partial due to its scoped design.

Code Examples

Simple Programs

Standard ML's simple programs highlight its functional nature, concise syntax, and reliance on the Basis Library for common operations like and processing. These examples illustrate basic expression evaluation, function definitions via , and the use of higher-order functions, all without explicit type declarations thanks to the language's mechanism. A fundamental introductory program is the "Hello, World!" example, which demonstrates output using the TextIO module from the Basis Library. The following code prints a greeting to the standard output stream:
sml
TextIO.print("Hello, world!\n");
Upon execution in an interactive environment like SML/NJ, this evaluates to (the type for void-like expressions) and displays "Hello, world!" followed by a . To suppress the printing of the result in the REPL, the expression can be bound to an underscore:
sml
val _ = TextIO.print("Hello, world!\n");
The TextIO.print function has the signature string -> [unit](/page/Unit), applying the given directly to the output stream. Another basic example is a for computing the of a non-negative , defined using on the argument as per the core language syntax. The code is:
sml
fun fact 0 = 1
  | fact n = n * fact (n - 1);
This defines fact with a base case where fact 0 returns 1, and a recursive case multiplying n by fact (n - 1) for n > 0. The type is inferred as int -> int. For instance, evaluating fact 5 yields 120, demonstrating the language's support for straightforward recursion without explicit loops. Simple list operations showcase Standard ML's built-in support for functional transformations via the List structure in the Basis Library. Consider mapping a function to double each element in an integer list:
sml
val doubled = List.map (fn x => x * 2) [1, 2, 3, 4];
The List.map function, with signature ('a -> 'b) -> 'a list -> 'b list, applies the anonymous function (fn x => x * 2) (of type int -> int) to each element from left to right, producing [2, 4, 6, 8] of type int list. Similarly, filtering a list to retain only even numbers uses List.filter:
sml
val evens = List.filter (fn x => x mod 2 = 0) [1, 2, 3, 4, 5, 6];
With signature ('a -> bool) -> 'a list -> 'a list, this applies the predicate (fn x => x mod 2 = 0) (type int -> bool) and returns [2, 4, 6] of type int list, preserving the original order of matching elements. These operations exemplify immutable list handling central to the language.

Algorithm Implementations

Standard ML's functional lends itself naturally to expressing classic sorting algorithms using , , and higher-order functions on lists. These implementations highlight the language's emphasis on immutability and declarative style, avoiding mutable state or loops. Below, representative examples of , , and are provided, demonstrating core idioms such as list cons (::) and . Efficiency considerations, including tail-recursive optimizations via accumulators, are also discussed to align with the language's support for safe, performant .

Insertion Sort

Insertion sort builds a sorted list incrementally by inserting elements into their correct positions. This showcases on lists, where the empty list serves as the base case and the operator allows recursive insertion. The insert function places a value x into a sorted list ys, comparing x with the head y of ys:
fun insert x [] = [x]
  | insert x (y :: ys) = if x <= y then x :: y :: ys
                         else y :: insert x ys
The full isort function then iterates over the input list, inserting each element into an accumulating sorted result:
fun isort [] = []
  | isort (x :: xs) = insert x (isort xs)
This recursive definition is intuitive but not tail-recursive, potentially leading to stack overflow for large lists due to the pending insert call. To optimize, a tail-recursive variant uses an accumulator to build the result:
fun isort_tail xs =
    let fun insort acc [] = acc
          | insort acc (x :: xs) = insort (insert x acc) xs
    in insort [] xs end
This approach maintains ascending order and exemplifies SML's balance of expressiveness and efficiency through higher-order functions like insert. With proper tail-call optimization in compliant implementations, it avoids stack overflow while achieving O(n²) time.

Merge Sort

Merge sort employs a divide-and-conquer strategy, splitting lists recursively and merging sorted halves. It is stable and guarantees O(n log n) time, making it suitable for demonstrating recursive decomposition in SML. First, the split function divides a list into two roughly equal parts using even-odd indexing:
fun split [] = ([], [])
  | split [x] = ([x], [])
  | split (x :: y :: xs) = let val (l, r) = split xs
                            in (x :: l, y :: r) end
The merge function combines two sorted lists, using pattern matching to compare heads and recurse on tails:
fun merge [] ys = ys
  | merge xs [] = xs
  | merge (x :: xs) (y :: ys) = if x <= y then x :: merge xs (y :: ys)
                                else y :: merge (x :: xs) ys
The complete msort ties these together:
fun msort [] = []
  | msort [x] = [x]
  | msort xs = let val (l, r) = split xs
                in merge (msort l) (msort r) end
For tail recursion, an accumulator can be introduced in the merge step, though the divide phase remains inherently non-tail-recursive. Implementations like optimize this via deforestation, reducing intermediate list allocations. This example underscores SML's strength in composing pure functions for parallelizable algorithms.

Quicksort

Quicksort selects a and partitions the list into elements less than or equal to and greater than the , recursing on each part. In SML, partitioning uses higher-order functions like List.filter, naturally handling duplicates by including equals in one subtree for simplicity. Assuming the List structure's filter:
fun qsort [] = []
  | qsort [x] = [x]
  | qsort xs =
      let val [pivot](/page/Pivot) = hd xs
          val rest = tl xs
          val lows = List.filter (fn y => y <= [pivot](/page/Pivot)) rest
          val highs = List.filter (fn y => y > [pivot](/page/Pivot)) rest
      in qsort lows @ [pivot](/page/Pivot) :: qsort highs end
The @ operator concatenates results, with hd and tl extracting pivot and tail. This version averages O(n log n) time but worst-cases to O(n²) on sorted inputs; randomized pivot selection (e.g., via List.nth (xs, Random.randInt (length xs))) can mitigate this, though not shown here for brevity. These implementations illustrate how SML's list operations foster concise, correct algorithms without side effects.

Advanced Language Features

Standard ML provides advanced mechanisms for abstraction and error management, including and functors for structuring code and exceptions for handling errors. These features enable the creation of reusable, components and robust programs that gracefully manage exceptional conditions. The language's system, defined in the revised standard, supports through functors, allowing structures to be instantiated with different types while preserving . Exceptions, also part of the core definition, permit the interruption of normal to signal and recover from errors like arithmetic overflows or invalid operations.

Error Handling with Exceptions

Standard ML's exception system uses the raise keyword to signal errors and the handle clause to catch them, providing a form of lightweight error handling without altering the functional purity of expressions. For instance, the built-in division div raises the Div exception when dividing by zero, as specified in the Basis Library's structure. To demonstrate , consider a safe division function that catches this exception and returns a default value:
sml
fun safeDiv (x, y) =
    x div y handle Div => 0;
Evaluating safeDiv (10, 0) yields 0 instead of terminating the program, illustrating how [handle](/page/Handle) matches the raised exception and substitutes an alternative computation. This approach is exhaustive in on exceptions, ensuring all cases are addressed if the handler covers the relevant constructors.

Expression Interpreter Using Datatypes and Pattern Matching

Datatypes in Standard ML allow the definition of algebraic data types for representing structured data, such as abstract syntax trees for expressions, combined with pattern matching in functions for concise evaluation. A common practical scenario is an interpreter for simple arithmetic expressions, where a datatype encodes constants, , subtraction, and division. The eval function uses pattern matching to recursively compute the value, raising exceptions for invalid operations like via the underlying div primitive. Here is an example datatype and evaluator:
sml
[datatype exp = Constant of int](/page/The_Constant)
             | Plus of exp * exp
             | Minus of exp * exp
             | Divide of exp * exp;

fun [eval](/page/Eval) (Constant n) = n
  | eval (Plus (e1, e2)) = eval e1 + eval e2
  | eval (Minus (e1, e2)) = eval e1 - eval e2
  | eval (Divide (e1, e2)) = eval e1 div eval e2;
For an expression like Divide (Constant 10, Constant 0), [eval](/page/Eval) raises Div, which can be caught externally with [handle](/page/Handle) for error propagation or recovery, such as raising a custom [EvalError](/page/Error) exception:
sml
exception EvalError of string;

val result = eval (Divide (Constant 10, Constant 0))
                  handle Div => raise EvalError "Division by zero";
This setup leverages matching's exhaustiveness checking at to ensure all expression variants are handled, while exceptions provide .

Arbitrary-Precision Integers with the IntInf Module

The Basis Library includes the IntInf structure for arbitrary-precision integers, supporting operations beyond the fixed-size int type, such as , , and conversion from smaller types. This module is essential for computations involving where overflow is a concern, as IntInf values can grow dynamically without bounds (limited only by memory). Key operations include fromInt for conversion and the overloaded arithmetic operators like +. For example, to compute the factorial of a large number:
sml
fun fact 0 = IntInf.fromInt 1
  | fact n = IntInf.*(IntInf.fromInt n, fact (n - 1));

val largeFact = fact 1000;  (* Results in a very large IntInf integer *)
Here, IntInf.fromInt n converts the recursive int argument, and * performs arbitrary-precision , ensuring no occurs even for results exceeding word . This demonstrates IntInf's utility in numerical applications requiring exact .

Generic Structures with Functors and Higher-Order Functions

Functors in Standard ML enable the creation of modules by parameterizing structures over signatures, facilitating reusable abstractions like sets. A can define higher-order functions that operate on the provided types, with allowing flexible instantiation. For example, a for a set uses a higher-order comparison function partially applied to elements. Consider a signature for ordered elements:
sml
signature ORD = sig
  type t
  val compare : t * t -> order
end;
A functor GenericSet takes an ORD structure and implements a set with insertion using a partially applied fold:
sml
functor GenericSet (Key : ORD) : sig
  type set
  val empty : set
  val insert : Key.t -> set -> set
end = struct
  type set = Key.t list  (* Simplified as list for illustration *)
  val empty = []
  fun insert x s = if List.exists (fn y => Key.compare (x, y) = EQUAL) s
                   then s
                   else x :: s  (* Higher-order List.exists with partial compare *)
end;
Instantiating with integers (assuming an IntOrd structure providing compare):
sml
structure IntOrd = struct
  type t = [int](/page/INT)
  fun compare (x, y) = if x < y then LESS else if x > y then GREATER else EQUAL
end;

structure IntSet = GenericSet (IntOrd);

val s = IntSet.insert 42 IntSet.empty;  (* Partial application in insert via fold-like ops *)
This example shows how functors parameterize over types and operations, with higher-order functions like List.exists (partially applied with Key.compare) enabling generic behavior. The Basis Library's List structure provides such combinators for list manipulation within functors.

Applications

Major Software Projects

Standard ML has been instrumental in developing several major software projects, particularly in formal verification and theorem proving, where its strong type system and functional paradigm support reliable, modular implementations. Poly/ML, a high-performance implementation of Standard ML, serves as the primary backend for the Isabelle theorem prover, enabling its concurrent proof development environment and integration with higher-order logic components. Isabelle's synchronization mechanisms rely on Poly/ML primitives adapted for parallel processing in proof scripts. Similarly, components of the HOL4 theorem prover leverage Standard ML implementations like Poly/ML for efficient execution of verified proofs in higher-order logic. The /NJ implementation underpins key development tools, including the MLWorks (), which bootstraps from /NJ to provide interactive editing, debugging, and compilation for Standard ML programs. The CakeML project develops a verified for a subset of Standard ML, with its implementation and proofs conducted in HOL4, a theorem prover implemented in Standard ML. MLton, known for whole-program optimization, powers high-performance applications such as the sml-server , which handles HTTP requests efficiently in production environments. These projects highlight Standard ML's impact on verified systems, including theorem provers and self-hosting compilers, where the language's formal semantics enable rigorous correctness guarantees without runtime errors.

Research and Educational Use

Standard ML has been widely adopted in academic settings for teaching functional programming principles due to its clean syntax, strong type system, and support for modular programming. Universities around the world have incorporated it as an introductory language for computer science courses, emphasizing its role in fostering understanding of higher-order functions, recursion, and type inference without the complexities of side effects. Moscow ML, a of Standard ML, is particularly suited for educational environments because of its and portability, making it ideal for exercises and student projects in . It supports the full Standard ML '97 specification while requiring minimal resources, allowing easy integration into teaching materials for topics like data structures and algorithm design. Key textbooks have further solidified Standard ML's pedagogical value. Larry Paulson's ML for the Working Programmer (2nd edition, 1996) provides a comprehensive guide to programming in Standard ML, covering core concepts such as , modules, and through practical examples and exercises, and it remains a staple in university curricula for its balance of theory and application. In research, Standard ML has advanced by serving as a foundation for exploring polymorphic and modular type systems, with seminal work including type-theoretic interpretations that embed its semantics into richer logical frameworks. This has enabled deeper investigations into the of programming languages, influencing developments in dependent types and proof assistants. A prominent example is the CakeML project, which develops a verified implementation of a substantial subset of Standard ML, including its parser, type checker, and compiler, all proven correct in the HOL4 theorem prover. This work demonstrates Standard ML's utility in , allowing researchers to build and certify trustworthy software systems while highlighting the language's expressive power for mechanized proofs. The ML Family Workshop series continues to foster research on Standard ML and its extensions. The edition, held in conjunction with ICFP, featured discussions on enhancements and within the ML ecosystem, while the 2025 workshop, held on October 16 in , featured discussions on enhancements such as freezing bidirectional typing, via projects like LunarML, and language extensions including implicit modules and extended . These events promote collaboration on theoretical advancements and practical innovations in the extended ML family. Standard ML's adoption extends to functional programming curricula at institutions like , where Robert Harper's Programming in Standard ML supports courses on programming language foundations. Its influence is also evident in the design of , which inherits Standard ML's core and module concepts, adapting them for broader applicability while building on the standardization efforts of .

References

  1. [1]
    Standard ML language - Scholarpedia
    Dec 7, 2015 · Standard ML is a declarative language; a Standard ML program consists of a sequence of declarations of types, values (including functions) and exceptions.Missing: features | Show results with:features
  2. [2]
    Standard ML of New Jersey
    The ML language is clearly specified by The Definition of Standard ML (Revised) (Milner, Tofte, Harper, MacQueen, MIT Press, 1997), which defines the language ...Missing: history sources
  3. [3]
    [PDF] The History of Standard ML
    Mar 28, 2020 · The paper covers the early history of ML, the subsequent efforts to define a standard ML language, and the development of its major features and ...
  4. [4]
    The history of Standard ML - ACM Digital Library
    Jun 12, 2020 · The paper covers the early history of ML, the subsequent efforts to define a standard ML language, and the development of its major features and ...
  5. [5]
    A theory of type polymorphism in programming - ScienceDirect.com
    December 1978, Pages 348-375. Journal of Computer and System Sciences. A theory of type polymorphism in programming. Author links open overlay panelRobin Milner.
  6. [6]
    [PDF] The History of Standard ML - CMU School of Computer Science
    Mar 17, 2020 · Meta Language of the LCF theorem proving system developed by Robin Milner and his research group at the. University of Edinburgh in the 1970s ...
  7. [7]
  8. [8]
    [PDF] The Definition of Standard ML
    This new version of the Definition addresses the three kinds of inadequacy respectively by additions, subtractions and corrections. But we have only made such ...Missing: standardization IFIP
  9. [9]
    SML '97 - Standard ML of New Jersey
    SML '97 is a revision of Standard ML, a language defined in 1997, to distinguish it from the 1990 version. It includes a new basis library.
  10. [10]
    The Definition of Standard ML | Books Gateway - MIT Press Direct
    Standard ML is a general-purpose programming language designed for large projects. This book provides a formal definition of Standard ML.Missing: 97 | Show results with:97
  11. [11]
    Standard ML Basis Library
    Jun 6, 2002 · These web pages contain the interface specifications for the modules of the SML Basis Library, which is a standard library for the 1997 Revision ...
  12. [12]
    SML Basis Library - Standard ML of New Jersey
    This document is the result of three years of work by a small group of SML implementers. The latest versions of Moscow ML and Standard ML of New Jersey, as well ...
  13. [13]
    2024 - ML Family Workshop
    ML Family Workshop. 2024. Higher-order, Typed, Inferred, Strict: ML Family Workshop 2024. Friday September 6th 2024, Milan, Italy, co-located with ICFP 2024.Missing: 2025 | Show results with:2025
  14. [14]
    Standard ML of New Jersey
    SML/NJ is free, open source software. What's New. [2025-11-04] Version 110.99.9 is released. This release is primarily a bug-fix release.SML/NJ Versions · User's Guide · The SML of NJ Library · SML '97
  15. [15]
    [PDF] A Theory of Type Polymorphism in Programming
    The aim of this work is largely a practical one. A widely employed style of programming, particularly in structure-processing languages.
  16. [16]
    [PDF] The Definition of Standard ML
    We also briefly describe the recent revisions to the Definition. Standard ML. Standard ML is a functional programming language, in the sense that the full power ...
  17. [17]
    Modules for standard ML - ACM Digital Library
    MacQueen, and D. T. Sannella, Hope: an experimental applicative language, Conf. ... on Principles of Programming Languages, Salt Lake City, January 1984, pp.
  18. [18]
    [PDF] Principal type-schemes for functional programs∗
    Principal type-schemes for functional programs∗. Luis Damas†and Robin Milner. First published in POPL '82: Proceedings of the 9th ACM SIGPLAN-SIGACT.
  19. [19]
    [PDF] Programming in Standard ML - CMU School of Computer Science
    This book is an introduction to programming with the Standard ML pro- gramming language. It began life as a set of lecture notes for Computer.
  20. [20]
    [PDF] Programming in Standard ML - CMU School of Computer Science
    This book is an introduction to programming with the Standard ML pro- gramming language. It began life as a set of lecture notes for Computer.
  21. [21]
  22. [22]
    [PDF] Tips for Computer Scientists Standard ML (Revised)
    Aug 30, 2009 · It also spec- ifies a type year, in order to specify the type of built. The structure implements cars as triples consisting of the name of ...
  23. [23]
    The List structure - Standard ML Family GitHub Project
    May 24, 2000 · The List structure provides a collection of utility functions for manipulating polymorphic lists, traditionally an important datatype in functional programming.
  24. [24]
    SML '97 Types and Type Checking - Standard ML of New Jersey
    Apr 4, 2000 · What is going on here is that a mutable value (here a ref cell r ) is given a polymorphic type. This allows the mutable value's type to be ...
  25. [25]
    The General structure - Standard ML Family GitHub Project
    Feb 20, 1997 · The structure General defines exceptions, datatypes, and functions which are used throughout the SML Basis Library, and are useful in a wide range of programs.
  26. [26]
    The Standard ML Basis Library - ITU
    Aug 5, 1997 · This document describes the Standard ML Basis Library. This library provides an extensive initial basis for Standard ML, which complements the ...
  27. [27]
    SML Basis Library Overview - Standard ML Family GitHub Project
    Jul 18, 2002 · The SML Basis Library provides support for basic operations on the standard option and list datatypes with the Option, List, ListPair structures.
  28. [28]
    The TEXT_IO signature - Standard ML Family GitHub Project
    Jul 1, 2002 · The TEXT_IO interface provides input/output of characters and strings. Most of the operations themselves are defined in the IMPERATIVE_IO signature.Missing: specification | Show results with:specification
  29. [29]
    The BinIO structure - Standard ML Family GitHub Project
    Jul 1, 2002 · The structure BinIO provides input/output of binary data (8-bit bytes). The semantics of the various I/O operations can be found in the description of the ...
  30. [30]
    The MATH signature - Standard ML Family GitHub Project
    May 25, 2000 · The signature MATH specifies basic mathematical constants, the square root function, and trigonometric, hyperbolic, exponential, and logarithmic functions ...Missing: module | Show results with:module
  31. [31]
    The Date structure - Standard ML Family GitHub Project
    The Date structure provides functions for converting between times and dates, and formatting and scanning dates.Missing: specification | Show results with:specification
  32. [32]
    The OS structure
    ### Summary of OS Structure and Substructures
  33. [33]
    The CHAR signature
    ### Summary of Key Functions for Char Structure
  34. [34]
    The STRING signature
    ### Key Functions for String Structure
  35. [35]
    BasisLibrary - MLton
    Oct 21, 2021 · The Standard ML Basis Library is a collection of modules dealing with basic types, input/output, OS interfaces, and simple datatypes.
  36. [36]
    The WORD signature
    ### Description and Key Functions for Word Structure
  37. [37]
    Standard ML of New Jersey User's Guide
    Apr 3, 2024 · This library consists of the types, functions, and interfaces that are part of the ML standard, including modules for I/O, the operating-system ...
  38. [38]
    Standard ML of New Jersey Interactive System
    Apr 3, 2024 · The Standard ML Basis Library contains a number of structures, such as TextIO , BinIO , OS that provide support for such tasks as executing ...Missing: specification | Show results with:specification
  39. [39]
  40. [40]
  41. [41]
  42. [42]
  43. [43]
    Standard ML of New Jersey License
    STANDARD ML OF NEW JERSEY COPYRIGHT NOTICE, LICENSE AND DISCLAIMER. Copyright (c) 2001-2025 by The Fellowship of SML/NJ Copyright (c) 1989-2001 by Lucent ...
  44. [44]
    SML/NJ Versions
    SML/NJ Versions ; 110.99.8, April 25, 2025, files ; 110.99.7.1, January 17, 2025, files ; 110.99.7, December 28, 2024, files ; 110.99.6.1, October 25, 2024, files.
  45. [45]
    Standard ML of New Jersey - GitHub
    The latest development release is 2025.2. To report bugs in the development version, use the development issue tracker. Other issues, such as broken links on ...Missing: November | Show results with:November<|control11|><|separator|>
  46. [46]
    MLton
    Dec 30, 2024 · MLton is a whole-program optimizing compiler for the Standard ML programming language. MLton generates small executables with excellent runtime performance.Documentation · FAQ · Users · Features
  47. [47]
    ClosureConvert - MLton
    Oct 21, 2021 · It converts an SXML program into an SSA program. Defunctionalization is the technique used to eliminate Closures (see CejtinEtAl00). Uses ...Missing: optimizations | Show results with:optimizations
  48. [48]
    Profiling - MLton
    Oct 21, 2021 · With MLton and mlprof , you can profile your program to find out bytes allocated, execution counts, or time spent in each function.
  49. [49]
    History - MLton
    In August 1997, we began development of an independent compiler for SML. At the time the compiler was called smlc . By October, we had a working ...Missing: open | Show results with:open
  50. [50]
    About Poly/ML
    Poly/ML was originally written by David Matthews at the Computer Laboratory at Cambridge University. It was written in an experimental language, Poly, similar ...History and Acknowledgements · What else does Poly/ML...Missing: background origins
  51. [51]
    The History of Standard ML - ACM Digital Library
    The Definition of Standard ML (Revised). The MIT. Press, Cambridge, MA, USA. John Mitchell and Ramesh Viswanathan. 1996. Standard ML-NJ Weak Polymorphism and ...
  52. [52]
    Poly/ML Home Page
    The Poly/ML implementation of Standard ML. Features Full multiprocessor support in the thread library and garbage collector. Interactive debuggerDownload · Documentation · FAQMissing: background conformance
  53. [53]
    [PDF] Efficient parallel programming in Poly/ML and Isabelle/ML
    At the very least all the registers in use in the ML code have to be saved and restored to allow values in them to be modified by the garbage collector.
  54. [54]
    The PolyML structure
    Generally, the basis library of Poly/ML follows the ML standard libraries. With a few exceptions the extensions are all contained in the PolyML structure. There ...Missing: features conformance
  55. [55]
  56. [56]
    polyml/polyml: Poly/ML - GitHub
    Poly/ML is a Standard ML implementation originally written in an experimental language called Poly. It has been fully compatible with the ML97 standard since ...Missing: background conformance
  57. [57]
    [PDF] Moscow ML Owner's Manual
    Moscow ML implements Standard ML (SML), as defined in the 1997 Definition of Standard ML, includ- ing the SML Modules language and some extensions.
  58. [58]
    Moscow ML
    Moscow ML is a light-weight implementation of Standard ML (SML), a strict functional language used in teaching and research.
  59. [59]
    [PDF] Moscow ML Language Overview
    Moscow ML implements a proper extension of Standard ML, as defined in the 1997 Definition of Standard. ML. This document describes the language implemented by ...Missing: features conformance 97
  60. [60]
    [PDF] Programming with Regions in the ML Kit
    Apr 23, 1997 · Region-Based Memory Management is a technique for managing memory for programs that have dynamic data structures, such as lists, trees, pointers.Missing: Aarhus | Show results with:Aarhus
  61. [61]
    melsman/mlkit: Standard ML Compiler and Toolkit - GitHub
    The MLKit is a compiler toolkit for the Standard ML language, including The MLKit with Regions, which features a native backend for the x64 architecture.Missing: Aarhus University
  62. [62]
    [PDF] Programming with Regions in the MLKit - Martin Elsman
    Dec 30, 2021 · In this report, we describe how Standard ML can be equipped with a dif- ferent memory management discipline, namely a region-based memory model.Missing: Aarhus | Show results with:Aarhus
  63. [63]
    [PDF] A Verified Implementation of ML - CakeML
    We have developed and mechanically verified an ML system called. CakeML, which supports a substantial subset of Standard ML. CakeML is implemented as an ...
  64. [64]
    CakeML
    CakeML: A Verified Implementation of ML. About. CakeML is a functional programming language and an ecosystem of proofs and tools built around the language.Projects · Publications · Getting Started · Pancake
  65. [65]
    CakeML: A Verified Implementation of ML - GitHub
    CakeML is a verified implementation of a significant subset of Standard ML. The source and proofs for CakeML are developed in the HOL4 theorem prover.
  66. [66]
    The IntInf structure - Standard ML Family GitHub Project
    May 26, 2000 · These operations seemed to naturally fit into the specification of the IntInf module, rather than require an additional WordInf structure.
  67. [67]
    Theory ML - Isabelle
    Synchronization in Isabelle/ML is based on primitives of Poly/ML, which have been adapted to the specific assumptions of the concurrent Isabelle environment.
  68. [68]
    Poly/ML software directory
    This is a directory of software written in Standard ML that will work with Poly/ML. ... Isabelle theorem prover. sml-ev, kqueue and epoll library. sml-db ...
  69. [69]
    Implementation work using ML - cs.Princeton
    Implementation work using ML · Compilers · Network and Systems Software · Data structure marshalling/unmarshalling · Theorem Provers · Hardware Design and ...
  70. [70]
    MLWorks Project - Ravenbrook
    May 1, 2013 · MLWorks was able to bootstrap from C and SML/NJ and should still be able to. The run-time system is probably out of date. It's likely that ...
  71. [71]
    [PDF] The Verified CakeML Compiler Backend
    This allows us to verify CakeML code that makes FFI calls against the ... (SML/NJ) Version 110.78 (SML/NJ developers, 2017). • (Moscow ML) Version ...
  72. [72]
    diku-dk/sml-server: Web server infrastructure for Standard ML - GitHub
    This Standard ML package provides the basics for running an HTTP web server for which requests are handled by Standard ML code.
  73. [73]
    [PDF] Standard ML
    Moreover, ML is defined formally. Milner et al. (1990) is not the first formal definition of a programming language, but it is the first one that compiler ...
  74. [74]
    kfl/mosml: Moscow ML is a light-weight implementation of ... - GitHub
    Moscow ML is a light-weight implementation of Standard ML (SML), a strict functional language widely used in teaching and research. mosml.org · 355 stars 43 ...
  75. [75]
    ML for the Working Programmer, 2nd Edition - University of Cambridge
    ML for the Working Programmer. 2nd Edition. L. C. PAULSON, University of Cambridge. The new edition of this successful and established textbook retains its two ...
  76. [76]
    Standard ML and SML/NJ Literature
    Apr 1, 2024 · Text Books on Standard ML '97 Programming · ML for the Working Programmer, 2nd edition. Author: Larry Paulson Publisher: Cambridge University ...<|separator|>
  77. [77]
    [PDF] Reflections on Standard ML
    – The Definition of Standard ML specifies an initial basis or standard environment, but this initial basis is rather parsimonious and has been significantly ...
  78. [78]
    [PDF] A Type-Theoretic Interpretation of Standard ML
    This paper presents a type-theoretic interpretation of Standard ML, a full-scale language, using a 1997 dialect, and an internal language derived from XML.Missing: advancements | Show results with:advancements
  79. [79]
    Higher-order, Typed, Inferred, Strict: ML Family Workshop 2024
    This workshop specifically aims to recognize the entire extended ML family and to provide the forum to present and discuss common issues.
  80. [80]
    Higher-order, Typed, Inferred, Strict: ML Family Workshop 2025
    The 2025 ML family workshop is co-located with ICFP/SPLASH 2025 and will take place on October 16, 2025 in Singapore. The ML family workshop will be held in ...Missing: 2024 | Show results with:2024