Evaluation strategy
In a programming language, an evaluation strategy refers to the set of rules that govern the order and timing of evaluating expressions during program execution.[1] These strategies dictate when operands or arguments are computed, particularly in function applications, and play a crucial role in determining computational efficiency, memory usage, and whether a program terminates.[2]
The concept originated in the 1930s with Alonzo Church's lambda calculus, where reduction strategies such as normal order (outermost-leftmost) and applicative order (innermost) were formalized. These ideas influenced early programming languages: Algol 60 introduced call-by-name (a form of lazy evaluation), while Lisp adopted eager evaluation. Later developments, including lazy evaluation in languages like SASL (1970s) and Haskell (1990), expanded the field.[3]
Evaluation strategies are especially prominent in functional programming languages, where expressions are treated as first-class entities and side effects are minimized or absent.[1] The two primary categories are eager evaluation (also known as strict evaluation) and lazy evaluation (non-strict evaluation). In eager evaluation, arguments to a function are fully computed before the function body is executed, following an applicative order from left to right; this approach is common in languages like C, Java, and Standard ML, promoting predictable performance but potentially wasting resources on unused computations.[2] Conversely, lazy evaluation delays the computation of an expression until its value is actually required, enabling features like infinite data structures and conditional execution without evaluating irrelevant branches; it is the default in languages such as Haskell and can be simulated in others via mechanisms like thunks.[1]
Parameter-passing mechanisms often align with these strategies, including call-by-value (eager evaluation of arguments before passing), call-by-reference (passing memory addresses for modification), call-by-name (reevaluating arguments on each use), and call-by-need (an optimized lazy variant that caches results after first evaluation).[2] The choice of strategy influences program behavior in lambda calculus and its extensions, where confluent reduction ensures that results are independent of order in pure settings, though efficiency varies.[1] In functional logic programming, hybrid strategies combine lazy functional evaluation with non-deterministic search to balance completeness and performance.[4]
Introduction
Definition and overview
In a programming language, an evaluation strategy specifies the rules governing the order and timing of expression evaluation, as well as the binding of arguments to formal parameters during computation.[5] This encompasses decisions on whether and when subexpressions are reduced to values before or during function application.[6]
Evaluation strategies consist of two primary components: the evaluation order, which dictates when arguments to a function are computed (e.g., before or after entering the function body), and the binding mechanism, which determines how argument values or expressions are associated with and stored for parameters (e.g., by copying values or sharing references).[5] Key terminology includes eager evaluation (also called strict evaluation), where arguments are fully evaluated prior to function application, aligning with applicative order; and lazy evaluation (non-strict evaluation), where evaluation is postponed until the argument's value is actually required, corresponding to normal order.[7] Strict evaluation is the default in most imperative languages like C and Java, while non-strict evaluation appears in functional languages such as Haskell.[5]
These strategies profoundly influence program behavior and performance. Eager evaluation can improve efficiency for computations where all arguments are needed but may lead to non-termination if an argument diverges (e.g., an infinite loop) even if unused, and it executes side effects immediately, which is critical in imperative languages for predictable sequencing.[7] Lazy evaluation enhances efficiency by skipping unneeded computations—useful in declarative languages for infinite data structures or conditional expressions—and can ensure termination in cases where eager evaluation fails, but it complicates side-effect management and debugging due to deferred execution.[5] Overall, the choice balances resource use, correctness amid side effects, and expressiveness between imperative and declarative paradigms.
To illustrate evaluation points, consider this pseudocode for a simple function call:
function add(x, y)
return x + y
result = add(computeExp1(), computeExp2())
function add(x, y)
return x + y
result = add(computeExp1(), computeExp2())
In eager (applicative-order) evaluation, computeExp1() and computeExp2() are evaluated to values before the add body executes. In lazy (normal-order) evaluation, these expressions are bound unevaluated and computed only when referenced in the addition.[7]
Historical development
The foundations of evaluation strategies in programming languages trace back to Alonzo Church's development of lambda calculus in the 1930s, where normal order evaluation—reducing the leftmost outermost redex first—emerged as a core reduction strategy for computing functions on symbolic expressions. This approach influenced subsequent language designs by prioritizing substitution without premature argument evaluation, laying groundwork for non-strict paradigms.
In the mid-20th century, early imperative languages adopted strict evaluation models to align with hardware constraints and predictable execution. Fortran, released in 1957, implemented strict eager evaluation through its call-by-reference mechanism, evaluating arguments fully before passing addresses, which facilitated efficient numerical computations but limited expressiveness for unevaluated forms.[8] Similarly, ALGOL 60 (1960) introduced applicative order evaluation (call-by-value) for functions, where arguments are reduced to values prior to application, alongside call-by-name as a non-strict alternative; ALGOL 58 (1958) had used a substitution model akin to call-by-name.[9]
John McCarthy's seminal 1960 paper on Lisp established applicative order as the default evaluation strategy, diverging from lambda calculus's normal order by evaluating function arguments before application to enable practical implementation on early computers, though it acknowledged the theoretical appeal of normal order for avoiding redundant computations.[10] This choice influenced many subsequent languages, but terminology began to shift; McCarthy's precise distinction between applicative (eager) and normal (lazy-like) orders evolved in later designs, where "lazy evaluation" came to denote delayed computation with memoization, distinct from pure substitution.
Key advancements in non-strict strategies appeared in the 1970s, with lazy evaluation proposed for Lisp dialects through techniques like streams and indefinite data structures, as explored in works by Friedman and Wise, enabling infinite lists without immediate computation.[11] Icon, developed in 1977, introduced call-by-need elements via its goal-directed evaluation and generators, which compute values on demand during backtracking searches, bridging imperative and functional styles.[12]
The rise of functional programming paradigms in the 1980s amplified non-strict strategies for modularity and expressiveness. David Turner's Miranda (1985) adopted lazy evaluation as standard, treating functions as first-class citizens and delaying reductions until needed, which inspired broader adoption.[13] This culminated in the 1990s with Haskell's release (version 1.0 in 1990), which formalized non-strict semantics—implemented via lazy evaluation—to support pure functional programming, equational reasoning, and handling infinite data structures efficiently.[14]
Illustrative Examples
Basic evaluation example
To illustrate the impact of evaluation strategies on program execution, consider a simple function that increments its argument by one, defined in abstract syntax as:
inc(x) = x + 1
inc(x) = x + 1
This function is applied to a conditional expression as the argument:
inc(if cond then expensive() else 0)
inc(if cond then expensive() else 0)
Here, cond evaluates to false, expensive() represents a computationally intensive operation (such as a recursive computation that could loop indefinitely if invoked), and the if construct selects between the branches based on cond.
Under strict evaluation (also known as applicative order), the argument to inc is fully evaluated before the function body is entered. The evaluation proceeds as follows:
- Evaluate the operator
inc, yielding the increment function.
- Evaluate the argument
if cond then expensive() else 0: First, evaluate cond (false), but since if is treated as a regular function application in pure applicative order, both branches may be evaluated prior to selection in some models; however, the key point is that the else branch (0) is selected without needing expensive(), yet in strict regimes without special-casing conditionals, unnecessary evaluation of the then branch could occur if not optimized.
- To highlight the difference starkly, adapt to a test-like construct where the conditional is the function itself, as in the classic example: Define
test(x, y) = if (= x 0) then 0 else y, and call test(0, expensive()). In applicative order, evaluate x to 0 and y to expensive() before applying test, triggering the full computation of expensive() (potentially infinite loop) even though the result would be 0.
The outcome is that strict evaluation performs extraneous work, consuming resources on expensive() despite the conditional rendering it unnecessary.
In contrast, under non-strict evaluation (normal order), arguments are not evaluated until their values are required during function application (detailed further in the Evaluation Orders section). The trace is:
- Substitute the unevaluated argument into
inc: inc(if cond then expensive() else 0).
- Evaluate the body of
inc: Compute (if cond then expensive() else 0) + 1.
- Evaluate the conditional:
cond is false, so select the else branch (0) without ever invoking expensive().
- Compute
0 + 1, yielding 1.
Thus, non-strict evaluation avoids the unnecessary computation, producing the correct result 1 efficiently. This example demonstrates how evaluation strategy influences both performance and termination behavior in programs with conditional logic.
Comparison table
| Strategy | Evaluation Timing | Memory Usage | Side-Effect Handling | Termination Guarantees | Example Languages |
|---|
| Call by Value | Eager evaluation of arguments before function call. | Copies the value, requiring additional space for the local copy. | No side effects on actual parameters; modifications affect only the local copy. | Strict: May fail to terminate if an argument expression diverges, even if the result would terminate without it. | C, Java (primitives) [15] [16] |
| Call by Reference | Arguments evaluated before call; passes location. | Shares memory location; no extra copy. | Allows modification of actual parameters through aliases. | Strict: Similar to call by value; evaluates arguments eagerly. | Fortran, C++ (references), Ada (in out) [15] [17] |
| Call by Copy-Restore | Copy in before call; copy out on return. | Temporary local copy during execution. | Updates actual on return; potential issues with aliases. | Strict: Evaluates arguments eagerly. | Ada (for some modes) [15] [17] |
| Call by Sharing | Passes reference to the object at call time. | Shares the reference; no deep copy for structures. | Modifications affect the shared object. | Strict: Evaluates arguments before passing reference. | Java (objects), Lisp [17] [18] |
| Call by Name | Lazy: reevaluates argument expression each use. | Thunk storage for delayed evaluation; potential multiple computations. | Side effects occur each reevaluation; can affect actuals. | Non-strict: Terminates whenever a terminating reduction strategy does; avoids evaluating unused args. | Algol 60 [15] [16] [17] |
| Call by Need | Lazy: evaluates argument once on first use, memos result. | Stores thunk and result after first evaluation; avoids recomputation. | Typically read-only after evaluation; no multiple side effects. | Non-strict: Like call by name, but more efficient; terminates if possible. | Haskell [15] [16] |
To illustrate differences, consider the following pseudocode for a function f(x) where x is an argument:
Call by Value (C-like):
int f(int x) { return x * 2; }
int main() { int a = 3; f(a); } // a remains 3
int f(int x) { return x * 2; }
int main() { int a = 3; f(a); } // a remains 3
Call by Name (Algol-like):
[procedure](/page/Procedure) f(expr x); ... end;
f(a + b); // x reevaluated as a + b each time used
[procedure](/page/Procedure) f(expr x); ... end;
f(a + b); // x reevaluated as a + b each time used
Call by Need (Haskell-like):
f x = x * 2
let y = undefined in f y -- errors only if y used, but memoized if used once
```[](https://www.cs.fsu.edu/~lacher/courses/COP4020/fall10/lectures/Subroutines.pdf)[](https://www.nyu.edu/classes/jcf/CSCI-GA.2110-001/slides/session4/Subprograms.pdf)
## Evaluation Orders
### Strict evaluation
Strict evaluation, also known as applicative-order evaluation, is a [strategy](/page/Strategy) in which all arguments to a [function](/page/Function) are fully evaluated to their normal form before the [function](/page/Function) body is applied. This approach ensures that subexpressions are reduced to values prior to any combination or substitution in [lambda calculus](/page/Lambda_calculus) terms.[](https://www.macs.hw.ac.uk/~greg/books/gjm.lambook88.pdf) In programming languages, it aligns with eager evaluation, where computations occur immediately upon encountering an expression.
Mechanically, strict evaluation proceeds by selecting the leftmost innermost redex for [reduction](/page/Reduction), evaluating [arguments](/page/Argument) from the inside out before applying the outer [function](/page/Function). This outer-to-inner order contrasts with strategies that delay [argument](/page/Argument) computation, prioritizing complete resolution of inputs to avoid partial applications. In [lambda calculus](/page/Lambda_calculus), for an application $ M \, N $, $ N $ is reduced to a value before substituting into $ M $. A classic illustration is the identity [function](/page/Function) applied to a diverging [term](/page/Term):
$$
(\lambda x. x) \, \Omega
$$
where $ \Omega = (\lambda x. x \, x) \, (\lambda x. x \, x) $. Under strict evaluation, $ \Omega $ is fully evaluated first, resulting in non-termination as it enters an infinite self-application loop.[](https://www.cs.cornell.edu/courses/cs6110/2016sp/lectures/lecture04.pdf)
One key advantage of strict evaluation is its predictability, particularly for side effects, as all computations occur in a defined sequence, facilitating debugging and reasoning about program behavior. It is the default in most imperative languages, such as C and Java, where immediate evaluation supports efficient resource management for terminating expressions.[](https://www.macs.hw.ac.uk/~greg/books/gjm.lambook88.pdf) However, a significant disadvantage arises with non-terminating arguments: even if the function does not require the argument's value, strict evaluation will attempt to compute it, potentially leading to infinite loops or errors in cases like conditionals with diverging false branches. This can cause programs that would terminate under delayed evaluation to diverge entirely. Strict evaluation is often realized through binding strategies like call by value, which enforce this eager semantics.
### Non-strict evaluation
Non-strict evaluation, also referred to as normal-order evaluation, is a reduction strategy in [lambda calculus](/page/Lambda_calculus) and [functional programming](/page/Functional_programming) where the leftmost outermost beta-redex (reducible expression) is reduced first, delaying the evaluation of arguments until they are actually needed.[](https://www.cs.cornell.edu/courses/cs6110/2018sp/lectures/lec04.pdf) This approach contrasts with strict strategies by prioritizing the overall term structure over immediate argument computation, ensuring that if [a normal form](/page/A-normal_form) exists, the reduction sequence will reach it due to the standardization theorem.[](https://www.cs.yale.edu/flint/trifonov/cs430/notes/17.pdf) In practice, arguments are represented as thunks—suspended computations that encapsulate unevaluated expressions—and are only forced when their values are demanded during the execution of the function body.[](https://www.brics.dk/RS/97/7/BRICS-RS-97-7.pdf)
The mechanics of non-strict evaluation focus on achieving head-normal form, where the term is reduced until the outermost [lambda](/page/Lambda) abstraction is exposed, with inner expressions evaluated [on demand](/page/On-demand) as they contribute to the result. This [on-demand](/page/On-demand) evaluation occurs lazily, meaning subexpressions not required for the final output are never computed, which is particularly useful in functional languages for constructing and manipulating infinite data structures like streams or lists without immediate termination.[](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/history.pdf) For instance, in [Haskell](/page/Haskell), this strategy enables concise definitions of algorithms that process only the necessary portions of potentially unbounded data.[](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/history.pdf)
One key advantage of non-strict [evaluation](/page/Evaluation) is its ability to avoid superfluous computations, thereby improving [efficiency](/page/Efficiency) in scenarios where arguments may not influence the outcome, such as conditional expressions or higher-order functions. It is foundational in purely functional languages, facilitating modular and composable code without the need for explicit control over evaluation timing.[](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/history.pdf) However, it introduces overhead from thunk creation and management, which can increase memory usage through accumulated suspensions and complicate performance prediction.[](https://www.brics.dk/RS/97/7/BRICS-RS-97-7.pdf) Additionally, if the language permits side effects, delayed evaluation can lead to non-intuitive ordering, making [debugging](/page/Debugging) more challenging.[](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/history.pdf)
A illustrative example highlights the terminating behavior of non-strict evaluation: consider the lambda term $(\lambda x.\ 42)\ \Omega$, where $\Omega = (\lambda x.\ x\ x)(\lambda x.\ x\ x)$ is a diverging expression that loops indefinitely under reduction. In non-strict evaluation, the outer redex reduces first to 42, bypassing the evaluation of $\Omega$ entirely, whereas a strict strategy would diverge by attempting to evaluate the argument beforehand.[](https://www.cs.tufts.edu/comp/150AVS/03-lambda.pdf) This demonstrates how non-strict evaluation can yield results in cases where strict evaluation fails. Non-strict evaluation is often implemented via binding strategies like call-by-name, which provide the necessary delay mechanism.[](https://www.brics.dk/RS/97/7/BRICS-RS-97-7.pdf)
### Comparison of applicative and normal order
Applicative-order evaluation, also known as strict evaluation, evaluates the arguments to a function before applying the function itself, focusing on the innermost redex (reducible expression) first.[](https://web.mit.edu/6.001/6.037/sicp.pdf) In contrast, [normal-order evaluation](/page/Normal_order), a non-strict approach, applies the function first and evaluates arguments only when they are actually needed, prioritizing the outermost (leftmost) redex.[](https://web.mit.edu/6.001/6.037/sicp.pdf) This difference in timing—pre-application for applicative order versus post-application for normal order—fundamentally affects how expressions are reduced in [lambda calculus](/page/Lambda_calculus) and [functional programming](/page/Functional_programming) languages.[](https://www.cs.cornell.edu/courses/cs6110/2016sp/lectures/lecture04.pdf)
Regarding termination, applicative order can fail to terminate if any argument diverges, as it evaluates all arguments regardless of whether they are used; for instance, applying a function to a non-terminating expression like `(define (p) (p))` will loop infinitely before the function body executes.[](https://web.mit.edu/6.001/6.037/sicp.pdf) [Normal order](/page/Normal_order), however, can achieve termination in such cases by deferring evaluation of unused arguments, ensuring convergence to [a normal form](/page/A-normal_form) if one exists.[](https://www.cs.cornell.edu/courses/cs6110/2016sp/lectures/lecture04.pdf) This property makes [normal order](/page/Normal_order) particularly advantageous for expressions involving conditionals where branches may not be taken.[](https://files.core.ac.uk/download/pdf/354675334.pdf)
Efficiency trade-offs arise from these strategies: applicative order avoids the overhead of creating thunks (delayed computations) and prevents redundant evaluations of the same argument, but it wastes effort on unused computations.[](https://people.csail.mit.edu/jastr/6001/fall07/r26.pdf) [Normal order](/page/Normal_order) saves work by skipping unnecessary evaluations but may recompute arguments multiple times without memoization, leading to higher computational cost in some scenarios; for example, computing `(gcd 206 40)` requires 18 remainder operations under [normal order](/page/Normal_order) but only 4 under applicative order.[](https://web.mit.edu/6.001/6.037/sicp.pdf)
Side effects further highlight the contrasts, as applicative order executes all argument side effects upfront, potentially performing more operations than needed, while [normal order](/page/Normal_order) defers them, which can alter the order or multiplicity of execution unpredictably.[](https://web.mit.edu/6.001/6.037/sicp.pdf) In languages with mutable state, this deferral in [normal order](/page/Normal_order) complicates reasoning about program behavior.[](https://files.core.ac.uk/download/pdf/354675334.pdf)
To illustrate, consider a conditional [function](/page/Function) `test` defined as `(define (test x y) (if (= x 0) 0 y))` applied to `(test 0 (display "side effect"))`, where the second argument produces a [side effect](/page/Side_effect) (printing). Under applicative order, the [side effect](/page/Side_effect) executes immediately before the conditional, printing regardless of the branch. Under [normal order](/page/Normal_order), the [side effect](/page/Side_effect) is deferred and skipped since the true branch is taken, avoiding the print.
**Applicative-order trace:**
1. Evaluate arguments: `x` to 0, `y` to ([display](/page/Display) "[side effect](/page/Side_effect)") — prints "[side effect](/page/Side_effect)".
2. Apply `if (= 0 0) 0 y` → returns 0.
**Normal-order trace:**
1. Substitute into body: `if (= 0 0) 0 ([display](/page/Display) "[side effect](/page/Side_effect)")`.
2. Evaluate condition: true.
3. Return 0 — no evaluation of `y`, so no print.
This example demonstrates how [normal order](/page/Normal_order) avoids unnecessary [side effect](/page/Side_effect)s, while applicative order incurs them.[](https://web.mit.edu/6.001/6.037/sicp.pdf)
## Strict Binding Strategies
### Call by value
In call by value, each argument expression is fully evaluated to produce a value, which is then copied and bound to the corresponding formal [parameter](/page/Parameter) as a [local variable](/page/Local_variable) before execution of the [procedure](/page/Procedure) or [function](/page/Function) body begins. This mechanism treats the parameter as an independent entity initialized with the argument's value, ensuring that the binding is strict and occurs prior to any side effects within the called [procedure](/page/Procedure).[](https://www.softwarepreservation.org/projects/ALGOL/report/Algol60_report_CACM_1960_June.pdf)[](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)
A key property of call by value is the absence of [aliasing](/page/Aliasing) for mutable state, as modifications to the [parameter](/page/Parameter) do not propagate back to the original argument or its associated data. Any side effects arising from the [evaluation](/page/Evaluation) of the argument expression, such as [function](/page/Function) calls or increments, are realized before the [procedure](/page/Procedure) body executes, promoting predictable behavior in strict [evaluation](/page/Evaluation) contexts.[](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)[](https://www.cs.tufts.edu/~nr/cs257/archive/john-mccarthy/recursive.pdf)
This strategy is prevalent in imperative languages like C and Java. In C, arguments are passed by value, with the order of evaluation unspecified but a sequence point ensuring completion before the function body starts; Java similarly passes copies of primitive values or object references by value, where reassigning the parameter does not alter the caller's reference.[](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)[](https://docs.oracle.com/javase/tutorial/java/javaOO/arguments.html)
The term "call by value" originated in the [ALGOL 60](/page/ALGOL_60) report, where it specified explicit value assignment to formal parameters declared with the `value` keyword. In contrast, John McCarthy's 1960 Lisp design used applicative-order evaluation—evaluating arguments to normal form before [function application](/page/Function_application), effectively passing evaluated expressions or closures in a functional sense—rather than the imperative notion of value copying that became standard in later languages like ALGOL 60 derivatives. This evolution shifted the semantics from reduction strategies in expression-based systems to parameter binding via duplication in block-structured procedural languages.[](https://www.pls-lab.org/Call-by-value)[](https://www.cs.tufts.edu/~nr/cs257/archive/john-mccarthy/recursive.pdf)[](https://www.softwarepreservation.org/projects/ALGOL/report/Algol60_report_CACM_1960_June.pdf)
The following C example illustrates that mutation of the parameter does not affect the original argument:
```c
#include <stdio.h>
void modify(int param) {
param = 10; // Modifies local copy only
}
int main() {
int arg = 5;
modify(arg);
printf("arg = %d\n", arg); // Outputs: arg = 5
return 0;
}
f x = x * 2
let y = undefined in f y -- errors only if y used, but memoized if used once
```[](https://www.cs.fsu.edu/~lacher/courses/COP4020/fall10/lectures/Subroutines.pdf)[](https://www.nyu.edu/classes/jcf/CSCI-GA.2110-001/slides/session4/Subprograms.pdf)
## Evaluation Orders
### Strict evaluation
Strict evaluation, also known as applicative-order evaluation, is a [strategy](/page/Strategy) in which all arguments to a [function](/page/Function) are fully evaluated to their normal form before the [function](/page/Function) body is applied. This approach ensures that subexpressions are reduced to values prior to any combination or substitution in [lambda calculus](/page/Lambda_calculus) terms.[](https://www.macs.hw.ac.uk/~greg/books/gjm.lambook88.pdf) In programming languages, it aligns with eager evaluation, where computations occur immediately upon encountering an expression.
Mechanically, strict evaluation proceeds by selecting the leftmost innermost redex for [reduction](/page/Reduction), evaluating [arguments](/page/Argument) from the inside out before applying the outer [function](/page/Function). This outer-to-inner order contrasts with strategies that delay [argument](/page/Argument) computation, prioritizing complete resolution of inputs to avoid partial applications. In [lambda calculus](/page/Lambda_calculus), for an application $ M \, N $, $ N $ is reduced to a value before substituting into $ M $. A classic illustration is the identity [function](/page/Function) applied to a diverging [term](/page/Term):
$$
(\lambda x. x) \, \Omega
$$
where $ \Omega = (\lambda x. x \, x) \, (\lambda x. x \, x) $. Under strict evaluation, $ \Omega $ is fully evaluated first, resulting in non-termination as it enters an infinite self-application loop.[](https://www.cs.cornell.edu/courses/cs6110/2016sp/lectures/lecture04.pdf)
One key advantage of strict evaluation is its predictability, particularly for side effects, as all computations occur in a defined sequence, facilitating debugging and reasoning about program behavior. It is the default in most imperative languages, such as C and Java, where immediate evaluation supports efficient resource management for terminating expressions.[](https://www.macs.hw.ac.uk/~greg/books/gjm.lambook88.pdf) However, a significant disadvantage arises with non-terminating arguments: even if the function does not require the argument's value, strict evaluation will attempt to compute it, potentially leading to infinite loops or errors in cases like conditionals with diverging false branches. This can cause programs that would terminate under delayed evaluation to diverge entirely. Strict evaluation is often realized through binding strategies like call by value, which enforce this eager semantics.
### Non-strict evaluation
Non-strict evaluation, also referred to as normal-order evaluation, is a reduction strategy in [lambda calculus](/page/Lambda_calculus) and [functional programming](/page/Functional_programming) where the leftmost outermost beta-redex (reducible expression) is reduced first, delaying the evaluation of arguments until they are actually needed.[](https://www.cs.cornell.edu/courses/cs6110/2018sp/lectures/lec04.pdf) This approach contrasts with strict strategies by prioritizing the overall term structure over immediate argument computation, ensuring that if [a normal form](/page/A-normal_form) exists, the reduction sequence will reach it due to the standardization theorem.[](https://www.cs.yale.edu/flint/trifonov/cs430/notes/17.pdf) In practice, arguments are represented as thunks—suspended computations that encapsulate unevaluated expressions—and are only forced when their values are demanded during the execution of the function body.[](https://www.brics.dk/RS/97/7/BRICS-RS-97-7.pdf)
The mechanics of non-strict evaluation focus on achieving head-normal form, where the term is reduced until the outermost [lambda](/page/Lambda) abstraction is exposed, with inner expressions evaluated [on demand](/page/On-demand) as they contribute to the result. This [on-demand](/page/On-demand) evaluation occurs lazily, meaning subexpressions not required for the final output are never computed, which is particularly useful in functional languages for constructing and manipulating infinite data structures like streams or lists without immediate termination.[](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/history.pdf) For instance, in [Haskell](/page/Haskell), this strategy enables concise definitions of algorithms that process only the necessary portions of potentially unbounded data.[](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/history.pdf)
One key advantage of non-strict [evaluation](/page/Evaluation) is its ability to avoid superfluous computations, thereby improving [efficiency](/page/Efficiency) in scenarios where arguments may not influence the outcome, such as conditional expressions or higher-order functions. It is foundational in purely functional languages, facilitating modular and composable code without the need for explicit control over evaluation timing.[](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/history.pdf) However, it introduces overhead from thunk creation and management, which can increase memory usage through accumulated suspensions and complicate performance prediction.[](https://www.brics.dk/RS/97/7/BRICS-RS-97-7.pdf) Additionally, if the language permits side effects, delayed evaluation can lead to non-intuitive ordering, making [debugging](/page/Debugging) more challenging.[](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/history.pdf)
A illustrative example highlights the terminating behavior of non-strict evaluation: consider the lambda term $(\lambda x.\ 42)\ \Omega$, where $\Omega = (\lambda x.\ x\ x)(\lambda x.\ x\ x)$ is a diverging expression that loops indefinitely under reduction. In non-strict evaluation, the outer redex reduces first to 42, bypassing the evaluation of $\Omega$ entirely, whereas a strict strategy would diverge by attempting to evaluate the argument beforehand.[](https://www.cs.tufts.edu/comp/150AVS/03-lambda.pdf) This demonstrates how non-strict evaluation can yield results in cases where strict evaluation fails. Non-strict evaluation is often implemented via binding strategies like call-by-name, which provide the necessary delay mechanism.[](https://www.brics.dk/RS/97/7/BRICS-RS-97-7.pdf)
### Comparison of applicative and normal order
Applicative-order evaluation, also known as strict evaluation, evaluates the arguments to a function before applying the function itself, focusing on the innermost redex (reducible expression) first.[](https://web.mit.edu/6.001/6.037/sicp.pdf) In contrast, [normal-order evaluation](/page/Normal_order), a non-strict approach, applies the function first and evaluates arguments only when they are actually needed, prioritizing the outermost (leftmost) redex.[](https://web.mit.edu/6.001/6.037/sicp.pdf) This difference in timing—pre-application for applicative order versus post-application for normal order—fundamentally affects how expressions are reduced in [lambda calculus](/page/Lambda_calculus) and [functional programming](/page/Functional_programming) languages.[](https://www.cs.cornell.edu/courses/cs6110/2016sp/lectures/lecture04.pdf)
Regarding termination, applicative order can fail to terminate if any argument diverges, as it evaluates all arguments regardless of whether they are used; for instance, applying a function to a non-terminating expression like `(define (p) (p))` will loop infinitely before the function body executes.[](https://web.mit.edu/6.001/6.037/sicp.pdf) [Normal order](/page/Normal_order), however, can achieve termination in such cases by deferring evaluation of unused arguments, ensuring convergence to [a normal form](/page/A-normal_form) if one exists.[](https://www.cs.cornell.edu/courses/cs6110/2016sp/lectures/lecture04.pdf) This property makes [normal order](/page/Normal_order) particularly advantageous for expressions involving conditionals where branches may not be taken.[](https://files.core.ac.uk/download/pdf/354675334.pdf)
Efficiency trade-offs arise from these strategies: applicative order avoids the overhead of creating thunks (delayed computations) and prevents redundant evaluations of the same argument, but it wastes effort on unused computations.[](https://people.csail.mit.edu/jastr/6001/fall07/r26.pdf) [Normal order](/page/Normal_order) saves work by skipping unnecessary evaluations but may recompute arguments multiple times without memoization, leading to higher computational cost in some scenarios; for example, computing `(gcd 206 40)` requires 18 remainder operations under [normal order](/page/Normal_order) but only 4 under applicative order.[](https://web.mit.edu/6.001/6.037/sicp.pdf)
Side effects further highlight the contrasts, as applicative order executes all argument side effects upfront, potentially performing more operations than needed, while [normal order](/page/Normal_order) defers them, which can alter the order or multiplicity of execution unpredictably.[](https://web.mit.edu/6.001/6.037/sicp.pdf) In languages with mutable state, this deferral in [normal order](/page/Normal_order) complicates reasoning about program behavior.[](https://files.core.ac.uk/download/pdf/354675334.pdf)
To illustrate, consider a conditional [function](/page/Function) `test` defined as `(define (test x y) (if (= x 0) 0 y))` applied to `(test 0 (display "side effect"))`, where the second argument produces a [side effect](/page/Side_effect) (printing). Under applicative order, the [side effect](/page/Side_effect) executes immediately before the conditional, printing regardless of the branch. Under [normal order](/page/Normal_order), the [side effect](/page/Side_effect) is deferred and skipped since the true branch is taken, avoiding the print.
**Applicative-order trace:**
1. Evaluate arguments: `x` to 0, `y` to ([display](/page/Display) "[side effect](/page/Side_effect)") — prints "[side effect](/page/Side_effect)".
2. Apply `if (= 0 0) 0 y` → returns 0.
**Normal-order trace:**
1. Substitute into body: `if (= 0 0) 0 ([display](/page/Display) "[side effect](/page/Side_effect)")`.
2. Evaluate condition: true.
3. Return 0 — no evaluation of `y`, so no print.
This example demonstrates how [normal order](/page/Normal_order) avoids unnecessary [side effect](/page/Side_effect)s, while applicative order incurs them.[](https://web.mit.edu/6.001/6.037/sicp.pdf)
## Strict Binding Strategies
### Call by value
In call by value, each argument expression is fully evaluated to produce a value, which is then copied and bound to the corresponding formal [parameter](/page/Parameter) as a [local variable](/page/Local_variable) before execution of the [procedure](/page/Procedure) or [function](/page/Function) body begins. This mechanism treats the parameter as an independent entity initialized with the argument's value, ensuring that the binding is strict and occurs prior to any side effects within the called [procedure](/page/Procedure).[](https://www.softwarepreservation.org/projects/ALGOL/report/Algol60_report_CACM_1960_June.pdf)[](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)
A key property of call by value is the absence of [aliasing](/page/Aliasing) for mutable state, as modifications to the [parameter](/page/Parameter) do not propagate back to the original argument or its associated data. Any side effects arising from the [evaluation](/page/Evaluation) of the argument expression, such as [function](/page/Function) calls or increments, are realized before the [procedure](/page/Procedure) body executes, promoting predictable behavior in strict [evaluation](/page/Evaluation) contexts.[](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)[](https://www.cs.tufts.edu/~nr/cs257/archive/john-mccarthy/recursive.pdf)
This strategy is prevalent in imperative languages like C and Java. In C, arguments are passed by value, with the order of evaluation unspecified but a sequence point ensuring completion before the function body starts; Java similarly passes copies of primitive values or object references by value, where reassigning the parameter does not alter the caller's reference.[](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)[](https://docs.oracle.com/javase/tutorial/java/javaOO/arguments.html)
The term "call by value" originated in the [ALGOL 60](/page/ALGOL_60) report, where it specified explicit value assignment to formal parameters declared with the `value` keyword. In contrast, John McCarthy's 1960 Lisp design used applicative-order evaluation—evaluating arguments to normal form before [function application](/page/Function_application), effectively passing evaluated expressions or closures in a functional sense—rather than the imperative notion of value copying that became standard in later languages like ALGOL 60 derivatives. This evolution shifted the semantics from reduction strategies in expression-based systems to parameter binding via duplication in block-structured procedural languages.[](https://www.pls-lab.org/Call-by-value)[](https://www.cs.tufts.edu/~nr/cs257/archive/john-mccarthy/recursive.pdf)[](https://www.softwarepreservation.org/projects/ALGOL/report/Algol60_report_CACM_1960_June.pdf)
The following C example illustrates that mutation of the parameter does not affect the original argument:
```c
#include <stdio.h>
void modify(int param) {
param = 10; // Modifies local copy only
}
int main() {
int arg = 5;
modify(arg);
printf("arg = %d\n", arg); // Outputs: arg = 5
return 0;
}
This behavior aligns with call by value semantics, where the parameter param receives a copy of arg's value, and changes remain isolated.[19]
Call by reference
Call by reference is a parameter-passing mechanism in which the formal parameter is bound to the memory address (L-value) of the actual argument, allowing the subroutine to access and modify the original data directly without copying.[20][21] This approach is typically used in strict evaluation contexts, where arguments are fully evaluated prior to the binding.[22]
One key advantage of call by reference is its efficiency when passing large data structures, such as arrays or objects, since only the address is transmitted rather than duplicating the entire content.[23] However, this shared access introduces aliasing, where multiple identifiers refer to the same memory location, potentially enabling side effects that alter the caller's data unexpectedly during execution.[22] Such properties make it suitable for operations requiring in-place modifications but demand careful management to avoid non-local effects.
In languages like Pascal, call by reference is explicitly invoked using the var keyword for formal parameters, ensuring the address of the argument is passed and changes propagate back to the caller.[24][25] Similarly, C++ supports this through reference types, declared with the & symbol, which act as aliases to the original variables and differ from pointers by being non-nullable and non-resignable, promoting safer usage without explicit dereferencing.[26]
Despite these benefits, call by reference can lead to issues such as unintended modifications via aliasing, which complicate program reasoning and debugging, as changes in one part of the code may unexpectedly affect distant variables.[27] Scoping problems may also arise if the lifetime of the referenced data ends before the subroutine completes, potentially causing dangling references or undefined behavior.[22]
For illustration, consider this C++ example where a function increments a variable passed by reference:
cpp
#include <iostream>
void increment(int& x) {
x += 1;
}
int main() {
int value = 5;
increment(value);
std::cout << value << std::endl; // Outputs: 6
return 0;
}
#include <iostream>
void increment(int& x) {
x += 1;
}
int main() {
int value = 5;
increment(value);
std::cout << value << std::endl; // Outputs: 6
return 0;
}
Here, the call modifies the original value in main because x aliases its address. A similar effect occurs in Pascal with var:
pascal
program Example;
var
value: integer;
procedure Increment(var x: integer);
begin
x := x + 1;
end;
begin
value := 5;
Increment(value);
writeln(value); // Outputs: 6
end.
program Example;
var
value: integer;
procedure Increment(var x: integer);
begin
x := x + 1;
end;
begin
value := 5;
Increment(value);
writeln(value); // Outputs: 6
end.
This demonstrates how mutations within the procedure directly alter the caller's variable.[24]
Call by copy-restore
Call by copy-restore, also known as copy-in copy-out or value-result, is a strict parameter-passing mechanism in which the value of an argument is copied into a local parameter upon entry to the subroutine, and any modifications to that parameter are then copied back to the original argument upon exit.[28] This process occurs as part of eager evaluation, ensuring arguments are fully evaluated before the subroutine executes.[23]
The mechanism simulates pass-by-result semantics, allowing subroutines to update caller variables without direct sharing during execution, which can be useful in languages lacking native reference passing to mimic modifiable outputs.[28] It has been used in languages like Ada for in out parameters, though early Fortran versions primarily employed call-by-reference; value-result became more prominent in later standards and other languages.[23]
A key advantage of call by copy-restore is that it prevents aliasing issues within the subroutine, as the parameter operates on an independent copy, avoiding unintended side effects from simultaneous modifications—unlike bidirectional sharing—while still enabling one-way updates to the original argument on return.[28]
For example, consider a pseudocode subroutine that updates a specific element in an array:
SUBROUTINE UpdateArray(A, index, newValue)
// Copy-in: Create local copy of A
LOCAL_ARRAY = COPY(A)
// Modify the local copy
LOCAL_ARRAY[index] = newValue
// Copy-out: Restore changes to original A
A = COPY(LOCAL_ARRAY)
END SUBROUTINE
SUBROUTINE UpdateArray(A, index, newValue)
// Copy-in: Create local copy of A
LOCAL_ARRAY = COPY(A)
// Modify the local copy
LOCAL_ARRAY[index] = newValue
// Copy-out: Restore changes to original A
A = COPY(LOCAL_ARRAY)
END SUBROUTINE
In this case, calling UpdateArray(myArray, 5, 42) evaluates myArray and copies it locally before modification, then copies the updated local array back to myArray upon completion, ensuring the caller's array reflects the change without exposing the original to aliasing during execution.[28]
Call by sharing
Call by sharing is an evaluation strategy where the caller and the called procedure share access to the same argument object by passing a reference to it, without duplicating the object's contents. This approach was introduced in the CLU programming language to enable efficient passing of complex data structures while maintaining object integrity.[29] In practice, the argument is fully evaluated prior to the call, and a reference to the resulting object—typically treated as a value itself—is bound to the formal parameter, often via a shallow copy of the reference for efficiency.[29]
This strategy strikes a balance between call by value and call by reference by passing the reference by value, which prevents rebinding the parameter to a different object but allows in-place mutations to propagate back to the caller.[30] It is particularly prevalent in object-oriented languages for handling mutable objects, such as lists or arrays in Python and Ruby, where the object's mutability enables shared modifications without the overhead of deep copying.[31] In Python, for instance, all arguments are passed this way, as the language treats everything as an object and passes references to those objects. Modern Fortran (since Fortran 90) supports call-by-value via the VALUE attribute for scalar arguments, providing a strict alternative to its default reference passing.[31][32]
The shared access in call by sharing can introduce side effects, as mutations to the object within the procedure affect the original argument in the caller's scope, potentially leading to unintended aliasing behaviors if programmers assume independent copies.[31] To illustrate, consider the following Python example where a list is passed to a function that appends an element:
python
def modify_list(lst):
lst.append(42) # Modifies the shared object
my_list = [1, 2, 3]
modify_list(my_list)
print(my_list) # Outputs: [1, 2, 3, 42]
def modify_list(lst):
lst.append(42) # Modifies the shared object
my_list = [1, 2, 3]
modify_list(my_list)
print(my_list) # Outputs: [1, 2, 3, 42]
Here, the function modifies the original list because both the parameter lst and the argument my_list reference the same object.[31] Similar behavior occurs in Ruby for mutable objects like arrays, where passing an array allows the callee to alter its contents, affecting the caller's view.
Call by address
Call by address is a strict parameter-passing mechanism where the explicit memory address of the argument is passed to the function parameter, typically as a pointer value, requiring dereferencing within the function to access or modify the original data.[33] This approach evaluates the argument to obtain its address prior to the call, aligning with strict evaluation semantics. In practice, the function receives a copy of the address (consistent with call-by-value for the pointer itself), but dereferencing allows indirect mutation of the caller's data.[34]
This mechanism provides low-level control over memory locations, enabling operations like dynamic allocation via functions such as malloc in C, where pointers facilitate heap management without unnecessary data copying. However, it is inherently error-prone, as improper dereferencing can lead to segmentation faults, dangling pointers, or buffer overflows, demanding careful programmer oversight. Languages like C implement this through explicit pointer declarations (e.g., int *p), drawing from assembly language influences where direct address manipulation is routine for performance-critical code. PL/I also supports call by address as a default linkage convention, passing parameters by reference to their storage locations unless specified otherwise.[35]
The primary advantage of call by address lies in its efficiency for systems programming, where it minimizes overhead for large or complex data structures by avoiding full copies and supporting in-place modifications. For instance, pointer arithmetic allows traversing and altering arrays efficiently, which is essential for low-level tasks like operating system development or embedded systems. A representative example is a swap function in C that exchanges two integers without returning values:
c
void swap(int *a, int *b) {
int temp = *a;
*a = *b;
*b = temp;
}
void swap(int *a, int *b) {
int temp = *a;
*a = *b;
*b = temp;
}
This is invoked as swap(&x, &y);, where & yields the addresses of x and y, enabling the function to dereference and update them directly. Such patterns underscore its utility in resource-constrained environments, though they require robust error handling to mitigate risks.
Call by unification
Call by unification is a strict evaluation strategy employed in logic programming languages, where the arguments of a predicate are matched and bound to variables through a process of unification before the body of the predicate is executed.[36] This approach ensures that the terms are made identical via substitutions, enabling declarative specification of computations based on logical inference rather than imperative assignment.[37]
The mechanics of call by unification involve applying an algorithm to find a most general unifier (mgu), which is a substitution that renders two terms syntactically identical while preserving their structure.[37] Unification occurs eagerly upon predicate invocation: the call's arguments are unified with the head of a clause, and if successful, the resulting substitution is extended to the clause's body for further evaluation. This substitution binds variables to terms, such as constants, variables, or complex structures, and is composed incrementally during resolution. The process supports the resolution principle, where unification facilitates matching between goals and facts or rules.[38]
A key property of call by unification is its integration with backtracking, allowing the system to retract bindings and explore alternative clauses upon failure, which is essential for search-based problem solving in nondeterministic domains.[39] Variables are bound to ground terms or other variables, enabling flexible pattern matching without explicit control flow for binding propagation.
This strategy is central to Prolog, where it forms the core mechanism for SLD-resolution (Selective Linear Definite clause resolution), driving the language's execution model since its inception in the early 1970s.[39]
One advantage of call by unification is its support for declarative pattern matching, allowing programmers to specify what relations hold rather than how to compute them, which simplifies expressing logical queries and rules.[36]
For example, in Prolog, the goal unify(X, [a|X], [a,b]). succeeds by unifying the second and third arguments, binding X to [b], as the list structures match under the substitution {X ← [b]}.[38]
Non-strict Binding Strategies
Call by name
Call by name is a non-strict evaluation strategy in which the argument expression passed to a function is not evaluated at the call site but is instead textually substituted for the corresponding formal parameter within the function body, with the substituted expression being evaluated anew each time it is referenced during execution.[40] This substitution occurs dynamically at runtime, treating the argument as unevaluated text that is inserted wherever the parameter appears, often enclosed in parentheses to preserve syntactic validity.[40] As a result, the argument's side effects, if any, may occur multiple times if the parameter is used more than once in the body.[41]
The core property of call by name lies in its pure textual substitution semantics, which avoids the creation or storage of thunks—delayed computation wrappers—relying instead on direct re-substitution and re-evaluation without intermediate caching.[41] This approach aligns with normal-order reduction in lambda calculus, where outermost redexes are reduced first, but it implements this through syntactic replacement rather than graph-based optimization.[41] Call by name was formally introduced in the Revised Report on the Algorithmic Language ALGOL 60, where it is described as replacing "any formal parameter not quoted in the value list... throughout the procedure body, by the corresponding actual parameter."[40] It influenced the design of early Lisp implementations, where McCarthy incorporated a flexible variant resembling ALGOL's call by name through explicit evaluation control via the eval function, enabling delayed argument handling in symbolic expressions.[42]
A significant disadvantage of call by name is its potential for inefficient repeated evaluation, which can lead to exponential time complexity in cases where an expensive argument is referenced multiple times within the function body.[43] For instance, consider an ALGOL 60-style procedure that computes both the square and double of its parameter:
procedure square_plus_double(p);
begin
integer result;
result := p * p + 2 * p
end;
procedure square_plus_double(p);
begin
integer result;
result := p * p + 2 * p
end;
If invoked as square_plus_double(fib(30)), where fib(n) is a naive recursive Fibonacci function requiring exponential time itself, the argument fib(30) would be fully recomputed three times—twice for p * p and once for 2 * p—resulting in approximately three times the work of a single evaluation, amplifying the overall cost exponentially with deeper recursion or more references.[44] This inefficiency arises because each substitution triggers independent evaluation without memoization, making call by name unsuitable for performance-critical applications involving costly computations.[45]
Call by need
Call by need is a non-strict evaluation strategy that delays the evaluation of function arguments until their values are required, while ensuring that each argument is computed at most once through memoization. In this approach, arguments are represented as thunks—suspended computations—that are evaluated on first demand and then cached, with the result shared across all subsequent uses to avoid redundant work. This mechanism combines the benefits of call by name's laziness with improved efficiency by preventing repeated evaluations of the same expression.[46]
The strategy operates by substituting arguments with let-bindings in the lambda calculus, where evaluation occurs only in contexts that necessitate the value, such as primitive operations or case expressions. Once evaluated, the thunk is updated to hold the value, enabling sharing via graph structures in implementations.[46] Call by need exhibits lazy properties, meaning unevaluated arguments do not cause divergence unless demanded, while its caching provides efficiency gains over pure call by name. In practice, it is often realized through graph reduction techniques, where expressions form a directed acyclic graph (DAG) that is rewritten incrementally, preserving sharing and minimizing duplication during reduction.[46]
Haskell implements call by need as its default evaluation strategy, where lazy evaluation is achieved through this memoized demand-driven approach.[47] One key advantage is the ability to handle infinite data structures without non-termination, as only the portions needed for the computation are evaluated.[47]
For example, consider defining an infinite list of natural numbers in Haskell: naturals = 0 : map (+1) naturals. Accessing the first few elements, such as take 5 naturals, forces evaluation of only the initial segment [0,1,2,3,4], while the tail remains a thunk until further demand, preventing infinite computation.[47]
Call by macro expansion
Call by macro expansion is a non-strict evaluation strategy in which argument expressions are textually substituted into the body of a macro or function definition at compile time, with evaluation deferred until the expanded code is executed as part of the program. This static process resembles textual replacement, enabling code transformation without runtime overhead or dynamic thunk creation. As such, it achieves delayed evaluation through compile-time binding rather than runtime mechanisms.
In Lisp, macros facilitate this strategy via definitions like defmacro, where the macro receives unevaluated argument forms and generates a replacement expression that is inserted during compilation. The substitution occurs structurally on s-expressions, allowing arbitrary code generation while preserving the language's homoiconic nature. Lisp macros originated in early dialects like Lisp 1.5 and evolved significantly in MacLisp and Common Lisp, becoming essential for extending the language's syntax and creating domain-specific abstractions.[48] Lisp macros are generally non-hygienic, meaning they can suffer from variable capture issues where identifiers in the macro body might shadow or be shadowed by those in the calling context; programmers must handle this manually, often by generating fresh symbols with functions like gensym. For example, a macro defined as (defmacro when (condition &body body) (if ,condition (progn ,@body)))expands a call like(when (> x 0) (print x))to(if (> x 0) (progn (print x)))` at compile time, inlining the conditional logic and avoiding any function call overhead, but care must be taken to avoid capture in more complex cases.
In contrast, languages like Scheme implement hygienic macros, which automatically address variable capture by renaming identifiers in the macro body to fresh names during expansion and tracking syntactic environments to ensure local bindings do not accidentally shadow or be shadowed by those in the calling environment; this is achieved through algorithms that perform linear-time rewriting.[49]
The C preprocessor implements call by macro expansion through function-like macros, processed during translation phase 4, where the preprocessor scans for invocations, collects arguments as token sequences within matching parentheses, fully expands those arguments for further macros, and substitutes them textually into the replacement list without evaluating them. This purely lexical replacement treats parameters as placeholders, inserting the argument tokens verbatim while preserving source structure like whitespace and comments, followed by rescanning the result for additional expansions. An example is #define SQUARE(x) ((x) * (x)), which expands SQUARE(a + b) to ((a + b) * (a + b)), potentially recomputing a + b twice in the compiled code but resolving all substitutions statically. This approach, standardized in C since its early specifications, promotes efficiency by enabling inline code generation and eliminating indirect calls for simple operations.[50]
The primary advantages of call by macro expansion include enhanced code generation efficiency, as the compiler can optimize the fully expanded form, and reduced runtime costs from avoiding argument passing and evaluation delays. In both Lisp and C contexts, it supports powerful metaprogramming while remaining confined to compile-time processing.[48][50]
Call by future
Call by future is a non-strict evaluation strategy that integrates concurrency into argument passing by delegating the evaluation of each function argument to a separate future, or promise, which computes the value asynchronously in parallel with the function body. Upon invocation, the function receives placeholders for these futures rather than waiting for the arguments to evaluate, allowing the caller to proceed immediately without blocking. When the function body requires an argument's value, it accesses the future, which blocks the accessing thread only if the computation is incomplete; otherwise, the result is returned transparently. This mechanism, first formalized in Multilisp, ensures that futures replace themselves with their computed values upon completion, enabling seamless integration into sequential code while exposing parallelism.[51]
The strategy's key properties include its support for concurrency through speculative, eager evaluation of arguments in parallel, combined with lazy demand-driven access that avoids unnecessary computations if values go unused. Unlike purely sequential laziness, call by future overlaps the evaluation of independent arguments, potentially reducing overall execution time in parallel environments, though it introduces non-determinism due to scheduling and potential blocking. It maintains referential transparency in functional settings by treating futures as first-class values, but requires careful handling of side effects to avoid races. This approach has been implemented in languages such as Multilisp, where the future construct spawns dedicated processes for each argument, and Alice ML, which uses futures for data-driven synchronization in a typed concurrent functional setting.[51][52]
A primary advantage of call by future is its ability to automatically overlap independent computations, maximizing processor utilization without explicit programmer intervention for parallelism. For instance, in Multilisp, implementations demonstrated near-linear speedup for tasks like Quicksort on multiprocessors, where futures for recursive sublist processing run concurrently across multiple levels of recursion.[51] In a parallel map operation, such as applying a function to each element of a list, call by future binds each application to a separate future, allowing all elements to compute simultaneously; the mapping thread then touches these futures as needed, blocking only for unfinished ones while proceeding with completed results. This is particularly effective for data-parallel workloads, as seen in Alice ML's thread-spawning primitives that create futures for concurrent list processing.[52] As a non-strict binding, it ensures arguments do not block the caller, deferring evaluation until demanded within the function.[51]
Optimistic evaluation
Optimistic evaluation is an adaptive strategy designed for non-strict functional programming languages, where expressions are speculatively evaluated eagerly at runtime, assuming their results will be needed, but with a mechanism to abort and rollback if the computation proves too expensive or unnecessary. This approach complements traditional lazy evaluation (such as call-by-need) by dynamically switching between lazy and eager modes for individual thunks—unevaluated expressions—based on runtime observations from an online profiler. The profiler monitors execution and adjusts speculation decisions, typically initializing with an eager bias and refining it to avoid wasteful computations.[53]
In terms of mechanics, optimistic evaluation operates on let expressions, such as let x = <rhs> in <body>, by introducing a runtime switch that determines whether to evaluate the right-hand side (<rhs>) speculatively using call-by-value semantics or to defer it lazily as a thunk. If speculation succeeds—meaning the value is used and the cost is low—the computation commits seamlessly, avoiding the overhead of thunk creation and blackholing associated with pure laziness. Upon detecting excessive cost (e.g., via time or space thresholds), the system aborts the speculative work by capturing the computation state into a continuation thunk, resuming the lazy strategy only if the value is later demanded. This rollback ensures semantic equivalence to standard non-strict evaluation while enabling eager optimizations. For infinite data structures, a variant called chunky evaluation speculatively computes fixed-size chunks (e.g., 10 elements) of lists, balancing speculation with controlled laziness to prevent unbounded eager expansion.[53][54]
The properties of optimistic evaluation center on its speculative nature, where commit occurs implicitly on successful use and abort handles failures gracefully, maintaining the non-strict guarantees of the language without introducing strictness errors. It introduces minimal overhead for speculation decisions, typically via lightweight runtime checks, and preserves referential transparency. This strategy is particularly effective in low-contention scenarios, such as programs where most thunks are eventually forced or are inexpensive to evaluate, yielding high throughput by reducing thunk allocation and forcing costs—key bottlenecks in lazy evaluation. Empirical results from implementations show average speedups of 5-25%, with up to 50% improvements in space-leak prone programs, and no program slowing by more than 15%.[53][55]
Optimistic evaluation was implemented experimentally in an early version of the Glasgow Haskell Compiler (GHC) for the Haskell language in 2003, requiring modifications to the compiler and runtime system to support speculation switches and abortion mechanisms. For example, consider a program that repeatedly sums elements from a large generated list; under pure lazy evaluation, thunks accumulate, risking space leaks, but optimistic evaluation speculatively computes list elements as needed, running in near-constant space by committing only useful results and aborting unused speculations. This makes it suitable for performance-critical non-strict code, adapting to workload characteristics without static analysis alone.[53]
Modern Applications and Variations
Evaluation in functional languages
In functional programming languages, evaluation strategies emphasize non-strict semantics to support higher-order functions, referential transparency, and composability, with lazy evaluation—specifically call-by-need—serving as the default in influential languages like Haskell.[56] This approach delays computation until a value is required, enabling the construction of infinite data structures and avoiding unnecessary work, while sharing computed results across multiple uses to optimize performance.[14] However, functional languages also provide mechanisms for strict evaluation to mitigate potential inefficiencies, such as Haskell's seq function, which forces an argument to weak head normal form before proceeding, and bang patterns (!), which enforce strictness in pattern bindings.[57]
Implementations of these strategies in functional languages often rely on graph reduction techniques, where expressions are represented as directed graphs that are incrementally reduced to normal form. In the Glasgow Haskell Compiler (GHC), the dominant Haskell implementation, code is compiled to the Spineless Tagless G-machine (STG), a virtual machine that performs lazy graph reduction without explicit pointers, facilitating efficient handling of higher-order functions through closure representation and updateable references. This model supports lazy evaluation by constructing thunks—unevaluated expressions—that are shared and updated in place upon demand, reducing redundant computations in polymorphic and higher-order contexts.[14]
Key challenges in these implementations include space leaks arising from unevaluated thunks that accumulate in memory without being garbage-collected, potentially leading to excessive heap usage in long-running programs.[58] To address this, strictness analysis techniques determine whether a function requires its arguments to produce a defined result, allowing compilers to optimize by evaluating strict arguments eagerly and enabling parallelization or specialization.[59] Seminal work by Wright and Felleisen formalized strictness via abstract interpretation and type-based inference, providing a foundation for precise annotations that balance laziness with efficiency.
A representative example is the handling of infinite streams: in lazy languages like Haskell, streams can be defined recursively without termination issues, as in naturals = 0 : map (+1) naturals, where only demanded elements are computed, supporting applications like symbolic computation or event streams. In contrast, strict functional languages like Clean require explicit forcing for laziness via annotations or uniqueness typing, evaluating arguments eagerly by default to ensure predictable resource use but necessitating manual intervention for infinite structures.[60]
The evolution of evaluation strategies in functional languages traces from David Turner's Miranda in the 1980s, which popularized lazy evaluation through non-strict semantics and polymorphic types, influencing subsequent designs by enabling concise expressions for complex algorithms.[13] This laziness became central to Haskell's 1990 specification, prioritizing expressiveness over strictness. Modern languages like F#, however, adopt strict-by-default (eager) evaluation to align with imperative ecosystems, using explicit lazy keywords for deferred computation while maintaining functional purity.[61]
Parallel and concurrent evaluation
In parallel and concurrent computing environments, evaluation strategies are adapted to leverage multiple processors or threads while managing shared resources and avoiding unnecessary computations. Futures and promises serve as key mechanisms for parallelism, where a future acts as a placeholder for a value computed asynchronously, allowing the main thread to continue without blocking until the result is needed.[62] Promises complement this by enabling the completion of a future from another thread, facilitating coordination in concurrent programs.[63] Non-strict evaluation strategies, such as lazy evaluation, are particularly useful in actor-based systems to prevent blocking, as messages between actors can carry unevaluated expressions that are computed only when required by the recipient.
Strict evaluation in concurrent settings introduces challenges like race conditions, where multiple threads simultaneously access and modify shared state during immediate argument evaluation, leading to nondeterministic outcomes and potential data corruption.[64] In contrast, lazy strategies enable speculative parallelism by delaying evaluation and allowing multiple threads to explore alternative computation paths concurrently, with results discarded if not needed, thus mitigating blocking and improving resource utilization in functional languages.[65]
Languages like Erlang employ eager evaluation within isolated processes, where each process maintains its own heap and garbage collector, ensuring fault isolation and preventing one process's evaluation from interfering with others through message passing rather than shared memory. Scala integrates futures for parallel execution with lazy vals, which defer initialization until first access and are thread-safe due to atomic synchronization, combining non-strict semantics with concurrent computation.
A prominent technique for parallel lazy evaluation is work-stealing, implemented in the Glasgow Haskell Compiler (GHC) runtime, where idle threads steal unevaluated thunks (lazy computations) from other threads' queues to balance load and maximize parallelism without explicit programmer intervention.[66]
For example, a parallel fold operation on a list can use futures to compute partial reductions concurrently: in Scala, the list is split into segments, each folded asynchronously via Future, then combined with a final fold on the results, achieving speedup over sequential call-by-need where the entire structure is built lazily before reduction.[67] This contrasts with pure call-by-need, which avoids redundant computations but processes sequentially unless annotated for parallelism.[68]
Optimizations and implementations
In strict evaluation strategies such as call by value, inlining replaces a function call with the body of the called function, reducing invocation overhead and enabling further optimizations like constant propagation and dead code elimination. This technique is particularly effective in higher-order languages, where it can inline polymorphic or generic functions after monomorphization, minimizing indirect calls and improving cache locality.
For call by need, memoization tables store computed values of arguments to prevent redundant evaluations, ensuring each thunk is reduced at most once while sharing results across multiple uses. Implemented via updateable references in the runtime, this approach avoids the exponential time costs of pure call by name, with Haskell's implementation using black holes to detect and prevent infinite loops during updates.[45]
Strictness inference analyzes program structure to determine when lazy expressions can be safely evaluated eagerly, allowing compilers to generate more efficient code by avoiding thunk creation for strict positions. Based on abstract interpretation or type-based analysis, this optimization can eliminate unnecessary thunks in functional programs, as demonstrated in analyses of higher-order strictness for data structures like lists.[69]
The Java Virtual Machine (JVM) employs eager evaluation aligned with call by value, leveraging just-in-time (JIT) compilation to apply optimizations such as method inlining, escape analysis, and loop unrolling on hot paths detected at runtime. HotSpot's tiered compilation starts with interpretation and progresses to optimized native code, achieving near-native performance for long-running applications by adapting to observed call frequencies.[70]
GHC, the Glasgow Haskell Compiler, implements lazy evaluation through graph reduction on the Spineless Tagless G-machine (STG), where expressions form sharing graphs updated in place during reduction to support call by need. This model reduces memory duplication by sharing subexpressions, with the runtime collector managing graph nodes for efficient traversal.[71]
In strict languages, tail call optimization (TCO) transforms recursive tail calls into loops, preventing stack overflow and enabling constant stack space for deep recursion, though support varies; for instance, some JIT compilers recognize and optimize tail recursion via loop fusion. In lazy evaluation, space reclamation occurs through generational garbage collection that promotes evaluated thunks and sweeps unevaluated ones, mitigating space leaks from long thunk chains, as seen in Haskell's parallel GC variants.[72]
Hybrid approaches balance strict and lazy evaluation, such as strict-by-need in dialects like Icon, where computations are forced eagerly but results memoized to avoid recomputation, reducing overhead in goal-directed programming while preserving sharing benefits. These trade-offs allow selective strictness in lazy systems via inference, improving speed at the cost of analysis time, or lazy regions in strict systems for infinite data handling.
Post-2010 advancements in JIT compilation, such as in GraalVM, enable dynamic adaptation of evaluation strategies by profiling runtime behavior and partially evaluating code paths, switching between eager and lazy modes for hot methods to optimize for specific workloads like numerical computations.