The Design of Experiments
The Design of Experiments is a foundational text in statistics authored by British statistician and geneticist Sir Ronald Aylmer Fisher, first published in 1935 by Oliver & Boyd, which systematically articulates the principles for structuring scientific experiments to enable rigorous inference on causal effects.[1] Fisher's work emerged from his practical experience at Rothamsted Experimental Station, where he developed methods to analyze agricultural field trials amid limited resources and confounding soil variations.[2] The book emphasizes three core principles—randomization, replication, and blocking—as essential for isolating treatment effects from extraneous noise and ensuring the validity of statistical conclusions.[3] Randomization distributes experimental units across treatments haphazardly to mitigate unknown biases, replication provides estimates of experimental error to assess significance, and blocking groups similar units to control for known sources of variation, thereby enhancing precision without assuming causal structures beyond observed data.[4] These methods, illustrated through examples like psycho-physical tests and crop yield studies, shifted experimental practice from ad hoc observations to controlled designs capable of falsifying hypotheses via null distributions.[1] Fisher's innovations profoundly influenced fields beyond agriculture, including medicine, psychology, and engineering, by establishing randomization as indispensable for causal claims, countering earlier reliance on systematic arrangements prone to systematic error.[2] The text also introduces concepts like factorial designs for efficient exploration of multiple factors and interactions, principles that underpin modern tools such as ANOVA for dissecting variance components.[3] While subsequent developments have extended these ideas—incorporating response surface methods and optimal designs—Fisher's insistence on integrating design with analysis remains central to empirical science, underscoring that flawed experiments yield unreliable knowledge regardless of analytical sophistication.[4]Introduction
Overview and Significance
The Design of Experiments is a 1935 book by British statistician and geneticist Ronald A. Fisher that codifies principles for structuring scientific experiments to yield valid causal inferences. Published by Oliver and Boyd in Edinburgh, the text draws from Fisher's practical experience at Rothamsted Experimental Station, where he analyzed long-term agricultural trials, and addresses the limitations of prior observational methods by advocating controlled, probabilistic approaches to experimentation.[5][6] Fisher introduces core techniques including randomization to allocate treatments impartially and mitigate bias from unmeasured factors, replication to estimate error variance precisely, and blocking for local control of known sources of heterogeneity, such as soil variability in field trials. These elements underpin the book's integration of experimental layout with statistical analysis, notably via the analysis of variance (ANOVA) to partition observed variation into treatment effects and residual error. Illustrative examples, like the lady tasting tea demonstration of detecting subtle perceptual differences under randomized conditions, highlight how proper design enables null hypothesis testing to assess evidence against chance alone.[7][8][9] The book's enduring significance stems from establishing experimental design as a distinct scientific discipline, shifting from intuitive adjustments to systematic strategies that maximize inferential power while minimizing resource waste. Its principles revolutionized agricultural research by enabling factorial designs that test multiple factors efficiently, contributing to yield improvements and influencing broader applications in genetics, medicine, and industry. Though critiqued for underemphasizing model assumptions in later statistical paradigms, Fisher's framework remains foundational to modern design of experiments, as evidenced by its adoption in standards like those from the American Society for Quality and ongoing use in randomized controlled trials.[4][2][10]Publication and Editions
The Design of Experiments was first published in 1935 by Oliver & Boyd in Edinburgh, comprising 248 pages and establishing foundational principles of experimental design derived from Fisher's work at Rothamsted Experimental Station.[5] The initial edition emphasized randomization, replication, and the integration of design with statistical analysis, distinguishing it from prior statistical texts focused primarily on post-hoc methods.[11] Subsequent editions, revised by Fisher himself until his death in 1962, numbered eight in total, with publication years as follows: 1935 (first), 1937 (second), 1942 (third), 1947 (fourth), 1949 (fifth), 1951 (sixth), 1960 (seventh), and 1966 (eighth, posthumous).[12] Oliver & Boyd handled all early printings in Edinburgh, while from the fifth edition onward, Hafner Publishing Company co-issued versions in New York, facilitating broader dissemination in the United States.[12] Revisions typically involved minor expansions, such as added subsections on specific designs like confounding and lattice squares, alongside textual clarifications, but the core structure and philosophical emphasis on inductive reasoning in experimentation persisted without major overhaul.[6] Later printings and reprints, including those by Hafner into the 1970s, reproduced the 1966 edition with minimal alterations, preserving its status as a concise treatise rather than an expansive manual.[13]Author and Historical Context
Ronald Fisher's Background
Ronald Aylmer Fisher was born on 17 February 1890 in East Finchley, London, to George Fisher, an auctioneer at Robinson and Fisher, and Katie Heath, daughter of a solicitor.[14] He was the youngest of eight children, with his twin brother dying in infancy; his mother died of peritonitis in 1904 when Fisher was 14, after which his father lost the family business.[14] [15] Despite suffering from extreme myopia that required multiple operations and prevented visual aids in learning, Fisher displayed precocious mathematical talent from a young age, developing an interest in astronomy by age six and attending lectures by Sir Robert Ball shortly thereafter.[14] Fisher received his early education at Stanmore Park School under W. N. Roe, followed by Harrow School from 1904, where he was taught by C. H. P. Mayo and W. N. Roseveare and won the Neeld Medal in mathematics in 1906.[15] [14] Due to his poor eyesight, he relied on mental arithmetic and biometrics instruction from biology teacher Arthur Vassal.[16] In October 1909, he matriculated at Gonville and Caius College, Cambridge, graduating in 1912 with distinction in the mathematical tripos, influenced by his family's academic tradition including an uncle who was a Cambridge Wrangler.[15] [14] Following graduation, amid World War I and barred from military service due to his vision, Fisher briefly worked on a farm in Canada and as a statistician at the Mercantile and General Investment Company in London.[15] From 1915 to 1919, he taught mathematics and physics at schools including Rugby, continuing independent research in statistics and genetics during this period.[15] In 1917, he married Ruth Eileen Guinness in a secret ceremony, with whom he would have nine children.[15] These early experiences honed his quantitative skills, setting the stage for his later innovations in experimental design upon joining Rothamsted Experimental Station in 1919.[15]Development at Rothamsted Experimental Station
Ronald Fisher joined Rothamsted Experimental Station in 1919 as its first statistician, tasked with analyzing an extensive archive of agricultural data from field trials dating back to the 1840s, including the Broadbalk Wheat Experiment initiated in 1843.[17][18] The station's pre-existing experiments often yielded inconclusive results due to uncontrolled soil heterogeneity, systematic plot arrangements that introduced bias, and insufficient replication, which hindered reliable inference on treatments like fertilizers and crop varieties.[19] To address these limitations, Fisher developed foundational principles for field experiment design, emphasizing randomization to distribute treatments across plots and mitigate unknown sources of variation, thereby enabling valid statistical inference.[20] He advocated replication of treatments to partition observed variance into components attributable to treatments and experimental error, allowing estimation of error degrees of freedom—typically requiring at least three replicates per treatment for practical significance testing.[20] Complementing these, blocking was introduced to group plots into homogeneous units (e.g., based on soil fertility gradients), reducing error variance by accounting for local control without confounding treatment effects.[21] Fisher's innovations extended to efficient layouts such as Latin squares for controlling two sources of variation (e.g., rows and columns in rectangular fields), which he applied to Rothamsted's variety trials and manure experiments starting in the early 1920s.[22] In 1926, he formalized these ideas in the paper "The Arrangement of Field Experiments," prescribing randomized blocks and factorial designs to maximize information yield from limited resources, such as testing multiple nutrient interactions in a single trial.[23] Concurrently, he devised analysis of variance (ANOVA) as a computational framework to dissect multi-way treatment structures, demonstrated on Rothamsted data like the 1923-1925 pot culture trials with nitrogen, phosphorus, and potassium factors.[21] During his 14-year tenure until 1933, Fisher trained Rothamsted researchers in these methods, transforming the station's practices from ad hoc trials to systematically designed experiments that supported robust conclusions on crop responses—evident in over 100 redesigned field layouts by the late 1920s.[17] These developments, rooted in the station's agricultural imperatives, provided the empirical foundation and methodological core for Fisher's 1935 book The Design of Experiments, which synthesized randomization, replication, and blocking as indispensable for causal inference in varied sciences.[19]Core Principles of Experimental Design
Randomization
Randomization constitutes a core principle in Ronald A. Fisher's framework for experimental design, mandating the random allocation of treatments to experimental units to preclude systematic bias and underpin the validity of inferential statistics. Fisher first articulated this requirement in his 1925 text Statistical Methods for Research Workers, positing that deliberate assignment introduces unknown confounding factors that undermine causal attribution, whereas randomization ensures treatments are probabilistically equivalent with respect to all extraneous influences.[24][7] The mechanism operates by treating the assignment process as a random permutation of labels to units, distributing unidentified sources of heterogeneity evenly across groups. This approach yields a randomization distribution under the null hypothesis of no treatment effect, enabling exact significance tests via enumeration of all possible reallocations—a method Fisher illustrated with small-scale examples, such as comparing yields from two fertilizer types on four plots, where the p-value derives from the proportion of permutations as extreme as or more extreme than the observed outcome.[25][26] In The Design of Experiments (1935), Fisher detailed this in Chapter II, "Randomisation; the Physical Basis of the Validity of the Test," arguing it confers model-free validity by grounding inference in the physical act of randomization rather than untestable parametric assumptions about error distributions.[3] Fisher's advocacy stemmed from practical necessities at Rothamsted Experimental Station, where agricultural field trials faced spatial gradients and soil variabilities; non-random designs, like systematic layouts, risked aligning treatments with these gradients, inflating error variance or masking effects. Randomization mitigates such risks without requiring prior knowledge of nuisances, though Fisher cautioned it does not guarantee balance—mere probability suffices for unbiased estimation and valid testing.[27] Critics, including Neyman, later contended randomization alone inadequately estimates treatment effects in finite samples without superpopulation models, but Fisher's position prioritized exact conditional inference over asymptotic approximations, emphasizing randomization's role in isolating causal impacts from chance.[25] Empirical validations, such as simulations in subsequent statistical literature, affirm that randomized designs yield unbiased variance estimates superior to systematic alternatives when unknowns abound.Replication
Replication refers to the repetition of each treatment across multiple independent experimental units within a single experiment, enabling the separation of treatment effects from random variation. This principle, one of the three foundational elements of modern experimental design alongside randomization and blocking, was formalized by Ronald A. Fisher to provide an estimate of the inherent variability or "error" in the experimental material, independent of any systematic treatment differences.[28] Without replication, observed differences between treatments cannot be reliably attributed to the treatments themselves, as they may arise solely from uncontrolled fluctuations in the experimental units, such as soil fertility gradients in agricultural trials. The primary statistical benefit of replication is the estimation of the error variance, denoted as \sigma^2, through the within-treatment variability, which supplies the degrees of freedom required for tests of significance, such as the analysis of variance (ANOVA). For instance, with r replicates per treatment, the mean square error from ANOVA provides an unbiased estimator of \sigma^2, allowing computation of the standard error of a treatment mean as \sigma / \sqrt{r}, which declines proportionally to the square root of the replication level and thus enhances the precision and power of detecting genuine effects.[29] Fisher emphasized that this error estimate, derived under randomization, ensures the validity of inference by mimicking the distribution of treatment effects under the null hypothesis of no treatment differences.[7] In practice, the number of replicates balances the need for precision against resource constraints; Fisher recommended sufficient replication to yield adequate degrees of freedom for error estimation, typically at least two or three per treatment in simple designs, though more may be required for high variability or small effects. At Rothamsted Experimental Station, where Fisher applied these principles to crop yield trials from the 1920s onward, replication across plots—often in randomized blocks—averaged out local environmental heterogeneity, as demonstrated in early barley and wheat experiments where replicated plots revealed significant variety differences only after error partitioning.[20] Excessive replication without corresponding randomization or blocking, however, risks inefficient resource use, as it amplifies rather than mitigates systematic biases if not properly controlled.[30]Blocking and Local Control
Blocking, also termed local control, involves partitioning experimental units into groups, or blocks, anticipated to exhibit greater homogeneity within than between groups, thereby isolating and reducing the influence of identifiable nuisance factors on treatment effect estimates.[31] This technique enhances experimental precision by confining extraneous variation to between-block comparisons, allowing more accurate attribution of differences to treatments via subsequent randomization within blocks.[28] Ronald A. Fisher integrated blocking as one of three foundational principles of experimental design—alongside randomization and replication—particularly for agricultural trials where environmental heterogeneity, such as soil fertility gradients, could otherwise confound results.[20] In practice, blocks are formed based on prior knowledge of variation sources; for instance, in field experiments, contiguous plots form blocks to control for spatial gradients in nutrient availability or topography, as Fisher advocated during his tenure at Rothamsted Experimental Station starting in 1919.[32] Treatments are then randomly allocated within each block, yielding designs like the randomized block design (RBD), where analysis of variance partitions total variability into components attributable to blocks, treatments, and residual error.[33] This partitioning quantifies block effects separately, reducing the error mean square and thereby elevating the F-statistic for treatment tests, which improves statistical power without biasing estimates under proper randomization.[34] Fisher's emphasis on local control stemmed from empirical observations at Rothamsted, where unblocked layouts in early 20th-century trials yielded high residual variation, masking subtle varietal or fertilizer effects; by 1921, he had implemented block-based designs in wheat experiments, demonstrating error reductions of up to 50% compared to fully randomized alternatives.[35] Local control does not eliminate all uncontrolled variation but targets known factors, complementing randomization's role in handling unknowns; failure to block on major nuisances can inflate error variances, as evidenced in simulations where omitting soil blocks doubled required sample sizes for equivalent power.[36] While effective, blocking restricts full randomization, necessitating careful block definition to avoid confounding if treatments correlate with block factors.Key Concepts and Methods
Null Hypothesis and Significance Testing
In Ronald Fisher's statistical methodology, the null hypothesis constitutes a precise statement positing no effect from the experimental treatment or no difference among compared groups, enabling the computation of the exact probability distribution for the observed data under this assumption.[1] This approach, articulated in The Design of Experiments (1935), underpins significance testing by providing a benchmark against which experimental results are evaluated. Fisher emphasized that the null hypothesis must be formulated specifically enough to derive the sampling distribution analytically, often facilitated by randomization in experimental design, which ensures the validity of the probability calculations.[37] Significance testing, as delineated by Fisher, entails calculating the p-value: the probability, assuming the null hypothesis holds true, of observing data at least as inconsistent with the null as the actual results obtained.[37] A low p-value indicates that the data are improbable under the null, furnishing inductive evidence against it, though Fisher cautioned that the null is never proven but may be disproved.[1] Unlike the Neyman-Pearson framework, which incorporates alternative hypotheses and concepts of power and error rates, Fisher's method focuses solely on the null, rejecting rigid decision rules in favor of interpreting p-values on a continuous scale of evidential strength.[37] He advocated thresholds like 0.05 or 0.01 for gauging significance in practice, but stressed their arbitrariness, prioritizing the magnitude of the p-value. A paradigmatic illustration appears in Chapter II of The Design of Experiments, the "lady tasting tea" experiment, where a woman claims ability to discern whether milk or tea was added first to a cup.[2] Under the null hypothesis of no discriminative power, eight cups—four prepared each way—are randomized and presented blindly; correct identification of all yields a p-value of \frac{1}{\binom{8}{4}} = \frac{1}{70} \approx 0.014, derived via Fisher's exact test, rejecting the null at conventional levels. This example underscores how randomization in design permits exact significance assessment without approximations, highlighting replication's role in enhancing precision against variability.[1] Fisher integrated null hypothesis testing with experimental principles like randomization, replication, and blocking to maximize the sensitivity of tests—reducing the likelihood of failing to detect true effects by minimizing extraneous variance—while maintaining the integrity of p-value computations. He argued that efficient designs amplify evidence against the null when effects exist, as "the value of the experiment is increased whenever it permits the null hypothesis to be more readily disproved."[1] This causal realism in inference prioritizes designs yielding data most discrepant from the null under alternative realities, fostering rigorous scientific induction over mere data collection.[37]Analysis of Variance (ANOVA)
Analysis of variance (ANOVA) is a statistical method pioneered by Ronald Fisher in the early 1920s to evaluate the effects of experimental treatments on quantitative outcomes, particularly in agricultural field trials where multiple factors influence variability in yields or responses.[38] Fisher developed ANOVA while analyzing historical data at Rothamsted Experimental Station, enabling the decomposition of total observed variation into additive components attributable to specific sources, such as treatments, environmental blocks, and residual error, thereby isolating genuine treatment effects from random noise.[39] This approach contrasted with earlier pairwise comparisons, which lacked efficiency for multifactor designs, by providing a unified framework to test the null hypothesis that all treatment means are equal against alternatives of systematic differences.[40] In Fisher's framework, as elaborated in The Design of Experiments (1935), ANOVA operates by partitioning the total sum of squares (SST, measuring overall deviation from the grand mean) into sums of squares for treatments (SSTr), blocks (SSB, to account for local soil or field heterogeneity), and error (SSE, representing unexplained variation).[3] Degrees of freedom are correspondingly allocated: for a one-way layout with t treatments and n replicates per treatment, treatment degrees of freedom are t-1, error degrees of freedom are t(n-1), and the mean square for treatments (MST = SSTr / (t-1)) divided by the mean square error (MSE = SSE / [t(n-1)]) yields the F-statistic.[41] Fisher derived the F-distribution to determine the probability under the null hypothesis that this ratio exceeds observed values, establishing significance thresholds (e.g., at p < 0.05) without assuming normality beyond large-sample approximations, though he recommended exact tables for small samples.[42] For factorial experiments, central to Fisher's advocacy for efficient designs, ANOVA extends to multi-way classifications, estimating main effects and interactions by further subdividing treatment sums of squares; for instance, in a 2x2 factorial with factors A and B, the model includes terms for A, B, and AB interaction, tested via nested F-ratios against appropriate error terms.[43] This hierarchical structure enhances power by detecting synergies or antagonisms among factors, as demonstrated in Fisher's Rothamsted analyses of manure types and crop rotations, where interactions revealed that nitrogenous fertilizers amplified yields only under specific soil conditions.[44] Blocking integrates with ANOVA by removing between-block variance, increasing sensitivity to treatments; in randomized block designs, the model is Y_ij = μ + τ_i + β_j + ε_ij, where τ_i are treatment effects (sum to zero) and β_j block effects, with ANOVA tables displaying:| Source | Degrees of Freedom | Sum of Squares | Mean Square | F-ratio |
|---|---|---|---|---|
| Treatments | t-1 | SSTr | MST = SSTr/(t-1) | MST/MSE |
| Blocks | b-1 | SSB | MSB = SSB/(b-1) | (not always tested) |
| Error | (t-1)(b-1) | SSE | MSE = SSE/((t-1)(b-1)) | - |
| Total | tb-1 | SST | - | - |
Confounding and Experimental Efficiency
Confounding refers to the deliberate inseparability of certain effects in experimental designs, particularly in factorial arrangements where higher-order interactions are sacrificed to estimate main effects and lower-order interactions with fewer experimental units. In Ronald Fisher's framework, this technique emerged as a solution to the impracticality of complete replication in experiments with many factors, allowing researchers to confound triple or higher interactions across blocks while preserving orthogonality for primary effects of interest.[3] Fisher outlined these methods in his 1935 book The Design of Experiments, emphasizing that such confounding maintains the validity of variance analysis provided the confounded effects are negligible or of subordinate importance.[1] This approach enhances experimental efficiency by reducing the number of required runs without proportionally increasing error variance. For instance, in a $2^3 factorial design suited for confounding the triple interaction, treatments can be tested in groups of eight across multiple incomplete blocks, enabling the estimation of main effects and two-factor interactions with higher precision than a fully randomized design of equivalent size.[11] Efficiency gains arise because confounding permits larger blocks or more factors within resource constraints, minimizing the degrees of freedom lost to block effects while concentrating them on treatments; Yates and Fisher demonstrated this in agricultural trials where block sizes exceeded feasible complete sets, estimating efficiency as the ratio of error mean squares reduced via partial confounding.[28] The method assumes higher-order interactions contribute minimally to variation, a principle validated in Rothamsted field trials where confounded designs yielded more precise treatment comparisons than unblocked alternatives.[20] Randomization remains essential to mitigate systematic confounding from uncontrolled variables, such as soil gradients, ensuring confounded effects are not biased by external factors. Fisher distinguished confounding from aliasing, where the former involves planned inseparability in block designs to boost efficiency, while uncontrolled confounding introduces bias; this distinction underscores causal inference by isolating treatment effects from nuisance variables.[45] In practice, double confounding—further aliasing subordinate interactions—extends applicability to experiments with up to seven or more factors, as detailed in Fisher's later editions, achieving up to 50% reductions in experimental units for equivalent power in detecting main effects.[3] Such designs prioritize empirical precision over exhaustive estimation, aligning with Fisher's causal realism that experiments test specific hypotheses rather than all possible interactions.[46]Content Structure
Chapter Summaries
Chapter I: IntroductionFisher introduces the necessity of statistical methods in experimental design to address disputes over evidence, emphasizing that inductive inferences must account for uncertainty through probability rather than inverse probability, which he rejects for its logical inconsistencies. He argues that laboratory logic requires experiments designed to yield valid conclusions by controlling variation sources, rather than mere data accumulation. The chapter sets the foundation for rigorous experimentation, critiquing ad hoc interpretations and advocating planned designs to advance scientific knowledge.[11] Chapter II: The Principles of Experimentation, Illustrated by a Psycho-Physical Experiment
Using a tea-tasting experiment where a subject distinguishes milk-first from tea-first preparations, Fisher illustrates core principles: the null hypothesis of no discrimination, randomization to ensure valid significance tests, and replication to enhance sensitivity. Randomization prevents systematic biases, allowing exact probability calculations under the null (e.g., 1 in 70 chance for correct guesses in an 8-cup trial). He stresses that significance levels, such as 5%, guide inference without proving alternatives, and qualitative refinements can boost design efficiency without scaling up sample size.[11] Chapter III: A Historical Experiment on Growth Rate (Darwin’s Experiment on the Growth of Plants)
Fisher reanalyzes Darwin's data on crossed versus self-fertilized plant heights, applying pairing to reduce environmental variation and Student's t-test to assess significance (e.g., t=2.148 for 15 pairs, indicating superiority). He critiques non-randomized arrangements like Galton's for inflating apparent effects through data manipulation, underscoring randomization's role in unbiased error estimation. The chapter extends to testing wider hypotheses, showing randomization validates distributional assumptions and prevents fallacious conclusions from systematic biases.[11] Chapter IV: An Agricultural Experiment in Randomised Blocks
In a crop variety trial across 8 blocks of 5 plots each, Fisher demonstrates randomized blocks to separate treatment effects from block heterogeneity via analysis of variance (ANOVA), allocating degrees of freedom (e.g., 4 for varieties, 28 for error). Replication across blocks averages plot errors for precision, while randomization avoids biases from systematic layouts. He advises compact blocks and edge discarding to minimize residual error, illustrated by a barley yield example showing improved variety comparisons.[11] Chapter V: The Latin Square
Fisher presents Latin squares for experiments with two nuisance factors (e.g., rows and columns), randomizing treatments under double restrictions to estimate error after adjusting for these (e.g., 20 degrees of freedom in a 6x6 square). Faulty analyses that lump row/column effects into error reduce precision, and systematic squares risk underestimating true variability, as in Tedin's trials. Extensions to Graeco-Latin squares handle additional factors combinatorially, with a potato fertilizer example highlighting practical error control.[11] Chapter VI: The Factorial Design in Experimentation
Fisher advocates factorial designs over single-factor tests to detect interactions, as in a 4-factor, 16-combination experiment yielding main effects and pairwise interactions efficiently. These designs support broader inductive inferences by varying conditions simultaneously, incorporating subsidiary factors like timing at minimal extra cost. For unreplicated cases with many factors, higher-order interactions serve as error estimates, enabling robust conclusions from complex systems.[11] Chapter VII: Confounding
To manage field heterogeneity in large designs, Fisher introduces confounding, where main effects or interactions are estimated from subsets of replications (e.g., partial confounding in 8- or 27-treatment layouts). Orthogonal sets allow controlled aliasing of higher-order terms, preserving key comparisons via ANOVA. This balances precision gains against information loss on confounded effects, essential for scalable agricultural trials.[11] Chapter VIII: Special Cases of Partial Confounding
Building on confounding, Fisher details applications like dummy treatments for quality-quantity interactions or material comparisons, as in an 81-plot fertilizer experiment. Partial strategies confound minor interactions to prioritize main effects, with analysis interpreting aliased terms cautiously. Early examples underscore design flexibility in interpreting confounded outcomes without sacrificing overall efficiency.[11] Chapter IX: The Increase of Precision by Concomitant Measurements (Statistical Control)
Concomitant variables, like covariates, enhance precision through regression adjustments (e.g., correcting yields by soil metrics), tested for significance post-hoc. Arbitrary corrections risk bias, but valid ones reduce error variance without confounding treatments. A practical example illustrates improved estimates in correlated measurements, emphasizing statistical control's role in observational-like data within experiments.[11] Chapter X: The Generalisation of Null Hypotheses (Fiducial Probability)
Fisher generalizes null hypothesis testing beyond means to ANOVA components, using t- and χ² distributions for interactions and comparisons. Multiplicity of tests requires information-based precision measures, extending to fiducial intervals for parameter bounds. This framework unifies significance across design types, cautioning against overinterpreting non-significant interactions.[11] Chapter XI: The Measurement of Amount of Information
Information is quantified as precision's reciprocal, applicable to estimation in frequencies, regressions, and assays, where designs minimize loss (e.g., in linkage studies). Biological assays optimize dose allocations for efficient potency estimates. The chapter ties design to inference, stressing experiments that maximize relevant information while controlling variance sources.[11] ![Ronald Fisher in stained glass at Gonville and Caius College][float-right]