Fact-checked by Grok 2 weeks ago

Likelihood principle

The Likelihood Principle (LP) is a normative principle in statistical inference asserting that, given observed data x from an experiment with parameter \theta, all relevant evidence about \theta is fully captured by the likelihood function L(\theta \mid x), and any two experiments yielding proportional likelihood functions provide equivalent inferential information, independent of the experimental design, sampling procedure, or stopping rules. Formally introduced by Allan Birnbaum in 1962, the LP emerged from efforts to unify foundations of inference during the mid-20th-century debates between frequentist and Bayesian approaches, building on earlier ideas from on likelihood as a measure of evidence. Birnbaum demonstrated that the LP logically follows from two other principles: the Sufficiency Principle, which states that inferences should depend only on sufficient statistics summarizing the data without loss of information, and the Conditionality Principle, which requires conditioning inferences on the actually observed ancillary statistics rather than hypothetical repetitions of the experiment. This derivation positioned the LP as a cornerstone for methods like and Bayesian updating, where posterior distributions are proportional to the likelihood times the prior. The LP has profound implications for statistical practice, emphasizing post-data analysis over pre-data design considerations, such as sample size or optional stopping, which it deems irrelevant to evidential interpretation. It aligns closely with Bayesian and likelihood-based inference, promoting coherence by ensuring that equivalent likelihoods lead to identical conclusions, as seen in applications to Poisson processes or normal models where the likelihood fully parameterizes uncertainty. However, it remains controversial, particularly among frequentists, who argue that it neglects long-run error rates, confidence coverage, and the role of unobserved outcomes in assessing procedure validity, leading to critiques that LP-compliant methods can yield intervals with poor repeated-sampling properties. Despite these debates, the LP continues to influence modern inference, including in robust Bayesian techniques and discussions of model adequacy.

Core Concepts

Definition and Formal Statement

The likelihood principle is a foundational concept in statistical inference that asserts the evidential content of experimental data regarding a parameter is fully captured by the likelihood function derived from the observed outcome, independent of the broader experimental design or sampling procedure. To understand this, first distinguish the likelihood function from probability: while a probability function treats the data as random and the parameter as fixed, the likelihood function L(\theta \mid x) treats the observed data x as fixed and views the parameter \theta as variable, defined as L(\theta \mid x) = f(x \mid \theta), where f is the probability density or mass function of the data given \theta. The function is unique up to a positive constant multiplier, as scaling by such a constant does not alter comparative inferences about \theta. Formally, the likelihood principle states that if two experiments E and E' share the same parameter space and their observed outcomes x and y yield proportional likelihood functions—i.e., f(x \mid \theta) = c \cdot g(y \mid \theta) for some positive constant c and all \theta—then the evidential meaning of the outcomes is identical, and all inferences about \theta should coincide, irrespective of differences in the experiments' structures. In other words, "the evidential meaning of any outcome x of any experiment E is characterized completely by the likelihood function c f(x, \theta), and is otherwise independent of the structure of E." This principle implies that all relevant information from the experiment about \theta resides solely in the likelihood function for the observed data, excluding considerations of the space of possible unobserved outcomes.

Relation to Likelihood Function

The , denoted L(\theta \mid x), is defined as the joint probability or of the observed data x given the \theta, treated as a of \theta for fixed x; it is typically taken up to a proportionality constant independent of \theta. This formulation, introduced by Ronald A. Fisher, shifts the focus from the data as random to the as the variable of interest, enabling direct assessment of how well different \theta values explain the fixed observations. A key property of the is its invariance under reparameterization: if the is transformed via \phi = g(\theta) where g is a , the L(\phi \mid x) in the new parameterization preserves the relative evidential support for different values compared to L(\theta \mid x), ensuring consistent regardless of how \theta is expressed. Another central role is in (MLE), where the point estimate \hat{\theta} is obtained as \hat{\theta} = \arg\max_{\theta} L(\theta \mid x), selecting the value that maximizes the probability of the observed . For computational purposes, the log-likelihood \ell(\theta \mid x) = \log L(\theta \mid x) is preferred, as the logarithm converts products over observations into sums, simplifying differentiation and optimization while preserving the maximizing argument due to the monotonicity of the log function. For common distributions, explicit forms facilitate this process. For an independent sample from a with \mu and variance \sigma^2, the log-likelihood is \ell(\mu, \sigma^2 \mid x) = -\frac{n}{2} \log(2\pi \sigma^2) - \frac{1}{2\sigma^2} \sum_{i=1}^n (x_i - \mu)^2, where n is the sample size. For a binomial distribution with n trials and k successes, parameterized by success probability p, it is \ell(p \mid k, n) = k \log p + (n - k) \log(1 - p), up to an additive constant. The likelihood function underpins the likelihood principle by encapsulating all evidential content about \theta from the data, asserting that valid inferences must rely solely on L(\theta \mid x) and exclude considerations of the sampling distribution, which describes hypothetical data generation rather than the observed evidence. This separation emphasizes the function's role as the sole basis for comparing parameter plausibility.

Illustrative Examples

Sampling Without Replacement Example

Consider an urn containing a large number N of balls, where the proportion of red balls is \theta and black balls is $1 - \theta, with \theta unknown. Two different sampling procedures are possible: one involves drawing two balls with replacement, while the other involves drawing two balls without replacement. In both cases, suppose the observed outcome is the ordered sequence red followed by black (denoted RB). For sampling with replacement, the probability of observing RB is \theta \cdot (1 - \theta). Thus, the likelihood function is L(\theta \mid \text{RB}) = \theta (1 - \theta). For sampling without replacement, the probability of observing RB is \frac{\theta N}{N} \cdot \frac{(1 - \theta) N}{N - 1} = \theta (1 - \theta) \cdot \frac{N}{N - 1}. The likelihood function is therefore L(\theta \mid \text{RB}) \propto \theta (1 - \theta), identical in shape to the with-replacement case (differing only by a constant factor that does not affect relative likelihoods or inferences). According to the likelihood principle, all statistical about \theta from the observed RB is contained in this , which is the same under both sampling procedures. Consequently, any posterior distribution for \theta (under a given ) or likelihood-based , such as maximum likelihood estimates or likelihood intervals, must be identical regardless of whether the sampling was with or without replacement—disregarding differences in the "stopping rule" or procedure. In contrast, frequentist methods typically condition on the fixed sample size n=2 and use the full sampling distribution, leading to different inferences: the binomial distribution for with replacement versus the hypergeometric for without replacement. For finite N, this results in slightly different confidence intervals for \theta, as the variance of the estimator differs between the procedures, thereby incorporating extraneous information about the sampling mechanism beyond the observed data. This demonstrates a violation of the likelihood principle by frequentist approaches.

Binomial Sampling Example

The likelihood principle is illustrated through sampling in scenarios involving optional stopping rules, where the evidential meaning of the observed remains unchanged regardless of how the stopping decision was made. Consider independent trials, such as repeated coin flips with unknown success probability \theta, where success corresponds to heads. Two sampling schemes can yield the same observed data but differ in their stopping rules: one uses a fixed number of trials ( sampling), while the other employs sequential sampling that stops based on the outcomes observed (e.g., negative binomial sampling, stopping after a fixed number of failures). A concrete example involves observing 9 heads and 3 tails. In the fixed-sample scheme, 12 flips are performed regardless of outcomes, following a binomial distribution X \sim \text{Binomial}(12, \theta). The likelihood function is L(\theta \mid x=9) = \binom{12}{9} \theta^9 (1-\theta)^3 \propto \theta^9 (1-\theta)^3. In the sequential scheme, flipping continues until the 3rd tail is observed (negative binomial distribution, number of trials until 3 failures, with failure probability $1-\theta), which happens to require 12 flips for this data. The likelihood is L(\theta \mid x=9) = \binom{11}{2} \theta^9 (1-\theta)^3 \propto \theta^9 (1-\theta)^3. The likelihood principle asserts that these proportional likelihoods provide identical evidence about \theta, so inferences should not depend on the stopping rule; the constant of proportionality carries no evidential value. Under the likelihood principle, estimation and rely solely on the observed data. The maximum likelihood estimator (MLE) is \hat{\theta} = 9/12 = 0.75, and measures of uncertainty, such as intervals derived from the likelihood (e.g., via the likelihood or Bayesian credible intervals with noninformative priors), ignore any hypothetical additional flips that might have occurred under a different stopping . This ensures that the conditions on the observed 9 heads and 3 tails without to unobserved possibilities. In contrast, frequentist approaches, which emphasize long-run error rates, produce inferences that vary with the stopping rule because they incorporate the full , including potential unobserved data. For testing H_0: \theta = 0.5 against H_1: \theta > 0.5, the under fixed binomial sampling is P(X \geq 9 \mid \theta=0.5) \approx 0.073, failing to reject at \alpha = 0.05. Under negative binomial sampling, it is P(X \geq 9 \mid \theta=0.5) \approx 0.033, rejecting H_0. This discrepancy arises because optional stopping can alter error rates (e.g., inflating Type I error), but the likelihood principle dismisses such dependence as irrelevant to the evidential interpretation of the data.

Foundational Principles

The Law of Likelihood

Introduced by Edwards, Lindman, and Savage (1963), the Law of Likelihood articulates a fundamental aspect of how observed data serve as evidence for comparing statistical hypotheses within a specified model. It posits that, given a statistical model, a particular body of data supports one hypothesis over another to the degree that the likelihood of the first hypothesis given the data exceeds that of the second. This principle, derived as an implication of the broader likelihood principle, emphasizes relative evidential support without invoking prior beliefs or long-run frequencies. Formally, if the evaluated at \theta_1 given x, denoted L(\theta_1 \mid x), exceeds L(\theta_2 \mid x), then x provides in favor of \theta_1 over \theta_2 by the L(\theta_1 \mid x) / L(\theta_2 \mid x); the holds if the inequality is reversed. The mathematical expression for this comparative measure is the likelihood ratio: \Lambda(x; \theta_1, \theta_2) = \frac{L(\theta_1 \mid x)}{L(\theta_2 \mid x)} This quantifies the strength of , where \Lambda > 1 indicates for \theta_1, \Lambda = 1 suggests equivalence, and \Lambda < 1 favors \theta_2. Under prior distributions for the hypotheses, the likelihood corresponds directly to the , linking it to Bayesian updating while remaining independent of specific . The implications of the Law of Likelihood extend to establishing a coherent scale for evidential strength in . It forms the cornerstone for likelihood-based hypothesis testing, where ratios guide decisions on and parameter comparison by focusing solely on how well hypotheses account for the observed data. For instance, in or , likelihood ratios are routinely used to weigh , such as the probability of match under competing profiles. Despite its utility, the Law of Likelihood has inherent limitations, as it delivers only relative measures of support between hypotheses and does not yield absolute probabilities or error rates for individual claims. It requires a fully specified and cannot address issues like model adequacy or the overall plausibility of a in isolation.

Birnbaum's Sufficiency and Conditionality

In 1962, Allan Birnbaum demonstrated that the can be derived as a of two foundational principles in : the sufficiency principle and the conditionality principle. This result, often referred to as Birnbaum's theorem, posits that if inferences adhere to both the sufficiency and conditionality principles, they must also satisfy the , which asserts that the evidential meaning of data is fully captured by the . Birnbaum's argument aimed to provide a rigorous foundation for the by reducing it to these arguably more intuitive principles, thereby bridging frequentist and Bayesian perspectives on evidence. The sufficiency principle, as formulated by Birnbaum, states that the evidential import of an experimental outcome depends only on a for the of interest, rather than the full dataset. Formally, if t = t(x) is a sufficient statistic for an experiment E with observed outcome x, and E' is the derived experiment with outcomes given by t, then the evidence from E given x equals the evidence from E' given t, denoted Ev(E, x) = Ev(E', t). This principle, building on Ronald Fisher's earlier work on sufficiency, ensures that ancillary information—details in the data irrelevant to the —does not influence inferences. For instance, in estimating a binomial probability, the principle justifies basing conclusions solely on the number of successes, ignoring the specific sequence of trials, as the count is sufficient. Complementing sufficiency, Birnbaum's conditionality emphasizes that should be conditioned on the actually observed aspect of an experiment, disregarding hypothetical alternatives. In a mixed experiment E comprising subexperiments E_h with known mixing probabilities, if outcome (E_h, x_h) is observed, the equals that from performing E_h alone: Ev(E, (E_h, x_h)) = Ev(E_h, x_h). This counters the inclusion of sampling plan details that might dilute evidential meaning, such as in optional stopping scenarios where the decision to continue sampling is ancillary. Birnbaum illustrated this with a hypothetical setup involving two instruments selected with probabilities 0.73 and 0.27; the from using one instrument matches what it would provide in isolation, irrespective of the selection process. Birnbaum's proof that these principles entail the likelihood principle proceeds by and constructive . Suppose two experiments E and E' yield outcomes x and y with proportional likelihood functions, f(x \mid \theta) = c \cdot g(y \mid \theta) for some constant c and all \theta. Birnbaum constructs a hypothetical experiment E^* with E and E' as components, each with mixing probability 1/2. By the conditionality , the from the observed component (say E with x) equals the from E alone. Within each component, a trivial sufficient (based on the likelihood) is identified, and the sufficiency equates across outcomes with identical likelihoods. Thus, Ev(E, x) = Ev(E^*, (E, x)) = Ev(E', y), establishing the likelihood . This derivation highlights the principles' role in eliminating non-likelihood-based , though it relies on idealized experimental frames.

Historical Context

Early Developments

The roots of likelihood-based inference trace back to the , where the concept of likelihood as a measure of evidential support emerged in applications to astronomical . Pierre-Simon , in his work on error theory and planetary perturbations, employed what would later be recognized as likelihood ratios to assess the relative plausibility of different hypotheses given observational data, such as estimating the masses of planets from their gravitational effects. ' historical analysis highlights how these early uses in astronomy treated likelihood as a direct comparator of hypotheses without invoking prior probabilities, marking an implicit shift toward evidential reasoning over full Bayesian inversion. In the 1920s, R.A. advanced these ideas into a systematic framework, introducing (MLE) as a method for parameter estimation that maximizes the —the probability of the observed data viewed as a function of the unknown parameters. In his seminal 1922 paper, Fisher argued that MLE provides efficient and unbiased estimates, emphasizing the 's role in capturing all relevant information from the data about the parameters, distinct from long-run frequency interpretations of probability. Fisher explicitly rejected the Bayesian approach of , which he viewed as mathematically flawed due to its reliance on arbitrary priors and integration over parameters, instead advocating likelihood as the foundation for inductive inference. This rejection facilitated a broader shift in statistical practice away from Bayesian methods toward likelihood-centered techniques, influencing fields like and experimental design where Fisher applied MLE to real data problems. Fisher further developed these ideas in the 1930s through fiducial inference, a method that derives probability statements about parameters directly from the by inverting the likelihood-based , without priors. In his 1933 work, presented fiducial inference as a way to quantify uncertainty in parameters using the likelihood's pivotal properties, such as in estimating the of normal means from sample data, positioning it as an alternative to both frequentist error rates and Bayesian posteriors. The 1930s also saw the Neyman-Pearson framework incorporate likelihood ratios into hypothesis testing, formalizing the most powerful tests for simple hypotheses via the ratio of likelihoods under alternative models. In their 1933 lemma, and demonstrated that rejecting the null when the likelihood ratio exceeds a threshold minimizes type II error for a fixed type I error rate, but this approach remained anchored in frequentist control of long-run error probabilities rather than pure evidential support from the data. This integration of likelihood ratios into testing procedures bridged Fisher's estimation ideas with decision-theoretic inference, though it highlighted tensions between likelihood as evidence and frequentist conditioning on hypothetical repetitions.

Birnbaum's Formalization

In 1962, Allan Birnbaum published his seminal paper "On the Foundations of " in the Journal of the , where he formally articulated the as a cornerstone of statistical evidential reasoning. Birnbaum defined the as the assertion that the evidential import of an experimental outcome for inferring about unknown parameters is fully described by the derived from that outcome, irrespective of the broader sampling model or reference set. This formalization positioned the principle as a normative standard for inference, emphasizing that inferences should depend solely on the relative likelihoods of parameters given the observed data. Birnbaum's key innovation was a rigorous proof demonstrating that the likelihood principle logically follows from the conjunction of two established principles: the sufficiency principle and the conditionality principle. The sufficiency principle, originally developed by Ronald A. Fisher, posits that the evidential meaning of an outcome is unaltered when the data is summarized by a , as this captures all relevant information about the parameters. Building on this, the conditionality principle states that in a of experiments, the evidential meaning of an observed outcome from one component should be assessed conditionally on that component having occurred, disregarding the unperformed alternatives. Birnbaum showed that these two principles together entail the likelihood principle by constructing a hypothetical experiment framework, wherein outcomes yielding proportional likelihood functions must carry identical evidential weight, thereby resolving potential inconsistencies in procedures that violate this entailment. Upon publication, Birnbaum's work received immediate endorsement from prominent statisticians, including , who described the argument as a "landmark in statistics" and suggested it could pave the way for broader acceptance of Bayesian methods. However, it also ignited debates concerning the validity and applicability of the underlying axioms, particularly among frequentists who questioned whether the sufficiency and conditionality principles were universally defensible in all inferential contexts. Birnbaum's formalization thus not only unified disparate foundational ideas but also highlighted tensions between likelihood-based and frequentist paradigms, influencing subsequent discussions on the philosophy of statistical evidence.

Debates and Applications

Arguments in Favor

The likelihood principle asserts that the evidential import of observed data for inferences about parameters is encapsulated entirely within the , rendering extraneous details about the sampling process—such as potential unobserved outcomes—evidentially irrelevant. This evidential coherence ensures that inferences depend only on the observed data's fit to the model, avoiding distortions from hypothetical scenarios that did not occur. Birnbaum formalized this by defining the "evidential meaning" of an outcome as a solely between the likelihoods for different parameter values, emphasizing that sampling models beyond the likelihood contribute nothing to evidence. This approach aligns closely with empirical scientific practice, particularly in fields like physics and , where experimental interpretations prioritize the relative support provided by for competing hypotheses over ancillary sampling considerations. Edwards highlighted how likelihood-based assessments mirror the intuitive evidential evaluations scientists perform, such as comparing how well explains observed phenomena without invoking improbable alternatives from the broader experiment . By focusing on fit via the ratio of likelihoods, facilitates objective comparisons that resonate with replicable scientific reasoning. The likelihood principle is inherently compatible with , as the posterior distribution is directly proportional to the likelihood multiplied by the ; when employing a uniform , the posterior simplifies to being proportional to the likelihood alone, thereby determining inferences without additional adjustments. This integration avoids the arbitrary error-rate calibrations often required in frequentist methods, allowing for coherent updating based purely on observed . and Wolpert further underscore this synergy, noting that likelihoodist procedures can serve as building blocks for Bayesian analyses while preserving evidential focus. Practically, adherence to the likelihood principle streamlines statistical analysis by eliminating dependence on stopping rules in sequential testing, thereby resolving paradoxes where identical terminal yield divergent inferences merely due to differing unobserved paths. For instance, in ongoing experiments, it permits flexible data accumulation without altering evidential conclusions upon stopping, promoting and reducing methodological in adaptive designs. This benefit is particularly valuable in resource-constrained settings like clinical trials, where it supports consistent across varied sampling trajectories.

Criticisms from Frequentist Perspectives

Frequentist statisticians criticize the likelihood principle for disregarding the of the data, which is essential for controlling long-run error rates such as Type I and Type II errors in repeated applications of a . By focusing solely on the for the observed data, the principle fails to account for the probabilities of other possible outcomes, thereby undermining guarantees of reliable across hypothetical repetitions of the experiment. This omission leads to procedures that may appear evidentially equivalent under the principle but lack controlled error properties, conflicting with the frequentist emphasis on objective performance criteria. David A. S. Fraser's structural inference framework, developed in the 1980s, further critiques the likelihood principle by advocating for based on the full experimental frame, including ancillary statistics and the broader model structure, rather than the likelihood alone. Fraser and collaborators argue that the principle suppresses relevant information about the experiment's design and variability, such as issues with ancillary statistics, which are crucial for accurate . This approach highlights how the likelihood principle's narrow focus can lead to paradoxical or inadequate conclusions in settings where the complete data-generating process must inform the analysis. Deborah Mayo's severity principle offers another key objection, positing that valid inferences require claims to pass severe tests capable of detecting errors with high probability, a requirement unmet by likelihood-based methods. Unlike frequentist procedures, which evaluate evidence through error probabilities tied to the sampling distribution, the likelihood principle does not ensure such severity, potentially allowing weakly supported claims to be accepted without scrutiny of their robustness against alternatives. A specific concern arises in confidence intervals, where the principle would equate inferences from procedures yielding identical likelihoods for the observed data, even if their coverage probabilities differ substantially, thus ignoring frequentist standards for consistent long-run performance.

Thought Experiments and Case Studies

One prominent illustrating tensions with the likelihood principle is the voltmeter scenario, originally articulated by J.W. Pratt to highlight differences in evidential interpretation between likelihood-based and frequentist approaches. In this example, an engineer measures plate voltages of a sample of tubes, obtaining values from 75 to 99 volts with a of 87 and standard deviation of 4, using a accurate to ±0.01 volt but with a maximum reading of 100 volts. A high-range capable of accurately reading over 100 volts was broken and unavailable during the experiment. The for the observed voltages is identical to what it would be if the high-range voltmeter had been used, implying equivalent about the true voltage under the likelihood principle. However, frequentists argue that the limited provides weaker evidence against substantially higher voltages, as the instrument could not have detected deviations above 100 volts reliably, whereas the high-range meter's availability would have allowed for broader coverage and better error control. Experimental design arguments further critique the principle by demonstrating how sampling procedures, such as optional stopping or , can alter frequentist error properties without affecting the , potentially leading to misleading inferences. A classic illustration is the stopping rule paradox. Consider testing a for fairness (p=0.5). In fixed sampling of n=10 flips, observing 10 heads gives a one-sided of approximately 0.001, reflecting the improbability under the fixed . In contrast, consider a sequential where sampling stops upon observing a specific pattern or count that yields terminal data with the same likelihood, such as in the related -negative binomial case below; the can differ markedly due to the stopping rule, even though the LP deems the evidence equivalent. This first recognized by discussions in inductive probability, highlights how overlooks design-specific error control. A related involves for a success probability in truncated versus fixed samples, where the equates evidential meaning but frequentists identify sampling-induced bias. For instance, suppose data consist of 12 successes and 6 failures: under fixed-sample sampling (n=18 trials), a standard 95% for the success probability p is approximately [0.45, 0.88] using the normal approximation, reflecting unbiased coverage properties. If instead sampling truncates after 12 successes (negative design, with 6 failures observed en route), the remains proportional to p^{12}(1-p)^6, suggesting the same interval per . Frequentists, however, view the truncated procedure as biased toward higher p estimates, requiring adjusted intervals (e.g., wider or shifted) to maintain nominal coverage, as the stopping rule inflates the chance of overestimating p. This discrepancy underscores how overlooks design-specific error control, potentially yielding unreliable inferences in practice. Attempts to resolve these issues often invoke the conditionality principle, which advocates basing inferences solely on the observed data while on ancillary statistics like the or instrument choice, thereby aligning with likelihood-based . In the case, for example, conditioning on the reading falling within the meter's operational range would equate the two scenarios, avoiding reliance on hypothetical unobservable outcomes. Birnbaum formalized this approach, showing its equivalence to the likelihood principle when combined with sufficiency. However, such resolutions remain debated, as frequentists argue that conditionality discards essential information about experimental error rates and design integrity, exacerbating rather than mitigating the principle's incompatibilities with long-run frequency guarantees.

Modern Interpretations

Bayesian Perspectives

The likelihood principle finds a natural alignment with through , which updates prior beliefs about parameters using observed data via the posterior : P(\theta \mid x) \propto P(x \mid \theta) \pi(\theta), where P(x \mid \theta) is the and \pi(\theta) is the . With a fixed , the likelihood fully captures the evidential contribution of the data to inferences about \theta, ensuring that all relevant information is incorporated without extraneous considerations from the sampling process. This core mechanism embodies the principle by rendering inferences to aspects of the experiment that do not alter the likelihood, such as sample size or ancillary statistics. Preceding Birnbaum's formalization, championed the use of as objective measures of evidential support in Bayesian hypothesis testing. In his influential , Jeffreys argued that the of likelihoods under competing hypotheses quantifies the relative degree of provided by the , advocating this approach over frequentist error-based criteria to avoid inconsistencies in evidence assessment. This advocacy positioned as central to Bayesian , emphasizing their role in scaling posterior odds directly from prior odds. A key advantage of the likelihood principle within Bayesian frameworks is its resolution of frequentist paradoxes, such as those arising from optional stopping rules. By conditioning inferences solely on the observed data's likelihood—ignoring the stopping mechanism—Bayesian methods maintain consistent posteriors regardless of whether data collection was halted early due to interim results or continued to a fixed sample size, thereby eliminating artificial discrepancies in evidential strength. However, adherence to the likelihood principle in Bayesian inference introduces limitations tied to prior specification, as the principle itself provides no guidance on selecting \pi(\theta). Different priors can yield divergent posteriors even for identical likelihoods, potentially undermining the objectivity of inferences if subjective or misspecified priors are employed, though objective choices like Jeffreys priors aim to mitigate this by ensuring invariance.

Connections to Other Inference Paradigms

The likelihood principle stands in sharp contrast to paradigms, which rely on long-run error probabilities derived from the full , including outcomes not observed in the data. According to this principle, such methods, including p-values and confidence intervals, are invalid because they incorporate information about hypothetical non-observed data, thereby violating the evidential basis limited to the observed . This tension arises because frequentist procedures evaluate performance over repeated sampling, which the likelihood principle deems irrelevant to the evidential meaning of a specific outcome. Fiducial inference, introduced by R. A. as an alternative to methods, initially drew on likelihood considerations by using pivotal quantities to derive probability statements about parameters directly from the . However, Fisher's later refinements diverged from a strict likelihood by emphasizing conditional properties and recognizable subsets, leading to inconsistencies that limited its general applicability. In cases where fiducial arguments rely on sufficient statistics or pivotal quantities aligned with the observed likelihood, the likelihood principle subsumes these methods, treating them as special instances of likelihood-based . Structural inference, developed by D. A. S. Fraser, extends beyond the by incorporating ancillary statistics and group-theoretic structures from the model to resolve paradoxes like marginalization issues, providing a more comprehensive framework for handling model structure. Unlike the , which effectively eliminates such structural details to focus solely on the , structural approaches retain them to derive bases that account for observable implications of the model. Despite these differences, the likelihood principle, fiducial inference, and structural inference share overlaps in prioritizing the observed data over uninformative aspects of the broader model or prior assumptions, aiming to base conclusions on rather than hypothetical repetitions. They diverge, however, in their treatment of : the likelihood principle confines it to relative evidential support via the likelihood ratio, while fiducial and structural methods introduce additional probabilistic or geometric interpretations to quantify absolute uncertainty.

Contemporary Applications and Extensions

In , the likelihood principle informs the application of likelihood ratios to evaluate diagnostic tests, enabling clinicians to update prior probabilities based solely on the observed data's evidential content. Post-2000 guidelines, such as those from the Centre for Evidence-Based Medicine, recommend likelihood ratios to quantify how test results modify disease probability, combining while adhering to the principle by conditioning inference on the rather than ancillary sampling details. This approach gained prominence in clinical practice following the 2007 rationalization of likelihood ratios in medical education, which emphasized their role in avoiding frequentist biases like error rate distortions. In meta-analyses, likelihood-based methods further apply the principle to synthesize from multiple studies, focusing on the relative provided by the combined likelihoods to mitigate sampling biases inherent in fixed stopping rules or selective reporting. In , objective Bayes methods leverage the likelihood principle for , integrating non-informative priors with the observed data's likelihood to balance fit and complexity. The (BIC), for instance, approximates the under objective priors and is routinely applied in algorithms for tasks like clustering and , penalizing by incorporating the maximized log-likelihood and sample size. This aligns with the principle's emphasis on data-dependent evidence, as BIC selects models that maximize predictive likelihood without extraneous frequentist considerations, a practice solidified in contemporary reviews of information criteria for high-dimensional data. Extensions of the likelihood principle, as formalized by Royall in 1997, generalize its application to non-independent and identically distributed (non-i.i.d.) data through the law of likelihood, which measures evidential support via relative likelihood ratios regardless of distributional assumptions. Royall's paradigm extends beyond i.i.d. settings by defining evidence as the likelihood function's shape, applicable to sequential or heterogeneous observations, and has influenced modern evidential statistics that prioritize this conditional framework. In genomics, these extensions manifest in variant calling pipelines, where likelihood-based frameworks compute genotype probabilities from sequencing reads treated as non-i.i.d. observations, as in the Genome Analysis Toolkit (GATK), which uses likelihood-based models, including genotype likelihoods, to call variants like single nucleotide polymorphisms (SNPs), accounting for read mapping through local re-assembly. Recent developments in the have applied the likelihood principle to out-of-distribution (OOD) detection in systems, where the likelihood path—tracking probability density along data trajectories—guides from partial or anomalous inputs, aligning with the principle's focus on observed to enhance model robustness. This addresses ethical concerns in by ensuring inferences from incomplete datasets avoid unsubstantiated extrapolations, promoting fairer in applications like autonomous systems.

References

  1. [1]
    [PDF] On the Foundations of Statistical Inference
    The likelihood principle (L): If E and E' are any two experiments with the same parameter space, represented respectively by density functions f(x, 0) and g(y, ...
  2. [2]
    [PDF] 1 The Likelihood Principle
    Likelihood principle concerns foundations of statistical inference and it is often invoked in arguments about correct statistical reasoning. Let f be a ...
  3. [3]
    On the Foundations of Statistical Inference - Taylor & Francis Online
    Apr 10, 2012 · The likelihood principle states that the “evidential meaning” of experimental results is characterized fully by the likelihood function.
  4. [4]
    [PDF] The Likelihood Principle - Error Statistics Philosophy
    OF THE LIKELIHOOD. PRINCIPLE AND RELATIVE LIKELIHOOD PRINCIPLE. Most people who reject the LP do so because it has consequences they do not like. Of course ...
  5. [5]
  6. [6]
  7. [7]
    On the mathematical foundations of theoretical statistics - Journals
    Several reasons have contributed to the prolonged neglect into which the study of statistics, in its theoretical aspects, has fallen.
  8. [8]
    Transformation properties of the likelihood and posterior
    ... likelihood is invariant to reparameterization, which is a very important property. It also underscores how the likelihood is not a probability density in θ.
  9. [9]
    Log-likelihood - StatLect
    In other words, when we deal with continuous distributions such as the normal distribution, the likelihood function is equal to the joint density of the sample.
  10. [10]
    Normal distribution - Maximum likelihood estimation - StatLect
    Maximum likelihood estimation (MLE) of the parameters of the normal distribution. Derivation and properties, with detailed proofs.Assumptions · The maximum likelihood...
  11. [11]
    1.5 - Maximum Likelihood Estimation | STAT 504
    ... likelihood and log likelihood functions. The "dbinom" function is the PMF for the binomial distribution. Copy code. likeli.plot = function(y,n) { L = function ...
  12. [12]
  13. [13]
  14. [14]
  15. [15]
    an account of the statistical concept of likelihood and its application ...
    Sep 2, 2019 · Likelihood; an account of the statistical concept of likelihood and its application to scientific inference. by: Edwards, A. W. F. (Anthony ...
  16. [16]
    On the Foundations of Statistical Inference: Discussion - jstor
    Birnbaum suggests that maybe not everyone will make the transition. I ... 308 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1962 for what it is worth ...
  17. [17]
  18. [18]
    On the Birnbaum Argument for the Strong Likelihood Principle
    This feature results in violations of a principle known as the strong likelihood principle (SLP), the focus of this paper.
  19. [19]
    The likelihood principle: A review, generalizations, and statistical ...
    The likelihood principle: A review, generalizations, and statistical implications. Author(s) James O. Berger, Robert L. Wolpert. IMS Lecture Notes Monogr. Ser. ...
  20. [20]
    [PDF] Theory of Probability revisited - Ceremade
    Mar 8, 2006 · Jeffreys defends the use of likelihood ratios [or inverse probability] versus p values (VII, §7.2) ...if the actual value is unknown the ...
  21. [21]
    On the Birnbaum Argument for the Strong Likelihood Principle
    “Within the context of what can be called classical frequency-based statistical in- ference, Birnbaum (1962) argued that the conditional- ity and sufficiency ...
  22. [22]
    None
    Error: Could not load webpage.<|control11|><|separator|>
  23. [23]
    R. A. Fisher and Fiducial Argument - Project Euclid
    The fiducial argument arose from Fisher's desire to create an inferential alternative to inverse methods. Fisher discovered such an alternative in 1930, ...
  24. [24]
    Likelihood Ratios
    Application. The LR is used to assess how good a diagnostic test is and to help in selecting an appropriate diagnostic test(s) or sequence of tests. They ...Missing: principle 2000
  25. [25]
    Making Sense of Diagnostic Test Results Using Likelihood Ratios
    In this article, we describe two additional approaches to help clinical learners understand how LRs describe the discriminatory power of test results.Missing: principle applications 2000
  26. [26]
    A likelihood approach to meta-analysis with random effects - PubMed
    We show how a likelihood based method can be used to overcome these problems, and use profile likelihoods to construct likelihood based confidence intervals.
  27. [27]
    Information criteria for model selection - Zhang - 2023
    Feb 20, 2023 · The above argument also applies to the selection principle based on maximizing the marginal likelihood or Bayes factors, namely based on ...
  28. [28]
    [PDF] Bayesian Model Selection, the Marginal Likelihood, and ...
    The marginal likelihood (aka Bayesian evidence), which represents the probability of generating our observations from a prior, provides a distinctive approach ...Missing: objective | Show results with:objective
  29. [29]
    Entropy, Statistical Evidence, and Scientific Inference - NIH
    Sep 9, 2022 · Royall axiomatically based his evidential statistics on the law of likelihood [11] and the likelihood principle (LP) [10] and utilized the ...
  30. [30]
    statistical framework for SNP calling, mutation discovery, association ...
    We present a statistical framework for calling SNPs, discovering somatic mutations, inferring population genetical parameters and performing association tests
  31. [31]
    The Likelihood Path Principle and Its Application to OOD Detection
    Jan 10, 2024 · ... likelihood principle. This narrows the search for informative ... TXYZ.AI (What is TXYZ.AI?) Related Papers. Recommenders and Search ...