Fact-checked by Grok 2 weeks ago

Design effect

The design effect (often abbreviated as DEFF or deff) is a key concept in survey sampling statistics that measures the ratio of the variance of an estimator under a complex sampling design—such as cluster, stratified, or multistage sampling—to the variance that same estimator would have under simple random sampling (SRS) of equivalent size.^[1] This ratio quantifies the efficiency loss or gain from the sampling design, where values greater than 1 indicate inflated variance due to factors like intra-cluster correlation, typically requiring larger sample sizes to achieve the same precision as SRS.^[2] Introduced by statistician Leslie Kish in his 1965 book Survey Sampling, the concept has become fundamental for planning surveys, adjusting effective sample sizes, and estimating standard errors in complex designs.^[1] In practice, the design effect is calculated using formulas that incorporate design-specific elements, such as the intra-cluster correlation coefficient (ρ) and average cluster size (m or n). For cluster sampling, a common form is DEFF = 1 + (m - 1)ρ, where ρ represents the correlation among units within clusters, often leading to DEFF values between 1.1 and 3.0 in real-world applications.^[1] The effective sample size is then derived as the original sample size divided by DEFF, aiding researchers in resource allocation and variance correction for analyses like regression or proportion estimation.^[2] Beyond sampling, design effects also account for non-sampling errors, such as those from interviewer variability, and are decomposed into components (e.g., sampling vs. measurement effects) for more nuanced survey evaluation.^[1] The importance of the design effect lies in its role in enhancing the accuracy of survey inferences, particularly in large-scale studies like national censuses or health trials, where ignoring it can lead to underestimated variances and overconfident conclusions.^[3] Extensions of Kish's original framework, including combined design effects for multiple variance sources, continue to evolve in statistical literature to address modern survey challenges like unequal probabilities and post-stratification.^[1]

Overview and Fundamentals

Introduction to Design Effect

The design effect, often denoted as deff, quantifies the relative efficiency of a complex sampling design in survey research by comparing the variance of an estimator under that design to the variance it would have under simple random sampling (SRS) of the same sample size.^[4] Specifically, it is defined as deff = Var(estimator under complex design) / Var(estimator under SRS), providing a measure of how the sampling strategy impacts precision.^[5] This concept, introduced by Leslie Kish in his foundational work on survey sampling, allows researchers to evaluate whether intricate designs—such as those involving multiple stages—enhance or diminish the reliability of estimates compared to straightforward SRS.^[4] In practice, the design effect is crucial for assessing efficiency losses or gains in large-scale surveys, where factors like clustering, stratification, and unequal selection probabilities are common to reduce costs or improve coverage.^[6] For instance, complex designs can lead to deff values greater than 1, indicating increased variance and reduced precision relative to SRS, or less than 1, signaling efficiency improvements.^[7] This adjustment is essential for accurate variance estimation and sample size planning, ensuring that survey results reflect true population characteristics without underestimating uncertainty.^[6] A classic conceptual example arises in cluster sampling, where the population is divided into groups (clusters) and entire clusters are selected rather than individual units randomly.^[5] Because elements within clusters tend to be more similar to each other than to those in other clusters—due to shared geographic or social factors—the intra-cluster correlation inflates the variance of the estimator, typically resulting in deff > 1 and a less precise estimate than SRS for the same sample size.^[7] Closely related, the effective sample size represents the equivalent SRS sample size that would yield the same precision, calculated as the actual sample size divided by deff.^[5]

Core Definitions and Notations

In survey sampling, the finite population of size N is denoted U = \{1, 2, \dots, N\}, where y_i represents the value associated with unit i \in U. A sample s of fixed size n is selected from U according to a probability sampling design, with first-order inclusion probabilities \pi_i = \Pr(i \in s) for each unit i. The corresponding sampling weights are defined as w_i = 1 / \pi_i for units i \in s, ensuring unbiased estimation under the design.^[5] The Horvitz-Thompson (HT) estimator for the population total Y = \sum_{i \in U} y_i is \hat{Y} = \sum_{i \in s} y_i w_i, which is unbiased since \E(\hat{Y}) = Y. The exact design-based variance of \hat{Y} is given by

V(\hat{Y}) = \frac{1}{2} \sum_{i \in U} \sum_{j \in U} \left( \pi_i \pi_j - \pi_{ij} \right) \left( \frac{y_i}{\pi_i} - \frac{y_j}{\pi_j} \right)^2,

where \pi_{ij} = \Pr(i \in s, j \in s) denotes the second-order inclusion probabilities. For the population mean \bar{Y} = Y / N, the HT estimator is \hat{\bar{Y}} = \hat{Y} / N, and its variance is V(\hat{\bar{Y}}) = V(\hat{Y}) / N^2. Under simple random sampling without replacement (SRS) of size n, the variance of the corresponding estimator \hat{\bar{Y}}_{\text{SRS}} is

V_{\text{SRS}}(\hat{\bar{Y}}) = \left(1 - \frac{n}{N}\right) \frac{S^2}{n},

where S^2 = \frac{1}{N-1} \sum_{i \in U} (y_i - \bar{Y})^2 is the population variance.^[8] Kish's design effect, denoted {\text{Deff}}, quantifies the efficiency loss or gain of the complex design relative to SRS for estimating parameters like the mean, defined as

{\text{Deff}}(\hat{\bar{Y}}) = \frac{V(\hat{\bar{Y}})}{V_{\text{SRS}}(\hat{\bar{Y}})}.

This measures how the sampling design and estimation procedure affect precision compared to SRS of equivalent size n. For designs with unequal inclusion probabilities (and thus unequal weights), Kish derived an approximation focusing on the weighting component, assuming the primary source of variance inflation is the variation in w_i. Normalizing the weights such that \bar{w} = n^{-1} \sum_{i \in s} w_i, the weight-based design effect for the mean is

{\text{Deff}}_w = \frac{1}{n} \sum_{i \in s} \left( \frac{w_i}{\bar{w}} \right)^2 = 1 + \text{CV}^2(w),

where \text{CV}^2(w) is the squared coefficient of variation of the weights. This formula arises from approximating V(\hat{\bar{Y}}) by ignoring joint inclusion terms in the HT variance (valid under sampling with replacement or when correlations are negligible) and assuming the population is large, yielding V(\hat{\bar{Y}}) \approx n^{-1} N^{-2} \sum_{i \in U} y_i^2 / \pi_i. This approximation holds under assumptions such as weights being uncorrelated with the study variable y_i and equal variances across strata in designs like stratified sampling with proportional allocation, where the variance inflation arises primarily from weight variability.^[5]^[4] The design effect on the standard error, denoted {\text{Deft}}, is the square root of {\text{Deff}}:

{\text{Deft}} = \sqrt{{\text{Deff}}},

interpreting as the factor by which the standard error of the mean under the complex design exceeds (or is reduced below) that under SRS; for example, {\text{Deft}} > 1 indicates inflated uncertainty due to the design. These approximations commonly assume a negligible finite population correction (i.e., n/N \approx 0, omitting the (1 - n/N) term) and independence among inclusion indicators (approximating without-replacement sampling as with-replacement for variance calculation). The effective sample size can be expressed as n_{\text{eff}} = n / {\text{Deff}}.^[5]

Effective Sample Size

The effective sample size, denoted n_{\text{eff}}, represents the size of a simple random sample (SRS) that would yield the same level of precision for an estimator as the actual sample of size n drawn under a complex sampling design.^[5] It is derived directly from the design effect (d_{\text{eff}}), defined as the ratio of the variance of the estimator under the complex design to its variance under an SRS of the same size n: d_{\text{eff}} = V_{\text{complex}}(\hat{\theta}) / V_{\text{SRS}}(\hat{\theta}).^[4] Since the variance under SRS is approximately proportional to $1/n (ignoring the finite population correction for large populations), V_{\text{SRS}}(\hat{\theta}) \approx \sigma^2 / n, where \sigma^2 is the population variance. Under the complex design, V_{\text{complex}}(\hat{\theta}) \approx d_{\text{eff}} \cdot \sigma^2 / n. To match this variance with an SRS of size n_{\text{eff}}, set \sigma^2 / n_{\text{eff}} = d_{\text{eff}} \cdot \sigma^2 / n, yielding n_{\text{eff}} = n / d_{\text{eff}}.^[5] This derivation highlights how n_{\text{eff}} quantifies sampling efficiency: when d_{\text{eff}} > 1, as often occurs in complex designs due to clustering or unequal probabilities, n_{\text{eff}} < n, meaning the sample provides less information than an SRS of equal size. For example, in a national health survey with n = 1000 and d_{\text{eff}} = 1.25 due to moderate clustering, n_{\text{eff}} = 800, requiring a 25% larger sample to match SRS precision.^[5] Conversely, d_{\text{eff}} < 1 (e.g., from stratification) increases n_{\text{eff}}, enhancing efficiency. In practice, n_{\text{eff}} guides sample size planning to achieve target precision, such as standard errors for means or proportions.^[4] For designs with unequal inclusion probabilities \pi_i (the probability that unit i is selected), the effective sample size accounts for variance inflation from weighting. The formula is n_{\text{eff}} = (\sum \pi_i)^2 / \sum \pi_i^2, where sums are over the population; this approximates n / d_{\text{eff}} under the assumption that \sum \pi_i = n (expected sample size).^[5] Step-by-step, the Horvitz-Thompson estimator for the population mean under unequal probabilities has variance involving terms \sum (1 - \pi_i) y_i^2 / \pi_i - \sum_{i \neq j} (1 - \pi_i)(1 - \pi_j) y_i y_j / (\pi_i \pi_j) (\pi_{ij} - \pi_i \pi_j), but approximating by ignoring joint inclusions and assuming large populations simplifies to a relative variance factor of \sum \pi_i^2 / n compared to SRS (where all \pi_i = n/N, yielding d_{\text{eff}} = 1). Thus, d_{\text{eff}} \approx \sum \pi_i^2 / [(\sum \pi_i)^2 / n], leading to n_{\text{eff}} = n \cdot (\sum \pi_i)^2 / \sum \pi_i^2 when normalizing for the expected \sum \pi_i = n.^[4] This is often estimated from the sample using weights w_i = 1/\pi_i, as n_{\text{eff}} = (\sum w_i)^2 / \sum w_i^2 with weights normalized to sum to n.^[5] In cluster sampling, n_{\text{eff}} simplifies when focusing on intra-class correlation \rho, the correlation among units within clusters. For equal cluster sizes n_c, n_{\text{eff}} = n / [1 + (n_c - 1)\rho], where \rho > 0 reduces n_{\text{eff}} by increasing within-cluster similarity.^[4] This formula arises from the variance of the cluster mean being \sigma^2 / n_c \cdot [1 + (n_c - 1)\rho], inflating the overall design variance relative to SRS.^[5] Limitations include the assumption of a uniform d_{\text{eff}} across variables, which rarely holds as \rho or weight variability differs by outcome (e.g., \rho = 0.05 for income but 0.20 for health behaviors).^[4] Additionally, n_{\text{eff}} ignores finite population corrections and higher-order inclusions, potentially underestimating precision in small populations, and requires accurate estimation of \pi_i or \rho, which can vary by design stage.^[5]

Historical Context

Origins in Survey Sampling

The need for efficient large-scale data collection during World War II significantly advanced survey sampling practices in the United States, particularly through the U.S. Census Bureau's development of probability-based methods to assess unemployment, labor force dynamics, and resource allocation on a national scale. Led by statisticians such as Morris Hansen, these efforts addressed the practical constraints of full enumeration by introducing multi-stage and cluster sampling designs that balanced cost and coverage, marking the emergence of design considerations in the 1940s.^[9] By the 1950s, such techniques had become standard in ongoing national surveys like the Current Population Survey, initiated in 1940 and refined postwar to support economic planning.^[10] The intellectual roots of these advancements trace back to Jerzy Neyman's 1934 analysis of stratified sampling, which demonstrated its superior efficiency over simple random sampling by minimizing variance through the division of populations into homogeneous subgroups.^[11] Neyman illustrated that optimal allocation within strata—proportional to subgroup size and variability—could substantially reduce estimation errors, prefiguring later ideas about the impact of sampling structure on precision without explicitly quantifying a universal ratio.^[12] This work influenced early U.S. applications, where stratification helped mitigate the inefficiencies of geographically dispersed data collection during wartime exigencies. Post-World War II expansions at the U.S. Census Bureau further highlighted how practical designs, such as clustering and multi-stage sampling, often inflated variances relative to theoretical simple random benchmarks, prompting innovations in error assessment and sample optimization.^[13] In the 1950s literature, the term "design efficiency" emerged to capture this relative performance, referring to the ratio of variances under complex versus simple designs, as systematically explored in Hansen, Hurwitz, and Madow's comprehensive 1953 treatise on sample survey methods. These discussions underscored the trade-offs in real-world surveys, where clustering reduced logistical costs but increased sampling errors due to intracluster correlations. This foundational concept evolved into the more formalized "design effect" in subsequent decades.

Key Developments and Contributors

The formal introduction of the design effect (Deff) and its square root, the design effect coefficient (Deft), occurred in Leslie Kish's seminal 1965 book Survey Sampling, where he emphasized their application to cluster sampling designs to quantify the efficiency loss due to intra-cluster correlation compared to simple random sampling. Kish's framework built on earlier Neyman-era foundations in probability sampling but provided practical tools for survey practitioners to adjust sample sizes and variance estimates in complex designs.^[14] In the 1970s and 1980s, researchers extended Kish's concepts to multi-stage sampling designs, accounting for interactions between clustering, stratification, and unequal probabilities at multiple levels. Keith Rust and collaborators advanced these extensions by developing methods to decompose design effects in multi-stage surveys, enabling better planning for large-scale national assessments like education surveys.^[15] Concurrently, work focused on stratification effects, refining Deff calculations to incorporate stratum-specific variances and improve precision in disproportionate stratified samples.^[16] The 1990s and 2000s saw further specialization, with Sharon L. Lohr's 1999 book Sampling: Design and Analysis deriving design effects tailored to regression slopes in cluster samples, highlighting how clustering inflates variances for slope estimators in linear models.^[17] In 2015, Kyle Z. Henry and Richard Valliant extended Kish's Deff to calibration weighting scenarios, proposing measures that capture the joint impact of unequal weights from calibration adjustments and non-response in single-stage samples, thus aiding variance inflation adjustments in post-stratified surveys.^[18] Post-2010 developments have integrated design effects with big data surveys and advanced statistical methods for enhanced estimation. Researchers have explored machine learning techniques to improve survey processes, including nonresponse modeling and data integration in complex designs.^[19] In the 2020s, Bayesian approaches using hierarchical models have been applied to survey sampling, incorporating complex design features like clustering and stratification in multi-stage surveys with big data integration.^[20]

Design Effects by Sampling Design

Baselines in Simple Random Sampling

Simple random sampling (SRS) is a fundamental probability sampling method in which every subset of n units from a finite population of size N has an equal chance of being selected, typically without replacement. Under SRS without replacement (SRSWOR), each unit in the population has an equal probability of n/N of being included in the sample, ensuring no unit is selected more than once.^[21] The sample mean \bar{y}_{\text{SRS}} = \frac{1}{n} \sum_{i \in S} y_i serves as an unbiased estimator of the population mean \mu. Its variance, accounting for the finite population correction, is given by

\text{Var}(\bar{y}_{\text{SRS}}) = \left(1 - \frac{n}{N}\right) \frac{S^2}{n},

where S^2 = \frac{1}{N-1} \sum_{i=1}^N (y_i - \mu)^2 is the population variance. This formula arises from the dependence introduced by sampling without replacement, which induces a slight negative correlation among selected units.^[21] To derive this variance, consider the sample mean as \bar{y}_{\text{SRS}} = \frac{1}{n} \sum_{i=1}^n X_i, where the X_i are the ordered observations from the sample. The variance is

\text{Var}(\bar{y}_{\text{SRS}}) = \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \text{Cov}(X_i, X_j) = \frac{1}{n^2} \left( \sum_{i=1}^n \text{Var}(X_i) + \sum_{i \neq j} \text{Cov}(X_i, X_j) \right).

Each \text{Var}(X_i) = \frac{N-1}{N} S^2, so the first sum is n \cdot \frac{N-1}{N} S^2. For the covariances, there are n(n-1) off-diagonal terms, and \text{Cov}(X_i, X_j) = -\frac{S^2}{N} for i \neq j, yielding n(n-1) \left( -\frac{S^2}{N} \right). Substituting and simplifying gives

\text{Var}(\bar{y}_{\text{SRS}}) = \frac{S^2}{n} \left[ \frac{N-1}{N} - \frac{n-1}{N} \right] = \frac{S^2}{n} \cdot \frac{N - n}{N} = \left(1 - \frac{n}{N}\right) \frac{S^2}{n}.

This derivation highlights the finite population correction factor (1 - n/N), which approaches 1 as N \to \infty.^[21] SRS serves as the baseline for design effects because it assumes no additional structure or correlations beyond the uniform selection probabilities and without-replacement dependence, resulting in a design effect (deff) of exactly 1 by definition.^[22]^[23] In complex sampling designs, the design effect is computed as the ratio of the variance under the complex design to this SRS variance, quantifying any efficiency loss or gain relative to SRS.^[22]

Unequal Selection Probabilities

Unequal selection probabilities occur in survey sampling when units are chosen with varying chances of inclusion, often to improve efficiency by oversampling rare or important subgroups. Common sources include probability proportional to size (PPS) sampling, where the inclusion probability π_i is proportional to an auxiliary variable like unit size, size-biased sampling that favors larger elements to reduce variance for size-related estimates, and post-stratification adjustments that reweight the sample to match known population distributions after initial selection.^[24]^[25] For the Horvitz-Thompson (HT) estimator, which unbiasedly estimates the population total as \hat{T} = \sum_{i \in s} \frac{y_i}{\pi_i} where w_i = 1/\pi_i are the inverse probability weights, the design effect under unequal probabilities can be approximated as \mathrm{deff} \approx \frac{\sum w_i^2 y_i^2}{(\sum w_i y_i)^2}. This approximation arises from the variance of the HT estimator relative to simple random sampling (SRS), where the variance inflation stems primarily from the heterogeneity in weights; under the Poisson sampling model and small π_i, \mathrm{Var}(\hat{T}) \approx \sum_U w_i^2 y_i^2 (1 - \pi_i), and estimating this with sample data yields the numerator while the denominator reflects the squared estimated total.^[25]^[6] When weights vary haphazardly—independent of the study variable y—Kish derived an approximation for the design effect in ratio-mean estimation as \mathrm{deff} \approx 1 + \mathrm{CV}(w)^2, where \mathrm{CV}(w) is the coefficient of variation of the weights. This formula highlights the variance inflation from weight dispersion: for example, if weights vary substantially due to post-stratification, \mathrm{CV}(w) = 0.2 implies \mathrm{deff} \approx 1.04, modestly increasing the effective sample size reduction compared to equal-probability SRS where \mathrm{deff} = 1.^[18]^[14] The design effect can deviate from these weight-only approximations when y correlates with π_i. Positive correlation (e.g., in PPS where larger units with higher y values have greater π_i) amplifies variance inflation, often yielding \mathrm{deff} > 1 + \mathrm{CV}(w)^2; for instance, oversampling high-income households in a income survey increases deff beyond the baseline weight effect. Conversely, negative correlation—such as undersampling high-y units like rare events in targeted strata—can reduce variance, resulting in \mathrm{deff} < 1, effectively boosting precision relative to SRS. Spencer (2000) provides an extended approximation \mathrm{deff} \approx (1 - r_{yP})^2 + 1 + \mathrm{CV}(w)^2 (under negligible residual correlations), where r_{yP} is the correlation between y and π, illustrating how negative r_{yP} mitigates weight-induced inflation.^[25]

Cluster Sampling

In cluster sampling, the population is divided into naturally occurring groups known as clusters, such as geographic areas, schools, or households, from which a random sample of these clusters is selected as primary sampling units. All elements within the selected clusters, or a random subsample of them, are then included in the survey. This approach is efficient for studying large, dispersed populations where obtaining a complete frame of individual elements is impractical, but cluster-level frames are available.^[26] The design effect in cluster sampling quantifies the inflation in variance due to the dependence among elements within clusters, primarily driven by the intra-cluster correlation. Under a design-based framework for single-stage cluster sampling with equal cluster sizes, the design effect for the mean estimator is given by

\text{deff} = 1 + (m - 1)\rho,

where m is the number of elements per cluster and \rho is the intra-class correlation coefficient, which measures the proportion of total variance attributable to differences between clusters.^[27] This formula derives from the decomposition of total variance into between-cluster and within-cluster components, akin to a one-way random effects ANOVA model. The total population variance \sigma^2 = \sigma_b^2 + \sigma_w^2, where \sigma_b^2 is the between-cluster variance and \sigma_w^2 is the within-cluster variance, with \rho = \sigma_b^2 / \sigma^2. The variance of the sample mean under simple random sampling is \sigma^2 / n. In cluster sampling with n = k m ( k clusters), the variance becomes \frac{\sigma_b^2}{k} + \frac{\sigma_w^2}{n} = \frac{\sigma^2}{n} \left[1 + (m - 1)\rho \right], yielding the design effect as the ratio of this variance to the simple random sampling variance. This derivation assumes negligible finite population correction and equal cluster sizes, highlighting how positive \rho (due to cluster homogeneity) reduces effective sample size.^[27]^[28] The intra-class correlation \rho is estimated from sample data using an ANOVA-like approach on the observed values. Compute the mean square between clusters (MSB) and mean square within clusters (MSW) from the sample, then

\hat{\rho} = \frac{\text{MSB} - \text{MSW}}{\text{MSB} + (m - 1) \text{MSW}},

which provides an unbiased estimator under the random effects model, with MSB capturing between-cluster variation and MSW the within-cluster variation. This method requires sufficient clusters for reliable estimation and assumes balanced cluster sizes.^[27]^[28] In social surveys, typical \rho values range from 0.01 to 0.1 for variables like demographic characteristics, behaviors (e.g., smoking status, physical activity), and attitudes, reflecting moderate intra-cluster similarity. For example, with m = 20, this yields deff values of approximately 1.2 to 3.0, indicating that cluster sampling may require 20% to 200% more elements than simple random sampling to achieve equivalent precision. In practice, cluster sampling may also incorporate unequal selection probabilities across clusters, adding a separate layer to the overall design effect.^[29]^[30]

Stratified and Multi-Stage Designs

In stratified sampling, the population is divided into homogeneous subgroups or strata based on key characteristics, with independent simple random samples drawn from each stratum according to a specified allocation, such as proportional or optimal. This design reduces sampling variance compared to simple random sampling (SRS) by ensuring representation across subgroups and minimizing within-stratum variability, leading to a design effect (deff) typically less than 1. The deff for the mean under stratified sampling with finite population corrections is given by

\text{deff} = \frac{\sum_h W_h^2 S_h^2 (1 - n_h/N_h)/n_h}{S^2 (1 - n/N)/n},

where W_h = N_h/N is the stratum weight, S_h^2 is the population variance within stratum h, n_h and N_h are the sample and population sizes in stratum h, n and N are the total sample and population sizes, and S^2 = \sum_h W_h S_h^2 is the overall population variance.^[5] This formula derives from the reduced variance of the stratified estimator relative to SRS, where the numerator captures stratum-specific contributions adjusted for allocation and finite corrections, while the denominator represents the SRS variance. For proportional allocation (n_h = n W_h) and negligible finite population corrections (large N), the deff simplifies to \sum_h W_h (S_h^2 / S^2), which is less than 1 when between-stratum variance exceeds within-stratum variance, quantifying the efficiency gain from stratification.^[31] Multi-stage designs extend stratified sampling by incorporating multiple levels of clustering, such as primary sampling units (PSUs) selected within strata, followed by subsampling of secondary sampling units (SSUs) or elements within PSUs. In a stratified two-stage design, the overall deff combines stratification, clustering, and weighting effects across stages, often approximated as a weighted average of stratum-specific deffs. The Chen-Rust formula provides a model-based extension for this, expressing the deff for the mean as \text{deff} = \sum_h W_h \cdot \text{deff}_h, where \text{deff}_h = \left[1 + (m_h - 1) \rho_h \right] \left[1 + \text{CV}_{w_h}^2 \right], with m_h as the average cluster size in stratum h, \rho_h as the intra-cluster correlation, and \text{CV}_{w_h} as the coefficient of variation of weights within stratum h.^[32] For three-stage designs, the formula generalizes to a product of stage-specific clustering effects multiplied by the weighting deff: \text{deff} = \left[1 + (m - 1) \rho_1 \right] \left[1 + (l - 1) \rho_2 \right] \left[1 + \text{CV}_w^2 \right], where m and l are average sample sizes at the first and second subsampling stages, \rho_1 and \rho_2 are the respective intra-cluster correlations, and \text{CV}_w is the overall weight variation; stratification is incorporated by averaging these across strata.^[33] This product form arises because variances at each stage compound multiplicatively under independence assumptions, allowing decomposition of the total deff into stage contributions. Stratification interacts with multi-stage clustering by reducing the overall deff, as it groups similar PSUs into strata, lowering the effective intra-cluster correlation \rho_h compared to unstratified cluster sampling, where clustering alone inflates deff above 1. In multi-stage settings, the total deff approximates the product of stage-specific deffs (\text{deff}_\text{total} \approx \prod \text{deff}_\text{stage}), but stratification mitigates clustering inflation by enhancing homogeneity within selected units at early stages. For instance, in national health surveys like the National Health Interview Survey (NHIS), stratification by geography and demographics is combined with multi-stage PSU and household clustering, yielding deffs around 1.5–3 for key health indicators, lower than the 2–5 typical for unstratified clusters due to the variance reduction from allocation.^[31] Similarly, the Medical Expenditure Panel Survey (MEPS) employs stratified multi-stage sampling with state-level strata and clustered households, enabling precise national and subdomain inferences.^[34]

Advanced Variations and Estimators

Model-Based vs. Design-Based Frameworks

In the design-based framework, the design effect (deff) is derived solely from the sampling mechanism, treating the finite population as fixed and focusing on the variability induced by the probability structure of the sample design. This approach, pioneered by Kish, defines deff as the ratio of the variance of an estimator under the complex design to its variance under simple random sampling (SRS) of the same size, such as deff = Var_complex(θ̂) / Var_SRS(θ̂).^[4] For instance, in cluster sampling, Kish's formula approximates deff = 1 + (b - 1)ρ, where b is the average cluster size and ρ is the intraclass correlation, emphasizing probability-derived inclusion probabilities without reliance on data models.^[35] In contrast, the model-based framework estimates deff from an assumed superpopulation model, viewing the population as a realization of random variables and incorporating auxiliary information to adjust variances. Here, deff arises from the data-generating process, such as regression or random effects models, where auxiliary covariates help predict unsampled units and reduce variance beyond design alone. For example, in a random effects model for cluster sampling, the model-based deff can be expressed as:

\text{deff} = 1 + (b^* - 1)\rho

where b^* is an effective cluster size adjusted for weights or pseudo-inclusion probabilities derived from the model, and ρ is estimated via methods like maximum likelihood.^[36] This approach uses pseudo-inclusion probabilities implicitly through model-based weights, allowing for more flexible variance adjustments in complex surveys.^[36] The key differences lie in their inferential foundations: design-based inference relies on the randomization from the sample design for unbiasedness, assuming a fixed population, while model-based inference depends on the correctness of the superpopulation model for consistency, treating the population as random and leveraging correlations for efficiency.^[36] Design-based deff ignores substantive data structure, potentially overestimating variance without auxiliaries, whereas model-based deff can borrow strength across units but risks bias if the model is misspecified. The 1980s marked a historical shift toward model-based methods for handling complex surveys, particularly in small area estimation, as seen in the seminal Fay-Herriot area-level model, which integrated direct survey estimates with regression on covariates to improve precision for sparse domains.

Design Effects with Weighting and Calibration

Calibration weighting involves adjusting initial sampling weights to align with known population totals for auxiliary variables, thereby reducing bias from nonresponse, undercoverage, or frame imperfections while potentially improving estimator precision when auxiliaries correlate with the study variable. This process, often implemented via generalized regression estimation, modifies weights multiplicatively, introducing additional variability that inflates the design effect beyond that of the initial sampling design. A practical measure of the design effect attributable to calibration, as proposed by Henry and Valliant, approximates the variance inflation as the average squared ratio of calibrated weights to initial weights:

\text{deff} = \frac{1}{n} \sum_{i=1}^n \left( \frac{w_i^{\text{cal}}}{w_i^{\text{init}}} \right)^2,

where n is the sample size, w_i^{\text{cal}} are the calibrated weights, and w_i^{\text{init}} are the initial design-based weights. This metric captures the efficiency loss from the adjustment factors, assuming they average to 1, and is particularly useful for single-stage samples where calibration to multiple auxiliaries can substantially alter weight distributions.^[18] Post-stratification, a special case of calibration, adjusts weights within predefined population cells (e.g., age-sex categories) to match known marginal totals, often using raking procedures that iteratively scale weights across dimensions until convergence. The resulting design effect similarly reflects variance inflation from these adjustments, derived as the ratio of the calibrated estimator's variance to that under simple random sampling, incorporating the variability induced by cell-specific scaling factors. In practice, raking can yield design effects close to those of full calibration but may exhibit greater inflation if cross-classifications lead to extreme weight adjustments, as observed in household surveys where demographic margins are enforced. While post-stratification enhances representativeness for subpopulation estimates, its design effect is typically computed via the overall coefficient of variation of the final weights, emphasizing the trade-off between bias correction and increased sampling error.^[37] Haphazard weights, as conceptualized by Kish, extend the basic unequal probability framework to scenarios where weights are derived from estimated population ratios using auxiliary data, rather than direct selection probabilities. In such cases, the design effect approximates Kish's original formula for inefficient weighting but accounts for the correlation between the study variable and the auxiliaries used in ratio estimation:

\text{deff} \approx 1 + b \cdot \text{CV}(w)^2,

where \text{CV}(w) is the coefficient of variation of the weights, and b is the slope from the regression of the study variable on the auxiliary (or weight predictor). This extension highlights how well-chosen auxiliaries can mitigate variance inflation (if b < 1) compared to purely haphazard unequal weights, which assume no correlation and yield b = 1, resulting in \text{deff} = 1 + \text{CV}(w)^2. The approach is design-based and applies to post-sampling adjustments where initial equal-probability samples are reweighted using estimated totals. These weighting methods offer clear benefits in bias reduction—calibration and post-stratification can substantially lower mean squared error by aligning estimates with benchmarks—but they often increase the design effect, particularly when auxiliaries show poor correlation with the target variable, leading to unstable weights and diminished effective sample size. For instance, if auxiliaries like age or education fail to predict the outcome (low R^2 in auxiliary models), the variance penalty can outweigh bias gains, sometimes doubling the design effect relative to unadjusted weights. Practitioners must evaluate this trade-off using simulation or historical data to ensure adjustments enhance overall efficiency.^[38]

Specialized Estimators (e.g., Regression Slopes)

In cluster sampling designs, the design effect extends beyond simple means to more complex estimators such as regression slopes, where clustering induces correlations that alter the variance relative to simple random sampling (SRS). For ordinary least squares (OLS) estimation of a regression slope under a random intercept model, Sharon L. Lohr derived an approximation for the design effect as \text{deff} = \frac{1 + (m-1)\rho_\beta}{1 + (m-1)\rho_y}, where m is the cluster size, \rho_\beta is the intraclass correlation for the predictor variable, and \rho_y is the intraclass correlation for the response variable.^[39] This formula captures how clustering affects both the numerator (covariance between response and predictor) and denominator (variance of the predictor) in the slope estimator, potentially yielding deff values less than 1 if predictor clustering is stronger than response clustering, though typically deff exceeds 1 due to shared cluster effects. The derivation relies on sandwich variance estimators, which account for clustered errors by combining model-based expectations with design-based adjustments for intra-cluster dependence. Under assumptions of equal cluster sizes and a linear model with random intercepts, the variance of the slope estimator incorporates the covariance matrix of random effects, leading to the ratio form that adjusts the SRS variance for design-induced inflation or deflation.^[39] Generalized least squares can further refine this by weighting observations to mitigate clustering, often reducing deff compared to unweighted OLS. Design effects for other specialized estimators, such as totals or ratios in multi-stage sampling, follow similar principles but incorporate additional stages of clustering or stratification. For population totals in two-stage cluster designs, deff approximates the product of stage-specific effects, typically $1 + (m_1-1)\rho_1 for primary units times $1 + (m_2-1)\rho_2 for secondary units, where m_1, m_2 are stage sizes and \rho_1, \rho_2 are respective intraclass correlations.^[39] Ratios, like prevalence rates, exhibit deff influenced by the correlation between numerator and denominator, often lower than for means if positive covariation offsets clustering variance. These specialized design effects find application in longitudinal surveys, where cluster designs (e.g., repeated measures within households or schools) lead to varying slopes across clusters, necessitating adjustments to assess trends in outcomes like health or education metrics over time.^[39] For instance, in studies tracking child development, deff for slopes relating exposure to cognitive growth accounts for intra-family correlations, informing more precise inference on intervention effects.

Applications and Practical Uses

Variance Estimation and Adjustment

In survey analysis, the design effect (deff) quantifies the impact of a complex sampling design on the variance of an estimator relative to simple random sampling (SRS), and it is essential for adjusting standard errors to reflect the true sampling variability. The adjusted standard error is computed as SE_{\text{adjusted}} = SE_{\text{naive}} \times \text{Deft}, where \text{Deft} = \sqrt{\text{deff}} and SE_{\text{naive}} is the standard error assuming SRS. This multiplier accounts for elements such as clustering, stratification, and unequal weighting that inflate or deflate variance. For instance, in cluster sampling, deff often exceeds 1, increasing the adjusted SE by the Deft factor.^[4]^[5] To incorporate this adjustment into confidence intervals, the adjusted SE is substituted into standard interval formulas, such as the normal approximation for large samples: \hat{\theta} \pm z_{\alpha/2} \times SE_{\text{adjusted}}, where \hat{\theta} is the point estimate and z_{\alpha/2} is the critical value (e.g., 1.96 for 95% confidence). For smaller samples or non-normal distributions, a t-distribution with degrees of freedom adjusted for the design may be used. This process ensures intervals capture the design-induced uncertainty, preventing overconfidence in estimates from complex surveys.^[40]^[5] In complex designs where analytical computation of deff is infeasible, replication methods provide empirical estimates by simulating the sampling process. The bootstrap method generates multiple resamples (typically 500–1,000) from the original data, preserving strata and clusters, then computes the estimator for each; the variance across resamples yields the complex variance, from which deff is obtained as the ratio to the SRS variance. Similarly, the jackknife method creates replicates by systematically omitting groups of primary sampling units (e.g., clusters) and recalculating the estimator, estimating variance from the spread of these pseudo-estimates. These techniques are particularly useful for nonlinear statistics in multi-stage designs.^[41]^[4] For nonlinear estimators, such as logistic regression coefficients, Taylor series linearization approximates the variance by expanding the estimator around population parameters using the first-order terms of a Taylor series. This linear approximation, \hat{\theta} \approx \theta + \sum (\partial f / \partial y_i) (y_i - \mu_i), allows application of design-based variance formulas to the linearized form, incorporating deff through the survey structure (e.g., weights and clusters). The method is implemented in software for proportions, ratios, and regression models, ensuring accurate SEs that reflect the design.^[42] Ignoring deff in variance estimation leads to systematic underestimation of SEs, resulting in confidence intervals that are too narrow and overstated statistical significance, which can produce misleading inferences about population parameters. This risk is heightened in clustered or weighted designs, where naive SRS assumptions fail to capture intra-cluster correlations or selection biases.^[40]^[4]

Sample Size Planning

Sample size planning in complex survey designs relies on the design effect (deff) to adjust the required sample size to achieve a desired level of precision, such as a specified margin of error for estimating population parameters. Unlike simple random sampling (SRS), where the sample size for estimating a proportion p with margin of error d at the confidence level corresponding to the Z-score (typically 1.96 for 95% confidence) is given by n_{\text{SRS}} = \frac{Z^2 p (1-p)}{d^2}, complex designs inflate this baseline due to clustering, unequal probabilities, or other features that increase variance. The adjusted sample size is then n_{\text{complex}} = n_{\text{SRS}} \times \frac{\text{deff}}{\text{efficiency gain}}, where the efficiency gain accounts for variance reductions from features like stratification (often deff < 1 in pure stratified designs), ensuring the effective sample size matches SRS precision. This approach, rooted in design-based inference, allows planners to anticipate the impact of the sampling strategy before data collection. To implement this, planners follow structured steps to estimate deff. First, anticipate the design features: for cluster sampling, deff is often approximated as $1 + (m-1)\rho, where m is the average cluster size and \rho is the intraclass correlation coefficient measuring within-cluster similarity. Typical \rho values from literature or prior surveys range from 0.01 to 0.05 for social and health outcomes; for example, assuming \rho = 0.05 and m = 30 yields deff ≈ 2, doubling the required sample size relative to SRS. If pilot data are available, compute deff empirically from a small preliminary sample by comparing variances under the complex design to SRS equivalents. Literature reviews of similar surveys provide benchmarks, such as deff values of 1.5–3 for multi-stage household clusters in national health surveys. Once estimated, plug the deff into the adjusted formula, iterating if design parameters like cluster size change.^[27]^[43] Budget constraints further guide planning, particularly in cluster designs where costs per cluster (e.g., travel and training) exceed costs per unit within clusters. Increasing cluster size m reduces the number of clusters needed but inflates deff due to higher \rho impact, potentially requiring more total units; conversely, more smaller clusters minimize deff but raise fixed costs. Planners balance this by minimizing total cost C = c_0 k + c_1 n, where k is the number of clusters, n = k m, c_0 is cost per cluster, and c_1 is marginal cost per unit, subject to the deff-adjusted precision target. Optimal m often falls around 20–30 for many surveys to control both deff and expenses. In stratified designs, adjust stratum-specific allocations by incorporating within-stratum deff: for proportional allocation, set n_h = n \frac{N_h}{N}, then scale each by the stratum's deff (e.g., if one stratum uses clusters with deff=1.8, inflate its n_h accordingly while overall efficiency gain from stratification may yield total deff < 1). This ensures domain-specific precision without over-sampling low-variance strata.^[44]

Real-World Survey Examples

The National Health Interview Survey (NHIS), conducted annually by the Centers for Disease Control and Prevention (CDC), employs a stratified multistage cluster sampling design with primary sampling units (PSUs) selected as clusters of addresses within geographic areas. This approach results in design effects (deff) typically ranging from approximately 2 to 4 for key health estimates, attributable to the clustering of households within PSUs, which increases variance compared to simple random sampling. Variance adjustments in NHIS analysis incorporate these effects through software like SUDAAN, ensuring reliable standard error estimates for indicators such as health insurance coverage and chronic conditions, where DEFT values (square root of deff) often fall between 1.5 and 2.0.^[31] The European Social Survey (ESS), a cross-national study across multiple European countries, utilizes stratified multistage probability sampling, often involving regional stratification followed by cluster selection of addresses or individuals. Design effects in the ESS generally range from 1.5 to 3, influenced by varying selection probabilities and clustering, with country-specific estimates like 1.32 in some rounds due to moderate intra-cluster correlation. To mitigate these effects, the ESS applies calibration weighting during post-estimation, which helps stabilize variances for social and attitudinal indicators while maintaining comparability across nations.^[45]^[30] During the COVID-19 pandemic (2020-2023), rapid surveys relying on online panels exhibited notably high design effects, often exceeding 5, stemming from nonprobability sampling and haphazard weighting to adjust for coverage biases in hastily assembled respondent pools. For instance, the U.S. Census Bureau's Household Pulse Survey, a weekly online data collection on pandemic impacts, experienced substantial design effects reflecting variance inflation from unequal probabilities and clustering in digital recruitment. Similarly, various international rapid polls faced elevated deff due to convenience sampling via social media or opt-in panels, complicating precise tracking of infection rates and behaviors.^[46] In the Indian National Family Health Survey-5 (NFHS-5, 2019-2021), a two-stage stratified cluster design with villages as PSUs yielded a design effect of approximately 2.1 for key health indicators, such as immunization coverage and anemia prevalence, driven by intra-cluster homogeneity in rural and urban settings. This deff value, derived from DEFT calculations in survey tables, underscores the efficiency loss from clustering about 20-30 households per PSU, necessitating adjustments in variance estimation for national and district-level inferences on maternal and child health.^[47] A critical lesson from these pandemic-era applications is that failing to account for design effects in rapid surveys and polls can lead to overstated precision, with unadjusted confidence intervals appearing unduly narrow and masking true uncertainty in estimates like vaccine uptake or public sentiment.^[4]

Implementation in Software

Overview of Tools and Packages

Several statistical software packages and tools facilitate the computation of design effects (deff) in complex survey analysis, enabling researchers to account for sampling structures such as stratification, clustering, and weighting. These tools typically integrate deff calculations into broader survey estimation routines, often by comparing variances under complex designs to simple random sampling equivalents. Open-source options predominate in recent developments, promoting accessibility and reproducibility in research workflows.^[48] In R, the survey package, developed by Thomas Lumley, provides core functionality for survey data analysis through the svydesign() function, which constructs survey design objects incorporating sampling features like clusters and strata. The svydeff() function within this package computes design effects for estimates such as means and totals, supporting both direct and bootstrap-based variance methods. Complementing this, the srvyr package offers a tidyverse-compatible interface, allowing users to apply survey designs to data frames and compute summaries with design effects via wrappers around survey functions, enhancing usability for modern data pipelines.^[49]^[50] Proprietary software like SAS includes the DEFF option in PROC SURVEYMEANS, which calculates design effects for means and related statistics under specified survey designs, including finite population corrections. Similarly, Stata's svy: prefix commands, followed by postestimation with estat effects, yield design effects (DEFF and DEFT) and effective sample sizes for a range of estimators, integrating seamlessly with its survey data setup via svyset. These tools are widely used in institutional settings for their robust handling of large datasets.^[51]^[52] For Python, dedicated support remains limited, with users often relying on custom implementations or extensions in libraries like statsmodels for basic weighted variance computations that can approximate deff. A more comprehensive option is the samplics package (introduced around 2021), which supports complex survey designs and provides modules for estimating population parameters with design-unbiased variances, enabling deff derivation through variance ratios. In Julia, the Survey.jl package, released in 2022, offers scalable analysis for large survey datasets with support for multi-stage and stratified designs, including variance estimation via Taylor linearization and bootstrapping that facilitates deff calculations for efficient, high-performance computing.^[53]^[54]^[55] Open-source packages such as those in R, Python, and Julia emphasize reproducibility by providing transparent, version-controlled code and documentation, allowing researchers to verify and extend deff computations without licensing barriers, in contrast to proprietary systems like SAS and Stata that require paid access but offer integrated enterprise support.^[56]

Code Examples and Best Practices

In R, the survey package provides robust tools for incorporating complex survey designs into analyses, including the computation of design effects (deff) to quantify the efficiency loss relative to simple random sampling. A typical workflow begins by creating a survey design object with svydesign(), specifying clusters via the id argument (representing primary sampling units or PSUs), strata if applicable, and sampling weights. Then, estimators like svymean() can compute means along with their design effects by setting deff = TRUE. For instance, using the built-in apiclus1 dataset, which simulates a two-stage cluster sample of California schools, the following code defines a clustered design and estimates the mean API score in 2000 with its deff:^[49]

r
library(survey)
data(apiclus1)
dclus1 <- svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc)
svymean(~api00, dclus1, deff = TRUE, na.rm = TRUE)
library(survey)
data(apiclus1)
dclus1 <- svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc)
svymean(~api00, dclus1, deff = TRUE, na.rm = TRUE)

This yields a mean of approximately 737.9 with a standard error of 10.2 and a deff of about 2.44, indicating the clustered design inflates variance by 2.44 times compared to simple random sampling of the same size.^[49] The deff here exceeds 1, as expected in cluster sampling where intra-cluster correlation increases variability.^[5] Best practices for deff computation emphasize fully specifying the survey design to account for all features like weights and PSUs, as omitting them leads to underestimated variances. Always validate that deff values exceed 1 in clustered designs to confirm the impact of the sampling structure, and use na.rm = TRUE in estimators or adjust weights to handle missing data without biasing results. For reproducible analyses, document the design object creation explicitly, including finite population corrections (fpc) when known, to ensure transparency in variance adjustments.^[49]^[57]^[58] Common pitfalls include assuming simple random sampling via software defaults, such as using base R functions like mean() without a design object, which ignores clustering and yields incorrect deffs near 1 even in complex samples. Another error is over-relying on a single average deff across variables for planning or interpretation, as deff is estimator-specific and can vary substantially (e.g., from 0.73 for height to 6.82 for BMI in NHANES data).^[58]^[5]^[5] Integration with the tidyverse ecosystem via the srvyr package enhances reproducibility for deff computations by enabling dplyr-style syntax on survey objects, such as chaining group_by() and survey_mean(deff = TRUE). For the apiclus1 example, this might look like:

as_survey_design(apiclus1, ids = ~dnum, weights = ~pw, fpc = ~fpc) %>% group_by(stype) %>% survey_mean(~api00, deff = TRUE, na.rm = TRUE)

, facilitating stratified deff estimates in tidy pipelines.^[59]^[60]

References

[1]
Section 1. Introduction - Statistique Canada
Jun 30, 2020 · The design effect is the ratio of variance under a survey design to a simple random sample. Effective sample size is the simple random sample ...
[2]
Considering the design effect in cluster sampling - PMC - NIH
The design effect (Deff) is a correction factor used to adjust sample size in cluster sampling, representing variance inflation. It is calculated as 1 + p(n-1) ...
[3]
Design Effect: Definition, Examples - Statistics How To
A design effect (DEFF) is an adjustment made to find a survey sample size, due to a sampling method (eg cluster sampling, respondent driven sampling, or ...
[4]
[PDF] Chapter VI Estimating components of design effects for use in ...
Kish (1965) coined the term "design effect" to denote the ratio of the variance of any estimate, say, z , obtained from a complex design to the variance of z ...
[5]
Design Effects and Effective Sample Size
Sep 29, 2025 · The design effect (deff) is the ratio of the variance of a survey statistic under the complex design to the variance of the survey statistic under an SRS.
[6]
[PDF] Design Effects for the Weighted Mean and Total Estimators Under ...
The design effect is widely used in survey sampling for developing a sampling design and for reporting the effect of the sampling design in estimation and ...
[7]
Encyclopedia of Survey Research Methods - Design Effects (deff)
The design effect (deff) is a survey statistic computed as the quotient of the variability in the parameter estimate of interest resulting ...
[8]
[PDF] History and Development of the Theoretical Foundations of Survey ...
The design effect is defined as the ratio of the actual variance of a statistic under the specified design to the variance which would be achieved under a ...
[9]
Morris H. Hansen - U.S. Census Bureau
Jun 22, 1983 · Morris H. Hansen was, perhaps, the most influential statistician in the evolution of survey methodology in the 20th century.Missing: WWII | Show results with:WWII
[10]
Current Population Survey History - U.S. Census Bureau
Oct 9, 2023 · The households in the revised sample were in 68 "Primary Sampling Units" (PSU's), comprising 125 counties and independent cities. By 1945, about ...
[11]
https://www.jstor.org/stable/2342192
[12]
[PDF] neyman-1934.pdf - Error Statistics Philosophy
ON THE Two DIFFERENT ASPECTS OF THE REPRESENTATIVE METHOD: THE METHOD OF STRATIFIED SAMPLING AND THE METHOD. OF PURPOSIVE SELECTION. By JERZY NEYMAN. (Biometric ...
[13]
[PDF] Morris Hansen - National Academy of Sciences
Hansen was an advocate of the principle that in most cases inference from sample surveys should be based on the design of the surveys rather than on assumed ...
[14]
(PDF) DESIGN EFFECTS AND SURVEY PLANNING - ResearchGate
The design effects of survey estimates can be used as tools for measuring sample efficiency and for survey planning. Kish (1965) defined the design effect ...
[15]
[PDF] 1978: DESIGN EFFECTS IN A COMPLEX MULTISTAGE SAMPLE
This paper presents the results of an empirical examination of design effects of attributes and proportions estimated from a complex sample survey.
[16]
[PDF] Methods for Design Effects - SCB
Problems arise due to differences of deft values between variables (Kish 1965, Section 14.1). A new paper makes abundantly clear, with 3 variables from 56 ...
[17]
[PDF] Sampling: - Design and Analysis - WordPress.com
be better for science and society if they were simply not done. This book gives you guidance on how to tell when a sample is valid or not, and how to design ...
[18]
[PDF] A design effect measure for calibration weighting in single-stage ...
While the Kish design effects attempt to measure the impact of variable weights, they are informative only under special circumstances, do not. Page 4. 316.
[19]
An Introduction to Machine Learning Methods for Survey Researchers
Jan 2, 2018 · This special issue aims to familiarize survey researchers and social scientists with the basic concepts in machine learning and highlights five common methods.Abstract · Tuning Parameters For... · A Recap Of Explanatory...Missing: big 2020s
[20]
Bayesian Ideas in Survey Sampling: The Legacy of Basu | Sankhya A
Oct 16, 2023 · The aim of this paper is to explore and discuss the potential role of Bayesian ideas and techniques in modern survey sampling.
[21]
[PDF] Simple Random Sampling - University of Michigan
Sep 11, 2012 · The goal is to estimate the mean and the variance of a variable of interest in a finite population by collecting a random sample from it.
[22]
[PDF] survey sampling
Sep 21, 2007 · Throughout the book the reader's attention is called to possible frame defects and their effects on sample design. Problems invite the reader to ...Missing: Deft | Show results with:Deft
[23]
Population Survey or Descriptive Study | StatCalc | User Guide - CDC
For simple random sampling, the values for Design Effect and Clusters are 1 by definition. Example. Example. The following example investigates whether the ...Missing: baseline deff=
[24]
Chapter 8 Sampling with probabilities proportional to size
As a consequence, the sampling units have unequal size. The sampling units of unequal size are selected by probabilities proportional to their size (pps).
[25]
[PDF] An Approximate Design Effect for Unequal Weighting When ...
For simplicity, we will discuss singlestage unequal probability sampling with replacement. Heuristic extension of the results to sampling without replacement is ...
[26]
[PDF] chapter 2. sampling design - U.S. Environmental Protection Agency
In cluster sampling, the total population is divided into a number of relatively small subdivisions, or clusters, and then some of the subdivisions are randomly.
[27]
What Is an Intracluster Correlation Coefficient? Crucial Concepts for ...
The smaller the design effect, the larger the effective sample size. A high k (number of clusters) and a low m (number of elements within a cluster) give the ...
[28]
https://influentialpoints.com/Training/one-way_random_effects_ANOVA_use_and_misuse.htm
[29]
Intraclass Correlation Coefficients Typical of Cluster-Randomized ...
ICCs indicating substantial within-practice clustering were calculated for age (ICC = 0.151), race (ICC = 0.265), and such behaviors as smoking (ICC = 0.118) ...Table 1 · Calculation Of Iccs · Discussion
[30]
[PDF] Sampling - European Social Survey
Design Effects/ Effective Sample Size. As indicated, a variety of complex sample designs such as multistage stratified and clustered sampling was used in the ...
[31]
[PDF] Design and Estimation for the National Health Interview Survey ...
2Design PSUs totaled 498 prior to 1960; 3 were added in 1960 for Alaska and Hawaii. ... NOTES: CV is coefficient of variation, and deft is design effect. Page 51 ...
[32]
An Extension of Kish's Formula for Design Effects to Two - NIH
Mar 13, 2017 · Kish (1965) coined the term 'design effect' (DEFF) to describe the ratio of the variance obtained from the complex sampling plan to the variance ...
[33]
https://doi.org/10.1093/jssam/smw036
[34]
[PDF] Sample Designs of the Medical Expenditure Panel Survey ...
Overview of 2006–2015 NHIS Sample Design. As for earlier NHIS sample designs, the 2006−2015 NHIS design was based on a stratified multi- stage sample design.
[35]
None
### Summary of Model-Based Justification for Kish’s Design Effect Formula
[36]
[PDF] Design effects: model-based versus design-based approach
... design effect of multi-stage cluster sampling with unequal inclusion probabilities can be written as d deff = d deffp × d deffc . (3.11). The concept of the ...<|separator|>
[37]
Post-stratification or non-response adjustment? - Survey Practice
Jul 31, 2016 · The base weights have the lowest degree of variability, which translates to the lowest apparent unequal weighting design effect of 1.054.
[38]
[PDF] Using the Lasso to Select Auxiliary Vectors for the Construction of ...
Jun 30, 2017 · ... increases variance without ... justment reduces bias, often significantly–exemplifying the importance of calibration in survey results.
[39]
Design Effects for a Regression Slope in a Cluster Sample
Aug 8, 2025 · In this article, design effects of ordinary least squares and generalized least squares estimators of a regression slope are derived under ...Missing: 1999 | Show results with:1999
[40]
[PDF] Design Effects and Generalized Variance Functions
Kish, L. (1965), Survey Sampling. New York: John Wiley. Lee, K. H. (1972), “Partially balanced designs for half sample replication method of variance.
[41]
[PDF] Replication-Based Variance Estimation Methods for Survey Data ...
Replication methods, like jackknife and BRR, replace complex algebra with simple repeated analysis, enabling variance estimation for nonlinear quantities. A ...
[42]
Taylor Series Linearization (TSL) - Sage Research Methods
The TSL method uses the linear terms of a Taylor series expansion to approximate the estimator by a linear function of the observed data. The ...
[43]
https://www.asasrms.org/Proceedings/y2003/Files/JSM2003-000820.pdf
[44]
[PDF] Section 2: Preparing the Sample Overview
Jan 26, 2017 · Multi-stage cluster sampling is one of the most common sample designs for national surveys and it is the recommended method for most STEPS ...
[45]
[PDF] Sampling Guidelines: Principles and Implementation for the ...
Finally, our estimate for the overall design effect Deff = 1.32, as Deffp = 1. Since the net sample size is 71 the estimated effective sample is neff = nnet.
[46]
Are We There Yet? Big Surveys Significantly Overestimate COVID ...
Sep 3, 2021 · Accurate surveys are the primary tool for understanding public opinion towards and barriers preventing COVID-19 vaccine uptake.Missing: deff | Show results with:deff<|separator|>
[47]
[PDF] National Family Health Survey (NFHS-5), 2019-21 - The DHS Program
design effect (DEFT), the relative standard error (SE/R), and the 95 percent confidence limits (R±2SE) for each variable. The DEFT is considered undefined ...Missing: deff | Show results with:deff
[48]
[PDF] Defining the role of open source software in research reproducibility
May 18, 2022 · An open source license does not by itself support reuse; good-quality software design and documentation are needed in at least some measure ...
[49]
[PDF] survey.pdf
The strata argument is used only to compute finite population corrections, the same variables must be included in formula to compute stratified sampling ...
[50]
[PDF] Complex sampling and R. - faculty.washington.edu
The survey package always uses formulas to specify variables in a survey data set. Page 10. Basic estimation ideas. Individuals are sampled with known ...
[51]
PROC SURVEYMEANS Statement - SAS Help Center
Sep 29, 2025 · DEFF. requests the design effect for MEAN. DF. requests the degrees of ... This option has no effect on tables that use the STACKING option.
[52]
[PDF] estat — Postestimation statistics for survey data - Stata
Options for estat lceffects deff and deft request that the design-effect measures DEFF and DEFT be displayed. This is the default, unless direct ...
[53]
samplics - PyPI
Samplics is a python package that implements a set of sampling techniques for complex survey designs. These survey sampling techniques are organized into ...
[54]
[PDF] samplics: a Python Package for selecting, weighting and analyzing ...
Dec 8, 2021 · samplics is a Python package developed to provide a comprehensive set of APIs to select random samples, adjust sample weights, produce design-.
[55]
Home · Survey.jl
Dec 22, 2023 · This Julia package aims to provide an efficient framework for rapidly growing sizes of survey datasets. Several software tools are available to ...Missing: 2022 effect deff
[56]
Open source and reproducible and inexpensive infrastructure for ...
Jan 2, 2024 · To fill that gap, we developed a workflow that allows for reproducible model training, testing, and evaluation. We leveraged public GitHub ...
[57]
8.2 Specifying the survey design | Introduction to Regression ...
When incorporating a complex survey design, always use the full dataset. Do not exclude any observations, not even to apply inclusion/exclusion criteria or to ...<|separator|>
[58]
Survey Data Analysis with R - OARC Stats - UCLA
They serve the same function as the PSU and strata variables (which are used a Taylor series linearization) to correct the standard errors of the estimates for ...<|separator|>
[59]
[PDF] srvyr: 'dplyr'-Like Syntax for Summary Statistics of Survey Data
Aug 19, 2024 · If provided a svyrep.design object from the survey package, it will turn it into a srvyr object, so that srvyr functions will work with it.
[60]
Chapter 5 Descriptive analyses | Exploring Complex Survey Data ...
This chapter discusses how to analyze measures of distribution (eg, cross-tabulations), central tendency (eg, means), relationship (eg, ratios), and dispersion ...