Fact-checked by Grok 2 weeks ago

Bernoulli sampling

Bernoulli sampling is a fundamental probability sampling method in survey statistics, characterized by the independent selection of each population unit with a fixed inclusion probability π, typically set to achieve an expected sample size of n = Nπ where N is the population size. This design results in a random sample size that follows a binomial distribution Bin(N, π), distinguishing it from fixed-size methods like simple random sampling.^[1]^[2] In Bernoulli sampling, the inclusion of units is determined through independent Bernoulli trials, ensuring that the probability of selecting any particular subset of units of size k is π^k (1-π)^{N-k}, which maximizes the entropy among designs with equal first-order inclusion probabilities.^[2] The method is particularly straightforward to implement, as it requires no coordination between selections, and it supports both equal and unequal probability variants, though the classic form assumes equal π across units.^[3] This independence property makes Bernoulli sampling a baseline for more complex designs, such as Poisson sampling, which generalizes it by allowing varying inclusion probabilities π_i.^[4] Key estimators in Bernoulli sampling include the Horvitz-Thompson estimator for the population total, given by \hat{Y} = \sum_{i \in s} y_i / \pi_i, where s is the realized sample and y_i are the unit values; its unbiasedness stems from the design's properties.^[2] The variance of this estimator under equal π is V(\hat{Y}) = \frac{1 - \pi}{\pi} \sum_{i=1}^N y_i^2, or equivalently N \frac{1 - \pi}{\pi} (\bar{Y}^2 + S^2) where \bar{Y} is the population mean and S^2 = \frac{1}{N} \sum_{i=1}^N (y_i - \bar{Y})^2 is the population variance, reflecting the added variability from the random sample size.^[2] While the random sample size can be a drawback in practice—potentially leading to inefficiencies if n deviates significantly from its expectation—Bernoulli sampling excels in theoretical analyses and scenarios requiring high randomness, such as bootstrap methods or multi-stage surveys.^[3]

Definition and Fundamentals

Definition

Bernoulli sampling is a probability-based sampling method used in statistics to select elements from a finite population of size N. In this approach, each unit in the population is independently included in the sample with a fixed probability p (where $0 < p < 1), known as the inclusion probability or sampling rate. This process treats the inclusion of each unit as a separate Bernoulli trial, resulting in a random sample size S that varies from 0 to N. Unlike sampling designs with predetermined sizes, the expected sample size is Np, providing flexibility in scenarios where exact control over sample size is not required.^[1]^[3] The method operates under key assumptions of independence among selections, ensuring that the inclusion of one unit does not influence others, and a uniform inclusion probability p applied to all units. Since selections are binary (included or not) and independent, the sample consists of distinct units without duplicates, forming a random subset of the population. The random sample size S follows a binomial distribution with parameters N and p. This design is particularly useful in survey sampling and large-scale data collection where computational efficiency or variable response rates are concerns.^[1]^[3] Named after the Swiss mathematician Jacob Bernoulli, whose foundational work on probability in Ars Conjectandi (1713) introduced Bernoulli trials as independent experiments with two outcomes, the sampling method draws directly from this concept. The concept of probability sampling designs, including those with independent inclusions such as Bernoulli sampling, was developed in the early 20th century by statisticians like Jerzy Neyman.^[2]

Mathematical Formulation

In Bernoulli sampling from a finite population of size N, each unit i = 1, \dots, N is included in the sample independently with fixed probability p, where $0 < p < 1. This process is modeled using indicator random variables X_i, defined such that X_i = 1 if unit i is selected and X_i = 0 otherwise, with P(X_i = 1) = p and P(X_i = 0) = 1 - p for each i. The X_i are independent across units, and the resulting sample consists of the units for which X_i = 1.^[5] The sample size S, which is the number of units selected, is given by S = \sum_{i=1}^N X_i. Since the X_i are independent Bernoulli random variables with parameter p, S follows a binomial distribution: S \sim \operatorname{Bin}(N, p). The expected value of the sample size is E[S] = Np, obtained by linearity of expectation as E[S] = \sum_{i=1}^N E[X_i] = \sum_{i=1}^N p = Np. The variance of the sample size is \operatorname{Var}(S) = Np(1-p), derived from the variance of a binomial random variable or directly as \operatorname{Var}(S) = \sum_{i=1}^N \operatorname{Var}(X_i) = \sum_{i=1}^N p(1-p) = Np(1-p), since the X_i are independent.^[5] To estimate the population total \tau = \sum_{i=1}^N y_i for a variable of interest y associated with the units, the inclusion probability for each unit is \pi_i = P(X_i = 1) = p. The Horvitz-Thompson estimator is then \hat{\tau} = \frac{1}{p} \sum_{i: X_i=1} y_i = \sum_{i \in \mathcal{S}} \frac{y_i}{p}, where \mathcal{S} denotes the realized sample. This estimator is unbiased for \tau under the design-based framework of probability sampling.^[6]^[5]

Statistical Properties

Bias and Unbiased Estimators

In Bernoulli sampling, the Horvitz-Thompson estimator for the population total \hat{\tau} = \sum_{i \in s} \frac{y_i X_i}{\pi_i} is unbiased, where s denotes the realized sample, X_i is the inclusion indicator for unit i, y_i is the value associated with unit i, and \pi_i = p is the fixed inclusion probability for all units. The expected value is E[\hat{\tau}] = \sum_{i=1}^N E\left[\frac{X_i y_i}{p}\right] = \sum_{i=1}^N y_i \cdot \frac{E[X_i]}{p} = \sum_{i=1}^N y_i = \tau, since E[X_i] = p for each i due to the Bernoulli nature of the selection process.^[7]^[8] The sample mean \bar{y} = \frac{1}{S} \sum_{i \in s} y_i, where S = \sum_{i=1}^N X_i is the random sample size, provides a biased estimator of the population mean \mu = \tau / N for finite populations N, as the randomness in S introduces a ratio bias that does not average to zero. An unbiased alternative is the Horvitz-Thompson mean estimator \hat{\mu} = \frac{1}{N} \hat{\tau}, which directly inherits the unbiasedness of \hat{\tau}.^[8] Unbiasedness of these estimators holds under the conditions of a fixed inclusion probability p for all units, independent selections across units (as in the Bernoulli scheme), and complete response with no non-response bias, ensuring that the observed y_i match the intended population values without additional measurement error.^[8] A key feature simplifying unbiased estimation in Bernoulli sampling is that all first-order inclusion probabilities are equal (\pi_i = p for i = 1, \dots, N), which avoids the complexities of unequal probability sampling where \pi_i vary and require unit-specific adjustments.^[7]^[8]

Variance and Sampling Distribution

The Horvitz-Thompson estimator \hat{\tau} for the population total \tau = \sum_{i=1}^N y_i under Bernoulli sampling has variance

\Var(\hat{\tau}) = \frac{1-p}{p} \sum_{i=1}^N y_i^2,

where the sum is over all population units and p is the inclusion probability. This expression arises from the independence of the inclusion indicators \delta_i \sim \Bernoulli(p), leading to \Var(\hat{\tau}) = \sum_{i=1}^N \left( \frac{y_i}{p} \right)^2 \Var(\delta_i) = \frac{1-p}{p} \sum_{i=1}^N y_i^2. It can be rewritten in terms of the population mean \bar{y} = \frac{1}{N} \sum_{i=1}^N y_i and the population variance S^2 = \frac{1}{N-1} \sum_{i=1}^N (y_i - \bar{y})^2 as

\Var(\hat{\tau}) = \frac{1-p}{p} \sum_{i=1}^N (y_i - \bar{y})^2 + \frac{1-p}{p} N \bar{y}^2 = \frac{(1-p)(N-1)}{p} S^2 + \frac{1-p}{p} N \bar{y}^2,

highlighting the contributions from population variability and the squared mean. The sampling distribution of the Horvitz-Thompson estimator \hat{\tau} is approximately normal for large population size N when the expected sample size Np is fixed or sufficiently large. This follows from the central limit theorem applied to the sum of independent terms \frac{\delta_i y_i}{p}, each with finite variance, yielding \hat{\tau} \approx \mathcal{N}\left( \tau, \frac{1-p}{p} \sum_{i=1}^N y_i^2 \right). Such asymptotic normality facilitates inference procedures, including confidence intervals based on the estimated variance. The sample size n = \sum_{i=1}^N \delta_i under Bernoulli sampling follows an exact binomial distribution \Bin(N, p). For large N and small p with fixed expected size \lambda = Np, the distribution approximates a Poisson distribution with parameter \lambda, i.e., n \approx \Pois(\lambda), which is useful for modeling sparse sampling scenarios. The second-order joint inclusion probabilities in Bernoulli sampling are \pi_{ij} = P(\delta_i = 1, \delta_j = 1) = p^2 for all i \neq j, reflecting the independence of inclusions. These probabilities enter the general Horvitz-Thompson variance formula \Var(\hat{\tau}) = \sum_{i=1}^N \sum_{j=1}^N (\pi_{ij} - \pi_i \pi_j) \frac{y_i}{\pi_i} \frac{y_j}{\pi_j}, where the cross terms vanish since \pi_{ij} = \pi_i \pi_j = p^2, reducing it to the independent sum form; however, they are essential for variance estimation in broader unequal-probability designs or when approximating the variance from sample data.

Estimation Procedures

Point Estimation

In Bernoulli sampling, each unit in a finite population of size N is independently included in the sample with fixed probability p, resulting in a random sample size following a binomial distribution \text{Bin}(N, p). The Horvitz-Thompson estimator provides an unbiased point estimate of the population total \tau = \sum_{i=1}^N y_i, given by \hat{\tau} = \frac{1}{p} \sum_{i \in S} y_i, where S denotes the realized sample and y_i is the value associated with unit i.^[8] The corresponding unbiased point estimate of the population mean \mu = \tau / N is then \hat{\mu} = \hat{\tau} / N = \frac{1}{N p} \sum_{i \in S} y_i.^[8] This estimator aggregates the observed values using expanded weights of $1/p for each selected unit, effectively scaling the sample to represent the full population under the independent inclusion mechanism.^[8] For binary outcomes, where y_i = 1 if unit i possesses a particular attribute (success) and 0 otherwise, the population proportion p_{\text{pop}} = \mu is estimated using the same framework: \hat{p}_{\text{pop}} = \frac{1}{N} \sum_{i \in S} \frac{y_i}{p} = \frac{1}{N p} \sum_{i \in S} y_i, which counts the successes in the sample and scales by the inverse inclusion probability.^[8] This approach remains unbiased, as the Horvitz-Thompson principle ensures E[\hat{p}_{\text{pop}}] = p_{\text{pop}}, leveraging the known p to correct for the probabilistic selection.^[8] The random sample size introduces challenges, particularly when the sample is empty (S = \emptyset), which occurs with probability (1 - [p](/page/P′′))^N and yields \hat{\mu} = [0](/page/0) or \hat{p}_{\text{pop}} = [0](/page/0) under the Horvitz-Thompson estimator; this outcome is unbiased but can lead to underestimation in small populations or low [p](/page/P′′).^[9] To address this, one adjustment is to condition on a non-empty sample, where the conditional distribution approximates simple random sampling without replacement for fixed observed size |S| \geq 1, allowing the use of conditional unbiased estimators derived from the realized sample size.^[10] Alternatively, imputation techniques may be applied, such as assigning a neutral value (e.g., the prior mean) to the empty case before aggregation with weights $1/[p](/page/P′′), though this introduces mild bias traded for reduced variance in practice.^[9] The variance of \hat{\mu} under Bernoulli sampling is \frac{1-p}{p N^2} \sum_{i=1}^N y_i^2, highlighting the efficiency gains from higher p.^[2]

Interval Estimation

Interval estimation in Bernoulli sampling involves constructing confidence intervals for population parameters, such as the mean \mu, using the Horvitz-Thompson estimator \hat{\mu} derived from the sampled data. The normal approximation confidence interval for the mean is given by \hat{\mu} \pm z_{\alpha/2} \sqrt{\widehat{\mathrm{Var}}(\hat{\mu})}, where \widehat{\mathrm{Var}}(\hat{\mu}) is estimated using the sample second moments adjusted for the inclusion probability p, specifically \widehat{\mathrm{Var}}(\hat{\mu}) = \frac{1-p}{N^2 p^2} \sum_{i \in s} y_i^2 for constant p, or more generally incorporating the population variance approximation via the sample variance s^2 / (N p) for large samples.^[8] Bootstrap methods provide an alternative for estimating the variability of \hat{\mu}, particularly useful when the expected sample size N p is small. In the context of Poisson (Bernoulli) sampling, a studentized bootstrap approach resamples from the design by generating bootstrap inclusion indicators I_i^* independently with probability p, computes bootstrap replicates \hat{\mu}^*, and uses the distribution of (\hat{\mu}^* - \hat{\mu}) / \sqrt{\widehat{\mathrm{Var}}(\hat{\mu}^*)} to form percentile-t intervals, achieving second-order accurate coverage o_p((N p)^{-1/2}).^[11] For estimating proportions Q, exact methods adjust the Clopper-Pearson interval for the sampling design by conditioning on the realized sample size n; given k successes in the sample, the conditional distribution is Binomial(n, Q), yielding the interval based on beta quantiles: \left[ \mathrm{Beta}^{-1}(\alpha/2; k, n-k+1), \mathrm{Beta}^{-1}(1-\alpha/2; k+1, n-k) \right], which provides conservative coverage under the design.^[12] These intervals maintain nominal coverage asymptotically as N p \to \infty, but finite-sample performance requires adjustments for small N p or small p N (e.g., via bootstrap or conditioning) to avoid undercoverage in the normal approximation or excessive width in exact methods.^[11]

Comparisons to Other Sampling Methods

With Simple Random Sampling

Bernoulli sampling differs from simple random sampling (SRS) in its design mechanism: while SRS selects a fixed sample size n from a finite population of size N without replacement, ensuring each subset of size n is equally likely and no duplicates occur, Bernoulli sampling includes each population unit independently with fixed probability p, resulting in a random sample size following a binomial distribution \text{Bin}(N, p) and no duplicates since inclusions are binary decisions.^[8] This independent inclusion process makes Bernoulli sampling a form of Poisson sampling when probabilities are equal, allowing for straightforward implementation but introducing variability in the realized sample size. In terms of efficiency for estimating the population mean \mu, the Horvitz-Thompson (HT) estimator under Bernoulli sampling, \hat{\mu}_{\text{Bern}} = \frac{1}{N} \sum_{i \in S} \frac{y_i}{p}, is unbiased with approximate variance \text{Var}(\hat{\mu}_{\text{Bern}}) \approx \frac{1-p}{Np} (S^2 + \bar{Y}^2), where S^2 is the population variance, \bar{Y} is the population mean, and Np is the expected sample size. In contrast, the sample mean under SRS, \bar{y}_{\text{SRS}}, has variance \text{Var}(\bar{y}_{\text{SRS}}) = \frac{1 - n/N}{n} S^2. When the expected sample size Np = n, Bernoulli sampling exhibits higher variance than SRS due to the randomness in sample size and the lack of negative dependence among inclusion indicators, making it less efficient unless p = n/N and N is large, where the designs become asymptotically equivalent. This increased variability in Bernoulli sampling stems from its structure, which does not account for finite population corrections as effectively as SRS.^[8] Bernoulli sampling is preferable in scenarios requiring simplicity with unequal inclusion probabilities, such as when extending to Poisson sampling with \pi_i \propto |y_i| to minimize variance, or in online and streaming data environments where independent decisions per unit facilitate maintenance over evolving datasets without needing to track fixed-size constraints. Conversely, SRS is more suitable when a fixed sampling budget is essential and duplicates must be avoided under uniform probabilities, as it provides tighter control over sample size and lower variance for the same expected effort. As p \to 0 with Np = n fixed, Bernoulli sampling approximates Poisson sampling, which exhibits greater variability in the estimator compared to SRS due to the independent inclusions leading to higher design variance without the stabilizing effect of fixed size.

With Sampling Without Replacement

Bernoulli sampling, also known as Poisson sampling, involves independent inclusion decisions for each population unit, resulting in zero covariance between the inclusion indicators \delta_i and \delta_j for i \neq j.^[13] This independence simplifies theoretical analysis and computation but leads to a random sample size following a binomial distribution. In contrast, sampling without replacement methods, such as simple random sampling (SRS) of fixed size n, introduce dependence among inclusion indicators, with \text{Cov}(\delta_i, \delta_j) = -\frac{n}{N} \left(1 - \frac{n}{N}\right) \frac{1}{N-1} for i \neq j, where N is the population size.^[14] This negative covariance reflects the constraint of no duplicates and fixed sample size, fostering a compensatory effect that reduces overall variability in estimators compared to the independent structure of Bernoulli sampling. The dependence in without-replacement designs contributes to variance reduction for estimators like the Horvitz-Thompson (HT) estimator, particularly when matching the expected sample size np of Bernoulli sampling. For instance, rejective sampling and conditional Poisson sampling—methods that generate fixed-size samples while approximating target inclusion probabilities—exhibit lower design variance than Bernoulli sampling for the same expected size, as the fixed size eliminates variability in sample count and leverages negative correlations.^[15] This reduction is quantified by a factor akin to the finite population correction (1 - n/N), which accounts for the decreased uncertainty in finite populations; in Bernoulli sampling, the absence of this correction inflates the variance relative to fixed-size without-replacement alternatives. The HT estimator under Bernoulli sampling, referenced briefly from its formulation as \hat{\tau} = \sum_{i \in s} y_i / p, benefits from independence but lacks this efficiency gain. Quantitative comparisons often show the without-replacement variance as a fraction of the Bernoulli variance, emphasizing the trade-off between simplicity and precision. Implementation differences further highlight the contrasts. Bernoulli sampling is computationally straightforward, requiring only independent Bernoulli trials for each unit—no tracking of prior selections or exclusions is needed, making it ideal for parallel processing or large-scale simulations. Without-replacement methods, however, demand algorithms to enforce distinct selections, such as reservoir sampling for streaming data where the population arrives sequentially; this algorithm maintains a fixed-size reservoir of candidates, replacing elements with probability inversely proportional to the current stream position seen thus far.^[16] Such mechanisms add overhead, especially for unequal probabilities or variable streams, but ensure deterministic sample sizes beneficial for budgeting in surveys. A representative example illustrates the variance disparity for sample size S = \sum \delta_i. Consider a population of N=100 with inclusion probability p=0.1, yielding expected size n=10; under Bernoulli sampling, S \sim \text{Binomial}(100, 0.1), so \text{Var}(S) = Np(1-p) = 9. In contrast, fixed-size without-replacement sampling with n=10 has \text{Var}(S) = 0, eliminating size variability entirely and underscoring the efficiency of dependence-induced stability.^[14]

Applications

In Survey Sampling

Bernoulli sampling forms the basis for probability proportional to size (PPS) surveys by assigning inclusion probabilities p_i to each population unit proportional to a predefined size measure, such as aggregate economic value or population count. This generalization, known as Poisson sampling, involves independent Bernoulli trials for each unit with unequal probabilities, resulting in a random sample size while ensuring first-order inclusion probabilities align with the size measures. Such designs are self-weighting when p_i is directly set to the normalized size (i.e., size_i/total size), meaning the inverse inclusion probability serves as the natural weight for unbiased estimation of population totals without additional adjustments. This approach is particularly effective in PPS for reducing variance in estimates of domain totals compared to equal-probability methods, as larger units contribute more reliably to the sample.^[17]^[2]^[18] In large-scale surveys, Bernoulli sampling offers significant implementation advantages, especially in distributed environments like web-based data collection platforms. Each unit's selection occurs independently via a simple Bernoulli trial with probability p, requiring no central coordination to achieve a fixed sample size and thus avoiding logistical complexities in coordinating across remote or decentralized respondents. This scalability suits massive populations, such as online panels or national registries, where the random sample size—following a binomial distribution—adapts flexibly to frame uncertainties without predefined quotas. For instance, in web surveys, respondents can "self-select" into the sample probabilistically upon access, streamlining operations while maintaining probabilistic rigor.^[8]^[19]^[20] Bernoulli sampling facilitates non-response handling by distinctly modeling selection and response mechanisms, leveraging the known inclusion probabilities to apply inverse probability weighting (IPW) targeted at response biases. The base weights, as inverses of p_i, account for selection, while subsequent IPW multiplies these by the inverse of estimated response probabilities (e.g., via logistic models fitted to auxiliary data), enabling separate correction for non-response without conflating it with design effects. This separation enhances estimator efficiency and reduces bias in respondent-only data, particularly in designs where non-response rates vary by unit characteristics.^[21]^[22]^[23] A practical example of Bernoulli sampling's application appears in the U.S. Census Bureau's survey operations, where Poisson sampling variants are used for variance estimation in economic surveys and to maximize overlap in primary sampling unit selection for frame evaluation in educational surveys.^[24]^[25]

In Monte Carlo Methods

Bernoulli sampling plays a key role in Monte Carlo methods by enabling the generation of independent inclusions for stochastic paths and simulations, facilitating unbiased estimation of expectations and integrals. In Monte Carlo integration, consider estimating the integral \int f(x) \, dx over a unit measure domain; one generates N candidate points x_i from the appropriate distribution (e.g., uniform), includes each independently with probability p via Bernoulli sampling, and applies the Horvitz-Thompson estimator \hat{\mu} = \frac{1}{Np} \sum_{i \in s} f(x_i), where s denotes the realized sample of included points.^[26] This approach yields an unbiased estimator for the integral, analogous to importance sampling in broader Monte Carlo frameworks.^[26] The sampling distribution of this estimator exhibits higher variance compared to fixed-size simple random sampling due to the binomial variability in sample size, though it maintains the standard Monte Carlo convergence rate of O(N^{-1/2}).^[27] To mitigate this, Bernoulli sampling is often combined with variance reduction techniques. Antithetic variates, for instance, generate negatively correlated pairs by applying the transformation $1 - u to the uniform random variables u used to produce the Bernoulli inclusions, thereby reducing estimator variance in integration tasks.^[28] Control variates further enhance efficiency by incorporating auxiliary variables with known expectations to adjust the estimator, applicable directly to Bernoulli-sampled paths in simulations.^[29] In financial applications, particularly risk assessment, Bernoulli sampling models individual default events in credit portfolios for Value-at-Risk (VaR) computations via Monte Carlo simulation. Each obligor's default is simulated as a Bernoulli random variable with probability equal to its default rate, often within mixture models to capture correlations; multiple scenarios are then generated to approximate the portfolio loss distribution and derive VaR quantiles.^[30] This approach is essential for handling the discrete nature of defaults in large-scale simulations, where the aggregate loss is the sum of these Bernoulli outcomes scaled by exposures.^[31] A specific use arises in Markov Chain Monte Carlo (MCMC), where Bernoulli proposals drive the Metropolis-Hastings algorithm for distributions over binary state spaces, such as in Bayesian variable selection. Here, the proposal distribution selects a coordinate and proposes flipping its binary value with a small probability, effectively a Bernoulli trial per component, enabling exploration of high-dimensional inclusion indicator vectors while satisfying detailed balance.^[32]

References

[1]
Bernoulli Sampling - SAS Help Center
Sep 29, 2025 · Bernoulli sampling is an equal probability selection method where the sample size is not fixed and has a binomial distribution. The selection ...Missing: definition | Show results with:definition
[2]
Probability Sampling Designs: Principles for Choice of Design and ...
Abstract. The aim of this paper is twofold. First, three theoretical principles are formalized: randomization, overrepresentation and restriction. We de-.
[3]
[PDF] Sampling Methods Related to Bernoulli and Poisson Sampling
that we call generalized Bernoulli sampling. This type of sampling can ... Unified Theory and Strategies of Survey Sampling. North Holland. Amsterdam ...
[4]
Bernstein-type exponential inequalities in survey sampling
ment), a generalization of Bernoulli sampling originally proposed in [22] for the case of unequal ... relevant in survey sampling). For a correlation of ...
[5]
[PDF] neyman-1934.pdf - Error Statistics Philosophy
There are two different aspects of the representative method. One of them is called the method of random sampling and the other the method of purposive ...
[6]
https://www.jstor.org/stable/2280784
[7]
A Generalization of Sampling Without Replacement From a Finite ...
D. G. HORVITZt AND D. J. THOMPSON. Iowa State College. This paper presents a ... 680 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1952 proximate ...
[8]
[PDF] Horvitz-Thompson-1952-jasa.pdf
Author(s): D. G. Horvitz and D. J. Thompson. Source: Journal of the American Statistical Association, Vol. 47, No. 260 (Dec., 1952), pp. 663-. 685. Published ...
[9]
[PDF] Statistics in Survey Sampling - arXiv
Dec 1, 2024 · 3.5 Bernoulli sampling. Bernoulli sampling design is a sampling design based on independent. Bernoulli trials for the element in the population.
[10]
[PDF] Sequential Poisson Sampling - SCB
The Poisson sampling estimator ├YR is, under general conditions, ... The fixed size allows us to use SPS in cases where PS is at risk of giving an empty sample, ...
[11]
[PDF] Variance estimators in survey sampling - Camelia Goga
1.1 The Horvitz-Thompson estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . 8. 2 Equal sampling designs. 11 ... 2.2 Bernoulli sampling (BE) .
[12]
https://www.mwsug.org/proceedings/2008/pharma/MWSUG-2008-P08.pdf
[13]
https://arxiv.org/pdf/1902.09169.pdf
[14]
Chapter 6 Simple Random Sampling | STAT392
In SRSWOR the inclusion probabilities are πi=nN π i = n N for every unit i i . In other words each unit has an equal chance of ending up in the sample. Not all ...
[15]
[PDF] conditional Poisson sampling schemes - HAL
Feb 12, 2018 · This paper establishes exponential bounds for sample sum deviations in rejective sampling, using conditional Poisson sampling and the Escher ...
[16]
[PDF] Random Sampling with a Reservoir
We introduce fast algorithms for selecting a random sample of n records without replacement from a pool of N records, where the value of N is unknown ...
[17]
Drawing a Sequential Poisson Sample
A general approach for estimating the variance of the Horvitz-Thompson estimator is to construct bootstrap replicate weights from the design weights for the ...Missing: formula | Show results with:formula
[18]
[PDF] Exploring Sampling Techniques to Reduce Respondent Burden
known as a Poisson sampling (Ohlsson, 1992) or Bernoulli sampling (Sarndal,. Swensson, & Wretman, 1992). • Yields fixed sampling fraction but not a fixed sample ...
[19]
[PDF] Maintaining Bernoulli Samples over Evolving Multisets
Jun 11, 2007 · We first discuss classical Bernoulli sampling and maintenance schemes over ordinary sets, and then give a naive maintenance scheme that handles ...
[20]
[PDF] Non-response follow-up for business surveys
Jun 21, 2022 · Although business surveys typically use simple sampling designs, such as stratified simple random or. Bernoulli sampling designs, they do ...
[21]
Handling Missing Values in Surveys With Complex Study Design
Feb 20, 2023 · The simplest form of IPW multiplies the respondents' sampling weight by an inverse response rate (number of sampled units divided by number ...
[22]
Construction of Weights in Surveys: A Review - Project Euclid
At the first stage, each unit is assigned a base weight, which is defined as the inverse of its inclusion probability. The base weights are then modified to ...
[23]
[PDF] Comparing Weighting Methods When Adjusting for Logistic Unit ...
One way to adjust for unit nonresponse in a sample survey is by fitting a logistic response function using some variant of maximum likelihood and then ...
[24]
[PDF] Estimating Variances for the U.S. Census Bureau's Annual ...
• Approximate sampling formula variance (SYG). • Antal Tillé Bootstrap ... Horvitz Thompson estimate from bootstrap replicate b (method m). Average over ...
[25]
https://www.nces.ed.gov/FCSM/pdf/2009FCSM_Rottach_IX-C.pdf
[26]
[PDF] A theory of statistical models for Monte Carlo integration
Dec 11, 2002 · The special case k = 1, called importance sampling, corresponds to the Horvitz–Thompson estimator (Horvitz and Thompson, 1952), which is widely ...
[27]
Large sample theory for merged data from multiple sources
Bernoulli sampling yields larger asymptotic variance than sampling without re- placement. As expected from the decomposition of the asymptotic variance, the.
[28]
Living on the Edge: An Unified Approach to Antithetic Sampling
We provide applications to Monte Carlo integration and Markov Chain Monte. Carlo Bayesian estimation. Key words and phrases: Antithetic variables, ...
[29]
[PDF] 9 Importance sampling - Art Owen
Using the density function as a control variate provides at least as good a variance reduction as we get from self-normalized importance sampling or ordi- nary ...
[30]
[PDF] Monte Carlo Methods for Portfolio Credit Risk
Bernoulli mixture models are a fundamental class of credit risk models because many credit risk models can be represented as a mixture model. It is ...
[31]
[PDF] Review and Implementation of Credit Risk Models - IMF eLibrary
Monte Carlo simulation is required because there is no analytical solution for the loss distribution when Bernoulli defaults are combined with random default ...
[32]
A Metropolized Adaptive Subspace Algorithm for High-Dimensional ...
Metropolis-Hastings algorithm. Clearly, the efficiency of such an MCMC ... Bernoulli proposal to the target π(·|D) in terms of Kullback-Leibler ...