Fact-checked by Grok 2 weeks ago

Bernoulli process

In , a Bernoulli process is a discrete-time consisting of a sequence of independent Bernoulli trials, where each trial has exactly two possible outcomes—typically termed "" with probability p (where $0 \leq p \leq 1) and "" with probability $1 - p—and the probability remains constant across all trials. The outcomes can be represented by indicator random variables X_i taking value 1 for success and 0 for failure, with \mathbb{E}[X_i] = p and \mathrm{Var}(X_i) = p(1-p). This process models fundamental random experiments such as repeated flips or binary events in and serves as the foundation for more complex distributions. Named after the mathematician (1654–1705), the concept emerged from his pioneering work on probability, particularly in his posthumously published book (1713), where he formalized the expansion for general success probabilities and proved an early version of the applicable to sequences of such trials. Prior contributions, such as those by and in the 1650s, had explored fair coin scenarios ( p = 1/2 ) leading to the coefficients via , but Bernoulli extended this to arbitrary p, establishing the process's generality. His asserts that, as the number of trials increases, the sample proportion of successes converges to p with high probability, underscoring the process's role in empirical estimation and . The Bernoulli process underpins several key probability distributions derived from partial sums or waiting times within the sequence. For instance, the total number of successes in n fixed trials follows a with parameters n and p, characterized by probability mass function P(K = k) = \binom{n}{k} p^k (1-p)^{n-k}. The number of trials until the first success adheres to a , with P(X = k) = (1-p)^{k-1} p for k = 1, 2, \dots, while the trials until the r-th success yield a , P(Y = y) = \binom{y-1}{r-1} p^r (1-p)^{y-r} for y = r, r+1, \dots. These distributions are essential in fields like , , and , where binary events recur independently. Beyond basic applications, the Bernoulli process connects to broader modeling, such as the simple (where position is the cumulative sum of outcomes) and extensions like the Beta-Bernoulli process for with unknown p. Its independence assumption distinguishes it from dependent binary sequences, like Markov chains with , ensuring tractable computations via the product of individual probabilities. Violations of independence or constant probability lead to alternative models, but the Bernoulli framework remains a cornerstone for approximating real-world phenomena under ideal conditions.

Definition and Interpretation

Basic Definition

A Bernoulli process is a discrete-time consisting of a of independent and identically distributed random variables \{X_n : n = 1, 2, \dots \}, where each X_n follows a with success probability p (where $0 \leq p \leq 1), taking the value 1 with probability p and 0 with probability $1-p. The parameter p represents the fixed probability of success in each trial, remaining constant across the . In a more general formulation, the Bernoulli process can be viewed as a measurable map from a to the space \{0,1\}^\mathbb{N}, where each coordinate corresponds to the outcome of a single trial. The concept is named after , who introduced the foundational ideas of repeated independent trials in the context of games of chance in his posthumously published work (1713). The sum of the first n variables in the process follows a with parameters n and p.

Probabilistic Interpretation

The Bernoulli process can be intuitively understood as a sequence of repeated independent trials, each resulting in one of two possible outcomes: success or failure. A classic is the repeated flipping of a , where "success" might correspond to heads and "failure" to tails. For a , the probability of success p = 0.5, making each flip equally likely to yield either outcome; in contrast, a biased coin has p \neq 0.5, such as p = 0.7 for heads, reflecting consistent but uneven across all flips. In real-world applications, the Bernoulli process models sequences of binary yes/no events, such as detecting defects in manufactured items during inspections, where each item is independently defective with fixed probability p or non-defective with probability $1 - p. Another example arises in discrete-time arrival modeling, where each time slot (e.g., a second or minute) has a probability p of an arrival occurring, independently of previous slots, to approximate sparse arrival patterns in systems like queueing at a service counter. A key feature of the Bernoulli process is the of trials, meaning the outcome of any single trial does not influence or depend on the results of others, ensuring that the probability p remains constant regardless of prior events. This independence underpins the process's reliability in modeling scenarios where past occurrences provide no predictive information about future ones. The process exhibits , as the of each trial is identical—always Bernoulli with parameter p—implying that the probabilistic structure does not shift over time or across trials. This stationary property makes the Bernoulli process suitable for long sequences where consistent behavior is expected.

Formal Definition

Probability Space

The probability space underlying the Bernoulli process in a measure-theoretic consists of a \Omega = \{0,1\}^{\mathbb{N}}, which is the set of all infinite sequences of binary outcomes, where each coordinate represents the result of a single trial ( or ). This space models the infinite sequence of independent trials characteristic of the process. The measurable structure is provided by the product \sigma-algebra \mathcal{F} on \Omega, generated by the cylinder sets, which are events depending on only finitely many coordinates. Specifically, for any finite n \in \mathbb{N} and subset A \subseteq \{0,1\}^n, the cylinder set C_{n,A} = \{\omega \in \Omega : (\omega_1, \dots, \omega_n) \in A\} belongs to the generating algebra, and \mathcal{F} = \sigma(\{C_{n,A} : n \in \mathbb{N}, A \subseteq \{0,1\}^n\}). This \sigma-algebra ensures that events of interest, such as those based on partial observations, are measurable. The product \sigma-algebra coincides with the Borel \sigma-algebra generated by the product topology on \Omega. The natural filtration \{\mathcal{F}_n\}_{n \in \mathbb{N}} is defined by \mathcal{F}_n = \sigma(X_1, \dots, X_n), where X_k: \Omega \to \{0,1\} is the coordinate projection X_k(\omega) = \omega_k for each k \in \mathbb{N}, representing the outcome of the k-th trial as a random variable. These \sigma-algebras form an increasing sequence \mathcal{F}_1 \subseteq \mathcal{F}_2 \subseteq \cdots \subseteq \mathcal{F}, capturing the information revealed progressively by the trials.

Bernoulli Measure

The Bernoulli measure \mu_p, for $0 \leq p \leq 1, is the infinite product probability measure on the product space \{0,1\}^{\mathbb{N}} equipped with the product \sigma-algebra, where each marginal measure on the n-th coordinate is the Bernoulli measure \nu_p satisfying \nu_p(\{1\}) = p and \nu_p(\{0\}) = 1-p. This construction ensures that the coordinate projections X_n: \{0,1\}^{\mathbb{N}} \to \{0,1\} form a sequence of independent and identically distributed (p) random variables under \mu_p. The existence of \mu_p follows from the applied to the consistent family of finite-dimensional distributions given by the product of \nu_p on finite cylinders. For a cylinder set C defined by fixing the first n coordinates to a specific (x_1, \dots, x_n) \in \{0,1\}^n with exactly k ones (successes), the measure is \mu_p(C) = p^k (1-p)^{n-k}. More generally, for a depending on any of coordinates \{i_1, \dots, i_m\} fixed to values (a_1, \dots, a_m) with \sum a_j = k' ones, \mu_p(C) = p^{k'} (1-p)^{m - k'}. This measure extends uniquely to the full \sigma-algebra by , as the finite-dimensional distributions are consistent under the independent identical distribution assumption. The Bernoulli measure \mu_p is a probability measure, satisfying \mu_p(\{0,1\}^{\mathbb{N}}) = 1, since each \nu_p is a probability measure and the infinite product preserves this property. It is atomless when $0 < p < 1, as every singleton \{x\} for x \in \{0,1\}^{\mathbb{N}} has measure zero, being the intersection of nested cylinders whose measures approach zero (product of infinitely many factors each at most \max(p, 1-p) < 1). Additionally, \mu_p is invariant under the one-sided shift transformation T: \{0,1\}^{\mathbb{N}} \to \{0,1\}^{\mathbb{N}} defined by T(x_1, x_2, \dots) = (x_2, x_3, \dots), meaning \mu_p(T^{-1}(A)) = \mu_p(A) for all measurable A, due to the identical distribution of coordinates.

Key Properties

Law of Large Numbers

In the context of a Bernoulli process \{X_i\}_{i=1}^\infty consisting of independent Bernoulli random variables with success probability p \in (0,1), the law of large numbers describes the convergence of the sample mean \bar{X}_n = \frac{1}{n} \sum_{i=1}^n X_i to the expected value E[X_i] = p. The weak law of large numbers states that for every \epsilon > 0, P(|\bar{X}_n - p| \geq \epsilon) \to 0 as n \to \infty. This result was originally established by specifically for sequences of Bernoulli trials in his seminal 1713 treatise []. A straightforward proof of the weak law relies on and the independence of the X_i []. Since \mathrm{Var}(X_i) = p(1-p) for each i, it follows that \mathrm{Var}(\bar{X}_n) = p(1-p)/n, so P(|\bar{X}_n - p| \geq \epsilon) \leq \frac{\mathrm{Var}(\bar{X}_n)}{\epsilon^2} = \frac{p(1-p)}{n \epsilon^2} \to 0 as n \to \infty. The strong law of large numbers provides a sharper almost-sure convergence: \bar{X}_n \to p almost surely as n \to \infty. This general form for independent identically distributed random variables with finite mean was proved by in 1930 []. For the Bernoulli process, the proof exploits the independence, identical distribution, and boundedness ($0 \leq X_i \leq 1) of the variables to apply directly; a common sketch uses truncation of the variables to bound moments and invokes the Borel-Cantelli lemma to show that large deviations occur only finitely often with probability 1. This almost-sure convergence implies that the long-run proportion of successes in the process equals p with probability 1, justifying the intuitive notion of the in repeated Bernoulli trials. The diminishing variance \mathrm{Var}(\bar{X}_n) = p(1-p)/n \to 0 as n \to \infty further underscores the increasing concentration of \bar{X}_n around p.

Binomial Distribution

In the context of a Bernoulli process, which consists of a sequence of independent Bernoulli trials each with success probability p, the cumulative number of successes after n trials is a key random variable. Let X_i denote the outcome of the i-th trial, where X_i = 1 for success and X_i = 0 for failure. The sum S_n = \sum_{i=1}^n X_i represents the total number of successes in these n trials and follows a , denoted S_n \sim \operatorname{Bin}(n, p). This distribution was first systematically derived by in his seminal work . The of the gives the probability of exactly k successes in n trials as P(S_n = k) = \binom{n}{k} p^k (1-p)^{n-k}, \quad k = 0, 1, \dots, n, where \binom{n}{k} = \frac{n!}{k!(n-k)!} is the , counting the number of ways to choose k successes out of n trials. This formula arises directly from the of the trials and the multiplicative probability structure of the process. The binomial random variable S_n has expected value E[S_n] = np and variance \operatorname{Var}(S_n) = np(1-p), which follow from the linearity of expectation and the independence of the X_i, since E[X_i] = p and \operatorname{Var}(X_i) = p(1-p). These moments quantify the average number of successes and the spread around that average, scaling linearly with the number of trials n.

Central Limit Theorem

In the Bernoulli process, consisting of independent Bernoulli trials with success probability p \in (0,1), the partial sum S_n = \sum_{i=1}^n X_i follows a with parameters n and p. The (CLT) asserts that the normalized version of this sum converges in distribution to a standard normal as the number of trials grows large. Specifically, \frac{S_n - n p}{\sqrt{n p (1-p)}} \xrightarrow{d} \mathcal{N}(0,1) as n \to \infty, where \xrightarrow{d} denotes convergence in distribution. This result follows from the more general for independent, identically distributed random variables with finite variance, since each X_i has mean p and variance p(1-p). A standard proof of this CLT employs , exploiting the of the trials. Consider the centered and scaled variables Y_i = \frac{X_i - p}{\sqrt{p(1-p)}}, which are i.i.d. with 0 and variance 1. The of Y_i is \phi_Y(t) = \exp\left( -i t \frac{p}{\sqrt{p(1-p)}} \right) \phi\left( \frac{t}{\sqrt{p(1-p)}} \right), where \phi(t) = (1-p) + p e^{it} is the of X_i. The of the normalized sum Z_n = \frac{1}{\sqrt{n}} \sum_{i=1}^n Y_i is \left[ \phi_Y\left( \frac{t}{\sqrt{n}} \right) \right]^n. For small u = t / \sqrt{n}, the cumulant expansion gives \log \phi_Y(u) = -\frac{1}{2} u^2 + o(u^2), since the is zero. Thus, n \log \phi_Y\left( \frac{t}{\sqrt{n}} \right) \to -\frac{t^2}{2}, and by the for , Z_n \xrightarrow{d} \mathcal{N}(0,1). Alternatively, the Lindeberg condition can be verified directly, as the variables have bounded support and thus satisfy the necessary uniformity in their contributions to the sum's variance. This asymptotic normality has significant implications for statistical inference in the Bernoulli process. For large n, the distribution of S_n can be approximated by a normal distribution with mean n p and variance n p (1-p), enabling the construction of approximate confidence intervals for the unknown parameter p. A common (1-\alpha)-level interval is given by \hat{p} \pm z_{\alpha/2} \sqrt{ \frac{\hat{p}(1-\hat{p})}{n} }, where \hat{p} = S_n / n is the sample proportion and z_{\alpha/2} is the upper \alpha/2-quantile of the standard normal distribution; this approximation improves as n increases, provided n p (1-p) is sufficiently large (typically at least 5 or 10). The rate of convergence in the CLT for the Bernoulli process is quantified by the , which provides a non-uniform bound on the Kolmogorov distance between the distribution of the normalized sum and the standard normal. For the case, the error is bounded by C \frac{p^2 + (1-p)^2}{\sqrt{n p (1-p)}}, where C is an absolute constant; the optimal such constant for the is approximately 0.4097, yielding an overall error of order O(1/\sqrt{n}). This bound confirms that the normal approximation becomes reliable for moderately large n, with the precise rate depending on p; it is tightest when p = 1/2 and worsens near the boundaries p \approx 0 or p \approx 1.

Dynamical Systems Connections

Bernoulli Shift

The Bernoulli shift, also known as the Bernoulli map or shift transformation, is a fundamental example of a arising from the Bernoulli process. It is defined on the sequence space \{0,1\}^\mathbb{N}, consisting of all infinite sequences \omega = (\omega_1, \omega_2, \dots) where each \omega_i \in \{0,1\}. The shift map \sigma: \{0,1\}^\mathbb{N} \to \{0,1\}^\mathbb{N} acts by \sigma(\omega_1, \omega_2, \dots) = (\omega_2, \omega_3, \dots), effectively discarding the first coordinate and shifting the sequence to the left. This transformation preserves the Bernoulli measure \mu_p, which is the infinite product measure on \{0,1\}^\mathbb{N} with marginal probability p for the coordinate being 1 (and $1-p for 0), assuming $0 < p < 1. Specifically, for any measurable set A, \mu_p(\sigma^{-1}(A)) = \mu_p(A), as the shift aligns with the independent structure of the product measure, mapping cylinder sets to equivalent-measure sets. The Bernoulli shift is ergodic with respect to \mu_p, meaning that the only invariant sets under \sigma have measure 0 or 1. Consequently, for any integrable function f on \{0,1\}^\mathbb{N}, the time average \lim_{n \to \infty} \frac{1}{n} \sum_{k=0}^{n-1} f(\sigma^k \omega) equals the space average \int f \, d\mu_p for \mu_p-almost every \omega. This ergodicity reflects the irreducibility of the dynamics, where orbits densely explore the space without fixed subsystems of positive measure. Furthermore, the Bernoulli shift exhibits the strong mixing property, a stronger form of ergodicity where \lim_{n \to \infty} \mu_p(\sigma^{-n}(A) \cap B) = \mu_p(A) \mu_p(B) for all measurable sets A, B. In this case, the correlations decay exponentially: there exists \gamma \in (0,1) such that |\mu_p(\sigma^{-n}(A) \cap B) - \mu_p(A) \mu_p(B)| \leq C \gamma^n for suitable constants C, depending on the sets. This rapid decorrelation underscores the chaotic, independent nature of the underlying Bernoulli trials.

Symbolic Dynamics Examples

One prominent example of a Bernoulli shift in is the doubling map on the unit interval, defined by T(x) = 2x \mod 1 for x \in [0,1). This map is topologically conjugate to the one-sided Bernoulli shift on the space of binary sequences via the binary expansion representation x = 0.\omega_1 \omega_2 \omega_3 \dots, where each \omega_i \in \{0,1\}, and the conjugacy is given by the map h: \{0,1\}^\mathbb{N} \to [0,1) that sends a sequence to its corresponding real number, with the shift \sigma on sequences corresponding to the action of T on expansions (noting that expansions are unique except on a countable set of dyadic rationals). The on [0,1) serves as the invariant measure for the doubling map T, and it is equivalent to the Bernoulli measure \mu_{1/2}, which is the infinite product of uniform Bernoulli trials with probability 1/2 on {0,1}. This equivalence preserves the ergodic properties of the system, such as mixing, under the conjugacy. Another key example arises with the middle-thirds , which serves as the support for a measure with equal probabilities p=1/2. Points in the \mathcal{C} \subset [0,1] admit ternary expansions using only digits 0 and 2, i.e., x = \sum_{k=1}^\infty a_k / 3^k where each a_k \in \{0,2\}, and the shift map acts by deleting the first digit and shifting the remaining sequence, inducing a between \mathcal{C} and the symbolic space {0,2}^\mathbb{N}. The corresponding measure on \mathcal{C} is the of \mu_{1/2} under this representation, making the system measure-theoretically isomorphic to the shift.

Odometer Transformation

The , also known as the , is defined on the p-adic integers X = \mathbb{Z}_p = \lim_{\leftarrow k} \mathbb{Z}/p^k \mathbb{Z} for a fixed prime p, a compact under p-adic addition. Equivalently, elements of X can be represented as x = \sum_{k=0}^\infty d_k p^k with digits d_k \in \{0, 1, \dots, p-1\}, topologized as the product space \{0,1,\dots,p-1\}^\mathbb{N}. The transformation T: X \to X is given by adding 1 (translation by the generator 1), which in the digit representation increments the least significant digit d_0 and propagates the carry to higher digits as needed (e.g., if d_0 = p-1, set d_0 = 0 and add 1 to d_1, etc.), mimicking the rollover of a mechanical . This map preserves the normalized \mu on X, the unique translation-invariant probability measure, which in the digit representation corresponds to the infinite product of uniform measures on \{0,1,\dots,p-1\} (each with probability $1/p), equivalent to the i.i.d. on the digits. Unlike the one-sided Bernoulli shift on (\mathbb{Z}/p\mathbb{Z})^\mathbb{N}, which is a non-invertible left-shift map, the T is bijective because addition by 1 on the \mathbb{Z}_p admits an (addition by -1), making it an of the (X, \mu). The (X, \mu, T) is a rank-one , constructed via cutting and stacking with constant cutting parameters r_n = p and no spacers, ensuring it is mildly mixing but not strongly mixing. Topologically, T is minimal on X, meaning every orbit is dense, and it admits a unique invariant probability measure \mu, rendering the system strictly ergodic. In , the serves as a example of a strictly ergodic \mathbb{[Z](/page/Z)}-action on a via translation, with discrete rational spectrum consisting of all p-th roots of unity (eigenvalues e^{2\pi i m / p^k} for integers m, k) and , contrasting with the positive-entropy, Lebesgue-spectrum shifts while sharing the underlying structure. This makes it useful for studying spectral invariants and factors in classifications of measure-preserving transformations, such as in constructions of prime systems or extensions with odometer factors.

Applications and Extensions

Bernoulli Sequences

A Bernoulli sequence is defined as a realization or sample path \omega \in \{0,1\}^\mathbb{N} drawn from the measure \mu_p on the space of infinite sequences, where each coordinate \omega_n is independently 1 with probability p \in (0,1) and 0 with probability $1-p. This measure assigns to any finite based on the initial segment \sigma of length k the probability \mu_p([\sigma]) = p^{|\sigma|_1} (1-p)^{|\sigma|_0}, where |\sigma|_1 and |\sigma|_0 denote the number of 1s and 0s in \sigma, respectively. Almost all Bernoulli sequences with respect to \mu_p exhibit strong properties. By the , the empirical frequency of 1s converges to p, implying that for p = 1/2, the sequences are in the sense of having asymptotically equal frequencies of 0s and 1s. Moreover, these sequences are non-periodic, as any periodic structure would contradict the and identical distribution of the coordinates under \mu_p. They are also incompressible, meaning their initial segments cannot be significantly shortened by any algorithmic description, a consequence of satisfying all \mu_p-Martin-Löf tests for . In contrast to deterministic sequences, Bernoulli sequences lack the structured repetition found in periodic binary strings or de Bruijn sequences, which are finite or cyclic constructs designed to include every possible substring of a given length exactly once. While de Bruijn sequences serve applications in coding and testing due to their exhaustive coverage, Bernoulli sequences, being probabilistic realizations, avoid such predictability and instead embody the inherent irregularity of independent trials. The term "Bernoulli sequence" is sometimes used interchangeably with the Bernoulli process itself, referring to the abstract sequence of random variables, but in this context, it specifically emphasizes the pathwise properties of individual realizations \omega.

Randomness Extraction

In biased Bernoulli processes, where the success probability p \neq 0.5, the generated bits deviate from uniformity, posing challenges for applications that demand high-quality , such as cryptographic and simulations. Randomness extraction techniques address this by transforming sequences of biased bits into nearly uniform random bits, leveraging the underlying of the trials while mitigating . The seminal extractor, introduced in , provides a simple debiasing method for independent Bernoulli trials. It pairs consecutive bits (X_{2i-1}, X_{2i}): if the pair is (1,0), it outputs 1; if (0,1), it outputs 0; pairs (0,0) and (1,1) are discarded. This yields independent uniform bits (each 0 or 1 with probability 1/2), regardless of p (assuming $0 < p < 1), because the conditional probabilities balance out: P(\text{output 1} \mid \text{output}) = \frac{p(1-p)}{2p(1-p)} = \frac{1}{2}. The probability of producing an output per pair is $2p(1-p), yielding an efficiency of p(1-p) output bits per input bit. To enhance efficiency, especially for long sequences, iterated extensions apply the extractor recursively on discarded bits. Elias's 1972 block-based approach processes variable-length runs of identical bits, encoding them to extract more bits per block and approaching the source h(p) = -p \log_2 p - (1-p) \log_2 (1-p) for large blocks. Peres's 1992 method builds on this by iteratively applying extraction in a hierarchical manner, reusing bits from previous discards to achieve asymptotically optimal rates near h(p) as iterations increase. These iterations improve output yield, for instance, approaching 1 bit per input when p = 0.5 asymptotically with increasing iterations, compared to the basic extractor's 0.25. The basic Von Neumann extractor's efficiency p(1-p) \leq 1/4 limits its practicality when bias is strong (p near 0 or 1), as most pairs are discarded. More advanced alternatives, such as those employing to model the source distribution, can extract closer to the full entropy but require knowledge of p or adaptive estimation. In modern cryptographic contexts post-2000, such as quantum generators and privacy amplification, sophisticated extractors (e.g., seeded variants) have superseded basic methods for robust, high-entropy output in adversarial settings.

References

  1. [1]
    [PDF] The Bernoulli process and discrete distributions Math 217 ...
    The definition. A single trial for a Bernoulli pro- cess, called a Bernoulli trial, ends with one of two outcomes, one often called success, the other ...
  2. [2]
  3. [3]
    10. Bernoulli Trials - Random Services
    The process leads to several important probability distributions: the binomial, geometric, and negative binomial. ... The Beta-Bernoulli Process. Apps. Coin ...
  4. [4]
    Tales of Statisticians | Jacob Bernoulli
    He published papers on algebra and probability in 1685 and on geometry in 1687, and was appointed to the chair of mathematics at Basel in 1687. At about this ...Missing: process original
  5. [5]
    [PDF] 6.436J Lecture 20: The Bernoulli and Poisson processes
    Nov 19, 2008 · THE BERNOULLI PROCESS. In the Bernoulli process, the random variables Xn are i.i.d. Bernoulli, with com. mon parameter p ∈ (0, 1).
  6. [6]
    [PDF] Lecture 9: Commonly Used Distributions - Columbia University
    ◦ Quality control: Defective/Good. ◦ Clinical ... Let X be the number of successes in a Bernoulli process with n trials and probability of success p.
  7. [7]
    [PDF] some basic probabilistic processes
    The following plot presents pk(ko) for a Bernoulli process, with P = + and n = 4. 4-2 Interarrival Times for the Bernoulli Process ... customer whose purchase ...
  8. [8]
    [PDF] Entropy and Information Theory - Stanford Electrical Engineering
    Jun 3, 2023 · ... Bernoulli process if it can be defined as a stationary coding of an independent identically distributed (i.i.d.) process. Let µ denote the ...
  9. [9]
    [PDF] Probability: Theory and Examples Rick Durrett Version 5 January 11 ...
    Jan 11, 2019 · ... Probability Spaces. Here and throughout the book, terms being defined are set in boldface. We begin with the most basic quantity. A probability ...
  10. [10]
    [PDF] Probability and Measure - University of Colorado Boulder
    This book covers probability measures, Lebesgue measure, simple random variables, laws of large numbers, Markov chains, and measure theory.
  11. [11]
    [PDF] The Basics of Stochastic Processes - MIT OpenCourseWare
    In the Bernoulli process, the random variables Xn are i.i.d. Bernoulli, with com- mon parameter p ∈ (0, 1). The natural sample space in this case is = {0, 1}∞ .<|control11|><|separator|>
  12. [12]
    [PDF] Overview 1 Probability spaces - UChicago Math
    Mar 21, 2016 · This is an introduction to the mathematical foundations of probability theory. It is intended as a supplement.
  13. [13]
    [PDF] An Introduction to Measure Theory - Terry Tao
    Product measure. Given two sets X and Y , one can form their Cartesian ... Informally, Bernoulli measure al- lows one to model an infinite number of ...<|control11|><|separator|>
  14. [14]
    [PDF] Measure-Theoretic Probability I
    Let B≠ be the æ°algebra generated by the cylinder sets. The pair (≠,B≠) will be refereed to as sequence space. Exercise 1.7. Define a metric on ≠ by setting ...<|control11|><|separator|>
  15. [15]
    [PDF] Jakob Bernoulli On the Law of Large Numbers Translated into ...
    His Ars Conjectandi (1713) (AC) was published posthumously with a Foreword by his nephew, Niklaus Bernoulli (English translation: David (1962, pp. 133 – 135); ...Missing: citation | Show results with:citation
  16. [16]
    [PDF] Bernoulli and Binomial Random Variables
    Jul 10, 2017 · A Bernoulli random variable is the simplest kind of random variable. It can take on two values,. 1 and 0. It takes on a 1 if an experiment with ...<|control11|><|separator|>
  17. [17]
    254A, Notes 2: The central limit theorem | What's new - Terry Tao
    Jan 5, 2010 · The central limit theorem (and its variants, which we discuss below) are extremely useful tools in random matrix theory, in particular through the control they ...
  18. [18]
    [PDF] Two Proofs of the Central Limit Theorem
    For a Bernoulli random variable, it is very simple: MBer(p) = (1 − p) + pet =1+(et − 1)p. A binomial random variable is just the sum of many Bernoulli variables ...
  19. [19]
    [PDF] On a bound of the absolute constant in the Berry–Esseen inequality ...
    Sep 14, 2018 · The absolute constant in the Berry-Esseen inequality for iid Bernoulli variables is strictly less than the Esseen constant, with a bound of C02 ...
  20. [20]
    [PDF] shifts and ergodic theory - UChicago Math
    We define Bernoulli schemes by formally construct- ing the product space given a base space x. We define measure-preserving and ergodic maps, and show that ...
  21. [21]
    The Ergodic Hierarchy - Stanford Encyclopedia of Philosophy
    Apr 13, 2011 · It is a hierarchy of properties that dynamical systems can possess. Its five levels are ergodicity, weak mixing, strong mixing, Kolmogorov, and Bernoulli.
  22. [22]
    [PDF] Entropy in Dynamical Systems & Ergodic Theory (A little Glimpse)
    Every Bernoulli shift is mixing. Markov shifts (a.k.a. finite-state space Markov chains) : Consider an irreducible stochastic matrix Q : {1, ...
  23. [23]
    Introduction to the Modern Theory of Dynamical Systems
    This book provided the first self-contained comprehensive exposition of the theory of dynamical systems as a core mathematical discipline closely ...
  24. [24]
    [PDF] Singular p-adic transformations for Bernoulli product measures
    Feb 21, 2014 · Although we have defined the p-adic integers as formal power series, we can identify certain series as natural integers or rational numbers.
  25. [25]
    [PDF] Ergodic Theory of Group Actions - Mathematics
    Apr 6, 2016 · The Bernoulli shift for (S, P) is the transformation T : (X, µ) ... Bernoulli shifts on nonatomic probability spaces are mixing. Proof ...
  26. [26]
    [PDF] Minimity and unique ergodicity for adic transformations - IME-USP
    Dec 28, 2008 · Moreover this can be constructed so as to be strictly ergodic: to be both minimal (every orbit is dense) and uniquely ergodic. (that there is a ...<|control11|><|separator|>
  27. [27]
    [PDF] The structure and spectrum of Heisenberg odometers
    They are rank one transformations, and therefore are ergodic and have zero entropy. They have discrete rational spectrum and are the key ingredients in the ...
  28. [28]
    [PDF] a prime system with many self-joinings - Northwestern University
    In Section 2, we introduce general concepts from ergodic theory. In Section 3, we define our system (Y,ν,T), as the first return map of an odometer to a compact ...
  29. [29]
    [PDF] Bernoulli Randomness and Biased Normality - arXiv
    Nov 26, 2020 · Let nω denote the set of infinite n-ary sequences where n is a base. We identify n<ω as the set of finite n-ary sequences, which we also call ...
  30. [30]
    A Tricentenary history of the Law of Large Numbers - Project Euclid
    The Weak Law of Large Numbers is traced chronologically from its inception as Jacob Bernoulli's Theorem in 1713, through De Moivre's Theorem, ...
  31. [31]
    [PDF] Evolutionary Construction of de Bruijn Sequences
    A binary de Bruijn sequence of order n is a cyclic sequence of period 2n, in which each n-bit pattern appears exactly once. De Bruijn sequences are balanced, ...
  32. [32]
    [PDF] Randomness Extractors
    Randomness extractors are functions that extract almost-uniform bits from biased and correlated sources, like low-order bits of a system clock.
  33. [33]
    [PDF] An introduction to randomness extractors
    The notion of deterministic extractors can be traced back to von-Neumann. [vN51] who considered sequences of independent tosses of a biassed coin with unknown ...
  34. [34]
    Proof of Von Neumann's debiasing algorithm - MathOverflow
    Dec 17, 2013 · John von Neumann describes an algorithm to debias the random source and output a perfectly unbiased sequence of 1s and 0s as follows:
  35. [35]
    (PDF) Iterating Von Neumann's Procedure for Extracting Random Bits
    Aug 6, 2025 · Given a sequence of independent, identically distributed random biased bits, von Neumann's simple procedure extracts independent unbiased bits.
  36. [36]
    [PDF] A generalization of the Von Neumann extractor - arXiv
    Jan 7, 2021 · The number of bi- ased Bernoulli random variables needed to produce one unbiased instance is the complexity of interest. The complexity depends ...
  37. [37]
    Randomness extractors: making fair coins out of biased coins
    May 26, 2024 · I discuss various models of biased bit sequences, and how to extract uniform random (or close to it) output bit sequences from them, ...Missing: modern cryptography
  38. [38]
    Recent Advances in Randomness Extraction - PMC - PubMed Central
    Jun 26, 2022 · The area of randomness extraction has seen interesting advances in recent years, with rapid progress on many longstanding open problems.