In mathematics, a normal number is a real number whose representation in every integer base b \geq 2 exhibits uniform distribution of digits, such that every finite sequence of length k appears with asymptotic frequency exactly $1/b^k.[1] This property, known as absolute normality, ensures that the digits behave as if randomly and independently chosen from the uniform distribution over the alphabet of base b.[2] The concept was introduced by Émile Borel in 1909, who defined it in the context of base expansions and proved that the set of absolutely normal numbers has full Lebesgue measure (i.e., almost every real number in [0,1) is absolutely normal).[1] It remains an open conjecture whether all irrational algebraic numbers are normal.A weaker notion is normality in base b (or b-normality), where the property holds only for a fixed base b, requiring that blocks of digits of any length appear with the correct frequency in the base-b expansion.[2] Even more restricted is simple normality in base b, where each single digit from 0 to b-1 occurs with frequency $1/b, but longer blocks may not be uniformly distributed.[1] Borel also showed that almost all numbers are normal (though not necessarily absolutely normal) in any fixed base, but the set of numbers normal in a specific base b has Lebesgue measure 1 while being meager in the Baire category sense (a countable union of nowhere dense sets).[2]The first explicit construction of an absolutely normal number was given by Wacław Sierpiński in 1917, through a limiting construction ensuring the required digit frequencies.[1] A famous example is the Champernowne constant in base 10, formed by concatenating the positive integers (0.123456789101112...), which was proven normal in base 10 by Constantine Champernowne in 1933 and later shown to be absolutely normal by Yoshinobu Nakai and Iekata Shiokawa.[1] Other constructed normal numbers include the Copeland–Erdős constant (concatenation of primes, normal in base 10) and certain Liouville numbers that are computable and absolutely normal.[2]Despite these constructions, determining whether specific irrational constants like \pi, e, or \sqrt{2} are normal remains an open problem, though statistical evidence from computed digits suggests they likely are.[3] Normality has connections to ergodic theory, uniform distribution modulo 1, and Diophantine approximation; for instance, a number \alpha is b-normalif and only if the sequence \{ b^n \alpha \} (fractional parts) is uniformly distributed modulo 1.[1] Key results include Wolfgang Schmidt's 1960 theorem establishing that normality in multiplicatively dependent bases is equivalent, while for multiplicatively independent bases, there exist numbers normal in one but not the other.[1]
Definitions
Single-base normality
A real number x whose base-b expansion is non-terminating (as is the case for irrational numbers) is said to be normal in base b (where b \geq 2 is an integer) if, for every positive integer k \geq 1 and every possible sequence of k digits from the alphabet \{0, 1, \dots, b-1\}, the limiting relative frequency of that sequence's occurrences in the base-b expansion of x equals $1/b^k.[4][5] This concept was introduced by Émile Borel in 1909 to formalize a notion of statistical randomness in digit expansions.[6]Formally, let \xi_1 \xi_2 \xi_3 \dots denote the base-b expansion of x \in [0,1), where each \xi_i \in \{0, 1, \dots, b-1\}. For a fixed k and a specific sequence s = (s_1, \dots, s_k), the number of occurrences of s in the first N digits is the count of starting positions m such that $1 \leq m \leq N - k + 1 and \xi_m \xi_{m+1} \dots \xi_{m+k-1} = s. Then, x is normal in base b if\lim_{N \to \infty} \frac{1}{N - k + 1} \times (\text{number of occurrences of } s \text{ in first } N \text{ digits}) = \frac{1}{b^k}holds for every k \geq 1 and every such sequence s.[4][7]For the simplest case of k=1, this reduces to the condition that each single digit from 0 to b-1 appears with asymptotic density $1/b in the expansion, a property known as simple normality in base b.[5] Normality in a single base b is a special case of the stronger notion of absolute normality, which requires the property to hold simultaneously in every integer base b \geq 2.[7]
Absolute normality
A real number is said to be absolutely normal if it is normal in every integer base b \geq 2.[8] This condition requires that the base-b expansion of the number satisfies the frequency uniformity of digits and blocks for all such bases simultaneously, extending the notion of normality beyond a single base.[4]The concept of absolute normality builds on Émile Borel's foundational work in 1909, where he introduced normal numbers in a fixed base and proved their existence with full measure, laying the groundwork for multi-base generalizations; the term "absolutely normal" originates from this work.[9]Wacław Sierpiński in 1917 provided the first explicit construction of such a number through an effective method ensuring digit uniformity across all bases, demonstrating that at least one exists without relying solely on probabilistic methods.[10]Absolute normality implies normality in any specific base b \geq 2, as the condition encompasses all bases, but the converse—whether a number normal in one base is necessarily normal in all others—remains unknown.[11] In terms of prevalence, Borel established that almost all real numbers in the unit interval are absolutely normal with respect to Lebesgue measure, meaning the set of non-absolutely normal numbers has measure zero.[9] Despite this ubiquity, no simple explicit example of an absolutely normal number, such as a constant defined by a closed-form expression, is known; all known examples arise from intricate constructive proofs.[12]
Fundamental Properties
Frequency conditions
A normal number in base b satisfies the frequency condition that, for every positive integer k, each possible block of k digits from the alphabet \{0, 1, \dots, b-1\} appears in the base-b expansion with asymptotic frequency exactly b^{-k}.[7] This equiprobability ensures that single digits (k=1) each occur with frequency $1/b, pairs (k=2) with frequency $1/b^2, and so on for longer blocks, reflecting a balanced distribution across all finite sequences.[4]This block frequency property implies that the digits in the expansion behave statistically as if they were independently and uniformly drawn from \{0, 1, \dots, b-1\}, with the limiting frequencies of joint occurrences matching those of the independent uniform product measure.[6] For instance, the frequency of a specific block of length k is precisely the product of the individual digit probabilities under uniformity, underscoring the randomness-like quality of normal expansions.[13]Émile Borel established in 1909 that almost every real number in the unit interval is normal in base b, meaning the set of non-normal numbers has Lebesgue measure zero; this result extends to absolute normality across all bases b \geq 2, proven using measure-theoretic arguments on the probabilities of digit sequences.[4] Borel's theorem demonstrates that normal numbers are generic in the sense of natural density under Lebesgue measure, though it provides no explicit examples beyond probabilistic existence.[1]Despite this abundance, computational verification of normality for specific numbers remains challenging, as it requires analyzing the infinite expansion to confirm the asymptotic frequencies, and no natural constant like \pi or e has been proven normal, with all known proofs relying on explicit constructions rather than direct verification of transcendental or algebraic irrationals.[14]
Disjunctive property
A number x in [0, 1) is said to be disjunctive in base b (where b \geq 2 is an integer) if its base-b expansion contains every possible finite sequence of digits from the alphabet \{0, 1, \dots, b-1\} as a contiguous substring at least once.[15] Equivalently, the infinite sequence of digits in the expansion is a disjunctive sequence, meaning its set of factors (subwords) includes all finite words over the given alphabet.[15] This property emphasizes the exhaustive presence of patterns without regard to their relative frequencies.Disjunctivity is a weaker condition than normality: every normal number in base b is disjunctive in that base, but the converse does not hold, as a disjunctive sequence may contain all finite patterns without adhering to the required asymptotic frequencies of $1/b^k for sequences of length k.[15][16] The implication from normality to disjunctivity follows directly from the frequencycondition for normal numbers. Specifically, for any fixed finite sequence s of length k, the proportion of occurrences of s in the first N digits approaches $1/b^k > 0 as N \to \infty, ensuring that the expected number of appearances grows without bound and thus s must occur at least once (in fact, infinitely often).[16]This disjunctive property renders the base-b expansions of normal numbers universal in the sense that they embed every conceivable finite digit string, serving as a complete "library" of all possible patterns over the alphabet.[15] For instance, the Champernowne constant, constructed by concatenating all positive integers in base 10, is normal and hence disjunctive, containing sequences like the digits of any rational number as substrings.[16]
Examples and Constructions
Explicit normal numbers
One of the earliest explicit constructions of an absolutely normal number is due to Wacław Sierpiński in 1917. It is formed by concatenating the rational numbers $1/1, 1/2, 2/2, 1/3, 2/3, 3/3, \dots up to denominators with increasing numbers of digits, ensuring uniform digit frequencies across all bases.[1]Another early explicit construction is the Champernowne constant C_b in base b \geq 2, formed by the concatenation of the base-b representations of the positive integers in ascending order. For base 10, this yields the decimal expansion $0.123456789101112131415\dots. D. G. Champernowne proved in 1933 that C_b is normal in base b.[17]A related construction is the Copeland–Erdős constant, obtained by concatenating the base-b representations of the prime numbers in order. In base 10, it begins $0.235711131719232931\dots. A. H. Copeland and P. Erdős established in 1946 that this constant is normal in base 10, with the proof relying on the density of primes ensuring balanced digit frequencies.Additional algorithmic constructions generate normal numbers by engineering digit streams to satisfy exact frequency conditions for all block lengths. For instance, de Bruijn sequences, which cyclically include every possible substring of length n over an alphabet of size b exactly once, can be adapted recursively via Eulerian paths in de Bruijn digraphs to build expansions where subblocks of increasing lengths appear with uniform asymptotic density. This approach, which avoids the redundancy of concatenative methods like Champernowne's, produces a class of b-adic normal numbers.[18]Generalizations using Turing machines link such constructions to algorithmic randomness, where computable sequences are designed to mimic the frequency properties of random strings. In the 1980s, explorations of program-length complexity and sophistication provided frameworks for generating contrived yet provably normal numbers through universal Turing machine simulations, emphasizing the computability of normality criteria.
Non-normal numbers
All rational numbers fail to be normal in any integer base b \geq 2, as their base-b expansions are either terminating (equivalent to eventually repeating with digit 0) or eventually periodic.[19] This periodicity implies that the digits are not equidistributed, with certain blocks repeating indefinitely and others absent, violating the frequency condition for normality.[19] For instance, in base 10, \frac{1}{3} = 0.\overline{3}, where only the digit 3 appears after the decimal point, so the frequencies of digits 0, 1, 2, and 4 through 9 are zero.[19]Even among irrational numbers, non-normality arises when expansions exhibit biases, such as limited digit usage or disproportionate frequencies. Liouville numbers, a class of transcendental irrationals constructed to satisfy strong Diophantine approximation properties, provide examples where some members fail the frequency conditions due to engineered digit patterns that favor certain values.[20] For instance, specific Liouville numbers can be built with long strings of repeated digits (like all b-1s in base b), ensuring that no base-b expansion is simply normal, let alone normal.[21]Another straightforward construction is an irrational whose base-10 expansion uses only digits 0 and 1, such as \sum_{k=1}^\infty 10^{-(k(k+1)/2)} = 0.101001000100001\dots, where 1s are separated by increasing runs of zeros (1 zero, then 2, then 3, etc.).[21] Here, digits 2 through 9 never appear, so their frequencies are zero, precluding normality.[21] More generally, Stoneham-type numbers like \sum_{j=1}^\infty b^{-c^j} (for coprime integers b, c \geq 2) produce irrationals with extensive blocks of zeros in base bc, biasing the digit distribution.[21]Despite their prevalence in explicit constructions, the set of all non-normal numbers (in a fixed base) has Lebesgue measure zero on the unit interval.[4] This follows from Borel's 1909 theorem, which establishes that almost every real number (in the measure-theoretic sense) is normal in every integer base b \geq 2.[4]
Theoretical Connections
Finite-state machines
The concept of normality in the digit expansion of a real number can be characterized through its resistance to compression by finite-state machines, which are computational devices with a fixed number of internal states that process input sequences deterministically or non-deterministically. Specifically, a number is normal in base b if its infinite digit sequence cannot be compressed by any lossless finite-state compressor to an average length below the entropy limit of \log_b b = 1 (in base-b logarithmic units) per digit, meaning the sequence exhibits maximal information content per symbol under finite-state processing. This incompressibility implies that no finite-state automaton can predict or encode the sequence more efficiently than simply copying it, as any attempt to exploit patterns would fail due to the uniform frequency of all finite substrings.A prominent example of a finite-state compression method is the Ziv-Lempel algorithm (LZ78), which builds a dictionary of previously seen phrases to encode repetitions. Normal sequences resist significant compression by such methods, achieving compression ratios close to 1, whereas non-normal sequences, such as periodic ones, compress efficiently because their repetitive structure allows the dictionary to represent large portions with short codes. For instance, the Champernowne constant, known to be normal, remains largely incompressible under LZ78, highlighting how normality enforces algorithmic randomness against dictionary-based finite-state strategies.[22][23]The connection between normality and finite-state incompressibility emerged from developments in algorithmic information theory during the 1960s, with foundational work by Andrey Kolmogorov, Ray Solomonoff, and Gregory Chaitin on program-size complexity, later extended to sequences by Donald Knuth and others exploring computational limits of randomness. A seminal result was proved by Claus P. Schnorr and Harald Stimm in 1972, establishing that a sequence is normal if and only if no finite-state martingale (a betting strategy computable by a finite automaton) can succeed on it with positive capital growth, providing an automata-theoretic characterization equivalent to the frequency condition for normality. This theorem underscores the discrete, computational perspective on normality, distinct from continuous dynamical views like equidistribution.[4][24]Recent ties to Kolmogorov complexity further reinforce this link: the prefix of length N in a normal number's base-b expansion has Kolmogorov complexity asymptotically approaching N \log_2 b bits, achieving the maximum possible for sequences over an alphabet of size b, as shorter programs cannot describe the unpredictable uniformity without enumerating the entire prefix. This maximal complexity aligns with finite-state incompressibility, as any effective compressor would imply a shorter descriptive program, contradicting normality.
Equidistribution and ergodic theory
A real number \alpha \in [0,1) is normal in base b \geq 2 if and only if the sequence of fractional parts \{b^n \alpha\}, where \{\cdot\} denotes the fractional part, is equidistributed in the unit interval [0,1) with respect to Lebesgue measure.[25] This equidistribution ensures that the base-b digits of \alpha appear with the correct asymptotic frequencies, as the position of b^n \alpha \mod 1 determines the n-th digit.Equidistribution of the sequence \{b^n \alpha\} can be characterized using Weyl's criterion, which states that the sequence is equidistributed modulo 1 if and only if\lim_{N \to \infty} \frac{1}{N} \sum_{n=1}^N e^{2\pi i m b^n \alpha} = 0for every integer m \neq 0.[25] This exponential sum condition provides a Fourier-analytic tool to verify normality by bounding discrepancies in the distribution.In ergodic theory, normality connects to the dynamics of the base-b shift map on the symbolic space \{0,1,\dots,b-1\}^\mathbb{N} equipped with the product Bernoulli measure of equal probabilities $1/b. This shift is ergodic, and a sequence is normal in base b if its orbit under the shift is generic for this measure, meaning Birkhoff averages of cylinder functions converge to their space integrals.[26] Equivalently, on the interval [0,1), the map T_b(x) = \{b x\} preserves Lebesgue measure and is ergodic, so by the Birkhoff ergodic theorem, almost every \alpha has an equidistributed orbit under T_b, hence is normal in base b.For non-integer bases \beta > 1, the concept extends via \beta-expansions generated by the \beta-transformation T_\beta(x) = \{\beta x\}, which admits an absolutely continuous invariant measure and is ergodic. Thus, Lebesgue-almost every x \in [0,1) is \beta-normal, meaning blocks of digits in its greedy \beta-expansion occur with frequencies dictated by the Parry measure.Normality also arises in continued fraction expansions through the ergodic Gauss map G(x) = \{1/x\} on (0,1), which preserves the Gauss measure \mu(dx) = dx/(\log 2 (1+x)) and is ergodic. A number is continued fraction normal if blocks of partial quotients appear with \mu-measure frequencies; almost every x satisfies this by the ergodic theorem. Explicit constructions of numbers that are both continued fraction normal and absolutely normal, leveraging Champernowne-like sequences adapted to the Gauss dynamics, were provided by Becher and Yuhjtman in 2019.[27]