Fact-checked by Grok 2 weeks ago

Algorithmic information theory

Algorithmic information theory, also known as the theory of Kolmogorov complexity, is a subfield of theoretical computer science and mathematical logic that quantifies the intrinsic information content of individual finite objects—such as binary strings or sequences—by defining their complexity as the length, in bits, of the shortest computer program capable of generating them on a universal Turing machine.^[1] This measure, termed Kolmogorov complexity and denoted K(x) for an object x, captures the minimal description length required to specify x algorithmically, providing a foundational tool for analyzing compressibility, randomness, and the limits of computation without relying on probabilistic ensembles.^[2] The theory emerged in the mid-1960s through independent contributions from three pioneers: Ray Solomonoff, who laid early groundwork in inductive inference by linking program length to prediction universality in 1964; Andrey Kolmogorov, who formalized three quantitative approaches to information in individual objects, emphasizing algorithmic descriptions in his 1965 paper; and Gregory Chaitin, who developed related notions of program-size complexity for binary sequences in 1966, highlighting its connections to Gödel's incompleteness theorems.^[3] These works built on earlier concepts from computability theory, such as Alan Turing's universal machine, but shifted focus from decision problems to the descriptive power of programs as a measure of information.^[2] Unlike Claude Shannon's classical information theory, which quantifies average uncertainty in message sources via entropy, algorithmic information theory applies to singular instances and reveals that most strings are incompressible—meaning K(x) \approx |x| for a string x of length n—thus providing a precise, albeit uncomputable, definition of algorithmic randomness.^[1] Central to the theory are several key principles and extensions. The invariance theorem ensures that K(x) is robust up to an additive constant across different universal Turing machines, making it a language-independent complexity measure.^[2] Mutual information between objects, defined as I(x:y) = K(x) + K(y) - K(x,y), quantifies shared algorithmic structure, analogous to Shannon's mutual information but for individuals.^[1] The theory's uncomputability—arising because no algorithm can compute K(x) for arbitrary x—stems from the halting problem, yet prefix-free variants like monotone complexity enable practical approximations through enumerable semimeasures, such as the universal semimeasure M(x).^[3] These foundations have profound implications: they underpin the minimum description length (MDL) principle for model selection in machine learning, where the best model minimizes the total description length of data and model; clarify notions of simplicity in philosophy and physics; and influence fields like data compression, where incompressible objects represent true randomness.^[1] Despite its theoretical depth, algorithmic information theory faces challenges in application due to uncomputability, leading to approximations like compression-based estimators or logical-depth measures that incorporate computational time.^[2] Ongoing research extends AIT to quantum settings, networks, and biological systems, reinforcing its role as a bridge between computation, probability, and epistemology.^[3]

Introduction

Core Principles

Algorithmic information theory provides a foundation for quantifying the information content of individual objects using computational resources. The core concept is algorithmic complexity, defined as the length, in bits, of the shortest program that outputs a given finite object—such as a binary string—on a universal Turing machine and then halts. This measure, introduced independently by Ray Solomonoff, Andrey Kolmogorov, and Gregory Chaitin, focuses on the minimal computational description required to reconstruct the object exactly.^[4] This definition captures the intrinsic information content of an object because it identifies the most concise algorithmic specification possible, transcending choices in representation, language, or encoding scheme. The complexity is invariant up to an additive constant under changes in the universal machine or programming formalism, ensuring it reflects an absolute, machine-independent essence of the object's randomness or structure. Unlike statistical measures that average over distributions, algorithmic complexity assesses a single instance's inherent compressibility through computation. To illustrate, consider a repetitive binary string like 01010101 (eight bits). This can be generated by a short program that instructs the machine to output "01" four times, yielding low algorithmic complexity relative to its length. In contrast, a random string of the same length, such as 10111001, lacks discernible patterns and requires a program essentially as long as the string itself to specify it bit by bit, resulting in high complexity approximately equal to its length. This disparity highlights how algorithmic complexity distinguishes structured, compressible objects from those embodying maximal intrinsic information. The theory's motivation draws from computability theory's exploration of effective procedures and the philosophy of information, seeking a precise notion of simplicity and inference that aligns with Occam's razor by favoring minimal descriptions. It formalizes the idea that an object's information is the effort needed to compute it from no input, bridging logical and quantitative views of knowledge.

Significance and Scope

Algorithmic information theory (AIT) has profoundly influenced the understanding of randomness in mathematics and computer science by formalizing it as the incompressibility of individual objects, rather than relying on probabilistic ensembles as in classical information theory. In this framework, a finite string is deemed random if its shortest algorithmic description is approximately as long as the string itself, capturing the intuitive notion that truly random data defies efficient compression. This definition, central to Kolmogorov complexity, provides a rigorous, objective measure that bridges computation and information, enabling proofs of randomness for specific sequences without assuming underlying distributions. The theory's principles underpin philosophical and statistical concepts like Occam's razor, which favors simpler explanations, by quantifying simplicity through the minimal program length required to describe phenomena. This connection manifests in the minimum description length (MDL) principle, which selects models that minimize the total description length of both the model and the data it encodes, thereby balancing complexity and fit in inductive inference. Developed by Jorma Rissanen, MDL draws directly from AIT to formalize parsimony in model choice, influencing fields from statistics to machine learning where shorter descriptions imply greater explanatory power.^[5] Despite its foundational insights, AIT's scope is inherently limited by the uncomputability of core measures like Kolmogorov complexity, which cannot be algorithmically determined for arbitrary inputs due to the halting problem. However, the theory circumvents this through asymptotic analyses, universal approximations via self-delimiting codes, and time-bounded variants like Levin complexity, which provide practical bounds and computable proxies for real-world applications. These limitations highlight AIT's role as a theoretical benchmark rather than a direct computational tool, guiding the development of feasible heuristics.^[6] In contemporary contexts as of 2025, AIT continues to shape AI model selection by informing criteria like MDL, which aids in choosing parsimonious architectures amid vast parameter spaces, as seen in recent symbolic regression methods that prioritize incompressible representations for generalization. Similarly, in big data analysis, MDL-inspired techniques enhance pattern mining by identifying concise, non-redundant structures in massive datasets, improving efficiency in anomaly detection and compression tasks. These applications underscore AIT's enduring relevance in addressing scalability and interpretability challenges in data-driven systems.^[7]^[8]

Historical Development

Early Foundations

The early foundations of algorithmic information theory took shape in the 1960s, a period marked by intense exploration in artificial intelligence where researchers grappled with core challenges in logic, probability, and pattern recognition to enable machines to infer general rules from limited observations. This era saw the rise of symbolic AI approaches, such as heuristic search and early neural networks, but also highlighted the need for formal measures of inference and randomness to ground predictive models in computational terms. These motivations drove independent efforts to quantify information and complexity algorithmically, bridging computability theory with probabilistic reasoning.^[9] Ray Solomonoff initiated this development in 1964 with his seminal paper on inductive inference, proposing algorithmic probability as a foundation for artificial intelligence by defining the prior probability of a data sequence as the aggregate likelihood of all programs that could generate it on a universal Turing machine. This universal prior, derived from the halting probabilities of self-delimiting programs, offered an optimal solution to Bayes' rule in prediction tasks, emphasizing shorter programs as more plausible explanations and formalizing Occam's razor computationally. Solomonoff's framework addressed pattern recognition by enabling extrapolation from finite data to unseen events, positioning algorithmic methods as superior to ad hoc statistical models in AI.^[10] Independently, Andrey Kolmogorov advanced the field in 1965 through his publication defining the complexity of finite objects—such as binary strings—as the length of the shortest algorithm that produces them, providing a non-probabilistic absolute measure of information content distinct from Shannon entropy. This deterministic approach focused on the minimal descriptive resources needed for individual objects, responding to logical questions about randomness by deeming a sequence random if no significantly shorter program exists to generate it, thus laying a cornerstone for analyzing finite patterns without reliance on ensemble probabilities. Kolmogorov's ideas complemented the AI context by offering a tool to evaluate the compressibility of observed data in inference processes.^[11] In parallel during the mid-1960s, Gregory Chaitin contributed key insights, beginning with his 1966 submission on program lengths for binary sequences and extending in 1969 to self-delimiting codes that ensure prefix-free program sets for robust complexity measures. These self-delimiting constructions prevented ambiguities in program interpretation, enhancing the invariance of complexity across machines and addressing probabilistic limitations in earlier definitions. Chaitin's contemporaneous work also introduced the omega number in the early 1970s as the halting probability of a universal self-delimiting machine, though rooted in his 1960s explorations, it quantified algorithmic randomness for infinite domains and underscored uncomputability barriers in logical systems.

Major Advancements

In the 1970s, Gregory Chaitin advanced algorithmic information theory by introducing the concept of algorithmic information density, which quantifies the concentration of incompressible information in program spaces, and the halting probability Ω, defined as the probability that a randomly generated program halts on a universal Turing machine. Ω, also known as Chaitin's constant, encodes the halting problem in its binary expansion, rendering it algorithmically random and uncomputable, with profound implications for the limits of formal systems.^[12] These developments built on earlier foundations by emphasizing probabilistic interpretations of complexity, influencing subsequent work on randomness and incompleteness. In the 1970s, Leonid Levin contributed significantly through his formulation of universal search, an optimal algorithm for solving inversion problems by enumerating and verifying candidate solutions in order of increasing resource use, achieving time complexity within a constant factor of the optimal. This approach, detailed in his analysis of sequential search problems, provided a framework for universal problem-solving that balances generality and efficiency, impacting fields like optimization and automated theorem proving.^[13] Levin's work also extended to optimal algorithms, highlighting trade-offs in computational resources and laying groundwork for average-case complexity analysis. In the 1990s and 2000s, Peter Gács and collaborators developed resource-bounded variants of Kolmogorov complexity, such as time-bounded measures that restrict computation to polynomial time, enabling computable approximations while preserving key invariance properties up to additive constants. These variants addressed the uncomputability of classical measures by incorporating feasible resource limits, facilitating applications in complexity theory and learning algorithms.^[14] Concurrently, explorations connected algorithmic information theory to quantum computing, with early quantum generalizations of Kolmogorov complexity proposed to model information processing in quantum systems, revealing differences in compressibility for quantum states compared to classical ones.^[15] From the 2010s to 2025, algorithmic information theory saw increased applications in cryptography, where concepts like incompressibility tests enhanced pseudorandom generator evaluations and side-channel attack analyses, providing theoretical bounds on information leakage.^[16] In AI ethics, it informed discussions on explainability limits, demonstrating through incomputability arguments that black-box models may inherently resist full interpretability, prompting frameworks for ethical oversight in automated decision-making.^[17] Critiques of uncomputability were addressed via practical approximations, such as the Lempel-Ziv algorithm, which estimates Kolmogorov complexity through dictionary-based compression, yielding the normalized compression distance for clustering and anomaly detection tasks.

Mathematical Foundations

Kolmogorov Complexity

The Kolmogorov complexity of a finite binary string x, denoted K(x), is defined as the length of the shortest program p (in bits) that, when executed by a fixed universal Turing machine U, outputs exactly x and then halts:

K(x) = \min \{ |p| : U(p) = x \}.

This measure captures the minimal algorithmic description length required to specify x without loss of information, independent of any particular encoding beyond the choice of U.^[11]^[18] A key property is subadditivity (or monotonicity with respect to concatenation), which states that the complexity of the concatenation of two strings x and y is at most the sum of their individual complexities plus a constant:

K(xy) \leq K(x) + K(y) + c,

where c is a constant independent of x and y but dependent on U. This reflects that a program for xy can be constructed by combining programs for x and y with fixed overhead for concatenation.^[18]^[19] The symmetry of information property provides a balanced view of mutual dependence:

K(xy) = K(x) + K(y \mid x) + O(1) = K(y) + K(x \mid y) + O(1),

where the conditional complexity K(y \mid x) is the length of the shortest program that outputs y given x as input to U:

K(y \mid x) = \min \{ |p| : U(p, x) = y \}.

This equates the added description length for y given x with that for x given y, up to additive constants, highlighting the symmetric information content in pairs of strings. The chain rule follows naturally, bounding the joint complexity:

K(xy) \leq K(x) + K(y \mid x) + O(\log K(xy)).

These relations enable decomposition of total complexity into conditional contributions, analogous to entropy chain rules but for individual objects.^[18]^[19] For simple strings, computations illustrate these ideas, though exact values depend on the specific U and are practically uncomputable due to the undecidability of the halting problem. Consider the string x consisting of n zeros, denoted $0^n; a program can specify "print n zeros," requiring O(\log n) bits to encode n in binary, so K(0^n) = O(\log n) + O(1). In contrast, a random string r of length n with no discernible pattern has K(r) \geq n - O(1), as any shorter program would imply compressibility absent in true randomness. These examples underscore that low-complexity strings are compressible, while high-complexity ones resist algorithmic shortening.^[18]

Prefix-Free and Universal Variants

To address limitations in the plain Kolmogorov complexity, such as the inability to unambiguously decode concatenated programs without additional delimiters, the prefix-free variant was introduced, which restricts the domain of the universal Turing machine U to a prefix-free set of programs. The prefix complexity of a binary string x, denoted K_p(x), is defined as the length of the shortest prefix-free program p such that U(p) = x, formally K_p(x) = \min \{ |p| : p \in \mathrm{dom}(U), U(p) = x \}, where \mathrm{dom}(U) forms a prefix-free set ensuring no program is a proper prefix of another. This construction allows for instantaneous, unambiguous decoding, making it suitable for modeling sequential data transmission or nested computations in algorithmic information theory. Building on this, the universal semimeasure m(x) provides a probabilistic interpretation, defined as the sum m(x) = \sum_{p : U(p)=x} 2^{-|p|}, where the sum is over all prefix-free programs p that output x. Due to the prefix-free condition and Kraft's inequality, \sum_x m(x) \leq 1, establishing m as a semimeasure that dominates any computable semimeasure up to a multiplicative constant, thereby serving as a universal prior that approximates any effective probability distribution on strings. This links descriptive complexity to algorithmic probability, as K_p(x) \approx -\log_2 m(x), with the approximation holding within an additive constant independent of x. A key consequence is Chaitin's constant \Omega, the halting probability of the universal prefix-free machine U, given by \Omega = \sum_{p \in \mathrm{dom}(U)} 2^{-|p|} = \sum_x 2^{-K_p(x)}, which converges to a value between 0 and 1 and encodes the total probability mass of all halting computations. This real number \Omega is algorithmically random and uncomputable, capturing the inherent uncertainty in the halting problem within the framework of prefix complexity.

Key Properties and Theorems

Uncomputability and Limits

One of the fundamental limitations of algorithmic information theory is the uncomputability of Kolmogorov complexity, which means no Turing machine exists that can compute the exact value of K(x) for arbitrary strings x. This uncomputability arises directly from the undecidability of the halting problem, as established by Alan Turing in 1936. The proof proceeds by contradiction using a diagonalization argument: assume a Turing machine M computes K(x). Define a computable function f(n) that enumerates all binary strings in order of increasing length and, using M, finds and outputs the smallest string s such that K(s) > n. Since f is computable, K(f(n)) \leq K(n) + c for some constant c, which is at most approximately \log_2 n + c. However, by construction, K(f(n)) > n. For sufficiently large n, \log_2 n + c < n, yielding a contradiction.^[20]^[20] The uncomputability of K(x) is intimately connected to the busy beaver function \Sigma(n), which denotes the maximum number of 1's that an n-state Turing machine can produce on a blank tape before halting. Both exhibit non-computable behavior, and variants of busy beaver-like functions defined via Kolmogorov complexity—such as the maximum N where K(N) \leq n—grow faster than any computable function, highlighting how algorithmic complexity escapes recursive bounds. This rapid growth underscores the theoretical barriers in predicting or bounding complexity measures precisely.^[20]^[21] Despite these limitations, approximations to K(x) are feasible in practice. Upper bounds are straightforward: the length of any specific program that outputs x provides an immediate upper estimate on K(x), and compression algorithms like gzip or Lempel-Ziv yield practical approximations by constructing short descriptions. Lower bounds rely on counting arguments: for strings of length n, the number of distinct outputs from programs of length at most m is at most $2^{m+1} - 1, implying that if m < n - O(1), most strings must have K(x) > m, thus establishing that typical random strings are incompressible up to nearly their full length.^[20]^[20] These uncomputability results imply that algorithmic information theory serves primarily as a theoretical framework for understanding limits on computation, randomness, and information, rather than a directly applicable tool. In practical scenarios, such as data compression or randomness testing, heuristics and upper-bound estimators are employed, as exact K(x) values remain forever out of reach, providing instead asymptotic benchmarks for what is possible in principle.^[20]

Invariance and Universality

One of the foundational results in algorithmic information theory is the invariance theorem, which establishes that the Kolmogorov complexity of an object is robust with respect to the choice of universal Turing machine used to define it. Specifically, for any two universal Turing machines U and V, there exists a constant c independent of the input x such that |K_U(x) - K_V(x)| \leq c, where K_U(x) denotes the length of the shortest program for U that outputs x.^[22] This theorem, originally articulated in early formulations of the theory, ensures that complexities differ by at most a fixed additive constant across equivalent computational models. The proof of the invariance theorem relies on the simulation capabilities of universal machines. Given machines U and V, one can construct a fixed-length program p for U that simulates V on any input program q, effectively translating q into an equivalent program for U with overhead bounded by |p|, the length of the simulator. Thus, K_U(x) \leq K_V(x) + |p|, and symmetrically by swapping U and V, yielding the bounded difference c = 2 \max(|p_U|, |p_V|). This simulation argument highlights the asymptotic machine-independence of the measure, as the constant c does not grow with the complexity of x. A related universality property holds for the prefix-free variant of algorithmic probability, m(x) = \sum \{ 2^{-|p|} : U(p) = x \}, which defines a universal lower semicomputable semimeasure. For any other computable semimeasure \mu, there exists a constant c_\mu > 0 such that m(x) \geq c_\mu \mu(x) for all x, meaning m dominates \mu up to a multiplicative constant. This domination arises because m effectively enumerates and weights all possible computable semimeasures via universal simulation, subsuming their contributions. These properties have profound consequences for the theory: Kolmogorov complexity K(x) (and its prefix variant) can be treated as a canonical, machine-independent measure of intrinsic information content, valid asymptotically without dependence on specific computational details. This invariance enables the theory's broad applicability, as results hold uniformly across universal models, facilitating comparisons and extensions in diverse computational frameworks.

Applications and Extensions

Randomness and Incompressibility

In algorithmic information theory, randomness for individual infinite sequences is characterized through the lens of incompressibility using Kolmogorov complexity. A sequence is deemed Martin-Löf random if it passes all effective statistical tests for randomness.^[23] This criterion captures the intuitive notion that random sequences resist algorithmic compression, as no shorter description than the sequence itself (up to a fixed additive constant) can generate it. The equivalence between Martin-Löf randomness and this incompressibility property—for prefix-free Kolmogorov complexity satisfying K(x) \geq n - c for every initial segment x of length n, where c is a constant—was established by showing that sequences passing all effective statistical tests for randomness are precisely those whose prefixes are incompressible.^[24] The incompressibility criterion extends to finite strings, where nearly all binary strings of length n are incompressible, meaning K(x) \geq n - O(1) for almost all x with respect to the uniform probability measure; the set of compressible strings has measure approaching zero as n grows. This probabilistic near-universality underscores that randomness is the default state in the space of all possible sequences, with compressible strings forming a negligible minority that admit succinct algorithmic descriptions. Formally, the proportion of incompressible strings of length n is $1 - o(1), reflecting the foundational role of incompressibility in quantifying intrinsic randomness. This framework has key applications in establishing theoretical limits for data compression. Any universal compression algorithm, such as those based on prefix codes, cannot compress all inputs; it succeeds only on compressible strings, while incompressible ones—the truly random data—remain at or near their original length, providing an absolute bound on lossless compression ratios. In randomness testing, practical suites like the NIST statistical test suite indirectly approximate incompressibility by checking for patterns that would allow compression, such as linear dependencies or frequency biases, though they cannot fully capture the uncomputable Kolmogorov measure and thus serve as heuristics for pseudorandom generators.^[25]^[26] A illustrative example is the decimal digits of \pi, which appear statistically random—passing empirical tests for uniformity and independence—yet are highly compressible in the Kolmogorov sense. A short program can compute arbitrary prefixes of \pi's digits using series expansions like the Leibniz formula, yielding K(x) = O(\log n) for an n-digit prefix x, far below n. This discrepancy highlights how algorithmic randomness distinguishes intrinsic incompressibility from mere statistical unpredictability, as \pi's digits, though computable and thus non-random, mimic random behavior due to their aperiodic, pattern-free expansion.

Connections to Other Fields

Algorithmic information theory has significantly influenced machine learning, particularly through the minimum description length (MDL) principle, which formalizes model selection by minimizing the total description length of a model and the data it encodes.^[5] This approach draws directly from Kolmogorov complexity, where the ideal MDL code corresponds to the shortest program that generates the data, balancing model simplicity against explanatory power to avoid overfitting.^[5] In practice, MDL approximations use two-part codes—describing the model parameters followed by the data given the model—enabling robust selection in tasks like regression and classification, as implemented in frameworks that approximate universal priors.^[27] Complementing MDL, Solomonoff induction provides a foundational framework for predictive modeling by assigning prior probabilities to hypotheses based on their algorithmic complexity, favoring shorter descriptions in a Bayesian update over observed data.^[28] This universal induction scheme, rooted in algorithmic probability, achieves optimal prediction in the limit by converging to the true underlying process with error bounded by the complexity of the data, influencing modern approaches to sequential prediction and reinforcement learning.^[28] Although uncomputable, resource-bounded variants approximate it effectively for real-world applications, demonstrating rapid convergence even with limited samples.^[28] In physics and thermodynamics, algorithmic information theory establishes deep parallels between Kolmogorov complexity and thermodynamic entropy, treating program lengths as analogous to microstate counts in statistical mechanics.^[29] Algorithmic entropy, defined via prefix-free machines, quantifies the information content of individual states much like Shannon entropy does for ensembles, allowing derivations of thermodynamic relations such as dE = T dS - P dV + \mu dN, where observables like runtime map to energy and volume.^[29] Paul Vitányi has extended these ideas to physical complexity measures, incorporating spatial and temporal constraints in descriptions to model real-world systems, as explored in applications bridging computation and natural processes.^[30] Cryptography leverages algorithmic complexity to rigorously distinguish pseudorandom sequences from truly random ones, as outputs from computable pseudorandom generators exhibit low Kolmogorov complexity due to their short seed descriptions.^[31] This incompressibility criterion underpins tests for cryptographic strength, where high-complexity strings resist efficient prediction, informing the design of secure generators that mimic randomness indistinguishable by polynomial-time algorithms.^[31] Such connections highlight limitations of purely information-theoretic randomness in computational settings, guiding hybrid approaches in secure protocol development.^[32] Recent extensions in the 2020s have generalized Kolmogorov complexity to quantum settings, defining quantum algorithmic complexity via deterministic-control quantum Turing machines that process mixed states and approximate outputs with fidelity bounds.^[33] This framework reveals correlations in quantum systems through mutual information measures symmetric to classical ones, enabling analysis of entanglement and state copying limits beyond no-cloning theorems.^[33] In AI safety, incompressibility bounds from algorithmic information theory impose fundamental limits on explainability, proving that simpler explanations of complex models inevitably err on some inputs, with complexity gaps growing exponentially in input dimensions.^[34] These results yield regulatory impossibilities, showing no framework can ensure both unrestricted AI capabilities and fully interpretable, low-error explanations, informing alignment strategies via structural constraints on model behavior.^[34] Algorithmic information theory also extends to networks and biological systems. In network science, measures of algorithmic complexity quantify the intrinsic structure of graphs and multiplex networks, revealing how additional layers encode information beyond simple topologies.^[35]^[36] For biological applications, AIT provides tools for causal discovery in dynamical systems, such as gene regulatory networks, by using algorithmic probability to infer mechanisms from data while minimizing description length.^[37]

References

[1]
[PDF] Algorithmic Information Theory
We define the quantity of information contained in an object to be the size of that object's smallest representation or description. So,
[2]
Algorithmic Information Theory - Computer Science
Algorithmic Information Theory. Kolmogorov Information theory applies to individual objects, in contrast to Shannon theories that apply to the models of ...
[3]
Introduction To Algorithmic Information Theory 1
AIT, of course, stands for Algorithmic Information Theory. The information part of the name comes from Shannon's information theory, that first proposed.
[4]
[PDF] A PRELIMINARY REPORT ON A GENERAL THEORY OF ...
OF INDUCTIVE INFERENCE. R. J. Solomonoff. Abstract. Some preliminary work is presented on a very general new theory of inductive inference. The extrapolation ...
[5]
[PDF] A Tutorial Introduction to the Minimum Description Length Principle
Occam's Razor MDL chooses a model that trades-off goodness-of-fit on the ob- served data with 'complexity' or 'richness' of the model. As such, MDL embodies a ...
[6]
Algorithmic information theory - Scholarpedia
Jul 9, 2018 · Roughly, a string is Algorithmic "Martin-Loef" Random (AR) if it is incompressible in the sense that its algorithmic complexity is equal to its ...
[7]
(Exhaustive) Symbolic Regression and model selection by minimum ...
Jul 17, 2025 · To address these I propose an exhaustive search and model selection by the minimum description length principle, which allows accuracy and ...
[8]
The minimum description length principle for pattern mining: a survey
Jul 4, 2022 · Our aim is to review the development of pattern mining methods based on and inspired from the Minimum Description Length (MDL) principle.
[9]
The History of Artificial Intelligence - IBM
Developed at SRI in the late 1960s, Shakey is the first mobile robot capable of reasoning about its own actions, combining perception, planning and problem- ...
[10]
[PDF] Indian Statistical Institute
(2) The frequency concept applied to a large but finite number of trials does not admit a rigorous formal exposition within the framework of pure mathematics.
[11]
https://www.tandfonline.com/doi/abs/10.1080/00207166808803030
[12]
https://www.scholarpedia.org/article/Algorithmic_information_theory
[13]
‪Peter Gacs‬ - ‪Google Scholar‬
Professor emeritus of Computer Science, Boston University - ‪‪Cited by 5185‬‬ - ‪Algorithmic information theory‬ - ‪fault-tolerant cellular automata‬ ...Missing: resource- bounded variants 1990s- 2000s
[14]
[quant-ph/9510005] Quantum algorithmic information theory - arXiv
Oct 5, 1995 · The theory of quantum computation will be based upon a model of universal quantum computer whose elementary unit is a two-port interferometer ...Missing: 1990s- 2000s
[15]
[PDF] Applications of Algorithmic Information Theory - of Marcus Hutter
Fazit: K is an excellent universal complexity measure, suitable for quantifying Occam's razor. ... The Minimum Description Length Principle. - 24 -. Marcus Hutter.
[16]
The Limits of AI Explainability: An Algorithmic Information Theory ...
Jun 14, 2025 · This paper establishes a theoretical foundation for understanding the fundamental limits of AI explainability through algorithmic ...
[17]
Three approaches to the quantitative definition of information
(1968). Three approaches to the quantitative definition of information * . International Journal of Computer Mathematics: Vol. 2, No. 1-4, pp. 157-168.
[18]
An Introduction to Kolmogorov Complexity and Its Applications
Book Title: An Introduction to Kolmogorov Complexity and Its Applications. Authors: Ming Li, Paul Vitányi. Series Title: Texts in Computer Science. DOI: https ...
[19]
[PDF] Around Kolmogorov complexity: basic notions and results - arXiv
Apr 20, 2015 · We prove their basic properties (symmetry of information, connection between a priori probability and prefix complex- ity, criterion of ...
[20]
An Introduction to Kolmogorov Complexity and Its Applications
In stockJun 11, 2019 · This must-read textbook presents an essential introduction to Kolmogorov complexity (KC), a central theory and powerful tool in information science.
[21]
[1703.05170] Busy beavers and Kolmogorov complexity - arXiv
Mar 15, 2017 · In this note we consider different versions of the busy beaver-like notions defined in terms of Kolmogorov complexity. We show that these ...Missing: grows faster than
[22]
https://doi.org/10.1145/321356.321363
[23]
The definition of random sequences - ScienceDirect.com
December 1966, Pages 602-619. Information and Cont… The definition of random sequences. Author links open overlay panel. Per Martin-Löf ... View PDFView article ...
[24]
[PDF] A Statistical Test Suite for Random and Pseudorandom Number ...
3.10 Linear Complexity Test. This test uses linear complexity to test for randomness. The concept of linear complexity is related to a popular part of many ...
[25]
[PDF] STATISTICAL TESTING of RANDOMNESS
The most interesting randomness test would be based on Kolmogorov's def- inition of complexity which is the length of the shortest (binary) computer pro- gram ...
[26]
[PDF] Model Selection Based on Minimum Description Length
We introduce the minimum description length (MDL) principle, a general principle for inductive inference based on the idea that regularities (laws).<|separator|>
[27]
[PDF] Does Algorithmic Probability Solve the Problem of Induction?
In doing inductive inference, one begins with two kinds of information: First, the data itself, and Second, the a priori data - the information one had before.
[28]
[1010.2067] Algorithmic Thermodynamics - arXiv
Oct 11, 2010 · This viewpoint allows us to apply many techniques developed for use in thermodynamics to the subject of algorithmic information theory.
[29]
[PDF] Algorithmic Information Theory - CWI
Jul 30, 2007 · Definition 6 [Conditional and Joint Kolmogorov Complexity] The conditional prefix Kolmogorov complexity of x given y (for free) is. K(x|y) ...
[30]
[PDF] Pseudorandom Generators - Harvard SEAS
• Kolmogorov complexity: A string x “looks random” if it is incompressible ... section grew out of the complexity-theoretic approach to cryptography.
[31]
[PDF] On One-way Functions and Kolmogorov Complexity
Sep 24, 2020 · We introduce the notion of a (conditionally-secure) entropy-preserving pseudo-random generator. (EP-PRG) and next show (1) the existence of a ...
[32]
Quantum Kolmogorov complexity and quantum correlations ... - arXiv
May 23, 2023 · This work presents a study of Kolmogorov complexity for general quantum states from the perspective of deterministic-control quantum Turing ...Missing: extensions 2020s
[33]
The Limits of AI Explainability: An Algorithmic Information Theory ...
Apr 29, 2025 · This paper establishes a theoretical foundation for understanding the fundamental limits of AI explainability through algorithmic information theory.Missing: safety incompressibility