Fact-checked by Grok 2 weeks ago

A Mathematical Theory of Communication

A Mathematical Theory of Communication is a seminal article by American mathematician and electrical engineer Claude Elwood Shannon, published in two parts in the Bell System Technical Journal in July (pp. 379–423) and October (pp. 623–656) 1948. The paper establishes the mathematical foundations of by quantifying information as a probabilistic measure independent of semantics, introducing concepts such as to describe uncertainty in message sources, and defining as the maximum reliable transmission rate over noisy channels. Shannon structures the analysis around a general communication model comprising an information source producing symbols, a transmitter (encoder) that converts them into signals, a that may introduce , a receiver (decoder) that reconstructs the signals, and a destination that interprets the output. For discrete noiseless systems, he demonstrates that messages can be encoded efficiently using a binary alphabet, with the fundamental unit of information being the bit, equivalent to selecting one of two equally likely possibilities. Entropy, denoted H = -\sum p_i \log_2 p_i, quantifies the average per symbol from a source with probabilities p_i, serving as both a measure of uncertainty and the minimum average bits needed for unique decodability. Extending to noisy channels, introduces I(X;Y) = H(X) - H(X|Y) to capture the reduction in uncertainty about input X given output Y, and H(X|Y) to model noise-induced errors. The C is the supremum of over input distributions, C = \max_{p(x)} I(X;Y), representing the highest rate for error-free communication in the limit of long sequences. Key theorems, including the , assert that rates below are achievable with arbitrarily low error probability via random coding, while rates above it are impossible, thus separating data compression from error correction. The paper also addresses continuous channels, deriving capacity formulas like C = W \log_2 (1 + S/N) for bandlimited with W and S/N, and explores sampling theorems linking discrete and continuous cases. Building on prior work by Ralph Hartley (1928) on message length and (1928) on , Shannon's framework revolutionized communication engineering by providing rigorous limits and efficiencies. Its influence extends to , , and , underpinning modern digital technologies from error-correcting codes to data compression algorithms.

Background and Publication

Historical Context

The development of in the mid-19th century marked a pivotal advancement in long-distance communication, primarily through Samuel F. B. Morse's invention of the electric telegraph and , which enabled the transmission of messages over wires using electrical pulses. Demonstrated publicly in 1837 and first commercially deployed in 1844 between , and , the telegraph revolutionized by allowing near-instantaneous signaling across continents, laying the groundwork for electrical communication . By the late 19th century, emerged as a natural extension, with patenting the in 1876, which converted sound waves into electrical signals for voice transmission over the same infrastructure. This innovation spurred rapid expansion in the early 20th century, as telephone grew to support widespread voice communication, though both systems initially relied on deterministic engineering approaches focused on signal fidelity rather than probabilistic measures of information. In 1928, Ralph V. L. Hartley, an engineer at Bell Laboratories, published a seminal paper introducing a quantitative, logarithmic measure of based on the number of possible symbols and their selection, shifting focus from physical signal properties to the informational content transmitted. Hartley's work, titled "Transmission of Information," proposed that the conveyed in a is proportional to the logarithm of the number of equally likely alternatives, providing an early framework for measuring communication efficiency in multi-symbol systems. This logarithmic approach addressed limitations in earlier metrics by accounting for the multiplicity of choices in signaling, influencing subsequent probabilistic models. Shannon's later concept of refined Hartley's measure by incorporating symbol probabilities for uneven distributions. Concurrent with Hartley's contributions, research in the 1920s and advanced understanding of channel limitations, exemplified by Harry Nyquist's 1928 sampling , which established that a signal's determines the maximum rate of independent pulses transmissible without . Nyquist's , derived from telegraph transmission theory, quantified the relationship between and signaling speed, highlighting the need to consider in practical systems. By the early , engineers were intensively studying signal-to-noise ratios to optimize performance in noisy environments, as expanded amid growing demands for reliable transmission. World War II profoundly shaped communication engineering by exposing the inadequacies of deterministic models in handling and uncertainty, particularly through advancements in and technologies. Cryptographic efforts required analyzing secure message transmission over imperfect channels, while systems demanded robust signal processing amid environmental interference, underscoring the limitations of classical engineering in probabilistic settings. These wartime challenges at institutions like motivated a transition toward mathematical frameworks that could quantify reliability in the presence of . Claude Shannon's pre-1948 career provided a strong foundation for his later work, beginning with his 1937 master's thesis at , "A Symbolic Analysis of Relay and Switching Circuits," which applied to the design of electromechanical switching systems, bridging logic and . This thesis demonstrated how binary states could model complex circuits, earning recognition for its innovative use of mathematical symbolism in practical telephony problems. During , while employed at , Shannon contributed to for U.S. national defense, developing techniques to evaluate code security and transmission vulnerabilities, which deepened his insight into information handling under uncertainty.

Publication Details

"A Mathematical Theory of Communication" was originally published in two parts in the Technical Journal. Part I appeared in the July 1948 issue (Volume 27, Issue 3, pages 379–423), and Part II in the October 1948 issue (Volume 27, Issue 4, pages 623–656). The paper was authored by Claude E. Shannon, a and employed at Bell Laboratories at the time. It spans approximately 79 pages, including figures, diagrams, and mathematical appendices that illustrate key concepts in discrete and continuous communication systems. Upon publication, the paper garnered limited immediate attention owing to its highly content and appearance in a specialized with restricted circulation. However, it received praise from fellow at , such as , who later described Shannon's contributions as casting "as much light on the problem of the communication as can be shed" and of primary importance comparable to foundational principles in . The work was reprinted in book form as The Mathematical Theory of Communication in 1949 by the , featuring an introductory essay by Warren Weaver that contextualized its broader implications for communication science. It was later included in the comprehensive anthology Claude Elwood Shannon: Collected Papers, edited by N. J. A. Sloane and Aaron D. Wyner and published by IEEE Press in 1993. Since the , the original paper has been freely accessible online through various academic archives, including those hosted by institutions like and the , facilitating its widespread study and citation in subsequent research.

Discrete Information Sources

Entropy Measure

In the context of information sources, an information source is modeled as a that generates a sequence of from a finite , where each occurs with a given . This setup assumes the source produces independently unless specified otherwise, capturing the fundamental inherent in the selection process. The entropy H serves as the foundational measure of the average uncertainty or information content per symbol in such a source. It is defined by the formula H = -\sum_{i=1}^n p_i \log_2 p_i, where p_i represents the probability of the i-th symbol from the alphabet of n symbols, and the logarithm base 2 yields units in bits. This quantity quantifies the expected number of yes/no questions needed to determine the symbol's identity, reflecting the source's unpredictability. The function arises uniquely from a set of axioms that any reasonable measure of should satisfy: in the probabilities p_i, monotonicity in the number of outcomes ( non-decreasing with more equiprobable symbols), additivity for events (the of combined choices equals the sum of individual uncertainties), and the choice of logarithmic base to ensure these properties hold. These axioms lead to the logarithmic form, as proven through functional equations showing that only H = K \sum p_i \log (1/p_i) (with K = 1/\log 2 for bits) complies, distinguishing it from other potential measures like arithmetic means. Entropy exhibits key properties that underscore its role as an information metric: it is concave in the probability distribution, meaning it increases as probabilities become more equal; it achieves its maximum value of \log_2 n bits for a uniform distribution over n symbols, representing complete unpredictability; and it equals zero for deterministic sources where one symbol has probability 1 and others 0, indicating no uncertainty. For sources with dependencies, such as Markov processes where symbol probabilities depend on prior symbols, the concept extends to joint entropy for sequences of n symbols: H(X_1, \dots, X_n) = -\sum p(x_1, \dots, x_n) \log_2 p(x_1, \dots, x_n), where the sum is over all possible sequences and p(x_1, \dots, x_n) is the . This measures the total uncertainty in the joint occurrence of the sequence. For stationary processes, the average information rate per symbol, which characterizes the long-term output rate of the source, is given by the limit H = \lim_{n \to \infty} \frac{1}{n} H(X_1, \dots, X_n). This rate converges for ergodic sources, providing a stable measure of information production independent of sequence length.

Source Coding Theorem

The Source Coding Theorem, also known as Shannon's Noiseless Coding Theorem, provides the fundamental limit for lossless compression of a discrete memoryless information source. It asserts that if a source emits symbols from a finite alphabet with probabilities p_i and entropy H = -\sum p_i \log_2 p_i, then any uniquely decodable code for representing the symbols must have an average codeword length L satisfying L \geq H bits per symbol. Furthermore, there exist such codes where L < H + 1. This theorem implies that entropy H represents the irreducible minimum average bit rate for faithful representation of the source without loss, setting the ultimate bound on data compression efficiency for noiseless channels. Codes achieving lengths close to H are optimal in the asymptotic sense, minimizing redundancy while ensuring unique decodability. A sketch of the proof begins with the lower bound, derived from the non-negativity of the or directly from coding constraints: for any uniquely decodable code, the Kraft inequality states that ∑ 2^{-l_i} ≤ 1, leading to H ≤ L via the non-negativity of the KL divergence between the source distribution p_i and the distribution q_i proportional to 2^{-l_i}. The achievability part relies on the asymptotic equipartition property (AEP), or : for blocks of n symbols, the probability of atypical sequences vanishes as n \to \infty, and the $2^{nH} typical sequences each have probability approximately $2^{-nH}, allowing assignment of binary codes of length roughly nH bits, yielding average length per symbol approaching H. The bound L < H + 1 holds for single-symbol (fixed-length block) codes via the Kraft inequality for , \sum 2^{-l_i} \leq 1, which ensures the existence of such codes. A practical example is , an algorithm that constructs optimal prefix codes for known symbol probabilities by building a binary tree based on frequency merging, achieving average lengths within 1 bit of the entropy bound for any discrete source. For a binary source where one symbol occurs with probability p and the other with $1-p, the entropy is the h(p) = -p \log_2 p - (1-p) \log_2 (1-p), and single-symbol Huffman codes have L = 1 bit, for instance when p = 0.1 yielding L = 1 bit compared to h(0.1) \approx 0.469; block coding approaches this value more closely. Redundancy, defined as the difference L - H, quantifies inefficiency in a code; optimal codes like minimize it to near zero for large alphabets or blocks, though single-symbol codes always leave some residual redundancy bounded by 1 bit. To approach the entropy rate exactly, block coding extends the theorem to sequences of n symbols, treating the block as a super-symbol with entropy nH; as n increases, the per-symbol average length L_n / n \to H, enabling arbitrarily efficient compression for stationary ergodic sources.

Discrete Noiseless Channels

Channel Capacity

In a discrete noiseless channel, the input is drawn from a finite alphabet of size M, and the output is identical to the input with no distortion or loss of information. This model assumes that each symbol from the input alphabet \{x_1, \dots, x_M\} can be transmitted perfectly and sequentially over time, typically normalized to one symbol per unit time. Such channels, exemplified by early or , represent the simplest form of communication where reliability is guaranteed without error-correction mechanisms. The C of a discrete noiseless is defined as the maximum rate at which can be transmitted reliably, measured in bits per . It is given by the formula C = \max_{p(x)} H(X) = \log_2 M, where H(X) = -\sum_{p(x_i) > 0} p(x_i) \log_2 p(x_i) is the of the input distribution p(x), and the maximum is achieved when the input is uniformly distributed over the M symbols. This derivation follows from the fact that the preserves all input , so its equals the highest possible the input can provide, allowing the channel to convey up to \log_2 M bits per without loss. The implications of this are foundational: any information source with an R \leq C can be encoded and transmitted over the without or , by matching the source to the input that achieves . Conversely, sources with R > C cannot be transmitted faithfully, leading to inevitable information loss. For more general noiseless channels with time-varying symbol durations or constraints on allowable sequences, the extends to C = \log_2 X_0, where X_0 solves the \sum_i X_0^{-t_i} = 1 for symbol times t_i, but remains bounded by the effective size in the basic discrete case.

Fundamental Theorem for Noiseless Channels

The fundamental theorem for noiseless channels establishes the precise limits of reliable communication over noiseless . Specifically, for a noiseless with C bits per channel use, it is possible to transmit the output of any memoryless with R < C bits per source symbol such that the probability of error P_e in decoding blocks of length n satisfies P_e \to 0 as the block length n \to \infty. This theorem, central to Shannon's framework, demonstrates that the channel C = \log_2 | \mathcal{Y} |—where | \mathcal{Y} | is the size of the channel output alphabet—serves as both an achievable rate and an upper bound for error-free transmission of compressed source data. The achievability part of the proof proceeds in two steps: first, apply the source coding theorem to compress sequences of n source symbols into binary representations requiring approximately nR bits with negligible error probability; second, map these bits directly onto sequences of \lceil nR / C \rceil channel symbols using a one-to-one correspondence, exploiting the noiseless nature of the channel for exact recovery at the receiver. As n grows large, the encoding inefficiency vanishes, allowing transmission at rates arbitrarily close to C without errors. For the converse, reliable transmission at rates R > C is impossible, as the channel can distinguish at most $2^{nC} distinct output sequences of length n, limiting the number of reliably transmittable source messages; any attempt to exceed this bound results in P_e bounded away from zero, as shown using inequalities on the entropy of the source relative to the channel's information-carrying capacity. In practice, for memoryless sources, fixed-length can achieve the C exactly when the source aligns with the channel's structure, such as by grouping source into blocks whose total is an integer multiple of \log_2 | \mathcal{Y} |, enabling direct bijective mapping without variable-length overhead. This result underscores the theorem's operational significance, confirming that noiseless channels operate at full efficiency for rates below through asymptotic .

Discrete Channels with Noise

Mutual Information

In discrete channels with noise, serves as a fundamental measure of the dependence between the input symbol X and the output symbol Y, quantifying the amount of information about X that is conveyed by Y. It builds on the concept of , which represents the average uncertainty in X given the knowledge of Y. The H(X|Y) is formally defined as H(X|Y) = -\sum_{x,y} p(x,y) \log_2 p(x|y), where p(x,y) is the over input-output pairs, and p(x|y) is the of x given y. This quantity averages the of X conditioned on each possible Y, weighted by the p(y). Mutual information I(X;Y) is then expressed as the difference between the marginal H(X) of the input and the H(X|Y): I(X;Y) = H(X) - H(X|Y). Equivalently, it can be written in a symmetric form that highlights the reduction in uncertainty for both variables: I(X;Y) = \sum_{x,y} p(x,y) \log_2 \frac{p(x,y)}{p(x)p(y)}, where p(x) and p(y) are the marginal distributions. This summation measures the shared information between X and Y, capturing how much the joint distribution deviates from independence under the product of marginals. Key properties of mutual information include non-negativity, I(X;Y) \geq 0, with equality holding if and only if X and Y are ; symmetry, I(X;Y) = I(Y;X); and additivity for channels, where I(X_1 X_2; Y_1 Y_2) = I(X_1; Y_1) + I(X_2; Y_2) if the pairs (X_1, Y_1) and (X_2, Y_2) are . These properties arise directly from the probabilistic structure and ensure behaves as a valid measure of information dependence. Noisy discrete channels are modeled using transition probabilities p(y|x), which form the rows of a stochastic matrix specifying the probability of each output y given input x. The joint distribution p(x,y) = p(x) p(y|x) incorporates the input distribution p(x), allowing mutual information to depend on the choice of p(x). The channel capacity C, representing the maximum reliable transmission rate, is achieved by optimizing over input distributions: C = \max_{p(x)} I(X;Y). This maximization yields the supremum of across all possible input ensembles, providing a theoretical upper bound on the through the .

Channel Coding Theorem

The states that for a memoryless with C bits per use, reliable communication is possible at any rate R < C, in the sense that there exist block codes of length n with $2^{nR} codewords such that the probability of decoding error P_e satisfies P_e \to 0 as n \to \infty; conversely, for any R > C, P_e is bounded below by a positive constant independent of n. The proof of achievability employs random coding, in which $2^{nR} codewords are independently generated according to the optimal input distribution, followed by , which identifies the unique codeword whose contains the received sequence; this demonstrates that the expected probability decays to zero for R < C. The converse proof utilizes a bound on the equivocation (similar to Fano's inequality) to show that the mutual information between the input and output sequences satisfies I(X^n; Y^n) \leq nC + o(n), which implies P_e cannot approach zero for R > C. A canonical example is the binary symmetric channel, where input and output alphabets are \{0,1\} and each bit flips with probability p < 1/2, independently; its is C = 1 - h_2(p), with the binary entropy h_2(p) = -p \log_2 p - (1-p) \log_2 (1-p), achieved by uniform input distribution. For physical channels, the discrete model relates to W by approximating the channel as transmitting up to $2W symbols per second (via sampling theorem), yielding a maximum reliable rate of $2WC bits per second.

Continuous Channels

Entropy for Continuous Sources

In the extension of information theory to continuous random variables, Shannon introduced the concept of to quantify the uncertainty associated with a continuous information source. For a continuous X with p(x), the h(X) is defined as h(X) = -\int_{-\infty}^{\infty} p(x) \log_2 p(x) \, dx, where the is taken over the of p(x). This measure arises naturally as the continuous analog of discrete , obtained in the limit as the of the continuous variable becomes finer. Unlike the discrete entropy, which is always non-negative, the differential entropy h(X) can take negative values, reflecting the fact that it is not an absolute measure of information but rather a limit that depends on the scale of measurement. Key properties include its invariance under translations, meaning h(X + c) = h(X) for any constant c, since shifting the density does not alter the integral. Additionally, it exhibits scaling behavior with respect to volume: for a uniform distribution over a region of volume v, the maximum differential entropy is h(X) = \log_2 v. More generally, the entropy increases (or remains unchanged) when averaging over distributions, and it satisfies subadditivity bounds such as h(X, Y) \leq h(X) + h(Y), with equality if X and Y are independent. For joint distributions, the differential entropy of two continuous random variables X and Y with joint density p(x, y) is given by h(X, Y) = -\iint p(x, y) \log_2 p(x, y) \, dx \, dy. The conditional differential entropy h(X \mid Y) is then defined as h(X \mid Y) = h(X, Y) - h(Y), which quantifies the remaining in X given knowledge of Y. This satisfies h(X \mid Y) \leq h(X), indicating that cannot increase on average. These extensions parallel their counterparts but account for the of continuous spaces. A significant result is that, among all continuous distributions with a fixed variance \sigma^2, the Gaussian distribution maximizes the differential entropy. For a one-dimensional Gaussian with density p(x) = \frac{1}{\sqrt{2\pi \sigma^2}} \exp\left(-\frac{x^2}{2\sigma^2}\right), the entropy is h(X) = \frac{1}{2} \log_2 (2\pi e \sigma^2). This maximum entropy property underscores the Gaussian's role as the "most random" distribution under quadratic constraints, with generalizations to higher dimensions involving the determinant of the covariance matrix. The relation between differential entropy and discrete entropy emerges when approximating a continuous source by discretizing it into small bins of width \Delta. In this limit as \Delta \to 0, the discrete entropy H approximates h(X) + \log_2 \Delta, highlighting how differential entropy captures the intrinsic uncertainty independent of the measurement resolution. This connection justifies using differential entropy to model analog sources in communication systems.

Capacity of Continuous Channels

In the context of continuous channels, the capacity C is defined as the maximum between the input X and output Y, given by C = \max_{p(x)} I(X;Y), where the maximization is over all possible input distributions p(x) subject to appropriate constraints. for continuous random variables is expressed as I(X;Y) = h(Y) - h(Y|X), with h(Y) denoting the of the output and h(Y|X) the conditional . For channels with additive noise, where Y = X + Z and Z is independent of X, the term h(Y|X) = h(Z) remains fixed regardless of the input distribution, as it depends solely on the noise characteristics. Thus, capacity reduces to maximizing the output entropy h(Y) under the given constraints, which effectively determines the fundamental limit on reliable communication rates. In bandwidth-limited scenarios, the Nyquist-Shannon sampling theorem plays a crucial role, establishing that a signal bandlimited to bandwidth W can be perfectly reconstructed from samples taken at the Nyquist rate of $2W samples per second, thereby linking the continuous-time channel to a discrete-time equivalent with $2W dimensions per second. This discretization allows the capacity to be computed as if the channel were discrete, scaled by the sampling rate. The canonical example is the additive white Gaussian noise (AWGN) channel, where the noise Z is Gaussian with zero mean and variance N. Under a power constraint where the average input power satisfies \mathbb{E}[X^2] \leq S, the capacity-achieving input distribution is Gaussian with variance S, yielding C = \frac{1}{2} \log_2 \left(1 + \frac{S}{N}\right) bits per transmission (or per two dimensions). For a bandlimited channel of bandwidth W, the total capacity becomes C = W \log_2 \left(1 + \frac{S}{N}\right) bits per second, with N representing the noise power within the bandwidth. Here, S is the signal power and N the noise power, highlighting the signal-to-noise ratio (SNR) as the key parameter governing performance. This formulation extends naturally to multidimensional channels, such as those involving inputs and outputs in . In n-dimensions, the capacity approximates \frac{n}{2} \log_2 \left(1 + \frac{S}{N}\right) bits per n-dimensional symbol under the power constraint, with the overall rate scaling linearly with the number of dimensions (e.g., via increased or time). This scaling underscores how higher-dimensional representations, common in practical schemes, enhance the effective without altering the per-dimension limit.

Legacy and Applications

Influence on Information Theory

Claude Shannon's 1948 paper "A Mathematical Theory of Communication" is widely regarded as the foundational text that birthed information theory as a distinct discipline, providing a rigorous mathematical framework that unified models for discrete and continuous communication systems and shifted the analytical focus from physical signal properties to the probabilistic nature of information itself. By defining information in terms of uncertainty reduction via entropy, Shannon enabled the quantification of communication efficiency independent of content semantics, establishing core concepts like entropy and mutual information that underpin the field. Among the paper's key theoretical advancements was the principle of separation between source coding and channel coding, which demonstrated that optimal communication could be achieved by independently optimizing data compression at the source and correction at the channel, a result that simplified system design and proved asymptotically achievable rates. Additionally, the work laid precursors to rate-distortion theory through its source coding theorem, which quantified the minimal rate needed for faithful representation of information sources, influencing later extensions to scenarios. The paper's academic impact extended beyond engineering, inspiring developments in during the 1960s, notably Andrei Kolmogorov's formulation of complexity as the length of the shortest program describing an object, which built on 's to explore individual incompressibility rather than ensemble averages. received the in 1966 for these contributions to mathematical theories of communication and information. By 2025, the paper had amassed over 192,000 citations, reflecting its enduring influence. One noted of the is its overemphasis on rates and syntactic at the expense of semantic meaning, a limitation acknowledged and contextualized in Warren Weaver's to the 1949 book version, which outlined broader levels of communication including semantic and problems beyond the .

Modern Applications

Concepts from A Mathematical Theory of Communication underpin modern digital communication systems, particularly through error-correcting codes that approach the theoretical limits of reliable . , introduced in 1993, enable near-capacity performance by iteratively decoding concatenated convolutional codes, and they were standardized for use in and mobile networks to combat channel noise effectively. Similarly, low-density parity-check (LDPC) codes, originally proposed in 1962 but revived for their efficiency, achieve performance close to the Shannon limit with lower decoding complexity than , forming the basis for data channels in networks. In data compression, Shannon's entropy serves as the foundational measure for optimal source coding, quantifying the minimum average bits required to represent without loss. , a prefix-free that assigns shorter codes to more probable symbols based on estimates, is integral to standards like files via the , for lossless encoding of quantized coefficients, and for compressing audio spectra. These methods ensure efficient storage and transmission by approximating the bound, reducing file sizes while preserving essential data. Machine learning leverages and for tasks involving uncertainty and dependency. In decision trees, such as the , Shannon entropy measures the impurity or unpredictability of class labels in subsets, guiding splits that maximize information gain to build predictive models. , quantifying shared information between variables, is widely used in to identify relevant inputs that reduce redundancy and improve model performance, as formalized in information-theoretic frameworks. Cryptography draws on information-theoretic principles to define absolute security bounds. Building on concepts from the 1948 paper, Shannon's 1949 analysis showed that the achieves perfect secrecy—where ciphertext reveals no information about the —if the is as long as the message and used only once. This establishes the theoretical limit for unconditional security, influencing modern designs that seek to approach these bounds under computational constraints. In , Shannon entropy models the efficiency of by estimating the information capacity of spike trains. Researchers apply it to quantify variability in neuronal firing rates, revealing how sensory stimuli are encoded with minimal in systems like the . In , entropy assesses genetic information content, such as measuring diversity in allele distributions or the uncertainty in codon assignments within the . These applications highlight entropy's role in understanding information flow from sequences to population-level .

References

  1. [1]
    A mathematical theory of communication - IEEE Xplore
    In the present paper we will extend the theory to include a number of new factors, in particular the effect of noise in the channel, and the savings possible.
  2. [2]
    [PDF] A Mathematical Theory of Communication
    This case has applications not only in communication theory, but also in the theory of computing machines, the design of telephone exchanges and other fields.
  3. [3]
    How Claude Shannon Invented the Future | Quanta Magazine
    Dec 22, 2020 · Shannon's general theory of communication is so natural that it's as if he discovered the universe's laws of communication, rather than ...
  4. [4]
    Claude Shannon and the Making of Information Theory
    How Claude Shannon's theory of “information” shaped the technologies that changed one of the most fundamental activities in our lives: communication.<|control11|><|separator|>
  5. [5]
    History of the U.S. Telegraph Industry – EH.net
    The Morse code, named after Samuel Morse, is still used today. For ... Alexander Graham Bell patented the telephone in 1876, initially referring to ...
  6. [6]
    Telephone and Multiple Telegraph | Articles and Essays | Alexander ...
    When Bell began experimenting with electrical signals, the telegraph had been an established means of communication for some 30 years. Although a highly ...
  7. [7]
    BSTJ 7: 3. July 1928: Transmission of Information. (Hartley, R.V.L.)
    Jan 19, 2013 · Bell System Technical Journal, 7: 3. July 1928 pp 535-563. Transmission of Information. (Hartley, R.V.L.). Addeddate: 2013-01-19 02:01:33.
  8. [8]
    [PDF] MT-002: What the Nyquist Criterion Means to Your ... - Analog Devices
    The mathematical basis of sampling was set forth by Harry Nyquist of Bell. Telephone Laboratories in two classic papers published in 1924 and 1928, respectively ...
  9. [9]
    [PDF] Memories: A Personal History of Bell Telephone Laboratories
    Aug 6, 2015 · Shannon proved that the maximum capacity C of a communication channel with a bandwidth B and a signal-to-noise ratio of S/N is: C = B log2 ...
  10. [10]
    Oral-History:Claude E. Shannon
    Jan 26, 2021 · Assessing the security concerns that limited professional communication about cryptography during World War II, Shannon explains his access to ...
  11. [11]
    A symbolic analysis of relay and switching circuits - DSpace@MIT
    A symbolic analysis of relay and switching circuits. Author(s). Shannon, Claude Elwood,1916-2001. Thumbnail. Download34541425-MIT.pdf (16.35Mb). Advisor.
  12. [12]
    A Man in a Hurry: Claude Shannon's New York Years - IEEE Spectrum
    Jul 12, 2017 · Moonlighting: Employed by Bell Labs during World War II, Claude Shannon (at chalkboard) still found time to work on his own research. Photo: ...
  13. [13]
  14. [14]
    [PDF] The Early Days of Information Theory
    Abstract-Shannon's communication (information) theory cast about as much light on the problem of the communication engineer as can be.
  15. [15]
    UI Press | | The Mathematical Theory of Communication
    In stockThe Mathematical Theory of Communication, published originally as a paper on communication theory more than fifty years ago.
  16. [16]
    [PDF] A Mathematical Theory of Communication. (Shannon, C.E.)
    A Mathematical Theory of Communication. By C. E. SHANNON. Introduction ... In the present paper we will extend the theory to include a number of new ...
  17. [17]
  18. [18]
    A Mathematical Theory of Communication Parts I & II ... - IEEE Reach
    With this paper, Claude Shannon provided the theoretical foundation for communication engineering. It describes how information can be manipulated using ...
  19. [19]
    Claude E. Shannon | IEEE Information Theory Society
    Shannon's most important paper, ' A mathematical theory of communication ,' was published in 1948. This fundamental treatise both defined a mathematical ...
  20. [20]
    Algorithmic information theory - Scholarpedia
    Jul 9, 2018 · Andrei Kolmogorov (1965) suggested to define the information content of an object as the length of the shortest program computing a ...
  21. [21]
    Claude E. Shannon - National Science and Technology Medals ...
    Claude E. Shannon was awarded the National Medal of Science for brilliant contributions to the mathematical theories of communications and information ...
  22. [22]
    ‪Claude E Shannon‬ - ‪Google Scholar‬
    A mathematical theory of communication. CE Shannon. The Bell system technical journal 27 (3), 379-423, 1948. 192337*, 1948 ; Communication theory of secrecy ...
  23. [23]
    [PDF] The Mathematical Theory of Communication - Monoskop
    Shannon and Warren Weaver. THE UNIVERSITY OF ILLINOIS PRESS ... THE MATHEMATICAL THEORY OF COMMUNICATION. By Claude E. Shannon. Page 37. Page 38. Introduction.
  24. [24]
    [PDF] Information, Entropy, and the Motivation for Source Codes - MIT
    Both Huffman codes and LZW are widely used in practice, and are a part of many real-world standards such as GIF, JPEG, MPEG, MP3, and more. □ 3.1 ...Missing: ZIP | Show results with:ZIP
  25. [25]
    [PDF] A Unifying Framework for Information Theoretic Feature Selection
    In this section we give a brief introduction to information theoretic concepts, followed by a summary of how they have been used to tackle the feature selection ...
  26. [26]
    A Tutorial for Information Theory in Neuroscience | eNeuro
    Jun 29, 2018 · In this tutorial, we provide a thorough introduction to information theory and how it can be applied to data gathered from the brain.
  27. [27]
    Entropy and Information Approaches to Genetic Diversity and its ...
    This article highlights advantages of entropy-based genetic diversity measures, at levels from gene expression to landscapes. Shannon's entropy-based ...
  28. [28]
    Shannon Information Entropy in the Canonical Genetic Code
    Feb 21, 2017 · The Shannon entropy measures the expected information value of messages. As with thermodynamic entropy, the Shannon entropy is only defined ...