Fact-checked by Grok 2 weeks ago

Expected value

In probability theory, the expected value (also known as the mathematical expectation, expectation, or simply the mean) of a random variable is a measure of the central tendency that represents the long-run average value of the random variable over infinitely many independent repetitions of the associated experiment. For a discrete random variable X taking values x_i with probabilities p_i, the expected value is calculated as E[X] = \sum x_i p_i; for a continuous random variable with probability density function f(x), it is E[X] = \int_{-\infty}^{\infty} x f(x) \, dx.^[1] This concept quantifies the "average" outcome weighted by the likelihood of each possibility, distinguishing it from the most probable value, and serves as the cornerstone for understanding distributions in statistics. The concept of expected value originated in the 17th century from analyses of games of chance, with Christiaan Huygens introducing it in 1657 in his treatise De Ratiociniis in Ludo Aleae to compute fair divisions in interrupted games; it was later formalized by Abraham de Moivre in 1718 and advanced by Pierre-Simon Laplace in 1814.^[2] Key properties of expected value underpin its utility across disciplines, with linearity of expectation being particularly notable: for any random variables R_1 and R_2 and constants a_1, a_2, E[a_1 R_1 + a_2 R_2] = a_1 E[R_1] + a_2 E[R_2], holding even without independence between the variables.^[3] This property enables efficient computations in complex scenarios, such as using indicator random variables where E[I_A] = Pr[A] for an event A.^[3] In statistics, expected value defines the population mean \mu, guiding hypothesis testing and confidence intervals; in economics and finance, it informs risk assessment by calculating weighted averages of potential profits and costs, as in net present value analyses for investments where outcomes are probabilistic.^[4] For instance, in evaluating a drilling project, expected value aggregates probabilities of dry holes (70%) versus successful yields (30%) to determine long-term viability, often yielding positive returns like $425,000 on average despite variability.^[4] Beyond core applications, expected value extends to decision theory and optimization, where it maximizes utility under uncertainty, as in expected utility theory for rational choice.^[5] It also appears in algorithms, such as the coupon collector problem, where the expected trials to gather n types is n H_n (with H_n the harmonic number), approximately n \ln n + \gamma n for large n, illustrating its role in computational complexity.^[3] Overall, expected value remains indispensable for modeling uncertainty, from insurance pricing to machine learning expectations in neural networks, always emphasizing the balance between probability and payoff.^[4]

History and Etymology

Historical Development

The concept of expected value emerged in the mid-17th century amid efforts to resolve disputes in gambling, particularly through the correspondence between Blaise Pascal and Pierre de Fermat in 1654. Prompted by the Chevalier de Méré, they addressed the "problem of points," which involved fairly dividing stakes in an interrupted game of chance, such as dice or cards, based on the probabilities of completing the game. Their exchange, preserved in letters, laid foundational principles for calculating fair shares proportional to winning chances, marking the inception of systematic probability reasoning applied to expectations in games.^[6] Building on this, Christiaan Huygens formalized the idea in his 1657 treatise De Ratiociniis in Ludo Aleae, the first published work on probability theory. Huygens introduced mathematical expectation as the value a player could reasonably anticipate from a game, using it to analyze fair divisions and advantages in various chance scenarios, such as lotteries and dice rolls. His propositions equated expectation to the weighted average of possible outcomes, providing a practical tool for gamblers and establishing expectation as a core probabilistic concept.^[7] Jacob Bernoulli advanced the notion significantly in his posthumously published 1713 work Ars Conjectandi, extending expectations beyond simple games to broader combinatorial outcomes and moral certainty. Bernoulli demonstrated how repeated trials converge to the expected value, introducing the law of large numbers as a theorem justifying the reliability of expectations in empirical settings. His analysis connected expectations to binomial expansions, influencing applications in annuities and demographics.^[8] Abraham de Moivre further refined these ideas in his 1718 book The Doctrine of Chances, where he developed approximations linking expectations to the binomial distribution for large numbers of trials. De Moivre's methods allowed estimation of expected outcomes in complex scenarios, bridging combinatorial probability with continuous approximations and enhancing the precision of expectation calculations in insurance and gaming.^[9] The modern rigorous framework for expected value was established by Andrey Kolmogorov in his 1933 monograph Grundbegriffe der Wahrscheinlichkeitsrechnung, which axiomatized probability theory using measure theory. Kolmogorov integrated expectation as the Lebesgue integral of a random variable over the probability space, unifying discrete and continuous cases within a general abstract setting and enabling its application across mathematics and sciences.^[10]

Etymology

The term "expectation" in probability theory originated in the 17th century, deriving from the Latin expectatio, which was introduced in Frans van Schooten's 1657 Latin translation of Christiaan Huygens' treatise De ratiociniis in ludo aleae. This work, based on Huygens' unpublished Dutch manuscript Van Rekeningh in Spelen van Gluck (1656), addressed problems in games of chance, where the concept denoted the anticipated monetary gain a player could reasonably foresee from fair play. The Latin root exspectatio, from the verb exspectare meaning "to look out for" or "to await," aligned with the gambling context of awaiting outcomes, emphasizing a balanced anticipation rather than mere hope.^[11]^[12] In French, the parallel term espérance mathématique ("mathematical hope" or "mathematical expectation") first appeared in a letter by Gabriel Cramer dated May 21, 1728, marking its initial documented use with the modern probabilistic meaning. This phrasing influenced subsequent works, including Pierre-Simon Laplace's adoption of espérance in Théorie analytique des probabilités (1812), where it signified the weighted average outcome. Meanwhile, in German mathematical literature, Erwartungswert ("expected value") emerged as an equivalent, with roots traceable to early 18th-century translations; for instance, Jakob Bernoulli employed related Latin expressions like valor expectationis (value of expectation) in Ars Conjectandi (1713) to describe anticipated gains, and occasionally mediocris to denote the mean or average value in probabilistic calculations.^[11]^[13]^[14] The English adoption evolved further in the 19th century, with Augustus De Morgan coining "mathematical expectation" in An Essay on Probabilities (1838) to formalize the numerical aspect of the concept. By the 20th century, "expected value" supplanted "expectation" in many English texts to underscore its role as a precise average, avoiding connotations of subjective anticipation; this shift is evident in works like Arne Fisher's The Mathematical Theory of Probabilities (1915), which used the term to highlight the mean of a random variable's distribution.^[11]

Notations and Terminology

Standard Notations

The standard notation for the expected value of a random variable X is E[X], where E stands for expectation. Alternative notations include \mathcal{E}(X) or \mathbb{E}[X], the latter often using blackboard bold to distinguish it in printed texts. The integral form \int x \, dF(x) represents the expected value in terms of the cumulative distribution function F.^[15] For conditional expectation, the subscripted notation E[X \mid Y] is commonly used, indicating the expected value of X given the random variable Y. In statistics, the expected value of a random variable is frequently denoted by \mu, representing the population mean.^[16] For multiple random variables, the joint expectation may be written as E[X,Y], denoting the expectation of their product XY.^[17] Variance serves as a fundamental measure of the dispersion or spread of a random variable's values around its expected value, quantifying the average squared deviation from the mean. Formally, for a random variable X, the variance is defined as \operatorname{Var}(X) = E[(X - E[X])^2], which captures the second central moment of the distribution.^[18] This concept highlights how expected value acts as the central tendency from which variability is assessed, with higher variance indicating greater unpredictability in outcomes relative to the mean.^[19] Covariance extends this idea to pairs of random variables, measuring the joint variability between them by assessing how deviations from their respective expected values tend to align. It is defined as \operatorname{Cov}(X, Y) = E[(X - E[X])(Y - E[Y])] for random variables X and Y, where positive values suggest that above-average occurrences in one variable correspond with above-average in the other, indicating positive association.^[20] Conceptually, covariance links the expected values of X and Y to their shared fluctuations, providing insight into dependence without assuming linearity.^[21] The moment-generating function (MGF) of a random variable X, denoted M_X(t) = E[e^{tX}], encapsulates all moments of the distribution, with the expected value E[X] corresponding to the first moment obtained by differentiating the MGF and evaluating [at

t](/page/AT&T)=0

. This relation underscores expected value as the foundational moment from which higher-order moments like variance derive.^[22] In essence, the MGF provides a generating tool where expected value emerges as the primary derivative, facilitating analysis of distributional properties.^[23] In statistics, the sample mean represents an empirical average computed from observed data, serving as an estimator of the theoretical expected value, which is the population parameter defined probabilistically. While the sample mean varies with each realization of the data, the expected value remains fixed as the long-term average under repeated sampling.^[24] This distinction emphasizes that expected value is an intrinsic property of the random variable's distribution, whereas the sample mean approximates it through finite observations.^[25] The law of large numbers conceptually ties these ideas together by stating that, under suitable conditions, the sample mean converges to the expected value as the number of independent observations increases, justifying the use of empirical averages to infer theoretical expectations. This convergence, often in probability or almost surely, illustrates how repeated sampling diminishes the influence of variability around the expected value.^[26] Thus, it bridges the gap between the abstract expected value and practical statistical inference.^[27]

Core Definitions

Finite Discrete Random Variables

A finite discrete random variable X takes on a finite number of distinct values x_1, x_2, \dots, x_n in the real numbers, each occurring with probability P(X = x_i) = p_i > 0, where the probability mass function satisfies \sum_{i=1}^n p_i = 1. The expected value E[X], also known as the mean or first moment, is defined as the sum

E[X] = \sum_{i=1}^n x_i p_i.

This formulation arises in the axiomatic foundations of probability, where the expectation captures the center of mass of the distribution under a discrete uniform measure scaled by probabilities.^[28]^[29] The expected value serves as a weighted average of the possible outcomes, with the probabilities p_i acting as weights that reflect their relative likelihoods; if all p_i = 1/n, it reduces to the arithmetic mean of the x_i. This interpretation aligns with the law of large numbers, indicating that the sample average from many independent repetitions of the experiment converges to E[X].^[30] For a fair six-sided die, where X denotes the face value shown and each outcome from 1 to 6 has probability $1/6, the expected value is

E[X] = \sum_{k=1}^6 k \cdot \frac{1}{6} = \frac{21}{6} = 3.5.

This result implies that, over many rolls, the average outcome approaches 3.5, even though no single roll yields this value.^[28] Consider a biased coin flip where X is the payoff: +5 for heads (with P(\text{heads}) = 0.6) and -5 for tails (with P(\text{tails}) = 0.4). The expected value is

E[X] = 0.6 \cdot 5 + 0.4 \cdot (-5) = 3 - 2 = 1.

In repeated plays, the average payoff would thus approach +1 per flip.^[31]

Countable Discrete Random Variables

For a countable discrete random variable X taking values in a countable set \{x_i : i \in \mathbb{Z}\}, the expected value is defined as

E[X] = \sum_{i=-\infty}^{\infty} x_i P(X = x_i),

provided the series converges absolutely, meaning \sum_{i=-\infty}^{\infty} |x_i| P(X = x_i) < \infty.^[32] This absolute convergence ensures the sum is well-defined regardless of the enumeration of the support, distinguishing it from the finite case where simple summation always applies without convergence concerns.^[32] The expectation exists and is finite if and only if \sum |x_i| P(X = x_i) < \infty, which is equivalent to both the positive part \sum_{x_i > 0} x_i P(X = x_i) < \infty and the negative part \sum_{x_i < 0} |x_i| P(X = x_i) < \infty.^[32] A classic example is the geometric distribution, modeling the number of failures before the first success in independent Bernoulli trials with success probability p \in (0,1], where P(X = k) = p (1-p)^k for k = 0, 1, 2, \dots. Here, E[X] = \frac{1-p}{p}, and the series converges due to the exponential decay of probabilities.^[33] Another is the Poisson distribution with parameter \lambda > 0, where P(X = k) = e^{-\lambda} \frac{\lambda^k}{k!} for k = 0, 1, 2, \dots, yielding E[X] = \lambda, with convergence assured by the factorial growth in the denominator.^[34] The expectation may fail to exist for distributions with heavy tails, where probabilities decay too slowly, causing the series \sum |x_i| P(X = x_i) to diverge. For instance, consider P(X = n) = \frac{1}{n(n+1)} for n = 1, 2, \dots, which satisfies normalization but leads to E[|X|] = \sum_{n=1}^{\infty} \frac{n}{n(n+1)} = \sum_{n=1}^{\infty} \frac{1}{n+1} = \infty, rendering the expectation undefined.^[32]

Continuous Random Variables

For a continuous random variable X with probability density function f(x), the expected value E[X] is defined as the Lebesgue integral

E[X] = \int_{-\infty}^{\infty} x f(x) \, dx,

provided the integral exists.^[35]^[36] This requires that f(x) \geq 0 for all x, \int_{-\infty}^{\infty} f(x) \, dx = 1, and absolute convergence of the integral, i.e., \int_{-\infty}^{\infty} |x| f(x) \, dx < \infty.^[35]^[36] Without absolute convergence, the expected value is undefined, even if the principal value exists.^[36] An equivalent expression for E[X] can be obtained using the cumulative distribution function F(x) = \int_{-\infty}^{x} f(t) \, dt, which facilitates computation in cases where differentiating the CDF to obtain f(x) is cumbersome:

E[X] = \int_{0}^{\infty} [1 - F(x)] \, dx - \int_{-\infty}^{0} F(x) \, dx.

This tail formula decomposes the expectation into contributions from the positive and negative parts of X, with each integral representing the expected contribution from the respective tail of the distribution.^[37] A classic example is the uniform distribution on the interval [a, b], where a < b and the density is f(x) = \frac{1}{b-a} for x \in [a, b] and 0 otherwise. The expected value is

E[X] = \int_{a}^{b} x \cdot \frac{1}{b-a} \, dx = \frac{a + b}{2},

the midpoint of the interval, reflecting the symmetry of the distribution.^[35] For the exponential distribution with rate parameter \lambda > 0, the density is f(x) = \lambda e^{-\lambda x} for x \geq 0 and 0 otherwise. The expected value is

E[X] = \int_{0}^{\infty} x \lambda e^{-\lambda x} \, dx = \frac{1}{\lambda},

which corresponds to the mean waiting time in a Poisson process with rate \lambda.^[35]^[38] Using the tail formula, since F(x) = 1 - e^{-\lambda x} for x \geq 0, it simplifies to \int_{0}^{\infty} e^{-\lambda x} \, dx = \frac{1}{\lambda}, confirming the result without direct integration against the density.^[37]

Advanced Definitions

General Real-Valued Random Variables

In measure-theoretic probability, the expected value of a real-valued random variable X: \Omega \to \mathbb{R} defined on a probability space (\Omega, \mathcal{F}, P) is given by the Lebesgue integral

E[X] = \int_{\Omega} X(\omega) \, dP(\omega),

provided this integral exists.^[39] This definition is equivalent to the integral with respect to the cumulative distribution function F_X of X,

E[X] = \int_{-\infty}^{\infty} x \, dF_X(x),

where the integral is understood in the Riemann–Stieltjes sense.^[40]^[41] The expected value E[X] is said to exist (and be finite) if and only if E[|X|] < \infty, where

E[|X|] = \int_{\Omega} |X(\omega)| \, dP(\omega).

In cases where E[|X^+|] < \infty and E[|X^-|] = \infty (or vice versa), E[X] may be defined as +\infty or -\infty, but the absolute expectation is infinite./04:_Expected_Value/4.01:_Definitions_and_Basic_Properties) This measure-theoretic formulation unifies the cases of discrete and continuous random variables: for discrete X taking values in a countable set, the expectation reduces to an integral with respect to the counting measure on that set, recovering the summation form; for continuous X, it corresponds to integration with respect to Lebesgue measure weighted by the density (when it exists).^[39] As an illustration, consider a general Bernoulli random variable X on (\Omega, \mathcal{F}, P) such that X(\omega) = 1 if \omega \in A \in \mathcal{F} with P(A) = p \in [0,1] and X(\omega) = 0 otherwise. Then E[X] = \int_{\Omega} X(\omega) \, dP(\omega) = 1 \cdot P(A) + 0 \cdot P(A^c) = p, and E[|X|] = p < \infty.

Infinite Expected Values

In probability theory, the expected value E[X] of a real-valued random variable X is defined as E[X^+] - E[X^-], where X^+ = \max(X, 0) and X^- = -\min(X, 0) are the positive and negative parts, respectively. ^[42] If E[X^+] = +\infty and E[X^-] < \infty, then E[X] = +\infty; similarly, E[X] = -\infty if E[X^-] = +\infty and E[X^+] < \infty. ^[42] The expectation is undefined if both E[X^+] = +\infty and E[X^-] = +\infty. ^[42] A classic illustration of an infinite expected value is the St. Petersburg paradox, first posed by Nicolaus Bernoulli in a 1713 letter and later analyzed by Daniel Bernoulli in 1738. ^[43] In this game, a fair coin is flipped until the first heads appears on the k-th trial, yielding a payoff of $2^k units; the expected value is \sum_{k=1}^\infty 2^k \cdot (1/2)^k = \sum_{k=1}^\infty 1 = +\infty. ^[43] Despite this infinite expectation, rational agents typically value the game at only a finite amount, often due to considerations of utility or risk aversion rather than the raw expectation. ^[43] Examples of distributions with infinite or undefined expectations include the Cauchy distribution and certain Pareto distributions. For the standard Cauchy distribution with probability density function f(x) = \frac{1}{\pi(1 + x^2)} for x \in \mathbb{R}, the expectation is undefined because both \int_{-\infty}^0 |x| f(x) \, dx = +\infty and \int_0^\infty x f(x) \, dx = +\infty. /05%3A_Special_Distributions/5.14%3A_The_Cauchy_Distribution) Similarly, for a Pareto distribution with shape parameter \alpha \leq 1 and minimum value x_m > 0, the density is f(x) = \frac{\alpha x_m^\alpha}{x^{\alpha+1}} for x \geq x_m, and the expectation E[X] = +\infty since the integral \int_{x_m}^\infty x f(x) \, dx diverges. /05%3A_Special_Distributions/5.36%3A_The_Pareto_Distribution) Such infinite expectations have significant implications, particularly in limit theorems and applications. For instance, the strong law of large numbers fails to converge to a finite limit when the expectation is infinite; for nonnegative random variables with E[X] = +\infty, the sample average \bar{X}_n satisfies \bar{X}_n \to +\infty almost surely as n \to \infty. ^[42] In finance and risk management, distributions with infinite means, such as heavy-tailed Pareto models for losses or returns, challenge traditional risk measures like value-at-risk, as extreme events dominate and standard averaging breaks down, necessitating alternative approaches like tail dependence or infinite-mean estimators. ^[44]

Properties

Basic Properties

The expected value, often denoted as E[X] for a random variable X, possesses several fundamental algebraic properties that underpin its utility in probability theory. These properties hold under minimal assumptions, such as the finiteness of the expected value, and apply to both discrete and continuous random variables. They are derived directly from the definitions of expected value as a sum or integral, leveraging the linearity of summation and integration. One of the most essential properties is linearity of expectation, which states that for any constants a and b and random variables X and Y (which may be dependent or independent), E[aX + bY] = a E[X] + b E[Y]. This holds regardless of the joint distribution of X and Y, making it particularly powerful for computations involving sums of random variables. The proof follows from the definition: for discrete cases, E[aX + bY] = \sum (a x_i + b y_i) P(X=x_i, Y=y_i) = a \sum x_i P(X=x_i, Y=y_i) + b \sum y_i P(X=x_i, Y=y_i) = a E[X] + b E[Y], using the linearity of finite sums; a similar argument applies to integrals for continuous cases. Another basic property is monotonicity: if X \leq Y almost surely (i.e., with probability 1), and both expected values are finite, then E[X] \leq E[Y]. This follows by applying linearity to E[Y - X] = E[Y] - E[X] and noting that Y - X \geq 0 almost surely, which implies E[Y - X] \geq 0 (see non-negativity below). For proof sketches, in the discrete case, the sum \sum (y_i - x_i) P(X=x_i, Y=y_i) \geq 0 since each term is non-negative; integration yields the continuous analog. Non-negativity asserts that if X \geq 0 almost surely, then E[X] \geq 0 (assuming finiteness). The proof is immediate from the definition, as the sum or integral of non-negative terms weighted by probabilities (which are non-negative) cannot be negative. This property extends naturally to the expected value of a constant: for any constant c, E = c, since the random variable is constant with probability 1, and the sum or integral simplifies directly to c. A useful consequence arises with indicator random variables. For an event A, the indicator $1_A (which equals 1 if A occurs and 0 otherwise) has E[1_A] = P(A), directly from the definition since E[1_A] = 1 \cdot P(A) + 0 \cdot (1 - P(A)) = P(A) in the discrete case, or by integration over the density in the continuous case. This connection highlights how expected value generalizes probability measures.

Inequalities

Markov's inequality is a fundamental result in probability theory that bounds the tail probability of a non-negative random variable using its expected value. For a non-negative random variable X with finite expectation and any a > 0,

P(X \geq a) \leq \frac{E[X]}{a}.

This inequality holds under the assumption that E[X] < \infty, and it applies to both discrete and continuous random variables. The proof relies on the integral representation of the expectation for non-negative X: E[X] = \int_0^\infty P(X \geq t) \, dt. Since P(X \geq t) is non-increasing, the integral from a to \infty satisfies \int_a^\infty P(X \geq t) \, dt \geq a \cdot P(X \geq a), leading directly to the bound. For discrete cases, a similar summation argument yields E[X] = \sum_{k=1}^\infty P(X \geq k) \geq a \cdot P(X \geq a). Equality holds if P(X = 0) + P(X = a) = 1. Chebyshev's inequality extends Markov's result to bound deviations from the mean using the variance. For a random variable X with finite mean \mu = E[X] and variance \sigma^2 = \operatorname{Var}(X) < \infty, and for any k > 0,

P(|X - \mu| \geq k \sigma) \leq \frac{1}{k^2}.

This assumes the existence of the second moment E[X^2] < \infty. The inequality provides a distribution-free upper bound on the probability of large deviations. The proof follows by applying Markov's inequality to the non-negative random variable Y = (X - \mu)^2: P(|X - \mu| \geq k \sigma) = P(Y \geq k^2 \sigma^2) \leq E[Y] / (k^2 \sigma^2) = \sigma^2 / (k^2 \sigma^2) = 1/k^2. Equality occurs when P(|X - \mu| = k \sigma) = 1. Jensen's inequality relates the expected value of a function to the function of the expected value for convex functions. If \phi is a convex function and X is a random variable with finite expectation E[X] < \infty, then

\phi(E[X]) \leq E[\phi(X)],

provided E[|\phi(X)|] < \infty. For concave \phi, the inequality reverses. This holds for real-valued random variables where the relevant moments exist. The proof uses the definition of convexity: for any x, y and \lambda \in [0,1], \phi(\lambda x + (1-\lambda) y) \leq \lambda \phi(x) + (1-\lambda) \phi(y). Expressing E[X] as an integral or sum, the inequality follows by integrating the convexity condition with respect to the distribution of X. For twice-differentiable \phi, non-negativity of \phi'' implies convexity and supports the result via Taylor expansion. Equality holds if \phi is affine on the support of X or if X is constant almost surely. Hölder's inequality generalizes the Cauchy-Schwarz inequality to bound the expectation of products using conjugate exponents. For random variables X and Y with finite moments E[|X|^p] < \infty and E[|Y|^q] < \infty, where p > 1, q = p/(p-1) (so $1/p + 1/q = 1),

|E[XY]| \leq \left( E[|X|^p] \right)^{1/p} \left( E[|Y|^q] \right)^{1/q}.

This assumes the p-th and q-th moments exist and are finite. The case p = q = 2 recovers Cauchy-Schwarz. The proof employs Young's inequality for products: for a = |X|^p / p, b = |Y|^q / q, ab \leq a + b, leading to |XY| \leq |X|^p / p + |Y|^q / q. Taking expectations and optimizing yields the bound. Equality holds when |X|^p and |Y|^q are proportional almost surely.

Convergence and Limits

In probability theory, the expected value of a sequence of random variables does not necessarily converge to the expected value of the limit under mere pointwise or probabilistic convergence, necessitating specific conditions to interchange limits and expectations. These conditions arise from measure-theoretic foundations and ensure the preservation of integrability and the validity of limit operations on expectations. The monotone convergence theorem provides one such condition for non-negative sequences. Specifically, if (X_n)_{n=1}^\infty is a sequence of non-negative random variables such that X_n \uparrow X almost surely (i.e., $0 \leq X_1(\omega) \leq X_2(\omega) \leq \cdots \leq X(\omega) for almost all \omega), then \mathbb{E}[X_n] \uparrow \mathbb{E}[X].^[45] This theorem guarantees that the expectations increase monotonically to the expectation of the limit, allowing the interchange of limit and expectation under monotonicity. A more general result is the dominated convergence theorem, which relaxes the monotonicity requirement at the cost of an integrability bound. If X_n \to X almost surely, and there exists a random variable Y with \mathbb{E}[|Y|] < \infty such that |X_n| \leq Y almost surely for all n, then \mathbb{E}[X_n] \to \mathbb{E}[X] and \mathbb{E}[|X_n - X|] \to 0.^[45] In probabilistic terms, the almost sure convergence can be weakened to convergence in probability under the same domination condition. This theorem is pivotal for establishing convergence of expectations in settings where sequences are bounded by an integrable dominator, such as in stochastic processes or limit theorems. Even without domination or monotonicity, uniform integrability offers a sufficient condition for interchanging limits and expectations. A sequence (X_n) is uniformly integrable if \lim_{c \to \infty} \sup_n \mathbb{E}[|X_n| \mathbf{1}_{|X_n| \geq c}] = 0. If X_n \to X almost surely, \mathbb{E}[|X_n|] < \infty for all n, and (X_n) is uniformly integrable, then \mathbb{E}[X] < \infty and \mathbb{E}[X_n] \to \mathbb{E}[X].^[46] Uniform integrability controls the contribution of large tails uniformly across the sequence, ensuring L¹ convergence and thus the desired limit for expectations; it is equivalent to the condition that \mathbb{E}[|X_n - X|] \to 0 under almost sure convergence.^[46] Fatou's lemma provides an inequality rather than equality, serving as a foundational tool for proving the above theorems. For a sequence of non-negative random variables X_n \geq 0, it states that \mathbb{E}[\liminf_{n \to \infty} X_n] \leq \liminf_{n \to \infty} \mathbb{E}[X_n].^[45] This lower semicontinuity of the expectation functional holds without additional assumptions beyond non-negativity, bounding the expectation of the limit inferior by the limit inferior of the expectations. Convergence in probability alone does not suffice to preserve expectations, as illustrated by counterexamples where the mass of the distribution "escapes" to infinity. Consider a uniform random variable U on [0,1], and define X_n = n if U \leq 1/n and X_n = 0 otherwise. Then X_n \to 0 in probability, since \mathbb{P}(|X_n| > \epsilon) = 1/n \to 0 for any \epsilon > 0, but \mathbb{E}[X_n] = n \cdot (1/n) = 1 \not\to 0.^[47] This "spiking" or "moving bump" phenomenon highlights the need for tail control, as the rare but large values prevent expectation convergence despite probabilistic convergence to zero.

Expected Values of Distributions

Discrete Distributions

The expected value of a discrete random variable X with probability mass function p(x) is given by E[X] = \sum_x x \, p(x), where the sum is over the support of X. For the Bernoulli distribution, X takes values 0 or 1 with success probability p, so the PMF is p(0) = 1 - p and p(1) = p. The expected value is E[X] = 0 \cdot (1 - p) + 1 \cdot p = p.^[48] The binomial distribution models the number of successes in n independent Bernoulli trials, each with success probability p. The PMF is p(x) = \binom{n}{x} p^x (1 - p)^{n - x} for x = 0, 1, \dots, n. The expected value follows from the linearity of expectation applied to the sum of n indicator variables, yielding E[X] = np.^[48] The negative binomial distribution counts the number of failures before the r-th success in independent Bernoulli trials with success probability p. The PMF is p(x) = \binom{x + r - 1}{x} p^r (1 - p)^x for x = 0, 1, 2, \dots. The expected value is E[X] = r(1 - p)/p, derived by viewing X as the sum of r independent geometric random variables each counting failures before a success.^[49] The Poisson distribution with parameter \lambda > 0 models the number of events in a fixed interval, with PMF p(k) = \frac{\lambda^k e^{-\lambda}}{k!} for k = 0, 1, 2, \dots. The expected value is E[Y] = \sum_{k=0}^\infty k \frac{\lambda^k e^{-\lambda}}{k!} = \lambda e^{-\lambda} \sum_{k=1}^\infty \frac{\lambda^{k-1}}{(k-1)!} = \lambda, recognizing the sum as e^\lambda.^[50] The geometric distribution, in the convention of trials until the first success, has PMF p(x) = (1 - [p](/page/P′′))^{x-1} [p](/page/P′′) for x = [1](/page/1), 2, [3, \dots](/page/3_Dots), where [p](/page/P′′) is the success probability. The expected value is E[X] = \sum_{x=[1](/page/1)}^\infty x (1 - [p](/page/P′′))^{x-1} [p](/page/P′′) = \frac{[1](/page/1)}{[p](/page/P′′)}, obtained by differentiating the geometric series sum \sum_{x=[0](/page/0)}^\infty [q](/page/Q)^x = 1/(1 - [q](/page/Q)) for [q](/page/Q) = 1 - [p](/page/P′′).^[51]

Distribution	Parameters	Expected Value E[X]
Bernoulli	[p](/page/P′′) \in (0,1)	[p](/page/P′′)
Binomial	n \in \mathbb{N}, [p](/page/P′′) \in (0,1)	n[p](/page/P′′)
Negative Binomial	r \in \mathbb{N}, [p](/page/P′′) \in (0,1)	r(1-[p](/page/P′′))/[p](/page/P′′)
Poisson	\lambda > 0	\lambda
Geometric	[p](/page/P′′) \in (0,1)	[1](/page/1)/[p](/page/P′′)

Continuous Distributions

For continuous random variables, the expected value is defined as the integral of the product of the variable and its probability density function (pdf) over the entire real line, ensuring the integral converges: E[X] = \int_{-\infty}^{\infty} x f(x) \, dx, where f(x) is the pdf.^[52] This contrasts with discrete cases by replacing summation with integration, providing the long-run average value under the distribution.^[52] Common continuous distributions have closed-form expected values derived through direct integration. For the uniform distribution on [a, b] with pdf f(x) = \frac{1}{b-a} for a \leq x \leq b (and 0 otherwise), the expected value is obtained by E[X] = \int_a^b x \cdot \frac{1}{b-a} \, dx = \frac{a + b}{2}.^[52] For the exponential distribution with rate parameter \lambda > 0 and pdf f(x) = \lambda e^{-\lambda x} for x \geq 0 (and 0 otherwise), integration by parts yields E[X] = \int_0^{\infty} x \lambda e^{-\lambda x} \, dx = \frac{1}{\lambda}.^[53] The normal distribution N(\mu, \sigma^2), with pdf f(x) = \frac{1}{\sqrt{2\pi \sigma^2}} \exp\left( -\frac{(x - \mu)^2}{2\sigma^2} \right), has expected value E[X] = \mu, as the mean parameter directly locates the distribution's center, verifiable by symmetry or completing the square in the integral.^[54] For the gamma distribution with shape \alpha > 0 and scale \beta > 0, pdf f(x) = \frac{x^{\alpha-1} e^{-x/\beta}}{\beta^\alpha \Gamma(\alpha)} for x > 0 (and 0 otherwise), the expected value integrates to E[X] = \alpha \beta.^[55] Similarly, the beta distribution on [0, 1] with shape parameters \alpha > 0 and \beta > 0, pdf f(x) = \frac{x^{\alpha-1} (1-x)^{\beta-1}}{B(\alpha, \beta)} where B is the beta function, gives E[X] = \frac{\alpha}{\alpha + \beta} via beta function properties in the integral.^[56] The following table summarizes the parameters and expected values for these distributions:

Distribution	Parameters	Expected Value E[X]
Uniform	a, b (a < b)	\frac{a + b}{2}
Exponential	\lambda > 0	\frac{1}{\lambda}
Normal	\mu, \sigma^2 > 0	\mu
Gamma	\alpha > 0, \beta > 0	\alpha \beta
Beta	\alpha > 0, \beta > 0	\frac{\alpha}{\alpha + \beta}

Computation and Extensions

Numerical Computation

When closed-form expressions for the expected value of a random variable are unavailable or computationally intractable, numerical methods provide approximations by leveraging sampling, integration techniques, or series approximations. These approaches are essential in fields like finance, physics, and machine learning, where distributions may be complex or high-dimensional.^[57] Monte Carlo simulation offers a straightforward way to estimate the expected value by generating independent samples from the underlying distribution. For a random variable X with distribution F, the estimator is the sample mean \hat{\mu} = \frac{1}{n} \sum_{i=1}^n x_i, where x_i are drawn from F, which converges to E[X] as n \to \infty by the law of large numbers. This method is unbiased and widely used for its simplicity in multidimensional settings.^[57] Importance sampling enhances Monte Carlo estimation, particularly for rare events or expectations involving heavy-tailed distributions, by drawing samples from a proposal distribution g that is easier to sample from and reweighting them to match the target distribution f. The estimator becomes \hat{\mu} = \frac{1}{n} \sum_{i=1}^n x_i \frac{f(x_i)}{g(x_i)}, reducing variance when g approximates the behavior of f in regions of interest. This technique, rooted in variance reduction strategies, is crucial for efficient computation in risk analysis and particle physics simulations. For continuous random variables where the density f(x) is known, numerical integration approximates E[X] = \int x f(x) \, dx using quadrature rules. Methods like Simpson's rule divide the integration domain into subintervals and apply polynomial approximations, yielding high accuracy for smooth functions with error scaling as O(h^4), where h is the step size. Gaussian quadrature, which chooses optimal nodes and weights, is particularly effective for expectations over finite intervals, often exact for polynomials up to degree $2n-1 with n points. These techniques are implemented in libraries for reliable one-dimensional computations.^[58] In discrete cases with countable support, the expected value is an infinite sum E[X] = \sum_{k=1}^\infty k p_k, which can be approximated by truncating at a finite N such that the tail \sum_{k=N+1}^\infty k p_k is bounded below a tolerance. Error bounds rely on tail estimates, such as geometric decay if probabilities decrease exponentially, ensuring the remainder is less than \epsilon with controlled truncation level N. Adaptive strategies adjust N dynamically based on partial sums to balance accuracy and efficiency. Software tools facilitate these computations in practice. In Python, libraries like NumPy provide functions such as numpy.mean for Monte Carlo sample averages and scipy.integrate.quad for quadrature-based expectations. Similarly, R's base package includes mean for simulations and integrate for numerical integration, with extensions like mc2d for advanced Monte Carlo variance reduction. These implementations handle large-scale approximations efficiently without requiring custom code. Error analysis is vital for assessing reliability. For the Monte Carlo estimator, the variance is \frac{\mathrm{Var}(X)}{n}, leading to a standard error of \sqrt{\frac{\mathrm{Var}(X)}{n}}; confidence intervals follow from the central limit theorem, approximating \hat{\mu} \pm z_{\alpha/2} \sqrt{\frac{s^2}{n}} where s^2 estimates \mathrm{Var}(X) and z_{\alpha/2} is the normal quantile. Importance sampling reduces this variance but requires checking effective sample size via weight diagnostics to avoid instability. Quadrature errors are deterministic and bounded by rule-specific formulas, while truncation errors use remainder theorems for guarantees. These metrics guide the choice of sample size or grid resolution to achieve desired precision.^[57]^[58]

Conditional Expected Value

The conditional expected value of a random variable X given an event A with P(A) > 0 is defined as E[X \mid A] = \frac{1}{P(A)} \int_A X \, dP.^[42] This represents the average value of X over the outcomes in A, normalized by the probability of A. In the general measure-theoretic framework, the conditional expectation E[X \mid \mathcal{G}] of an integrable random variable X (i.e., E[|X|] < \infty) with respect to a sub-\sigma-algebra \mathcal{G} of the underlying \sigma-algebra is a \mathcal{G}-measurable random variable Y such that for every set B \in \mathcal{G},

\int_B X \, dP = \int_B Y \, dP.

This Y, denoted E[X \mid \mathcal{G}], exists and is unique almost surely.^[42] When \mathcal{G} = \sigma(A) generated by a single event A, this reduces to the earlier definition. A fundamental relation is the law of total expectation, which states that E[E[X \mid \mathcal{G}]] = E[X].^[42] This holds because integrating the defining property over the entire space \Omega yields the unconditional expectation on both sides. Conditional expectations inherit key properties from the unconditional case, including linearity: for constants a, b and integrable X, Z,

E[aX + bZ \mid \mathcal{G}] = a E[X \mid \mathcal{G}] + b E[Z \mid \mathcal{G}]

almost surely.^[42] Additionally, if Y is a random variable, then E[X \mid Y = y] is the value at y of the random variable E[X \mid \sigma(Y)], providing a function of y that captures the expected value of X conditional on observing Y = y.^[42] Consider two independent Bernoulli random variables X_1, X_2 with success probability p, so their sum S = X_1 + X_2 follows a binomial distribution with parameters 2 and p. The conditional expectation E[X_1 \mid S = 1] equals \frac{1}{2p(1-p)} \int_{\{S=1\}} X_1 \, dP = \frac{1}{2}, by symmetry, since given exactly one success, each is equally likely to be the successful trial.^[42] For nested \sigma-algebras \mathcal{H} \subseteq \mathcal{G}, the tower property (or iteration property) asserts that E[E[X \mid \mathcal{G}] \mid \mathcal{H}] = E[X \mid \mathcal{H}] almost surely, reflecting how coarser information aggregates finer conditional expectations.^[42] This property is crucial in settings with filtrations, such as stochastic processes where information accumulates over time.

Applications

In Probability and Statistics

In probability theory, the expected value plays a foundational role in asymptotic results concerning sample means. The central limit theorem states that, under suitable conditions, the distribution of the standardized sample mean converges to a standard normal distribution as the sample size increases, with the mean of this limiting distribution equal to the expected value of the underlying random variable. This convergence implies that the expected value of the sample mean remains equal to the population expected value, providing a basis for inference about population parameters from large samples.^[59] The law of large numbers further underscores the reliability of the expected value as a long-run average. Specifically, the strong law, established by Kolmogorov, asserts that the sample average converges almost surely to the expected value of the random variable as the number of observations tends to infinity, assuming finite expectation. This result justifies the interpretation of the expected value as the limiting frequency in repeated independent trials.^[60] In hypothesis testing, expected values under the null and alternative hypotheses are essential for computing the power of a test, which measures the probability of correctly rejecting the null when it is false. Power calculations often involve evaluating the expected value of the test statistic under the alternative distribution to determine the non-centrality parameter or shift in the sampling distribution, thereby assessing the test's ability to detect true effects.^[61] The concept of estimator bias is defined directly in terms of expected value: an estimator \hat{\theta} is unbiased if its expected value equals the true parameter \theta, i.e., E[\hat{\theta}] = \theta. This property ensures that, on average, the estimator centers around the parameter it targets, a desirable feature in statistical estimation despite potential trade-offs with variance.^[62] The method of moments, introduced by Pearson, estimates distribution parameters by equating sample moments to their theoretical counterparts, where the k-th theoretical moment is the expected value E[X^k]. For instance, the first moment matches the sample mean to E[X], and higher moments similarly align powers of the data with population expectations to solve for parameters.^[63] In martingale theory, Doob's optional sampling theorem preserves expectations under stopping times: for a martingale M_t and a bounded stopping time \tau, the expected value E[M_\tau] = E[M_0], provided the conditions of uniform integrability or boundedness hold. This theorem extends the martingale property to optional sampling, enabling analysis of stopped processes while maintaining the expected value invariant.^[64]

In Decision Theory and Economics

In decision theory, expected utility theory provides a foundational framework for rational choice under uncertainty, positing that individuals maximize the expected value of utility, denoted as E[u(W)], where u is a von Neumann-Morgenstern utility function and W represents wealth outcomes. This approach, formalized by John von Neumann and Oskar Morgenstern, assumes that preferences over lotteries satisfy completeness, transitivity, continuity, and independence axioms, leading to a cardinal utility representation where decisions are based on the probability-weighted average of utilities rather than raw monetary values.^[65] Such maximization guides agents to select actions that yield the highest anticipated utility, distinguishing it from mere expected monetary value by incorporating diminishing marginal utility of wealth. Risk aversion arises naturally within this framework when the utility function u is concave, implying that the utility of expected wealth exceeds the expected utility of a random wealth prospect: u(E[W]) > E[u(W)]. This inequality follows from Jensen's inequality applied to concave functions and characterizes risk-averse behavior, where individuals prefer a certain outcome to a risky gamble with the same expected value, such as paying a premium for insurance. John W. Pratt formalized measures of absolute and relative risk aversion, enabling comparisons of attitudes toward risk across utility functions and influencing models of insurance demand and investment choices.^[66] The St. Petersburg paradox illustrates the limitations of expected monetary value, where a coin-flipping game yields an infinite expected payoff but finite willingness to pay, resolved by Daniel Bernoulli through bounded utility functions like the logarithmic form, which diminishes marginal utility for large gains and aligns expected utility with observed behavior.^[67] In finance, expected value underpins portfolio theory, as Harry Markowitz's mean-variance optimization selects portfolios maximizing expected return E[R] for a given risk level, measured by variance. This extends to the Capital Asset Pricing Model (CAPM), where William F. Sharpe derives that the expected return of an asset satisfies E[R_i] = R_f + \beta_i (E[R_m] - R_f), linking individual asset returns to market risk premiums under equilibrium assumptions of diversified investors.^[68]^[69] Expected value also informs cost-benefit analysis in economics, where projects are evaluated via expected net present value (NPV), computed as the discounted sum of expected benefits minus costs, accepting those with positive NPV to ensure efficient resource allocation.^[70] However, behavioral economics critiques pure expected utility for failing to capture empirical deviations, as prospect theory by Daniel Kahneman and Amos Tversky demonstrates that decisions overweight low-probability events, exhibit loss aversion, and reference dependence, leading to risk-seeking in losses and risk-averse choices in gains relative to a reference point.^[71]

In Other Fields

In quantum mechanics, the expected value of an observable, such as position or momentum, represents the average outcome of repeated measurements on a system in a given state, bridging theoretical predictions with empirical observations. This concept is formalized as the expectation value \langle \hat{A} \rangle = \langle \psi | \hat{A} | \psi \rangle, where \hat{A} is the operator corresponding to the observable and |\psi\rangle is the quantum state. For instance, in the hydrogen atom, the expected value of the radial position helps predict electron cloud distributions.^[72] In computer science, expected value is central to analyzing randomized algorithms, where it quantifies the average-case performance over all possible inputs weighted by their probabilities. For example, in quicksort, the expected number of comparisons is O(n \log n), providing a reliable bound despite worst-case variability. This approach, detailed in foundational texts on probabilistic methods, enables efficient design of algorithms like hashing and Monte Carlo simulations.^[3] In machine learning, particularly reinforcement learning, expected value defines the value function V(s), which estimates the long-term reward from a state s under a policy, computed as V(s) = \mathbb{E} \left[ \sum_{t=0}^{\infty} \gamma^t r_t \mid s_0 = s \right] with discount factor \gamma. This underpins algorithms like Q-learning, optimizing decisions in environments like robotics or game AI by maximizing cumulative expected rewards. Seminal work in this area emphasizes its role in balancing exploration and exploitation. In engineering, expected value supports reliability and risk assessment, such as calculating the mean time to failure (MTTF) for components modeled as random variables, aiding design in systems like telecommunications networks. For electrical engineering applications, it informs decision-making under uncertainty, like evaluating signal processing outcomes in noisy channels. This probabilistic framework ensures robust system performance metrics.^[73]

References

[1]
[PDF] 6.042J Chapter 18: Expectation - MIT OpenCourseWare
The expectation or expected value of a random variable is a single number that tells you a lot about the behavior of the variable.
[2]
Interpretations of Probability - Stanford Encyclopedia of Philosophy
Oct 21, 2002 · Probability theory was a relative latecomer in intellectual history. To be sure, proto-probabilistic ideas concerning evidence and inference ...<|separator|>
[3]
Expected Value Analysis (Economic Risk Analysis) | EME 460
The expected value is defined as the difference between expected profits and expected costs. Expected profit is the probability of receiving a certain profit ...
[4]
Normative Theories of Rational Choice: Expected Utility
Aug 8, 2014 · Expected utility theory is an account of how to choose rationally when you are not sure which outcome will result from your acts.
[5]
[PDF] FERMAT AND PASCAL ON PROBABILITY - University of York
The problem was proposed to Pascal and Fermat, probably in 1654, by the Chevalier de Méré, a gambler who is said to have had unusual ability “even for the ...Missing: primary | Show results with:primary
[6]
De Ratiociniis in Ludo Aleae
... {De Ratiociniis in Ludo Aleae} which was published in Latin in 1657. The ... Expectation which must consequently procure me the same Expectation in fair Gaming.
[7]
[PDF] Jakob Bernoulli On the Law of Large Numbers Translated into ...
Bernoulli apparently considered the art of conjecturing as a mathematical discipline based on probability as a measure of certainty and on expectation and ...
[8]
[PDF] De Moivre on the Law of Normal Probability - University of York
This paper gave the first statement of the formula for the “normal curve,” the first method of finding the probability of the occurrence of an error of a given ...
[9]
[PDF] FOUNDATI<lNS THEORY OF PROBABILITY - AltExploit
We define as elementary theory of probability that part of the theory in which we have to deal with probabilities of only a finite number of events.
[10]
Earliest Known Uses of Some of the Words of Mathematics (E)
In the 1920s H. Cramér developed a rigorous theory of "Edgeworth series" which he summarised in his Random Variables and Probability Distributions (1937). More ...
[11]
A History of the Mathematical Theory of Probability
A history of the mathematical theory of probability from the time of Pascal to that of Laplace. Search within full text.Missing: expectation | Show results with:expectation
[12]
Hope and glory - The Sociological Review
Oct 10, 2023 · The earliest recorded use of the French term “l'espérance mathématique” was in a letter written by Swiss mathematician Gabriel Cramer in 1728.
[13]
Jacob Bernoulli and the Founding of Mathematical Probability - jstor
'valor expectationis' meaning 'Erwartungswert' and the word 'Chance' did not appear (van der. Waerden 1975, 12). By the time Bernoulli wrote his commentary ...
[14]
Introduction To Mathematical Probability : J.v. Uspensky
Jan 23, 2017 · Introduction To Mathematical Probability ; Publication date: 1937 ; Topics: IIIT ; Collection: digitallibraryindia; JaiGyan ; Language: English.Missing: expected value notation
[15]
Expectation | Mean | Average - Probability Course
The expected value is defined as the weighted average of the values in the range. Expected value (= mean=average): Definition Let X be a discrete random ...
[16]
[PDF] Chapter 3: Expectation and Variance
The mean, expected value, or expectation of a random variable X is writ- ten as E(X) or µX. If we observe N random values of X, then the mean of the. N values ...
[17]
[PDF] ECE 302: Lecture 5.2 Joint Expectation - Purdue Engineering
What is E[XY ] then? E[XY ] is the cosine angle between x and y. Measures how similar they are. Weight by the PMF/PDF.
[18]
8.4 - Variance of X | STAT 414 - STAT ONLINE
The variance of X can also be called the second moment of X about the mean μ . The positive square root of the variance is called the standard deviation of X ...
[19]
[PDF] Mean and variance
The variance is defined as. In words, the variance is the expected squared difference between a random variable and its mean. For example, suppose is a fair ...
[20]
18.1 - Covariance of X and Y | STAT 414
Here, we'll begin our attempt to quantify the dependence between two random variables X and Y by investigating what is called the covariance between the two ...
[21]
[PDF] 18.05 S22 Reading 7b: Covariance and Correlation
Covariance is a measure of how much two random variables vary together. For example, height and weight of giraffes have positive covariance because when one is ...
[22]
Lesson 9: Moment Generating Functions - STAT ONLINE
Special functions, called moment-generating functions can sometimes make finding the mean and variance of a random variable simpler. In this lesson, we'll first ...
[23]
[PDF] Lecture 6: Expected Value and Moments - Stat@Duke
May 21, 2014 · This is called the moment generating function because we can obtain the moments of X by successively differentiating MX (t) wrt t and then.
[24]
[PDF] The Expected Value - Arizona Math
... E(X. µ)2 commonly called the. (distributional) variance. Note that σ2 = Var(X) = E(X. µ)2 = EX2. 2µEX + µ2 = EX2. 2µ2 + µ2 = EX2. µ2. This gives a frequently ...<|control11|><|separator|>
[25]
Mean or Expected Value and Standard Deviation - UH Pressbooks
The expected value is often referred to as the “long-term” average or mean. This means that over the long term of doing an experiment over and over, you would ...
[26]
[PDF] Chapter 5. Multiple Random Variables 5.7: Limit Theorems
• (Law of Large Numbers) As n → ∞, the sample mean Xn converges (in probability) to the true mean µ.
[27]
[PDF] 8 Laws of large numbers - Arizona Math
Note that Xn is itself a random variable. Intuitively we expect that as n →. ∞, Xn will converge to E[X]. What exactly do we mean by “convergence”.
[28]
8.1 - A Definition | STAT 414
The resulting sum is called the mathematical expectation, or the expected value of the function. The expectation is denoted.Missing: evolution term
[29]
[PDF] Topic 8: The Expected Value - Arizona Math
Expected value, or expectation, is a concept for random variables, calculated as EX = X(ωj)P{ωj} for discrete variables, based on a probability model.
[30]
[PDF] [ ] ( ) E X xp x = [ ] E X =
The expected value (average value) of a discrete random variable is defined by the quantity. : ( ) 0. [ ]. ( ). X xp x. E X xp x. µ. > = = ∑ . This sum is just ...
[31]
[PDF] Averaging and Expectation Random Variables Probability ...
Suppose we toss a biased coin, with Pr(h)=2/3. If the coin lands heads, you get $1; if the coin lands tails, you get $3. What are your expected winnings?
[32]
[PDF] Discrete random variables - UConn Undergraduate Probability OER
We need absolute convergence of the sum so that the expectation does not depend on the order in which we take the sum to define it.
[33]
Geometric distribution | Properties, proofs, exercises - StatLect
The geometric distribution is the probability distribution of the number of failures we get by repeating a Bernoulli experiment until we obtain the first ...Intuition · Definition · Relation to the Bernoulli... · Moment generating function
[34]
12.1 - Poisson Distributions | STAT 414 - STAT ONLINE
Let the discrete random variable denote the number of times an event occurs in an interval of time (or space). Then may be a Poisson random variable.
[35]
[PDF] Continuous Random Variables Class 5, 18.05
2 Spread. The expected value (mean) of a random variable is a measure of location. If you had to summarize a random variable with a single number, the mean ...
[36]
[PDF] Review of Probability Theory - CS229
x∈V al(X) g(x)pX(x). If X is a continuous random variable with PDF fX(x), then the expected value of g(X) is defined. as, E[g(X)] ,
[37]
[PDF] The Expected Value - Arizona Math
Continuous Random Variables. Survival Function. Normal Random Variables. Survival Function. We integrate by parts. Z ∞. 0. (1 − FX (x))dx = x(1 − FX (x)). ∞. 0.Missing: ∫ _0^
[38]
[PDF] Continuous Expectation and Variance, and the Law of Large ...
Expected value: 1/λ. Notation: exponential(λ) or exp(λ). Density: f (x) = λe−λx for 0 ≤ x. Models: Waiting time.
[39]
9. Expected Value as an Integral - Random Services
The best and most elegant definition of expected value is as an integral with respect to the underlying probability measure.
[40]
[PDF] Expected Values - Arizona Math
to approximate the expected value. Eg( ˜X) = ∑. ˜x g(˜x)f˜X. (˜x) = ∑. ˜x g ... g(x)dFX (x). For the Riemann-Stieltjes integral. ∫ b a g(x)dFX (x), we ...
[41]
[PDF] TOPIC. Expectations. This section deals with the notion of the
Oct 16, 2000 · 7–1 where the integral is taken to be a Riemann-Stieltjes integral. ... The expected value of a transformation of X. Suppose Y = t(X) is a ...
[42]
[PDF] Probability: Theory and Examples Rick Durrett Version 5 January 11 ...
Jan 11, 2019 · ... Expected Value . . . . . . . . . . . . . . . . . . . . . . . . 28 ... Infinite Mean . . . . . . . . . . . . . . . . . . . . . 88. 2.6 ...
[43]
the St. Petersburg paradox - Stanford Encyclopedia of Philosophy
Jul 30, 2019 · The St. Petersburg paradox, introduced by Nicolaus Bernoulli, involves a game where the expected value of the prize is infinite, even with ...The History of the St... · The Modern St. Petersburg... · The Pasadena Game
[44]
[PDF] Infinite-mean models in risk management - arXiv
Oct 25, 2024 · Infinite-mean models, often power-law distributions, are relevant for heavy-tailed data in risk management, where standard statistical methods ...
[45]
[PDF] 4 Expectation & the Lebesgue Theorems - Stat@Duke
Sep 20, 2017 · The two most famous of these conditions are both attributed to Henri Lebesgue: the Monotone Convergence Theorem (MCT) and the Domi- nated ...
[46]
[PDF] Convergence theorems for expectations
). Dominated convergence theorem: Consider a countable sequence {Xn : n = 1,2, ...} of R– valued random variables defined on the same probability space (Ω,F,P).Missing: expected value<|separator|>
[47]
[PDF] Convergence of Random Variables - MIT OpenCourseWare
Note that for a.s. convergence to be relevant, all random variables need to be defined on the same probability space (one experiment). Furthermore, the.
[48]
[PDF] Discrete Random Variables and Probability Distributions
The mean value of a Bernoulli variable is μ = p. So, the expected number of S's on any single trial is p. Since a binomial experiment consists of n trials, ...
[49]
[PDF] Chapter 5 sections - Stat@Duke
5.2 Bernoulli and Binomial distributions. Just skim 5.3 Hypergeometric distributions. 5.4 Poisson distributions. Just skim 5.5 Negative Binomial distributions.
[50]
[PDF] Chapter 3. Discrete random variables and probability distributions.
Var(Y) = λ. Intuitively: Poisson r.v. is a limit of binomial r.v.'s, hence its expectation and variance are also limits: np → λ, np(1 − p) → λ. Proof: (1). E(Y) ...
[51]
7.3 Geometric Distribution - Mathematics
Since successive trials are independent, then the probability of the first success occurring on the mth trial presumes that the previous m-1 trials were all ...
[52]
[PDF] 5 Continuous random variables - Arizona Math
(If this last integral is infinite we say the expected value of X is not defined.) ... Let F(x) be a function from R to [0, 1] such that. 1. F(x) is non- ...
[53]
[PDF] CS 547 Lecture 8: Continuous Random Variables
The expected value of an exponentially distributed random variable is simply the inverse of the paramter λ. It can be shown, using integration by parts twice, ...
[54]
Probability Playground: The Normal Distribution
A location parameter μ specifies the mean of the distribution, while the variance of the distribution is given by a parameter σ². The function Φ(x) in the cdf ...
[55]
[PDF] 4.6 The Gamma Probability Distribution
The gamma distribution has density f(y) = (yα−1e−y/β βαΓ(α)), mean µ = αβ, variance σ² = αβ², and standard deviation σ = pV (Y ).
[56]
AP Statistics Curriculum 2007 Beta - SOCR Wiki
Apr 4, 2023 · ... )tkk! Expectation: The expected value of a Beta distributed random variable x is. E(X)=αα+β. Variance: The Beta variance is. Var(X)= ...
[57]
[PDF] Chapters 1 and 2 - Art Owen
This is a book about the Monte Carlo method. The core idea of Monte Carlo is to learn about a system by simulating it with random sampling. That approach.
[58]
[PDF] quadrature.pdf - UMD MATH
The problem of numerical integration, or numerical quadrature, is to estimate. I(f) = Z b a f(x)dx. This problem arises when the integration cannot be ...
[59]
Lesson 27: The Central Limit Theorem - STAT ONLINE
The Central Limit Theorem (CLT) tells us that the sampling distribution of the sample mean is, at least approximately, normally distributed.
[60]
[PDF] 8 The Laws of Large Numbers - Stat@Duke
Oct 25, 2017 · 8 The Laws of Large Numbers. The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit.
[61]
[PDF] 8 Testing of Hypotheses and Confidence Regions
... power calculation is always the same mathematical calculation. For both, you calculate the expected value of the test function. For the type I error ...<|separator|>
[62]
1.3 - Unbiased Estimation | STAT 415 - STAT ONLINE
A natural question then is whether or not these estimators are "good" in any sense. One measure of "good" is "unbiasedness." Bias and Unbias Estimator. If the ...
[63]
1.4 - Method of Moments | STAT 415 - STAT ONLINE
The method of moments involves equating sample moments with theoretical moments. So, let's start by making sure we recall the definitions of theoretical ...
[64]
[PDF] conditional expectation and martingales
More important, the expectation of a martingale is unaffected by optional sampling. ... The cornerstone of martingale theory is Doob's Optional Sampling Theorem.
[65]
https://press.princeton.edu/books/paperback/9780691130613/theory-of-games-and-economic-behavior
[66]
[PDF] Risk Aversion in the Small and in the Large - John W. Pratt
Sep 4, 2001 · PRATT. 6. CONSTANT RISK AVERSION. If the local risk aversion function ... would be to have expected utility depend only on the mean and variance ...
[67]
[PDF] Exposition of a New Theory on the Measurement of Risk
Apr 6, 2005 · Exposition of a New Theory on the Measurement of Risk. Daniel ... interpreting Bernoulli's paper in the light of modern econometrics.
[68]
[PDF] Portfolio Selection Harry Markowitz The Journal of Finance, Vol. 7 ...
Sep 3, 2007 · We illustrate geometrically relations between beliefs and choice of portfolio accord- ing to the "expected returns-variance of returns" rule.
[69]
[PDF] Capital Asset Prices: A Theory of Market Equilibrium under ... - Finance
Jun 17, 2006 · Sharpe, "A Simplified Model for. Portfolio Analysis," Management Science, Vol. 9, No. 2 (January 1963), 277-293. A related discussion can be ...
[70]
Cost-Benefit Analysis | Cambridge Aspire website
Cost-Benefit Analysis provides accessible, comprehensive, authoritative, and practical treatments of the protocols for assessing the relative efficiency of ...
[71]
[PDF] Prospect Theory: An Analysis of Decision under Risk - MIT
BY DANIEL KAHNEMAN AND AMOS TVERSKY'. This paper presents a critique of expected utility theory as a descriptive model of decision making under risk, ...
[72]
Expectation Values in Quantum Mechanics - HyperPhysics
An expectation value relates quantum calculations to lab observations, representing the average value of a parameter from many measurements or particles. It's ...
[73]
Expected Value (Weighted Average) in FE Electrical Exam
May 24, 2023 · Enhancing Decision-Making – Expected Value provides a systematic framework for decision-making in electrical engineering. It enables engineers ...