Conditional expectation
In probability theory, the conditional expectation of a random variable X given another random variable Y = y is defined as the expected value of X with respect to its conditional probability distribution given Y = y.[1] This concept provides a way to compute averages under partial information about the random outcome, generalizing the unconditional expectation to scenarios where conditioning on events or other variables refines the prediction.[1] Formally, for discrete random variables, the conditional expectation is given by \mathbb{E}[X \mid Y = y] = \sum_x x \cdot P(X = x \mid Y = y), while for continuous cases, it takes the form \mathbb{E}[X \mid Y = y] = \int x \cdot f_{X \mid Y}(x \mid y) \, dx, where f_{X \mid Y} is the conditional density function.[1] In the more general measure-theoretic framework, the conditional expectation \mathbb{E}[X \mid \mathcal{G}] with respect to a sub-σ-algebra \mathcal{G} is the unique \mathcal{G}-measurable random variable that satisfies \int_A \mathbb{E}[X \mid \mathcal{G}] \, dP = \int_A X \, dP for all A \in \mathcal{G}, relying on the Radon-Nikodym theorem.[2] This rigorous definition was first established by Andrey Kolmogorov in his 1933 monograph Foundations of the Theory of Probability, which axiomatized probability using measure theory and introduced conditional expectation as a projection onto the space of \mathcal{G}-measurable functions.[3] Key properties of conditional expectation include linearity: \mathbb{E}[aX + bZ \mid Y] = a \mathbb{E}[X \mid Y] + b \mathbb{E}[Z \mid Y] for constants a, b, and the tower property (or law of iterated expectations): \mathbb{E}[\mathbb{E}[X \mid Y]] = \mathbb{E}[X], which underscores its role in breaking down complex expectations hierarchically.[1] Additionally, if X is \mathcal{G}-measurable, then \mathbb{E}[X \mid \mathcal{G}] = X, reflecting that full information about X yields the variable itself.[2] Conditional expectation is pivotal in numerous fields, enabling the computation of expected values under constraints or additional data, such as in Bayesian inference where it represents posterior means.[4] It forms the basis for advanced stochastic processes, including martingales—where the conditional expectation of the future value equals the current value—and is essential in filtering theory, optimal prediction, and financial mathematics for modeling asset prices under uncertainty.[4] In statistics, it underpins regression analysis, where the conditional expectation function describes the relationship between predictors and responses.[5]Examples
Dice Rolling
Consider the experiment of rolling two fair six-sided dice, where each die is independent and uniformly distributed over the outcomes {1, 2, 3, 4, 5, 6}. Let X denote the sum of the numbers shown on the two dice. The unconditional expected value E[X] is 7, as it equals the sum of the expected values of the individual dice, each of which has E[\text{die}] = (1+2+3+4+5+6)/6 = 3.5. Now suppose we are given the information that the first die shows a 1. This restricts the possible outcomes to the six equally likely cases where the second die shows 1 through 6, yielding sums of 2, 3, 4, 5, 6, or 7. The conditional expectation E[X \mid \text{first die} = 1] is the average over this restricted sample space: E[X \mid \text{first die} = 1] = \frac{2 + 3 + 4 + 5 + 6 + 7}{6} = \frac{27}{6} = 4.5. This value, 4.5, contrasts with the unconditional expectation of 7, illustrating how the additional information about the first die being 1 updates our prediction of the total sum downward, since the first contribution is now fixed at a low value of 1 while the second die remains unbiased. In essence, the conditional expectation serves as the best (in the mean squared error sense) prediction of X given the partial information from the first die, averaging the possible outcomes consistent with that information.Rainfall Data
In a practical setting, conditional expectation can be illustrated using historical monthly rainfall data collected over multiple years in a temperate region. Let X represent the rainfall amount in a given month, measured in inches. The conditional expectation E[X \mid \text{summer months}] is computed as the average of the observed rainfall values specifically for the summer months (June, July, and August), providing an empirical estimate of the expected rainfall given that the month falls in summer. This approach leverages the law of total expectation in a data-driven manner, where the overall expected rainfall is a weighted average of seasonal conditionals. For instance, consider a small sample of observed summer rainfall values from historical records: 2.5 inches, 3.0 inches, and 1.8 inches. The conditional expectation is then (2.5 + 3.0 + 1.8)/3 = 2.4 inches, representing the best estimate of typical summer rainfall based on these observations. To highlight how conditioning on the season reduces variability compared to the unconditional case, examine the following table of sample monthly rainfall data from one year, categorized by season. The unconditional monthly average across all months is approximately 2.8 inches, with higher spread due to seasonal differences. In contrast, the summer conditional average of 2.4 inches shows less variation within that subset, illustrating how conditional expectation narrows the focus and typically lowers uncertainty for predictions within the conditioned event.| Month | Season | Rainfall (inches) |
|---|---|---|
| June | Summer | 2.5 |
| July | Summer | 3.0 |
| August | Summer | 1.8 |
| December | Winter | 4.2 |
| January | Winter | 3.5 |
| February | Winter | 2.0 |