Conditional Probability and Conditional Expectation
The Discrete Case
For any two events \(E\) and \(F\), the conditional probability of \(E\) given \(F\) is defined, as long as \(P(F) > 0\), by
\[ P(E|F) = \frac{P(E F)}{P(F)}. \]
Hence, if \(X\) and \(Y\) are discrete random variables, then it is natural to define the conditional probability mass function of \(X\) given that \(Y = y\), by
\[ \begin{align*} p_{X|Y}(x|y) &= P(X = x | Y = y) \\ &= \frac{P(X = x, Y = y)}{P(Y = y)} \\ &= \frac{p(x,y)}{p_Y(y)}. \end{align*} \]
Similarly, the conditional probability distribution function of \(X\) given that \(Y=y\) is defined, for all \(y\) such that \(P(Y = y) > 0\), by
\[ \begin{align*} F_{X|Y}(x|y) &= P(X \leq x | Y = y) \\ &= \sum_{a \leq x} P_{X | Y}(a | y) \\ \end{align*} \]
The conditional expectation of \(X\) given that \(Y = y\) is defined by \[ E[X | Y = y] = \sum_{x} x P(X = x | Y = y) = \sum_{x} x p_{X|Y}(x|y). \]
The Continuous Case
If \(X\) and \(Y\) have a joint probability density function \(f(x, y)\), then the conditional probability density function of \(X\), given that \(Y = y\), is defined for all values of \(y\) such that \(f_Y(y) > 0\), by
\[ \begin{align*} f_{X|Y}(x|y) &= \frac{f(x,y)}{f_Y(y)} \\ &= \frac{f(x,y)}{\int_{-\infty}^{\infty} f(x,y) dx}. \end{align*} \] The conditional distribution function of \(X\) given that \(Y = y\) is defined by \[ \begin{align*} F_{X|Y}(x|y) &= P(X \leq x | Y = y) \\ &= \int_{-\infty}^{x} f_{X|Y}(a|y) da \\ &= \int_{-\infty}^{x} \frac{f(a,y)}{f_Y(y)} da \\ &= \frac{1}{f_Y(y)} \int_{-\infty}^{x} f(a,y) da. \end{align*} \]
The conditional expectation of \(X\) given that \(Y = y\) is defined or all values of \(y\) such that \(f_Y(y) > 0\), by \[ \begin{align*} E[X | Y = y] &= \int_{-\infty}^{\infty} x f_{X|Y}(x|y) dx \\ &= \int_{-\infty}^{\infty} x \frac{f(x,y)}{f_Y(y)} dx \\ &= \frac{1}{f_Y(y)} \int_{-\infty}^{\infty} x f(x,y) dx. \end{align*} \]
Computing Expectations by Conditioning
Let us denote by \(E[X|Y]\) that function of the random variable \(Y\) whose value at \(Y = y\) is \(E[X|Y = y]\). Note that \(E[X|Y]\) is itself a random variable. An extremely important important property of conditional expectation is that for all random variables \(X\) and \(Y\) \[ E[X] = E[E[X|Y]] \]
If \(Y\) is a discrete random variable, then \[ E[X] = \sum_{y} E[X|Y = y] P(Y = y) \]
If \(Y\) is continuous with density \(f_Y(y)\), then \[ E[X] = \int_{-\infty}^{\infty} E[X|Y = y] f_Y(y) dy \]