BBy Bot
Jun 09'24

Exercise

For correlated random variables [math]X[/math] and [math]Y[/math] it is natural to ask for the expected value for [math]X[/math] given [math]Y[/math]. For example, Galton calculated the expected value of the height of a son given the height of the father. He used this to show that tall men can be expected to have sons who are less tall on the average. Similarly, students who do very well on one exam can be expected to do less well on the next exam, and so forth. This is called regression on the mean. To define this conditional expected value, we first define a conditional density of [math]X[/math] given [math]Y = y[/math] by

[[math]] f_{X|Y}(x|y) = \frac {f_{X,Y}(x,y)}{f_Y(y)}\ , [[/math]]

where [math]f_{X,Y}(x,y)[/math] is the joint density of [math]X[/math] and [math]Y[/math], and [math]f_Y[/math] is the density for [math]Y[/math]. Then the conditional expected value of [math]X[/math] given [math]Y[/math] is

[[math]] E(X|Y = y) = \int_a^b x f_{X|Y}(x|y)\, dx\ . [[/math]]

For the normal density in Exercise, show that the conditional density of [math]f_{X|Y}(x|y)[/math] is normal with mean [math]\rho y[/math] and variance [math]1 - \rho^2[/math]. From this we see that if [math]X[/math] and [math]Y[/math] are positively correlated [math](0 \lt \rho \lt 1)[/math], and if [math]y \gt E(Y)[/math], then the expected value for [math]X[/math] given [math]Y = y[/math] will be less than [math]y[/math] (i.e., we have regression on the mean).