Let [math]X[/math] be a continuous random variable with mean [math]\mu(X)[/math] and variance [math]\sigma^2(X)[/math], and let [math]X^* = (X - \mu)/\sigma[/math] be its standardized version. Verify directly that [math]\mu(X^*) = 0[/math] and [math]\sigma^2(X^*) = 1[/math].
Let [math]\{X_k\}[/math], [math]1 \leq k \leq n[/math], be a sequence of independent random variables, all with mean 0 and variance 1, and let [math]S_n[/math], [math]S_n^*[/math], and [math]A_n[/math] be their sum, standardized sum, and average, respectively. Verify directly that [math]S_n^* = S_n/\sqrt{n} = \sqrt{n} A_n[/math].
Let [math]\{X_k\}[/math], [math]1 \leq k \leq n[/math], be a sequence of random variables, all with mean [math]\mu[/math] and variance [math]\sigma^2[/math], and [math]Y_k = X_k^*[/math] be their standardized versions. Let [math]S_n[/math] and [math]T_n[/math] be the sum of the [math]X_k[/math] and [math]Y_k[/math], and [math]S_n^*[/math] and [math]T_n^*[/math] their standardized version. Show that [math]S_n^* = T_n^* = T_n/\sqrt{n}[/math].
Suppose we choose independently 25 numbers at random (uniform density) from the interval [math][0,20][/math]. Write the normal densities that approximate the densities of their sum [math]S_{25}[/math], their standardized sum [math]S_{25}^*[/math], and their average [math]A_{25}[/math].
Write a program to choose independently 25 numbers at random from [math][0,20][/math], compute their sum [math]S_{25}[/math], and repeat this experiment 1000 times. Make a bar graph for the density of [math]S_{25}[/math] and compare it with the normal approximation of Exercise Exercise. How good is the fit? Now do the same for the standardized sum [math]S_{25}^*[/math] and the average [math]A_{25}[/math].
In general, the Central Limit Theorem gives a better estimate than Chebyshev's inequality for the average of a sum. To see this, let [math]A_{25}[/math] be the average calculated in Exercise, and let [math]N[/math] be the normal approximation for [math]A_{25}[/math]. Modify your program in Exercise to provide a table of the function [math]F(x) = P(|A_{25} - 10| \geq x) = {}[/math] fraction of the total of 1000 trials for which [math]|A_{25} - 10| \geq x[/math]. Do the same for the function [math]f(x) = P(|N - 10| \geq x)[/math]. (You can use the normal table, Table, or the procedure NormalArea for this.) Now plot on the same axes the graphs of [math]F(x)[/math], [math]f(x)[/math], and the Chebyshev function [math]g(x) = 4/(3x^2)[/math]. How do [math]f(x)[/math] and [math]g(x)[/math] compare as estimates for [math]F(x)[/math]?
The Central Limit Theorem says the sums of independent random variables tend to look normal, no matter what crazy distribution the individual variables have. Let us test this by a computer simulation. Choose independently 25 numbers from the interval [math][0,1][/math] with the probability density [math]f(x)[/math] given below, and compute their sum [math]S_{25}[/math]. Repeat this experiment 1000 times, and make up a bar graph of the results. Now plot on the same graph the density [math]\phi(x) = \mbox {normal \,\,\,}(x,\mu(S_{25}),\sigma(S_{25}))[/math]. How well does the normal density fit your bar graph in each case?
- [math]f(x) = 1[/math].
- [math]f(x) = 2x[/math].
- [math]f(x) = 3x^2[/math].
- [math]f(x) = 4|x - 1/2|[/math].
- [math]f(x) = 2 - 4|x - 1/2|[/math].
How large must [math]n[/math] be before [math]S_n = X_1 + X_2 +\cdots+ X_n[/math] is approximately normal? This number is often surprisingly small. Let us explore this question with a computer simulation. Choose [math]n[/math] numbers from [math][0,1][/math] with probability density [math]f(x)[/math], where [math]n = 3[/math], 6, 12, 20, and [math]f(x)[/math] is each of the densities in Exercise. Compute their sum [math]S_n[/math], repeat this experiment 1000 times, and make up a bar graph of 20 bars of the results. How large must [math]n[/math] be before you get a good fit?
A surveyor is measuring the height of a cliff known to be about 1000 feet. He assumes his instrument is properly calibrated and that his measurement errors are independent, with mean [math]\mu = 0[/math] and variance [math]\sigma^2 = 10[/math]. He plans to take [math]n[/math] measurements and form the average. Estimate, using (a) Chebyshev's inequality and (b) the normal approximation, how large [math]n[/math] should be if he wants to be 95 percent sure that his average falls within 1 foot of the true value. Now estimate, using (a) and (b), what value should [math]\sigma^2[/math] have if he wants to make only 10 measurements with the same confidence?