Central Limit Thm, Normal Approximations Engineering Statistics Section 5.4 Josh Engwer TTU 23 March 2016 Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 1 / 26
PART I PART I: CENTRAL LIMIT THEOREM Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 2 / 26
Expected Value & Variance of a Sum of iid rv s Proposition Let X 1,..., X n be a random sample from a population and c 1,..., c n 0. Then: E[c 1 X 1 + + c n X n ] = c 1 E[X 1 ] + + c n E[X n ] V[c 1 X 1 + + c n X n ] = c 2 1 V[X 1] + + c 2 nv[x n ] iid PROOF: CASE I (n = 2): X 1, X 2 (discrete population) E[c 1 X 1 + c 2 X 2 ] := (c 1 j + c 2 k) p (X1,X 2)(j, k) iid = (j,k) Supp(X 1,X 2) j Supp(X 1) k Supp(X 2) = c 1 j Supp(X 1) := c 1 E[X 1 ] + c 2 E[X 2 ] j p X1 (j) + c 2 (c 1 j + c 2 k) p X1 (j) p X2 (k) k Supp(X 2) k p X2 (k) Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 3 / 26
Expected Value & Variance of a Sum of iid rv s Proposition Let X 1,..., X n be a random sample from a population and c 1,..., c n 0. Then: PROOF: E[c 1 X 1 + + c n X n ] = c 1 E[X 1 ] + + c n E[X n ] V[c 1 X 1 + + c n X n ] = c 2 1 V[X 1] + + c 2 nv[x n ] E[c 1 X 1 + c 2 X 2 ] := CASE II (n = 2): X 1, X 2 iid (continuous population) iid = Supp(X 1,X 2) = c 1 Supp(X 1) (c 1 x 1 + c 2 x 2 ) f (X1,X 2)(x 1, x 2 ) dx 1 dx 2 Supp(X 1) Supp(X 2) := c 1 E[X 1 ] + c 2 E[X 2 ] (requires independence) (c 1 x 1 + c 2 x 2 ) f X1 (x 1 ) f X2 (x 2 ) dx 1 dx 2 x 1 f X1 (x 1 ) dx 1 + c 2 Supp(X 2) x 2 f X2 (x 2 ) dx 2 Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 4 / 26
Properties of Sample Mean Proposition Let X 1,..., X n be a random sample from a distribution with mean µ and variance σ 2. Then: PROOF: µ X = E[X] = µ σ 2 X = V[X] = σ2 /n σ X = σ/ n E[X] = E [ 1 n (X 1 + + X n ) ] = 1 n [E[X 1] + + E[X n ]] = 1 n [µ + + µ] = 1 n [nµ] = µ V[X] = V [ 1 n (X 1 + + X n ) ] = 1 [V[X n 2 1 ] + + V[X n ]] = 1 [σ 2 + + σ 2 ] = 1 [nσ 2 ] = σ 2 /n n 2 n 2 σ X = V[X] = σ 2 /n = σ/ n Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 5 / 26
Properties of Sample Total Proposition Let X 1,..., X n be a random sample from a distribution with mean µ and variance σ 2. Then: PROOF: µ X1+ +X n = E[X 1 + + X n ] = nµ σ 2 X 1+ +X n = V[X 1 + + X n ] = nσ 2 σ X1+ +X n = σ n E[X 1 + + X n ] = E [ nx ] = ne [ X ] = n µ = nµ V[X 1 + + X n ] = V [ nx ] = n 2 V [ X ] = n 2 [σ 2 /n] = nσ 2 σ X1+ +X n = V[X 1 + + X n ] = nσ 2 = σ n Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 6 / 26
Sample Mean & Sample Total of a Random Sample from a Normal Population The proceeding properties of sample means & sample variances can be applied to particular population distributions: Proposition Let random sample X 1,..., X n iid Normal(µ, σ 2 ). For any sample size n > 1, Then: X Normal(µ, σ 2 /n) For any sample size n > 1, X 1 + + X n Normal(nµ, nσ 2 ) PROOF: X Normal(µ, σ 2 ) = E[X] = µ, V[X] = σ 2 E[X] = E [ 1 n (X 1 + + X n ) ] = 1 n [E[X 1] + + E[X n ]] = 1 n [µ + + µ] = 1 n [nµ] = µ V[X] = V [ 1 n (X 1 + + X n ) ] = 1 [V[X n 2 1 ] + + V[X n ]] = 1 [σ 2 + + σ 2 ] = 1 [nσ 2 ] = σ 2 /n n 2 n 2 X Normal(µ, σ 2 /n) Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 7 / 26
Sample Mean & Sample Total of a Random Sample from a Normal Population The proceeding properties of sample means & sample variances can be applied to particular population distributions: Proposition Let random sample X 1,..., X n iid Normal(µ, σ 2 ). For any sample size n > 1, Then: X Normal(µ, σ 2 /n) For any sample size n > 1, X 1 + + X n Normal(nµ, nσ 2 ) PROOF: X Normal(µ, σ 2 ) = E[X] = µ, V[X] = σ 2 E[X 1 + + X n ] = E [ nx ] = ne [ X ] = n µ = nµ V[X 1 + + X n ] = V [ nx ] = n 2 V [ X ] = n 2 [σ 2 /n] = nσ 2 X 1 + + X n Normal(nµ, nσ 2 ) Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 8 / 26
Central Limit Theorem (CLT) The Central Limit Thm is considered a fundamental theorem of Statistics: Theorem (Central Limit Theorem) Let X 1,..., X n be a random sample from a non-normal distribution with mean µ and variance σ 2. Then: The sample mean X is approximately normal as follows: X approx Normal(µ, σ 2 /n) The sample total X 1 + + X n is approximately normal as follows: X 1 + + X n approx Normal(nµ, nσ 2 ) Requirement for this normal approximation to be valid: n > 30 The larger the sample size n, the better the approximation. PROOF: Beyond the scope of this course. Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 9 / 26
PART II PART II: NORMAL APPROXIMATION TO THE BINOMIAL Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 10 / 26
The Sum of several iid Bernoulli(p) rv s Proposition Let random sample X 1,..., X m iid Bernoulli(p). Then: For any sample size n > 1, X 1 + + X n Binomial(n, p) PROOF: X Bernoulli(p) = E[X] = p, V[X] = pq (q = 1 p) E[X 1 + + X n ] = E[X 1 ] + + E[X n ] iid = n E[X 1 ] = n p = np V[X 1 + + X n ] iid = V[X 1 ] + + V[X n ] iid = n V[X 1 ] = n pq = npq X 1 + + X n Binomial(n, p) Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 11 / 26
Normal Approximation to the Binomial Corollary Let X Binomial(n, p). Then: X approx Normal(µ = np, σ 2 = npq) (where q = 1 p) Requirement for this normal approximation to be valid: min{np, nq} 10 i.e. It s required that both np 10 and nq 10. NOTE: Remember that normal distributions are symmetric. If min{np, nq} < 10, then the binomial distribution is too skewed. PROOF: Let X 1,..., X n Bernoulli(p) = µ X1 = E[X 1 ] = p and σ 2 X 1 = V[X 1 ] = pq. Then X := X 1 + + X n Binomial(n, p). Moreover, the CLT asserts that X 1 + + X n approx Normal(µ = nµ X1, σ 2 = nσ 2 X 1 ) X Binomial(n, p) = X approx Normal(µ = np, σ 2 = npq) Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 12 / 26
Normal Approximation to Binomial Probability Corollary Let X Binomial(n, p). Then: P(X x) = Bi(x; n, p) Φ ( ) x + 0.5 np npq (where q = 1 p) Requirement for this normal approximation to be valid: min{np, nq} 10 NOTE: The continuity correction term + 0.5 improves the approximation. PROOF: Assume that the requirement min{np, nq} 10 is satisfied. Then: X Binomial(n, p) = X approx Normal(µ = np, σ 2 = npq) ( = Bi(x; n, p) = P(X x) P Z x µ ) ( ) ( ) x µ x np = Φ = Φ σ σ npq Add continuity correction + 0.5 in numerator of Φ to better the approximation. Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 13 / 26
Binomial Density Plots (pmf s) for Sample Size n = 2 Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 14 / 26
Binomial Density Plots (pmf s) for Sample Size n = 5 Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 15 / 26
Binomial Density Plots (pmf s) for Sample Size n = 10 Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 16 / 26
Binomial Density Plots (pmf s) for Sample Size n = 20 Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 17 / 26
Binomial Density Plots (pmf s) for Sample Size n = 50 Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 18 / 26
PART III PART III: NORMAL APPROXIMATION TO THE POISSON Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 19 / 26
The Sum of several iid Poisson(λ) rv s The proceeding properties of sample totals can be applied to particular population distributions: Proposition Let random sample X 1,..., X n iid Poisson(λ). For any sample size n > 1, Then: X 1 + + X n Poisson(nλ) PROOF: X Poisson(λ) = E[X] = λ, V[X] = λ E[X 1 + + X n ] = E[X 1 ] + + E[X n ] iid = n E[X 1 ] = n λ = nλ V[X 1 + + X n ] iid = V[X 1 ] + + V[X n ] iid = n V[X 1 ] = n λ = nλ X 1 + + X n Poisson(nλ) Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 20 / 26
Normal Approximation to the Poisson Corollary Let X Poisson(λ). Then: X approx Normal(µ = λ, σ 2 = λ) Requirement for this normal approximation to be valid: λ > 20 NOTE: Remember that normal distributions are symmetric. If λ 20, then the binomial distribution is too skewed. PROOF: Let X 1,..., X n Poisson(λ/n) = µ X1 = E[X 1 ] = λ/n and σ 2 X 1 = V[X 1 ] = λ/n. Then X := X 1 + + X n Poisson(λ). Moreover, the CLT asserts that X 1 + + X n approx Normal(µ = nµ X1, σ 2 = nσ 2 X 1 ) X Poisson(λ) = X approx Normal(µ = λ, σ 2 = λ) Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 21 / 26
Normal Approximation to Poisson Probability Corollary Let X Poisson(λ). Then: P(X x) = Pois(x; λ) Φ ( ) x + 0.5 λ λ Requirement for this normal approximation to be valid: λ > 20 NOTE: The continuity correction term + 0.5 improves the approximation. PROOF: Assume that the requirement λ > 20 is satisfied. Then: X Poisson(λ) = X approx Normal(µ = λ, σ 2 = λ) ( = Pois(x; λ) = P(X x) P Z x µ ) ( ) ( ) x µ x λ = Φ = Φ σ σ λ Add continuity correction + 0.5 in numerator of Φ to better the approximation. Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 22 / 26
Poisson Density Plots (pmf s) Notice the Poisson density curves for λ = 1, 2, 5, 10, 15 are skewed. Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 23 / 26
Poisson Density Plots (pmf s) Notice the Poisson density curves for λ 20 are nearly symmetric. Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 24 / 26
Textbook Logistics for Section 5.4 Difference(s) in Notation: CONCEPT TEXTBOOK NOTATION SLIDES/OUTLINE NOTATION Probability of Event P(A) P(A) Support of a r.v. All possible values of X Supp(X) pmf of a r.v. p X (x) p X (k) Expected Value of r.v. E(X) E[X] Variance of r.v. V(X) V[X] Sample Total T o Xk pmf of Sample Mean p X (x) p X (k) pmf of Sample Variance p S 2(s 2 ) p S 2(k) Ignore Lognormal Approximation (bottom of pg 236) The Lognormal distribution was part of Section 4.5 that was skipped Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 25 / 26
Fin Fin. Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 26 / 26