Central Limit Thm, Normal Approximations

Central Limit Thm, Normal Approximations Engineering Statistics Section 5.4 Josh Engwer TTU 23 March 2016 Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 1 / 26

PART I PART I: CENTRAL LIMIT THEOREM Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 2 / 26

Expected Value & Variance of a Sum of iid rv s Proposition Let X 1,..., X n be a random sample from a population and c 1,..., c n 0. Then: E[c 1 X 1 + + c n X n ] = c 1 E[X 1 ] + + c n E[X n ] V[c 1 X 1 + + c n X n ] = c 2 1 V[X 1] + + c 2 nv[x n ] iid PROOF: CASE I (n = 2): X 1, X 2 (discrete population) E[c 1 X 1 + c 2 X 2 ] := (c 1 j + c 2 k) p (X1,X 2)(j, k) iid = (j,k) Supp(X 1,X 2) j Supp(X 1) k Supp(X 2) = c 1 j Supp(X 1) := c 1 E[X 1 ] + c 2 E[X 2 ] j p X1 (j) + c 2 (c 1 j + c 2 k) p X1 (j) p X2 (k) k Supp(X 2) k p X2 (k) Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 3 / 26

Expected Value & Variance of a Sum of iid rv s Proposition Let X 1,..., X n be a random sample from a population and c 1,..., c n 0. Then: PROOF: E[c 1 X 1 + + c n X n ] = c 1 E[X 1 ] + + c n E[X n ] V[c 1 X 1 + + c n X n ] = c 2 1 V[X 1] + + c 2 nv[x n ] E[c 1 X 1 + c 2 X 2 ] := CASE II (n = 2): X 1, X 2 iid (continuous population) iid = Supp(X 1,X 2) = c 1 Supp(X 1) (c 1 x 1 + c 2 x 2 ) f (X1,X 2)(x 1, x 2 ) dx 1 dx 2 Supp(X 1) Supp(X 2) := c 1 E[X 1 ] + c 2 E[X 2 ] (requires independence) (c 1 x 1 + c 2 x 2 ) f X1 (x 1 ) f X2 (x 2 ) dx 1 dx 2 x 1 f X1 (x 1 ) dx 1 + c 2 Supp(X 2) x 2 f X2 (x 2 ) dx 2 Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 4 / 26

Properties of Sample Mean Proposition Let X 1,..., X n be a random sample from a distribution with mean µ and variance σ 2. Then: PROOF: µ X = E[X] = µ σ 2 X = V[X] = σ2 /n σ X = σ/ n E[X] = E [ 1 n (X 1 + + X n ) ] = 1 n [E[X 1] + + E[X n ]] = 1 n [µ + + µ] = 1 n [nµ] = µ V[X] = V [ 1 n (X 1 + + X n ) ] = 1 [V[X n 2 1 ] + + V[X n ]] = 1 [σ 2 + + σ 2 ] = 1 [nσ 2 ] = σ 2 /n n 2 n 2 σ X = V[X] = σ 2 /n = σ/ n Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 5 / 26

Properties of Sample Total Proposition Let X 1,..., X n be a random sample from a distribution with mean µ and variance σ 2. Then: PROOF: µ X1+ +X n = E[X 1 + + X n ] = nµ σ 2 X 1+ +X n = V[X 1 + + X n ] = nσ 2 σ X1+ +X n = σ n E[X 1 + + X n ] = E [ nx ] = ne [ X ] = n µ = nµ V[X 1 + + X n ] = V [ nx ] = n 2 V [ X ] = n 2 [σ 2 /n] = nσ 2 σ X1+ +X n = V[X 1 + + X n ] = nσ 2 = σ n Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 6 / 26

Sample Mean & Sample Total of a Random Sample from a Normal Population The proceeding properties of sample means & sample variances can be applied to particular population distributions: Proposition Let random sample X 1,..., X n iid Normal(µ, σ 2 ). For any sample size n > 1, Then: X Normal(µ, σ 2 /n) For any sample size n > 1, X 1 + + X n Normal(nµ, nσ 2 ) PROOF: X Normal(µ, σ 2 ) = E[X] = µ, V[X] = σ 2 E[X] = E [ 1 n (X 1 + + X n ) ] = 1 n [E[X 1] + + E[X n ]] = 1 n [µ + + µ] = 1 n [nµ] = µ V[X] = V [ 1 n (X 1 + + X n ) ] = 1 [V[X n 2 1 ] + + V[X n ]] = 1 [σ 2 + + σ 2 ] = 1 [nσ 2 ] = σ 2 /n n 2 n 2 X Normal(µ, σ 2 /n) Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 7 / 26

Sample Mean & Sample Total of a Random Sample from a Normal Population The proceeding properties of sample means & sample variances can be applied to particular population distributions: Proposition Let random sample X 1,..., X n iid Normal(µ, σ 2 ). For any sample size n > 1, Then: X Normal(µ, σ 2 /n) For any sample size n > 1, X 1 + + X n Normal(nµ, nσ 2 ) PROOF: X Normal(µ, σ 2 ) = E[X] = µ, V[X] = σ 2 E[X 1 + + X n ] = E [ nx ] = ne [ X ] = n µ = nµ V[X 1 + + X n ] = V [ nx ] = n 2 V [ X ] = n 2 [σ 2 /n] = nσ 2 X 1 + + X n Normal(nµ, nσ 2 ) Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 8 / 26

Central Limit Theorem (CLT) The Central Limit Thm is considered a fundamental theorem of Statistics: Theorem (Central Limit Theorem) Let X 1,..., X n be a random sample from a non-normal distribution with mean µ and variance σ 2. Then: The sample mean X is approximately normal as follows: X approx Normal(µ, σ 2 /n) The sample total X 1 + + X n is approximately normal as follows: X 1 + + X n approx Normal(nµ, nσ 2 ) Requirement for this normal approximation to be valid: n > 30 The larger the sample size n, the better the approximation. PROOF: Beyond the scope of this course. Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 9 / 26

PART II PART II: NORMAL APPROXIMATION TO THE BINOMIAL Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 10 / 26

The Sum of several iid Bernoulli(p) rv s Proposition Let random sample X 1,..., X m iid Bernoulli(p). Then: For any sample size n > 1, X 1 + + X n Binomial(n, p) PROOF: X Bernoulli(p) = E[X] = p, V[X] = pq (q = 1 p) E[X 1 + + X n ] = E[X 1 ] + + E[X n ] iid = n E[X 1 ] = n p = np V[X 1 + + X n ] iid = V[X 1 ] + + V[X n ] iid = n V[X 1 ] = n pq = npq X 1 + + X n Binomial(n, p) Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 11 / 26

Normal Approximation to the Binomial Corollary Let X Binomial(n, p). Then: X approx Normal(µ = np, σ 2 = npq) (where q = 1 p) Requirement for this normal approximation to be valid: min{np, nq} 10 i.e. It s required that both np 10 and nq 10. NOTE: Remember that normal distributions are symmetric. If min{np, nq} < 10, then the binomial distribution is too skewed. PROOF: Let X 1,..., X n Bernoulli(p) = µ X1 = E[X 1 ] = p and σ 2 X 1 = V[X 1 ] = pq. Then X := X 1 + + X n Binomial(n, p). Moreover, the CLT asserts that X 1 + + X n approx Normal(µ = nµ X1, σ 2 = nσ 2 X 1 ) X Binomial(n, p) = X approx Normal(µ = np, σ 2 = npq) Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 12 / 26

Normal Approximation to Binomial Probability Corollary Let X Binomial(n, p). Then: P(X x) = Bi(x; n, p) Φ ( ) x + 0.5 np npq (where q = 1 p) Requirement for this normal approximation to be valid: min{np, nq} 10 NOTE: The continuity correction term + 0.5 improves the approximation. PROOF: Assume that the requirement min{np, nq} 10 is satisfied. Then: X Binomial(n, p) = X approx Normal(µ = np, σ 2 = npq) ( = Bi(x; n, p) = P(X x) P Z x µ ) ( ) ( ) x µ x np = Φ = Φ σ σ npq Add continuity correction + 0.5 in numerator of Φ to better the approximation. Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 13 / 26

Binomial Density Plots (pmf s) for Sample Size n = 2 Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 14 / 26

Binomial Density Plots (pmf s) for Sample Size n = 5 Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 15 / 26

Binomial Density Plots (pmf s) for Sample Size n = 10 Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 16 / 26

Binomial Density Plots (pmf s) for Sample Size n = 20 Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 17 / 26

Binomial Density Plots (pmf s) for Sample Size n = 50 Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 18 / 26

PART III PART III: NORMAL APPROXIMATION TO THE POISSON Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 19 / 26

The Sum of several iid Poisson(λ) rv s The proceeding properties of sample totals can be applied to particular population distributions: Proposition Let random sample X 1,..., X n iid Poisson(λ). For any sample size n > 1, Then: X 1 + + X n Poisson(nλ) PROOF: X Poisson(λ) = E[X] = λ, V[X] = λ E[X 1 + + X n ] = E[X 1 ] + + E[X n ] iid = n E[X 1 ] = n λ = nλ V[X 1 + + X n ] iid = V[X 1 ] + + V[X n ] iid = n V[X 1 ] = n λ = nλ X 1 + + X n Poisson(nλ) Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 20 / 26

Normal Approximation to the Poisson Corollary Let X Poisson(λ). Then: X approx Normal(µ = λ, σ 2 = λ) Requirement for this normal approximation to be valid: λ > 20 NOTE: Remember that normal distributions are symmetric. If λ 20, then the binomial distribution is too skewed. PROOF: Let X 1,..., X n Poisson(λ/n) = µ X1 = E[X 1 ] = λ/n and σ 2 X 1 = V[X 1 ] = λ/n. Then X := X 1 + + X n Poisson(λ). Moreover, the CLT asserts that X 1 + + X n approx Normal(µ = nµ X1, σ 2 = nσ 2 X 1 ) X Poisson(λ) = X approx Normal(µ = λ, σ 2 = λ) Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 21 / 26

Normal Approximation to Poisson Probability Corollary Let X Poisson(λ). Then: P(X x) = Pois(x; λ) Φ ( ) x + 0.5 λ λ Requirement for this normal approximation to be valid: λ > 20 NOTE: The continuity correction term + 0.5 improves the approximation. PROOF: Assume that the requirement λ > 20 is satisfied. Then: X Poisson(λ) = X approx Normal(µ = λ, σ 2 = λ) ( = Pois(x; λ) = P(X x) P Z x µ ) ( ) ( ) x µ x λ = Φ = Φ σ σ λ Add continuity correction + 0.5 in numerator of Φ to better the approximation. Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 22 / 26

Poisson Density Plots (pmf s) Notice the Poisson density curves for λ = 1, 2, 5, 10, 15 are skewed. Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 23 / 26

Poisson Density Plots (pmf s) Notice the Poisson density curves for λ 20 are nearly symmetric. Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 24 / 26

Textbook Logistics for Section 5.4 Difference(s) in Notation: CONCEPT TEXTBOOK NOTATION SLIDES/OUTLINE NOTATION Probability of Event P(A) P(A) Support of a r.v. All possible values of X Supp(X) pmf of a r.v. p X (x) p X (k) Expected Value of r.v. E(X) E[X] Variance of r.v. V(X) V[X] Sample Total T o Xk pmf of Sample Mean p X (x) p X (k) pmf of Sample Variance p S 2(s 2 ) p S 2(k) Ignore Lognormal Approximation (bottom of pg 236) The Lognormal distribution was part of Section 4.5 that was skipped Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 25 / 26

Fin Fin. Josh Engwer (TTU) Central Limit Thm, Normal Approximations 23 March 2016 26 / 26