Lecture 9: The law of large numbers and central limit theorem

Lecture 9: The law of large umbers ad cetral limit theorem Theorem.4 Let X,X 2,... be idepedet radom variables with fiite expectatios. (i) (The SLLN). If there is a costat p [,2] such that E X i p i i= p <, () the (X i EX i ) a.s. 0. i= (ii) (The WLLN). If there is a costat p [,2] such that the lim p i= i= E X i p = 0, (2) (X i EX i ) p 0. UW-Madiso (Statistics) Stat 709 Lecture 9 208 / 5

Remarks Note that () implies (2) (Lemma.6). The result i Theorem.4(i) is called Kolmogorov s SLLN whe p = 2 ad is due to Marcikiewicz ad Zygmud whe p < 2. A obvious sufficiet coditio for () with p (,2] is sup E X p <. The WLLN ad SLLN have may applicatios i probability ad statistics. Example.32 Let f ad g be cotiuous fuctios o [0,] satisfyig 0 f (x) Cg(x) for all x, where C > 0 is a costat. We ow show that lim 0 0 i= f (x i) 0 i= g(x i) dx dx 2 dx = 0 f (x)dx 0 g(x)dx (3) (assumig that 0 g(x)dx 0). UW-Madiso (Statistics) Stat 709 Lecture 9 208 2 / 5

Example.32 (cotiued) X,X 2,... be i.i.d. radom variables havig the uiform distributio o [0,]. By Theorem.2, E[f (X )] = f (x)dx <, E[g(X )] = g(x)dx <. 0 0 By the SLLN (Theorem.3(ii)), i= By Theorem.0(i), f (X i ) a.s. E[f (X )], i= f (X i) i= g(x i) a.s. i= g(x i ) a.s. E[g(X )], E[f (X )] E[g(X )]. (4) Sice the radom variable o the left-had side of (4) is bouded by C, result (3) follows from the domiated covergece theorem ad the fact that the left-had side of (3) is the expectatio of the radom variable o the left-had side of (4). UW-Madiso (Statistics) Stat 709 Lecture 9 208 3 / 5

Example Let T = i= X i, where X s are idepedet radom variables satisfyig P(X = ± θ ) = 0.5 ad θ > 0 is a costat. We wat to show that T / a.s. 0 whe θ < 0.5. For θ < 0.5, EX 2 = 2 = = 2θ 2 <. By the Kolmogorov strog law of large umbers, T / a.s. 0. Example (Exercise 65) Let X,X 2,... be idepedet radom variables. Suppose that (X j EX j ) d N(0,), σ where σ 2 = var( X j). UW-Madiso (Statistics) Stat 709 Lecture 9 208 4 / 5

Example (Exercise 65) We wat to show that (X j EX j ) p 0 iff σ / 0. If σ / 0, the by Slutsky s theorem, (X j EX j ) = σ σ (X j EX j ) d 0. Assume ow σ / does ot coverge to 0 but (X j EX j ) p 0. Without loss of geerality, assume that σ / c (0, ]. By Slutsky s theorem, σ (X j EX j ) = σ (X j EX j ) p 0. This cotradicts the fact that (X j EX j )/σ d N(0,). Hece, (X j EX j ) does ot coverge to 0 i probability. UW-Madiso (Statistics) Stat 709 Lecture 9 208 5 / 5

The cetral limit theorem The WLLN ad SLLN may ot be useful i approximatig the distributios of (ormalized) sums of idepedet radom variables. We eed to use the cetral limit theorem (CLT), which plays a fudametal role i statistical asymptotic theory. Theorem.5 (Lideberg s CLT) Let {X j,j =,...,k } be idepedet radom variables with k as ad If the σ 2 k 0 < σ 2 = var ( k ) X j <, =,2,..., [ ] E (X j EX j ) 2 I { Xj EX j >εσ } 0 for ay ε > 0, (5) k σ (X j EX j ) d N(0,). UW-Madiso (Statistics) Stat 709 Lecture 9 208 6 / 5

Proof Cosiderig (X j EX j )/σ, without loss of geerality we may assume EX j = 0 ad σ 2 = i this proof. Let t R be give. From the iequality e tx ( + tx t 2 x 2 /2) mi{ tx 2, tx 3 }, the ch.f. of X j satisfies φ X j (t) ( t 2 σ 2 j /2 ) E ( ) mi{ tx j 2, tx j 3 }, where σ 2 j = var(x j ). For ay ε > 0, the right-had side of the previous expressio is bouded by E( tx j 3 I { Xj <ε}) + E( tx j 2 I { Xj ε}), which is bouded by ε t 3 σ 2 j + t2 E(X 2 j I { X j ε}). UW-Madiso (Statistics) Stat 709 Lecture 9 208 7 / 5

Proof (cotiued) Summig over j ad usig σ 2 =, we obtai that k ( ) φ X j (t) t 2 σj 2 /2 k {ε t 3 σj 2 + t2 E(Xj 2 I { X j ε})} = ε t 3 + t 2 k by coditio (5). Also by coditio (5) ad σ 2 =, σj 2 max j k σ 2 Sice ε > 0 is arbitrary ad t is fixed, ad k E(X 2 j I { X j ε}) ε t 3 ε 2 + max j k E(X 2 j I { X j >ε}) ε 2 φ X j (t) ( t 2 σ 2 j /2 ) 0 lim max σj 2 j k σ 2 = 0. (6) UW-Madiso (Statistics) Stat 709 Lecture 9 208 8 / 5

Proof (cotiued) This implies that t 2 σ 2 j are all betwee 0 ad for large eough. Usig the iequality a a m b b m m a j b j for ay complex umbers a j s ad b j s with a j ad b j, j =,...,m, we obtai that k e t2 σj 2 /2 k ( ) t 2 σj 2 /2 k ( ) σ 2 e t2 j /2 t 2 σj 2 /2, which is bouded by t 4 k σ 4 j t 4 max j k σ 2 j 0, sice e x x x 2 /2 if x 2 ad k σ 2 j = σ 2 =. UW-Madiso (Statistics) Stat 709 Lecture 9 208 9 / 5

Proof (cotiued) The k φ Xj (t) k e t2 σ 2 j /2 k k + 0 φ X j (t) e t2 σ j 2 /2 ( ) φ X j (t) t 2 σj 2 /2 k ( ) σ 2 e t2 j /2 t 2 σj 2 /2 as previously show. Thus, k φ Xj (t) = k e t2 σ 2 j /2 + o() = e t2 /2 + o() i.e., the ch.f. of k X j coverges to the ch.f. of N(0,) for every t. By Theorem.9(ii), the result follows. UW-Madiso (Statistics) Stat 709 Lecture 9 208 0 / 5

Remarks Coditio (5) is called Lideberg s coditio. From the proof, Lideberg s coditio implies (6), which is called Feller s coditio. Feller s coditio (6) meas that all terms i the sum σ 2 = k σ 2 j are uiformly egligible as. If Feller s coditio is assumed, the Lideberg s coditio is ot oly sufficiet but also ecessary for the result i Theorem.5, which is the well-kow Lideberg-Feller CLT. A proof ca be foud i Billigsley (995, pp. 359-36). Note that either Lideberg s coditio or Feller s coditio is ecessary for the result i Theorem.5 (Exercise 58). Liapouov s coditio A sufficiet coditio for Lideberg s coditio is the followig Liapouov s coditio, which is somewhat easier to verify: k E X j EX j 2+δ 0 for some δ > 0. (7) σ 2+δ UW-Madiso (Statistics) Stat 709 Lecture 9 208 / 5

Example.33 Let X,X 2,... be idepedet radom variables. Suppose that X i has the biomial distributio Bi(p i,), i =,2,..., ad that σ 2 = i= var(x i) = i= p i( p i ) as. For each i, EX i = p i ad E X i EX i 3 = ( p i ) 3 p i + p 3 i ( p i) 2p i ( p i ). Hece i= E X i EX i 3 2σ 2, i.e., Liapouov s coditio (7) holds with δ =. Thus, by Theorem.5, σ i= (X i p i ) d N(0,). (8) It ca be show (exercise) that the coditio σ is also ecessary for result (8). The followig are useful corollaries of Theorem.5 ad Theorem.9(iii). UW-Madiso (Statistics) Stat 709 Lecture 9 208 2 / 5

Corollary.2 (Multivariate CLT) For i.i.d. radom k-vectors X,...,X with a fiite Σ = var(x ), Corollary.3 (X i EX ) d N k (0,Σ). i= Let X i R m i, i =,...,k, be idepedet radom vectors with m i m (a fixed iteger), =,2,..., k as, ad if i, λ [var(x i )] > 0, where λ [A] is the smallest eigevalue of A. Let c i R m i be vectors such that ( lim max c i 2 i k / k ) c i 2 = 0. i= (i) If sup i, E X i 2+δ < for some δ > 0, the k / [ /2 ci τ k (X i EX i ) var(ci i)] τ X d N(0,). (9) i= i= (ii) If wheever m i =m j, i <j k, =,2,..., X i ad X j have the same distributio with E X i 2 <, the (9) holds. UW-Madiso (Statistics) Stat 709 Lecture 9 208 3 / 5

Remarks Provig Corollary.3 is a good exercise. Applicatios of these corollaries ca be foud i later chapters. More results o the CLT ca be foud, for example, i Serflig (980) ad Shorack ad Weller (986). More o Pólya s theorem Let Y be a sequece of radom variables, {µ } ad {σ } be sequeces of real umbers such that σ > 0 for all, ad The, by Propositio.6, (Y µ )/σ d N(0,). lim sup F (Y µ )/σ (x) Φ(x) = 0, (0) x where Φ is the c.d.f. of N(0,). UW-Madiso (Statistics) Stat 709 Lecture 9 208 4 / 5

Asymptotic ormality (0) implies that for ay sequece of real umbers {c }, lim P(Y c ) Φ ( c µ ) = 0, σ i.e., P(Y c ) ca be approximated by Φ ( c µ ) σ, regardless of whether {c } has a limit. Sice Φ ( t µ ) σ is the c.d.f. of N(µ,σ 2 ), Y is said to be asymptotically distributed as N(µ,σ 2 ) or simply asymptotically ormal. Examples For example, k i= cτ i X i i Corollary.3 is asymptotically ormal. This ca be exteded to radom vectors. For example, i= X i i Corollary.2 is asymptotically distributed as N k (EX,Σ). UW-Madiso (Statistics) Stat 709 Lecture 9 208 5 / 5