Exam 1 Spring 2015 Statistics for Applications 3/5/ PDF Free Download

8.443 Exam Sprig 05 Statistics for Applicatios 3/5/05. Log Normal Distributio: A radom variable X follows a Logormal(θ, σ ) distributio if l(x) follows a Normal(θ, σ ) distributio. For the ormal radom variable l(x) The probability desity fuctio of is (y θ) f(y µ, σ ) e σ, < y <. πσ The momet-geeratig fuctio of is tθ + σ t M t (t) E[e θ, σ ] e (a). Compute the first two momets of a radom variable X Logormal(θ, σ ). µ E[X θ, σ ] ad µ E[X θ] Hit: Note that X e ad X e where N(θ, σ ) ad use the momet-geeratig fuctio of. (b). Suppose that X,..., X is a i.i.d. sample from the Logormal(θ, σ ) distributio of size. Fid the method of momets estimates of θ ad σ. Hit: evaluate µ /µ ad fid a method-of-momets estimate for σ first. (c). For the log-ormal radom variable X e, where Normal(θ, σ ), prove that the probability desity of X is (l(x) θ) f(x θ, σ ) ( )e σ, 0 < x <. πσ x (d). Suppose that X,..., X is a i.i.d. sample from the Logormal(θ, σ ) distributio of size. Fid the mle for θ assumig that σ is kow to equal σ 0. (e). Fid the asymptotic variace of the mle for θ i (d). Solutio:

(a). µ θ+σ / E[X] E[e ] M () e µ E[X ] E[e ] M () e θ+σ (b). First, ote that: σ µ /(µ ) e It follows that a method-of-momets estimate for σ is where σˆ l(ˆµ /µˆ) µˆ X i i µˆ X i i Substitutig ˆσ for σ i the formula for µ we get µˆ e θ+ˆσ / θˆ l(ˆµ ) σˆ/ (c). Cosider the trasformatio X e. which has the iverse: y l(x) ad dy/dx /x. It follows that σ (l(x) θ) f X (x) f (l(x)) dy/dx e πσ x (d). The log of the desity fuctio for sigle realizatios x is l[f(x θ, σ 0 ] l(πσ 0 ) l(x) (l(x) θ) σ 0 For a sample x,..., x, the likelihood fuctio is i σ i 0 i (θ) l[f(x i θ, σ 0 ] (l(x i ) θ) + (terms ot depedig o θ) (θ) is miimized by θˆ i l(x i ) values. (e). The asymptotic variace satisfies E[ d J(θ) dθ d J(θ) dθ σ 0 ] /V ar(θˆ) Sice is costat l(x i ) the mle from the sample of V ar(θˆ) σ 0 / This asymptotic variace is i fact the actual variace of θ. ˆ

. The Pareto distributio is used i ecoomics to model values exceedig a threshhold (e.g., liability losses greater tha $00 millio for a cosumer products compay). For a fixed, kow threshhold value of x 0 > 0, the desity fuctio is f(x x θ 0, θ) θx θ 0 x, x x 0, ad θ >. Note that the cumulative distributio fuctio of X is ( o x θ P (X x) F X (x). x 0 (a). Fid the method-of-momets estimate of θ. (b). Fid the mle of θ. (c). Fid the asymptotic variace of the mle. (d). What is the large-sample asymptotic distributio of the mle? Solutio: (a) Compute the first momet of a Pareto radom variable X : J µ x 0 xf(x x 0, θ)dx J x x θx θ 0 x θ 0 dx J θx θ 0 x x θ 0 dx (θ ) θx θ 0 ( θ )x o0 θ x 0 θ Solvig µ µˆ x for θ gives: θ ˆ x x x0 (b). For a sigle observatio X x, we ca write log[f(x θ)] l(θ) + θ l(x 0 ) (θ ) l(x) log[f(x θ) θ ] θ + l(x 0 ) l(x) log[f (x θ) ] θ θ The mle for θ solves J(θ) 0 θ θ ( i l[f(x i θ)]) [ θ + l(x 0 ) l(x i )] θ + l(x 0) l(x i) θˆ [ l(x i /x 0 )] l(x i ) l(x 0 ) (c). The asymptotic variace of θˆ is V ar(θˆ) θ I(θ) 3

l[f(x θ)] Because I(θ) E[ ] θ θ (d) The asymptotic distributio of θˆ is D (θˆ θ) N(0, I(θ) ) N(0, θ ) or D ˆθ N(θ, θ ) 4

3. Distributios derived from Normal radom variables. Cosider two idepedet radom samples from two ormal distributios: X,..., X are i.i.d. Normal(µ, σ ) radom variables.,..., m are m i.i.d. Normal(µ, σ ) radom variables. (a). If µ µ 0, fid two statistics T (X,..., X,,..., m ) T (X,..., X,,..., m ) each of which is a t radom variable ad which are statistically idepedet. Explai i detail why your aswers have a t distributio ad why they are idepedet. (b). If σ σ > 0, defie a statistic T 3 (X,..., X,,..., m ) which has a F distributio. A F distributio is determied by the umerator ad deomiator degrees of freedom. State the degrees of freedom for your statistic T 3. (c). For your aswer i (b), defie the statistic T 4 (X,..., X,,..., m ) T3 (X,..., X,,..., m ) What is the distributio of T 4 uder the coditios of (b)? (d). Suppose that σ σ. If S X i(x i X), ad S m m i ( i ), are the sample variaces of the two samples, show how to use the F distributio to fid P (S X /S > c). (e). Repeat questio (d) if it is kow that σ σ. Solutio: (a). Cosider where X T SX T m S 5

X X i σ m m i σ m m N(µ, σ /) SX (X i X) ( ) χ N(µ, σ /) SX m ( i ) ( ) χ m We kow from theory that X ad SX are idepedet, ad ad S are idepedet, ad all 4 are mutually idepedet because they deped o idepedet samples. For µ 0, we ca write X/σ T S X/σ t a t distributio with (m ) degrees of freedom, because the umerator is N(0, ) radom variable idepedet of the deomiator which is i χ /(m ). m Ad for µ 0, we ca write m /σ T t m S /σ a t distributio with ( ) degrees of freedom, because the umerator is N(0, ) radom variable idepedet of the deomiator which is i χ /( ). (b). For σ σ cosider the statistic: T 3 S X S S X/σ S /σ The umerator is a χ radom variable divided by its degrees of freedom ( ) ad the deomiator is a idepedet χ m radom variable divided by its degrees of freedom (m ). By defiitio the distributio of such a ratio is a F distributio with ( ) ad (m ) degrees of freedom i the umerator/deomiator. (c). The iverse of a F radom variable is also a F radom variable the degrees of freedom for umerator ad deomiator reverse. (d). I geeral we kow: ( )S X σ (m )S σ which are idepedet. χ χ m 6

So, we ca develop the expressio: S ( )SX /σ ( )/σ P ( S X > c) P ( > c) (m )S /σ (m )σ σ ( ) P (F ( ),(m ) > (m ) ( ) c) The aswer is the upper-tail probability of a F distributio with ( ), (m ) degrees of freedom, equal to the probability of exceedig ( ) σ ( (m ) ( σ ) c) For (d), use σ σ ad for (e) use σ σ / σ 7

4. Hardy-Weiberg (Multiomial) Model of Gee Frequecies For a certai populatio, gee frequecies are i equilibrium: the geotypes AA, Aa, ad aa occur with probabilities ( θ), θ( θ), ad θ. A radom sample of 50 people from the populatio yielded the followig data: Geotype Type AA Aa aa 35 0 5 The table couts ca be modeled as the multiomial distributio: (X, X, X 3 ) Multiomial( 50, p (( θ), θ( θ), θ ). (a). Fid the mle of θ (b). Fid the asymptotic variace of the mle. (c). What is the large sample asymptotic distributio of the mle? (d). Fid a approximate 90% cofidece iterval for θ. To costruct the iterval you may use the follow table of cumulative probabilities for a stadard ormal N(0, ) radom variable Z P (Z < z) z 0.99.36 0.975.960 0.950.645 0.90.8 (e). Usig the mle θˆ i (a), 000 samples from the Multiomial( 50, p (( θˆ), θˆ( θˆ), θˆ )) distributio were radomly geerated, ad mle estimates were computed for each sample: θˆj, j,..., 000. For the true parameter θ 0, the samplig distributio of Δ θˆ θ 0 is approximated by that of Δ θˆ θ. ˆ The 50-th largest value of Δ was +0.065 ad the 50-th smallest value was 0.067. Use this iformatio ad the estimate i (a) to costruct a (parametric) bootstrap cofidece iterval for the true θ 0. What is the cofidece level of the iterval? (If you do ot have a aswer to part (a), assume the mle θˆ 0.5). Solutio: (a). Fid the mle of θ 8

(X, X, X 3 ) Multiomial(, p (( θ), θ( θ), θ )) Log Likelihood for θ (θ) log(f(x, x, x 3 p (θ), p (θ), p 3 (θ)))! log( p (θ) x p (θ) x p 3 (θ) x 3 x!x!x 3! ) x log(( θ) ) + x log(θ( θ)) +x 3 log(θ ) + (o-θ terms) (x + x )log( θ) + (x 3 + x )log(θ) + (o-θ terms) First Differetial of log likelihood: (x + x ) (x 3 + x ) " (θ) + θ θ ˆ x 3 + x x 3 + x (5) + 0 θ 0. x + x + x 3 (50) (b). Fid the asymptotic variace of the mle. V ar(θˆ) E[ "" (θ)] Secod Differetial of log likelihood: d (x + x ) (x 3 + x ) "" (θ) [ + ] dθ θ θ (x + x ) (x 3 + x ) ( θ) θ Each of the X i are Biomial(, p i (θ)) so E[X ] p (θ) ( θ) E[X ] p (θ) θ( θ) E[X 3 ] p 3 (θ) θ E[ "" (θ)] θ( θ) θˆ( θˆ) 0.8( 0.8) σˆ 0.6/00 (.4/0) (.04) ˆθ 50 θ( θ) (c) The asymptotic distributio of θˆ is N(θ, ) (d) A approximate 90% cofidece iterval for θ is give by i i {θ : θˆ z(α/) V ar(θˆ) < θ < θˆ + z(α/) V ar(θˆ)} i where α 0.90 ad z(.05).645, ad V ar(θˆ) (.04). 9

So the approximate 90% cofidece iterval is: {θ : 0.0.06580 < θ < 0.0 +.06580} (e). For the bootstrap distributio of the errors Δ θˆ θ 0, (where θ 0 is the true value), the approximate 5% ad 95% quatiles are δ 0.067 ad δ 0.065. The approximate 90% cofidece iterval is {θ : θˆ δ < θ < θˆ δ} [0. 0.065, 0. + 0.067] 0

MIT OpeCourseWare http://ocw.mit.edu 8.443 Statistics for Applicatios Sprig 05 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

Exam 1 Spring 2015 Statistics for Applications 3/5/2015