Qualifying Exam Solutions: Theoretical Statistics. (a) For the first sampling plan, the expectation of any statistic W (X, X,..., X n ) is a polynomial of θ of degree less than n +. Hence τ(θ) cannot have an unbiased estimator from this plan. For the second plan, let Y be the number of X i s observed. Then P (Y = y) = θ( θ) y, y =,,... ; E θ (Y ) = /θ. Therefore, Y is an unbiased estimator of θ the second sampling plan should be employed. (b) Observe X i s till you get two X i s equal to. Let Y be the the number of X i s till the first, Y be the number of X i s between the first second ones. Then Y Y are iid, E θ (Y i ) = /θ implies that E θ (Y Y ) = /θ.. (a) The joint pmf is (/N) n I X(n) NI X() where X i s are in {,,..., N, N +,...} X (r) is the rth order statistic. By the Factorization Theorem, X (n) is sufficient. We show it is complete. if k N. And P (X (n) k N) = P (X k N) P (X n k N) = ( k N )n P (X (n) = k N) = P (X (n) k N) P (X (n) k N) = ( k N )n ( k n )n. Let h(x (n) ) have E N (h(x (n) ) = o for any N =,,.... Putting N =, h() = 0; N =, h()p (X (n) = ) + h()p (X (n) = ) = h() = 0 since P (X (n) = N) > 0 for N. Proceeding this way by induction, h = 0 identically. Therefore, X n is complete. (b)an unbiased estimator of N is X. Hence the BUE of N is E(X X (n) ) = E(X X (n) ). (Suggestion from Professor Ghosh: it is already acceptable if a student stops here) Incidentally, E(X X (n) = x (n) ) = x (n) P (X = x (n) X (n) = x (n) ) + xp (X = x X n = x (n) ) x<x (n) P (X = x (n) X (n) = x (n) ) = P (X = x < x (n) X (n) = x (n) ) = N n k= ( N )k ( n k n ( ( n N )k k k= ) ( x(n) N ) ( x(n) N ) n k ) n k
from which E(X X (n) = x (n) ) can be found. 3. (a) It can be shown or known that T is the complete sufficient statistic of λ. If we can find f such that E[f(T )] = e aλ, then f(t ) is the UMVUE. Consider f(u) = b u. Then E(b T ) = k=0 b k (nλ)k e nλ = e nλ k! k=0 (bnλ) k k! = e nλ+bnλ. Let b = + a/n. We have E[( + a/n) T ] = e aλ. Therefore, ( + a/n) T is the UMVUE of e aλ. (b) Note that E( X ) = k λk k! e λ = e λ. k=0 So X is an unbiased estimator of e λ. Because T is complete sufficient, E( X T ) must be the UMVUE of e λ. Since UMVUE is unique, using the result in (a), we have E( X T ) = ( + n )T. 4. (a) Use π a,b = Beta(a, b) as the prior distribution. Then the posterior distribution of p is Beta(a + x, n x + b), the Bayes estimator of p is ˆp a,b = a + x n + a + b = n n + a + b ˆp + a n + a + b. When α > 0, β < 0 α + β <, we can let a = nβ/α b = n( α β)/α, both of which are positive, so that δ(x) is a Bayes estimator thus is admissible. (More sophisticated proofs are possible, for example, by using Karlin s Theorem, which leads to stronger conclusions) (b) The risk function of δ(x) is R(δ, p) =E p [(αˆp + β p) ] =E p (αˆp αp) + [β ( α)p] = α p( p) n + [β ( α)p] =[( α) α n ]p + [ α n β( α)]p + β. Setting ( α) α n = α β( α) = 0, n
that is, setting α = n + n, β = ( + n), R(δ, p) becomes a constant. Therefore, the corresponding estimator n δ(x) = + n X + ( + n) is minimax. As a matter of factor, this δ(x) is the Bayes estimator corresponding to the prior distribution Beta( n/, n/). 5. (a) Apply the Neyman-Pearson Lemma, the rejection region of a size α UMP test includes x, x,..., x n such that f(x,..., x n H a ) f(x,..., x n H 0 ) = (πσ ) n/ exp( σ (xi θ i0 ) ) (πσ ) n/ exp( σ x i ) = exp( xi θ σ i0 ) exp( θ σ i0 ) > k, or equivalently, x i θ i0 > k, with P ( X i θ i0 > k H 0 ) = α. Since X i s are independent of each other, the distribution of X i θ i0 is N(0, σ θ i0) under H 0. Therefore, we can choose k = σz α θ i0, the rejection region of the size α UMP test is { (x,..., x n ) x i θ i0 > σz α θ i0 }. (b). The MLE of θ i is ˆθ i = X i for i n. So the LRT statistic is λ = (πσ ) n/ exp( σ x i ) (πσ ) n/ exp( σ (xi ˆθ i ) ) = exp( x σ i ) The reject region is λ < c for some constant c such that 0 < c <, or equivalently X i > c for some constant c. Let T = Xi. Under H 0, T/σ χ n. So H 0 is rejected when T = n i= X i > σ χ n,α. Using normal approximation, the rejection region is x i > nσ + σ nz α. (c). For the test obtained in (a), under H a, X i θ i0 = n /3 X i follows a normal distribution with mean n /3 variance n /3 σ. So its power is ( σzα n /6 n /3 ) Φ = Φ(z σn /6 α σ n /6 ) as n where Φ is the cdf of the stard normal distribution. 3
For the test obtained in (b), under H a, Xi approximately follows a normal distribution with mean nσ + n /3 variance nσ + 4n /3. So its power is, as n, ( nσ + σ nz α (nσ + n /3 ) ) ( zα n /6 /( σ) ) Φ = Φ α. nσ + 4n /3 + n /3 σ 6. (a) Method : Let µ = (µ, µ, µ 3, µ 4 ) Then, the null hypothesis is A =. 0 H 0 : A t µ = 0. Since the columns of A are not orthogonal, we need to write A as 5 3 5 Ã = 3 5. 5 Then, we can write H 0 : Ãt µ = 0. Let ȳ = (ȳ,, ȳ 4 ). Note that the rank of Ã(Ãt Ã) Ã t is ȳ N(µ, σ I). Then under the null hypothesis, we have 5 ȳ t Ã(Ãt Ã) Ã t ȳ = 4 (ȳ ȳ ȳ 3 + ȳ 4 ) + 0 (ȳ + 3ȳ 3ȳ 3 ȳ 4 ) σ 5 χ. Let where F = MSE [5 8 (ȳ ȳ ȳ 3 + ȳ 4 ) + 8 (ȳ + 3ȳ 3ȳ 3 ȳ 4 ) ] Then, under the null hypothesis MSE = 5 4 i= 5 (y i y ). j= F F,5 4
H 0 is rejected if F F α,,5, where F α,,5 is the upper α quantile of F,5 distribution. Method : Directly use the result that the F statistic is F = ˆµ A [A(X X) A ] Aˆµ/ MSE where X is the design matrix ˆµ = (ȳ., ȳ., ȳ 3., ȳ 4. ). In fact, 3/ / A [A(X X) A ] A = / 7/ 7/ / / 3/ (b) Method Directly use the result that that is, ˆµ H0 = ˆµ + (X X) A [A(X X) A ] Aˆµ, ˆµ,h0 ȳ. 3/ / ȳ. ˆµ,h0 ˆµ 3,h0 = ȳ. ȳ 3. + / 7/ ȳ. 5 7/ / ȳ 3. ˆµ 4,h0 ȳ 4. / 3/ ȳ 4. Other methods are also acceptable, including using lagrange multipliers, orthogonal projections, etc 7. (a) The density of X i is f(x, x ) = ρ (π)σ σ exp{ ( ρ ) [(x µ ) σ ρ(x µ )(x µ ) + (x µ ) ]}. σ σ σ The conditional density of X i given X i is f X X (x x ) = exp{ π( ρ )σ ( ρ )σ [(x µ ) ρ σ σ (x µ )] } Therefore, τ = ( ρ )σ. 5
Then, we have the moment estimator as ˆµ = X, ˆµ = X, ˆσ = S, ˆσ = S, ρ = S /(S S ), ˆτ = ( ˆρ )ˆσ. (b) Clearly, V (X i ) = σ 4, V (X i ) = σ 4. Note that For the other terms, we need X i X i N(ρ σ σ X i, ( ρ )σ ). E(X ix i) =E[X ie(x i X i )] =E{Xi[( ρ )σ + ρ σ X σ i]} =( ρ )σσ + 3ρ σ σ 4 σ =( + ρ )σσ. We also need Therefore, we have E(X 3 ix i ) = E[X 3 ie(x i X i )] = ρσ σ E(X 3 i) = ρσ 3 σ E(X 3 ix i ) = 3ρσ σ 3. V (X i X i ) =( + ρ )σ σ Cov(X i, X i) =ρ σ σ Cov(X i, X i X i ) =ρσ 3 σ Cov(X i, X i X i ) =ρσ σ 3 Therefore, we have the covariance matrix of (Xi, Xi, X i X i ) equal to σ 4 ρ σσ 0 Σ = ρ σσ σ 4 0 0 0 ( + ρ )σσ (c) To compute the limiting distribution of ρ M. We define g(x, y, z) = z/ xy. Then, x y z = z x 3 y = z xy 3 = xy. 6
Let x = σ, y = σ z = ρσ σ. We have x y = ρ σ = ρ σ =. z σ σ Then, we have the limiting variance of n(ˆρ ρ) as ( ) σ ρ ρ 4 ρ σσ ρσσ 3 σ σ σ σ ρ σσ σ 4 ρσ σ 3 ρσσ 3 ρσ σ 3 ( + ρ )σσ Therefore, we have n(ˆρ ρ) approx N(0, ( ρ ) ). ρ σ ρ σ σ σ = ( ρ ). To compute the limiting distribution of ˆτ, we define g(x, y, z) = y z /x. Then, we have = z x x = y = z z x Let x = σ, y = σ z = ρσ σ. We have Then, we have ( ρ σ σ =( ρ ) σ 4. x y z = ρ σ σ = = ρσ σ ) σ ρσ 4 ρ σσ ρσσ 3 σ ρ σσ σ 4 ρσ σ 3 ρσσ 3 ρσ σ 3 ( + ρ )σσ ρ σ σ ρσ σ 7
Thus, we have n(ˆτ ˆτ) approx N(0, ( ρ ) σ 4 ). 8