Supplementary material for the paper Identifiability and bias reduction in the skew-probit model for a binary response

Size: px

Start display at page:

Download "Supplementary material for the paper Identifiability and bias reduction in the skew-probit model for a binary response"

Jewel Ross
5 years ago
Views:

1 Supplementary material for the paper Identifiability and bias reduction in the skew-probit model for a binary response DongHyuk Lee and Samiran Sinha Department of Statistics, Texas A&M University, College Station, TX , USA s: dhyuklee@stat.tamu.edu, sinha@stat.tamu.edu S. Figures and tables of the simulation study In this section, we provide the simulation results for scenarios 5-2. Figures S. S.8 present boxplots for each parameters corresponding to the simulation scenarios 5 2, respectively. Tables S. S.3 contain the mean and the standard deviation of computation time for simulation scenarios 5-6. Detailed discussion of the simulation results can be found in Section 4 of the manuscript.

2 S.2 Comparison between the optimization algorithms Here we have compared different optimization algorithms for estimating parameters under the five methods. Particularly, we compared algorithms Nelder-Mead, BFGS, L-BFGS-B, nlm, nlminb, ucminf, newuoa, bobyqa, nmkb that are available in the R package optimx, in terms of mean and standard deviation of the estimators (Tables S.4, S.5, S.6), computation time (Table S.7), and the number of non-converging datasets out of replications (Table S.8). Here we present the results for scenario 6 (X Normal(, ( 4/3) 2 ), β =, δ = 4, β =.42, p m = 4%) only. However, based on our short and limited simulation study, the results for other scenarios follow the same trend as of scenario 6. The simulation results can be summarized as follows: When sample size is large, resulting estimates are quite close regardless of the algorithm and method of estimation for β, β and δ. With large sample sizes, ucminf computes estimates much faster than others. For methods J and C, algorithms except BFGS produce very close results for β, β and δ. Also, the mean bias from algorithm BFGS is slightly larger than that from other algorithm when the sample size is small. For method N, estimates seem to differ across algorithms. Nelder-Mead, BFGS, L-BFGS-B, nlm and nlminb are suffering from the non-convergence issue especially for method N. Based on these findings, by far ucminf seems to be the best algorithm among the algorithms we consider. 2

3 S.3 R codes ## Necessary libraries 2 library ( sn) 3 library ( ucminf ) 4 5 ## Method N 6 loglik <- function ( paras, X, y){ 7 delta <- paras [ length ( paras )] 8 eta <- as. vector (X %% paras [- length ( paras )]) 9 mu <- psn (as. vector ( eta ), alpha = delta ) mu [ which ( mu ==) ] <- min ( mu [ mu!= ]) mu [ which ( mu ==) ] <- max ( mu [ mu!= ]) 2 re <- sum (y log ( mu ) + ( -y) log ( - mu )) 3 return (-re) 4 } 5 6 ## Method B 7 imat <- function (y, X, paras ){ 8 inf. mat = matrix (, nrow = length ( paras ), ncol = length ( paras )) 9 delta = paras [ length ( paras )] 2 eta <- as. vector (X %% paras [- length ( paras )]) 2 mu <- psn (as. vector ( eta ), alpha = delta ) 22 mu <- psn (as. vector ( eta ), alpha = paras [ length ( paras )]) 23 # mu [ which ( mu ==) ] <- min ( mu [ mu!= ]) 24 mu [ which ( mu ==) ] <- e - 25 # mu [ which ( mu ==) ] <- max ( mu [ mu!= ]) 26 mu [ which ( mu ==) ] <- -e - 27 term = dnorm ( eta ) pnorm ( delta eta ) 28 term = mu ( - mu ) 29 term2 = term term / term 3 term3 = exp ( -.5 eta ^2(+ delta ^2) ) 3 term4 = term3 term3 32 inf. mat.b <- 4t( term2 X) %% X 33 inf. mat.bd <- -2 colsums (( term term3 / term )X)/(pi(+ delta ^2) ) 34 inf. mat.d <- sum ( term4 / term )/(pi(+ delta ^2) )^2 35 inf. mat [- length ( paras ), - length ( paras )] <- inf. mat.b 36 inf. mat [ length ( paras ), length ( paras )] <- inf. mat.d 37 inf. mat [ length ( paras ), - length ( paras )] <- inf. mat.bd 38 inf. mat [- length ( paras ), length ( paras )] <- inf. mat.bd 39 return ( inf. mat ) 4 } 4 42 ## Method J 43 Jloglikp <- function ( paras, X, y){ 44 inf. mat = matrix (, nrow = length ( paras ), ncol = length ( paras )) 45 delta = paras [ length ( paras )] 46 eta <- as. vector (X %% paras [- length ( paras )]) 47 mu <- psn (as. vector ( eta ), alpha = delta ) 48 mu <- psn (as. vector ( eta ), alpha = paras [ length ( paras )]) 49 # mu [ which ( mu ==) ] <- min ( mu [ mu!= ]) 5 mu [ which ( mu ==) ] <- e - 5 # mu [ which ( mu ==) ] <- max ( mu [ mu!= ]) 52 mu [ which ( mu ==) ] <- -e - 53 term = dnorm ( eta ) pnorm ( delta eta ) 54 term = mu ( - mu ) 55 term2 = term term / term 56 term3 = exp ( -.5 eta ^2(+ delta ^2) ) 57 term4 = term3 term3 58 inf. mat.b <- 4t( term2 X) %% X 59 inf. mat.bd <- -2 colsums (( term term3 / term )X)/(pi(+ delta ^2) ) 6 inf. mat.d <- sum ( term4 / term )/(pi(+ delta ^2) )^2 6 inf. mat [- length ( paras ), - length ( paras )] <- inf. mat.b 62 inf. mat [ length ( paras ), length ( paras )] <- inf. mat.d 63 inf. mat [ length ( paras ), - length ( paras )] <- inf. mat.bd 64 inf. mat [- length ( paras ), length ( paras )] <- inf. mat.bd 65 if(det ( inf. mat ) < ) qnty = else qnty =.5 log ( det ( inf. mat )) 3

4 66 # ###### 67 re <- sum (y log ( mu ) + ( -y) log ( - mu )) + qnty 68 return (-re) 69 } 7 7 ## Method C 72 Cloglikp <- function ( paras, X, y){ 73 delta = paras [ length ( paras )] 74 eta <- as. vector (X %% paras [- length ( paras )]) 75 mu <- psn (as. vector ( eta ), alpha = delta ) 76 mu [ which ( mu ==) ] <- min ( mu [ mu!= ]) 77 mu [ which ( mu ==) ] <- max ( mu [ mu!= ]) 78 re <- sum (y log ( mu ) + ( -y) log ( - mu )) - sum ( log (+ paras ^2/ 2.5^2) ) 79 return (-re) 8 } 8 82 ## Method G 83 GJloglikp <- function ( paras, X, y){ 84 inf. mat = matrix (, nrow = length ( paras ), ncol = length ( paras )) 85 delta = paras [ length ( paras )] 86 eta <- as. vector (X %% paras [- length ( paras )]) 87 mu <- psn (as. vector ( eta ), alpha = delta ) 88 mu <- psn (as. vector ( eta ), alpha = paras [ length ( paras )]) 89 # mu [ which ( mu ==) ] <- min ( mu [ mu!= ]) 9 mu [ which ( mu ==) ] <- e - 9 # mu [ which ( mu ==) ] <- max ( mu [ mu!= ]) 92 mu [ which ( mu ==) ] <- -e - 93 term = dnorm ( eta ) pnorm ( delta eta ) 94 term = mu ( - mu ) 95 term2 = term term / term 96 term3 = exp ( -.5 eta ^2(+ delta ^2) ) 97 term4 = term3 term3 98 inf. mat.b <- 4t( term2 X) %% X 99 inf. mat.bd <- -2 colsums (( term term3 / term )X)/(pi(+ delta ^2) ) inf. mat.d <- sum ( term4 / term )/(pi(+ delta ^2) )^2 inf. mat [- length ( paras ), - length ( paras )] <- inf. mat.b 2 inf. mat [ length ( paras ), length ( paras )] <- inf. mat.d 3 inf. mat [ length ( paras ), - length ( paras )] <- inf. mat.bd 4 inf. mat [- length ( paras ), length ( paras )] <- inf. mat.bd 5 if(det ( inf. mat ) < ) qnty = else qnty =.5 log ( det ( inf. mat )) 6 # ###### 7 re <- sum (y log ( mu ) + ( -y) log ( - mu )) + qnty - as. numeric (.5 t( paras )%% inf. mat %% paras ) 8 return (-re) 9 } # ########################################################################################## 2 ## Data generation 3 set. seed () 4 n <- 5 b < b <- 7 delta <- 4 8 x <- runif (n, -2, 2) 9 X <- cbind (, x) 2 eta <- as. numeric ( b + b x) 2 p <- psn ( eta, alpha = delta ) 22 y <- rbinom (n,, p) ## Probit regression 25 PR <- glm ( y ~ x, family = binomial ( link = " probit ")) ## Initial value for beta parameters 28 beta <- coef ( PR) 29 delta <- runif (,, ) 3 3 ## Method N 32 fit _ naive <- ucminf ( c( beta, delta ), fn = loglik, X = X, y = y, hessian = 2) 33 Nest <- fit _ naive $ par 4

5 34 Nse <- sqrt ( diag ( fit _ naive $ invhessian )) 35 coef _ naive <- cbind (Nest, Nse, Nest /Nse, 2( - pnorm ( abs ( Nest / Nse ))), Nest + qnorm (5) Nse, Nest + qnorm (.975) Nse ) ## Method B 38 store _ boot <- matrix (, nrow =, ncol = 3) 39 k <- 4 total. boot <- 4 n <- nrow ( X) 42 while () { 43 total. boot <- total. boot + 44 cat (k, ) 45 if(!(k%% ) ) cat ( \n ) 46 idx. boot <- sample (: n, n, replace = TRUE ) 47 beta. boot <- coef ( glm (y[ idx. boot ] ~ x[ idx. boot ], family = binomial ( link = " probit "))) 48 fit _ boot <- ucminf (c( beta.boot, delta ), fn = loglik, X = X[ idx.boot,], y = y[ idx. boot ], hessian = ) 49 k <- k+ 5 store _ boot [k, ] <- fit _ boot $ par 5 52 if( k == ) { 53 cat ( \n ) 54 break 55 } 56 } 57 Bmle <- 2 Nest - apply ( store _ boot, 2, mean ) 58 se <- sqrt ( diag ( solve ( imat (y, X, Bmle )))) 59 coef _BC <- cbind (Bmle, se, Bmle /se, 2( - pnorm ( abs ( Bmle /se))), Bmle + qnorm (5) se, Bmle + qnorm (.975) se) 6 6 ## Method J 62 fit _ Jeff <- ucminf ( c( beta, delta ), fn = Jloglikp, X = X, y = y, hessian = 2) 63 Jest <- fit _ Jeff $ par 64 Jse <- sqrt ( diag ( fit _ Jeff $ invhessian )) 65 coef _ Jeff <- cbind (Jest, Jse, Jest /Jse, 2( - pnorm ( abs ( Jest / Jse ))), Jest + qnorm (5) Jse, Jest + qnorm (.975) Jse ) ## Method G 68 fit _ GJ <- ucminf ( c( beta, delta ), fn = GJloglikp, X = X, y = y, hessian = 2) 69 Gest <- fit _ GJ$ par 7 Gse <- sqrt ( diag ( fit _GJ$ invhessian )) 7 coef _GJ <- cbind (Gest, Gse, Gest /Gse, 2( - pnorm ( abs ( Gest / Gse ))), Gest + qnorm (5) Gse, Gest + qnorm (.975) Gse ) ## Method C 74 fit _ Cauchy <- ucminf ( c( beta, delta ), fn = Cloglikp, X = X, y = y, hessian = 2) 75 Cest <- fit _ Cauchy $ par 76 Cse <- sqrt ( diag ( fit _ Cauchy $ invhessian )) 77 coef _ Cauchy <- cbind (Cest, Cse, Cest /Cse, 2( - pnorm ( abs ( Cest / Cse ))), Cest + qnorm (5) Cse, Cest + qnorm (.975) Cse ) 5

6 Table S.: Mean and standard deviation of the computation time in seconds for simulation scenarios 5-8. Scenario n Method N B J G C (.988) (33.67) (.827) (.44) (.29) (4.564) ( ) (2.334) (2.736) (.722) (7.68) (95.3) (3.74) (4.773) (.445) (6.7) (92.6) (6.45) (8.845) (2.784) (5.9) (793.9) (5.342) (2.62) (7.2) (.837) (66.953) (.68) (.629) (.339) (3.932) (47.77) (.525) (.35) (.725) (4.772) (79.979) (2.872) (2.59) (.342) (4.49) (882.93) (6.77) (4.72) (2.75) (5.) (65.99) (3.735) (.9) (6.579) (.86) (6.72) (.89) (.93) (.287) (4.895) (479.68) (.742) (2.77) (.73) (.38) (4.366) (3.44) (4.45) (.433) (7.) (294.45) (6.338) (9.54) (2.892) (23.27) ( ) (2.79) (25.326) (7.92) (.76) (5.36) (.656) (.667) (.33) (4.57) (49.2) (.5) (.456) (.743) (9.3) (94.99) (2.822) (2.794) (.367) (5.48) ( ) (5.274) (4.457) (2.564) (5.35) (299.52) (2.9) (9.996) (5.95) 6

7 Table S.2: Mean and standard deviation of the computation time in seconds for simulation scenarios 9-2. Scenario n Method N B J G C (.979) (57) (.822) (.66) (.39) (4.62) ( ) (2.48) (3.98) (.827) (7.449) ( ) (3.834) (6.379) (.535) (7.352) ( ) (6.389) (4.482) (3.45) (5.859) (64.78) (5.539) (43.728) (7.89) (.877) (62.64) (.732) (.627) (.343) (4.5) (536.72) (.657) (.27) (.8) (6.56) ( ) (2.8) (.84) (.48) (6.24) (85.355) (5.23) (3.5) (2.65) (4.44) (687.28) (2.32) (9.474) (6.42) (.874) (38.42) (.776) (.54) (.275) (5.3) ( ) (.742) (3.36) (.84) (9.738) (62.34) (3.378) (7.452) (.652) (6.359) (275.74) (6.767) (6.474) (3.76) (2.33) ( ) (7.52) (44.928) (7.5) (.686) (36.53) (.674) (.68) (.334) (4.567) (479.97) (.472) (.27) (.695) (9.285) (63.782) (2.757) (2.257) (.38) (7.852) ( ) (5.29) (3.438) (2.567) (2.553) ( ) (3.34) (9.577) (6.59) 7

8 Table S.3: Mean and standard deviation of the computation time in seconds for simulation scenarios 3-6. Scenario n Method N B J G C (2.324) (3.245) (4) (.57) (.352) (6.347) (57.78) (2.493) (4.855) (6) (.464) (278.7) (4.66) (.3) (.93) (4.6) ( ) (9.357) (9.773) (3.862) (7.435) (695.3) (9.27) (4.9) (9.27) (2.292) (79.94) (.836) (5) (.427) (5.72) (65.24) (.882) (.899) (.9) (8.82) (6.2) (3.77) (2.727) (.762) (9.888) (65.853) (7.289) (5.27) (3.396) (6.42) (67.874) (5.38) (3.24) (7.959) (2.94) (8.483) (3) (.47) (.348) (6.88) ( ) (2.33) (4.565) (8) (2.239) (32.6) (4.836) (8.758) (.995) (24.92) ( ) (.88) (6.2) (3.97) (43.29) ( ) (29.32) (34.232) (9.475) (2.59) (58) (.85) (.3) (.47) (5.874) (569.56) (.929) (2.373) (.9) (2.847) (35.694) (3.835) (3.745) (.769) (22.548) ( ) (8.367) (5.933) (3.58) (33.355) ( ) (2.552) (3.49) (8.6) 8

9 Table S.4: The mean (standard deviation) different estimators of the intercept parameter for scenario 6. The true value of β was.42. Method N J G C n Algorithms Nelder-Mead BFGS L-BFGS-B nlm nlminb ucminf newuoa bobyqa nmkb (.335) (.456) (.345) (.436) (.36) (.42) (.339) (.338) (.346) (.55) (.53) (.56) (.75) (.62) (.67) (.57) (.56) (.55) (.63) (.88) (.63) (.63) (.63) (.63) (.63) (.63) (.63) (.4) (.4) (.4) (.4) (.4) (.4) (.4) (.4) (.4) (4) (4) (4) (4) (4) (4) (4) (4) (4) (.) (.53) (.) (.) (.) (.) (.) (.) (.28) (.77) (.77) (.77) (.77) (.77) (.77) (.77) (.77) (.77) (.55) (.289) (.55) (.55) (.55) (.55) (.55) (.55) (.55) (.38) (.38) (.38) (.38) (.38) (.38) (.38) (.38) (.38) (3) (3) (3) (3) (3) (3) (3) (3) (3) (.392) (.292) (.43) (.4) (.387) (.365) (.365) (.372) (.48) (.296) (.325) (.32) (.32) (.287) (.244) (.256) (.264) (.333) (.89) (.96) (.23) (.23) () (.37) (.5) (.6) (.235) (.76) (.95) (.82) (.83) (.83) (.75) (.62) (.62) (.5) (.48) (.49) (.44) (.4) (.48) (.7) (.4) (.42) (.47) (.238) (.24) (.238) (.238) (.238) (.238) (.238) (.238) (.238) (.42) (.4) (.42) (.42) (.42) (.42) (.42) (.42) (.42) (.68) (.5) (.68) (.68) (.68) (.68) (.68) (.68) (.68) (.4) (.4) (.4) (.4) (.4) (.4) (.4) (.4) (.4) (4) (4) (4) (4) (4) (4) (4) (4) (4) 9

10 Table S.5: The mean (standard deviation) of different estimators of the slope parameter for scenario 6. The true value of β was. Method N J G C n Algorithms Nelder-Mead BFGS L-BFGS-B nlm nlminb ucminf newuoa bobyqa nmkb (.337) (.353) (.35) (.28) (.36) (.29) (.343) (.342) (.325) (.83) (.82) (.83) (.77) (.87) (.78) (.82) (.82) (.8) (.) (.29) (.) (.) (.99) (.) (.) (.99) (.) (.68) (.68) (.68) (.68) (.68) (.68) (.68) (.68) (.68) (.4) (.4) (.4) (.4) (.4) (.4) (.4) (.4) (.4) () (.9) () () () () () () () (.26) (.26) (.26) (.26) (.25) (.26) (.26) (.26) (.26) (.89) (.5) (.89) (.89) (.89) (.89) (.89) (.89) (.89) (.65) (.65) (.65) (.65) (.65) (.65) (.65) (.65) (.65) (.4) (.4) (.4) (.4) (.4) (.4) (.4) (.4) (.4) (.578) (.47) (.59) (.586) (.562) (.53) (.553) (.559) (.622) (.355) (.372) (.357) (.354) (.335) (.296) (.325) (.33) (.45) (.224) (.67) (.235) (.232) (.227) () (.22) (.28) (.278) (.24) (.29) (.27) (.27) (.26) (.23) (.22) (.2) (.4) (.78) (.78) (.79) (.78) (.78) (.83) (.79) (.78) (.78) (.293) (.292) (.293) (.293) (.293) (.293) (.293) (.293) (.293) (.76) (.76) (.76) (.76) (.76) (.76) (.76) (.76) (.76) (.) (.) (.) (.) (.) (.) (.) (.) (.) (.68) (.68) (.68) (.68) (.68) (.68) (.68) (.68) (.68) (.4) (.4) (.4) (.4) (.4) (.4) (.4) (.4) (.4)

11 Table S.6: The mean (standard deviation) of different estimators for the skewness parameter for scenario 6. The true value of δ was 4. Method N J G C n Algorithms Nelder-Mead BFGS L-BFGS-B nlm nlminb ucminf newuoa bobyqa nmkb (2663.5) (8.56) (378.) (6234.9) (988.2) (469.) (23.2) (3.3) (378.9) (982.5) (7.77) (36.9) (422.) (94.4) (6757.5) (6.4) (8.53) (4749.7) (82.) (.2) (85.6) (264.5) (65.) (3935.) (8.663) (4.659) (2983.8) (29.6) (4.875) (24.) (39.3) (.847) (445.) (3.722) (2.277) (33.) (.868) (.866) (.866) (.866) (.866) (.866) (.866) (.866) (.866) (.9) (.778) (.9) (.9) (.9) (.9) (.9) (.9) (.27) (.432) (.423) (.43) (.43) (.43) (.43) (.43) (.43) (.43) (.484) (.922) (.483) (.483) (.483) (.483) (.483) (.483) (.483) (.269) (.265) (.267) (.268) (.268) (.268) (.268) (.265) (.268) (.77) (.77) (.769) (.769) (.769) (.769) (.769) (.769) (.769) (2.38) (.97) (2.5) (2.434) (2.89) (.82) (2.49) (2.86) (2.864) (.943) (.978) (.954) (.88) (.654) (.33) (.568) (.58) (2.65) (6) (.72) (.62) (.48) (3) (.434) (.839) (.793) (.628) (.3) (.23) (.83) (.85) (.66) (.43) (.8) (.5) (.48) (.34) (.33) (5) (.5) (.33) (.3) (.7) (.7) (.34) (.7) (.79) (.7) (.7) (.7) (.7) (.7) (.7) (.7) (.349) (.346) (.35) (.35) (.35) (.35) (.35) (.35) (.35) (.336) (.58) (.337) (.337) (.337) (.337) (.337) (.337) (.337) (.8) (.75) (.78) (.79) (.79) (.79) (.79) (.78) (.79) (.75) (.75) (.749) (.749) (.749) (.749) (.749) (.749) (.749)

12 Table S.7: The mean (standard deviation) computation time for different algorithms for scenario 6. Method N J G C n Algorithms Nelder-Mead BFGS L-BFGS-B nlm nlminb ucminf newuoa bobyqa nmkb (.745) (3.436) (2.849) (.757) (.556) (.882) (7.66) (6.78) (.35) (3.74) (6.892) (5.968) (3.26) (2.72) (3.987) (5.64) (4.65) (2.95) (5.32) (.699) (6.949) (3.73) (2.72) (4.799) (2.55) (23.8) (3.866) (6.97) (9.634) (5.695) (3.37) (2.63) (4.428) (22.8) (32.25) (4.754) (3.79) (8.64) (6.73) (2.94) (4.98) (5.254) (24.94) (48.748) (8.46) (.252) (.269) (.68) (.47) (.37) (.68) (.727) (2.852) () (3.36) (2.66) (.694) (.82) (.787) (.458) (5.46) (.484) (.889) (6.65) (5.532) (3.272) (.558) (.945) (2.768) (2.482) (24.22) (3.737) (.485) (8.96) (5.685) (2.549) (3.742) (5.75) (23.96) (46.842) (7.7) (25.89) (7.377) (.495) (5.65) (8.48) (2.89) (46.677) (88.544) (5.869) (.337) (.688) () (.557) (.63) (.637) (3.9) (5.78) (.56) (2.874) (3.9) (2.22) (4.439) (.78) (.36) (7.23) (.462) (2.732) (4.766) (2.36) (3.47) (7.786) (2.9) (2.4) (7.742) (.265) (5.44) (7.725) (8.) (7.349) (8.46) (4.3) (4.533) (3.82) (4.557) (9.834) (4.596) (2.96) (6.258) (7.883) (.587) (.888) (9.852) (.32) (8.497) (.747) (.24) (.42) (.229) (.256) (.343) (.88) (.338) (.499) (.543) (.532) (.83) (.426) (.458) (.77) (2.392) (4.24) (.4) (3.7) (4.72) (.536) (.74) (5) (.278) (5.52) (.548) (.93) (5.769) (4.8) (2.952) (.249) (.887) (2.57) (.6) (22.363) (3.746) (2.83) (8.667) (5.83) (2.885) (4.374) (5.98) (23.93) (43.749) (8.38) 2

13 Table S.8: The number of non-convergent datasets for different algorithms for scenario 6. Method N J G C n Algorithms Nelder-Mead BFGS L-BFGS-B nlm nlminb ucminf newuoa bobyqa nmkb

14 n= n= n= n= n= 2 intercept slope N B J G C N B J G C N B J G C N B J G C N B J G C skewness Figure S.: Simulation results based on replications when X Normal(, ( 4/3) 2 ), δ = 4, β =.77, β =, and p m = 2% The numbers in the boxplots are the empirical coverage probabilities for the nominal level based on the standard error derived from the Fisher information matrix. The horizontal line in each figure indicates the true value of the parameter. N: Naive MLE, B: Bootstrap bias correction, J: Penalized likelihood estimation with Jeffrey s prior, G: Penalized likelihood estimation with generalized information matrix, C: Penalized likelihood estimation with Cauchy distribution. 4

15 n= n= n= n= n= intercept slope N B J G C N B J G C N B J G C N B J G C N B J G C skewness Figure S.2: Simulation results based on replications when X Normal(, ( 4/3) 2 ), δ = 4, β =.42, β =, and p m = 4%. The numbers in the boxplots are the empirical coverage probabilities for the nominal level based on the standard error derived from the Fisher information matrix. The horizontal line in each figure indicates the true value of the parameter. N: Naive MLE, B: Bootstrap bias correction, J: Penalized likelihood estimation with Jeffrey s prior, G: Penalized likelihood estimation with generalized information matrix, C: Penalized likelihood estimation with Cauchy distribution. 5

16 n= n= n= n= n= 3 2 intercept slope N B J G C N B J G C N B J G C N B J G C N B J G C skewness Figure S.3: Simulation results based on replications when X Normal(, ( 4/3) 2 ), δ = 8, β =.73, β =, and p m = 2%. The numbers in the boxplots are the empirical coverage probabilities for the nominal level based on the standard error derived from the Fisher information matrix. The horizontal line in each figure indicates the true value of the parameter. N: Naive MLE, B: Bootstrap bias correction, J: Penalized likelihood estimation with Jeffrey s prior, G: Penalized likelihood estimation with generalized information matrix, C: Penalized likelihood estimation with Cauchy distribution. 6

17 n= n= n= n= n= intercept slope N B J G C N B J G C N B J G C N B J G C N B J G C skewness Figure S.4: Simulation results based on replications when X Normal(, ( 4/3) 2 ), δ = 8, β =.44, β =, and p m = 4%. The numbers in the boxplots are the empirical coverage probabilities for the nominal level based on the standard error derived from the Fisher information matrix. The horizontal line in each figure indicates the true value of the parameter. N: Naive MLE, B: Bootstrap bias correction, J: Penalized likelihood estimation with Jeffrey s prior, G: Penalized likelihood estimation with generalized information matrix, C: Penalized likelihood estimation with Cauchy distribution. 7

18 n= n= n= n= n= 2 intercept slope N B J G C N B J G C N B J G C N B J G C N B J G C skewness Figure S.5: Simulation results based on replications when X.5Normal(, ( /3) 2 ) +.5Normal(, ( /3) 2 ), δ = 4, β =.85, β =, and p m = 2% The numbers in the boxplots are the empirical coverage probabilities for the nominal level based on the standard error derived from the Fisher information matrix. The horizontal line in each figure indicates the true value of the parameter. N: Naive MLE, B: Bootstrap bias correction, J: Penalized likelihood estimation with Jeffrey s prior, G: Penalized likelihood estimation with generalized information matrix, C: Penalized likelihood estimation with Cauchy distribution. 8

19 n= n= n= n= n= intercept slope N B J G C N B J G C N B J G C N B J G C N B J G C skewness Figure S.6: Simulation results based on replications when X.5Normal(, ( /3) 2 ) +.5Normal(, ( /3) 2 ), δ = 4, β =.35, β =, and p m = 4%. The numbers in the boxplots are the empirical coverage probabilities for the nominal level based on the standard error derived from the Fisher information matrix. The horizontal line in each figure indicates the true value of the parameter. N: Naive MLE, B: Bootstrap bias correction, J: Penalized likelihood estimation with Jeffrey s prior, G: Penalized likelihood estimation with generalized information matrix, C: Penalized likelihood estimation with Cauchy distribution. 9

20 n= n= n= n= n= 3 2 intercept slope N B J G C N B J G C N B J G C N B J G C N B J G C skewness Figure S.7: Simulation results based on replications when X.5Normal(, ( /3) 2 ) +.5Normal(, ( /3) 2 ), δ = 8, β =.82, β =, and p m = 2%. The numbers in the boxplots are the empirical coverage probabilities for the nominal level based on the standard error derived from the Fisher information matrix. The horizontal line in each figure indicates the true value of the parameter. N: Naive MLE, B: Bootstrap bias correction, J: Penalized likelihood estimation with Jeffrey s prior, G: Penalized likelihood estimation with generalized information matrix, C: Penalized likelihood estimation with Cauchy distribution. 2

21 n= n= n= n= n= intercept slope N B J G C N B J G C N B J G C N B J G C N B J G C skewness Figure S.8: Simulation results based on replications when X.5Normal(, ( /3) 2 ) +.5Normal(, ( /3) 2 ), δ = 8, β =.37, β =, and p m = 4%. The numbers in the boxplots are the empirical coverage probabilities for the nominal level based on the standard error derived from the Fisher information matrix. The horizontal line in each figure indicates the true value of the parameter. N: Naive MLE, B: Bootstrap bias correction, J: Penalized likelihood estimation with Jeffrey s prior, G: Penalized likelihood estimation with generalized information matrix, C: Penalized likelihood estimation with Cauchy distribution. 2

book 2014/5/6 15:21 page 261 #285

book 2014/5/6 15:21 page 261 #285 Chapter 10 Simulation Simulations provide a powerful way to answer questions and explore properties of statistical estimators and procedures. In this chapter, we will