Consistent estimators for multilevel generalised linear models using an iterated bootstrap

Similar documents
Intro to GLM Day 2: GLM and Maximum Likelihood

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Window Width Selection for L 2 Adjusted Quantile Regression

Probits. Catalina Stefanescu, Vance W. Berger Scott Hershberger. Abstract

11. Logistic modeling of proportions

PARAMETRIC AND NON-PARAMETRIC BOOTSTRAP: A SIMULATION STUDY FOR A LINEAR REGRESSION WITH RESIDUALS FROM A MIXTURE OF LAPLACE DISTRIBUTIONS

Unit 5: Study Guide Multilevel models for macro and micro data MIMAS The University of Manchester

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

The Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis

1. You are given the following information about a stationary AR(2) model:

SMALL AREA ESTIMATES OF INCOME: MEANS, MEDIANS

Small Sample Performance of Instrumental Variables Probit Estimators: A Monte Carlo Investigation

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

Bootstrap Inference for Multiple Imputation Under Uncongeniality

Power of t-test for Simple Linear Regression Model with Non-normal Error Distribution: A Quantile Function Distribution Approach

Asymptotic Distribution Free Interval Estimation

Multiple Regression. Review of Regression with One Predictor

Chapter 7: Point Estimation and Sampling Distributions

Computational Statistics Handbook with MATLAB

Using Halton Sequences. in Random Parameters Logit Models

Introduction Dickey-Fuller Test Option Pricing Bootstrapping. Simulation Methods. Chapter 13 of Chris Brook s Book.

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

A Two-Step Estimator for Missing Values in Probit Model Covariates

Quantile Regression. By Luyang Fu, Ph. D., FCAS, State Auto Insurance Company Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting

Modelling mortgage insurance as a multi-state process

Monte Carlo approximation through Gibbs output in generalized linear mixed models

Objective calibration of the Bayesian CRM. Ken Cheung Department of Biostatistics, Columbia University

Mixed models in R using the lme4 package Part 3: Inference based on profiled deviance

Bloomberg. Portfolio Value-at-Risk. Sridhar Gollamudi & Bryan Weber. September 22, Version 1.0

Market Risk Analysis Volume I

Robust Critical Values for the Jarque-bera Test for Normality

Chapter 9: Sampling Distributions

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

Quantile Regression in Survival Analysis

Geostatistical Inference under Preferential Sampling

STRESS-STRENGTH RELIABILITY ESTIMATION

Outline. Review Continuation of exercises from last time

Week 7 Quantitative Analysis of Financial Markets Simulation Methods

Maximum Likelihood Estimation

Context Power analyses for logistic regression models fit to clustered data

Hierarchical Generalized Linear Models. Measurement Incorporated Hierarchical Linear Models Workshop

Contents Part I Descriptive Statistics 1 Introduction and Framework Population, Sample, and Observations Variables Quali

Statistics & Statistical Tests: Assumptions & Conclusions

Reading the Tea Leaves: Model Uncertainty, Robust Foreca. Forecasts, and the Autocorrelation of Analysts Forecast Errors

Bayesian Finance. Christa Cuchiero, Irene Klein, Josef Teichmann. Obergurgl 2017

Solving dynamic portfolio choice problems by recursing on optimized portfolio weights or on the value function?

A New Test for Correlation on Bivariate Nonnormal Distributions

Linear Regression with One Regressor

Diploma in Business Administration Part 2. Quantitative Methods. Examiner s Suggested Answers

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Effects of missing data in credit risk scoring. A comparative analysis of methods to gain robustness in presence of sparce data

An Improved Saddlepoint Approximation Based on the Negative Binomial Distribution for the General Birth Process

Fitting financial time series returns distributions: a mixture normality approach

Estimating Mixed Logit Models with Large Choice Sets. Roger H. von Haefen, NC State & NBER Adam Domanski, NOAA July 2013

Relevant parameter changes in structural break models

Technical Appendix: Policy Uncertainty and Aggregate Fluctuations.

Value at risk might underestimate risk when risk bites. Just bootstrap it!

Phd Program in Transportation. Transport Demand Modeling. Session 11

On Implementation of the Markov Chain Monte Carlo Stochastic Approximation Algorithm

List of tables List of boxes List of screenshots Preface to the third edition Acknowledgements

NAMWOOK KOO UNIVERSITY OF FLORIDA

Appendix A (Pornprasertmanit & Little, in press) Mathematical Proof

Statistical Models and Methods for Financial Markets

Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR

Asymmetric Price Transmission: A Copula Approach

The Time-Varying Effects of Monetary Aggregates on Inflation and Unemployment

TABLE OF CONTENTS - VOLUME 2

A comment on Christoffersen, Jacobs and Ornthanalai (2012), Dynamic jump intensities and risk premiums: Evidence from S&P500 returns and options

Chapter 8. Introduction to Statistical Inference

Alternative VaR Models

Small Area Estimation of Poverty Indicators using Interval Censored Income Data

Estimation of a Ramsay-Curve IRT Model using the Metropolis-Hastings Robbins-Monro Algorithm

Chapter 4 Variability

Calibration of Interest Rates

Estimating the Parameters of Closed Skew-Normal Distribution Under LINEX Loss Function

Model 0: We start with a linear regression model: log Y t = β 0 + β 1 (t 1980) + ε, with ε N(0,

MODELLING OF INCOME AND WAGE DISTRIBUTION USING THE METHOD OF L-MOMENTS OF PARAMETER ESTIMATION

Econ 300: Quantitative Methods in Economics. 11th Class 10/19/09

Analysis of extreme values with random location Abstract Keywords: 1. Introduction and Model

Stochastic Loss Reserving with Bayesian MCMC Models Revised March 31

Difficult Choices: An Evaluation of Heterogenous Choice Models

Scaling conditional tail probability and quantile estimators

Is neglected heterogeneity really an issue in binary and fractional regression models? A simulation exercise for logit, probit and loglog models

Occasional Paper. Risk Measurement Illiquidity Distortions. Jiaqi Chen and Michael L. Tindall

Monetary policy under uncertainty

Practice Exam 1. Loss Amount Number of Losses

Logit Models for Binary Data

A NEW POINT ESTIMATOR FOR THE MEDIAN OF GAMMA DISTRIBUTION

Passing the repeal of the carbon tax back to wholesale electricity prices

Longevity risk and stochastic models

Chapter 4: Estimation

KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI

Review: Population, sample, and sampling distributions

A Stochastic Reserving Today (Beyond Bootstrap)

Application of MCMC Algorithm in Interest Rate Modeling

Chapter 5: Statistical Inference (in General)

Lecture outline. Monte Carlo Methods for Uncertainty Quantification. Importance Sampling. Importance Sampling

Sampling Distributions and the Central Limit Theorem

arxiv: v1 [q-fin.rm] 13 Dec 2016

An Introduction to Bayesian Inference and MCMC Methods for Capture-Recapture

Transcription:

Multilevel Models Project Working Paper December, 98 Consistent estimators for multilevel generalised linear models using an iterated bootstrap by Harvey Goldstein hgoldstn@ioe.ac.uk

Introduction Several papers have addressed the issue of the parameter biases which can occur when fitting multilevel models with non Normal responses. Breslow and Clayton (1993) discuss various fitting procedures including those based upon linearising transformations, maximum likelihood and Bayesian estimation using MCMC. Direct maximum likelihood or restricted maximum likelihood, while feasible for simple models, becomes quickly intractable as the number of random effects increases: MCMC via Gibbs sampling is an attractive alternative, but the choice of prior distribution for the random parameters is important and there are difficulties in choosing diffuse or uninformative priors (Browne, 1998?). Approximate methods based upon linearising transformations and applying quasilikelihood estimation are attractive since they pose no serious computational problems and can be fitted using modifications to existing multilevel software packages. Rodriguez and Goldman (1995) illustrate how severe underestimation can occur in a simple variance components model with binary responses, especially for the level 2 variance. They use a first order MQL method (Goldstein, 1991). Goldstein (1995) and Goldstein and Rasbash (1996) develop improved linearising approximations and show that for models where there are adequate numbers of level 1 units per level 2 unit these give satisfactory results. Nevertheless, where the numbers of level 1 units per level 2 unit is small and for binary responses as in the Rodriguez-Goldman data sets, there is still some underestimation. In this paper we set out a procedure (Kuk, 1995) which yields asymptotically unbiased and consistent estimates for such models and which can be applied in general to any kind of non-linear multilevel model. Iterative bootstrap (IB) bias correction We shall illustrate the procedure with a simple 2-level variance components model, as follows logit( π ) = β + β x + u u y j ij ij 0 1 ij j 2 ~ N( 0, σ u) ~ Binomial( 1, π ) ij Given a set of initial estimates, obtained using for example the first order MQL approximation, 20 ( ) ( 0) ( 0) u 0 1 $ $ $ σ, β, β we generate a set of bootstrap samples, from the model using the estimates (1) and averaging over these we obtain the set of bootstrap estimates ~ 20 ( ) ~ ( 0) ~ ( 0) σ, β, β (2) u 0 1 (1)

We now obtain the bootstrap estimate of the bias by subtracting (2) from (1). These bias estimates are added to the initial parameter estimates (1) as a first adjustment to give new bias-corrected estimates 21 () () 1 () 1 u 0 1 $ $ $ σ, β, β We generate a new set of bootstrap samples from the model based upon the estimates given by (3), subtract the new mean bootstrap parameter estimates from (3) to obtain updated bias estimates and add these to the initial estimates (1) to obtain a new set of bias corrected estimates. When it converges, Kuk (1995) demonstrates that this procedure gives asymptotically consistent and unbiased parameter estimates. In the present case the bootstrap samples have been generated parametrically by sampling from the distributions with estimated parameters: in the present case from a Normal distribution for the level 2 residuals and a binomial distribution (with denominator one) for the level 1 residuals. It relies upon the assumed model structure correctly representing the data hierarchy. In some cases this may not be the case, for example if an important level is omitted. Thus, the procedure does not protect against such forms of model misspecification. An important case is with discrete response models where we may have, say, extra binomial variation. In such cases the procedure can give different solutions depending on which estimation method is used. Care needs to be taken with small variance estimates. To estimate the bias we need to allow negative estimates of variances. If an initial estimate is zero, then clearly, resetting negative bootstrap sample means to zero implies that the bias estimate will never be negative, so the new updated estimate will remain at zero. Moreover, as confirmed by simulations, all the estimates will exhibit a downward bias if negative bootstrap means are reset to zero. We also note that where an unbiased variance estimate is close to zero, the value of the bias is anyway small, so that full bias correction is less important and, for example, a second order PQL estimate may be adequate (see below). The bootstrap replicates from the final bootstrap set generally will have too small a variance and so cannot directly be used for inference. If we knew the functional relationship between the bias-corrected value and the biased value this could be used to transform each of the bootstrap replicate estimates and the transformed values then used for inference. We shall discuss a procedure for doing this below. In MLwiN version 1.0 the procedure is to use scaling factors for each parameter calculated as follows. For each parameter in turn, using the final bias-corrected estimate and the final bootstrap replicate mean, we take the ratio of these and multiply all the final replicate parameter values by this ratio. These scaled values are used to construct approximately correct standard errors and quantile estimates. (3)

A simulation We simulate 100 replications of the model (1) for a binary (0,1) response with all three parameters equal to 1., with 50 level 2 units and 2 level 1 units per level 2 unit. This is a rather extreme case where we would expect serious underestimation of parameters. To decide how many bootstrap samples we need for each iteration of the procedure we keep a running mean such that when, at the t-th bootstrap sample, for the running means θ, θ, θ t t 1 t 2 θ θ < ε and θ θ < ε (4) t t 1 t 1 t 2 then we accept convergence. We have chosen the value of ε as 0.001 and set a minimum number of samples as 10. We note, in passing, that the device of maintaining a suitable running statistic to judge convergence is applicable for bootstrap sampling when attention is focused on other functions of parameters, for example the standard deviation or a percentile estimate. We then need a criterion for judging convergence of the bootstrap bias corrected estimates. In an application convergence needs to be monitored closely, especially for small values of random parameters. We finally adopted the following criteria for the simulations We compute the average of the current and previous two estimates, say θ t and the average of the three estimates prior to these, say θ t 1, and judge convergence as follows ( θt θt 1) / θt < 002. if θt 0.25 (5) ( θt θt 1) < 0. 005 if θt < 0.25 For small estimated values convergence is often slow and an absolute rather than relative criterion seems appropriate. The mean number of iterations required was 13.8 and the mean number of bootstrap samples per iteration was 80.5. The basic results are given in Table 1. We have used the standard deviation rather than the variance for reporting means since the distribution of the latter is more skew.

Table 1. Simulation results for MQL, Iterated bootstrap (IB)+ PQL estimates (s.e.) Level 2 s.d. Intercept Slope Initial IB Initial IB Initial IB 1st order MQL (IGLS) 1st order PQL (RIGLS) 2nd order PQL (IGLS) 2nd order PQL (RIGLS) 0.49 (0.03) 0.98 (0.06) 0.89 (0.03) 1.05 (0.04) 0.91 (0.03) 1.07 (0.04) 0.49 (0.04) 0.88 (0.03) 0.88 (0.03) 0.84 (0.06) 1.03 (0.04) 1.02 (0.03) 0.93 (0.07) 1.07 (0.04) 1.10 (0.04) The standard errors are computed over simulation replications. It is clear that the serious underestimation for all the parameters has been eliminated, and the final estimates are unbiased within the limits of sampling error. The initial second order PQL estimates using Iterative Generalised Least Squares (IGLS, which is maximum likelihood in the multivariate Normal case) of the fixed parameters in fact show no bias, but there is underestimation of the standard deviation. With Restricted Iterative Generalised Least Squares (RIGLS) which is restricted maximum likelihood in the multivariate Normal case) the variance estimate is less biased, although there appears to be a slight overestimation of the slope parameter. Interstingly, the first order PQL (RIGLS) estimates are no better than the first order MQL (IGLS) estimates, which suggests that second order PQL estimates should be used where possible for exploratory purposes. We also notice that the ratios of standard errors for the IB and MQL 1 estimates is approximately the same as the ratios of the parameter estimates, lending support to the scaling procedure suggested above. It would of course be possible to start with the second order PQL estimates and use this estimation procedure for the bootstrapping. A difficulty with this is that each estimation takes rather longer and this will usually be an important consideration. Secondly, in some cases (5% in the present case) the second order procedure fails to converge whereas the first order one almost always does. We note, however, that discarding those replicates where convergence fails does not invalidate the IB procedure. At convergence we generate a final sequence of bootstrap samples to provide estimates of precision, confidence intervals etc. The number of samples required for such purposes will generally be larger than used to in the updating, but as pointed out above we can use a running statistic for judging convergence at any prespecified accuracy. Figure 1 shows the relationship between the final and initial estimates and illustrates how substantial adjustments can be made when the initial estimates are moderately large.

Figure 1. Final iterative bootstrap estimate of level 2 standard deviation by initial estimate. The value for the initial estimate of zero is the mean over the 22 such values. Interval estimation Once convergence has been achieved a final group of replicates can be produced as the basis for inference. As pointed out above, however, these generally will have too small variation. One solution would be to take every replicate set and use the IB to produce bias-corrected estimates; these could then directly be used for inference. This procedure, however, is too computationally intensive to be practical in most circumstances. Note that we cannot just bias correct for selected percentiles since the rank orders will differ among the prarameters. An alternative procedure is as follows, but it applies just to the random parameters. For each replicate in the final group we will have simulated a set of residuals from the assumed underlying multivariate normal distribution. Using the generated residuals we can obtain the empirical covariance matrix at each level of the model. Each element of this matrix (termed a generated parameter) corresponds to a random parameter estimate for the replicate and

we use the relationship between these two sets for our functional transformation. We note that this also allows us to establish functional relationships for any function of the random parameters. A suitable smoothing curve, such as a cubic spline, for relating the generated parameters to the estimated parameters is then required. By making the replicate set large enough we can obtain any required accuracy. This procedure does not deal with the fixed parameters. Here, however, the simple scaling procedure may be adequate, and the PQL2 estimates are typically almost unbiased. This procedure can also be used to speed up the iterations - an accelerated iterated bootstrap. Consider the first replicate set. For a given parameter, if the distribution of the estimates covers the initial sample estimate ( $ ( ) θ 0 ) then the relationship between the generated parameter as response and the estimate obtained at that replicate allows us to obtain a predicted unbiased estimate. If this is not the case at the first iteration then we continue until it occurs. Using this estimate of the parameter we then iterate for a few further replicate sets to obtain an accurate unbiased estimate. From the final replicate set we then obtain the relationship to be used for inference. Conclusions The procedure outlined is quite general, and can be applied to any non-linear multilevel model. As mentioned above, it will usually not be necessary where there are sufficient level 1 units per level 2 unit. In practice, where the number of such units is small, a useful strategy is to base model exploration on the second order (RIGLS) PQL estimates and then compute final bias corrected estimates using the first order MQL as here. In many cases, however, the second order (RIGLS) PQL estimates will be perfectly adequate. Criteria are required for judging convergence and the number of bootstrap samples and the optimum criteria will generally depend on the data themselves and further work on this would be useful. For the bias corrected estimates the procedure may not always converge or convergence may be extremely slow. For MQL estimation neither of these problems has been encountered but they seem more likely to occur with PQL estimation and is a further reason for preferring the former to the latter.

References Breslow, N.E. and Clayton, D.G. (1993). Goldstein, H. (1991). Goldstein, H. (1995) Goldstein, H. and Rasbash, J. (1996).. Kuk, A.Y.C. (1995) Rodriguez, G. and Goldman, N (1995). Approximate inference in generalised linear models. J. American Statistical Association, 88, 9-25 Non-linear multilevel models with an application to discrete response data. Biometrika, 73, 43-51. Multilevel Statistical Models. London, Edward Arnold; New York, Halstead Press. Improved approximations for multilevel models with binary responses. Journal of the Royal Statistical Society, A. 159: 505-13 Asymptotically unbiased estimation in generalised linear models with random effects. J. Royal Statistical Society, B, 57, 395-407 An assessment of estimation procedures for multilevel models with binary responses. J. Royal Statistical Society, A, 158, 73-90