Generalized Multilevel Regression Example for a Binary Outcome

Size: px

Start display at page:

Download "Generalized Multilevel Regression Example for a Binary Outcome"

Alannah Hines
6 years ago
Views:

1 Psy 510/610 Multilevel Regression, Spring HLM Generalized Multilevel Regression Example for a Binary Outcome Specifications for this Bernoulli HLM2 run Problem Title: no title The data source for this run = C:\jason\HLM\SQMS\school binary.mdm The command file for this run = C:\jason\HLM\SQMS\school binary 3.hlm Output file name = C:\jason\HLM\SQMS\hlm2.html The maximum number of level-1 units = 2184 The maximum number of level-2 units = 28 The maximum number of micro iterations = 14 Method of estimation: full maximum likelihood via EM-Laplace 2 via adaptive Gaussian quadrature Maximum number of macro iterations = 100 Number of adaptive Gaussian quadrature points = 10 Distribution at Level-1: Bernoulli The outcome variable is GUN Summary of the model specified Level-1 Model Prob(GUNij=1 βj) = ϕij log[ϕij/(1 - ϕij)] = ηij ηij = β0j + β1j*(raceij) + β2j*(erosionij) Level-2 Model β0j = γ00 + γ01*(dropoutj) + u0j β1j = γ10 β2j = γ20 RACE EROSION have been centered around the grand mean. DROPOUT has been centered around the grand mean. Level-1 variance = 1/[ϕij(1-ϕij)] Mixed Model ηij = γ00 + γ01*dropoutj + γ10*raceij + γ20*erosionij + u0j Results for Non-linear Model with the Logit Link Function Unit-Specific Model, PQL Estimation - (macro iteration 12) τ INTRCPT1,β of τ INTRCPT1,β Random level-1 coefficient Reliability estimate INTRCPT1,β The value of the log-likelihood function at iteration 2 = E+003 Final estimation of fixed effects: (Unit-specific model) INTRCPT2, γ <0.001 DROPOUT, γ INTRCPT2, γ INTRCPT2, γ <0.001 Odds Ratio Confidence Interval

2 Psy 510/610 Multilevel Regression, Spring INTRCPT2, γ (0.032,0.063) DROPOUT, γ (0.911,1.140) INTRCPT2, γ (0.842,2.264) INTRCPT2, γ (1.681,2.972) Final estimation of fixed effects (Unit-specific model with robust standard s) INTRCPT2, γ <0.001 DROPOUT, γ INTRCPT2, γ INTRCPT2, γ <0.001 Odds Confidence Ratio Interval INTRCPT2, γ (0.033,0.061) DROPOUT, γ (0.885,1.175) INTRCPT2, γ (0.930,2.052) INTRCPT2, γ (1.689,2.957) Final estimation of variance components Variance Random Effect χ Deviation Component 2 INTRCPT1, u <0.001 Results for Unit-Specific Model, EM Laplace-2 Estimation Iteration 29 τ INTRCPT1,β of τ INTRCPT1,β Random level-1 coefficient Reliability estimate INTRCPT1,β The log-likelihood at EM Laplace-2 iteration 29 is E+003 Final estimation of fixed effects (Unit-specific model) INTRCPT2, γ <0.001 DROPOUT, γ INTRCPT2, γ INTRCPT2, γ <0.001 Odds Confidence Ratio Interval INTRCPT2, γ (0.028,0.062) DROPOUT, γ (0.918,1.137) INTRCPT2, γ (0.626,3.080) INTRCPT2, γ (1.488,3.396)

3 Psy 510/610 Multilevel Regression, Spring Statistics for the current model Deviance = Number of estimated parameters = 5 Results For Unit-Specific Model, Adaptive Gaussian Quadrature iteration 2 τ INTRCPT1,β of τ INTRCPT1,β Final estimation of fixed effects (Unit-specific model) INTRCPT2, γ <0.001 DROPOUT, γ INTRCPT2, γ INTRCPT2, γ <0.001 Statistics for the current model Deviance = Number of estimated parameters = 5 Results for Population-Average Model The value of the log-likelihood function at iteration 3 = E+003 Final estimation of fixed effects: (Population-average model) INTRCPT2, γ <0.001 DROPOUT, γ INTRCPT2, γ INTRCPT2, γ <0.001 Odds Ratio Confidence Interval INTRCPT2, γ (0.034,0.066) DROPOUT, γ (0.911,1.146) INTRCPT2, γ (0.859,2.185) INTRCPT2, γ (1.689,2.907) Final estimation of fixed effects (Population-average model with robust standard s) INTRCPT2, γ <0.001 DROPOUT, γ INTRCPT2, γ INTRCPT2, γ <0.001 Odds Confidence Ratio Interval INTRCPT2, γ (0.035,0.064) DROPOUT, γ (0.875,1.193) INTRCPT2, γ (0.960,1.956)

4 Psy 510/610 Multilevel Regression, Spring INTRCPT2, γ (1.715,2.863) The predicted probability can be computed from the results. I use the population average results to more accurately estimate the proportion of students who have carried a gun in the population. The following formula can be used assuming mean centering of the predictors (or if testing the intercept only model) and the proportion is desired for the case when all predictors equal their means: 1 ϕ = 1 e η ij + ( ) Where ηij is the predicted log odds given the regression, η = β + β ( DROPOUT ) + β ( RACE) + β ( EROSION ). ηij is easy ij to calculate if all predictors are 0 (i.e., equal to their means when centered), because ηij is simply equal to β0: 1 ϕ = =.045 ( 3.045) ( 1 + e ) Thus, approximately 4.5% of students in the population are expected to report carrying a gun in the previous year. R > library(lme4) > mydata$gun <- as.numeric(mydata$gun) > mydata$gun = mydata$gun - 1. > mydata$race <- as.numeric(mydata$race) NOTE: Use listwise deletion to make sure centering is based on same number of cases as used in the model > mydata <-Subset(gun!='NA' & race!='na'& erosion!='na'& dropout!='na') > > mydata$race <- mydata$race - mean(mydata$race) > mydata$erosion <- mydata$erosion - mean(mydata$erosion) > mydata$dropout <- mydata$dropout - mean(mydata$dropout) > View(mydata) > > > rm(model1) > #Laplace approximation > model1 <- glmer(gun ~ race + erosion + dropout + (1 schnum), family = binomial,data=mydata) > summary(model1) Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmermod'] Family: binomial ( logit ) Formula: gun ~ race + erosion + dropout + (1 schnum) Data: mydata AIC BIC loglik deviance df.resid Scaled residuals: Min 1Q Median 3Q Max Random effects: Groups Name Variance Std.Dev. schnum (Intercept) Number of obs: 2274, groups: schnum, 28

5 Psy 510/610 Multilevel Regression, Spring Fixed effects: Estimate Std. Error z value Pr(> z ) (Intercept) < race erosion dropout Correlation of s: (Intr) race erosin race erosion dropout > #adaptive quadrature, 15 integration points > model2 <- glmer(gun ~ race + erosion + dropout + (1 schnum), family = binomial, nagq=15,data=mydata) > summary(model2) Generalized linear mixed model fit by maximum likelihood (Adaptive Gauss- Hermite Quadrature, nagq = 15) ['glmermod'] Family: binomial ( logit ) Formula: gun ~ race + erosion + dropout + (1 schnum) Data: mydata AIC BIC loglik deviance df.resid Scaled residuals: Min 1Q Median 3Q Max Random effects: Groups Name Variance Std.Dev. schnum (Intercept) Number of obs: 2274, groups: schnum, 28 Fixed effects: Estimate Std. Error z value Pr(> z ) (Intercept) < race erosion dropout Correlation of s: (Intr) race erosin race erosion dropout Note: Special code is needed (0+slopevar) to suppress multiple intercepts that are generated by default, which seems atypical to me. Also, adaptive quadrature does not appear to be available for models with random slopes. > #to have more than one random slope and a single intercept, use something like > model1 <- glmer(gun ~ race + erosion + dropout + (race schnum) + (0+erosion schnum), family = binomial,data=mydata)

6 Psy 510/610 Multilevel Regression, Spring > summary(model1) Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmermod'] Family: binomial ( logit ) Formula: gun ~ race + erosion + dropout + (race schnum) + (0 + erosion schnum) Data: mydata AIC BIC loglik deviance df.resid Scaled residuals: Min 1Q Median 3Q Max Random effects: Groups Name Variance Std.Dev. Corr schnum (Intercept) race schnum.1 erosion Number of obs: 2274, groups: schnum, 28 Fixed effects: Estimate Std. Error z value Pr(> z ) (Intercept) < race erosion dropout Correlation of s: (Intr) race erosin race erosion dropout SPSS Because SPSS only provides PQL estimates, which I do not recommend using when other methods are available, I do not present a full example here. Below is syntax, however, for estimating a binary model. GENLINMIXED /DATA_STRUCTURE SUBJECTS=schnum /FIELDS TARGET=gun /TARGET_OPTIONS DISTRIBUTION=BINOMIAL LINK=LOGIT /FIXED EFFECTS=race cerosion cdropout USE_INTERCEPT=TRUE /RANDOM USE_INTERCEPT=TRUE SUBJECTS=schnum COVARIANCE_TYPE=VARIANCE_COMPONENTS /BUILD_OPTIONS TARGET_CATEGORY_ORDER=DESCENDING INPUTS_CATEGORY_ORDER=DESCENDING MAX_ITERATIONS=1500 CONFIDENCE_LEVEL=95 DF_METHOD=SATTERTHWAITE. The "TARGET" is the outcome and the "INPUTS" are the predictors. The SUBJECTS variable is the group designation. ORDER=DESCENDING is used to specify that the 0 group is used as the comparison (typically what is desired) for the dependent or the independent variable. If omitted, the 1 group is used as the default.

Step 1: Load the appropriate R package. Step 2: Fit a separate mixed model for each independence claim in the basis set.

Step 1: Load the appropriate R package. You will need two libraries: nlme and lme4. Step 2: Fit a separate mixed model for each independence claim in the basis set. For instance, in Table 2 the first basis