Mixed models in R using the lme4 package Part 3: Inference based on profiled deviance

Similar documents
σ e, which will be large when prediction errors are Linear regression model

Business Statistics 41000: Probability 3

Window Width Selection for L 2 Adjusted Quantile Regression

Statistics & Statistical Tests: Assumptions & Conclusions

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

Maximum Likelihood Estimation

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions

Modern Methods of Data Analysis - SS 2009

Stat 328, Summer 2005

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Chapter 9: Sampling Distributions

Contents Part I Descriptive Statistics 1 Introduction and Framework Population, Sample, and Observations Variables Quali

Chapter 7. Inferences about Population Variances

CPSC 540: Machine Learning

Maximum Likelihood Estimation

Final Exam Suggested Solutions

CPSC 540: Machine Learning

Small Sample Performance of Instrumental Variables Probit Estimators: A Monte Carlo Investigation

Session 178 TS, Stats for Health Actuaries. Moderator: Ian G. Duncan, FSA, FCA, FCIA, FIA, MAAA. Presenter: Joan C. Barrett, FSA, MAAA

Random Variables and Probability Distributions

Multiple regression - a brief introduction

LAST SECTION!!! 1 / 36

Gov 2001: Section 5. I. A Normal Example II. Uncertainty. Gov Spring 2010

Variance clustering. Two motivations, volatility clustering, and implied volatility

Lecture 6: Non Normal Distributions

The Great Moderation Flattens Fat Tails: Disappearing Leptokurtosis

Homework Assignment Section 3

Analysis of Variance in Matrix form

PARAMETRIC AND NON-PARAMETRIC BOOTSTRAP: A SIMULATION STUDY FOR A LINEAR REGRESSION WITH RESIDUALS FROM A MIXTURE OF LAPLACE DISTRIBUTIONS

Models of Patterns. Lecture 3, SMMD 2005 Bob Stine

9. Logit and Probit Models For Dichotomous Data

Linear Regression with One Regressor

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2012, Mr. Ruey S. Tsay. Solutions to Midterm

Lecture 2. Probability Distributions Theophanis Tsandilas

Analysis of the Influence of the Annualized Rate of Rentability on the Unit Value of the Net Assets of the Private Administered Pension Fund NN

Consistent estimators for multilevel generalised linear models using an iterated bootstrap

Regression Review and Robust Regression. Slides prepared by Elizabeth Newton (MIT)

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay. Solutions to Final Exam

Example 1 of econometric analysis: the Market Model

Intro to GLM Day 2: GLM and Maximum Likelihood

Gamma Distribution Fitting

Likelihood Methods of Inference. Toss coin 6 times and get Heads twice.

STA258 Analysis of Variance

Financial Econometrics

Diploma in Business Administration Part 2. Quantitative Methods. Examiner s Suggested Answers

Modelling Returns: the CER and the CAPM

The Two Sample T-test with One Variance Unknown

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Power of t-test for Simple Linear Regression Model with Non-normal Error Distribution: A Quantile Function Distribution Approach

The Two-Sample Independent Sample t Test

Two Populations Hypothesis Testing

Data Analysis and Statistical Methods Statistics 651

μ: ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Tests for One Variance

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER

Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron

Financial Econometrics (FinMetrics04) Time-series Statistics Concepts Exploratory Data Analysis Testing for Normality Empirical VaR

Confidence Intervals Introduction

University of New South Wales Semester 1, Economics 4201 and Homework #2 Due on Tuesday 3/29 (20% penalty per day late)

Sample Size for Assessing Agreement between Two Methods of Measurement by Bland Altman Method

Bivariate Birnbaum-Saunders Distribution

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]

Analysis of 2x2 Cross-Over Designs using T-Tests for Non-Inferiority

MVE051/MSG Lecture 7

Shifting our focus. We were studying statistics (data, displays, sampling...) The next few lectures focus on probability (randomness) Why?

Fat tails and 4th Moments: Practical Problems of Variance Estimation

Random Effects ANOVA

Let us assume that we are measuring the yield of a crop plant on 5 different plots at 4 different observation times.

Asymmetric Price Transmission: A Copula Approach

Fall 2004 Social Sciences 7418 University of Wisconsin-Madison Problem Set 5 Answers

σ 2 : ESTIMATES, CONFIDENCE INTERVALS, AND TESTS Business Statistics

Experimental Design and Statistics - AGA47A

R. Kerry 1, M. A. Oliver 2. Telephone: +1 (801) Fax: +1 (801)

Normal Probability Distributions

Interval estimation. September 29, Outline Basic ideas Sampling variation and CLT Interval estimation using X More general problems

2018 AAPM: Normal and non normal distributions: Why understanding distributions are important when designing experiments and analyzing data

Assessing Model Stability Using Recursive Estimation and Recursive Residuals

Point Estimation. Principle of Unbiased Estimation. When choosing among several different estimators of θ, select one that is unbiased.

Time series: Variance modelling

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.

Econometric Methods for Valuation Analysis

Lecture 9: Markov and Regime

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

When determining but for sales in a commercial damages case,

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Midterm

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

Review for Final Exam Spring 2014 Jeremy Orloff and Jonathan Bloom

Lecture 3: Probability Distributions (cont d)

STA 4504/5503 Sample questions for exam True-False questions.

Unit2: Probabilityanddistributions. 3. Normal distribution

2.1 Properties of PDFs

Modelling Environmental Extremes

Market Risk Analysis Volume I

10/1/2012. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1

Internet Appendix for Asymmetry in Stock Comovements: An Entropy Approach

Chapter 11: Inference for Distributions Inference for Means of a Population 11.2 Comparing Two Means

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

Inferences on Correlation Coefficients of Bivariate Log-normal Distributions

Modelling Environmental Extremes

Transcription:

Mixed models in R using the lme4 package Part 3: Inference based on profiled deviance Douglas Bates Department of Statistics University of Wisconsin - Madison <Bates@Wisc.edu> Madison January 11, 2011 Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 1 / 18

Outline 1 Profiling the deviance Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 2 / 18

Outline 1 Profiling the deviance 2 Plotting the profiled deviance Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 2 / 18

Outline 1 Profiling the deviance 2 Plotting the profiled deviance 3 Profile pairs Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 2 / 18

Outline 1 Profiling the deviance 2 Plotting the profiled deviance 3 Profile pairs 4 Profiling models with fixed-effects for covariates Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 2 / 18

Outline 1 Profiling the deviance 2 Plotting the profiled deviance 3 Profile pairs 4 Profiling models with fixed-effects for covariates 5 Summary Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 2 / 18

Likelihood ratio tests and deviance In section 2 we described the use of likelihood ratio tests (LRTs) to compare a reduced model (say, one that omits a random-effects term) to the full model. The test statistic in a LRT is the change in the deviance, which is negative twice the log-likelihood. We always use maximum likelihood fits (i.e. REML=FALSE) to evaluate the deviance. In general we calculate p-values for a LRT from a χ 2 distribution with degrees of freedom equal to the difference in the number of parameters in the models. The important thing to note is that a likelihood ratio test is based on fitting the model under each set of conditions. Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 3 / 18

Profiling the deviance versus one parameter There is a close relationship between confidence intervals and hypothesis tests on a single parameter. When, e.g. H 0 : β 1 = β 1,0 versus H a : β 1 β 1,0 is not rejected at level α then β 1,0 is in a 1 α confidence interval on the parameter β 1. For linear fixed-effects models it is possible to determine the change in the deviance from fitting the full model only. For mixed-effects models we need to fit the full model and all the reduced models to perform the LRTs. In practice we fit some of them and use interpolation. The profile function evaluates such a profile of the change in the deviance versus each of the parameters in the model. Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 4 / 18

Transforming the LRT statistic The LRT statistic for a test of a fixed value of a single parameter would have a χ 2 1 distribution, which is the square of a standard normal. If a symmetric confidence interval were appropriate for the parameter, the LRT statistic would be quadratic with respect to the parameter. We plot the square root of the LRT statistic because it is easier to assess whether the plot looks like a straight line than it is to assess if it looks like a quadratic. To accentuate the straight line behavior we use the signed square root transformation which returns the negative square root to the left of the estimate and the positive square root to the right. This quantity can be compared to a standard normal. We write it as ζ Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 5 / 18

Evaluating and plotting the profile > pr1 <- profile ( fm1m <- lmer ( Yield ~ 1+(1 Batch ), Dyestuff, RE > xyplot ( pr1, aspect =1.3).sig01.lsig (Intercept) 2 1 ζ 0 1 2 0 20 40 60 80 100 3.6 3.8 4.0 4.2 1500 1550 The parameters are σ b, log(σ) (σ is the residual standard deviation) and µ. The vertical lines delimit 50%, 80%, 90%, 95% and 99% confidence intervals. Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 6 / 18

Alternative profile plot > xyplot (pr1, aspect =0.7, absval = TRUE ) 2.5 2.0 ζ 1.5 1.0.sig01.lsig (Intercept) 0.5 0.0 0 20 40 60 80 100 3.6 3.8 4.0 4.2 1500 1550 Numerical values of the confidence interval limits are obtained from the method for the confint generic > confint ( pr1 ) 2.5 % 97.5 %.sig01 12.201753 84.06289.lsig 3.643622 4.21446 (Intercept) 1486.451500 1568.54849 Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 7 / 18

Changing the confidence level As for other methods for the confint generic, we use level=α to obtain a confidence level other than the default of 0.95. > confint ( pr1, level =0.99) 0.5 % 99.5 %.sig01 NA 113.692643.lsig 3.571293 4.326347 (Intercept) 1465.874011 1589.126022 Note that the lower 99% confidence limit for σ 1 is undefined. Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 8 / 18

Interpreting the univariate plots A univariate profile ζ plot is read like a normal probability plot a sigmoidal (elongated S -shaped) pattern like that for the (Intercept) parameter indicates overdispersion relative to the normal distribution. a bending pattern, usually flattening to the right of the estimate, indicates skewness of the estimator and warns us that the confidence intervals will be asymmetric a straight line indicates that the confidence intervals based on the quantiles of the standard normal distribution are suitable Note that the only parameter providing a more-or-less straight line is σ and this plot is on the scale of log(σ) not σ or, even worse, σ 2. We should expect confidence intervals on σ 2 to be asymmetric. In the simplest case of a variance estimate from an i.i.d. normal sample the confidence interval is derived from quantiles of a χ 2 distribution which is quite asymmetric (although many software packages provide standard errors of variance component estimates as if they were meaningful). Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 9 / 18

Profile ζ plots for log(σ),σ and σ 2 log(σ) σ σ 2 2 1 ζ 0 1 2 3.6 3.8 4.0 4.2 40 50 60 70 2000 3000 4000 5000 We can see moderate asymmetry on the scale of σ and stronger asymmetry on the scale of σ 2. The issue of which of the ML or REML estimates of σ 2 are closer to being unbiased is a red herring. σ 2 is not a sensible scale on which to evaluate the expected value of an estimator. Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 10 / 18

Profile ζ plots for log(σ 1 ),σ 1 and σ 2 1 log(σ 1 ) σ 1 σ 1 2 2 1 ζ 0 1 2 2 3 4 0 20 40 60 80 100 0 5000 10000 For σ 1 the situation is more complicated because 0 is within the range of reasonable values. The profile flattens as σ 0 which means that intervals on log(σ) are unbounded. Obviously the estimator of σ1 2 is terribly skewed yet most software ignores this and provides standard errors on variance component estimates. Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 11 / 18

Profile pairs plots The information from the profile can be used to produce pairwise projections of likelihood contours. These correspond to pairwise joint confidence regions. Such a plot (next slide) can be somewhat confusing at first glance. Concentrate initially on the panels above the diagonal where the axes are the parameters in the scale shown in the diagonal panels. The contours correspond to 50%, 80%, 90%, 95% and 99% pairwise confidence regions. The two lines in each panel are profile traces, which are the conditional estimate of one parameter given a value of the other. The actual interpolation of the contours is performed on the ζ scale which is shown in the panels below the diagonal. Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 12 / 18

Profile pairs for model fm1 > splom ( pr1 ) 1600 1550 (Intercept) 1500 1450 0 1 2 3 4.4 4.2 4.0 3.8 3.6 4.0 4.2 4.4.lsig 0 1 2 3 0 50 100 150.sig01 0 1 2 3 Scatter Plot Matrix Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 13 / 18

About those p-values Statisticians have been far too successful in propagating concepts of hypothesis testing and p-values, to the extent that quoting p-values is essentially a requirement for publication in some disciplines. When models were being fit by hand calculation it was important to use any trick we could come up with to simplify the calculation. Often the results were presented in terms of the simplified calculation without reference to the original idea of comparing models. We often still present model comparisons as properties of terms in the model without being explicit about the underlying comparison of models with the term and without the term. The approach I recommend for assessing the importance of particular terms in the fixed-effects part of the model is to fit with and without then use a likelihood ratio test (the anova function). Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 14 / 18

Hypothesis tests versus confidence intervals As mentioned earlier, hypothesis tests and confidence intervals are two sides of the same coin. For a categorical covariate, it often makes sense to ask Is there a signficant effect for this factor? which we could answer with a p-value. We may, in addition, want to know how large the effect is and how precisely we have estimated it, i.e. a confidence interval. For a continuous covariate we generally want to know the coefficient estimate and its precision (i.e. a confidence interval) in preference to a p-value for a hypothesis test. When we have many observations and only a moderate number of fixed and random effects, the distribution of the fixed-effects coefficients estimators is well-approximated by a multivariate normal derived from the estimates, their standard errors and correlations. With comparatively few observations it is worthwhile using profiling to check on the sensistivity of the fit to the values of the coefficients. As we have seen, estimates of variance components can be poorly behaved and it is worthwhile using profiling to check their precision. Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 15 / 18

The fixed-effects coefficient estimates (top row) have good normal approximations (i.e. a 95% confidence intervals will be closely approximated by estimate ± 1.96 standard error). The estimators of σ 1, σ 2 and log(σ) are also well approximated by a normal. If anything, the estimators of σ 1 and σ 2 are skewed to the left rather than skewed to the right. Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 16 / 18 Profiling a model for the classroom data > pr8 <- profile ( fm8 <- lmer ( mathgain ~ mathkind + minority + + ses + (1 classid ) + (1 schoolid ), classroom, R ζ 2.5 2.0 1.5 1.0 0.5 0.0 (Intercept).sig01 260 280 300 4 6 8 10 12 mathkind.sig02 0.52 0.48 0.44 4 6 8 10 12 minorityy.lsig 14 10 6 4 2 3.24 3.28 3.32 3.36 ses 2 4 6 8 2.5 2.0 1.5 1.0 0.5 0.0

Profile pairs for many parameters 8 6 ses 4 0 1 2 3 2 3.35 3.30 3.30.lsig 3.250 1 2 3 310 300 280 310 290 (Intercept) 280 270 2600 1 2 3 250 0.40 0.42 0.44 0.46 mathkind 0.46 0.48 0.50 0.52 0 1 2 3 0.54 5 10 5 minorityy 10 0 1 2 3 15 12 8 10 10 8.sig02 6 0 1 2 3 4 4 6 8 10 14.sig01 1 0 2 3 Scatter Plot Matrix Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 17 / 18

Summary Profile of the deviance with respect to the parameters in the model allow us to assess the variability in the parameters in terms of how well the model can be fit. We apply the signed square root transformation to the change in the deviance to produce ζ. When the Gaussian approximation to the distribution of the parameter estimate is appropriate, this function will be close to a straight line. Profile zeta plots and profile pairs plots provide visual assessment of the precision of parameter estimates. Typically the distribution of variance component estimates is highly skewed to the right and poorly approximated by a Gaussian, implying that standard errors of such estimates are of little value. Douglas Bates (Stat. Dept.) Profiling Jan. 11, 2011 18 / 18