The Likelihood Ratio Test

Similar documents
4.5 Generalized likelihood ratio test

Statistics for Economics & Business

A random variable is a variable whose value is a numerical outcome of a random phenomenon.

AY Term 2 Mock Examination

Parametric Density Estimation: Maximum Likelihood Estimation

Chapter 8. Confidence Interval Estimation. Copyright 2015, 2012, 2009 Pearson Education, Inc. Chapter 8, Slide 1

1 Random Variables and Key Statistics

Sampling Distributions and Estimation

point estimator a random variable (like P or X) whose values are used to estimate a population parameter

Today: Finish Chapter 9 (Sections 9.6 to 9.8 and 9.9 Lesson 3)

CAPITAL ASSET PRICING MODEL

CreditRisk + Download document from CSFB web site:

Problem Set 1a - Oligopoly

BASIC STATISTICS ECOE 1323

The Valuation of the Catastrophe Equity Puts with Jump Risks

5. Best Unbiased Estimators

Lecture 4: Parameter Estimation and Confidence Intervals. GENOME 560 Doug Fowler, GS

ii. Interval estimation:

Online appendices from Counterparty Risk and Credit Value Adjustment a continuing challenge for global financial markets by Jon Gregory

CHAPTER 8 Estimating with Confidence

Lecture 5: Sampling Distribution

18.S096 Problem Set 5 Fall 2013 Volatility Modeling Due Date: 10/29/2013

Inferential Statistics and Probability a Holistic Approach. Inference Process. Inference Process. Chapter 8 Slides. Maurice Geraghty,

Estimating Proportions with Confidence

Bayes Estimator for Coefficient of Variation and Inverse Coefficient of Variation for the Normal Distribution

SCHOOL OF ACCOUNTING AND BUSINESS BSc. (APPLIED ACCOUNTING) GENERAL / SPECIAL DEGREE PROGRAMME

1. Suppose X is a variable that follows the normal distribution with known standard deviation σ = 0.3 but unknown mean µ.

Unbiased estimators Estimators

ECON 5350 Class Notes Maximum Likelihood Estimation

Topic-7. Large Sample Estimation

Estimation of Basic Genetic Parameters

x satisfying all regularity conditions. Then

Chapter 8 Interval Estimation. Estimation Concepts. General Form of a Confidence Interval

STATISTICS 110/201, FALL 2017 Homework #5 Solutions Assigned Mon, November 6, Due Wed, November 15

These characteristics are expressed in terms of statistical properties which are estimated from the sample data.

= α e ; x 0. Such a random variable is said to have an exponential distribution, with parameter α. [Here, view X as time-to-failure.

NOTES ON ESTIMATION AND CONFIDENCE INTERVALS. 1. Estimation

Research Article The Probability That a Measurement Falls within a Range of n Standard Deviations from an Estimate of the Mean

Statistics for Business and Economics

Exam 2. Instructor: Cynthia Rudin TA: Dimitrios Bisias. October 25, 2011

An Empirical Study of the Behaviour of the Sample Kurtosis in Samples from Symmetric Stable Distributions

. (The calculated sample mean is symbolized by x.)

Standard Deviations for Normal Sampling Distributions are: For proportions For means _

Rafa l Kulik and Marc Raimondo. University of Ottawa and University of Sydney. Supplementary material

Maximum Empirical Likelihood Estimation (MELE)

Topic 14: Maximum Likelihood Estimation

Outline. Assessing the precision of estimates of variance components. Modern practice. Describing the precision of parameters estimates

Chapter 10 - Lecture 2 The independent two sample t-test and. confidence interval

ExamFI1_C1415. Lídia Montero. Thursday, January 15, 2015

A Bayesian perspective on estimating mean, variance, and standard-deviation from data

Quantitative Analysis

SOLUTION QUANTITATIVE TOOLS IN BUSINESS NOV 2011

14.30 Introduction to Statistical Methods in Economics Spring 2009

Control Charts for Mean under Shrinkage Technique

Sampling Distributions and Estimation

A Technical Description of the STARS Efficiency Rating System Calculation

Online appendices from The xva Challenge by Jon Gregory. APPENDIX 10A: Exposure and swaption analogy.

A point estimate is the value of a statistic that estimates the value of a parameter.

Binomial Model. Stock Price Dynamics. The Key Idea Riskless Hedge

Introduction to Probability and Statistics Chapter 7

FOUNDATION ACTED COURSE (FAC)

0.1 Valuation Formula:

EXERCISE - BINOMIAL THEOREM

Models of Asset Pricing

Lecture 4: Probability (continued)

Lecture 5 Point Es/mator and Sampling Distribu/on

Summary. Recap. Last Lecture. .1 If you know MLE of θ, can you also know MLE of τ(θ) for any function τ?

BIOSTATS 540 Fall Estimation Page 1 of 72. Unit 6. Estimation. Use at least twelve observations in constructing a confidence interval

Introduction to Statistical Inference

Combining imperfect data, and an introduction to data assimilation Ross Bannister, NCEO, September 2010

Basic formula for confidence intervals. Formulas for estimating population variance Normal Uniform Proportion

B = A x z

SELECTING THE NUMBER OF CHANGE-POINTS IN SEGMENTED LINE REGRESSION

5 Statistical Inference

CHAPTER 3 RESEARCH METHODOLOGY. Chaigusin (2011) mentioned that stock markets have different

Estimating Volatilities and Correlations. Following Options, Futures, and Other Derivatives, 5th edition by John C. Hull. Chapter 17. m 2 2.

ST 305: Exam 2 Fall 2014

Subject CT1 Financial Mathematics Core Technical Syllabus

Monetary Economics: Problem Set #5 Solutions

Probability and statistics

An Improved Estimator of Population Variance using known Coefficient of Variation

Variance and Standard Deviation (Tables) Lecture 10


Quantitative Analysis

Department of Mathematics, S.R.K.R. Engineering College, Bhimavaram, A.P., India 2

Chapter 8: Estimation of Mean & Proportion. Introduction

CHAPTER 8: CONFIDENCE INTERVAL ESTIMATES for Means and Proportions

Appendix 1 to Chapter 5

Estimation of Population Variance Utilizing Auxiliary Information

of Asset Pricing R e = expected return

FINM6900 Finance Theory How Is Asymmetric Information Reflected in Asset Prices?

PORTFOLIO THEORY FOR EARTHQUAKE INSURANCE RISK ASSESSMENT

Linear Programming for Portfolio Selection Based on Fuzzy Decision-Making Theory

of Asset Pricing APPENDIX 1 TO CHAPTER EXPECTED RETURN APPLICATION Expected Return

Exam 1 Spring 2015 Statistics for Applications 3/5/2015

DESCRIPTION OF MATHEMATICAL MODELS USED IN RATING ACTIVITIES

Sampling Distributions & Estimators

Correlation possibly the most important and least understood topic in finance

Models of Asset Pricing

Tests for the Difference Between Two Linear Regression Intercepts

Transcription:

LM 05 Likelihood Ratio Test 1 The Likelihood Ratio Test The likelihood ratio test is a geeral purpose test desiged evaluate ested statistical models i a way that is strictly aalogous to the F-test for reduced models (RM) ad full models (FM) commoly employed with liear models (see Biostatistics Worksheet 402). I both, failure to reject the ull hypothesis results i model simplificatio. The likelihood ratio test works ot oly with liear models, show here, but may be applied to a very wide array of problems ivolvig Geralized Liear Models (GLM), where Maximum Likelihood (ML) or Restricted Maximum Likelihood (REML) methods are utilized to estimate model parameters. The latter methods/models iclude, amog others, Logistic Regressio (see GLM 020), Poisso Regressio (see GLM 040), ad Liear Mixed Models (see LMM 060) described i Worksheets o the Biologist's Aalytic Toolkit Website uder Statisistical Models. For direct compariso of results, the data set aalyzed here is the same as for the geeral F test (Biostatisics Worksheet 402). As ca be see, F test ad likelihood ratio tests give similar but ot exactly the same results. Helpful discussio of this approach appears i Kuter et al. (KNNL) Applied Liear Statistical Models 5th Editio, ad umerous statistics websites. Example i R: #LOG LIKELIHOOD AND LIKELIHOOD RATIO TEST setwd("c:/data/models/") K=read.table("KNNLCh9SurgicalUit.txt") K a ach(k) Fittig Full ad Reduced liear models: Full Model: #FITTING THE FULL LINEAR MODEL FM=lm(Y~X1+X2+X3+X4+X5+factor(X6)) FMg=glm(Y~X1+X2+X3+X4+X5+factor(X6)) aova(fm) aova(fmg) Note: R's fuctio glm() is also employed here sice this fuctio produces a data class for which the geeral wrapper aova() assumes aova.glm() which produces likelihood ratio results as default. > K X1 X2 X3 X4 X5 X6 Y 1 6.7 62 81 2.59 50 0 6.544 2 5.1 59 66 1.70 39 0 5.999 3 7.4 57 83 2.16 55 0 6.565 4 6.5 73 41 2.01 48 0 5.854 5 7.8 65 115 4.30 45 0 7.759 6 5.8 38 72 1.42 65 1 5.852 7 5.7 46 63 1.91 49 1 6.250 8 3.7 68 81 2.57 69 1 6.619 9 6.0 67 93 2.50 58 0 6.962 10 3.7 76 94 2.40 48 0 6.875 11 6.3 84 83 4.13 37 0 6.613 12 6.7 51 43 1.86 57 0 5.549 13 5.8 96 114 3.95 63 1 7.361... 44 6.5 56 77 2.85 41 0 6.288 45 3.4 77 93 1.48 69 0 6.178 46 6.5 40 84 3.00 54 1 6.416 47 4.5 73 106 3.05 47 1 6.867 48 4.8 86 101 4.10 35 1 7.170 49 5.1 67 77 2.86 66 1 6.365 50 3.9 82 103 4.55 50 0 6.983 51 6.6 77 46 1.95 50 0 6.005 52 6.4 85 40 1.21 58 0 6.361 53 6.4 59 85 2.33 63 0 6.310 54 8.8 78 72 3.20 56 0 6.478 > aova(fm) X1 1 0.7763 0.7763 12.5579 0.0009042 *** X2 1 2.5888 2.5888 41.8803 5.187e-08 *** X3 1 6.3341 6.3341 102.4704 2.157e-13 *** X4 1 0.0246 0.0246 0.3976 0.5313820 X5 1 0.1265 0.1265 2.0460 0.1592180 factor(x6) 1 0.0522 0.0522 0.8448 0.3627348 Residuals 47 2.9053 0.0618 --- Sigif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 σ FM = 0.0618 = 0.2486 ^ "stadard error" (stadard deviatio of the residuals) for the full model > aova(fmg) Model: gaussia, lik: idetity Terms added sequetially (first to last) Df Deviace Resid. Df Resid. Dev NULL 53 12.8077 X1 1 0.7763 52 12.0315 X2 1 2.5888 51 9.4427 X3 1 6.3341 50 3.1085 X4 1 0.0246 49 3.0840 X5 1 0.1265 48 2.9575 factor(x6) 1 0.0522 47 2.9053

LM 05 Likelihood Ratio Test 2 Reduced Model: FITTING A REDUCED LINEAR MODEL RM=lm(Y~X1+X2+X3+X5) RMg=glm(Y~X1+X2+X3+X5) aova(rm) aova(rmg) > aova(rm) X1 1 0.7763 0.7763 12.8495 0.000776 *** X2 1 2.5888 2.5888 42.8528 3.349e-08 *** X3 1 6.3341 6.3341 104.8499 9.118e-14 *** X5 1 0.1484 0.1484 2.4561 0.123503 Residuals 49 2.9602 0.0604 --- Sigif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 σ RM = 0.0604 = 0.2458 ^ "stadard error" (stadard deviatio of the residuals) for the reduced model > aova(rmg) Model: gaussia, lik: idetity Terms added sequetially (first to last) Df Deviace Resid. Df Resid. Dev NULL 53 12.8077 X1 1 0.7763 52 12.0315 X2 1 2.5888 51 9.4427 X3 1 6.3341 50 3.1085 X5 1 0.1484 49 2.9602 Estimatig Stadard Error usig Maximum Likelihood: differece i umber of parameters > betwee models FM & RM k MLσ FM := σ FM k + r MLσ RM := σ RM := 54 k := 7 r := 2 σ FM := 0.2486247 σ RM := 0.2457874 MLσ FM = 0.231951 MLσ RM = 0.234132 > #CALCULATING MAXIMUM LIKELIHOOD STANDARD DEVIATION > =legth(k[,1]) > #NUMBER OF CASES IN DATASET K [1] 54 > k=legth(k) > k #NUMBER OF VARIABLES IN FM [1] 7 > r=2 #DIFFERENCE IN NUMBER OF VARIABLES FM VS RM > r [1] 2 > #EXTRACTING STANDARD ERRORS: > #FOR FM: > FMsigma = summary(fm)$sigma > FMsigma > #FOR RM: > RMsigma = summary(rm)$sigma > RMsigma [1] 0.2457874 > #MAXIMUM LIKELIHOOD STANDARD DEVIATIONS > #FOR FM: > FMsigma.ML = FMsigma*sqrt((-k)/) > FMsigma.ML [1] 0.2319511 > #FOR RM: > RMsigma.ML = RMsigma*sqrt((-k+r)/) > RMsigma.ML [1] 0.234132

LM 05 Likelihood Ratio Test 3 Calculatig Log Likelihoods for Each Model: l likelihood value for FM > l likelihood value for RM > ^ l ="atural logs" i base e > # LOG LIKELIHOOD OF MODELS > # FM: > sum(log(dorm(x = Y, mea = predict(fm), sd = FMsigma.ML))) [1] 2.28368 > loglik(fm) 'log Lik.' 2.28368 (df=8) > # RM: > sum(log(dorm(x = Y, mea = predict(rm), sd = RMsigma.ML))) [1] 1.778311 > loglik(rm) 'log Lik.' 1.778311 (df=6) Note: log likelihoods for each model are calculated here usig maximum likelihood estimates of stadard error for each model separately. This cotrasts with the use of stadard error usig oly the FM i the test below. Likelihood Ratio Test: Assumptios: - Stadard Liear Regressio depeds o specifyig i advace which variable is to be cosidered 'depedet' ad which 'idepedet'. This decisio matters as chagig roles for Y & X usually produces a differet result.\ - Y 1, Y 2, Y 3,..., Y (depedet variable) is a radom sample. Note: Although a Normal distributio is assumed here for Y i a liear model, i other istaces of the likelihood ratio test, this assumptio does't apply. - X 1, X 2, X 3,..., X (idepedet variable) with each value of X i matched to Y i Withi this setup, two models for the relatioship betwee X ad Y variables are explicitly compared: Full Model: Y i = β 0 + Σβ j X i + ε i Reduced Model: Y i = β 0 + Σβ k X i + ε i Hypotheses: where: Y i ad [X 1,X 2,... X i ] are matched depedet ad idepedet variables, ad β 0 is the y itercept of the regressio lie (traslatio) β j are slope coefficiets for the full set of idepedet variables X 1,X 2,... X j β k are slope coefficiets for a smaller set of idepedet variables withi X j ε i is the error factor i predictio of Y i ad a radom variable ~N(0,σ 2 ). H 0 : coefficiets i j but NOT INCLUDED i k = 0. Note: this is always the more parsimoious (i.e., smaller) model H 1 : at least some of these coeficiets ot 0

LM 05 Likelihood Ratio Test 4 Degrees of Freedom: = 54 k = 7 r = 2 Sum of Squares ad Stadard Error for FM: < = umber of matched observatios i dataset < k = umber of variables i FM < r = differeces i umber of variables betwee FM & RM > aova(fm) X1 1 0.7763 0.7763 12.5579 0.0009042 X2 1 2.5888 2.5888 41.8803 5.187e-08 X3 1 6.3341 6.3341 102.4704 2.157e-13 X4 1 0.0246 0.0246 0.3976 0.5313820 X5 1 0.1265 0.1265 2.0460 0.1592180 factor(x6) 1 0.0522 0.0522 0.8448 0.3627348 Residuals 47 2.9053 0.0618 > #LIKELIHOOD RATIO TEST: > #SUM OF SQUARES ERROR FOR MODELS: > SSE.FM = sum((y-predict(fm))^2) #SSE for FM > SSE.FM [1] 2.90527 > SSE.RM = sum((y-predict(rm))^2) #SSE for RM > SSE.RM [1] 2.960161 > aova(rm) X1 1 0.7763 0.7763 12.8495 0.000776 *** X2 1 2.5888 2.5888 42.8528 3.349e-08 *** X3 1 6.3341 6.3341 104.8499 9.118e-14 *** X5 1 0.1484 0.1484 2.4561 0.123503 Residuals 49 2.9602 0.0604 s := 0.2486247 SSE FM := 2.90527 SSE RM := 2.960161 > #STANDARD ERROR FOR FM: > s=summary(fm)$sigma > s > s=sqrt(summary(fmg)$dispersio) > s ^ Stadard errors are the square root of MSE, see above. Relative Likelihoods: 1 SSE FM 2 LFM := e 1 SSE RM 2 LRM := e C := Likelihoods: 1 ( 2 π 2 ) Λ FM := C LFM Λ RM := C LRM see eq 1.26 i KNNL LFM = 6.2241 10 11 LRM = 3.9926 10 11 C = 1.2296 10 11 Λ FM = 7.653 Λ RM = 4.9091 > #RELATIVE LIKELIHOODS FOR THE MODELS: > LFM = exp(-(1/2)*(sse.fm/s^2)) #TIMES CONSTANT C > LFM [1] 6.224145e-11 > LRM = exp(-(1/2)*(sse.rm/s^2)) #TIMES CONSTANT C > LRM [1] 3.992573e-11 > #CONSTANT C: > #NUMBER OF CASES IN DATASET K [1] 54 > C=1/((2*pi*s^2)^(/2)) #CONSTANT IN EQ 1.26 IN KNNL > C [1] 122956414826 > LCFM=C*LFM > LCFM [1] 7.652985 > LCRM=C*LRM > LCRM [1] 4.909125

LM 05 Likelihood Ratio Test 5 Likelihood Ratio Test Statistic: SSE FM := 2.9053 SSE RM := 2.9602 > #LOG LIKELIHOOD RATIO STATISTIC: > LRT=(SSE.RM - SSE.FM)/s^2 > LRT #LOG LIKELIHOOD RATIO STATISTIC [1] 0.8880001 LRT := ( ) SSE RM SSE FM LRT = 0.8881 < differece here due to roudig... Critical Value of the Test: α := 0.05 CV := qchisq 1 α, r Decisio Rule: IF F > CV, THEN REJECT H 0 OTHERWISE ACCEPT H 0 LRT = 0.8881 Probability Value: P := 1 pchisq( LRT, r) IMPORTANT NOTE: Prototype i R: < Probability of Type I error must be explicitly set ( ) CV = 5.9915 CV = 5.9915 P = 0.6414 < ote degrees of freedom reflect differece betwee the models > #PROBABILITY OF NULL HYPOTHESIS RM > P=1-pchisq(LRT,2) > P #PROBABILITY [1] 0.6414654 FALURE to reject H 0 i this test meas that the MORE PARSIMONIOUS model RM is PREFERRED! #LIKELIHOOD RATIO TEST: aova(rm,fm,test="lrt") aova(rmg,fmg,test="lrt") > #LIKELIHOOD RATIO TEST: > aova(rm,fm,test="lrt") Model 1: Y ~ X1 + X2 + X3 + X5 Model 2: Y ~ X1 + X2 + X3 + X4 + X5 + factor(x6) Res.Df RSS Df Sum of Sq Pr(>Chi) 1 49 2.9602 2 47 2.9053 2 0.054891 0.6415 > aova(rmg,fmg,test="lrt") Model 1: Y ~ X1 + X2 + X3 + X5 Model 2: Y ~ X1 + X2 + X3 + X4 + X5 + factor(x6) Resid. Df Resid. Dev Df Deviace Pr(>Chi) 1 49 2.9602 2 47 2.9053 2 0.054891 0.6415