Alastair Hall ECG 790F: Microeconometrics Spring Computer Handout # 2. Estimation of binary response models : part II

Similar documents
The QLIM Procedure. Table of Contents

Phd Program in Transportation. Transport Demand Modeling. Session 11

Determining Probability Estimates From Logistic Regression Results Vartanian: SW 541

Intro to GLM Day 2: GLM and Maximum Likelihood

Econometric Methods for Valuation Analysis

Lecture 21: Logit Models for Multinomial Responses Continued

BEcon Program, Faculty of Economics, Chulalongkorn University Page 1/7

The method of Maximum Likelihood.

Log-linear Modeling Under Generalized Inverse Sampling Scheme

Maximum Likelihood Estimation

PhD Qualifier Examination

Logit Models for Binary Data

Introduction to the Maximum Likelihood Estimation Technique. September 24, 2015

Gov 2001: Section 5. I. A Normal Example II. Uncertainty. Gov Spring 2010

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Appendix. Table A.1 (Part A) The Author(s) 2015 G. Chakrabarti and C. Sen, Green Investing, SpringerBriefs in Finance, DOI /

Table 4. Probit model of union membership. Probit coefficients are presented below. Data from March 2008 Current Population Survey.

Multinomial Logit Models - Overview Richard Williams, University of Notre Dame, Last revised February 13, 2017

proc genmod; model malform/total = alcohol / dist=bin link=identity obstats; title 'Table 2.7'; title2 'Identity Link';

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2010, Mr. Ruey S. Tsay. Solutions to Midterm

Financial Econometrics: Problem Set # 3 Solutions

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2017, Mr. Ruey S. Tsay. Solutions to Final Exam

Analysis of Microdata

Environmental samples below the limits of detection comparing regression methods to predict environmental concentrations ABSTRACT INTRODUCTION

Exercise 1. Data from the Journal of Applied Econometrics Archive. This is an unbalanced panel.n = 27326, Group sizes range from 1 to 7, 7293 groups.

Multiple Regression and Logistic Regression II. Dajiang 525 Apr

Equity, Vacancy, and Time to Sale in Real Estate.

Final Exam - section 1. Thursday, December hours, 30 minutes

ARIMA ANALYSIS WITH INTERVENTIONS / OUTLIERS

Estimation Procedure for Parametric Survival Distribution Without Covariates

PASS Sample Size Software

[BINARY DEPENDENT VARIABLE ESTIMATION WITH STATA]

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2012, Mr. Ruey S. Tsay. Solutions to Midterm

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, Last revised January 13, 2018

Graduate School of Business, University of Chicago Business 41202, Spring Quarter 2007, Mr. Ruey S. Tsay. Midterm

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2014, Mr. Ruey S. Tsay. Solutions to Midterm

Final Exam Suggested Solutions

LAMPIRAN. Null Hypothesis: LO has a unit root Exogenous: Constant Lag Length: 1 (Automatic based on SIC, MAXLAG=13)

Laplace approximation

Tests for Two ROC Curves

FBBABLLR1CBQ_US Commercial Banks: Assets - Bank Credit - Loans and Leases - Residential Real Estate (Bil, $, SA)

STA 4504/5503 Sample questions for exam True-False questions.

Logistic Regression with R: Example One

Market Variables and Financial Distress. Giovanni Fernandez Stetson University

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2010, Mr. Ruey S. Tsay Solutions to Final Exam

book 2014/5/6 15:21 page 261 #285

West Coast Stata Users Group Meeting, October 25, 2007

Discrete Choice Modeling William Greene Stern School of Business, New York University. Lab Session 2 Binary Choice Modeling with Panel Data

Amath 546/Econ 589 Univariate GARCH Models

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2009, Mr. Ruey S. Tsay. Solutions to Final Exam

Extend the ideas of Kan and Zhou paper on Optimal Portfolio Construction under parameter uncertainty

6. Genetics examples: Hardy-Weinberg Equilibrium

Final Exam, section 1. Tuesday, December hour, 30 minutes

Duration Models: Parametric Models

Lecture Note: Analysis of Financial Time Series Spring 2008, Ruey S. Tsay. Seasonal Time Series: TS with periodic patterns and useful in

Financial Time Series Analysis (FTSA)

Lecture Note of Bus 41202, Spring 2008: More Volatility Models. Mr. Ruey Tsay

Agricultural and Applied Economics 637 Applied Econometrics II

Final Exam, section 2. Tuesday, December hour, 30 minutes

Chapter 7: Estimation Sections

Non-Inferiority Tests for the Ratio of Two Means in a 2x2 Cross-Over Design

Heterogeneity in Multinomial Choice Models, with an Application to a Study of Employment Dynamics

Valuing Environmental Impacts: Practical Guidelines for the Use of Value Transfer in Policy and Project Appraisal

Comparing Odds Ratios and Marginal Effects from Logistic Regression and Linear Probability Models

Introduction Dickey-Fuller Test Option Pricing Bootstrapping. Simulation Methods. Chapter 13 of Chris Brook s Book.

FIT OR HIT IN CHOICE MODELS

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, Last revised January 10, 2017

Hierarchical Generalized Linear Models. Measurement Incorporated Hierarchical Linear Models Workshop

sociology SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 SO5032 Quantitative Research Methods

Gamma Distribution Fitting

Tests for the Odds Ratio in a Matched Case-Control Design with a Binary X

ECG 752: Econometrics II Spring Assessed Computer Assignment 3: Answer Key

Lecture Quantitative Finance Spring Term 2015

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples

Technology Support Center Issue

Quantile Regression. By Luyang Fu, Ph. D., FCAS, State Auto Insurance Company Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting

Lecture 5a: ARCH Models

Vladimir Spokoiny (joint with J.Polzehl) Varying coefficient GARCH versus local constant volatility modeling.

Learning from Data: Learning Logistic Regressors

Girma Tefera*, Legesse Negash and Solomon Buke. Department of Statistics, College of Natural Science, Jimma University. Ethiopia.

Review questions for Multinomial Logit/Probit, Tobit, Heckit, Quantile Regressions

Bayesian Multinomial Model for Ordinal Data

A Comparison of Univariate Probit and Logit. Models Using Simulation

might be done. The utility. rather than

NPTEL Project. Econometric Modelling. Module 16: Qualitative Response Regression Modelling. Lecture 20: Qualitative Response Regression Modelling

Variance clustering. Two motivations, volatility clustering, and implied volatility

Sociology 704: Topics in Multivariate Statistics Instructor: Natasha Sarkisian. Binary Logit

A Note on the Oil Price Trend and GARCH Shocks

Statistics for Business and Economics

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2013, Mr. Ruey S. Tsay. Midterm

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]

Discrete Choice Modeling William Greene Stern School of Business, New York University. Lab Session 4

Financial Risk Management

Crash Involvement Studies Using Routine Accident and Exposure Data: A Case for Case-Control Designs

TOURISM GENERATION ANALYSIS BASED ON A SCOBIT MODEL * Lingling, WU **, Junyi ZHANG ***, and Akimasa FUJIWARA ****

############################ ### toxo.r ### ############################

Market Risk Analysis Volume I

Econ 533 Problem Set #2 Answer Sheets

Estimating Mixed Logit Models with Large Choice Sets. Roger H. von Haefen, NC State & NBER Adam Domanski, NOAA July 2013

Exam 2 Spring 2015 Statistics for Applications 4/9/2015

Transcription:

Alastair Hall ECG 790F: Microeconometrics Spring 2006 Computer Handout # 2 Estimation of binary response models : part II In this handout, we discuss the estimation of binary response models with and without heteroscedasticity using proc qlim. We use the same example as on Computer Handout # 1, and for brevity focus purely on the probit model. Recall that in this example, we wish to model the probability that an individual votes yes in a referendum on whether the local tax rate should be increased to provide additional funding for schools. The probability of voting yes is assumed to be determined as follows, where P (YESVM)=f(PUB 12,PUB 34,PUB 5,PRIV,YEARS,SCHOOL,LOGINC,PTCON) YESVM = I{individual votes yes}, PUB 12 = I{individual has 1 or 2 children in public school}, PUB 34 = I{individual has 3 or 4 children in public school}, PUB 5 = I{individual has 5 or more children in public school}, PRIV = I{individual has 1 or more children in private school}, SCHOOL = I{individual is employed as a teacher (public or private)}, YEARS= # of years living in Troy community, LOGINC = log of annual household income in $, PTCON = log of property taxes paid per year in $, and I{.} denotes an indicator variable that takes the value one if the event in the curly brackets occurs. In Computer Handout 1, we desribed how this model can be estimated using proc logistic. The model can also be estimated in proc qlim as follows: 1 proc qlim data=main; 1 This assumes that the data are read into the data set main using the same commands as in Computer Handout #1. 1

model yesvm = PUB1 2PUB34 PUB5 PRIV YEARS SCHOOL loginc PTCON/ type=bprobit covest=hess optmethod=nr; endogenous discrete=(yesvm 0 1); The role of the commands is as follows: type=bprobit: this implements a binary response probit estimation; for logit use blogit. covest=hess: this causes the covariance matrix of the MLE to be estimated via the hessian of the log likelihood function; the other option is op which would cause the estimate to be based on the outer product gradient estimator. optmethod=nr: this option sets the numerical optimization to the Newton-Raphson; another option is qn which yields the so-called quasi-newton method. endogenous discrete=(yesvm 0 1): this instructs SAS that the dependent variable is discrete and takes values zero or one. It is also possible to control the number of iterations within the routine using maxiter= -the default is 100 - and the starting values using start=. The output is as follows: The QLIM Procedure Binary Probit Estimates Algorithm converged. Model Fit Summary Dependent Variable YESVM Number of Observations 95 Log Likelihood -52.70244 Maximum Absolute Gradient 3.0704E-12 Number of Iterations 6 Optimization Method Newton-Raphson AIC 123.40488 Schwarz Criterion 146.38977 2

Discrete Response Profile Index YESVM Frequency Percent 0 0 36 37.89 1 1 59 62.11 Goodness-of-Fit Measures for Discrete Choice Models Measure Value Formula Likelihood Ratio (R) 20.669 2 * (LogL - LogL0) Upper Bound of R (U) 126.07-2 * LogL0 Aldrich-Nelson 0.1787 R / (R+N) Cragg-Uhler 1 0.1955 1 - exp(-r/n) Cragg-Uhler 2 0.2661 (1-exp(-R/N)) / (1-exp(-U/N)) Estrella 0.2115 1 - (1-R/U)^(U/N) Adjusted Estrella 0.0280 1 - ((LogL-K)/LogL0)^(-2/N*LogL0) McFadden s LRI 0.1639 R / U Veall-Zimmermann 0.3133 (R * (U+N)) / (U * (R+N)) McKelvey-Zavoina 0.3918 N = # of observations, K = # of regressors Parameter Estimates Standard Approx Parameter DF Estimate Error t Value Pr > t Gradient Intercept 1-4.1560 4.5863-0.91 0.3648 2.54E-13 PUB1_2 1 0.1859 0.4352 0.43 0.6693 9.45E-14 PUB3_4 1 0.5324 0.4799 1.11 0.2673 1.16E-13 PUB5 1 0.2039 0.7683 0.27 0.7907-555E-17 PRIV 1-0.3298 0.4649-0.71 0.4780-644E-17 YEARS 1-0.0145 0.0153-0.95 0.3440 3.07E-12 SCHOOL 1 1.6654 0.8616 1.93 0.0532 3E-13 loginc 1 1.4591 0.4800 3.04 0.0024 2.5E-12 PTCON 1-1.4777 0.6489-2.28 0.0228 1.68E-12 3

Notice the following features of this output: The estimates are the same as those obtained in Computer Handout # 1 using proc logistic. The reported statistics are different from those generated by proc logistic. A lot of goodness of fit statistics have been proposed for this model! One advantage of proc qlim is that it can be used to estimate binary response models with hetroscedasticity. Suppose in our example that we wish to estimate the model with heteroscedasticity of the form σ 2 i = exp(θpriv i). This can be implemented as follows. proc qlim data=main; model yesvm = PUB1 2PUB34 PUB5 PRIV YEARS SCHOOL loginc PTCON/ type=bprobit covest=hess optmethod=nr; endogenous discrete=(yesvm 0 1); hetero PRIV; The only difference from our original program is the inclusion of the command hetero PRIV. The output is as follows. The QLIM Procedure Binary Probit Estimates with Heteroscedasticity Algorithm converged. Model Fit Summary Dependent Variable YESVM Number of Observations 95 Log Likelihood -52.27072 Maximum Absolute Gradient 0.0001472 Number of Iterations 14 Optimization Method Newton-Raphson AIC 124.54143 Schwarz Criterion 150.08020 4

Discrete Response Profile Index YESVM Frequency Percent 0 0 36 37.89 1 1 59 62.11 Parameter Estimates Standard Approx Parameter DF Estimate Error t Value Pr > t Gradient Intercept 1-7.2962 5.8095-1.26 0.2092 0.000014 PUB1_2 1 0.001571 0.5628 0.00 0.9978 0.000021 PUB3_4 1 0.3540 0.5928 0.60 0.5504 0.000013 PUB5 1 1.1178 1.0963 1.02 0.3079-0.00001 PRIV 1-0.4496 7.4650-0.06 0.9520 8.574E-8 YEARS 1-0.006539 0.0179-0.37 0.7148-0.00001 SCHOOL 1 1.9683 1.0146 1.94 0.0524 2.307E-6 loginc 1 1.8180 0.5637 3.23 0.0013 0.000147 PTCON 1-1.5335 0.7258-2.11 0.0346 0.0001 HET1 1 5.8697 23.5334 0.25 0.8030-2.68E-7 The estimate of θ is denoted HET1 in the output. It is natural to want to test whether heteroscedasticity is present, that is to test the null hypothesis H 0 : θ = 0 (homoscedasticity) versus H 1 : θ 0 (heteroscedasticity of the form given). This can be done two ways using the output above: using the Wald test which is just the square of the t statistic given in the output, that is W =(0.25) 2 =0.0625 with a p-value of 0.803; using the LR test which is 2{( 52.70244) ( 52.27072)} = 0.8634 with a p-value of 0.6472. Both statistics are insignificant at conventional values and so we fail to reject the null of homoscedasticity in this case. Other forms of heteroscedasticity can be estimated using the following options in the hetero command; here we assume the heteroscedastcity is driven by x θ where x =(x 1,x 2 ) are variables in the data set and θ =(θ 1,θ 2 ) unknown parameters. hetero x1 x2 σi 2 = exp(x iθ); in this case the output contains coefficient estimates 5

HET1andHET2 which are estimates of θ 1 and θ 2 respectively; hetero x1 x2 / link=exp default); σi 2 = exp(x iθ)(i.e. same as previous case and so exp is hetero x1 x2 / link=linear σ 2 i = x i θ; hetero x1 x2 / link=linear square σ 2 i =(x i θ)2 ;thesquare option can also be used with link=exp. 6