Discrete Choice Modeling

Similar documents
Exercise 1. Data from the Journal of Applied Econometrics Archive. This is an unbalanced panel.n = 27326, Group sizes range from 1 to 7, 7293 groups.

Phd Program in Transportation. Transport Demand Modeling. Session 11

STATISTICAL METHODS FOR CATEGORICAL DATA ANALYSIS

Contents Part I Descriptive Statistics 1 Introduction and Framework Population, Sample, and Observations Variables Quali

Transport Data Analysis and Modeling Methodologies

TOURISM GENERATION ANALYSIS BASED ON A SCOBIT MODEL * Lingling, WU **, Junyi ZHANG ***, and Akimasa FUJIWARA ****

Econometric Methods for Valuation Analysis

sociology SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 SO5032 Quantitative Research Methods

Table 4. Probit model of union membership. Probit coefficients are presented below. Data from March 2008 Current Population Survey.

Review questions for Multinomial Logit/Probit, Tobit, Heckit, Quantile Regressions

Estimating Ordered Categorical Variables Using Panel Data: A Generalised Ordered Probit Model with an Autofit Procedure

Ministry of Health, Labour and Welfare Statistics and Information Department

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Discrete Choice Modeling William Greene Stern School of Business, New York University. Lab Session 2 Binary Choice Modeling with Panel Data

The Bernoulli distribution

Quantile Regression due to Skewness. and Outliers

Methods for A Time Series Approach to Estimating Excess Mortality Rates in Puerto Rico, Post Maria 1 Menzie Chinn 2 August 10, 2018 Procedure:

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

9. Logit and Probit Models For Dichotomous Data

Analysis of Microdata

Analyzing the Determinants of Project Success: A Probit Regression Approach

List of tables List of boxes List of screenshots Preface to the third edition Acknowledgements

THE IMPACT OF BANKING RISKS ON THE CAPITAL OF COMMERCIAL BANKS IN LIBYA

a. Explain why the coefficients change in the observed direction when switching from OLS to Tobit estimation.

A Test of the Normality Assumption in the Ordered Probit Model *

[BINARY DEPENDENT VARIABLE ESTIMATION WITH STATA]

Valuing Environmental Impacts: Practical Guidelines for the Use of Value Transfer in Policy and Project Appraisal

Appendix. Table A.1 (Part A) The Author(s) 2015 G. Chakrabarti and C. Sen, Green Investing, SpringerBriefs in Finance, DOI /

What Makes Family Members Live Apart or Together?: An Empirical Study with Japanese Panel Study of Consumers

Hasil Common Effect Model

Logit Models for Binary Data

BEcon Program, Faculty of Economics, Chulalongkorn University Page 1/7

A generalized Hosmer Lemeshow goodness-of-fit test for multinomial logistic regression models

Brief Sketch of Solutions: Tutorial 2. 2) graphs. 3) unit root tests

Lecture 21: Logit Models for Multinomial Responses Continued

CHAPTER 6 DATA ANALYSIS AND INTERPRETATION

Questions of Statistical Analysis and Discrete Choice Models

Introductory Econometrics for Finance

Econometrics II Multinomial Choice Models

Final Exam - section 1. Thursday, December hours, 30 minutes

Recovery measures of underfunded pension funds: contribution increase, no indexation, or pension cut? Leo de Haan

A Comparison of Univariate Probit and Logit. Models Using Simulation

Why Housing Gap; Willingness or Eligibility to Mortgage Financing By Respondents in Uasin Gishu, Kenya

Available online at ScienceDirect. Procedia Environmental Sciences 22 (2014 )

Market Variables and Financial Distress. Giovanni Fernandez Stetson University

CHAPTER 8 EXAMPLES: MIXTURE MODELING WITH LONGITUDINAL DATA

Hierarchical Generalized Linear Models. Measurement Incorporated Hierarchical Linear Models Workshop

WesVar uses repeated replication variance estimation methods exclusively and as a result does not offer the Taylor Series Linearization approach.

Lecture 6: Non Normal Distributions

Openness and Inflation

Mobile Financial Services for Women in Indonesia: A Baseline Survey Analysis

F. ANALYSIS OF FACTORS AFFECTING PROJECT EFFICIENCY AND SUSTAINABILITY

Factor Affecting Yields for Treasury Bills In Pakistan?

GGraph. Males Only. Premium. Experience. GGraph. Gender. 1 0: R 2 Linear = : R 2 Linear = Page 1

Contents. Part I Getting started 1. xxii xxix. List of tables Preface

4 th NBRM Research Conference Structural Rigidities, Growth and Monetary Policy. Discussion Altin Tanku Bank of Albania

Didacticiel - Études de cas. In this tutorial, we show how to implement a multinomial logistic regression with TANAGRA.

DYNAMICS OF URBAN INFORMAL

Financial Econometrics: Problem Set # 3 Solutions

Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron

Keywords Akiake Information criterion, Automobile, Bonus-Malus, Exponential family, Linear regression, Residuals, Scaled deviance. I.

Vlerick Leuven Gent Working Paper Series 2003/30 MODELLING LIMITED DEPENDENT VARIABLES: METHODS AND GUIDELINES FOR RESEARCHERS IN STRATEGIC MANAGEMENT

Lifetime Income Inequality: quantile treatment effect of retirement on the distribution of lifetime income.

Economics 442 Macroeconomic Policy (Spring 2015) 3/23/2015. Instructor: Prof. Menzie Chinn UW Madison

The SAS System 11:03 Monday, November 11,

Applied Econometrics for Health Economists

This is a repository copy of A Zero Inflated Regression Model for Grouped Data.

What determines Paid Parental Leave Provisions in Collective Agreements in New Zealand?

Getting Started in Logit and Ordered Logit Regression (ver. 3.1 beta)

LAMPIRAN PERHITUNGAN EVIEWS

Chapter 4 Level of Volatility in the Indian Stock Market

Bi-Variate Causality between States per Capita Income and State Public Expenditure An Experience of Gujarat State Economic System

Donald Trump's Random Walk Up Wall Street

INFLUENCE OF CONTRIBUTION RATE DYNAMICS ON THE PENSION PILLAR II ON THE

Multinomial Choice (Basic Models)

Intro to GLM Day 2: GLM and Maximum Likelihood

CHAPTER 5 RESULT AND ANALYSIS

Green Giving and Demand for Environmental Quality: Evidence from the Giving and Volunteering Surveys. Debra K. Israel* Indiana State University

Module 10: Single-level and Multilevel Models for Nominal Responses Concepts

Analysis of the Influence of the Annualized Rate of Rentability on the Unit Value of the Net Assets of the Private Administered Pension Fund NN

The Family Gap phenomenon: does having children impact on parents labour market outcomes?

IMPACT OF MACROECONOMIC VARIABLE ON STOCK MARKET RETURN AND ITS VOLATILITY

Models of Multinomial Qualitative Response

Queensland University of Technology Transport Data Analysis and Modeling Methodologies

CHAPTER 4 DATA ANALYSIS Data Hypothesis

Web Appendix Figure 1. Operational Steps of Experiment

Time Invariant and Time Varying Inefficiency: Airlines Panel Data

SAS Simple Linear Regression Example

THE EQUIVALENCE OF THREE LATENT CLASS MODELS AND ML ESTIMATORS

Capital structure and profitability of firms in the corporate sector of Pakistan

ECO671, Spring 2014, Sample Questions for First Exam

XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

Introduction to the Maximum Likelihood Estimation Technique. September 24, 2015

ARE EUROPEAN BANKS IN ECONOMIC HARMONY? AN HLM APPROACH. James P. Gander

ECON Introductory Econometrics. Lecture 1: Introduction and Review of Statistics

Getting Started in Logit and Ordered Logit Regression (ver. 3.1 beta)

Chapter 6. Transformation of Variables

Predicting the Probability of Being a Smoker: A Probit Analysis

VERSION 7.2 Mplus LANGUAGE ADDENDUM

Is neglected heterogeneity really an issue in binary and fractional regression models? A simulation exercise for logit, probit and loglog models

Transcription:

[Part 1] 1/15 0 Introduction 1 Summary 2 Binary Choice 3 Panel Data 4 Bivariate Probit 5 Ordered Choice 6 Count Data 7 Multinomial Choice 8 Nested Logit 9 Heterogeneity 10 Latent Class 11 Mixed Logit 12 Stated Preference 13 Hybrid Choice William Greene Stern School of Business New York University

[Part 1] 2/15 Objectives in Model Building Specification: guided by underlying theory Modeling framework Functional forms Estimation: coefficients, partial effects, model implications Statistical inference: hypothesis testing Prediction: individual and aggregate Model assessment (fit, adequacy) and evaluation Model extensions Interdependencies, multiple part models Heterogeneity Endogeneity and causal inference Exploration: Estimation and inference methods

[Part 1] 3/15 Regression Basics The MODEL Modeling the conditional mean Regression Other features of interest Modeling quantiles Conditional variances or covariances Modeling probabilities for discrete choice Modeling other features of the population

[Part 1] 4/15 Application: Health Care Usage German Health Care Usage Data, 7,293 Individuals, Varying Numbers of Periods Data downloaded from Journal of Applied Econometrics Archive. This is an unbalanced panel with 7,293 individuals. They can be used for regression, count models, binary choice, ordered choice, and bivariate binary choice. This is a large data set. There are altogether 27,326 observations. The number of observations ranges from 1 to 7. (Frequencies are: 1=1525, 2=2158, 3=825, 4=926, 5=1051, 6=1000, 7=987). (Downloaded from the JAE Archive) Variables in the file are DOCTOR = 1(Number of doctor visits > 0) HOSPITAL = 1(Number of hospital visits > 0) HSAT = health satisfaction, coded 0 (low) - 10 (high) DOCVIS = number of doctor visits in last three months HOSPVIS = number of hospital visits in last calendar year PUBLIC = insured in public health insurance = 1; otherwise = 0 ADDON = insured by add-on insurance = 1; otherswise = 0 HHNINC = household nominal monthly net income in German marks / 10000. (4 observations with income=0 were dropped) HHKIDS = children under age 16 in the household = 1; otherwise = 0 EDUC = years of schooling AGE = age in years MARRIED = marital status

[Part 1] 5/15 Household Income Kernel Density Estimator Histogram

[Part 1] 6/15 Regression Income on Education ---------------------------------------------------------------------- Ordinary least squares regression... LHS=LOGINC Mean = -.92882 Standard deviation =.47948 Number of observs. = 887 Model size Parameters = 2 Degrees of freedom = 885 Residuals Sum of squares = 183.19359 Standard error of e =.45497 Fit R-squared =.10064 Adjusted R-squared =.09962 Model test F[ 1, 885] (prob) = 99.0(.0000) Diagnostic Log likelihood = -559.06527 Restricted(b=0) = -606.10609 Chi-sq [ 1] (prob) = 94.1(.0000) Info criter. LogAmemiya Prd. Crt. = -1.57279 Variable Coefficient Standard Error b/st.er. P[ Z >z] Mean of X Constant -1.71604***.08057-21.299.0000 EDUC.07176***.00721 9.951.0000 10.9707 Note: ***, **, * = Significance at 1%, 5%, 10% level. ----------------------------------------------------------------------

[Part 1] 7/15 Specification and Functional Form ---------------------------------------------------------------------- Ordinary least squares regression... LHS=LOGINC Mean = -.92882 Standard deviation =.47948 Number of observs. = 887 Model size Parameters = 3 Degrees of freedom = 884 Residuals Sum of squares = 183.00347 Standard error of e =.45499 Fit R-squared =.10157 Adjusted R-squared =.09954 Model test F[ 2, 884] (prob) = 50.0(.0000) Diagnostic Log likelihood = -558.60477 Restricted(b=0) = -606.10609 Chi-sq [ 2] (prob) = 95.0(.0000) Info criter. LogAmemiya Prd. Crt. = -1.57158 Variable Coefficient Standard Error b/st.er. P[ Z >z] Mean of X Constant -1.68303***.08763-19.207.0000 EDUC.06993***.00746 9.375.0000 10.9707 FEMALE -.03065.03199 -.958.3379.42277

[Part 1] 8/15 Interesting Partial Effects ---------------------------------------------------------------------- Ordinary least squares regression... LHS=LOGINC Mean = -.92882 Standard deviation =.47948 Number of observs. = 887 Model size Parameters = 5 Degrees of freedom = 882 Residuals Sum of squares = 171.87964 Standard error of e =.44145 Fit R-squared =.15618 Adjusted R-squared =.15235 Model test F[ 4, 882] (prob) = 40.8(.0000) Diagnostic Log likelihood = -530.79258 Restricted(b=0) = -606.10609 Chi-sq [ 4] (prob) = 150.6(.0000) Info criter. LogAmemiya Prd. Crt. = -1.62978 E[ Income x] Age Variable Coefficient Standard Error b/st.er. P[ Z >z] Mean of X Constant -5.26676***.56499-9.322.0000 EDUC.06469***.00730 8.860.0000 10.9707 FEMALE -.03683.03134-1.175.2399.42277 AGE.15567***.02297 6.777.0000 50.4780 AGE 2 -.00161***.00023-7.014.0000 2620.79 2 Age Age Age 2

[Part 1] 9/15 Function: Log Income Age Partial Effect wrt Age

[Part 1] 10/15 Modeling Categorical Variables Theoretical foundations Econometric methodology Models Statistical bases Econometric methods Applications

[Part 1] 11/15 Categorical Variables Observed outcomes Inherently discrete: number of occurrences, e.g., family size Multinomial: The observed outcome indexes a set of unordered labeled choices. Implicitly continuous: The observed data are discrete by construction, e.g., revealed preferences; our main subject Discrete, cardinal: Counts of occurrences Implications For model building For analysis and prediction of behavior

[Part 1] 12/15 Simple Binary Choice: Insurance

[Part 1] 13/15 Ordered Outcome Self Reported Health Satisfaction

[Part 1] 14/15 Counts of Occurrences

[Part 1] 15/15 Multinomial Unordered Choice