Quant Econ Pset 2: Logit

Similar documents
Lecture 1: Logit. Quantitative Methods for Economic Analysis. Seyed Ali Madani Zadeh and Hosein Joshaghani. Sharif University of Technology

Choice Probabilities. Logit Choice Probabilities Derivation. Choice Probabilities. Basic Econometrics in Transportation.

ONLINE APPENDIX (NOT FOR PUBLICATION) Appendix A: Appendix Figures and Tables

Econometric Methods for Valuation Analysis

The Implications of Declining Retiree Health Insurance

Economics Multinomial Choice Models

Models of Multinomial Qualitative Response

Lecture 10: Alternatives to OLS with limited dependent variables, part 1. PEA vs APE Logit/Probit

B003 Applied Economics Exercises

Estimating Market Power in Differentiated Product Markets

Exam. ECON 4624 Empirical Public Economics. (a) Consider the budget contraint in Figure 1 below. What are the expected effects on

CHAPTER 11 Regression with a Binary Dependent Variable. Kazu Matsuda IBEC PHBU 430 Econometrics

Multinomial Choice (Basic Models)

Predicting the Probability of Being a Smoker: A Probit Analysis

Empirical Project. Replication of Returns to Scale in Electricity Supply. by Marc Nerlove

Econ 8602, Fall 2017 Homework 2

MULTIVARIATE FRACTIONAL RESPONSE MODELS IN A PANEL SETTING WITH AN APPLICATION TO PORTFOLIO ALLOCATION. Michael Anthony Carlton A DISSERTATION

Precautionary Saving and Health Insurance: A Portfolio Choice Perspective

Introduction to Population Modeling

Econometrics II Multinomial Choice Models

Rockefeller College University at Albany

Introduction to POL 217

Understanding the underlying dynamics of the reservation wage for South African youth. Essa Conference 2013

Marital Disruption and the Risk of Loosing Health Insurance Coverage. Extended Abstract. James B. Kirby. Agency for Healthcare Research and Quality

Logit with multiple alternatives

Tutorial: Discrete choice analysis Masaryk University, Brno November 6, 2015

Analyzing the Determinants of Project Success: A Probit Regression Approach

1 Excess burden of taxation

3 Logit. 3.1 Choice Probabilities

Mixed Logit or Random Parameter Logit Model

sociology SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 SO5032 Quantitative Research Methods

Labor Economics Field Exam Spring 2014

1 Roy model: Chiswick (1978) and Borjas (1987)

*9-BES2_Logistic Regression - Social Economics & Public Policies Marcelo Neri

Allison notes there are two conditions for using fixed effects methods.

Renters Report Future Home Buying Optimism, While Family Financial Assistance Is Most Available to Populations with Higher Homeownership Rates

Problem Set 6. I did this with figure; bar3(reshape(mean(rx),5,5) );ylabel( size ); xlabel( value ); mean mo return %

CHAPTER 4 ESTIMATES OF RETIREMENT, SOCIAL SECURITY BENEFIT TAKE-UP, AND EARNINGS AFTER AGE 50

NBER WORKING PAPER SERIES MAKING SENSE OF THE LABOR MARKET HEIGHT PREMIUM: EVIDENCE FROM THE BRITISH HOUSEHOLD PANEL SURVEY

Review questions for Multinomial Logit/Probit, Tobit, Heckit, Quantile Regressions

An ex-post analysis of Italian fiscal policy on renovation

What You Don t Know Can t Help You: Knowledge and Retirement Decision Making

Statistical Analysis of Traffic Injury Severity: The Case Study of Addis Ababa, Ethiopia

Final Exam - section 1. Thursday, December hours, 30 minutes

EconS 301 Intermediate Microeconomics Review Session #4

Effects of working part-time and full-time on physical and mental health in old age in Europe

REGIONAL WORKSHOP ON TRAFFIC FORECASTING AND ECONOMIC PLANNING

May 9, Please put ONLY your ID number on the blue books. Three (3) points will be deducted for each time your name appears in a blue book.

West Coast Stata Users Group Meeting, October 25, 2007

Egyptian Married Women Don t desire to Work or Simply Can t? A Duration Analysis. Rana Hendy. March 15th, 2010

Table 4. Probit model of union membership. Probit coefficients are presented below. Data from March 2008 Current Population Survey.

In Debt and Approaching Retirement: Claim Social Security or Work Longer?

Econ 101A Final exam Mo 18 May, 2009.

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]

The Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis

Macro Consumption Problems 33-43

Appendix A. Additional Results

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION

Discrete Choice Theory and Travel Demand Modelling

COMPLEMENTARITY ANALYSIS IN MULTINOMIAL

FIN FINANCIAL INSTRUMENTS SPRING 2008

John Hull, Risk Management and Financial Institutions, 4th Edition

PhD Qualifier Examination

WesVar uses repeated replication variance estimation methods exclusively and as a result does not offer the Taylor Series Linearization approach.

The current study builds on previous research to estimate the regional gap in

How exogenous is exogenous income? A longitudinal study of lottery winners in the UK

What is spatial transferability?

Limited Dependent Variables

Risk Reduction Potential

Logistic Regression Analysis

INTERNATIONAL REAL ESTATE REVIEW 2002 Vol. 5 No. 1: pp Housing Demand with Random Group Effects

SDP Macroeconomics Midterm exam, 2017 Professor Ricardo Reis

ECON 6022B Problem Set 2 Suggested Solutions Fall 2011

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, Last revised January 10, 2017

Adverse Selection, Moral Hazard and the Demand for Medigap Insurance

Online appendix for W. Kip Viscusi, Joel Huber, and Jason Bell, Assessing Whether There Is a Cancer Premium for the Value of a Statistical Life

Public Economics. Contact Information

Estimating Heterogeneous Choice Models with Stata

Labor Migration and Wage Growth in Malaysia

Returns to education in Australia

Estimating Mixed Logit Models with Large Choice Sets. Roger H. von Haefen, NC State & NBER Adam Domanski, NOAA July 2013

Reemployment after Job Loss

STA 4504/5503 Sample questions for exam True-False questions.

Are Early Stage Investors Biased Against Women?

Automobile Prices in Equilibrium Berry, Levinsohn and Pakes. Empirical analysis of demand and supply in a differentiated product market.

Lecture 12: The Bootstrap

Married to Your Health Insurance: The Relationship between Marriage, Divorce and Health Insurance.

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models

SDP Macroeconomics Final exam, 2014 Professor Ricardo Reis

Applied Econometrics for Health Economists

Name: 1. Use the data from the following table to answer the questions that follow: (10 points)

Economic Growth and Convergence across the OIC Countries 1

Sarah K. Burns James P. Ziliak. November 2013

Course information FN3142 Quantitative finance

Abadie s Semiparametric Difference-in-Difference Estimator

Forecasting jumps in conditional volatility The GARCH-IE model

Online Appendix (Not For Publication)

Reforming Beneficiary Cost Sharing to Improve Medicare Performance. Appendix 1: Data and Simulation Methods. Stephen Zuckerman, Ph.D.

Your Name (Please print) Did you agree to take the optional portion of the final exam Yes No. Directions

PhD Qualifier Examination

Transcription:

Quant Econ Pset 2: Logit Hosein Joshaghani Due date: February 20, 2017 The main goal of this problem set is to get used to Logit, both to its mechanics and its economics. In order to fully grasp this useful tool, we need a little bit of statistics, mathematics and economics. 1 Logit derivation Goal of this question is to study properties of Logit function and its derivation. 1.1 Gumble distribution Use the pdf and cdf function for Gumble distribution to derive the following graph in Python. Recall that for type I extreme value (Gumble) distribution we have: g(x) = e x e e x G(x) = e e x and mean is γ, the Euler-Mascheroni constant, and variance is π2 3. 1

1.2 Gumbel - Gumbel = Logistics Normal! Show that the difference between two extreme value variables is distributed logistic. That is, if ɛ ni and ɛ nj are iid extreme value, then ɛ = ɛ ni ɛ nj follows the logistic distribution: Hint: f ɛ (s) = F ɛ (s) = e s (1 + e s ) 2 es 1 + e s F ɛ (s) = Prob(ɛ ni ɛ nj < s) = Prob(ɛ ni < s + ɛ nj ) = Prob(ɛ ni < s + ɛ nj ɛ nj )g(ɛ nj )dɛ nj ɛ nj = G(s + ɛ nj )g(ɛ nj )dɛ nj ɛ nj (s+ɛ nj ) = e ɛ nj e e ɛ nj... ɛ nj e e = es 1 + e s Now draw pdf and cdf for Logistic distribution and compare it with Normal. Convince yourself that the difference between extreme value and independent normal errors is indistinguishable empirically. (Note: Don t forget the variance of the logistic function. What is the variance of Gumbel - Gumble, if they are independent?) dɛ nj Finally, explain why binary logit model is derived as follows: P n1 = e(x n1 X n0 )β 1 + e (X n1 X n0 )β P n0 = 1 P n1 1.3 Derivation of Multinomial Logit Formula Show that if unobserved component of utilty is distributed iid extreme value (i.e. Gumble) for each alternative, then the choice probabilities take the form of: P ni = ev ni j ev nj (1) Hint: Start from the conditional probability P ni ɛ ni that is the probability of alternative i is chosen given the unobserved utility of choice i, and use independence assumption to show that P ni ɛ ni = j i e e (ɛ ni +V ni V nj ) Then use the same method used for binary logit model in problem 1.2 to derive the multinomial logit formula of equation 1. 2

1.4 Derivatives and Elasticities How does choice probability P ni responds to a change in one of the characteristics of i s alternative? For instance, we want to predict how market share of a product responds to improvement in its performance. To answer these sort of questions we need derivatives and elasticities. For the logit model of equation (1) show that: where V ni depends on z ni. P ni = V ni P ni (1 P ni ) However, economists often measure response by elasticities rather than derivatives, since elasticities are normalized for the variables units. An elasticity is the percentage change in one variable that is associated with a one-percent change in another variable. Show that the elasticity of P ni with respect to z ni, a variable entering the utility of alternative i, is E i,zni = P ni z ni P ni = V ni z ni (1 P ni ) Another useful term is cross derivative and cross elasticities which capture responds of P ni to changes in characteristics of alternative j i. Show that: P ni = V nj P ni P nj E i,zni = P ni z nj P ni = V nj P ni P nj 2 Maximum Likelihood Estimation A sample of N decision makers is obtained for the purpose of estimation. 2.1 Log Likelihood function Since the logit probabilities take a closed form, the traditional maximum-likelihood procedures can be applied. Show that the log likelihood function is then LL(β) = 2.2 First order condition N y ni ln P ni (2) n=1 Show that the first order condition for the problem of max β LL(β) is given by: N (y ni P ni )x ni = 0 (3) n=1 i What is the interpretation of this equation? Show that the maximum likelihood estimates of β are those that make the predicted average of each explanatory variable equal to the observed average in the sample. i 3

2.3 Goodness of fit The likelihood ratio index is defined as ρ = 1 LL( ˆβ) LL(0) (4) In two sentences interpret this index. Is it similar to R 2 in linear regression? For instance, can you compare likelihood ratio index from two models on two datasets to each other? (Hint: the answer is NO! Explain why.) 3 Logit in action We analyze data on supplementary health insurance coverage. Even you can do all of this exercise in Python, but it is more efficient to do it Stata. You may want to replicate this exercise in Python later. The data come from wave 5 (2002) of the Health and Retirement Study (HRS), a panel survey sponsored by the National Institute of Aging. The sample is restricted to Medicare beneficiaries. The HRS contains information on a variety of medical service uses. The elderly can obtain supplementary insurance coverage either by purchasing it themselves or by joining employer-sponsored plans. We use the data to analyze the purchase of private insurance (ins) from any source, including private markets or associations. The insurance coverage broadly measures both individually purchased and employer-sponsored private supplementary insurance, and includes Medigap plans and other policies. Explanatory variables include health status, socioeconomic characteristics, and spouse-related information. Self-assessed health-status information is used to generate dummy variable (hstatusg) that measures whether health status is good, very good, or excellent. Other measures of health status are the number of limitations (up to five) on activities of daily living (adl) and the total number of chronic conditions (chronic). Socioeconomic variables used are age, gender, race, ethnicity, marital status, years of education, and retirement status (respectively, age, female, white, hisp, married, educyear, retire); household income (hhincome); and log household income if positive (linc). Spouse retirement status (sretire) is an indicator variable equal to 1 if a retired spouse is present. 3.1 Logit Estimation Using logit command in Stata, estimate a logit model for Pr(ins = 1) when value of having insurance depends linearly on household income (hhincome) and socioeconomic variables (age, female, white, hisp, married, educyear, retire). Interpret the estimated coefficients and the reported log likelihood. 3.2 Comparing Predicted Outcome with Actual Outcome Now we want to compare the predicted outcome from this simple logit model with the actual outcome. Use Stata s predict command to store predicted values for the probability of having insurance from a simple logit model that the explanatory variable is only hhincome. Then 4

estimate predicted outcome from OLS regression of the same model and compare the results. Interpret. Hint: Your result should look like the following. 3.3 Logit in Python (optional) Repeat the same exercise using Python. 4 Consumer Surplus For policy analysis, the researcher is often interested in measuring the change in consumer surplus that is associated with a particular policy. For example, if a new alternative is being considered, such as building a light rail system in a city, then it is important to measure the benefits of the project to see if they warrant the costs. Similarly, a change in the attributes of an alternative can have an impact on consumer surplus that is important to assess. Degradation of the water quality of rivers harms the anglers who can no longer fish as effectively at the damaged sites. Measuring this harm in monetary terms is a central element of legal action against the polluter. Under the logit assumptions, the consumer surplus associated with a set of alternatives takes a closed form that is easy to calculate. By definition, a persons consumer surplus is the utility, in dollar terms, that the person receives in the choice situation. The decision maker chooses the alternative that provides the greatest utility. Consumer surplus is therefore CS n = (1/α n ) max j (U nj ), where α n is the marginal utility of income: du n /dy n = α n. The researcher does not observe U n j and therefore cannot use this expression to calculate the decision makers consumer surplus. Instead, the researcher observes V n j and knows the 5

distribution of the remaining portion of utility. With this information, the researcher is able to calculate the expected consumer surplus: 4.1 log-sum term E(CS n ) = (1/α n )E[max(V nj + ɛ nj )] j Assume utility is linear in income. (Hence α n is constant with respect to income). Show that if each ɛ nj is i.i.d extereme value then ( J E(CS n ) = (1/α n ) ln j=1 e V nj ) + C where C is an unknown constant that represents the fact that the absolute level of utility cannot be measured. Hint: read chapter 4 of Small and Rosen (1981). Note that the argument in parentheses in this expression is the denominator of the logit choice probability. Aside from the division and addition of constants, expected consumer surplus in a logit model is simply the log of the denominator of the choice probability. It is often called the log-sum term. 4.2 Marginal Utility of Income It is important to measure marginal utility of income, α n, for welfare analysis. In many choice models we use price of the products as one of the attributes of the product. Do you have any suggestion for measuring α n? Explain assumptions you need to measure marginal utility of income. 6