Lecture 1: Logit. Quantitative Methods for Economic Analysis. Seyed Ali Madani Zadeh and Hosein Joshaghani. Sharif University of Technology

Similar documents
Choice Probabilities. Logit Choice Probabilities Derivation. Choice Probabilities. Basic Econometrics in Transportation.

Quant Econ Pset 2: Logit

3 Logit. 3.1 Choice Probabilities

Mixed Logit or Random Parameter Logit Model

Econometrics II Multinomial Choice Models

Estimating Mixed Logit Models with Large Choice Sets. Roger H. von Haefen, NC State & NBER Adam Domanski, NOAA July 2013

Choice Models. Session 1. K. Sudhir Yale School of Management. Spring

Estimating Market Power in Differentiated Product Markets

Econometric Methods for Valuation Analysis

Multinomial Choice (Basic Models)

Drawbacks of MNL. MNL may not work well in either of the following cases due to its IIA property:

Models of Multinomial Qualitative Response

Economics Multinomial Choice Models

Logit with multiple alternatives

1 Excess burden of taxation

Discrete Choice Theory and Travel Demand Modelling

Using Halton Sequences. in Random Parameters Logit Models

Labor Economics Field Exam Spring 2014

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]

1 Roy model: Chiswick (1978) and Borjas (1987)

ONLINE APPENDIX (NOT FOR PUBLICATION) Appendix A: Appendix Figures and Tables

Unobserved Heterogeneity Revisited

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples

Automobile Prices in Equilibrium Berry, Levinsohn and Pakes. Empirical analysis of demand and supply in a differentiated product market.

Nested logit. Michel Bierlaire

Nested logit. Michel Bierlaire

Discrete Choice Methods with Simulation

P = The model satisfied the Luce s axiom of independence of irrelevant alternatives (IIA) which can be stated as

The Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis

Introduction to the Maximum Likelihood Estimation Technique. September 24, 2015

A MODIFIED MULTINOMIAL LOGIT MODEL OF ROUTE CHOICE FOR DRIVERS USING THE TRANSPORTATION INFORMATION SYSTEM

Earnings Dynamics, Mobility Costs and Transmission of Firm and Market Level Shocks

Discrete Choice Model for Public Transport Development in Kuala Lumpur

Analysis of Microdata

Depression Babies: Do Macroeconomic Experiences Affect Risk-Taking?

Heterogeneity in Multinomial Choice Models, with an Application to a Study of Employment Dynamics

CHOICE THEORY, UTILITY FUNCTIONS AND RISK AVERSION

STA 4504/5503 Sample questions for exam True-False questions.

One period models Method II For working persons Labor Supply Optimal Wage-Hours Fixed Cost Models. Labor Supply. James Heckman University of Chicago

15. Multinomial Outcomes A. Colin Cameron Pravin K. Trivedi Copyright 2006

Introduction to POL 217

What s New in Econometrics. Lecture 11

Final Exam. Consumption Dynamics: Theory and Evidence Spring, Answers

MPhil F510 Topics in International Finance Petra M. Geraats Lent Course Overview

Expected Inflation Regime in Japan

Risk Premia and the Conditional Tails of Stock Returns

Valuing Environmental Impacts: Practical Guidelines for the Use of Value Transfer in Policy and Project Appraisal

Industrial Organization

9. Logit and Probit Models For Dichotomous Data

Discussion of Fertility, Social Mobility and Long-Run Inequality

Frequency of Price Adjustment and Pass-through

Acemoglu, et al (2008) cast doubt on the robustness of the cross-country empirical relationship between income and democracy. They demonstrate that

Analysis of the Impact of Interest Rates on Automobile Demand

Day 3C Simulation: Maximum Simulated Likelihood

A Structural Model of Continuous Workout Mortgages (Preliminary Do not cite)

Empirical public economics (31.3, 7.4, seminar questions) Thor O. Thoresen, room 1125, Friday

Window Width Selection for L 2 Adjusted Quantile Regression

Supplemental Online Appendix to Han and Hong, Understanding In-House Transactions in the Real Estate Brokerage Industry

Hierarchical Generalized Linear Models. Measurement Incorporated Hierarchical Linear Models Workshop

14.471: Fall 2012: Recitation 3: Labor Supply: Blundell, Duncan and Meghir EMA (1998)

Which GARCH Model for Option Valuation? By Peter Christoffersen and Kris Jacobs

Home Energy Reporting Program Evaluation Report. June 8, 2015

*9-BES2_Logistic Regression - Social Economics & Public Policies Marcelo Neri

How (not) to measure Competition

Exercises on the New-Keynesian Model

CSC 411: Lecture 08: Generative Models for Classification

Equity, Vacancy, and Time to Sale in Real Estate.

Questions of Statistical Analysis and Discrete Choice Models

Multi-armed bandits in dynamic pricing

Macroeconomics Sequence, Block I. Introduction to Consumption Asset Pricing

General Examination in Macroeconomic Theory SPRING 2016

Green Giving and Demand for Environmental Quality: Evidence from the Giving and Volunteering Surveys. Debra K. Israel* Indiana State University

TOURISM GENERATION ANALYSIS BASED ON A SCOBIT MODEL * Lingling, WU **, Junyi ZHANG ***, and Akimasa FUJIWARA ****

Idiosyncratic risk and the dynamics of aggregate consumption: a likelihood-based perspective

What Makes Family Members Live Apart or Together?: An Empirical Study with Japanese Panel Study of Consumers

News Shocks and Asset Price Volatility in a DSGE Model

Asymmetric Information in Health Insurance: Evidence from the National Medical Expenditure Survey. Cardon and Hendel

Temporal transferability of mode-destination choice models

3. Multinomial response models

Estimating the Effect of Tax Reform in Differentiated Product Oligopolistic Markets

Problem Set 3. Thomas Philippon. April 19, Human Wealth, Financial Wealth and Consumption

MULTIVARIATE FRACTIONAL RESPONSE MODELS IN A PANEL SETTING WITH AN APPLICATION TO PORTFOLIO ALLOCATION. Michael Anthony Carlton A DISSERTATION

Essays on the Random Parameters Logit Model

Convergence of Life Expectancy and Living Standards in the World

Unemployment Fluctuations and Nominal GDP Targeting

Phd Program in Transportation. Transport Demand Modeling. Session 11

Intro to GLM Day 2: GLM and Maximum Likelihood

Analyzing the Determinants of Project Success: A Probit Regression Approach

The Lack of Persistence of Employee Contributions to Their 401(k) Plans May Lead to Insufficient Retirement Savings

Agricultural and Applied Economics 637 Applied Econometrics II

List of figures. I General information 1

NPTEL Project. Econometric Modelling. Module 16: Qualitative Response Regression Modelling. Lecture 20: Qualitative Response Regression Modelling

ECON 815. A Basic New Keynesian Model II

Mock Examination 2010

MORTGAGE LOAN MARKET IN A DISCRETE CHOICE FRAMEWORK 1. Ákos Aczél 2. The Central Bank of Hungary. Budapest, Hungary

Stochastic Models. Statistics. Walt Pohl. February 28, Department of Business Administration

Dynamic Marketing Budget Allocation across Countries, Products, and Marketing Activities

Risk Reduction Potential

Analyzing Oil Futures with a Dynamic Nelson-Siegel Model

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Transcription:

Lecture 1: Logit Quantitative Methods for Economic Analysis Seyed Ali Madani Zadeh and Hosein Joshaghani Sharif University of Technology February 2017 1 / 38

Road map 1. Discrete Choice Models 2. Binary Logit 3. Logit 4. Power and Limitations of Logit Taste Variation Independence from Irrelevant Alternatives Dynamic models 5. Derivatives and Elasticities 6. Estimation: Maximum Likelihood (ML) 7. Review of few case studies Transportation: McFadden et. al. (1977) Health: Education: Labor: Card (2001) JLE, Sicherman (1991) JLE 2 / 38

General Discrete Choice Model Discrete choice models are usually derived under an assumption of utility-maximizing behavior by the decision maker. observable {}}{ un-observable {}}{ Utility = representative utility + error U ni = V ni + ɛ ni Choice probability: P ni = Prob(U ni > U nj j i) = Prob(V ni + ɛ ni > V nj + ɛ nj j i) = Prob(ɛ nj ɛ ni < V ni V nj j i) = I (ɛ nj ɛ ni < V ni V nj j i)f (ɛ)dɛ ɛ where I (.) is an indicator function and ɛ = ɛ nj ɛ ni and f (.) is the pdf of ɛ. 3 / 38

General Discrete Choice Model Discrete choice models are usually derived under an assumption of utility-maximizing behavior by the decision maker. observable {}}{ un-observable {}}{ Utility = representative utility + error U ni = V ni + ɛ ni Choice probability: P ni = Prob(U ni > U nj j i) = Prob(V ni + ɛ ni > V nj + ɛ nj j i) = Prob(ɛ nj ɛ ni < V ni V nj j i) = I (ɛ nj ɛ ni < V ni V nj j i)f (ɛ)dɛ ɛ where I (.) is an indicator function and ɛ = ɛ nj ɛ ni and f (.) is the pdf of ɛ. 3 / 38

Binary Logit Model Only two choices: i is either 1 or 2. Type I Exterme Value Distribution (Gumble) f (x) = e ɛ ni e e ɛ ni F (x) = e e ɛ ni difference between two extreme value variables is distributed logistic. That is, if ɛ ni and ɛ nj are iid extreme value, then ɛ = ɛ nj ɛ nj follows the logistic distribution: proof: see problem set 2! f ɛ (s) = F ɛ (s) = e s (1 + e s ) 2 es 1 + e s 4 / 38

Binary Logit Model Closed form solution for choice probability: P ni = F ɛ (V ni V nj ) = ev ni V nj 1 + e V ni V nj linear assumption: V ni = X ni β P ni = e(x ni X nj ) β 1 + e (X ni X nj ) β P nj = 1 P ni 5 / 38

Binary Logit Model Closed form solution for choice probability: P ni = F ɛ (V ni V nj ) = ev ni V nj 1 + e V ni V nj linear assumption: V ni = X ni β P ni = e(x ni X nj ) β 1 + e (X ni X nj ) β P nj = 1 P ni 5 / 38

Binary Logit Model Two main assumptions: Distributional assumption: error differences distributed logistc. Independence: the unobserved portion of utility for one alternative is unrelated to the unobserved portion of utility for another alternative. The ultimate goal of the researcher is to represent utility so well that the only remaining aspects constitute simply white noise; that is, the goal is to specify utility well enough that a logit model is appropriate. Seen in this way: The logit model is the ideal rather than a restriction. 6 / 38

Binary Logit Model: Misspecification If you think that the unobserved portion of utility is correlated over alternatives given your specification of representative utility, then you have three options: 1. use a different model that allows for correlated errors, such as those described later (Probit, Mixed Logit, GEV) 2. respecify representative utility so that the source of the correlation is captured explicitly and thus the remaining errors are independent, 3. use the logit model under the current specification of representative utility, considering the model to be an approximation. 7 / 38

Binary Logit Model: Misspecification If you think that the unobserved portion of utility is correlated over alternatives given your specification of representative utility, then you have three options: 1. use a different model that allows for correlated errors, such as those described later (Probit, Mixed Logit, GEV) 2. respecify representative utility so that the source of the correlation is captured explicitly and thus the remaining errors are independent, 3. use the logit model under the current specification of representative utility, considering the model to be an approximation. 7 / 38

Binary Logit Model: Misspecification If you think that the unobserved portion of utility is correlated over alternatives given your specification of representative utility, then you have three options: 1. use a different model that allows for correlated errors, such as those described later (Probit, Mixed Logit, GEV) 2. respecify representative utility so that the source of the correlation is captured explicitly and thus the remaining errors are independent, 3. use the logit model under the current specification of representative utility, considering the model to be an approximation. 7 / 38

Binary Logit Model: Misspecification If you think that the unobserved portion of utility is correlated over alternatives given your specification of representative utility, then you have three options: 1. use a different model that allows for correlated errors, such as those described later (Probit, Mixed Logit, GEV) 2. respecify representative utility so that the source of the correlation is captured explicitly and thus the remaining errors are independent, 3. use the logit model under the current specification of representative utility, considering the model to be an approximation. 7 / 38

Binary Logit Model: Improvements Improving bus service in areas where the service is so poor that few travelers take the bus would be less effective, than making the same improvement in areas where bus service is already sufficiently good to induce a moderate share of travelers to choose it (but not so good that nearly everyone does). 8 / 38

Multinomial Logit Model: McFadden (1974) Given independence, for each given ɛ ni, the probability that a decision maker n chooses i is: P ni ɛ ni = Prob(ɛ nj < ɛ ni + V ni V nj j i) = j i e e (ɛ ni +V ni V nj ) notice independence and distributional assumptions. 9 / 38

Multinomial Logit Model: McFadden (1974) Of course, ɛ ni is not given, so the choice probability is ( P ni = e e (ɛ ni +V ni V nj )e ) ɛ ni e e ɛni dɛ ni j i = ev ni j evnj Proof: see problem set 2! 10 / 38

Multinomial Logit Model: McFadden (1974) Of course, ɛ ni is not given, so the choice probability is ( P ni = e e (ɛ ni +V ni V nj )e ) ɛ ni e e ɛni dɛ ni j i = ev ni j evnj Proof: see problem set 2! 10 / 38

Identification: The Scale Parameter So far, we assumed that ɛ ni have Gumble distribution variance is π2 6. What if in our model ɛ ni has a different variance? say σ 2 π2 6. U ni = V ni + ɛ ni Since the scale of utility is irrelevant to behavior, utility can be divided by σ without changing behavior. choice probabilities are P ni = P ni = e(β /σ) x ni j e(β /σ) x nj the parameter σ is called the scale parameter. 11 / 38

Identification: The Scale Parameter So far, we assumed that ɛ ni have Gumble distribution variance is π2 6. What if in our model ɛ ni has a different variance? say σ 2 π2 6. U ni = V ni + ɛ ni Since the scale of utility is irrelevant to behavior, utility can be divided by σ without changing behavior. choice probabilities are P ni = P ni = e(β /σ) x ni j e(β /σ) x nj the parameter σ is called the scale parameter. 11 / 38

Identification: The Scale Parameter only β σ can be estimated, β and σ are NOT separately identified. The parameters β are estimated, but these estimated parameters are actually estimates of the original coefficients β divided by the scale parameter σ. The coefficients that are estimated indicate the effect of each observed variable relative to the variance of the unobserved factors A larger variance in unobserved factors leads to smaller coefficients, even if the observed factors have the same effect on utility poor specification higher variance of unobserved smaller coefficients 12 / 38

Identification: The Scale Parameter only β σ can be estimated, β and σ are NOT separately identified. The parameters β are estimated, but these estimated parameters are actually estimates of the original coefficients β divided by the scale parameter σ. The coefficients that are estimated indicate the effect of each observed variable relative to the variance of the unobserved factors A larger variance in unobserved factors leads to smaller coefficients, even if the observed factors have the same effect on utility poor specification higher variance of unobserved smaller coefficients 12 / 38

Identification: The Scale Parameter only β σ can be estimated, β and σ are NOT separately identified. The parameters β are estimated, but these estimated parameters are actually estimates of the original coefficients β divided by the scale parameter σ. The coefficients that are estimated indicate the effect of each observed variable relative to the variance of the unobserved factors A larger variance in unobserved factors leads to smaller coefficients, even if the observed factors have the same effect on utility poor specification higher variance of unobserved smaller coefficients 12 / 38

Power and Limitations of Logit 1. systematic taste variation 2. proportional substitution across alternatives IIA assumption 3. dynamic models: unobserved factors have to be independent over time in repeated choice situations. 13 / 38

Taste Variation Logit can represent systematic taste variation: that is, taste variation that relates to observed characteristics of the decision maker larger families prefer larger cars families with more kids, prefer homes close to schools but not random taste variation: differences in tastes that cannot be linked to observed characteristics families with more frequent road trips prefer larger cars more religious families prefer homes closer to mosques. 14 / 38

Taste Variation If taste variation is at least partly random, logit is a misspecification. As an approximation, logit might be able to capture the average tastes fairly well even when tastes are random, since the logit formula seems to be fairly robust to misspecifications. The researcher might therefore choose to use logit even when he knows that tastes have a random component, for the sake of simplicity. 15 / 38

Independence from Irrelevant Alternatives When the attributes of one alternative improve (e.g., its price drops), the probability of its being chosen rises. When a cell-phone manufacturer launches a new product with extra features, the firm is interested in knowing the extent to which the new product will draw customers away from its other cell phones rather than from competitors? phones. The logit model implies a certain pattern of substitution across alternatives. If substitution actually occurs in this way given the researcher s specification of representative utility, then the logit model is appropriate. To allow for more general patterns of substitution and to investigate which pattern is most accurate, more flexible models are needed. 16 / 38

Independence from Irrelevant Alternatives The relative odds of choosing i over k are the same no matter what other alternatives are available or what the attributes of the other alternatives are: j ev nj / P ni = evni P nk V nk / = e V ni V nk = e (X ni X nk ) β j ev nj Since the ratio is independent from alternatives other than i and k, it is said to be independent from irrelevant alternatives. IIA property is realistic in some choice situations red-bus and blue-bus 17 / 38

Independence from Irrelevant Alternatives The relative odds of choosing i over k are the same no matter what other alternatives are available or what the attributes of the other alternatives are: j ev nj / P ni = evni P nk V nk / = e V ni V nk = e (X ni X nk ) β j ev nj Since the ratio is independent from alternatives other than i and k, it is said to be independent from irrelevant alternatives. IIA property is realistic in some choice situations red-bus and blue-bus 17 / 38

Panel Data 18 / 38

Derivatives and Elasticities To what extent these probabilities change in response to a change in some observed factors? 1. To what extent will the probability of choosing a given car increase if the vehicle s fuel efficiency is improved? 2. To what extent will the probability of households choosing, say, a Toyota decrease if the fuel efficiency of a Honda improves? P ni z ni = V ni z ni P ni (1 P ni ) proof: see problem set 2! what is this partial derivative, if utility is linear in z ni? This derivative maximizes when P ni = 1/2. Interpret. 19 / 38

Derivatives and Elasticities Cross elasticities: P ni z nj = V nj z nj P ni P nj proof: see problem set 2! what is this partial derivative, if utility is linear in z nj? How does this derivative changes with P ni and P nj. Interpret. 20 / 38

Log Likelihood function contribution of individual observation in the likelihood: (P ni ) y ni where y ni = 1 if person n chose i and zero otherwise. Assuming independence, N L(β) = (P ni ) y ni i n=1 i then, the log likelihood is N LL(β) = y ni ln P ni n=1 i McFadden (1974) shows that LL(β) is globally concave for linear-in-parameters utility. 21 / 38

Log Likelihood function contribution of individual observation in the likelihood: (P ni ) y ni where y ni = 1 if person n chose i and zero otherwise. Assuming independence, N L(β) = (P ni ) y ni i n=1 i then, the log likelihood is N LL(β) = y ni ln P ni n=1 i McFadden (1974) shows that LL(β) is globally concave for linear-in-parameters utility. 21 / 38

Estimation: ML Maximum Likelihood Estimator: ˆβ ML = arg max log L(β) β First order condition: log L(β) β = 0. It can be shown that the FOC is equivalent to (see problem set 2 for the proof!) 1 N y ni x ni = 1 P ni x ni N n i n i ˆβML makes the predicted average of each explanatory variable equal to the observed average in the sample. 22 / 38

Goodness of Fit The likelihood ratio index: ρ = 1 log L( ˆβ) log L(0) It is important to note that the likelihood ratio index is not at all similar in its interpretation to the R 2 used in regression, despite both statistics having the same range. 23 / 38

Case Study 1: Transportation McFadden (1977) Introduction of new tranit method: BART Why do we need behavioral model? Structural versus reduced form estimation 24 / 38

25 / 38

26 / 38

Case Study 2: Labor Market Impacts of Immigrants David Card (2001) JLE This paper estimated a set of multinomial logit models for probabilities of working in six different occupation groups The estimated coefficients were then used to assign probabilities of working in different occupations 27 / 38

28 / 38

29 / 38

Case Study 3: Redistribution of Resources within family McGarry and Shoeni (1995) JHR Whether parents give greater financial assistance to their adult children who have lower incomes? Whether adult children give greater financial and time assistance to their less wealthy elderly parents? Why are redistributional aspects of transfers important? Altruism reduces the effect of government assistance programs, because it crowds out familial assistance As discussed by Barro, if individuals are altruistic, then there is no difference between tax and deb 30 / 38

31 / 38

32 / 38

Table 12 Continued Results suggest that parents give more to their less well off children and elderly parents 33 / 38

Case Study 4: Agglomeration benefits and Location Choice Head et. al. (1995) JIE Does externalities based on geographical proximity affect firm s location choice? How pervasive is it? whether the agglomeration effects operate on a nationality-specific basis? This paper estimates a location choice model using data on Japanese investors who established new manufacturing plants in the US 34 / 38

A logit model for location decision of Japanese Industries The profitability of state s for investor j θ s + α US ln A US js + α J ln A J js + α G ln A G js + ε js (1) θ s captures attractiveness of state s to the average investor A US js, AJ js, and AG js are agglomeration variables measured as count of US, Japan, and members of Japanese Industrial Group (Keiretsu) establishments Pr(js) = exp(θ s + i {US,J,G} α i ln A i js ) l exp(θ l + i {US,J,G} α i ln A i jl ) (2) 35 / 38

36 / 38

Japanese establishments do not simply mimic the geographical pattern of US establishments Instead, initial investments by Japanese firms spur subsequent investors in the same industry or industrial group to select the same states This pattern of location choice supports agglomeration-externalities theory rather than inter-state endowment differences theory 37 / 38

References Card, David. Immigrant inflows, native outflows, and the local labor market impacts of higher immigration. Journal of Labor Economics 19, no. 1 (2001): 22-64. Head, Keith, John Ries, and Deborah Swenson. Agglomeration benefits and location choice: Evidence from Japanese manufacturing investments in the United States. Journal of international economics 38, no. 3 (1995): 223-247. McGarry, Kathleen, and Robert F. Schoeni. Transfer behavior in the health and retirement study: Measurement and the redistribution of resources within the family. Journal of Human resources (1995): S184-S226. 38 / 38