Economics Multinomial Choice Models

Similar documents
Models of Multinomial Qualitative Response

Estimating Market Power in Differentiated Product Markets

Econometrics II Multinomial Choice Models

Intro to GLM Day 2: GLM and Maximum Likelihood

Introduction to the Maximum Likelihood Estimation Technique. September 24, 2015

Multinomial Choice (Basic Models)

Introduction to POL 217

The Normal Distribution

Lecture 10: Alternatives to OLS with limited dependent variables, part 1. PEA vs APE Logit/Probit

Panel Data with Binary Dependent Variables

Lecture 1: Logit. Quantitative Methods for Economic Analysis. Seyed Ali Madani Zadeh and Hosein Joshaghani. Sharif University of Technology

Industrial Organization

Choice Probabilities. Logit Choice Probabilities Derivation. Choice Probabilities. Basic Econometrics in Transportation.

Quant Econ Pset 2: Logit

Lesson Exponential Models & Logarithms

STATISTICAL METHODS FOR CATEGORICAL DATA ANALYSIS

A MODIFIED MULTINOMIAL LOGIT MODEL OF ROUTE CHOICE FOR DRIVERS USING THE TRANSPORTATION INFORMATION SYSTEM

Finance 197. Simple One-time Interest

Estimating Mixed Logit Models with Large Choice Sets. Roger H. von Haefen, NC State & NBER Adam Domanski, NOAA July 2013

Multiple regression - a brief introduction

1 Asset Pricing: Bonds vs Stocks

6. Continous Distributions

3 Logit. 3.1 Choice Probabilities

(2/3) 3 ((1 7/8) 2 + 1/2) = (2/3) 3 ((8/8 7/8) 2 + 1/2) (Work from inner parentheses outward) = (2/3) 3 ((1/8) 2 + 1/2) = (8/27) (1/64 + 1/2)

2.01 Products of Polynomials

Solutions for practice questions: Chapter 15, Probability Distributions If you find any errors, please let me know at

15. Multinomial Outcomes A. Colin Cameron Pravin K. Trivedi Copyright 2006

Logit Models for Binary Data

A note on the nested Logit model

1. MAPLE. Objective: After reading this chapter, you will solve mathematical problems using Maple

The Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis

List of figures. I General information 1

Econ 8602, Fall 2017 Homework 2

sociology SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 SO5032 Quantitative Research Methods

Discrete Choice Theory and Travel Demand Modelling

Applications of Exponential Functions Group Activity 7 Business Project Week #10

The Delta Method. j =.

Choice Models. Session 1. K. Sudhir Yale School of Management. Spring

Hierarchical Generalized Linear Models. Measurement Incorporated Hierarchical Linear Models Workshop

Lecture 21: Logit Models for Multinomial Responses Continued

Duration Models: Parametric Models

Iteration. The Cake Eating Problem. Discount Factors

The Usefulness of Bayesian Optimal Designs for Discrete Choice Experiments

1 Excess burden of taxation

Roy Model of Self-Selection: General Case

Algebra 2: Lesson 11-9 Calculating Monthly Payments. Learning Goal: 1) How do we determine a monthly payment for a loan using any given formula?

Analysis of Microdata

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]

Problem Set 4 Answers

CCAC ELEMENTARY ALGEBRA

Vlerick Leuven Gent Working Paper Series 2003/30 MODELLING LIMITED DEPENDENT VARIABLES: METHODS AND GUIDELINES FOR RESEARCHERS IN STRATEGIC MANAGEMENT

What s New in Econometrics. Lecture 11

CS 361: Probability & Statistics

Some Characteristics of Data

Mixed Logit or Random Parameter Logit Model

Answers to chapter 3 review questions

COMPLEMENTARITY ANALYSIS IN MULTINOMIAL

One period models Method II For working persons Labor Supply Optimal Wage-Hours Fixed Cost Models. Labor Supply. James Heckman University of Chicago

MATH60082 Example Sheet 6 Explicit Finite Difference

Equalities. Equalities

Using the Central Limit Theorem It is important for you to understand when to use the CLT. If you are being asked to find the probability of the

*9-BES2_Logistic Regression - Social Economics & Public Policies Marcelo Neri

Financial Risk Management

3. Multinomial response models

Mean-Variance Analysis

2011 Pearson Education. Elasticities of Demand and Supply: Today add elasticity and slope, cross elasticities

2011 Pearson Education. Elasticities of Demand and Supply: Today add elasticity and slope, cross elasticities

Regression with a binary dependent variable: Logistic regression diagnostic

Math Performance Task Teacher Instructions

Chapter 3. Dynamic discrete games and auctions: an introduction

L industria del latte alimentare italiana: Comportamenti di consumo e analisi della struttura di mercato

Probability. An intro for calculus students P= Figure 1: A normal integral

GPD-POT and GEV block maxima

STAT 825 Notes Random Number Generation

A Continuous-Time Asset Pricing Model with Habits and Durability

Drawbacks of MNL. MNL may not work well in either of the following cases due to its IIA property:

CHAPTER 11 Regression with a Binary Dependent Variable. Kazu Matsuda IBEC PHBU 430 Econometrics

Tests for the Odds Ratio in a Matched Case-Control Design with a Binary X

Day 3 Simple vs Compound Interest.notebook April 07, Simple Interest is money paid or earned on the. The Principal is the

GE in production economies

BOSTON UNIVERSITY SCHOOL OF MANAGEMENT. Math Notes

EconS Constrained Consumer Choice

Lecture 34. Summarizing Data

Simplest Description of Binary Logit Model

Developmental Math An Open Program Unit 12 Factoring First Edition

3. Time value of money. We will review some tools for discounting cash flows.

Distributions and Intro to Likelihood

The Zero Product Law. Standards:

To apply SP models we need to generate scenarios which represent the uncertainty IN A SENSIBLE WAY, taking into account

This homework assignment uses the material on pages ( A moving average ).

Economics 742 Brief Answers, Homework #2

Discrete Choice Model for Public Transport Development in Kuala Lumpur

3. Time value of money

Multinomial Logit Models - Overview Richard Williams, University of Notre Dame, Last revised February 13, 2017

Modeling. joint work with Jed Frees, U of Wisconsin - Madison. Travelers PASG (Predictive Analytics Study Group) Seminar Tuesday, 12 April 2016

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

might be done. The utility. rather than

1. FRACTIONAL AND DECIMAL EQUIVALENTS OF PERCENTS

The Binomial Distribution

Point Estimation. Copyright Cengage Learning. All rights reserved.

Transcription:

Economics 217 - Multinomial Choice Models So far, most extensions of the linear model have centered on either a binary choice between two options (work or don t work) or censoring options. Many questions in economics involve consumers making choices between more than two varieties of goods Ready-to-eat cereal Vacation destinations Type of car to buy Firms also have such multinomial choices In which country to operate Where to locate a store Which CEO to hire Techniques to evaluate these questions are complex, but widely used in practice. Generally, they are referred to as Discrete Choice Models, or Multinomial Choice Models

Multinomial Choice - The basic framework Suppose there are individuals, indexed by i They choose from J options of a good, and may only choose one option. If they choose option k, then individual i receives U ik in utility, where U ik = V ik + ε ik V ik is observable utility (to the econometrician). This can be linked to things like product characteristics, demographics, etc.. ε ik is random utility. The econometrician doesn t see this, but knows its distribution. This actually makes the problem a bit more reasonable to characterize empirically Utility maximization - individual i chooses option k if U ik > U i k This maximization problem involves comparing observable utility for each option, while accounting for random utility.

Multinomial Choice - The basic framework From here, there are a variety of techniques that one can use to estimate multinomial choice models Multinomial Logit is the easiest, and will be derived below Assumes a particular functional form that has questionable properties, but produces closed form solutions There are two ways to derive the multinomial logit - we will go over the easier approach, though I have also derived the second approach in the notes. Nested Logit is more realistic: Consumers choose between larger groups (car vs. truck) before making more refined choices (two-door vs. four-door) Also yields closed form solutions, but results can depend on choices over "nests" Additional extensions to multinomial choice are beyond this course, but can be used if you understand the basic assumptions Multinomial Probit (requires heavy computation) Random coefficients logit (variation in how agents value attributes of choices)

Multinomial Distribution Recall the binomial distribution: f (y; p) = n! y! (n y)! py (1 p) n y Remember that p is the probability some event (eg. unemployment) occurs, and y is the number of times the event occurs after n attempts. n y is the number of times the event does not occur. When there are more than two choices, the distribution is generalized as multinomial Defining π as the probability that option is chosen, the multinomial distribution is written as: J 1 f (y; p) = n! y! πy = =1 n! y 1!y 2! y J 1!y J! πy 1 1 πy 2 2 πy J 1 J 1 πy J J This is the PDF that is used for maximum likelihood. We wish to estimate J π s. Do you think we can? Or do you think that we need to? Let s now take the next step and link the likelihood function to data.

Multinomial Logit - Derivation Recall for the Logit model we link the log odds ratio to data pi log = x T 1 p β i i We exponentiate and rearrange to get: p i = exp x T i β 1 + exp x T i β We must extend this link to having multiple options in the multinomial model. Since there is a linear dependency in our probabilities (ie. they sum to one), we must choose a reference group We write the log odds ratio relative to the reference group ( = 1) as: πi log = x T β i π i1 Note that the relative probability is specific to : β. β : The effect of some covariate on the choice between and 1 may vary by. x T i is a vector covariates for i that may vary by. Eg. Price matters for choice between compact cars, but not between compact and luxury.

Multinomial Logit - Derivation Exponentiate and solve for π i π i = π 1 exp x T β i Next, use the requirement that all probabilities sum to 1 Substituting for π i, we get: Solving for π i1 π i1 + π i1 = π i1 + J π = 1 =2 J π 1 exp =2 Thus, the probability of option, π i, is π i = x T β i = 1 1 1 + J x exp T =2 i β exp x Tβ i 1 + J s=2 exp x T is β s

Multinomial Logit - Assumptions The multinomial logit formula is pretty simple exp x Tβ i π i = 1 + J exp x T s=2 is β s The multinomial logit has a pretty sharp property that is usually not good in practice: Independence of Irrelevant Alternatives (IIA) Precisely, when choosing between two goods, substitution with other goods does not matter To see IIA in practice, take the ratio of probabilities between some good and another k π i exp x Tβ i = π ik exp x T β ik k Thus, the relative probabilities of two outcomes do not depend on the other J 2 outcomes. Techniques such as multinomial probit, and nested logit, avoid this strong prediction.

Multinomial Logit - Estimation in R There are a few packages in R to estimate the multinomial logit. mlogit is the best. The package also includes a number of datasets that we can use to demonstrate the model. Since it is pretty simple, we will use the dataset "Cracker". After loading mlogit, you can call the data internal to the package via the following command: data("cracker", package = "mlogit") str(cracker) Each row represents an individual, and "choice" represents the chosen brand. This will be the outcome variable. For each brand of cracker, the dataset contains the following information price observed for individual i Whether or not there was an in-store display observed by individual i, disp. Whether or not there was a newspaper ad observed by individual i, feat.

Multinomial Logit - Estimation in R To setup the data.frame for estimation, you must create an mlogit data obect. data_c<-mlogit.data(cracker, shape="wide", choice="choice", varying=c(2:13)) "data_c" is the mlogit data obect in "wide" formate "Cracker" is the original data frame "shape="wide"" tells us to list the data in a format that I will describe with R. "varying=c(2:13)" indicates the variables from the dataset that vary by individual (prices they observe, advertisements they see To estimate the model, run: m <- mlogit(choice~price+disp+feat,data_c) summary(m) Can estimate the model with product specific coefficients using m2 <- mlogit(choice~0 price+disp+feat,data_c) summary(m2)

Extra: Multinomial Logit from Extreme Value Distribution Choices are independent of one another, and ε ik follows an extreme value I distribution (also known as the Gumbel distribution). f (ε ik ) = exp ( ε ik ) exp ( exp ( ε ik )) Pr (ε < ε ik ) = F (ε ik ) = exp ( exp ( ε ik )) Recall that from utility maximization - individual i chooses option k if U ik > U i k We now seek the probability that this outcome occurs, which can then be compared empirically to the share of agents that choose option k over all other.

Extra: Multinomial Logit - Derivation First, let s consider option k against some other option. The probability the consumer purchases k: Pr U ik > U i = Pr Vik + ε ik > V i + ε i Rearranging to isolate ε i Pr U ik > U i = Pr Vik V i + ε ik > ε i This simply says that the difference in observable utility plus ε ik is greater than ε i. Put differently, unobserved utility in option is not sufficient to make-up for the other factors influencing the decision between k and. Imposing the CDF of the Gumbel distribution, and treating ε ik as a conditioning variable, we have: εik Pr U ik > U i = F V ik V i + ε ik Given ε ik, what is the probability that this occurs for all k?

Extra: Multinomial Logit - Derivation Since unobserved utility is independent across goods, the intersection of these events is ust their probabilities multiplied together So, the probability that k is chosen over for all k, conditional on ε ik, is: Pr U ik > U i k ε ik = F V ik V i + ε ik For the final step before some algebra, recall that this is a conditional probability. We still need to account for the possible values of ε ik k Formally, the unconditional probability that k is chosen, P ik, is written as: P ik = Pr U ik > U i k = Pr U ik > U i k ε ik f (ε ik ) dε ik Basically, what we re doing is taking each Pr U ik > U i k εik, and then weighting by the pdf f (ε ik ).

Extra: Multinomial Logit - Derivation Imposing the solution for the choice of k conditional on ε ik : P ik = F V ik V i + ε ik f (εik ) dε ik k Imposing the parameterization of the extreme value distribution, we have: P ik = exp exp V ik V i + ε ik exp ( εik ) exp ( exp ( ε ik )) dε ik k Note that since exp ( exp ( ε ik )) = exp ( exp ( (V ik V ik + ε ik ))), we can simply as: P ik = exp exp V ik V i + ε ik exp ( εik ) dε ik Simplifying this is not too hard, once you note a few convenient features of the extreme value distribution.

Extra: Multinomial Logit - Derivation Remember that the product of exponentials is ust the exponential of the sums of the exponents exp x = exp x Thus, P ik = = exp exp V ik V i + ε ik exp ( εik ) dε ik exp exp V ik V i + ε ik exp ( ε ik ) dε ik Using a similar rule, we can we can factor out exp(ε ik ) P ik = exp exp ( ε ik ) exp V ik V i exp ( ε ik ) dε ik The next step is tricky. What is the relationship between exp ( ε ik ) and exp ( ε ik ) dε ik?

Extra: Multinomial Logit - Derivation Time for a change of variables, where t = exp ( ε ik ) dt = exp ( ε ik ) dε ik where t (, 0) Thus, P ik = = 0 exp exp ( ε ik ) exp t exp V ik V i dt exp V ik V i exp ( ε ik ) dε ik Completing the integral: P ik = exp t exp V ik V i exp V ik V i 0

Extra: Multinomial Logit - Derivation And finally, simplify P ik = = = = 1 exp V ik V i 1 exp ( V ik) exp V i use exponent rule 1 exp ( V ik ) exp V i factor out exp ( V ik ) exp (V ik ) exp V i multiply top and bottom by exp (V ik ) From here, we usually assume that observed utility is a function of covariates Thus, V i = X i β P ik = exp (X ikβ) exp X i β