Estimating Mixed Logit Models with Large Choice Sets. Roger H. von Haefen, NC State & NBER Adam Domanski, NOAA July 2013

Similar documents
Estimation and Welfare Analysis from Mixed Logit Models with Large Choice Sets 1

Do Random Coefficients and Alternative Specific Constants Improve Policy Analysis? An Empirical Investigation of Model Fit and Prediction

Mixed Logit or Random Parameter Logit Model

Estimating Market Power in Differentiated Product Markets

Unobserved Heterogeneity Revisited

Lecture 1: Logit. Quantitative Methods for Economic Analysis. Seyed Ali Madani Zadeh and Hosein Joshaghani. Sharif University of Technology

What s New in Econometrics. Lecture 11

School of Economic Sciences

Choice Probabilities. Logit Choice Probabilities Derivation. Choice Probabilities. Basic Econometrics in Transportation.

1 Excess burden of taxation

Experience with the Weighted Bootstrap in Testing for Unobserved Heterogeneity in Exponential and Weibull Duration Models

Contents. Part I Getting started 1. xxii xxix. List of tables Preface

Chapter 3. Dynamic discrete games and auctions: an introduction

Equity correlations implied by index options: estimation and model uncertainty analysis

Using Halton Sequences. in Random Parameters Logit Models

to level-of-service factors, state dependence of the stated choices on the revealed choice, and

Discrete Choice Methods with Simulation

Choice Models. Session 1. K. Sudhir Yale School of Management. Spring

A UNIFIED MIXED LOGIT FRAMEWORK FOR MODELING REVEALED AND STATED PREFERENCES: FORMULATION AND APPLICATION TO CONGESTION

Adaptive Experiments for Policy Choice. March 8, 2019

Reviewing Income and Wealth Heterogeneity, Portfolio Choice and Equilibrium Asset Returns by P. Krussell and A. Smith, JPE 1997

Omitted Variables Bias in Regime-Switching Models with Slope-Constrained Estimators: Evidence from Monte Carlo Simulations

A MODIFIED MULTINOMIAL LOGIT MODEL OF ROUTE CHOICE FOR DRIVERS USING THE TRANSPORTATION INFORMATION SYSTEM

Estimating a Dynamic Oligopolistic Game with Serially Correlated Unobserved Production Costs. SS223B-Empirical IO

The Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis

Small Area Estimation of Poverty Indicators using Interval Censored Income Data

Introduction Dickey-Fuller Test Option Pricing Bootstrapping. Simulation Methods. Chapter 13 of Chris Brook s Book.

Economics Multinomial Choice Models

Introduction to Sequential Monte Carlo Methods

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern

On Implementation of the Markov Chain Monte Carlo Stochastic Approximation Algorithm

STATISTICAL METHODS FOR CATEGORICAL DATA ANALYSIS

Models of Multinomial Qualitative Response

Consistent estimators for multilevel generalised linear models using an iterated bootstrap

A Non-Random Walk Down Wall Street

Discrete Choice Theory and Travel Demand Modelling

Phd Program in Transportation. Transport Demand Modeling. Session 11

Importance sampling and Monte Carlo-based calibration for time-changed Lévy processes

Week 1 Quantitative Analysis of Financial Markets Distributions B

Financial Liberalization and Neighbor Coordination

Experience with the Weighted Bootstrap in Testing for Unobserved Heterogeneity in Exponential and Weibull Duration Models

Assicurazioni Generali: An Option Pricing Case with NAGARCH

Volume 37, Issue 2. Handling Endogeneity in Stochastic Frontier Analysis

Adverse Selection and Switching Costs in Health Insurance Markets. by Benjamin Handel

Halton Sequences for Mixed Logit. By Kenneth Train 1 Department of Economics University of California, Berkeley. July 22, 1999 Revised August 2, 1999

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern

Bootstrap Inference for Multiple Imputation Under Uncongeniality

MORTGAGE LOAN MARKET IN A DISCRETE CHOICE FRAMEWORK 1. Ákos Aczél 2. The Central Bank of Hungary. Budapest, Hungary

Lecture 13 Price discrimination and Entry. Bronwyn H. Hall Economics 220C, UC Berkeley Spring 2005

An Analysis of the Factors Affecting Preferences for Rental Houses in Istanbul Using Mixed Logit Model: A Comparison of European and Asian Side

Evaluation of influential factors in the choice of micro-generation solar devices: a case study in Cyprus

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

Equity, Vacancy, and Time to Sale in Real Estate.

Optimal Search for Parameters in Monte Carlo Simulation for Derivative Pricing

Estimating the Effect of Tax Reform in Differentiated Product Oligopolistic Markets

A Utility Theory for Logit Models with Repeated Choices

Measuring Competition in Health Care Markets. Ola Aboukhsaiwan University of Pennsylvania, Wharton

An Implementation of Markov Regime Switching GARCH Models in Matlab

Final exam solutions

Week 7 Quantitative Analysis of Financial Markets Simulation Methods

Session 5. Predictive Modeling in Life Insurance

Resource Allocation within Firms and Financial Market Dislocation: Evidence from Diversified Conglomerates

Module 4: Point Estimation Statistics (OA3102)

Roy Model of Self-Selection: General Case

Econometrics II Multinomial Choice Models

Angler Heterogeneity and the Species-Specific Demand for Marine Recreational Fishing 1

SUPPLEMENT TO EQUILIBRIA IN HEALTH EXCHANGES: ADVERSE SELECTION VERSUS RECLASSIFICATION RISK (Econometrica, Vol. 83, No. 4, July 2015, )

Reasoning with Uncertainty

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Estimation of a Ramsay-Curve IRT Model using the Metropolis-Hastings Robbins-Monro Algorithm

The Science Of Predicting Elections. Steve Herrin SASS

Numerical Methods in Option Pricing (Part III)

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

Essays on the Random Parameters Logit Model

WC-5 Just How Credible Is That Employer? Exploring GLMs and Multilevel Modeling for NCCI s Excess Loss Factor Methodology

On Stochastic Evaluation of S N Models. Based on Lifetime Distribution

Financial Risk Management

Booms and Busts in Asset Prices. May 2010

The EM algorithm for HMMs

Annual risk measures and related statistics

L industria del latte alimentare italiana: Comportamenti di consumo e analisi della struttura di mercato

Lecture 17: More on Markov Decision Processes. Reinforcement learning

FIT OR HIT IN CHOICE MODELS

Yao s Minimax Principle

The Determinants of Bank Mergers: A Revealed Preference Analysis

Selection on Moral Hazard in Health Insurance

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

Meta Analysis in Model Implementation: Choice Sets and the Valuation of Air Quality Improvements

Cambridge University Press Risk Modelling in General Insurance: From Principles to Practice Roger J. Gray and Susan M.

A Stochastic Reserving Today (Beyond Bootstrap)

Econ 8602, Fall 2017 Homework 2

Decreasing Returns to Scale, Fund Flows, and Performance

An EM-Algorithm for Maximum-Likelihood Estimation of Mixed Frequency VARs

2.1 Mathematical Basis: Risk-Neutral Pricing

PRE CONFERENCE WORKSHOP 3

Temporal transferability of mode-destination choice models

Value at Risk Ch.12. PAK Study Manual

Valuing Environmental Impacts: Practical Guidelines for the Use of Value Transfer in Policy and Project Appraisal

Multivariate probit models for conditional claim-types

Chapter 7: Estimation Sections

Transcription:

Estimating Mixed Logit Models with Large Choice Sets Roger H. von Haefen, NC State & NBER Adam Domanski, NOAA July 2013

Motivation Bayer et al. (JPE, 2007) Sorting modeling / housing choice 250,000 individuals / alternatives Estimate conditional logit model Why? Sampling of alternatives But restrictive substitution patterns

Research Objectives Develop estimation strategy for mixed logit models applied to large choice set problems Estimate latent class models with variation of the Expectation-Maximization (EM) algorithm Quantify the efficiency/bias/run time tradeoffs in an outdoor recreation application

Outline Background Latent class models EM algorithm Simulations Application Future directions

Discrete Choice Analysis Choice from a large set of alternatives

Discrete Choice Analysis Conditional indirect utility: U ij = Xijβ+ε ij

Discrete Choice Analysis Decision rule: Alternative j chosen iff: U =max U,..., U ij i1 ij

Discrete Choice Analysis Conditional Logit Model (McFadden 1974) Assuming is iid type I extreme value, then: ε ij P= ij exp(x β) j ij exp(x β) ij

Discrete Choice Analysis Independence of Irrelevant Alternatives (IIA) P exp(x β) ij = ij P exp(x β) ik ik Restrictive substitution patterns

Computational Challenges w/ Large Choice Sets Three approaches: Aggregation Separability Sampling

Sampling of Alternatives Ex: Five individuals, 15 alternatives Chosen alternative in red Full sample Individual Alternatives A 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 B 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 C 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 D 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 E 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Sampling of Alternatives Ex: Five individuals, 15 alternatives Chosen alternative in red 50% sample Individual Alternatives A 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 B 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 C 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 D 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 E 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Sampling of Alternatives Ex: Five individuals, 15 alternatives Chosen alternative in red 50% sample Individual Alternatives A 1 3 7 8 11 12 14 15 B 2 5 6 7 9 10 13 14 C 1 5 6 8 9 11 12 13 D 3 4 5 7 8 9 10 13 E 2 3 4 6 7 10 12 15

Sampling of Alternatives McFadden (1978) proved consistency of this approach But proof relies on independence of irrelevant alternatives (IIA) assumption Does not generalize to non-iia models So there is no theoretical justification for using sampling with mixed logit models

How should sampling work? Monte Carlo simulation #1 Fixed coefficient logit model 500, 1000, or 2000 individuals making single discrete choice 100 choice alternatives 4 fixed coefficients Sampling w/ 5, 10, 25 and 50 alternatives Maximum likelihood estimation 250 replications

Mean Parameter Bias Relative Standard Error How should sampling work? Fixed Coefficient Means 1.5 1.4 1.3 1.2 1.1 1.0 0.9 0.8 0.7 0.6 0.5 100 alt 50 alt 25 alt 10 alt 5 alt Sample of Alternatives Size 140 120 100 80 60 40 20 0

Mixed Logit Preference parameters vary randomly across population Continuous mixing distribution P = Finite mixing distribution L β f β θ dβ i i i i i C i ic ic c c P = s δ L β

What can go wrong? Monte Carlo simulation #2 Continuous mixing distribution (normal) 500, 1000, or 2000 individuals making single discrete choice 100 choice alternatives 2 fixed coefficients, 2 random coefficients Sampling w/ 5, 10, 25 and 50 alternatives Maximum simulated likelihood estimation 250 replications

Mean Parameter Bias Relative Standard Error What can go wrong? Fixed Coefficient Means 1.5 1.4 1.3 1.2 1.1 1.0 0.9 0.8 0.7 0.6 0.5 100 alt 50 alt 25 alt 10 alt 5 alt Sample of Alternatives Size 140 120 100 80 60 40 20 0

Mean Parameter Bias Relative Standard Error What can go wrong? Random Coefficient Means 1.8 140 1.6 1.4 1.2 1.0 0.8 0.6 0.4 120 100 80 60 40 20 0.2 100 alt 50 alt 25 alt 10 alt 5 alt Sample of Alternatives Size 0

Mean Parameter Bias Relative Standard Error What can go wrong? Random Coefficent Standard Deviations 1.8 140 1.6 1.4 1.2 1.0 0.8 0.6 0.4 120 100 80 60 40 20 0.2 100 alt 50 alt 25 alt 10 alt 5 alt Sample of Alternatives Size 0

What can go wrong? Monte Carlo simulation #3 Discrete mixing distribution (2 latent classes) 500, 1000, or 2000 individuals making single discrete choice 100 choice alternatives 2 fixed coefficients, 2 random coefficients Sampling w/ 5, 10, 25 and 50 alternatives Maximum likelihood estimation 250 replications

Mean Parameter Bias Relative Standard Error What can go wrong? Fixed Coefficient Means 1.5 1.4 1.3 1.2 1.1 1.0 0.9 0.8 0.7 0.6 0.5 100 alt 50 alt 25 alt 10 alt 5 alt Sample of Alternatives Size 140 120 100 80 60 40 20 0

Mean Parameter Bias Relative Standard Error What can go wrong? Latent Class Probability Coefficient Means 1.4 140 1.3 1.2 1.1 1.0 0.9 0.8 0.7 120 100 80 60 40 20 0.6 100 alt 50 alt 25 alt 10 alt 5 alt Sample of Alternatives Size 0

Mean Parameter Bias Relative Standard Error What can go wrong? Random Coefficient Means 1.8 140 1.6 1.4 1.2 1.0 0.8 0.6 0.4 120 100 80 60 40 20 0.2 100 alt 50 alt 25 alt 10 alt 5 alt Sample of Alternatives Size 0

Practical Dilemma Mixed logit + Overcomes behavioral limitations of IIA + More flexibly accounts for unobserved pref. heterogeneity Does not generate consistent estimates w/ sampling (McConnell and Tseng 2000; Nerella and Bhat 2004; our results) Fixed parameter logit Limited by IIA Limited ability to account for unobserved heterogeneity (nested logit?) + Does generate consistent estimates w/ sampling

Our Contribution Develop an expectation-maximization (EM) approach to estimate latent class mixed logit models for large choice set problems Embeds sampling of alternatives at the M step Computationally tractable for large (but not innumerable) choice sets Can account for unobserved attributes / endogenity using Berry (1994) contraction mapping

Our Contribution Monte Carlo simulations suggest consistency Need relatively large sample size for precise estimates Quantify the small sample bias / precision / run time tradeoff with a recreation data set

Related Literature Fox (RAND, 2007), Spiller (Ph.D. diss., 2011) Maximum score estimator using pairwise comparisons Nonparametric approach that allows for heteroskedasticity in the errors across individuals but homoskedasticity and limited correlations across alternatives for a given individual Works with choice sets that are effectively innumerable Counterfactual analysis? Assumes IIA Only works with fixed parameter specifications Can incorporate group specific (not alternative specific) constants

Latent Class Model Intuition: Population can be segmented into finite number of types or classes Analyst does not observe class membership (probabilistic) Within each class, preferences are homogenous But across classes, preferences are heterogeneous

Latent Class Model Setup: Conditional Likelihood where: Demographics C LL = lns δ L β c s δ = ic i ic ic c L β = C c =1 exp z δ J j i j =1 i c exp z δ ic c J c exp x β ij c ij exp x β c Pr(Class Membership) 1 ij

Expectation-Maximization (EM) Algorithm Attractive when estimating mixture models or models with latent data (i.e., class membership) Transforming the maximization of a log of a sum (mixed logit) into a recursive maximization of a sum of logs (logit) Because the M step involves logit estimation which embeds IIA assumption, can employ sampling

Expectation Step Latent Class Model via EM Algorithm Construct expectation of likelihood conditional on data and current parameter estimates Using Bayes rule, construct the probability of being in class c using full choice set t t Pr(c δ i,β,y) = C c t t t t s δ L β ic ic c s δ L β ic ic c

Maximization Step Latent Class Model via EM Algorithm Update parameter estimates by maximizing the conditional expected log-likelihood Fixed N C t t Max Pr(c δ,β,y)lns δ L β i c δ,β i ic ic c Fixed Fixed N C N C t t t t Max δ Pr(c δ i,β,y)ln sic δc Maxβ Pr(c δ i,β,y)ln Lic βc i c i c separate estimation separate logit estimation

Maximization Step Latent Class Model via EM Algorithm Update parameter estimates by maximizing the conditional expected log-likelihood Fixed N C t t Max Pr(c δ,β,y)lns δ L β i c δ,β i ic ic c Fixed Fixed N C N C t t t t Max δ Pr(c δ i,β,y)ln sic δc Maxβ Pr(c δ i,β,y)ln Lic βc i c i c separate estimation separate logit estimation can use sampling!

Latent Class Model via EM Algorithm Clarification: E step: use full choice set Generally straightforward, but problematic with innumerable choice sets M step: use sample of alternatives Logit estimation

Latent Class Model via EM Algorithm Iterate until convergence (small change in parameters)

Issues: Latent Class Model via EM Algorithm Likelihood function is not globally concave Try different starting values Inference Three approaches Bootstrapping Plug in final estimates into full likelihood Hessian Gradients from final step of EM algorithm + OPG formula (Ruud 1991)

Issues: Latent Class Model via EM Algorithm Likelihood function is not globally concave Try different starting values Inference Three approaches Bootstrapping Plug in final estimates into full likelihood Hessian Gradients from final step of EM algorithm + OPG formula (Ruud 1991)

Issues (cont.): Model selection Latent Class Model via EM Algorithm Information criteria Unobserved characteristics / endogenity Because logit is a mean-fitting distribution, we can use the Berry (1994) contraction mapping to efficiently estimate alternative specific constants

Mean Parameter Bias Relative Standard Error Monte Carlo evidence LC w/ EM Fixed Coefficient Means 1.5 1.4 1.3 1.2 1.1 1.0 0.9 0.8 0.7 0.6 0.5 100 alt 50 alt 25 alt 10 alt 5 alt Sample of Alternatives Size 140 120 100 80 60 40 20 0

Mean Parameter Bias Relative Standard Error Monte Carlo evidence LC w/ EM Fixed Coefficient Means 1.5 1.4 1.3 1.2 1.1 1.0 0.9 0.8 0.7 0.6 0.5 100 alt 50 alt 25 alt 10 alt 5 alt Sample of Alternatives Size 140 120 100 80 60 40 20 0

Mean Parameter Bias Relative Standard Error Monte Carlo evidence LC w/ EM Latent Class Probability Coefficient Means 1.4 140 1.3 1.2 1.1 1.0 0.9 0.8 0.7 120 100 80 60 40 20 0.6 100 alt 50 alt 25 alt 10 alt 5 alt Sample of Alternatives Size 0

Mean Parameter Bias Relative Standard Error Monte Carlo evidence LC w/ EM Random Coefficient Means 1.8 140 1.6 1.4 1.2 1.0 0.8 0.6 0.4 120 100 80 60 40 20 0.2 100 alt 50 alt 25 alt 10 alt 5 alt Sample of Alternatives Size 0

Mean Parameter Bias Relative Standard Error Monte Carlo evidence LC w/ EM Random Coefficient Means 1.8 140 1.6 1.4 1.2 1.0 0.8 0.6 0.4 120 100 80 60 40 20 0.2 100 alt 50 alt 25 alt 10 alt 5 alt Sample of Alternatives Size 0

Empirical Application 1997 Wisconsin angler data 512 anglers making site choices from 569 recreation sites (primarily lakes) Site choice influenced by travel costs, 15 site attributes (e.g., catch rates, bathrooms), and demographics (kids, income)

mean WTP per trip ($) Conditional Logit Results (average across 200 runs) Scenario 4: Agricultural Runoff Mgmt 5% catch rate increase of all fish at all non-urban/forest/refuge sites 1.2 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 Full 50% 25% 12.5% 5% 2% 1% Sample Size

Conditional Logit Results (average across 200 runs) Sample Size Sample Size (%) 50% 25% 12.5% 5% 2% 1% Sample Size (#) 285 142 71 28 11 6 Efficiency Loss 6% 16% 33% 76% 165% 272% Bias 1% 3% 6% 13% 28% 42% Time Savings 56% 80% 90% 98% 99% 99%

mean WTP per trip ($) Latent Class Results (average across 25 runs) Scenario 4: Agricultural Runoff Mgmt 5% catch rate increase of all fish at all non-urban/forest/refuge sites 1.2 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 Full 50% 25% 12.5% 5% 2% 1% Sample Size

Latent Class Results (average across 25 runs) Sample Size Sample Size (%) 50% 25% 12.5% 5% 2% 1% Sample Size (#) 285 142 71 28 11 6 Efficiency Loss 10% 28% 51% 76% 84632% 18360% Bias 19% 15% 6% 6% 34% 60% Time Savings 33% 56% 75% 84% 81% 86%

Summary Exploiting modified EM algorithm, one can estimate random coefficient, discrete choice models Tradeoffs in terms of efficiency, bias & run time Our results suggest that moderately sized samples can generate good estimates in reasonable amounts of time

Extensions Mixed count data models Non-linear pricing models

Thank You! Contact me with any comments: roger_von_haefen@ncsu.edu