Computer Lab II Biogeme & Binary Logit Model Estimation

Similar documents
Logit with multiple alternatives

Nested logit. Michel Bierlaire

Nested logit. Michel Bierlaire

Heteroskedastic Model

Choice Probabilities. Logit Choice Probabilities Derivation. Choice Probabilities. Basic Econometrics in Transportation.

Drawbacks of MNL. MNL may not work well in either of the following cases due to its IIA property:

Tutorial: Discrete choice analysis Masaryk University, Brno November 6, 2015

Discrete Choice Modeling of Combined Mode and Departure Time

Forecasting ridership for a new mode using binary stated choice data methodological challenges in studying the demand for high-speed rail in Norway

Heteroskedastic Model

Transport Data Analysis and Modeling Methodologies

Discrete Choice Theory and Travel Demand Modelling

Automobile Ownership Model

Econometric Methods for Valuation Analysis

Properties, Advantages, and Drawbacks of the Block Logit Model. Jeffrey Newman Michel Bierlaire

Discrete Choice Model for Public Transport Development in Kuala Lumpur

Econometrics II Multinomial Choice Models

A Self Instructing Course in Mode Choice Modeling: Multinomial and Nested Logit Models

Analysis of implicit choice set generation using the Constrained Multinomial Logit model

Nonlinear Econometric Analysis (ECO 722) Answers to Homework 4

A MODIFIED MULTINOMIAL LOGIT MODEL OF ROUTE CHOICE FOR DRIVERS USING THE TRANSPORTATION INFORMATION SYSTEM

Discrete Choice Modeling

The Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis

Quant Econ Pset 2: Logit

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Statistical Analysis of Traffic Injury Severity: The Case Study of Addis Ababa, Ethiopia

Valuing Environmental Impacts: Practical Guidelines for the Use of Value Transfer in Policy and Project Appraisal

Simplest Description of Binary Logit Model

PASS Sample Size Software

Quantitative Techniques Term 2

Models of Multinomial Qualitative Response

Table 4. Probit model of union membership. Probit coefficients are presented below. Data from March 2008 Current Population Survey.

Crash Involvement Studies Using Routine Accident and Exposure Data: A Case for Case-Control Designs

Discrete Choice Modeling William Greene Stern School of Business, New York University. Lab Session 2 Binary Choice Modeling with Panel Data

Lecture 1: Logit. Quantitative Methods for Economic Analysis. Seyed Ali Madani Zadeh and Hosein Joshaghani. Sharif University of Technology

Multinomial Choice (Basic Models)

Phd Program in Transportation. Transport Demand Modeling. Session 11

Multinomial Logit Models for Variable Response Categories Ordered

sociology SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 SO5032 Quantitative Research Methods

ASSESSING THE DETERMINANTS OF FINANCIAL DISTRESS IN FRENCH, ITALIAN AND SPANISH FIRMS 1

might be done. The utility. rather than

Author(s): Martínez, Francisco; Cascetta, Ennio; Pagliara, Francesca; Bierlaire, Michel; Axhausen, Kay W.

Equivalence Tests for the Difference of Two Proportions in a Cluster- Randomized Design

Review questions for Multinomial Logit/Probit, Tobit, Heckit, Quantile Regressions

Prior knowledge in economic applications of data mining

An ex-post analysis of Italian fiscal policy on renovation

A Piecewise Linear Multinomial Logit Model of Private Vehicle Ownership Behaviour of Indian Households

FIT OR HIT IN CHOICE MODELS

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

Available online at ScienceDirect. Procedia Environmental Sciences 22 (2014 )

Research Collection. The acceptance of modal innovation The case of Swissmetro. Conference Paper. ETH Library

XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

A Gender-based Analysis of Work Trip Mode Choice of Suburban Montreal Commuters Using Stated Preference Data

Designing Price Contracts for Boundedly Rational Customers: Does the Number of Block Matter?

Fall 2004 Social Sciences 7418 University of Wisconsin-Madison Problem Set 5 Answers

Studying Sample Sizes for demand analysis Analysis on the size of calibration and hold-out sample for choice model appraisal

A Comparison of Univariate Probit and Logit. Models Using Simulation

Contents. Part I Getting started 1. xxii xxix. List of tables Preface

NEWCASTLE UNIVERSITY SCHOOL OF MATHEMATICS & STATISTICS SEMESTER /2013 MAS8304. Environmental Extremes: Mid semester test

Module 2 caa-global.org

TOURISM GENERATION ANALYSIS BASED ON A SCOBIT MODEL * Lingling, WU **, Junyi ZHANG ***, and Akimasa FUJIWARA ****

Questions of Statistical Analysis and Discrete Choice Models

Quantity versus Price Rationing of Credit: An Empirical Test

Vlerick Leuven Gent Working Paper Series 2003/30 MODELLING LIMITED DEPENDENT VARIABLES: METHODS AND GUIDELINES FOR RESEARCHERS IN STRATEGIC MANAGEMENT

Bounding Optimal Expected Revenues for Assortment Optimization under Mixtures of Multinomial Logits

Introduction to the Maximum Likelihood Estimation Technique. September 24, 2015

Tests for Paired Means using Effect Size

Logit Models for Binary Data

Multiple Regression and Logistic Regression II. Dajiang 525 Apr

Hierarchical Generalized Linear Models. Measurement Incorporated Hierarchical Linear Models Workshop

to level-of-service factors, state dependence of the stated choices on the revealed choice, and

What is spatial transferability?

STATISTICAL METHODS FOR CATEGORICAL DATA ANALYSIS

Market Variables and Financial Distress. Giovanni Fernandez Stetson University

Economics Multinomial Choice Models

Analyzing the Determinants of Project Success: A Probit Regression Approach

PhD Qualifier Examination

Assicurazioni Generali: An Option Pricing Case with NAGARCH

Keywords Akiake Information criterion, Automobile, Bonus-Malus, Exponential family, Linear regression, Residuals, Scaled deviance. I.

What Makes Family Members Live Apart or Together?: An Empirical Study with Japanese Panel Study of Consumers

A Test of the Normality Assumption in the Ordered Probit Model *

Diploma Part 2. Quantitative Methods. Examiner s Suggested Answers

On Effects of Asymmetric Information on Non-Life Insurance Prices under Competition

Intro to GLM Day 2: GLM and Maximum Likelihood

Revenue Management Under the Markov Chain Choice Model

Transportation Theory and Applications

An Analysis of the Factors Affecting Preferences for Rental Houses in Istanbul Using Mixed Logit Model: A Comparison of European and Asian Side

Discrete Choice Modeling William Greene Stern School of Business, New York University. Lab Session 4

Assessment on Credit Risk of Real Estate Based on Logistic Regression Model

Mixed Logit or Random Parameter Logit Model

Online Appendix What Does Health Reform Mean for the Healthcare Industry? Evidence from the Massachusetts Special Senate Election.

A Spreadsheet-Literate Non-Statistician s Guide to the Beta-Geometric Model

Predicting stock prices for large-cap technology companies

Mode-choice behaviour for home-based work trips

What is the duration of Swiss direct real estate?

A note on the nested Logit model

How exogenous is exogenous income? A longitudinal study of lottery winners in the UK

STATISTICS 110/201, FALL 2017 Homework #5 Solutions Assigned Mon, November 6, Due Wed, November 15

Econometrics and Economic Data

Non-linearities in Simple Regression

Transcription:

Computer Lab II Biogeme & Binary Logit Model Estimation Evanthia Kazagli, Anna Fernandez Antolin & Antonin Danalet Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering École Polytechnique Fédérale de Lausanne September 23, 2014 EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 1 / 26

Today Further introduction to BIOGEME Estimation of Binary Logit models EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 2 / 26

How does BIOGEME work? model.mod data.dat BIOGEME Results.html Final model.res parameters default.par Data statistics etc..sta.log.rep... EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 3 / 26

BIOGEME - Data file File extension.dat First row contains column (variable) names. One observation per row. Each row must contain a choice indicator. Example with the Netherlands transportation mode choice data: choice between car and train. EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 4 / 26

BIOGEME - Data file netherlands.dat id choice rail_cost rail_time car_cost car_time 1 0 40 2.5 5 1.167 2 0 35 2.016 9 1.517 3 0 24 2.017 11.5 1.966 4 0 7.8 1.75 8.333 2 5 0 28 2.034 5 1.267... 219 1 35 2.416 6.4 1.283 220 1 30 2.334 2.083 1.667 221 1 35.7 1.834 16.667 2.017 222 1 47 1.833 72 1.533 223 1 30 1.967 30 1.267 EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 5 / 26

BIOGEME - Data file netherlands.dat id choice rail_cost rail_time car_cost car_time 1 0 40 2.5 5 1.167 2 0 35 2.016 9 1.517 3 0 24 2.017 11.5 1.966 4 0 7.8 1.75 8.333 2 5 0 28 2.034 5 1.267... Unique identifier of observations 219 1 35 2.416 6.4 1.283 220 1 30 2.334 2.083 1.667 221 1 35.7 1.834 16.667 2.017 222 1 47 1.833 72 1.533 223 1 30 1.967 30 1.267 EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 6 / 26

BIOGEME - Data file netherlands.dat id choice rail_cost rail_time car_cost car_time 1 0 40 2.5 5 1.167 2 0 35 2.016 9 1.517 3 0 24 2.017 11.5 1.966 4 0 7.8 1.75 8.333 2 5 0 28 2.034 5 1.267... Choice indicator, 0: car and 1: train 219 1 35 2.416 6.4 1.283 220 1 30 2.334 2.083 1.667 221 1 35.7 1.834 16.667 2.017 222 1 47 1.833 72 1.533 223 1 30 1.967 30 1.267 EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 7 / 26

BIOGEME - Model file File extension.mod Must be consistent with data file. Contains deterministic utility specifications, model type etc. The model file contains different [Sections] describing different elements of the model specification. EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 8 / 26

BIOGEME - Model file How can we write the following deterministic utility functions in BIOGEME? V car = ASC car +β time time car +β cost cost car V rail = β time time rail +β cost cost rail EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 9 / 26

BIOGEME - Model file [Choice] choice [Beta] // Name DefaultValue LowerBound UpperBound status ASC_CAR 0.0-100.0 100.0 0 ASC_RAIL 0.0-100.0 100.0 1 BETA_COST 0.0-100.0 100.0 0 BETA_TIME 0.0-100.0 100.0 0 [Utilities] //Id Name Avail linear-in-parameter expression 0 Car one ASC_CAR * one + BETA_COST * car_cost + BETA_TIME * car_time 1 Rail one ASC_RAIL * one + BETA_COST * rail_cost + BETA_TIME * rail_time EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 10 / 26

BIOGEME - Model file [Choice] choice [Beta] // Name DefaultValue LowerBound UpperBound status ASC_CAR 0.0-100.0 100.0 0 ASC_RAIL 0.0-100.0 100.0 1 BETA_COST 0.0-100.0 100.0 0 BETA_TIME 0.0-100.0 100.0 0 [Utilities] //Id Name Avail linear-in-parameter expression 0 Car one ASC_CAR * one + BETA_COST * car_cost + BETA_TIME * car_time 1 Rail one ASC_RAIL * one + BETA_COST * rail_cost + BETA_TIME * rail_time EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 11 / 26

BIOGEME - Model file [Choice] choice [Beta] // Name DefaultValue LowerBound UpperBound status ASC_CAR 0.0-100.0 100.0 0 ASC_RAIL 0.0-100.0 100.0 1 BETA_COST 0.0-100.0 100.0 0 BETA_TIME 0.0-100.0 100.0 0 [Utilities] //Id Name Avail linear-in-parameter expression 0 Car one ASC_CAR * one + BETA_COST * car_cost + BETA_TIME * car_time 1 Rail one ASC_RAIL * one + BETA_COST * rail_cost + BETA_TIME * rail_time EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 12 / 26

BIOGEME - Model file [Choice] choice What is one? [Beta] // Name Which is the type of model? DefaultValue LowerBound UpperBound status ASC_CAR 0.0-100.0 100.0 0 ASC_RAIL 0.0-100.0 100.0 1 BETA_COST 0.0-100.0 100.0 0 BETA_TIME 0.0-100.0 100.0 0 [Utilities] //Id Name Avail linear-in-parameter expression 0 Car one ASC_CAR * one + BETA_COST * car_cost + BETA_TIME * car_time 1 Rail one ASC_RAIL * one + BETA_COST * rail_cost + BETA_TIME * rail_time EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 13 / 26

BIOGEME - Model file [Expressions] // Define here arithmetic expressions for name that are not directly // available from the data one = 1 [Model] // Currently, only $MNL (multinomial logit), $NL (nested logit), $CNL // (cross-nested logit) and $NGEV (Network GEV model) are valid keywords // $MNL EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 14 / 26

Model and Data Files How to read and modify model files? How to read data files? GNU Emacs, vi, TextEdit (Mac) or Wordpad (Windows) Notepad (Windows) should not be used! EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 15 / 26

BIOGEME - Results - Netherlands dataset EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 16 / 26

BIOGEME - Results General model information EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 17 / 26

BIOGEME - Results Coefficient estimates EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 18 / 26

Today Further introduction to BIOGEME Estimation of Binary Logit models EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 19 / 26

Binary Logit Case Study Available datasets: Airline itinerary choice (Boeing) Choice-Lab marketing Mode choice in Netherlands Residential Telephone Services Mode choice in Switzerland (Optima) Descriptions available on the course webpage. Optima dataset does not contain.mod files. A specification has to be proposed as an assignment (next lab session). EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 20 / 26

How to go through the Case Studies Choose a dataset to work with (data descriptions are available on the course webpage). Copy the files related to the chosen dataset and case study from the course webpage. Go through the.mod files with the help of the descriptions. Run the.mod files with BIOGEME. Interpret the results and compare your interpretation with the one we have proposed. Develop other model specifications. EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 21 / 26

Course webpage http://transp-or.epfl.ch/ Teaching Mathematical modeling of behavior Laboratories BIOGEME software (including documentation and utilities) For each Case Study: Data files; Model specification files; Possible interpretation of results. EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 22 / 26

Today s plan 1 Independent work on 2-3 Case Studies choose a case study; estimate a model; interpret the results. 2 Group work gather in groups; generate.mod file (base); test an idea/ hypothesis. EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 23 / 26

Specifying models: Recommended steps Formulate a-priori hypothesis: Expectations and intuition regarding the explanatory variables that appear to be significant for mode choice. Specify a minimal model: Start simple; Include the main factors affecting the mode choice of (rational) travelers; This will be your starting point. Continue adding and testing variables that improve the initial model in terms of causality, and efficiency with respect to what actually happened in the sample. EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 24 / 26

Evaluating models The main indicators used to evaluate and compare the various models are summarised here: Informal tests: signs and relative magnitudes of the parameters β values (under our a-priori expectations); trade-offs among some attributes and ratios of pairs of parameters (e.g. reasonable value of time). Overall goodness of fit measure: adjusted rho-square (likelihood ratio index): takes into account the different number of explanatory variables used in the models and normalizes for their effect suitable to compare models with different number of independent variables. We check this value to have a first idea about which model might be better (among models of the same type), but it is not a statistical test. EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 25 / 26

Evaluating models (cont.) Statistical tests: t-test values: statistically significant explanatory variables are denoted by t-statistic values remarkably higher/ lower than ±2 (for a 95% level of confidence); final log-likelihood for the full set of parameters: should be remarkably different from the ones in the naive approach (null log-likelihood and log-likelihood at constants); we ask for high values of likelihood ratio test [ 2(LL(0) LL(β))] in order to have a model significantly different than the naive one. Test of entire models: likelihood ratio test [ 2(LL(ˆβ R ) LL(ˆβ U ))]: used to test the null hypothesis that two models are equivalent, under the requirement that the one is the restricted version of the other. The likelihood ratio test is X 2 distributed, with degrees of freedom equal to K U K R (where K the number of parameters of the unrestricted and restricted model, respectively). EK, AFA, AD (TRANSP-OR) Computer Lab II September 23, 2014 26 / 26