Queensland University of Technology Transport Data Analysis and Modeling Methodologies

Similar documents
Transport Data Analysis and Modeling Methodologies

Multi-Vehicle Crashes Involving Large Trucks: A Random Parameter Discrete Outcome Modeling Approach

Hot Springs Bypass Extension TIGER 2017 Application. Benefit-Cost Analysis Methodology Summary

Drawbacks of MNL. MNL may not work well in either of the following cases due to its IIA property:

Mixed Logit or Random Parameter Logit Model

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

DEPARTMENT OF TRANSPORTATION STATE OF GEORGIA TIA PROJECT CONCEPT REPORT

Discrete Choice Modeling

A MODIFIED MULTINOMIAL LOGIT MODEL OF ROUTE CHOICE FOR DRIVERS USING THE TRANSPORTATION INFORMATION SYSTEM

DMP (Decision Making Process)

Frequency Distribution Models 1- Probability Density Function (PDF)

Continuous Probability Distributions

SOCIETY OF ACTUARIES EXAM STAM SHORT-TERM ACTUARIAL MATHEMATICS EXAM STAM SAMPLE QUESTIONS

Appendix A. Selecting and Using Probability Distributions. In this appendix

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

Highway Engineering-II

Maximum Likelihood Estimation

Confidence Intervals for an Exponential Lifetime Percentile

Unit2: Probabilityanddistributions. 3. Normal and binomial distributions

A potentially useful approach to model nonlinearities in time series is to assume different behavior (structural break) in different subsamples

Unit2: Probabilityanddistributions. 3. Normal and binomial distributions

Intro to GLM Day 2: GLM and Maximum Likelihood

Phd Program in Transportation. Transport Demand Modeling. Session 11

Econometric Methods for Valuation Analysis

Financial Econometrics Jeffrey R. Russell. Midterm 2014 Suggested Solutions. TA: B. B. Deng

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

Choice Probabilities. Logit Choice Probabilities Derivation. Choice Probabilities. Basic Econometrics in Transportation.

Logit Models for Binary Data

Econometrics II Multinomial Choice Models

Gamma Distribution Fitting

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

INSTITUTE OF ACTUARIES OF INDIA EXAMINATIONS. 20 th May Subject CT3 Probability & Mathematical Statistics

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

APPENDIX E: ATM MODEL TECH MEMORANDUM. Metropolitan Council Parsons Brinckerhoff

Data Analytics (CS40003) Practice Set IV (Topic: Probability and Sampling Distribution)

Statistics & Flood Frequency Chapter 3. Dr. Philip B. Bedient

Unit 2: Ratios & Proportions

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2012, Mr. Ruey S. Tsay. Solutions to Final Exam

Empirical Bayes Analysis For Safety. Larry Hagen, P.E., PTOE

Traffic Impact Analysis Guidelines Methodology

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, Last revised January 10, 2017

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

Southern California Association of Governments (SCAG) Metropolitan Planning Organization (AMPO) Annual Conference. Prepared for

Implementing the MTO s Priority Economic Analysis Tool

SOLUTIONS FOR SAVING LIVES ON TEXAS ROADS

Random variables. Contents

Maximum Likelihood Estimates for Alpha and Beta With Zero SAIDI Days

Basic Procedure for Histograms

RISK BASED LIFE CYCLE COST ANALYSIS FOR PROJECT LEVEL PAVEMENT MANAGEMENT. Eric Perrone, Dick Clark, Quinn Ness, Xin Chen, Ph.D, Stuart Hudson, P.E.

Exercise 1. Data from the Journal of Applied Econometrics Archive. This is an unbalanced panel.n = 27326, Group sizes range from 1 to 7, 7293 groups.

Draft Environmental Impact Statement. Appendix G Economic Analysis Report

Statistical Analysis of Traffic Injury Severity: The Case Study of Addis Ababa, Ethiopia

Table 2.7 I-73 Economic Impact Summary in Value Change (Alternatives compared to No-Build)

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER

Estimating Mixed Logit Models with Large Choice Sets. Roger H. von Haefen, NC State & NBER Adam Domanski, NOAA July 2013

Tutorial: Discrete choice analysis Masaryk University, Brno November 6, 2015

Glossary Candidate Roadway Project Evaluation Form Project Scoring Sheet... 17

AP STATISTICS FALL SEMESTSER FINAL EXAM STUDY GUIDE

Northwest Arkansas Regional Travel Demand Model Development

Unit 5: Sampling Distributions of Statistics

Unit 5: Sampling Distributions of Statistics

Honors Statistics. 3. Review OTL C6#3. 4. Normal Curve Quiz. Chapter 6 Section 2 Day s Notes.notebook. May 02, 2016.

I-64 Capacity Improvements Segment III Initial Financial Plan

Contents. An Overview of Statistical Applications CHAPTER 1. Contents (ix) Preface... (vii)

BEcon Program, Faculty of Economics, Chulalongkorn University Page 1/7

Supplementary Appendix for Moral Hazard, Incentive Contracts and Risk: Evidence from Procurement

Normal Probability Distributions

Modelling Returns: the CER and the CAPM

Probability distributions relevant to radiowave propagation modelling

Model Paper Statistics Objective. Paper Code Time Allowed: 20 minutes

Note on Cost of Capital

CHAPTER 5 STOCHASTIC SCHEDULING

The Bernoulli distribution

State of the Industry

Theoretical Problems in Credit Portfolio Modeling 2

Simulation Wrap-up, Statistics COS 323

EXERCISES FOR PRACTICE SESSION 2 OF STAT CAMP

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, Last revised January 13, 2018

Basic Principles of Probability and Statistics. Lecture notes for PET 472 Spring 2010 Prepared by: Thomas W. Engler, Ph.D., P.E

1/2 2. Mean & variance. Mean & standard deviation

Lecture Data Science

Random Variables and Probability Distributions

Basic Principles of Probability and Statistics. Lecture notes for PET 472 Spring 2012 Prepared by: Thomas W. Engler, Ph.D., P.E

Correcting for Survival Effects in Cross Section Wage Equations Using NBA Data

WesVar uses repeated replication variance estimation methods exclusively and as a result does not offer the Taylor Series Linearization approach.

Long-Term Monitoring of Low-Volume Road Performance in Ontario

POLICIES AND PROCEDURES FOR TOLL COLLECTION AND ROADWAY OPERATIONS ON CCRMA FACILITIES

Top Incorrect Problems

Market Microstructure Invariants

Final Exam Suggested Solutions

EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS. Rick Katz

It is common in the field of mathematics, for example, geometry, to have theorems or postulates

Package XNomial. December 24, 2015

Lecture 3: Probability Distributions (cont d)

Weather Shield Transportation Ltd

Chapter 4 and Chapter 5 Test Review Worksheet

PCI Definition. Module 1 Part 4: Methodology for Determining Pavement Condition Index (PCI) PCI Scale. Excellent Very Good Good.

Project Appraisal Guidelines for National Roads Unit Guidance on using COBALT

Session 178 TS, Stats for Health Actuaries. Moderator: Ian G. Duncan, FSA, FCA, FCIA, FIA, MAAA. Presenter: Joan C. Barrett, FSA, MAAA

Transcription:

1 Queensland University of Technology Transport Data Analysis and Modeling Methodologies Lab Session #11 (Mixed Logit Analysis II) You are given accident, evirnomental, traffic, and roadway geometric data from 275 segments of highway in Washington State. The data are from 1990. The injury data consist of three possible outcomes: no injury, possible injury, injury. Your task is to estimate a mixed logit model of these three possible discrete outcomes. The mixed logit model allows for parameter variations across roadway segments (i.e. variations in β), a mixing distribution is introduced giving injury-severity proportions (see Train 2003), in EXP[ βix in] = EXP[ β X ] I i In ( β ϕ ) P f dβ where f (β φ) is the density function of β with φ referring to a vector of parameters of the density function (mean and variance), and all other terms are as previously defined. Equation 3 is the formulation for the mixed logit model. For model estimation, β can now account for segment-specific variations of the effect of X on injury-severity proportions, with the density function f (β φ) used to determine β. Mixed logit proportions are then a weighted average for different values of β across roadway segments where some elements of the vector β may be fixed and some may randomly distributed. If the parameters are random, the mixed logit weights are determined by the density function f(β φ). Most studies have used a continuous form of this density function in model estimation (such as a normal distribution) and this is what you are to use. In your specification, consider random variable possibilities including constant or fixed (C), normally distributed (N) and log-normally distributed (L). 1. The results of your best model specification. 2. A discussion of the logical process that led you to the selection of your final specification (the theory behind the inclusion of your selected variables). Include t-statistics and justify the signs of your variables.

Variables available for your specification are (in file Ex16-1.txt): 2 Variable Number ID FREQ ROUTE LENGTH INCLANES DECLANES WIDTH MIMEDSH MXMEDSH SPEED URB FC AADT SINGLE DOUBLE TRAIN PEAKHR GRADEBR MIGRADE MXGRADE MXGRDIFF TANGENT CURVES Explanation Segment ID number Number of accidents Route Number Segment length in miles Number of lanes in increasing milepost direction Number of lanes in decreasing milepost direction Total combined width of all lanes Minimum median shoulder in feet Maximum median shoulder in feet Speed limit (mi/h) Indicates urban area (1=yes, 0=no) Functional class (1=local, 2=collector, 3=arterial, 4=principal arterial, 5=interstate) Average Annual Daily Traffic Daily percentage of single unit trucks Daily percentage of tractor and trailer trucks Daily percentage of tractor and two-trailer trucks Percent of daily traffic in the peak hour Number of grade breaks in the segment Minimum grade in the segment Maximum grade in the segment Maximum grade difference in the segment Tangent length in the segment Number of cureves in the segment

3 MINRAD ACCESS MEDWIDTH FRICTION ADTLANE SLOPE INTECHAG AVEPRE AVESNOW Minimum radius in feet Segment access control (0=none, 1=partial, 3=full) Median width (1=less than 30ft; 2=30 to 40ft; 3=40 to 50ft; 4=50 to 60ft to 5=high) Friction value (0 to 100 with 100 being high) Average daily travel per lane Segment slope (0=flat, 1=slight, 2=medium, 3=high) Indicates number of interganges in the segment Average precipitation per month in inches Average snowfall per month in inches

--> read;nvar=32;nobs=825;names= ID,INJFREQ,ROUTE,LENGTH,INCLANES,DECLANES,WIDTH,MIMEDSH, MXMEDSH,SPEED,URB,FC,AADT, SINGLE,DOUBLE,TRAIN,PEAKHR,GRADEBR,MIGRADE,MXGRADE,MXGRDIFF, TANGENT,CURVES,MINRAD,ACCESS,MEDWIDTH, FRICTION,ADTLANE,SLOPE, INTECHAG,AVEPRE,AVESNOW; FILE=D:Ex16-1.txt$ --> create;laneadt=aadt/(inclanes+declanes)$ --> create;lnlanadt=log(laneadt)$ --> create;lnaadt=log(aadt)$ --> create;density=laneadt/length$ --> create;if(friction<=30)lowfri=1$ --> create;if(friction>30&friction<50)medfri=1$ --> create;if(friction>=50)hifri=1$ --> create;curvmile=curves/length$ --> create;if(curvmile<=0.5)lowcvmil=1;(else)lowcvmil=0$ --> create;if(curvmile>0.5&curvmile<=2.5)medcvmil=1;(else)medcvmil=0$ --> create;if(curvmile>2.5)hicvmil =1;(else)hicvmil=0$ --> create;truck=single+double+train$ --> create;pcttruck=truck/aadt$ --> create;if(medwidth=1)med030=1$ --> create;if(medwidth=2)med3040=1$ --> create;if(medwidth=3)med4050=1$ --> create;if(medwidth=4)med5060=1$ --> create;if(medwidth=5)med60=1$ --> create;if(speed<=50)speed1=1$ --> create;if(speed<=55)speed2=1$ --> create;if(speed>55)speed3=1$ --> create;if(speed>=55)speed4=1$ --> create;if(fc=1)local=1$ --> create;if(fc=5)intstate=1$ --> create;if(access=0)none =1$ --> create;if(access=1)partial=1$ --> create;if(access=2)full =1$ --> create;if(slope=0)flat=1$ --> create;if(slope=1)slight=1$ --> create;if(slope=2)medium=1$ --> create;if(slope=0 slope= 1)slpflat=1;(else)slpflat=0$ --> create;if(slope=2)slpmed=1;(else)slpmed=0$ --> create;if(avepre<=1.5)lowpre=1;(else)lowpre=0$ --> create;if(avepre>1.5&avepre<=2.5)medpre=1;(else)medpre=0$ --> create;if(avepre>2.5)hipre=1;(else)hipre=0$ --> create;if(avesnow<=1)norsnow=1$ --> create;if(avesnow>1)hisnow=1$ --> create;lanewid=(inclanes+declanes)/width$ --> dstat;rhs=lanewid$ Descriptive Statistics All results based on nonmissing observations. =============================================================================== Variable Mean Std.Dev. Minimum Maximum Cases =============================================================================== ------------------------------------------------------------------------------- All observations in current sample ------------------------------------------------------------------------------- LANEWID.809060949E-01.643688046E-02.392156863E-01.869565217E-01 825 4 --> create;if(lanewid<12)nlanwid=1;(else)nlanwid=0$ --> create;if(lanewid>12)wlanwid=1;(else)wlanwid=0$ --> create;intmi=intechag/length$ --> create;gbmile=gradebr/length$

--> nlogit;lhs=injfreq; choices=pdo,pinj,inj; model: U(pdo)=a0+a1*laneadt+a3*minrad/ U(pinj)=b0+b2*truck/ U(inj)=c3*friction+c2*intmi+c1*gbmile ;fcn=a0(c),a1(c),a3(n), b0(c),b2(n),c2(n),c3(c),c1(n);rpl;frequencies;parameter;pts=200,halton$ Normal exit from iterations. Exit status=0. 5 Start values obtained using nonnested model Maximum Likelihood Estimates Model estimated: Sep 14, 2010 at 11:06:53AM. Dependent variable Choice Weighting variable None Number of observations 258 Iterations completed 5 Log likelihood function -4485.876 R2=1-LogL/LogL* Log-L fncn R-sqrd RsqAdj No coefficients -5116.2374.12321.10233 Constants only. Must be computed directly. Use NLOGIT ;...; RHS=ONE $ Chi-squared[ 6] = 111.53902 Prob [ chi squared > value ] =.00000 Response data are given as frequencies. Number of obs.= 275, skipped 17 bad obs. Variable Coefficient Standard Error b/st.er. P[ Z >z] A0 2.11216686.38393093 5.501.0000 A1.690742D-05.479817D-05 1.440.1500 A3 -.125860D-04.598640D-05-2.102.0355 B0 1.73119412.37948435 4.562.0000 B2 -.05631942.00690283-8.159.0000 C2 -.11618126.07426145-1.564.1177 C3.02623092.00746039 3.516.0004 C1 -.02657617.02626467-1.012.3116 Normal exit from iterations. Exit status=0. Random Parameters Logit Model Maximum Likelihood Estimates Model estimated: Sep 14, 2010 at 11:09:57AM. Dependent variable INJFREQ Weighting variable Number of observations None 825 Iterations completed 30 Log likelihood function -4440.983 Restricted log likelihood -5116.237 Chi squared 1350.508 Degrees of freedom 12 Prob[ChiSqd > value] =.0000000 R2=1-LogL/LogL* Log-L fncn R-sqrd RsqAdj No coefficients -5116.2374.13198.11132 Constants only. Must be computed directly. Use NLOGIT ;...; RHS=ONE $ At start values -4485.8756.01001 -.01356 Response data are given as frequencies.

Random Parameters Logit Model Replications for simulated probs. = 200 Number of obs.= 275, skipped 17 bad obs. Variable Coefficient Standard Error b/st.er. P[ Z >z] Random parameters in utility functions A0 2.63055229.57514058 4.574.0000 A1.113482D-04.724898D-05 1.565.1175 A3.188581D-04.276536D-04.682.4953 B0 2.73068134.64190615 4.254.0000 B2 -.13353509.03899427-3.424.0006 C2-1.26570151.34650281-3.653.0003 C3.04349874.01148259 3.788.0002 C1 -.17878575.08955385-1.996.0459 Derived standard deviations of parameter distributions CsA0.000000...(Fixed Parameter)... CsA1.000000...(Fixed Parameter)... NsA3.00044556.00010979 4.058.0000 CsB0.000000...(Fixed Parameter)... NsB2.10031812.03522847 2.848.0044 NsC2 2.27834370.51217501 4.448.0000 CsC3.000000...(Fixed Parameter)... NsC1.31433085.16756988 1.876.0607 6