Comparison of Logit Models to Machine Learning Algorithms for Modeling Individual Daily Activity Patterns

Similar documents
Danny Givon, Jerusalem Transportation Masterplan Team, Israel

The Loans_processed.csv file is the dataset we obtained after the pre-processing part where the clean-up python code was used.

Credit Card Default Predictive Modeling

New Features of Population Synthesis: PopSyn III of CT-RAMP

Scoring Credit Invisibles

Yao s Minimax Principle

Support Vector Machines: Training with Stochastic Gradient Descent

Predicting Foreign Exchange Arbitrage

Decision making in the presence of uncertainty

Relative and absolute equity performance prediction via supervised learning

Examining the Morningstar Quantitative Rating for Funds A new investment research tool.

Conditional inference trees in dynamic microsimulation - modelling transition probabilities in the SMILE model

FE670 Algorithmic Trading Strategies. Stevens Institute of Technology

ECS171: Machine Learning

Session 57PD, Predicting High Claimants. Presenters: Zoe Gibbs Brian M. Hartman, ASA. SOA Antitrust Disclaimer SOA Presentation Disclaimer

Predictive modelling around the world Peter Banthorpe, RGA Kevin Manning, Milliman

Contents Critique 26. portfolio optimization 32

The Dynamic Cross-sectional Microsimulation Model MOSART

Contents. Part I Getting started 1. xxii xxix. List of tables Preface

A new PDE-based approach for construction scheduling and resource allocation. Paul Gabet, Julien Nachef CE 291F Project Presentation Spring 2014

CEC login. Student Details Name SOLUTIONS

Lecture 3: Factor models in modern portfolio choice

Business Strategies in Credit Rating and the Control of Misclassification Costs in Neural Network Predictions

Decision making in the presence of uncertainty

DaySim. Activity-Based Modelling Symposium. John L Bowman, Ph.D.

Lattice Model of System Evolution. Outline

Segmentation Survey. Results of Quantitative Research

Measuring Policyholder Behavior in Variable Annuity Contracts

Decision Trees An Early Classifier

CrowdWorx Market and Algorithm Reference Information

ActiveAllocator Insights

Predictive Analytics for Risk Management

CPS 270: Artificial Intelligence Markov decision processes, POMDPs

PWBM WORKING PAPER SERIES MATCHING IRS STATISTICS OF INCOME TAX FILER RETURNS WITH PWBM SIMULATOR MICRO-DATA OUTPUT.

Credit Risk Modeling Using Excel and VBA with DVD O. Gunter Loffler Peter N. Posch. WILEY A John Wiley and Sons, Ltd., Publication

Australians Switching Behaviour in Banking and Essential Services

Competition price analysis in non-life insurance

Wage Determinants Analysis by Quantile Regression Tree

Millennial Money Mindset Report

Article from. Predictive Analytics and Futurism. June 2017 Issue 15

Factor investing: building balanced factor portfolios

Integer Programming Models

EGYPTIAN INDUSTRIAL SECTORE

Stepping Through Co-Optimisation

Producing actionable insights from predictive models built upon condensed electronic medical records.

The Balance-Matching Heuristic *

UBS Investor Watch. Analyzing investor sentiment and behavior / 2Q Couples and money. Who decides? a b

[ ] Pinellas County Citizen Research: Telephonic Study of Citizen Values. CLIENT: Pinellas County CONTACT: Sarah Lindemuth

Stay or Go? The science of departures from superannuation funds

Decision making in the presence of uncertainty

Predicting Economic Recession using Data Mining Techniques

Implementing the Expected Credit Loss model for receivables A case study for IFRS 9

Applications of machine learning for volatility estimation and quantitative strategies

Progressive Hedging for Multi-stage Stochastic Optimization Problems

STOCK PRICE PREDICTION: KOHONEN VERSUS BACKPROPAGATION

Methods for forecasting in the Danish National Transport model

Session 5. Predictive Modeling in Life Insurance

Appendix A: Detailed Methodology and Statistical Methods

How Advanced Pricing Analysis Can Support Underwriting by Claudine Modlin, FCAS, MAAA

of Complex Systems to ERM and Actuarial Work

Top-down particle filtering for Bayesian decision trees

Markov Decision Processes

Analysis of Long-Distance Travel Behavior of the Elderly and Low Income

Milliman Risk Score 2.0 stratifying mortality risk using prescription drug information

Martingale Pricing Theory in Discrete-Time and Discrete-Space Models

Integrated Child Support System:

Americans Dependency on Social Security

Estimating Mixed Logit Models with Large Choice Sets. Roger H. von Haefen, NC State & NBER Adam Domanski, NOAA July 2013

SIMULATION OF ELECTRICITY MARKETS

Fixed Income Financial Engineering

Reserving in the Pressure Cooker (General Insurance TORP Working Party) 18 May William Diffey Laura Hobern Asif John

BEYOND THE 4% RULE J.P. MORGAN RESEARCH FOCUSES ON THE POTENTIAL BENEFITS OF A DYNAMIC RETIREMENT INCOME WITHDRAWAL STRATEGY.

Monte Carlo Methods in Finance

Activity-Based Costing

A Better Systematic Withdrawal Strategy--The Actuarial Approach Ken Steiner, Fellow, Society of Actuaries, Retired February 2014

Discrete Choice Models with Dynamic Effects: Estimation and Application in Activity-Based Travel Demand Framework

How Can YOU Use it? Artificial Intelligence for Actuaries. SOA Annual Meeting, Gaurav Gupta. Session 058PD

1. A is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes,

Tools for testing the Solvency Capital Requirement for life insurance. Mariarosaria Coppola 1, Valeria D Amato 2

Notes for the Course Autonomous Agents and Multiagent Systems 2017/2018. Francesco Amigoni

Broad and Deep: The Extensive Learning Agenda in YouthSave

CS 475 Machine Learning: Final Project Dual-Form SVM for Predicting Loan Defaults

The Listening Project 3 Partnerships and Community Service

Getting Started with CGE Modeling

Perceived Helpfulness of Financial Well-being Programs: Results From the 2017 and 2018 Retirement Confidence Surveys

MODELLING HEALTH MAINTENANCE ORGANIZATIONS PAYMENTS UNDER THE NATIONAL HEALTH INSURANCE SCHEME IN NIGERIA

Risk and Risk Management in the Credit Card Industry

TravelStar Travel Insurance Emergency Medical - Rate Schedule Effective September 1, 2010

Choice Probabilities. Logit Choice Probabilities Derivation. Choice Probabilities. Basic Econometrics in Transportation.

Effective Computation & Allocation of Enterprise Credit Capital for Large Retail and SME portfolios

Agile Testing Survival Guide

Michigan Consumer Sentiment: November Preliminary Mostly Unchanged

Multistage risk-averse asset allocation with transaction costs

Methodological Experiment on Measuring Asset Ownership from a Gender Perspective (MEXA) An EDGE-LSMS-UBOS Collaboration

Questions of Statistical Analysis and Discrete Choice Models

ONLINE APPENDIX (NOT FOR PUBLICATION) Appendix A: Appendix Figures and Tables

Binomial Trees. Liuren Wu. Zicklin School of Business, Baruch College. Options Markets

Gender discrimination in algorithmic decision making

Transportation Theory and Applications

CSE 100: TREAPS AND RANDOMIZED SEARCH TREES

Transcription:

Comparison of Logit Models to Machine Learning Algorithms for Modeling Individual Daily Activity Patterns Daniel Fay, Peter Vovsha, Gaurav Vyas (WSP USA) 1

Logit vs. Machine Learning Models Logit Models: Convenient model properties Easy replication of observed aggregate shares Suffer from combinatorial explosion of alternatives Mostly linear additive specifications of utilities Machine Learning Models: Capture non-linear affects of variables and their combinations Many different ML methods available Prioritize individual prediction rather than aggregate shares Suffer from systematic over/under predictions 2

Research Focus Individual prediction of daily activity pattern types as part of ABM Resolving combinatorial explosion of alternatives Applying model constraints to decision trees Behavioral insights from combinations of variables provided by decision trees 3

Individual Daily Activity Pattern Types (DAP) 3 categories for each person-day: Mandatory at least one work, university or school trip Non-mandatory at least one non-mandatory trip with no mandatory trips Home no participation in out-of-home activities Distinct travel patterns for each type DAP Mandatory Non-Mandatory Home 4

Modeling Coordinate Daily Activity Patterns Important to model DAP type for household members simultaneously Trinary choice model applied to household members jointly Leads to explosion in number of alternatives 3 Person Family 7 Person Family 2187 Combinations 27 Combinations 5

Machine Learning applied to DAP Objectives: Precision of DAP predicted individual and aggregate shares Find method to resolve combinatorial explosion of set of alternatives Identify key variable combinations and the non-linear impacts 6

Machine Learning applied to DAP Individual Accuracy: Random Forest Model Logit Model 7

Machine Learning applied to DAP Resolving Combinatorial Explosion: Adjusted initial random forest probabilities using correlations between patterns Pairwise correlations Performed iteratively until convergence Eliminates explosion of choices pertinent to Logit models IDAP Correlation Matrices Household Travel Survey Random Forest Classifier DAP Probabilities Adjustments to IDAP Probabilities Adjusted DAP Probabilities 8

Random Forest Classifier applied to DAP Aggregate Accuracy: 9

Applying Constraints to Decision Trees to guarantee desired model elasticity Age Gender Gender Income Income Income Income Constrain first splits of decision tree Find optimal split at each leaf node Train subsequent branches of the tree 10

Key Combinations of Variables Retirees 150k Income 75k Non-Mandatory Home 0k 0 79.5 100 Age 11

Key Combinations of Variables Pre-School Children Non-Worker at Home Yes No Non-Mandatory Home Mandatory 0 4.5 7 Age 12

Key Combinations of Variables Pre-School Children 4 years or older No non-worker at home Full-time worker at home? Yes Home No Non-worker with nonmandatory activity? Yes No Non-mandatory Mandatory 13

Key Combinations of Variables Full-Time and Part-Time Workers Part-time worker 29 years or older Gender Male? Yes Mandatory No Pre-school child at home? Yes No Home Mandatory 14

Key Combinations of Variables Non-workers and Retirees 79 years or younger Income 75k or more? Yes Non-mandatory No More cars than workers? Yes No Non-mandatory Home 15

Conclusions ML methods represent a viable alternative to traditional logit models for complex multi-dimensional choices. They may improve the individual model fit significantly ML may systematically over-predict or under-predict certain choices; in this regard, making ML models easy to calibrate in aggregate sense is an important direction ML methods indeed provide some additional insights into travel behavior by revealing certain non-linear combinations of variables that otherwise are difficult to guess and test with traditional logit models However some concerns have to be addressed before we can put ML in practice. 16