Lending Club Loan Portfolio Optimization Fred Robson (frobson), Chris Lucas (cflucas)

Size: px
Start display at page:

Download "Lending Club Loan Portfolio Optimization Fred Robson (frobson), Chris Lucas (cflucas)"

Transcription

1 CS22 Artificial Intelligence Stanford University Autumn Lending Club Loan Portfolio Optimization Fred Robson (frobson), Chris Lucas (cflucas) Overview Lending Club is an online peer-to-peer lending platform, matching borrowers and lenders. Each investor is trying to build the best portfolio of loans. Our project utilizes Artificial Intelligence techniques in order to try and build the optimal portfolio of Lending Club loans. In particular, our algorithms try to build the optimal portfolio from loans offered by Lending Club in any given month. Introduction Lending Club is an online peer-to-peer lending platform, matching individual borrowers and lenders. An individual looking to borrow inputs her information on the platform. An individual looking to lend money then browses the platform and chooses which loans to invest in. As an investor, you must use the information to choose which loans to invest in. Each loan has a list of characteristics: interest rate, job title, annual income, etc. All investors are trying to choose the loans that will give them the best returns. A 24 New York Times article[] described how many fund managers then use their own credit algorithms to identify loans that may be underpriced or overpriced, and cherrypick the ones they want. A loan is underpriced if the risk of default is lower than other loans offering the same risk of default. In the diagram below, the three red loans would be underpriced. For our project, we created our own algorithm that cherry picks the best loans. These cherrypicked loans are combined into the optimal investment portfolio. Literature Review & Similar Studies We have found similar works produced by Rob Gerritsen[2] which leverages data mining techniques for identifying high risk loans and loaning behavior. Gerritsen founded Exclusive Ore Inc. consulting group which specializes in data mining for a wide variety of industries including retail, finance, consumer packaged goods, etc. In this particular study, Gerritsen utilized Naïve Bayes and decision tree algorithms for classification yielding an approximately 85 percent accuracy rate. Although his objective was primarily classification driven, we believe that reviewing this study gave us an invaluable understanding of the tools and algorithms to rely on when analyzing loan data most specifically Sharpe s Ratio. We also looked at similar studies discussing data mining within the banking sector and, more specifically, the potential pitfalls and accuracy tradeoffs of running various algorithms on lending and loan acquisition data. Ahmed et al[3] built predictive models with j48, BayesNet and Naive Bayes algorithms in order to classify loan applications between private banks and the agriculture industry. There was also some investigation into the efficacy of neural networks and linear regression for building rating models for loans. More specifically regarding Lending Club, we found data science articles[8] leveraging the loan holder s description of what their loan was for turned out to be an important indicator for predicting default. This article was important to the development of our project as it turned our

2 CS22 Artificial Intelligence Stanford University Autumn attention to the text descriptions and gave us some intuition about how to learn from them when formulating our own predictions. Data Lending Club has published data on every loan they issued in the period 2-25[4]. We used this data to train and test our algorithms. For each loan, the data includes characteristics of the borrower, as well as whether or not the borrower defaulted. We split our database of loans into four partitions a testing and training data set for each of the two terms (thirty-six and sixty months). The raw data contains sixty fields for each loan; however, not every field has an intuitive use for our learning models so we removed them from our tables. Finally, of the loans we had left, we removed any entries that were missing any of the remaining fields. We did make some simplifications to the dataset. In particular, we assumed that there were no irregular payments. We assumed that each month the individual either pays their monthly installment, or is in default and therefore pays nothing. In practice, there are irregular payments as a result of late fees or payment plans. If a borrower is late to pay the installment, the late fee increases the payment. If the individual is regularly struggling to make the payment, Lending Club will agree to reduce the monthly payments so that the borrower does not default. Payment plans reduce monthly payments. We decided against incorporating the possibility of irregular payments in order to avoid additional complexities that were not needed for a robust analysis of the Lending Club dataset a decision that was further justified by the fact that only ~.2% of the loans in our data set relied on payment plans. Baseline & Oracle We implemented our baseline and oracle algorithms in order to set a reliable success metric for our model and project. For our baseline, we constructed the simplest portfolio investment strategy: invest a proportionally equal amount in every loan for a given period. In this case, our total profit was the weighted average return to date of all the loans issued in the given month. For our oracle we assumed perfect future knowledge. The oracle picked the loans which we knew had achieved the maximum return Jan- Jan- May- May- Sep- Sep- Jan-2 Jan-2 May-2 May-2 36 Month Loans Sep-2 Jan-3 Baseline May-3 Sep-3 Jan-4 6 Month Loans Sep-2 Jan-3 Baseline May-3 Sep-3 May-4 Oracle Jan-4 May-4 Oracle Sep-4 Sep-4 Jan-5 Jan-5 May-5 May-5 Sep-5 Sep-5 The Y axis represents returns to date. If you invested $ in the portfolios at the date on the X 2

3 CS22 Artificial Intelligence Stanford University Autumn axis, how many dollars would you have now in 26 if you never reinvested any of the returns. Approach (): Markov Decision Process Our first approach was to treat investing in a portfolio as a Markov Decision Process, where you choose which loans to invest in and each of the loans have a chance of randomly defaulting. We defined our MDP as follows: State: Portfolio of Loans, Cash, Date Actions: Invest in a loan offered on the lending club platform in a given month Successors: The different portfolios made possible by defaults Reward: The total payment received from all the loans in the portfolio that month For a graphical example of the MDP, see the appendix. However, in this MDP model, the transition probabilities are currently unknown. Therefore, we need to estimate them. In order to estimate the transition probabilities for each loan, we predicted each loan s chance of default in any given month, and treated each loan as independent. Estimating Probability of Default In order to estimate transition probabilities, we need to estimate: P(D x = D = D x- = ) < X < (Loan Maturity) D x = indicates loan default in Month X We used machine learning to estimate each of these. We ran ten iterations and a step size of =. and the loss function below: Loss Fn: Loss squared (x, y, w) = (ɸ(x) w y) 2 Our feature vector had 23 features: each loan holder s debt to income ratio, the number of years they have been employed, the loan s grade, and 2 binary variables for each of the top twenty most meaningful words. Example Feature Vector: Total Debt/Income.67 Loan Grade A3 Employment Length 2 Business 8 other words Medic In order to pull the most meaningful words, we used TF-IDF and porter stemming (see appendix). The resulting weights gave some interesting results. Whether or not the description of the loan contained the word bill had the greatest positive weight. If the description of the loan contained start, the weight was highly positive for the first month. By the 36 th month however, the weight had become negative. MDP Problems The MDP approach had significant challenges that limited its effectiveness. Most significantly, the search space become too large to feasibly explore. Each month a lender can invest in a subset of n loans, therefore there are n! actions. Additionally, if there are m loans in the investor s portfolio, there are 2 m different successor states, depending on which loan defaults. In order to address this challenge, we thought of ways to constrain the search space in order to scale our approach down to a more realistic level. What we ultimately settled on for this approach was that for any given month an investor could select a single loan from a random sample of 5 to invest the entire investment amount in. Although these modifications took us further from an accurate model of the real world, we were able to generate optimal results given the circumstances and get some baseline intuition for what an MDP could bring to the model. 3

4 CS22 Artificial Intelligence Stanford University Autumn MDP results Clearly, the results for this approach were not positive; the MDP approach consistently underperformed the baseline. Jan- May- Sep- Jan-2 36 Month Loans May-2 Sep-2 Jan-3 May-3 Sep-3 Jan-4 May-4 Sep-4 Jan-5 May-5 Sep-5 Additionally, we limited the MDP to choosing one of 5 random loans. Amongst the 5 loans, there may just not be many loans that deserve to be cherry-picked. Together these two constraints likely explain the poor results Approach (ii): Sharpe s Ratio Due to the complexity and poor results of the MDP approach, we decided to use a simpler method. We decided to try and maximize Sharpe s Ratio. E Portfolio Return k Sharpe Ratio = σ 6789:7;<7 Sharpe Ratio balances the expected returns of the portfolio with the riskiness of the portfolio. The optimal portfolio maximizes the Sharpe ratio. Baseline Oracle MDP 6 Month Loans Jan- May- Sep- Jan-2 May-2 Sep-2 Jan-3 May-3 Sep-3 Jan-4 May-4 Sep-4 Jan-5 Baseline Oracle MDP May-5 Sep-5 In order to calculate Sharpe Ratio we need to estimate: E R 6 = w < E R < σ 6? = w < Cov r <, However, given the extreme constraints we placed on the MDP, the quality of the results is not surprising. Because we limited our actions to investing in only one loan, the portfolio is extremely volatile. If that one loan defaults, the whole portfolio goes to zero. Therefore, for each loan we need to estimate expected return, variance, and covariance with other loans. Estimating Variance and Expected Return Using previously calculated probabilities of default from our Markov Decision Process, we ran twenty Monte Carlo simulations for each loan where in each month, each loan had a probability 4

5 CS22 Artificial Intelligence Stanford University Autumn of defaulting given by the probabilities we learnt for the MDP. The expected return is the average across all the simulations, the variance is the variance of the simulations Month Loans Jan- Estimating Covariance In order to predict covariance between loans, we used k-means clustering to cluster our loans according to zip-code and home-ownership. To find the covariance between two loans, we find the covariance between their respective clusters. Cov(x, y) = Cov([a, a 2, a n ][b, b 2, b m ]) where a i k x, b j k y Stochastic Gradient Descent Finally, we calculate the weight of each investment so that we maximize the Sharpe Ratio. To do so we used Stochastic Gradient Descent. As W is as large as 2, we had to use relatively few iterations due to time constraints. In the end, we used iterations and a step size of. May- Maximizing Sharpe Ratio Results Sep- Jan-2 36 Month Loans May-2 Sep-2 Jan-3 May-3 Sep-3 Jan-4 May-4 Sep-4 Jan-5 Baseline Oracle Max SR May-5 Sep-5.5 Jan- May- Sep- Jan-2 May-2 Sep-2 Jan-3 May-3 Sep-3 Excitingly, our Sharpe ratio significantly outperformed the Baseline for both time periods. Further Improvements Jan-4 May-4 Sep-4 As for further developments for our project, we see a few potential routes to improve upon. First, the prediction algorithm from our second approach involving Sharpe s Ratio, although successful, could be improved by using more features in our vector and by running than ten iterations to get us closer to the truly optimal values for the default probabilities, expected returns and variances of each of the loans. Jan-5 Baseline Oracle Max SR More fundamentally, we could change the machine learning approach which we used to predict probability of default. Much of the other literature [7] notes the fact that only a small percentage of borrowers will default and that therefore the dataset is very unbalanced. In order to combat the skewedness, some have used random forests, or have limited the dataset so that the number of loans that default is equal to the number of loans that did not. As another point of improvement, we could flesh out the aforementioned payment plan capabilities. As Lending Club expands its database of loans, the relevancy of payment plans would steadily increase and require extra work and logic to account for it. We feel this would make our May-5 Sep-5 5

6 CS22 Artificial Intelligence Stanford University Autumn model a more holistic representation of loan behavior in the real world. Conclusions In this report, we have built two methods to try and generate an optimal portfolio of LendingClub loans. Though our MDP approach was not successful, our Sharpe Ratio approach consistently outperformed the baseline in our test dataset. Whether it would it be effective beyond the test dataset and in the real world is more uncertain. Our dataset only includes loans from 2-26, when there were no financial crises. As a result, our algorithm suggests investing in low-grade loans with higher interest rates. These low-grade loans do well when economic conditions are good, but do badly when economic conditions decline. Therefore, if there was an economic crash similar to the Great Recession, the portfolio our algorithm recommended would likely do badly. Appendix Example MDP number of times the term t occurs in description d while idf(t, D) is the log of the total number of descriptions over D or the set of all descriptions that contain the term t. Hence, frequently occurring words are weighted less. We pulled the top twenty most meaningful words and created binary toggle features for each loan whether or not a given feature word was present in the loan description. Porter-stemming Algorithm We leveraged some existing code[5] for standardizing words in each loan description down to the root word or word stem. We utilized this method as a way to minimize the number of words we needed to iterate over by consolidating many of the similarly rooted words amongst the descriptions References [] oans-that-avoid-banks-maybe-not.html?_r= [2] Gerritsen, Rob. Assessing Loan Risks: A Data Mining Case Study. XCore Case Studies. Executive Ore Inc., Nov. 25. Web. 26 Oct. 26. < pdf>. TF-IDF Algorithm Text Frequency Inverse Document Frequency (TF-IDF) is an algorithm to extract the most important words in a document of text while accounting for the insignificance of words that appear across multiple documents. tf(t, d) is the [3] Hamid, Aboobyda J., Tarig M. Ahmed, and Nazlı İkizler. Developing Prediction Model and Analyzing Pitfalls of Loan Risk In Banks. Machine Learning and Applications: An International Journal, 26 Mar. 26. Web. 25 Oct. 26. [4] [5] 6

7 CS22 Artificial Intelligence Stanford University Autumn [6] [7] Yhat. Machine Learning for Predicting Bad Loans. Dec 24. [8] ing-club-loan-analysis-making-money-withlogistic-regression 7

Credit Card Default Predictive Modeling

Credit Card Default Predictive Modeling Credit Card Default Predictive Modeling Background: Predicting credit card payment default is critical for the successful business model of a credit card company. An accurate predictive model can help

More information

CS 475 Machine Learning: Final Project Dual-Form SVM for Predicting Loan Defaults

CS 475 Machine Learning: Final Project Dual-Form SVM for Predicting Loan Defaults CS 475 Machine Learning: Final Project Dual-Form SVM for Predicting Loan Defaults Kevin Rowland Johns Hopkins University 3400 N. Charles St. Baltimore, MD 21218, USA krowlan3@jhu.edu Edward Schembor Johns

More information

LendingClub Loan Default and Profitability Prediction

LendingClub Loan Default and Profitability Prediction LendingClub Loan Default and Profitability Prediction Peiqian Li peiqian@stanford.edu Gao Han gh352@stanford.edu Abstract Credit risk is something all peer-to-peer (P2P) lending investors (and bond investors

More information

We are not saying it s easy, we are just trying to make it simpler than before. An Online Platform for backtesting quantitative trading strategies.

We are not saying it s easy, we are just trying to make it simpler than before. An Online Platform for backtesting quantitative trading strategies. We are not saying it s easy, we are just trying to make it simpler than before. An Online Platform for backtesting quantitative trading strategies. Visit www.kuants.in to get your free access to Stock

More information

Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques

Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques Stock Trading Following Stock Price Index Movement Classification Using Machine Learning Techniques 6.1 Introduction Trading in stock market is one of the most popular channels of financial investments.

More information

Modeling Private Firm Default: PFirm

Modeling Private Firm Default: PFirm Modeling Private Firm Default: PFirm Grigoris Karakoulas Business Analytic Solutions May 30 th, 2002 Outline Problem Statement Modelling Approaches Private Firm Data Mining Model Development Model Evaluation

More information

Loan Approval and Quality Prediction in the Lending Club Marketplace

Loan Approval and Quality Prediction in the Lending Club Marketplace Loan Approval and Quality Prediction in the Lending Club Marketplace Final Write-up Yondon Fu, Matt Marcus and Shuo Zheng Introduction Lending Club is a peer-to-peer lending marketplace where individual

More information

International Journal of Advance Engineering and Research Development REVIEW ON PREDICTION SYSTEM FOR BANK LOAN CREDIBILITY

International Journal of Advance Engineering and Research Development REVIEW ON PREDICTION SYSTEM FOR BANK LOAN CREDIBILITY Scientific Journal of Impact Factor (SJIF): 4.72 International Journal of Advance Engineering and Research Development Volume 4, Issue 12, December -2017 e-issn (O): 2348-4470 p-issn (P): 2348-6406 REVIEW

More information

Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model

Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model Algorithmic Trading using Reinforcement Learning augmented with Hidden Markov Model Simerjot Kaur (sk3391) Stanford University Abstract This work presents a novel algorithmic trading system based on reinforcement

More information

Internet Appendix. Additional Results. Figure A1: Stock of retail credit cards over time

Internet Appendix. Additional Results. Figure A1: Stock of retail credit cards over time Internet Appendix A Additional Results Figure A1: Stock of retail credit cards over time Stock of retail credit cards by month. Time of deletion policy noted with vertical line. Figure A2: Retail credit

More information

Loan Approval and Quality Prediction in the Lending Club Marketplace

Loan Approval and Quality Prediction in the Lending Club Marketplace Loan Approval and Quality Prediction in the Lending Club Marketplace Milestone Write-up Yondon Fu, Shuo Zheng and Matt Marcus Recap Lending Club is a peer-to-peer lending marketplace where individual investors

More information

Regressing Loan Spread for Properties in the New York Metropolitan Area

Regressing Loan Spread for Properties in the New York Metropolitan Area Regressing Loan Spread for Properties in the New York Metropolitan Area Tyler Casey tyler.casey09@gmail.com Abstract: In this paper, I describe a method for estimating the spread of a loan given common

More information

Wide and Deep Learning for Peer-to-Peer Lending

Wide and Deep Learning for Peer-to-Peer Lending Wide and Deep Learning for Peer-to-Peer Lending Kaveh Bastani 1 *, Elham Asgari 2, Hamed Namavari 3 1 Unifund CCR, LLC, Cincinnati, OH 2 Pamplin College of Business, Virginia Polytechnic Institute, Blacksburg,

More information

$tock Forecasting using Machine Learning

$tock Forecasting using Machine Learning $tock Forecasting using Machine Learning Greg Colvin, Garrett Hemann, and Simon Kalouche Abstract We present an implementation of 3 different machine learning algorithms gradient descent, support vector

More information

Examining Long-Term Trends in Company Fundamentals Data

Examining Long-Term Trends in Company Fundamentals Data Examining Long-Term Trends in Company Fundamentals Data Michael Dickens 2015-11-12 Introduction The equities market is generally considered to be efficient, but there are a few indicators that are known

More information

Lecture 17: More on Markov Decision Processes. Reinforcement learning

Lecture 17: More on Markov Decision Processes. Reinforcement learning Lecture 17: More on Markov Decision Processes. Reinforcement learning Learning a model: maximum likelihood Learning a value function directly Monte Carlo Temporal-difference (TD) learning COMP-424, Lecture

More information

UPDATED IAA EDUCATION SYLLABUS

UPDATED IAA EDUCATION SYLLABUS II. UPDATED IAA EDUCATION SYLLABUS A. Supporting Learning Areas 1. STATISTICS Aim: To enable students to apply core statistical techniques to actuarial applications in insurance, pensions and emerging

More information

Improving Returns-Based Style Analysis

Improving Returns-Based Style Analysis Improving Returns-Based Style Analysis Autumn, 2007 Daniel Mostovoy Northfield Information Services Daniel@northinfo.com Main Points For Today Over the past 15 years, Returns-Based Style Analysis become

More information

ALGORITHMIC TRADING STRATEGIES IN PYTHON

ALGORITHMIC TRADING STRATEGIES IN PYTHON 7-Course Bundle In ALGORITHMIC TRADING STRATEGIES IN PYTHON Learn to use 15+ trading strategies including Statistical Arbitrage, Machine Learning, Quantitative techniques, Forex valuation methods, Options

More information

Session 5. Predictive Modeling in Life Insurance

Session 5. Predictive Modeling in Life Insurance SOA Predictive Analytics Seminar Hong Kong 29 Aug. 2018 Hong Kong Session 5 Predictive Modeling in Life Insurance Jingyi Zhang, Ph.D Predictive Modeling in Life Insurance JINGYI ZHANG PhD Scientist Global

More information

Relative and absolute equity performance prediction via supervised learning

Relative and absolute equity performance prediction via supervised learning Relative and absolute equity performance prediction via supervised learning Alex Alifimoff aalifimoff@stanford.edu Axel Sly axelsly@stanford.edu Introduction Investment managers and traders utilize two

More information

Classifying Press Releases and Company Relationships Based on Stock Performance

Classifying Press Releases and Company Relationships Based on Stock Performance Classifying Press Releases and Company Relationships Based on Stock Performance Mike Mintz Stanford University mintz@stanford.edu Ruka Sakurai Stanford University ruka.sakurai@gmail.com Nick Briggs Stanford

More information

Quantitative Risk Management

Quantitative Risk Management Quantitative Risk Management Asset Allocation and Risk Management Martin B. Haugh Department of Industrial Engineering and Operations Research Columbia University Outline Review of Mean-Variance Analysis

More information

EE266 Homework 5 Solutions

EE266 Homework 5 Solutions EE, Spring 15-1 Professor S. Lall EE Homework 5 Solutions 1. A refined inventory model. In this problem we consider an inventory model that is more refined than the one you ve seen in the lectures. The

More information

CSCI 1951-G Optimization Methods in Finance Part 00: Course Logistics Introduction to Finance Optimization Problems

CSCI 1951-G Optimization Methods in Finance Part 00: Course Logistics Introduction to Finance Optimization Problems CSCI 1951-G Optimization Methods in Finance Part 00: Course Logistics Introduction to Finance Optimization Problems January 26, 2018 1 / 24 Basic information All information is available in the syllabus

More information

Machine Learning in Risk Forecasting and its Application in Low Volatility Strategies

Machine Learning in Risk Forecasting and its Application in Low Volatility Strategies NEW THINKING Machine Learning in Risk Forecasting and its Application in Strategies By Yuriy Bodjov Artificial intelligence and machine learning are two terms that have gained increased popularity within

More information

Agricultural and Applied Economics 637 Applied Econometrics II

Agricultural and Applied Economics 637 Applied Econometrics II Agricultural and Applied Economics 637 Applied Econometrics II Assignment I Using Search Algorithms to Determine Optimal Parameter Values in Nonlinear Regression Models (Due: February 3, 2015) (Note: Make

More information

(RISK.03) Integrated Cost and Schedule Risk Analysis: A Draft AACE Recommended Practice. Dr. David T. Hulett

(RISK.03) Integrated Cost and Schedule Risk Analysis: A Draft AACE Recommended Practice. Dr. David T. Hulett (RISK.03) Integrated Cost and Schedule Risk Analysis: A Draft AACE Recommended Practice Dr. David T. Hulett Author Biography David T. Hulett, Hulett & Associates, LLC Degree: Ph.D. University: Stanford

More information

S&P 500 Portfolio Optimization Using Macroeconomic Factor Models

S&P 500 Portfolio Optimization Using Macroeconomic Factor Models S&P 500 Portfolio Optimization Using Macroeconomic Factor Models David Newcomb Mgmt. Science & Engineering Stanford University Zach Skokan Mgmt. Science & Engineering Stanford University Thomas Stephens

More information

Reinforcement Learning Analysis, Grid World Applications

Reinforcement Learning Analysis, Grid World Applications Reinforcement Learning Analysis, Grid World Applications Kunal Sharma GTID: ksharma74, CS 4641 Machine Learning Abstract This paper explores two Markov decision process problems with varying state sizes.

More information

HKUST CSE FYP , TEAM RO4 OPTIMAL INVESTMENT STRATEGY USING SCALABLE MACHINE LEARNING AND DATA ANALYTICS FOR SMALL-CAP STOCKS

HKUST CSE FYP , TEAM RO4 OPTIMAL INVESTMENT STRATEGY USING SCALABLE MACHINE LEARNING AND DATA ANALYTICS FOR SMALL-CAP STOCKS HKUST CSE FYP 2017-18, TEAM RO4 OPTIMAL INVESTMENT STRATEGY USING SCALABLE MACHINE LEARNING AND DATA ANALYTICS FOR SMALL-CAP STOCKS MOTIVATION MACHINE LEARNING AND FINANCE MOTIVATION SMALL-CAP MID-CAP

More information

Portfolio Analysis with Random Portfolios

Portfolio Analysis with Random Portfolios pjb25 Portfolio Analysis with Random Portfolios Patrick Burns http://www.burns-stat.com stat.com September 2006 filename 1 1 Slide 1 pjb25 This was presented in London on 5 September 2006 at an event sponsored

More information

A new look at tree based approaches

A new look at tree based approaches A new look at tree based approaches Xifeng Wang University of North Carolina Chapel Hill xifeng@live.unc.edu April 18, 2018 Xifeng Wang (UNC-Chapel Hill) Short title April 18, 2018 1 / 27 Outline of this

More information

Reinforcement Learning. Slides based on those used in Berkeley's AI class taught by Dan Klein

Reinforcement Learning. Slides based on those used in Berkeley's AI class taught by Dan Klein Reinforcement Learning Slides based on those used in Berkeley's AI class taught by Dan Klein Reinforcement Learning Basic idea: Receive feedback in the form of rewards Agent s utility is defined by the

More information

Market Variables and Financial Distress. Giovanni Fernandez Stetson University

Market Variables and Financial Distress. Giovanni Fernandez Stetson University Market Variables and Financial Distress Giovanni Fernandez Stetson University In this paper, I investigate the predictive ability of market variables in correctly predicting and distinguishing going concern

More information

Naïve Bayesian Classifier and Classification Trees for the Predictive Accuracy of Probability of Default Credit Card Clients

Naïve Bayesian Classifier and Classification Trees for the Predictive Accuracy of Probability of Default Credit Card Clients American Journal of Data Mining and Knowledge Discovery 2018; 3(1): 1-12 http://www.sciencepublishinggroup.com/j/ajdmkd doi: 10.11648/j.ajdmkd.20180301.11 Naïve Bayesian Classifier and Classification Trees

More information

How to Consider Risk Demystifying Monte Carlo Risk Analysis

How to Consider Risk Demystifying Monte Carlo Risk Analysis How to Consider Risk Demystifying Monte Carlo Risk Analysis James W. Richardson Regents Professor Senior Faculty Fellow Co-Director, Agricultural and Food Policy Center Department of Agricultural Economics

More information

Graduated from Glasgow University in 2009: BSc with Honours in Mathematics and Statistics.

Graduated from Glasgow University in 2009: BSc with Honours in Mathematics and Statistics. The statistical dilemma: Forecasting future losses for IFRS 9 under a benign economic environment, a trade off between statistical robustness and business need. Katie Cleary Introduction Presenter: Katie

More information

AIRCURRENTS: PORTFOLIO OPTIMIZATION FOR REINSURERS

AIRCURRENTS: PORTFOLIO OPTIMIZATION FOR REINSURERS MARCH 12 AIRCURRENTS: PORTFOLIO OPTIMIZATION FOR REINSURERS EDITOR S NOTE: A previous AIRCurrent explored portfolio optimization techniques for primary insurance companies. In this article, Dr. SiewMun

More information

FE670 Algorithmic Trading Strategies. Stevens Institute of Technology

FE670 Algorithmic Trading Strategies. Stevens Institute of Technology FE670 Algorithmic Trading Strategies Lecture 4. Cross-Sectional Models and Trading Strategies Steve Yang Stevens Institute of Technology 09/26/2013 Outline 1 Cross-Sectional Methods for Evaluation of Factor

More information

Predicting and Preventing Credit Card Default

Predicting and Preventing Credit Card Default Predicting and Preventing Credit Card Default Project Plan MS-E2177: Seminar on Case Studies in Operations Research Client: McKinsey Finland Ari Viitala Max Merikoski (Project Manager) Nourhan Shafik 21.2.2018

More information

Predicting the Success of a Retirement Plan Based on Early Performance of Investments

Predicting the Success of a Retirement Plan Based on Early Performance of Investments Predicting the Success of a Retirement Plan Based on Early Performance of Investments CS229 Autumn 2010 Final Project Darrell Cain, AJ Minich Abstract Using historical data on the stock market, it is possible

More information

SELECTION BIAS REDUCTION IN CREDIT SCORING MODELS

SELECTION BIAS REDUCTION IN CREDIT SCORING MODELS SELECTION BIAS REDUCTION IN CREDIT SCORING MODELS Josef Ditrich Abstract Credit risk refers to the potential of the borrower to not be able to pay back to investors the amount of money that was loaned.

More information

CFA Level II - LOS Changes

CFA Level II - LOS Changes CFA Level II - LOS Changes 2018-2019 Topic LOS Level II - 2018 (465 LOS) LOS Level II - 2019 (471 LOS) Compared Ethics 1.1.a describe the six components of the Code of Ethics and the seven Standards of

More information

The Fundamental Law of Mismanagement

The Fundamental Law of Mismanagement The Fundamental Law of Mismanagement Richard Michaud, Robert Michaud, David Esch New Frontier Advisors Boston, MA 02110 Presented to: INSIGHTS 2016 fi360 National Conference April 6-8, 2016 San Diego,

More information

Inverse reinforcement learning from summary data

Inverse reinforcement learning from summary data Inverse reinforcement learning from summary data Antti Kangasrääsiö, Samuel Kaski Aalto University, Finland ECML PKDD 2018 journal track Published in Machine Learning (2018), 107:1517 1535 September 12,

More information

Predictive modelling around the world Peter Banthorpe, RGA Kevin Manning, Milliman

Predictive modelling around the world Peter Banthorpe, RGA Kevin Manning, Milliman Predictive modelling around the world Peter Banthorpe, RGA Kevin Manning, Milliman 11 November 2013 Agenda Introduction to predictive analytics Applications overview Case studies Conclusions and Q&A Introduction

More information

A COMPARATIVE STUDY OF DATA MINING TECHNIQUES IN PREDICTING CONSUMERS CREDIT CARD RISK IN BANKS

A COMPARATIVE STUDY OF DATA MINING TECHNIQUES IN PREDICTING CONSUMERS CREDIT CARD RISK IN BANKS A COMPARATIVE STUDY OF DATA MINING TECHNIQUES IN PREDICTING CONSUMERS CREDIT CARD RISK IN BANKS Ling Kock Sheng 1, Teh Ying Wah 2 1 Faculty of Computer Science and Information Technology, University of

More information

Predicting stock prices for large-cap technology companies

Predicting stock prices for large-cap technology companies Predicting stock prices for large-cap technology companies 15 th December 2017 Ang Li (al171@stanford.edu) Abstract The goal of the project is to predict price changes in the future for a given stock.

More information

An introduction to Machine learning methods and forecasting of time series in financial markets

An introduction to Machine learning methods and forecasting of time series in financial markets An introduction to Machine learning methods and forecasting of time series in financial markets Mark Wong markwong@kth.se December 10, 2016 Abstract The goal of this paper is to give the reader an introduction

More information

Support Vector Machines: Training with Stochastic Gradient Descent

Support Vector Machines: Training with Stochastic Gradient Descent Support Vector Machines: Training with Stochastic Gradient Descent Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Support vector machines Training by maximizing margin The SVM

More information

CS 188: Artificial Intelligence

CS 188: Artificial Intelligence CS 188: Artificial Intelligence Markov Decision Processes Dan Klein, Pieter Abbeel University of California, Berkeley Non-Deterministic Search 1 Example: Grid World A maze-like problem The agent lives

More information

Gas storage: overview and static valuation

Gas storage: overview and static valuation In this first article of the new gas storage segment of the Masterclass series, John Breslin, Les Clewlow, Tobias Elbert, Calvin Kwok and Chris Strickland provide an illustration of how the four most common

More information

Reasoning with Uncertainty

Reasoning with Uncertainty Reasoning with Uncertainty Markov Decision Models Manfred Huber 2015 1 Markov Decision Process Models Markov models represent the behavior of a random process, including its internal state and the externally

More information

Predicting Market Fluctuations via Machine Learning

Predicting Market Fluctuations via Machine Learning Predicting Market Fluctuations via Machine Learning Michael Lim,Yong Su December 9, 2010 Abstract Much work has been done in stock market prediction. In this project we predict a 1% swing (either direction)

More information

The exam is closed book, closed calculator, and closed notes except your three crib sheets.

The exam is closed book, closed calculator, and closed notes except your three crib sheets. CS 188 Spring 2016 Introduction to Artificial Intelligence Final V2 You have approximately 2 hours and 50 minutes. The exam is closed book, closed calculator, and closed notes except your three crib sheets.

More information

Reinforcement Learning and Simulation-Based Search

Reinforcement Learning and Simulation-Based Search Reinforcement Learning and Simulation-Based Search David Silver Outline 1 Reinforcement Learning 2 3 Planning Under Uncertainty Reinforcement Learning Markov Decision Process Definition A Markov Decision

More information

Chapter 2 Uncertainty Analysis and Sampling Techniques

Chapter 2 Uncertainty Analysis and Sampling Techniques Chapter 2 Uncertainty Analysis and Sampling Techniques The probabilistic or stochastic modeling (Fig. 2.) iterative loop in the stochastic optimization procedure (Fig..4 in Chap. ) involves:. Specifying

More information

MS&E 448 Final Presentation High Frequency Algorithmic Trading

MS&E 448 Final Presentation High Frequency Algorithmic Trading MS&E 448 Final Presentation High Frequency Algorithmic Trading Francis Choi George Preudhomme Nopphon Siranart Roger Song Daniel Wright Stanford University June 6, 2017 High-Frequency Trading MS&E448 June

More information

Visualization on Financial Terms via Risk Ranking from Financial Reports

Visualization on Financial Terms via Risk Ranking from Financial Reports Visualization on Financial Terms via Risk Ranking from Financial Reports Ming-Feng Tsai 1,2 Chuan-Ju Wang 3 (1) Department of Computer Science, National Chengchi University, Taipei 116, Taiwan (2) Program

More information

BSc (Hons) Software Engineering BSc (Hons) Computer Science with Network Security

BSc (Hons) Software Engineering BSc (Hons) Computer Science with Network Security BSc (Hons) Software Engineering BSc (Hons) Computer Science with Network Security Cohorts BCNS/ 06 / Full Time & BSE/ 06 / Full Time Resit Examinations for 2008-2009 / Semester 1 Examinations for 2008-2009

More information

ECS171: Machine Learning

ECS171: Machine Learning ECS171: Machine Learning Lecture 15: Tree-based Algorithms Cho-Jui Hsieh UC Davis March 7, 2018 Outline Decision Tree Random Forest Gradient Boosted Decision Tree (GBDT) Decision Tree Each node checks

More information

arxiv: v1 [cs.ai] 7 Jan 2018

arxiv: v1 [cs.ai] 7 Jan 2018 Trading the Twitter Sentiment with Reinforcement Learning Catherine Xiao catherine.xiao1@gmail.com Wanfeng Chen wanfengc@gmail.com arxiv:1801.02243v1 [cs.ai] 7 Jan 2018 Abstract This paper is to explore

More information

How Can YOU Use it? Artificial Intelligence for Actuaries. SOA Annual Meeting, Gaurav Gupta. Session 058PD

How Can YOU Use it? Artificial Intelligence for Actuaries. SOA Annual Meeting, Gaurav Gupta. Session 058PD Artificial Intelligence for Actuaries How Can YOU Use it? SOA Annual Meeting, 2018 Session 058PD Gaurav Gupta Founder & CEO ggupta@quaerainsights.com Audience Poll What is my level of AI understanding?

More information

ASC Topic 718 Accounting Valuation Report. Company ABC, Inc.

ASC Topic 718 Accounting Valuation Report. Company ABC, Inc. ASC Topic 718 Accounting Valuation Report Company ABC, Inc. Monte-Carlo Simulation Valuation of Several Proposed Relative Total Shareholder Return TSR Component Rank Grants And Index Outperform Grants

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 2, Mar Apr 2017

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 2, Mar Apr 2017 RESEARCH ARTICLE Stock Selection using Principal Component Analysis with Differential Evolution Dr. Balamurugan.A [1], Arul Selvi. S [2], Syedhussian.A [3], Nithin.A [4] [3] & [4] Professor [1], Assistant

More information

Approximating the Confidence Intervals for Sharpe Style Weights

Approximating the Confidence Intervals for Sharpe Style Weights Approximating the Confidence Intervals for Sharpe Style Weights Angelo Lobosco and Dan DiBartolomeo Style analysis is a form of constrained regression that uses a weighted combination of market indexes

More information

Importance Sampling for Fair Policy Selection

Importance Sampling for Fair Policy Selection Importance Sampling for Fair Policy Selection Shayan Doroudi Carnegie Mellon University Pittsburgh, PA 15213 shayand@cs.cmu.edu Philip S. Thomas Carnegie Mellon University Pittsburgh, PA 15213 philipt@cs.cmu.edu

More information

Health Insurance Market

Health Insurance Market Health Insurance Market Jeremiah Reyes, Jerry Duran, Chanel Manzanillo Abstract Based on a person s Health Insurance Plan attributes, namely if it was a dental only plan, is notice required for pregnancy,

More information

Portfolio Management Package Insights A quarterly briefing with best practices and thought leadership concepts from your Portfolio Management Package

Portfolio Management Package Insights A quarterly briefing with best practices and thought leadership concepts from your Portfolio Management Package Portfolio Management Package Insights A quarterly briefing with best practices and thought leadership concepts from your Portfolio Management Package (PMP) team Contents 1. New Special Handling Code (First

More information

CS 343: Artificial Intelligence

CS 343: Artificial Intelligence CS 343: Artificial Intelligence Markov Decision Processes II Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC

More information

Lecture 2: Fundamentals of meanvariance

Lecture 2: Fundamentals of meanvariance Lecture 2: Fundamentals of meanvariance analysis Prof. Massimo Guidolin Portfolio Management Second Term 2018 Outline and objectives Mean-variance and efficient frontiers: logical meaning o Guidolin-Pedio,

More information

Introduction to Fall 2007 Artificial Intelligence Final Exam

Introduction to Fall 2007 Artificial Intelligence Final Exam NAME: SID#: Login: Sec: 1 CS 188 Introduction to Fall 2007 Artificial Intelligence Final Exam You have 180 minutes. The exam is closed book, closed notes except a two-page crib sheet, basic calculators

More information

Artificially Intelligent Forecasting of Stock Market Indexes

Artificially Intelligent Forecasting of Stock Market Indexes Artificially Intelligent Forecasting of Stock Market Indexes Loyola Marymount University Math 560 Final Paper 05-01 - 2018 Daniel McGrath Advisor: Dr. Benjamin Fitzpatrick Contents I. Introduction II.

More information

DFAST Modeling and Solution

DFAST Modeling and Solution Regulatory Environment Summary Fallout from the 2008-2009 financial crisis included the emergence of a new regulatory landscape intended to safeguard the U.S. banking system from a systemic collapse. In

More information

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18, ISSN

International Journal of Computer Engineering and Applications, Volume XII, Issue II, Feb. 18,   ISSN Volume XII, Issue II, Feb. 18, www.ijcea.com ISSN 31-3469 AN INVESTIGATION OF FINANCIAL TIME SERIES PREDICTION USING BACK PROPAGATION NEURAL NETWORKS K. Jayanthi, Dr. K. Suresh 1 Department of Computer

More information

Predicting Economic Recession using Data Mining Techniques

Predicting Economic Recession using Data Mining Techniques Predicting Economic Recession using Data Mining Techniques Authors Naveed Ahmed Kartheek Atluri Tapan Patwardhan Meghana Viswanath Predicting Economic Recession using Data Mining Techniques Page 1 Abstract

More information

Session 5. A brief introduction to Predictive Modeling

Session 5. A brief introduction to Predictive Modeling SOA Predictive Analytics Seminar Malaysia 27 Aug. 2018 Kuala Lumpur, Malaysia Session 5 A brief introduction to Predictive Modeling Lichen Bao, Ph.D A Brief Introduction to Predictive Modeling LICHEN BAO

More information

Accelerated Option Pricing Multiple Scenarios

Accelerated Option Pricing Multiple Scenarios Accelerated Option Pricing in Multiple Scenarios 04.07.2008 Stefan Dirnstorfer (stefan@thetaris.com) Andreas J. Grau (grau@thetaris.com) 1 Abstract This paper covers a massive acceleration of Monte-Carlo

More information

Final Examination CS540: Introduction to Artificial Intelligence

Final Examination CS540: Introduction to Artificial Intelligence Final Examination CS540: Introduction to Artificial Intelligence December 2008 LAST NAME: FIRST NAME: Problem Score Max Score 1 15 2 15 3 10 4 20 5 10 6 20 7 10 Total 100 Question 1. [15] Probabilistic

More information

Draft. emerging market returns, it would seem difficult to uncover any predictability.

Draft. emerging market returns, it would seem difficult to uncover any predictability. Forecasting Emerging Market Returns Using works CAMPBELL R. HARVEY, KIRSTEN E. TRAVERS, AND MICHAEL J. COSTA CAMPBELL R. HARVEY is the J. Paul Sticht professor of international business at Duke University,

More information

Decision Trees An Early Classifier

Decision Trees An Early Classifier An Early Classifier Jason Corso SUNY at Buffalo January 19, 2012 J. Corso (SUNY at Buffalo) Trees January 19, 2012 1 / 33 Introduction to Non-Metric Methods Introduction to Non-Metric Methods We cover

More information

An enhanced artificial neural network for stock price predications

An enhanced artificial neural network for stock price predications An enhanced artificial neural network for stock price predications Jiaxin MA Silin HUANG School of Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR S. H. KWOK HKUST Business

More information

The Optimization Process: An example of portfolio optimization

The Optimization Process: An example of portfolio optimization ISyE 6669: Deterministic Optimization The Optimization Process: An example of portfolio optimization Shabbir Ahmed Fall 2002 1 Introduction Optimization can be roughly defined as a quantitative approach

More information

16 MAKING SIMPLE DECISIONS

16 MAKING SIMPLE DECISIONS 247 16 MAKING SIMPLE DECISIONS Let us associate each state S with a numeric utility U(S), which expresses the desirability of the state A nondeterministic action A will have possible outcome states Result

More information

Portfolio theory and risk management Homework set 2

Portfolio theory and risk management Homework set 2 Portfolio theory and risk management Homework set Filip Lindskog General information The homework set gives at most 3 points which are added to your result on the exam. You may work individually or in

More information

Random Variables and Probability Distributions

Random Variables and Probability Distributions Chapter 3 Random Variables and Probability Distributions Chapter Three Random Variables and Probability Distributions 3. Introduction An event is defined as the possible outcome of an experiment. In engineering

More information

Better decision making under uncertain conditions using Monte Carlo Simulation

Better decision making under uncertain conditions using Monte Carlo Simulation IBM Software Business Analytics IBM SPSS Statistics Better decision making under uncertain conditions using Monte Carlo Simulation Monte Carlo simulation and risk analysis techniques in IBM SPSS Statistics

More information

Executing Effective Validations

Executing Effective Validations Executing Effective Validations By Sarah Davies Senior Vice President, Analytics, Research and Product Management, VantageScore Solutions, LLC Oneof the key components to successfully utilizing risk management

More information

Asset Allocation and Risk Assessment with Gross Exposure Constraints

Asset Allocation and Risk Assessment with Gross Exposure Constraints Asset Allocation and Risk Assessment with Gross Exposure Constraints Forrest Zhang Bendheim Center for Finance Princeton University A joint work with Jianqing Fan and Ke Yu, Princeton Princeton University

More information

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned

More information

Predicting Online Peer-to-Peer(P2P) Lending Default using Data Mining Techniques

Predicting Online Peer-to-Peer(P2P) Lending Default using Data Mining Techniques Predicting Online Peer-to-Peer(P2P) Lending Default using Data Mining Techniques Jae Kwon Bae, Dept. of Management Information Systems, Keimyung University, Republic of Korea. E-mail: jkbae99@kmu.ac.kr

More information

TDT4171 Artificial Intelligence Methods

TDT4171 Artificial Intelligence Methods TDT47 Artificial Intelligence Methods Lecture 7 Making Complex Decisions Norwegian University of Science and Technology Helge Langseth IT-VEST 0 helgel@idi.ntnu.no TDT47 Artificial Intelligence Methods

More information

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Piyush Rai CS5350/6350: Machine Learning November 29, 2011 Reinforcement Learning Supervised Learning: Uses explicit supervision

More information

Markov Decision Processes

Markov Decision Processes Markov Decision Processes Robert Platt Northeastern University Some images and slides are used from: 1. CS188 UC Berkeley 2. AIMA 3. Chris Amato Stochastic domains So far, we have studied search Can use

More information

And The Winner Is? How to Pick a Better Model

And The Winner Is? How to Pick a Better Model And The Winner Is? How to Pick a Better Model Part 2 Goodness-of-Fit and Internal Stability Dan Tevet, FCAS, MAAA Goodness-of-Fit Trying to answer question: How well does our model fit the data? Can be

More information

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration

Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Reinforcement Learning (1): Discrete MDP, Value Iteration, Policy Iteration Piyush Rai CS5350/6350: Machine Learning November 29, 2011 Reinforcement Learning Supervised Learning: Uses explicit supervision

More information

CEC login. Student Details Name SOLUTIONS

CEC login. Student Details Name SOLUTIONS Student Details Name SOLUTIONS CEC login Instructions You have roughly 1 minute per point, so schedule your time accordingly. There is only one correct answer per question. Good luck! Question 1. Searching

More information

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern

Monte-Carlo Planning: Introduction and Bandit Basics. Alan Fern Monte-Carlo Planning: Introduction and Bandit Basics Alan Fern 1 Large Worlds We have considered basic model-based planning algorithms Model-based planning: assumes MDP model is available Methods we learned

More information

Modelling the Sharpe ratio for investment strategies

Modelling the Sharpe ratio for investment strategies Modelling the Sharpe ratio for investment strategies Group 6 Sako Arts 0776148 Rik Coenders 0777004 Stefan Luijten 0783116 Ivo van Heck 0775551 Rik Hagelaars 0789883 Stephan van Driel 0858182 Ellen Cardinaels

More information