A Joint Credit Scoring Model for Peer-to-Peer Lending and Credit Bureau

Similar documents
INTERNATIONAL JOURNAL FOR INNOVATIVE RESEARCH IN MULTIDISCIPLINARY FIELD ISSN Volume - 3, Issue - 2, Feb

Key Words: emerging markets, copulas, tail dependence, Value-at-Risk JEL Classification: C51, C52, C14, G17

DEM Working Paper Series. Estimating bank default with generalised extreme value models

An Introduction to Copulas with Applications

2. Copula Methods Background

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]

Dependence Structure and Extreme Comovements in International Equity and Bond Markets

GEV-Canonical Regression for Accurate Binary Class Probability Estimation when One Class is Rare

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Small Sample Performance of Instrumental Variables Probit Estimators: A Monte Carlo Investigation

Analyzing the Determinants of Project Success: A Probit Regression Approach

Dependence Structure between TOURISM and TRANS Sector Indices of the Stock Exchange of Thailand

Financial Risk Management

Copulas and credit risk models: some potential developments

Modeling Co-movements and Tail Dependency in the International Stock Market via Copulae

Bivariate Birnbaum-Saunders Distribution

Modeling. joint work with Jed Frees, U of Wisconsin - Madison. Travelers PASG (Predictive Analytics Study Group) Seminar Tuesday, 12 April 2016

Modelling Bank Loan LGD of Corporate and SME Segment

Operational Risk Modeling

An Empirical Analysis of the Dependence Structure of International Equity and Bond Markets Using Regime-switching Copula Model

Introduction to vine copulas

MEASURING PORTFOLIO RISKS USING CONDITIONAL COPULA-AR-GARCH MODEL

Multivariate longitudinal data analysis for actuarial applications

Introduction to POL 217

Asymmetric Price Transmission: A Copula Approach

Page 2 Vol. 10 Issue 7 (Ver 1.0) August 2010

Modeling customer revolving credit scoring using logistic regression, survival analysis and neural networks

Lindner, Szimayer: A Limit Theorem for Copulas

Internet Appendix for Asymmetry in Stock Comovements: An Entropy Approach

Open Access Asymmetric Dependence Analysis of International Crude Oil Spot and Futures Based on the Time Varying Copula-GARCH

Web-based Supplementary Materials for. A space-time conditional intensity model. for invasive meningococcal disease occurence

STA 4504/5503 Sample questions for exam True-False questions.

Intro to GLM Day 2: GLM and Maximum Likelihood

Vine-copula Based Models for Farmland Portfolio Management

Lecture notes on risk management, public policy, and the financial system. Credit portfolios. Allan M. Malz. Columbia University

Threshold cointegration and nonlinear adjustment between stock prices and dividends

Loss Simulation Model Testing and Enhancement

On the Distribution and Its Properties of the Sum of a Normal and a Doubly Truncated Normal

NPTEL Project. Econometric Modelling. Module 16: Qualitative Response Regression Modelling. Lecture 20: Qualitative Response Regression Modelling

The Influence of Bureau Scores, Customized Scores and Judgmental Review on the Bank Underwriting

PORTFOLIO MODELLING USING THE THEORY OF COPULA IN LATVIAN AND AMERICAN EQUITY MARKET

Ruin with Insurance and Financial Risks Following a Dependent May 29 - June Structure 1, / 40

Catastrophic crop insurance effectiveness: does it make a difference how yield losses are conditioned?

Comparative Analyses of Expected Shortfall and Value-at-Risk under Market Stress

Centre for Computational Finance and Economic Agents WP Working Paper Series. Steven Simon and Wing Lon Ng

A Vine Copula Approach for Analyzing Financial Risk and Co-movement of the Indonesian, Philippine and Thailand Stock Markets

Econometric Methods for Valuation Analysis

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2017, Mr. Ruey S. Tsay. Solutions to Final Exam

Wide and Deep Learning for Peer-to-Peer Lending

UCLA Department of Economics Ph.D. Preliminary Exam Industrial Organization Field Exam (Spring 2010) Use SEPARATE booklets to answer each question

Extreme Return-Volume Dependence in East-Asian. Stock Markets: A Copula Approach

Spatial regression models for SMEs

News Sentiment And States of Stock Return Volatility: Evidence from Long Memory and Discrete Choice Models

LEND ACADEMY INVESTMENTS

Simulation of Extreme Events in the Presence of Spatial Dependence

Is the Potential for International Diversification Disappearing? A Dynamic Copula Approach

Pair Copula Constructions for Insurance Experience Rating

OPTIMAL PORTFOLIO OF THE GOVERNMENT PENSION INVESTMENT FUND BASED ON THE SYSTEMIC RISK EVALUATED BY A NEW ASYMMETRIC COPULA

GPD-POT and GEV block maxima

Will QE Change the dependence between Baht/Dollar Exchange Rates and Price Returns of AOT and MINT?

A Comparison of Univariate Probit and Logit. Models Using Simulation

MODELING DEPENDENCY RELATIONSHIPS WITH COPULAS

Measuring Risk Dependencies in the Solvency II-Framework. Robert Danilo Molinari Tristan Nguyen WHL Graduate School of Business and Economics

A case study on using generalized additive models to fit credit rating scores

Modelling Dependence between the Equity and. Foreign Exchange Markets Using Copulas

Modeling of Price. Ximing Wu Texas A&M University

Copulas? What copulas? R. Chicheportiche & J.P. Bouchaud, CFM

Copula information criterion for model selection with two-stage maximum likelihood estimation

The Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

Econometrics II Multinomial Choice Models

Lecture Note 9 of Bus 41914, Spring Multivariate Volatility Models ChicagoBooth

Credit Risk Modelling

Correcting for Survival Effects in Cross Section Wage Equations Using NBA Data

Probits. Catalina Stefanescu, Vance W. Berger Scott Hershberger. Abstract

Operational Risk Aggregation

Economics Multinomial Choice Models

1. Logit and Linear Probability Models

Extreme Dependence in International Stock Markets

Estimation of VaR Using Copula and Extreme Value Theory

Operational Risk Aggregation

Dynamic Corporate Default Predictions Spot and Forward-Intensity Approaches

Multiple Regression and Logistic Regression II. Dajiang 525 Apr

Lecture 21: Logit Models for Multinomial Responses Continued

Modelling Environmental Extremes

Vlerick Leuven Gent Working Paper Series 2003/30 MODELLING LIMITED DEPENDENT VARIABLES: METHODS AND GUIDELINES FOR RESEARCHERS IN STRATEGIC MANAGEMENT

Analysis of Microdata

Modelling Environmental Extremes

Bankruptcy Prediction of Small and Medium Enterprises. Using a Flexible Binary Generalized Extreme Value. Model

Longitudinal Modeling of Insurance Company Expenses

Lecture 10: Alternatives to OLS with limited dependent variables, part 1. PEA vs APE Logit/Probit

Analyzing Dependence Structure of Equity, Bond and Money Markets by Using Time-Varying Copulas

Financial Risk Forecasting Chapter 9 Extreme Value Theory

Audit Opinion Prediction Before and After the Dodd-Frank Act

Tail Risk, Systemic Risk and Copulas

Local logit regression for recovery rate

Credit Scoring Modeling

Polyhazard models with dependent causes

Outcome uncertainty and attendance demand in sport: the case of English soccer

ON A PROBLEM BY SCHWEIZER AND SKLAR

Transcription:

A Joint Credit Scoring Model for Peer-to-Peer Lending and Credit Bureau Credit Research Centre and University of Edinburgh raffaella.calabrese@ed.ac.uk joint work with Silvia Osmetti and Luca Zanin Credit Scoring and Credit Control conference 31 August 2017

Outline 1 The copula function The bivariate model 2 Data Empirical results 3

P2P lending Peer-to-peer (P2P) lending allows direct lending between lenders and borrowers using a platform. In 2014 P2P lending generated approximately $5.5 billion loans in the US. To improve the predictive accuracy accuracy of scoring models for P2P, we suggest to use the information from a credit bureau if the borrower defaults on any loan. We propose a bivariate regression model binary unbalanced data (BivGEV model).

The copula function The bivariate model Let Y be a binary response so defined { 1 if the borrower defaults Y = 0 otherwise Let x = (x 1,x 2,...,x p ) be a p-covariates vector. We model the probability of default P(Y = 1) = π(x β) where β = (β 0,β 1,...,β p ) are the regressor parameters. Symmetric link functions π( ), such as the logit and the probit models, are inaccurate if the binary classification is strongly unbalanced (Calabrese et al. 2015; King and Zeng, 2001; Wang and Dey, 2010).

GEV link function The copula function The bivariate model Link functions Pr(Y=1) 0.0 0.2 0.4 0.6 0.8 1.0 Logistic GEV(tau = 0.25) GEV(tau = 0.25) GEV(tau = 1) 4 2 0 2 4

The copula function The bivariate model We suggest to model the probability of default π(x β) using the GEV distribution as follows π(x it,s i,t i ) = π(x; β,τ) = { exp = exp [ 1 + τ(β 0 + p [ (β 0 + p j=1 β jx j ) j=1 β jx j ) ] ] 1 τ + } τ = 0 τ 0 with τ denotes the shape parameter and x + = max(x,0). The GEV distribution is very flexible with the shape parameter τ controlling the tail behaviour. The R package BGEVA is available on CRAN.

The copula function The copula function The bivariate model A function C : I 2 I, with I 2 = [0,1] [0,1] and I = [0,1], is a bivariate copula if it is the cumulative bivariate distribution function of a rv (U,V ), with uniform marginal in [0,1] C λ (u,v) = P(U u,v v), 0 u 1 0 v 1 where the copula parameter λ Λ describes the association between the marginals. Copula functions capture the dependence structure between the marginals and allow the specification of multivariate distributions with arbitrary dependence structures.

The copula function The bivariate model Some characteristics of the main Copula functions Copula Dependence Tail Dependence Gaussian radially no asymptotic symmetric tail dependence Clayton asymmetric strong left (lower) (exchangeable) tail dependence Gumbel asymmetric strong right (upper) (exchangeable) tail dependence Frank radially no asymptotic symmetric tail dependence Joe asymmetric strong right (upper) (exchangeable) tail dependence

The copula function The bivariate model Y = (Y 1,Y 2 ) is a binary bivariate response variable with values on (0,1); the marginal probabilities are π 1 (x; β 1,τ 1 ) = P(Y 1 = 1 x; β 1,τ 1 ) π 2 (x; β 2,τ 2 ) = P(Y 2 = 1 x; β 2,τ 2 ) The marginal probabilities are modelled using the GEV distribution. The BivGEV is defined using the copula function: π 11 (x; δ,τ ) = C λ (π 1 (x; β 1,τ 1 ),π 2 (x; β 2,τ 2 )) { } { }) = C λ (exp [1 + τ 1 η 1 ] 1/τ 1,exp [1 + τ 2 η 2 ] 1/τ 2 The maximum likelihood method is used to estimate the BivGEV model.

Data Data Empirical results We analyse 12,579 loans of 60 months provided by Lending Club from 2010 to the first quarter of 2012. { 1 if the borrower is reported in default by the credit bureau Y 1 = 0 otherwise Y 2 = { 1 if the borrower defaults on the P2P loan 0 otherwise The percentage of defaulted P2P loans is 24% and default credit bureau is 5%.

Data Empirical results The determinants of the scoring models for default credit bureau and P2P lending are: Loan purpose. Housing situation: Mortgage; Rent; Own or other situation. Interest rate. Annual income. Revolving utilization. Inquiries last 6 months. DTI: Monthly debt payments to monthly income. Delinquency last 2 years. Open accounts. Credit history length. Loan amount to annual income. Spatial variables defined using the first digit of the ZIP Code.

Empirical results Data Empirical results Copula Copula parameter λ Kendall-Tau Gaussian 0.147 0.094 Clayton 0.104 0.049 Gumbel 1.150 0.132 Frank 1.050 0.115 Joe 1.480 0.210 Copula AIC BIC Gaussian 12325.60 12538.08 Clayton 12325.58 12538.06 Gumbel 12325.51 12537.99 Frank 12325.91 12538.39 Joe 12326.36 12538.84

Data Empirical results Default Credit Bureau Default P2P lending Car financing 0.157 House 0.094 Major purchase 0.576 Small business 0.379 Rent 0.320 Interest rate 0.123 0.055 ln(annual income) 0.727 0.313 ln(revolving utilization) 0.145 0.048 Inquiries last 6 months 0.068 Delinquency last 2 years 0.137 Open accounts 0.021 DTI 0.021 Credit history length 0.026 0.007 Loan amount to annual income 2.37 0.301 Intercept 4.298 1.848 τ -0.8-0.1

Out of sample Data Empirical results Model MSE + MAE + AUC H Probit 0.5555 0.7392 0.6190 0.0558 Y 2 = 1 Y 1 = 1 Model MSE + MAE + AUC H BivGEV 0.3792 0.6109 0.5969 0.1529 BivProbit 0.3805 0.6117 0.5930 0.1520 Y 2 = 1 Y 1 = 0 Model MSE + MAE + AUC H BivGEV 0.5654 0.7465 0.6200 0.0783 BivProbit 0.5656 0.7463 0.6198 0.0788

Out of time Data Empirical results Model MSE + MAE + AUC H Probit 0.5570 0.7407 0.6671 0.0790 Y 2 = 1 Y 1 = 1 Model MSE + MAE + AUC H BivGEV 0.3616 0.5975 0.7897 0.3907 BivProbit 0.3629 0.5984 0.7910 0.3910 Y 2 = 1 Y 1 = 0 Model MSE + MAE + AUC H BivGEV 0.5657 0.7472 0.6642 0.1238 BivProbit 0.5661 0.7471 0.6637 0.1197

We introduced a bivariate regression model that is accurate in classifying defaults. We implemented the model in an R package that will be publicly available. We obtain that using the information from the credit bureau improves the predictive accuracy of a scoring model for P2P lending.