Recovery Risk: Application of the Latent Competing Risks Model to Non-performing Loans

Similar documents
درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

Australian Journal of Basic and Applied Sciences. Conditional Maximum Likelihood Estimation For Survival Function Using Cox Model

Analysis of truncated data with application to the operational risk estimation

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

A Markov Chain Approach. To Multi-Risk Strata Mortality Modeling. Dale Borowiak. Department of Statistics University of Akron Akron, Ohio 44325

On Some Statistics for Testing the Skewness in a Population: An. Empirical Study

A Skewed Truncated Cauchy Uniform Distribution and Its Moments

On the Existence of Constant Accrual Rates in Clinical Trials and Direction for Future Research

Test Volume 12, Number 1. June 2003

Modelling component reliability using warranty data

STUDIES ON INVENTORY MODEL FOR DETERIORATING ITEMS WITH WEIBULL REPLENISHMENT AND GENERALIZED PARETO DECAY HAVING SELLING PRICE DEPENDENT DEMAND

Solving real-life portfolio problem using stochastic programming and Monte-Carlo techniques

Business fluctuations in an evolving network economy

Pricing Dynamic Solvency Insurance and Investment Fund Protection

Dynamic Replication of Non-Maturing Assets and Liabilities

Using Monte Carlo Analysis in Ecological Risk Assessments

[D7] PROBABILITY DISTRIBUTION OF OUTSTANDING LIABILITY FROM INDIVIDUAL PAYMENTS DATA Contributed by T S Wright

A lower bound on seller revenue in single buyer monopoly auctions

FAILURE RATE TRENDS IN AN AGING POPULATION MONTE CARLO APPROACH

LOSS SEVERITY DISTRIBUTION ESTIMATION OF OPERATIONAL RISK USING GAUSSIAN MIXTURE MODEL FOR LOSS DISTRIBUTION APPROACH

Pakes (1986): Patents as Options: Some Estimates of the Value of Holding European Patent Stocks

Estimation Parameters and Modelling Zero Inflated Negative Binomial

Contract Pricing and Market Efficiency: Can Peer-to-Peer Internet Credit Markets Improve Allocative Efficiency?

Optimization of Fuzzy Production and Financial Investment Planning Problems

Department of Statistics University of Warwick

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

Survival Analysis APTS 2016/17 Preliminary material

1. You are given the following information about a stationary AR(2) model:

APPLICATION OF KRIGING METHOD FOR ESTIMATING THE CONDITIONAL VALUE AT RISK IN ASSET PORTFOLIO RISK OPTIMIZATION

A Dynamic Hedging Strategy for Option Transaction Using Artificial Neural Networks

Collective Defined Contribution Plan Contest Model Overview

TWO-STAGE NEWSBOY MODEL WITH BACKORDERS AND INITIAL INVENTORY

-divergences and Monte Carlo methods

Correcting for Survival Effects in Cross Section Wage Equations Using NBA Data

Sustainability of Earnings: A Framework for Quantitative Modeling of Strategy, Risk, and Value

Some Characteristics of Data

EX-POST VERIFICATION OF PREDICTION MODELS OF WAGE DISTRIBUTIONS

Personalized screening intervals for biomarkers using joint models for longitudinal and survival data

The mixed trunsored model with applications to SARS in detail. Hideo Hirose

MODELING VOLATILITY OF US CONSUMER CREDIT SERIES

The Cox Hazard Model for Claims Data: a Bayesian Non-Parametric Approach

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Omitted Variables Bias in Regime-Switching Models with Slope-Constrained Estimators: Evidence from Monte Carlo Simulations

Statistical Analysis of Life Insurance Policy Termination and Survivorship

TOURISM GENERATION ANALYSIS BASED ON A SCOBIT MODEL * Lingling, WU **, Junyi ZHANG ***, and Akimasa FUJIWARA ****

Introduction to Sequential Monte Carlo Methods

WARRANTY SERVICING WITH A BROWN-PROSCHAN REPAIR OPTION

Logarithmic-Normal Model of Income Distribution in the Czech Republic

UPDATED IAA EDUCATION SYLLABUS

American Option Pricing Formula for Uncertain Financial Market

A Note on Ramsey, Harrod-Domar, Solow, and a Closed Form

MEASURING PORTFOLIO RISKS USING CONDITIONAL COPULA-AR-GARCH MODEL

November 2000 Course 1. Society of Actuaries/Casualty Actuarial Society

Game-Theoretic Risk Analysis in Decision-Theoretic Rough Sets

On fuzzy real option valuation

Modelling catastrophic risk in international equity markets: An extreme value approach. JOHN COTTER University College Dublin

Actuarial Mathematics and Statistics Statistics 5 Part 2: Statistical Inference Tutorial Problems

On modelling of electricity spot price

Characterization of the Optimum

A Markov Chain Monte Carlo Approach to Estimate the Risks of Extremely Large Insurance Claims

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION

Non-Inferiority Tests for the Odds Ratio of Two Proportions

SAMPLE STANDARD DEVIATION(s) CHART UNDER THE ASSUMPTION OF MODERATENESS AND ITS PERFORMANCE ANALYSIS

Equivalence Tests for the Odds Ratio of Two Proportions

A Big Data Analytical Framework For Portfolio Optimization

Patent Licensing in a Leadership Structure

Pricing & Risk Management of Synthetic CDOs

Application of MCMC Algorithm in Interest Rate Modeling

A Skewed Truncated Cauchy Logistic. Distribution and its Moments

Counterparty Risk Modeling for Credit Default Swaps

Estimation of Volatility of Cross Sectional Data: a Kalman filter approach

EFFECT OF IMPLEMENTATION TIME ON REAL OPTIONS VALUATION. Mehmet Aktan

Comparing the Means of. Two Log-Normal Distributions: A Likelihood Approach

Distortion operator of uncertainty claim pricing using weibull distortion operator

BSc (Hons) Software Engineering BSc (Hons) Computer Science with Network Security

Probability & Statistics

Markowitz portfolio theory

A COMPARATIVE STUDY OF DATA MINING TECHNIQUES IN PREDICTING CONSUMERS CREDIT CARD RISK IN BANKS

Process capability estimation for non normal quality characteristics: A comparison of Clements, Burr and Box Cox Methods

On the Distribution and Its Properties of the Sum of a Normal and a Doubly Truncated Normal

Assembly systems with non-exponential machines: Throughput and bottlenecks

Analysis of extreme values with random location Abstract Keywords: 1. Introduction and Model

Fast Computation of the Economic Capital, the Value at Risk and the Greeks of a Loan Portfolio in the Gaussian Factor Model

Student Loan Nudges: Experimental Evidence on Borrowing and. Educational Attainment. Online Appendix: Not for Publication

The Usefulness of Bayesian Optimal Designs for Discrete Choice Experiments

Modelling Environmental Extremes

GENERATION OF STANDARD NORMAL RANDOM NUMBERS. Naveen Kumar Boiroju and M. Krishna Reddy

Applied Economics. Growth and Convergence 1. Economics Department Universidad Carlos III de Madrid

Probability and Statistics

Estimation of Value at Risk and ruin probability for diffusion processes with jumps

arxiv: v1 [stat.ap] 5 Mar 2012

Dividend Strategies for Insurance risk models

IMPERFECT MAINTENANCE. Mark Brown. City University of New York. and. Frank Proschan. Florida State University

BAYESIAN NONPARAMETRIC ANALYSIS OF SINGLE ITEM PREVENTIVE MAINTENANCE STRATEGIES

The Duration Derby: A Comparison of Duration Based Strategies in Asset Liability Management

DOES COMPENSATION AFFECT BANK PROFITABILITY? EVIDENCE FROM US BANKS

Modelling Environmental Extremes

Optimization of a Real Estate Portfolio with Contingent Portfolio Programming

Game Theory-based Model for Insurance Pricing in Public-Private-Partnership Project

SYLLABUS OF BASIC EDUCATION SPRING 2018 Construction and Evaluation of Actuarial Models Exam 4

Transcription:

44 Recovery Risk: Application of the Latent Competing Risks Model to Non-performing Loans Mauro R. Oliveira Francisco Louzada

45 Abstract This article proposes a method for measuring the latent risks involved in the recovery process of non-performing loans in financial institutions and/or business firms that deal with collection and recovery processes. To that end, we apply the competing risks model referred to in the literature as the promotion time model. The result achieved is the probability of credit recovery for a portfolio segmented into groups based on the information available. Within the context of competing risks, application of the technique yielded an estimation of the number of latent events that concur to the credit recovery event. With these results in hand, we were able to compare groups of defaulters in terms of risk or susceptibility to the recovery event during the collection process, and thereby determine where collection actions are most efficient. We specify the Poisson distribution for the number of latent causes leading to recovery, and the Weibull distribution for the time up to recovery. To estimate the model s parameters, we use the maximum likelihood method. Finally, the model was applied to a sample of defaulted loans from a financial institution. Keywords: Competing Risks, Credit Recovery, Default. 1. Introduction Statistical methods are used at almost every stage of a successful business. In the financial industry, initially, market surveys are applied for new product launches, followed by Scorecard models to grant credit to new customers, Behavior Scoring models to increase loyalty and revenues per customer, and finally Collection Scoring models, which are statistical techniques intended to optimize the process of collecting and recovering credits in default.

46 Therefore, statistical models enable automating process, which is crucial for the industry s players to maintain a high portfolio growth. In this paper, we propose to contribute to the employment of a statistical method that, as far as the authors are aware, has not been applied to analysis of the collection process at a financial institution. The results obtained enable confirming how and when collection actions are most efficient for the bank and, therefore, adding inputs to propose recovery process improvements. For this particular effort, we have available a dataset made up of approximately 22 thousand loans from a financial institution that entered default between 2009 and 2011. The rule to characterize default was the same for all customers, that is, 90 days past due payment on the loans installments. The same collection process was applied to every contract, that is, they were all subjected to the same actions on the part of the collections department. The financial institution that provided the data kept secrecy over the collection methods used. The collection process chosen by the institution considered a 24-month workout period for defaulted loans and, based on a collection rule, certain steps were taken in an attempt to recover non- -performing credits. To apply the proposed methodology, we will only consider fully recovered contracts along with totally lost contracts. That is, if a contract has been partly recovered at the end of the 24-month period, it will not enter the database for application of the method. Thereafter, our database includes information on time up to full recovery of the contract, and on the other hand, information on fully lost contract. In this latter case, obviously, the observations on date of recovery were not gathered. According to survival analysis terminology, those times were regarded as censored. Table 1 summarizes the total number of contracts that make up the database available for modeling. They include 22,109 defaulted contracts, of which approximately 64% had not been recovered by the end of the 24-month recovery period. The bank only made available two items of customer information, or model co-variables. One concerns the customer s risk profile, referred to as Behavior Score range (FX-BS), which returns the values 1, 2, 3 and 4; the other one has to do with the contracted product and is called contract amount range (FX-CV), also returning the values 1, 2, 3 and 4. To more clearly illustrate the results obtained and enable easier comparison of inter-group susceptibility to recovery, we only consider some ranges of the co-variables available. The data on tables 1, 2 and 3 (developed by the authors) indicate that customer profiles with Behavior Score range equal to 2 and contract amount range equal to 2 show the highest rates of recovery. These results find support in those obtained in Section 3, with the application of the competing risks model.

47 Table 1 Group Recovered Unrecovered % Non-recovery Average recovery time (months) Population: 22,109 8.047 14.062 63,60% 9,85 Range 1 Contracted amount: 5,532 Range 2 Contracted amount: 5,478 Range 1 Behavior Score: 7,245 Range 2 Behavior Score: 5,503 2.036 3.496 63,19% 10,35 2.552 2.926 53,41% 18,94 1.719 5.526 76,27% 11,69 3.280 2.223 40,39% 21,63 Table 2 Sub-group Recovered Unrecovered % Non-recovery Subpopulation: Range 1 Contracted amount: 2,895 Range 1 Behavior Score: 1,338 Range 2 Behavior Score: 1,557 Average recovery time (months) 1.203 1.692 58,44% 10,84 347 991 74,06% 11,99 856 701 45,02% 20,16 Table 3 Sub-group Recovered Unrecovered % Non-recovery Subpopulation: Range 2 Contracted amount: 3,270 Range 1 Behavior Score: 1,827 Range 2 Behavior Score: 1,443 Average recovery time (months) 1.694 1.576 48,19% 9,19 618 1.209 66,17% 10,66 1.076 367 25,43% 19,38

48 Based on this data structure, that is, with information on the occurrence or non-occurrence of an event and the time up to this occurrence, one may apply the statistical methodology known as survival analysis. Development of the theory and its application to real data are widely discussed in the literature, particularly in the medical area, where, for example, studies have been conducted on the survival period of patients subjected to different kinds of treatments and drugs. We recommend Maller & Zhou (1996) and Ibrahim et al. (2001) to interested readers. Competing risks modeling, which is this article s purpose, is widely known in the literature and has been extensively discussed in papers such as Cooner et al. (2006), Cooner et al. (2007), Xu et al. (2011). In addition to the large number of additional papers, this modeling has been gaining importance due, mainly, to the work of Chen et al. (1999), Tsodikov et al. (2003), and Tournoud & Ecochard (2007). As far as the authors' knowledge, competing risks modeling has not yet been applied to modeling the risk behavior of credit portfolios during the recovery process. Therefore, by analyzing the risk behavior leading to the event of interest recovery we may then calculate the probability of recovery for a given contract within the chosen period. With the results of the modeling in hand, we compare the estimated latent risks in the process leading to recovery for different customer groups and the respective probability of recovery. Our main objective was to identify the characteristics of customers that result in greater efficiency in the recovery process. The following sections are organized as follows: 2 competing risks model and how the model s parameters are estimated; 3 application of the model to the database; 4 conclusions and discussion of the results. 2. Model Formulation In the formulation of a competing risks model, recovery, or any other relevant event, is regarded as a result brought about by causes that operate concurrently over time. Therefore, two statistical distributions are attributed to formulate this model: one for the random variable time up to event and another for a random variable that models the number of competing events. For an in-depth study of the matter, we recommend the books of Crowder (2010) and Pintilie (2006), among others. Our dataset is made up of a population of approximately 22 thousand defaulted contracts from a Brazilian financial institution s credit portfolio. All contracts are subject to the same collection rule, that is, the same collection actions were implemented over a recovery process at most 24 months in length. The recovery process at hand resulted in one of two situations: time to full recovery of the non-performing loans and, for unrecovered ones, as in survival analysis, time is considered to be censored at month 24.

49 In this paper, we use the probability distributions most frequently employed in survival analysis and competing risks modeling literature. We assume that the time to the event follows the Weibull distribution, represented by the random variable T, and that the number of risk events follows the Poisson distribution, represented by the random variable M. The next formulation assumes that some clients may not be susceptible to recovery, so that the number of competing risks for the recovery event may be zero. The model is generally known as the promotion time model, and has appeared in the literature in previous works like Chen et al. (1999) and Yakovlev & Tsodikov (1996). Therefore, we assume M is Poisson distributed with a probability mass function given by: P(M=m)= (θm exp(-θ) m! where θ>0 e m=0,1,2,... For every i=0, 1, 2,..., m, let T i be the random variable due to the ith risk factor leading to recovery, which is also assumed to be independent from the number of risks given by M. The variable T i is assumed to folllow the Weibull distribution, whose probability function is given by: Therefore, the time of the occurrence of the event is defined as the minimum time out of all m risk factors, that is, Y=min(T 0, T 1,,T m ). As shown in Bereta et al. (2011), Chen et al. (1999) and Yakovlev & Tsodikov (1996), the random variable Y, probability density function is: f Y (t)=θf(t)exp[ θ(f(t))] In the same reference, the authors give the survival function as S Y (t)=exp[ θ(f(t))] As expected, the database has a large number of unrecovered contracts. The literature regards these as immune to the event and, therefore, in our case, they are regarded as contracts lost due to default. The output of the model that provides an estimate for this value is known as cure fraction, and is given by exp(-θ) in this case. To estimate the model s parameters, we use the maximum likelihood estimation with the presence of censored events. The censure indicator is such that δ i =1 if the contract is recovered and δ i =0 otherwise. Therefore, the likelihood function is given as: L(Θ t)= f Y (Θ t) δ i SY (Θ t) 1-δ i n 1 f(t)=γβ γ t γ 1 exp( (βt) γ ) where t>0, γ>0 e β>0. onde Θ=(θ,γ,β).

50 3. Application Competing risks modeling enables lenders to have practical interpretation of the parameters obtained. The Poisson parameter θ represents the expected value of the random variable M, and models the number of latent risks leading to the relevant event. The Poisson distribution parameters are easily interpreted for the purposes of risk-profile comparison: groups with a larger number of factors leading to recovery are more susceptible to recovery. In this case, we may also say that these are the groups with the highest risk of recovery. Tables 4 and 5 show the estimated parameters and allow easy comparison of the Poisson parameter estimates across customer groups. Therefore, the results shown in Tables 4 and 5 support the data shown in Tables 1, 2 and 3. We find that the risk profiles of customers with contracted amount range 2 and behavior score range 2 show the highest estimated values for θ and, therefore, the most chance of effective implementation of the credit recovery process. Equipped with the three parameters estimated by the competing risks model, Θ = (θ, γ, β), we show on Tables 6 and 7 the values for the survival of contracts for 12, 18 and 24 month intervals. According to our model s development, S Y (12months) concerns the probability of a defaulted contract being recovered after 12 months. Since the contract tracking and collection period is capped at 24 months, the values calculated in S Y (24months) represent the probability of non- -recovery of the non-performing contracts at the end of the 24-month period set for collection efforts. Note that these values, seen in table 6 column S Y (24months) and in Table 7 as well, are Table 4 Group Γ β θ Exp(-θ) Value Range I 1,157 18,762 0,614 0,510 Value Range II 1,157 18,762 0,871 0,418 BS Range I 1,260 23,152 0,413 0,661 BS Range II 1,260 23,152 1,422 0,241 Table 5 Group Subgroup Γ β θ Exp(-θ) Value Range I BS Range I 1,297 28,504 0,541 0.581 BS Range II 1,297 28,504 1,458 0,232 Value Range II BS Range I 1,304 18,551 0,544 0,580 BS Range II 1,304 18,551 1,849 0,157

51 very close to the non-recovery values initially presented in tables 2 and 3, respectively. Graphs 1 and 2, next, help compare the recovery risk profiles of the combined profiles formed by Behavior range and contracted amount range. As expected, the greater number of latent competing risks for the occurrence of credit recovery is related to the greater chance or risk of occurrence of the event of interest (recovery). Finally, Box 1 shows the degree of risk associated with the implementation of the recovery and collection process by increasing order. Box 1 Low: FX-BS1 combined with FX-CV1 Lower Medium: FX-BS1 combined with FX-CV2 Higher Medium: FX-BS2 combined with FX-CV1 High: FX-BS2 combined with FX-CV2 Table 6 Group S Y (12months) S Y (18months) S Y (24months) % Unrecovered Value Range I 75,89% 68,56% 63,65% 63,19% Value Range II 67,63% 58,56% 53,70% 53,41% BS Range I 86,39% 80,74% 76,46% 76,27% BS Range II 60,46% 47,93% 39,74% 40,39% Table 7 Group Subgroup S Y (12months) S Y (18months) S Y (24months) % Unrecovered Value Range I BS Range I 86,04% 79,51% 74,22% 74,06% BS Range II 66,68% 53,91% 44,78% 45,02% Value Range II BS Range I 79,03% 71,45% 66,36% 66,17% BS Range II 44,94% 31,91% 24,83% 25,43%

52 4. Conclusion This article presents a new way to measure the efforts of collection departments, which, generally speaking, are present in every credit-granting industry. We therefore attempt to provide an additional tool to be used jointly with existing methods in use in process of collection and recovery. The purpose of a great collection policy is to direct where to employ more effort and, on the other hand, where there is no need to do it with excessive expenditure of resources, resulting in a structured policy and economic recovery. We applied the statistical method known as competing latent risks modeling, as intended, and were able to compare groups of customers according to the probability of recovery of their non- -performing loans. We thus expect that, with the combination of a new additional statistical tool applied to the recovery process, credit lenders may pursue the objective of maximizing the collection process, with an immediate reduction of the losses arising from their financing activities. 5. Acknowledgments This study was funded by CNPq and FAPESP, Brazil. Authors Mauro Ribeiro de Oliveira Júnior Has a Bachelor s Degree in Mathematics from Universidade Federal de São Carlos (2003), a Master s Degree in Mathematics from Universidade Estadual de Campinas (2006) and an MBA in Risk Management from FIPECAFI (2012). Is currently a Doctoral Candidate in Statistics at UFSCar. E-mail: mauroexatas@gmail.com Francisco Louzada PhD of Statistics from Oxford University (1998), Master of Computer Sciences and Computing Mathematics from Universidade de São Paulo (1991), Bachelor of Statistics from Universidade Federal de São Carlos (1988). Currently a Tenured Professor at Universidade de São Paulo. E-mail: louzada@icmc.usp.br

53 References BERETA, E. M., LOUZADA, F., & FRANCO, M. A. P. (2011). The Poisson-Weibull Distribution. Advances and Applications in Statistics, v. 22, p. 107--118. CHEN, M.-H., IBRAHIM, J. G., & SINHA, D. (1999). A new Bayesian model for survival data with a surviving fraction. Journal of the American Statistical Association, v. 94, 909--919. COONER, F., BANERJEE, S., & MCBEAN, A. M. (2006). Modelling geographically referenced survival data with a cure fraction. Statistical Methods in Medical Research v. 15 (1): 307--324. COONER, F., BANERJEE, S., CARLIN, B. P., & SINHA, D. (2007). Flexible cure rate modeling under latent activation schemes. Journal of the American Statistical Association, v. 102: 560--572. CROWDER, M. J. (2010). Classical competing risks, CRC Press. IBRAHIM, J. G., CHEN, M.H., & SINHA, D. (2001). Bayesian Survival Analysis. Springer, New York. MALLER, R. A. e ZHOU, X. (1996). Survival Analysis with Long-Term Survivors. Wiley, New York. PINTILIE, M. (2006). Competing risks: a practical perspective, Vol. 58, Wiley.com. TOURNOUD, M. e ECOCHARD, R. (2007). Application of the promotion time cure model with time-changing exposure to the study of HIV/AIDS and other infectious diseases. Statistics in Medicine, v. 26, 1008--1021. TSODIKOV, A. D., IBRAHIM, J. G., & YAKOVLEV, A. Y. (2003). Estimating cure rates from survival data: An alternative to two-component mixture models. Journal of the American Statistical Association, v. 98, 1063--1078. XU, R., MCNICHOLAS, P. D., DESMOND, A. F., & DARLINGTON, G. A. (2011). A first passage time model for long-term survivors with competing risks. The International Journal of Biostatistics, v. 7(1), 1--15. YAKOVLEV, A. Y. e TSODIKOV, & A. D. (1996). Stochastic Models of Tumor Latency and Their Biostatistical Applications. World Scientifc, Singapore.