Application of Statistical Techniques in Group Insurance Chit Wai Wong, John Low, Keong Chuah & Jih Ying Tioh AIA Australia This presentation has been prepared for the 2016 Financial Services Forum. The Institute Council wishes it to be understood that opinions put forward herein are not necessarily those of the Institute and the Council is not responsible for those opinions.
Agenda Background and Motivation Stochastic Claims Techniques and Applications Generalised Linear Models and Applications GLM Case Study: Understanding What Drives TPD Claim Delays
Some Relevant Developments within General Insurance Generalised Linear Models Basic Chain Ladder Thomas Mack Standard Error of Chain Ladder England & Verrall Stochastic Claims Reserving Mario Wüthrich Stochastic Claims Reserving for Solvency Purposes Research with applications in pricing? Based on non-aggregated data? 1970 s 1993 2002 2008 To Date
Moving Forward to the 21 st Century Anticipated Improvements in data quality, Volume of data available Further insights to be gained from using individual member data Significant Improvements in Computing power Accessibility of Statistical packages
Suggested Statistical Package R with RStudio as Integrated Development Environment Widely used by practising statisticians and researchers for statistical analysis Various built-in packages assessable to users who might not have very strong coding experience Has active user groups where questions can be asked and are often quickly responded to Free of charge and open source
Analysis Capabilities Need to Expand The chain ladder and its deterministic variants are heavily used in almost all aspects of group insurance related work, but Additional insights potentially lead to Deterministic chain ladder Stochastic methods GLM Enhanced pricing, reserving and capital management Optimised Return on Capital
STOCHASTIC CLAIMS TECHNIQUES AND APPLICATIONS
Stochastic Claims Models Full Models Most reflective of claims profile Needs policy/member data Needs computing power Simplified Models Reflective of key drivers of claims profile Reduced policy/member data and computing requirements Setting of homogenous groups with sufficient credibility Many variants of stochastic models have emerged over the last 30 years in the general insurance area Most of these models are chain ladder based Some are easy to implement in spreadsheets through bootstrapping method Full models using individual data could be more popular as computing power increases
Why Stochastic Claims Models? Provide predictive distribution - Provide more than just mean and variance - Link probability to amount at risk - Useful in setting risk margins pricing, capital - Could be extended to enterprise value at risk Provide measure of variability Produce asymmetric distributions - Avoid understating the tail Pricing options and guarantees - Profit share for group life business 99.5%
Capital Risk Margins Calibration Stochastic chain ladder models could be used to understand IBNR distributions at the aggregated level or by accident/incurred years Support the calibration of risk margins on IBNR Most models can produce asymmetric distributions to avoid understating the tail Some variations could be implemented in Excel R has a very comprehensive package AY/DY 1 2 3 4 5 6 7 2007 3,855 3,428 2,057 1,374 1,037 598 340 2008 3,979 3,538 2,123 1,418 1,070 618 272 2009 4,263 3,791 2,274 1,519 1,146 615 270 2010 3,992 3,549 2,129 1,423 986 710 163 2011 4,217 3,750 2,250 2,139 901 742 232 Simulate 2012 5,108 4,542 2,618 2,067 1,214 921 356 2013 6,283 5,812 3,191 2,424 1,415 915 295 #ChainLadder package https://cran.r-project.org/web/packages/chainladder/chainladder.pdf All Years 2011 2012 2013 IBNR distributions by incurred year
Profit Share Pricing Profit share arrangement creates asymmetric return for the insurer The variability of claim outcomes will need to be considered when setting the cost of the profit share
Profit Share Pricing Cost of profit share = f (profit share refund %, premiums, claims distribution) Cost of profit share Profit share refund % Premiums Volatility in claims distribution Process to determine premiums is iterative Initial premiums could be chosen based on risk tolerance, say, to cover claims at 95 th percentile Simulation process could be repeated by varying premiums and profit share refund % Higher premiums lead to lower exposure to claims variability for the insurer but reduce competitiveness Final premiums and profit share refund % will be based on the balance between insurer s and client s objectives
GENERALISED LINEAR MODEL AND APPLICATIONS
Generalised Linear Model A statistical model that relates a response variable to a set of explanatory variables (rating factors) Response Variables Examples Incidence rate Lapse rate Termination rate Lodgement delay Explanatory Variables Examples Age Gender Policy Duration Smoking status
Why Use Generalised Linear Model? Moving away from one-way analysis One way analysis does not take into account of correlations between factors GLM is multivariate which allows for correlations and interactions between factors. Rating factors can be ranked and discovery of new rating factors is made easier using statistical packages. 3
Why Use Generalised Linear Model? / =Link ratio Claim ID Member ID Loss Date Reported Date Claim Paid Delay 0005 M61868 5/06/2012 5/07/2012 200000 1.0 3005 M003 18/07/2014 16/10/2014 150000 3.0 03656 M30540 6/09/2010 25/03/2011 80000 6.6 9873 M2168 1/07/2011 29/10/2011 400000 3.9 Moving from aggregated data to individual data Group insurance claims analysis are typically performed using aggregated data using the chain ladder approach. This approach will inform whether the delays are longer or shorter but not necessarily help us understand the drivers that are causing the delays. The advantages: Further insights could be generated by identifying what drives claim delays using individual claims data. Volume and richness of data will continue to grow. 3
Questions GLM Can Help Answer What is the optimal grouping of rating factors? Which customer segments should the business target? What is the impact of rating factors on the dependent variable? Are there any interactions between rating factors? Which are the most important drivers of experience?
Sourcing data for GLM analysis GLM analysis not limited to data that is only available internally. External sources of data can be linked to internal data using postcodes to extend analysis. Mitchell LGA example: Community profiles: Top employing industry is manufacturing, occupation profile light blue Economic data: Unemployment rate of 7.5% at Dec-15 quarter Socio-Economic Indexes for Areas (SEIFA): Area is not disadvantaged 2011 Can these factors help explain claims experience? Source: abs.gov.au and employment.gov.au
Visualisations R can be used to visualise analysis. Higher Delay + Internal Data + ABS Digital Boundaries Data for Australia
CASE STUDY UNDERSTANDING WHAT DRIVES TPD CLAIM DELAYS? Using Generalised Linear Models
Background and Motivation Incidence Claim Delay Lodgement Volatility in claim delay drives uncertainty in pricing, profit and capital reporting for Group Insurance Mis-estimation of claim delays by a few months could cost millions! With anticipated continued improvement in data quality and availability of external data.. Can we utilise member level claims data to gain further insights into what could be driving claims delay?
What Drives Claim Delay? awareness often cited but can we dig deeper? Claim Delay GLM was identified as the appropriate tool to: Uncover and test new factors Rank these factors by importance Define the appropriate banding for these factors (eg. Age) Estimate the impact that the factors might have on the claim delay Age Awareness Exposure to aggressive lawyer advertising? Cause of Claim Unemployment Education
Searching for Relevant Factors and Ranking them by Importance A forward selection algorithm may be used to identify rating factors. One-factor model Fit each of the remaining rating factors in turn Select the onefactor model that best fits the data Fit additional factors, some may drop off.. Multi-factor model Final model with statistically significant factors Claim Delay Age Education Age Awareness Exposure to aggressive lawyer advertising? Cause of Claim Unemployment Education The process stops there are no improvement in explanatory power by adding more factors. #forward package https://cran.r-project.org/web/packages/forward/forward.pdf
Claim Delay and Age Q1: Are older members potentially more reliant on their cover and therefore may be more aware of their cover? Relative Delay Longer Shorter *** Indifferent? retirement is still far away Preservation Age Results presented relative to [0,30) [30,40) [40,50) [50,55) [55,60) [60,70) *** *** R can be used to identify the appropriate age bands: Start with a model with an explanatory variable that contains all raw ages. R will highlight the ages that are statistically significantly different and will inform how the ages should be grouped. Claim Delay Age Band Age Awareness Exposure to aggressive lawyer advertising? Darker toned bars indicated higher numbers of claims in the group Higher number of * indicates higher statistical significance Cause of Claim Unemployment Education
Claim Delay and Cause of Claim Q2: Are certain causes of claim susceptible to longer delays? Relative Delay Longer Shorter Conclusive diagnosis is potentially complex and could take time? Social perception? * ** * *** * Results presented relative to * Small number of claims with significant longer delays related to more complicated conditions Claim Delay Cause of Claim Darker toned bars indicated higher numbers of claims in the group Higher number of * indicates higher statistical significance Age Cause of Claim Awareness Unemployment Education Exposure to aggressive lawyer advertising?
Claim Delay and Unemployment Q3: Are claimants who reside in areas where unemployment rates are high have relatively shorter claim delays? High Unemployment Shorter Claim Delay. Higher reliance? Higher Unemployment Shorter Delay Claim Delay Age Awareness Exposure to aggressive lawyer advertising? Visualisation using Cause of Claim Unemployment Education
Claim Delay and Solicitor Advertising Q4: Do Solicitor Advertising affect the claim delay? Claims with solicitor representation may come in late as claimants were initially unaware of their cover. Relative Delay Longer *** Late onset awareness influenced by advertising? Yes Unknown No Shorter *** Solicitor Representation Results presented relative to Claim Delay Age Awareness Exposure to aggressive lawyer advertising? Higher number of * indicates higher statistical significance Cause of Claim Unemployment Education
Other factors to consider Frequency and form of communication between Superannuation funds and its members Other transactional data (payment reminder notices) Claim Delay Awareness Claims management process Type of cover: Default vs. Opt-in vs. Underwritten Cover Whether the member have access to a financial planner? Socio-economic status (consumer profiling based on residential address)
Additional Insights and Potential Benefits Additional Rating Factors Age Cause of Claim Unemployment and others Inform Potential Benefits Better segmentation of risk, assessment of ultimate claim cost and trends in awareness levels Enhanced pricing based on characteristics of customers under the scheme More accurate reserving Optimised return on capital
Conclusion Stochastic claims models and GLM are tools that can be used to unravel useful additional insights With the continued increase in computing power and increased access to quality data, these techniques become easier to implement Results should not be viewed mechanically without judgement No models are correct, but some are useful!
Thank You DISCLAIMER The content is current as at the date set out on the cover page of this presentation and may be subject to change. This presentation provides general information only, without taking into account the objectives, financial situation, needs or personal circumstances of any individual. This presentation may contain projections concerning financial information and statements concerning future economic performance and events, plans and objectives relating to management, operations, products and services, and assumptions underlying these projections and statements. It is possible that actual results and financial conditions may differ, possibly materially, from the anticipated results and financial condition indicated in these projections and statements.
References England, PD and Verrall, RJ (2002) Stochastic Claims Reserving in General Insurance Abs.gov.au Demographic profiles, Socio economic indexes Employment.gov.au Unemployment rate https://cran.r-project.org/web/packages/chainladder/chainladder.pdf Chain Ladder R package https://cran.r-project.org/web/packages/forward/forward.pdf Forward search R package