Session 40 PD, How Would I Get Started With Predictive Modeling? Moderator: Douglas T. Norris, FSA, MAAA Presenters: Timothy S. Paris, FSA, MAAA Sandra Tsui Shan To, FSA, MAAA Qinqing (Annie) Xue, FSA, CERA, MAAA SOA Antitrust Disclaimer SOA Presentation Disclaimer
How Would I Get Started with Predictive Modeling? #040PD - Society of Actuaries Annual Meeting Sandra To, VP and Deputy Chief Reserving Actuary October 2016 1
Definitions: Let s start by agreeing on what we are discussing Predictive modeling a process used in predictive analytics to create a statistical model of future behavior Predictive analytics the area of data mining concerned with forecasting probabilities and trends Data mining the practice of examining large databases in order to generate new information Big data extremely large data sets that may be analyzed computationally to reveal patterns, trends and associations, especially relating to human behavior and interactions
Business Goals: Begin with the end in mind Many applications of predictive modeling. What will your model do? Source: Towers-Watson, The Future of Predictive Modeling, Emphasis 2012 3
Perils of a Poorly Designed Plan
Model Relevancy 5
Things to Consider Define short-, mid- and long-term business goals. How can data modeling support these initiatives? What do you expect to get out of the effort? Reject unclear objectives Big Data is an expensive proposition. Do it for a purpose not just because it is trendy. Determine monetary investment sizing. How much are you willing to invest? How quickly must the initiative move forward? When must the initial phase be complete? Define internal data resources. Define external data opportunities. Identify internal talent. Build and invest in resources, contract or partner?
Skills Needed to Create Effective Modeling Projects Actuaries vs Data Science? Source: http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram 7
Tools for Creating Effective Models Spreadsheets (Microsoft Excel) Neural networks Data mining Linear & logistic regression testing Business rules Cox regression Assumptions Clustering Historical data Scorecards Actuarial modeling platforms Association rules Prophet SAP Multiple models MoSes SQL Server Ensemble SAS AXIS Etc. Decision trees Segmentation Chaining Composition Rule set models Restricted Boltzamn Machine Industry benchmarks Behavioral metrics 8
Many Technologies and Potential Partners
Talent & Framework 10
Plan for success How did we determine that output make business sense? Active communication plan throughout development process Interactive process with business owners, technology team, sales, operations, etc. How do we make sure we deliver on our investments? Know size of opportunities How do we leverage what we have already built to achieve these opportunities Plan to operationalize what we have built How do we ensure that we meet our business objectives? Performance indicators Dashboards Owner for process Look for improvements, it should be an iterative process 11
How Would I Get Started with Predictive Modeling from a Newbie s Perspective
2
Newbie s Motto -Just Do It 3
Conferences Internet Meetings Coffee Chats 4
5
What s My Role? 6
R&D Client Markets New Solutions Costing IF Management Nitin Nayak Stephen Abrokwah JJ Carroll Li Lin Allen Pinkham Tommy Wade Jane Wang Brian Carteaux 7
What s in the beginning, in the middle, and in the end? Application Data Reports MIB, MVR, UW, Predictive Modeling Presentations Third-party Data 8
Where do we fit? Business Objective Data Insights Cost Benefit Business Strategy Consumer Experience Experience Study. Model Insights Feedback Loop 9
A Case Study Demonstrate predictive modelling process Identify factors associated with age 50+ life insurance ownership HRS2014 survey data 20,000 peopleage 50+ Marital status, education, #kids, job status,.. and Life insurance ownership Overall understanding Data Preparation Select interested subgroup Select potential predictors Multivariate logistic model Select predictors based on fitness measures Final Predictors Performance Metrics Close financial protection gap for pre-retirement population 10
Select potential predictors Univariate Analysis Years of Education Comparator Odd Ratio Post College High School 1.272 Frequency Tables College Graduate High School 1.265 Some College High School 1.087 Less than High School High School 0.499 11
Select predictors x: for someone who does not own life insurance, it is the probability that he/she is predicted as a life insurance owner - false positive rate y: for someone who does own life insurance, it is the probability that he/she is predicted as a life insurance owner - true positive rate Age Age + Gender Age + Gender + Region 12
Final Model # Predictors 1 CURRENT JOB STATUS 2 1ST ADDRESS STATE 3 OWN-RENT HOME 4 REGULAR USE OF WEB FOR EMAIL 5 PENSION INCOME 6 YEARS OF EDUCATION 7 COUNT OF KIDS 8 INCOME TAX RETURN 9 MARITAL STATUS 10 SEX OF INDIVIDUAL 11 NUMBER DRINKS- PER DAY 12 WHAT PERCENT TAKE RISKS 13 CURRENT AGE 14 SMOKE CIGARETTES NOW Correctly classified life insurance owners: 75% Correctly classified no life insurance owners: 63% 13
Newbie s Motto -Just Do It 14
15
Legal notice 2016 Swiss Re. All rights reserved. You are not permitted to create any modifications or derivative works of this presentation or to use it for commercial or other public purposes without the prior written permission of Swiss Re. The information and opinions contained in the presentation are provided as at the date of the presentation and are subject to change without notice. Although the information used was taken from reliable sources, Swiss Re does not accept any responsibility for the accuracy or comprehensiveness of the details given. All liability for the accuracy and completeness thereof or for any damage or loss resulting from the use of the information contained in this presentation is expressly excluded. Under no circumstances shall Swiss Re or its Group companies be liable for any financial or consequential loss relating to this presentation. 16
Timothy Paris, FSA, MAAA Session 040 How Would I Get Started With Predictive Modeling? Variable Annuity Case Study October 24, 2016
Industry Model Development 2
By Company By Quarter By Guarantee Type Moneyness Distribution Channel Contract Size Interaction with Partial Withdrawals 3
Interpreting Experience Data Translating to assumptions is very difficult using traditional methods! Avoid missing important factors? Adequacy of company-level data? Interactions between factors? Avoid double counting? Changes over time? Process transparency and consistency? 4
Industry Data Traditional Analysis Statistical Techniques Expert Judgment 5
y x E(y x) Classical Linear Modeling g[e(y x)] Generalized Linear Modeling (GLM) Flexible framework Non-normal Non-constant variance Simple Linear Modeling 6
Generalized Linear Modeling 7
Logistic Regression Model ln μμ 1 μμ = ββ 0 + ββ ii xx ii Log of odds is a linear function of key factors Binary values, such as surrenders or deaths 8
Goodness of Fit Predictive Power 9
Aikake s Information Criterion Actual-to-Expected Ratios Expert Judgment Much More 10
Aikake s Information Criterion AAAAAA = 2kk 2ln(LL) Metric to compare relative quality of alternative models. Lower is better. Rewards goodness of fit (L), but with penalty for more model factors (k) to mitigate risk of overfitting the model on train data. 11
Actual-to-Expected Ratios AA/EE Develop E using train data, compare to A from test data Examine in aggregate, by cohorts, and over time Look at range of outcomes and tails 12
Expert Judgment Business context, sensibility, materiality, parsimony More data usually beats more complex models Let the data speak Use simples models for complex data, and complex models for simple data 13
Factor Exploration 14
Factor Exploration 15
Factor Exploration 16
Factor Exploration 17
Factor Exploration 18
Aikake s Information Criterion Actual-to-Expected Ratios Expert Judgment Model Selection Much More 19
Factor Exploration 20
Company Customization and Benchmarking 21
22
Actuarial good practice Benchmarking Stakeholders want to know Early warning for management actions 23
Company Customization Similar to traditional actuarial credibility theory Avoid unnecessary and speculative guesswork whenever possible Balance between industry and company data 24
Benefits Goes beyond the endless series of reactionary point estimates to quantify range of behavioral values Consistent mathematical framework for assumption setting and review/updates Allows for company-level customization from max data set (industry) 25
Industry Data Traditional Analysis Statistical Techniques Expert Judgment 26