And The Winner Is? How to Pick a Better Model

Similar documents
And The Winner Is? How to Pick a Better Model

WC-5 Just How Credible Is That Employer? Exploring GLMs and Multilevel Modeling for NCCI s Excess Loss Factor Methodology

The Fundamentals of Reserve Variability: From Methods to Models Central States Actuarial Forum August 26-27, 2010

GLM III - The Matrix Reloaded

Loss Cost Modeling vs. Frequency and Severity Modeling

2018 Predictive Analytics Symposium Session 10: Cracking the Black Box with Awareness & Validation

Session 5. Predictive Modeling in Life Insurance

Subject CS1 Actuarial Statistics 1 Core Principles. Syllabus. for the 2019 exams. 1 June 2018

Quantile Regression. By Luyang Fu, Ph. D., FCAS, State Auto Insurance Company Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting

Intro to GLM Day 2: GLM and Maximum Likelihood

Web Appendix. Are the effects of monetary policy shocks big or small? Olivier Coibion

Keywords Akiake Information criterion, Automobile, Bonus-Malus, Exponential family, Linear regression, Residuals, Scaled deviance. I.

Chapter 7 Sampling Distributions and Point Estimation of Parameters

I BASIC RATEMAKING TECHNIQUES

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions

starting on 5/1/1953 up until 2/1/2017.

Maximum Likelihood Estimation

Institute of Actuaries of India Subject CT6 Statistical Methods

Non linearity issues in PD modelling. Amrita Juhi Lucas Klinkers

the display, exploration and transformation of the data are demonstrated and biases typically encountered are highlighted.

Advanced Risk Management Use of Predictive Modeling in Underwriting and Pricing

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Models of Patterns. Lecture 3, SMMD 2005 Bob Stine

Lecture 6: Non Normal Distributions

Contents Utility theory and insurance The individual risk model Collective risk models

The Matrix Inverted A Primer in GLM Theory and Practical Issues. March 11-12, 2004 CAS Ratemaking Seminar Roosevelt Mosley, FCAS, MAAA

This homework assignment uses the material on pages ( A moving average ).

Session 5. A brief introduction to Predictive Modeling

Predictive Modeling GLM and Price Elasticity Model. David Dou October 8 th, 2014

Assessing Regime Switching Equity Return Models

Dual response surface methodology: Applicable always?

Obtaining Predictive Distributions for Reserves Which Incorporate Expert Opinions R. Verrall A. Estimation of Policy Liabilities

A Comparison of Univariate Probit and Logit. Models Using Simulation

By-Peril Deductible Factors

Solutions to the Fall 2017 CAS Exam 8

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

CHAPTER 2 Describing Data: Numerical

Duangporn Jearkpaporn, Connie M. Borror Douglas C. Montgomery and George C. Runger Arizona State University Tempe, AZ

Chapter 3. Numerical Descriptive Measures. Copyright 2016 Pearson Education, Ltd. Chapter 3, Slide 1

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

On Some Test Statistics for Testing the Population Skewness and Kurtosis: An Empirical Study

EDUCATION COMMITTEE OF THE SOCIETY OF ACTUARIES SHORT-TERM ACTUARIAL MATHEMATICS STUDY NOTE CHAPTER 8 FROM

SYLLABUS OF BASIC EDUCATION SPRING 2018 Construction and Evaluation of Actuarial Models Exam 4

arxiv: v1 [q-fin.rm] 13 Dec 2016

SOLUTIONS TO THE LAB 1 ASSIGNMENT

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Stochastic Claims Reserving _ Methods in Insurance

Loss Simulation Model Testing and Enhancement

FINITE SAMPLE DISTRIBUTIONS OF RISK-RETURN RATIOS

Developing a reserve range, from theory to practice. CAS Spring Meeting 22 May 2013 Vancouver, British Columbia

ECE 295: Lecture 03 Estimation and Confidence Interval

Statistical Modeling Techniques for Reserve Ranges: A Simulation Approach

STATISTICAL DISTRIBUTIONS AND THE CALCULATOR

The Retrospective Testing of Stochastic Loss Reserve Models. Glenn Meyers, FCAS, MAAA, CERA, Ph.D. ISO Innovative Analytics. and. Peng Shi, ASA, Ph.D.

Analytics on pension valuations

DATA HANDLING Five-Number Summary

SYLLABUS OF BASIC EDUCATION FALL 2017 Advanced Ratemaking Exam 8

Outline. Review Continuation of exercises from last time

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Module Tag PSY_P2_M 7. PAPER No.2: QUANTITATIVE METHODS MODULE No.7: NORMAL DISTRIBUTION

Rating Endorsements Using Generalized Linear Models

Bayesian and Hierarchical Methods for Ratemaking

An Approach for Comparison of Methodologies for Estimation of the Financial Risk of a Bond, Using the Bootstrapping Method

Copyright 2011 Pearson Education, Inc. Publishing as Addison-Wesley.

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION

Introduction to Population Modeling

GARCH Models. Instructor: G. William Schwert

Maximum Likelihood Estimation

STAT 113 Variability

Window Width Selection for L 2 Adjusted Quantile Regression

STA Rev. F Learning Objectives. What is a Random Variable? Module 5 Discrete Random Variables

Statistical Case Estimation Modelling

PARAMETRIC AND NON-PARAMETRIC BOOTSTRAP: A SIMULATION STUDY FOR A LINEAR REGRESSION WITH RESIDUALS FROM A MIXTURE OF LAPLACE DISTRIBUTIONS

A Stochastic Reserving Today (Beyond Bootstrap)

INSTITUTE OF ACTUARIES OF INDIA EXAMINATIONS. 20 th May Subject CT3 Probability & Mathematical Statistics

IOP 201-Q (Industrial Psychological Research) Tutorial 5

Probability & Statistics Modular Learning Exercises

CHAPTER III METHODOLOGY

QQ PLOT Yunsi Wang, Tyler Steele, Eva Zhang Spring 2016

Expanding Predictive Analytics Through the Use of Machine Learning

Multiple regression - a brief introduction

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

STAT 157 HW1 Solutions

Session 178 TS, Stats for Health Actuaries. Moderator: Ian G. Duncan, FSA, FCA, FCIA, FIA, MAAA. Presenter: Joan C. Barrett, FSA, MAAA

The Leveled Chain Ladder Model. for Stochastic Loss Reserving

Where s the Beef Does the Mack Method produce an undernourished range of possible outcomes?

Lecture 3: Probability Distributions (cont d)

Study Guide on Testing the Assumptions of Age-to-Age Factors - G. Stolyarov II 1

Chapter 7: SAMPLING DISTRIBUTIONS & POINT ESTIMATION OF PARAMETERS

1. You are given the following information about a stationary AR(2) model:

Value (x) probability Example A-2: Construct a histogram for population Ψ.

TABLE OF CONTENTS - VOLUME 2

Port(A,B) is a combination of two stocks, A and B, with standard deviations A and B. A,B = correlation (A,B) = 0.

Antitrust Notice. Copyright 2010 National Council on Compensation Insurance, Inc. All Rights Reserved.

Lecture 12: The Bootstrap

Statistical Evidence and Inference

Predicting Charitable Contributions

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

STA Module 3B Discrete Random Variables

ESTIMATING THE DISTRIBUTION OF DEMAND USING BOUNDED SALES DATA

Transcription:

And The Winner Is? How to Pick a Better Model Part 2 Goodness-of-Fit and Internal Stability Dan Tevet, FCAS, MAAA

Goodness-of-Fit Trying to answer question: How well does our model fit the data? Can be measured on training data or on holdout data By identifying areas of poor model fit, we may be able to improve our model A few ways to measure goodness-of-fit Squared or absolute error Likelihood/log-likelihood AIC/BIC Deviance/deviance residuals Plot of actual versus predicted target

Squared Error & Absolute Error For each record, calculate the squared or absolute difference between actual and predicted target variable Easy and intuitive, but generally inappropriate for insurance data, and can lead to selection of wrong model Squared error appropriate for Normal data, but insurance data generally not Normal

Residuals Raw residual = y i μ i, where y is actual value of target variable and μ is predicted value In simple linear regression, residuals are supposed to be Normally distributed, and departure from Normality indicates poor fit For insurance data, raw residuals are highly skewed and generally not useful

Likelihood The probability, as predicted by our model, that what actually did occur would occur A GLM calculates the parameters that maximize likelihood Higher likelihood better model fit (very simple terms) Problem with likelihood adding a variable always improves likelihood

AIC & BIC penalized measures of fit Akaike Information Criterion (AIC) = -2*(Log Likelihood) + 2*(Number of Parameters in Model) Bayesian Information Criterion (BIC) = -2*(Log Likelihood) + (Number of Parameters in Model)*ln(Number of Records in Dataset) Good rule for deciding which variables to include unless a variables improves AIC or BIC, don t include it

Deviance Saturated model the model with the highest possible likelihood One indicator variable for each record, so model fits data perfectly Deviance = 2*(loglikelihood of saturated model loglikelihood of fitted model) GLMs minimize deviance Like squared error, but reflects shape of assumed distribution We generally fit skewed distributions to insurance data (Tweedie, gamma, etc), and thus deviance is more appropriate than squared error

Deviance in Math Poisson: Gamma: Tweedie: Normal:

Deviance Residuals Square root of (weighted) deviance times the sign of actual minus predicted Measures amount by which the model missed, but reflects the assumed distribution Should be approximately Normally distributed, and far departure from Normality indicates that incorrect distribution has been chosen Ideally, there should be no discernable pattern in deviance residuals Model should miss randomly, not systemically

Deviance Residual Diagnostics Histogram of deviance residuals look for approximate Normality (bell-shape) Far departure from Normality generally indicates that incorrect distribution has been chosen Can also indicate poor fit Scatter plot of deviance residuals versus predicted target variable Should be uninformative cloud Pattern in this plot indicates incorrect distribution

Example: Selecting Severity Model Goal is to select a distribution to model severity Two common choices Gamma and Inverse Gaussian Gamma: V(μ) = μ 2 Variance of severity is proportional to mean severity squared Inverse Gaussian: V(μ) = μ 3 Variance of severity is proportional to mean severity cubed Two lines of business LOB1 is high-frequency, low-severity LOB2 is low-frequency, high-severity

Deviance Residual Histogram LOB1, Gamma GLM

Deviance Residual Histogram LOB1, IG GLM

Deviance Residual Histogram LOB1, Gamma GLM

Deviance Residual Histogram LOB1, IG GLM

Deviance Residual Histogram LOB2, Gamma GLM

Deviance Residual Histogram LOB2, IG GLM

Actual vs Predicted Target Scatter plot of actual target variable (on y-axis) versus predicted target variable (on x-axis) If model fits well, then plot should produce a straight line, indicating close agreement between actual and predicted Focus on areas where model seems to miss If have many records, may need to bucket (such as into percentiles) Depending on scale, may need to plot on a log-log scale

Example of Actual vs Predicted

Example of Log of Actual vs Log of Predicted

Measuring Internal Stability Process of determining how robust model results are Getting a second opinion (and third and fourth and fifth) on how well the model performs Goals Guard against overfitting Select models that are more stable Better understand inherent volatility of results

Validation 101: Assess model on holdout data Split data into training-test-validation Training Validation Lift Build model Assess stability of model Calculate model lift Why is it more complex than this? Randomly splitting data doesn t necessary guard against overfitting Data may be too thin for such a rigid split Doesn t provide a great diversity of opinions

Overfitting can happen if models aren t validated out-of-time The same storm hits all homes in an area, the same bad winter impacts auto claims in a region, etc Through out-of-time validation, we can help guard against overfitting How do we use this? Year 1 Year 2 Year 3 Year 4 Year 5 Training Year 1 Year 2 Year 3 Validation Year 4 Lift Year 5 Examine model fit on validation set If reasonable If not Determining reasonableness often more art than science

Example of Plot of Actual vs Predicted on Holdout Training Validation (Out of Time)

Cross-validation is very useful when data is then and can give us more confidence in results Split data into subsets Refit model on each subset and compare results across subsets 25

Bootstrapping Re-sampling technique that allows us to get more out of our data Start with a dataset and sample from it with replacement Some records will get pulled multiple times, and some will not get pulled at all Generally, we create a dataset with the same number of records as our original dataset Can create many bootstrap datasets, and each dataset can be thought of as an alternate reality Since each bootstrap is an alternate reality, we can use bootstrapping to construct confidence intervals and get more opinions on model performance

We can use bootstrapping to put confidence intervals around lift measures Understand how significant the victory is 1 2 Model A currently in production, with Gini of 35.4 Better understanding of uncertainty New model expected to generate $1M in additional revenue in first 3 months Challenger Model B has Gini of 36.9 Actual revenue is $850K Should we implement Model B? Did model fail, or is this normal variation?

References De Jong and Heller, Generalized Linear Models for Insurance Data, Cambridge University Press, 2008 Efron and Tibshirani, An Introduction to the Bootstrap, Chapman & Hall, 1994 Goldburd, Khare, Tevet, Generalized Linear Models for Insurance Rating, CAS Monograph Series, 2016 McCullagh and Nelder, Generalized Linear Models, 2nd Ed., Chapman & Hall, 1989 Werner and Modlin, Basic Ratemaking, Casualty Actuarial Society, Fourth Edition, October 2010.