Raising Your Actuarial IQ (Improving Information Quality)

Similar documents
Raising Your Actuarial IQ (Improving Information Quality)

5.- RISK ANALYSIS. Business Plan

Data Distributions and Normality

The Fundamentals of Reserve Variability: From Methods to Models Central States Actuarial Forum August 26-27, 2010

WC-5 Just How Credible Is That Employer? Exploring GLMs and Multilevel Modeling for NCCI s Excess Loss Factor Methodology

Two-Sample T-Test for Non-Inferiority

Descriptive Statistics

x is a random variable which is a numerical description of the outcome of an experiment.

Two-Sample T-Test for Superiority by a Margin

Basic Procedure for Histograms

Data screening, transformations: MRC05

Chapter 6. y y. Standardizing with z-scores. Standardizing with z-scores (cont.)

Fundamentals of Statistics

Chapter 4-Describing Data: Displaying and Exploring Data

The Normal Distribution

Gamma Distribution Fitting

DATA SUMMARIZATION AND VISUALIZATION

starting on 5/1/1953 up until 2/1/2017.

How Advanced Pricing Analysis Can Support Underwriting by Claudine Modlin, FCAS, MAAA

Jacob: The illustrative worksheet shows the values of the simulation parameters in the upper left section (Cells D5:F10). Is this for documentation?

OVERVIEW GUIDE TO HOME COUNSELOR ONLINE NATIONAL FORECLOSURE MITIGATION COUNSELING (NFMC) FEATURES

Note: This ASOP is no longer in effect. It was superseded by ASOP No. 23, Doc. No Actuarial Standard of Practice No. 23.

NCSS Statistical Software. Reference Intervals

Enterprise Planning and Budgeting 9.0 Created on 2/4/2010 9:42:00 AM

Certifying Mortgages for Freddie Mac. User Guide

White Paper. Not Just Knowledge, Know How! Artificial Intelligence for Finance!

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing

Standardized Data Percentiles, Quartiles and Box Plots Grouped Data Skewness and Kurtosis

The Not-So-Geeky World of Statistics

Frequency Distribution and Summary Statistics

Review: Types of Summary Statistics

Vivid Reports 2.0 Budget User Guide

Dot Plot: A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.

Presented at the 2012 SCEA/ISPA Joint Annual Conference and Training Workshop -

Key Objectives. Module 2: The Logic of Statistical Inference. Z-scores. SGSB Workshop: Using Statistical Data to Make Decisions

Both the quizzes and exams are closed book. However, For quizzes: Formulas will be provided with quiz papers if there is any need.

Summary of Statistical Analysis Tools EDAD 5630

Center and Spread. Measures of Center and Spread. Example: Mean. Mean: the balance point 2/22/2009. Describing Distributions with Numbers.

Influence of Personal Factors on Health Insurance Purchase Decision

LAB 2 INSTRUCTIONS PROBABILITY DISTRIBUTIONS IN EXCEL

Lecture 1: Review and Exploratory Data Analysis (EDA)

Prepared By. Handaru Jati, Ph.D. Universitas Negeri Yogyakarta.

SPSS I: Menu Basics Practice Exercises Target Software & Version: SPSS V Last Updated on January 17, 2007 Created by Jennifer Ortman

Lecture Week 4 Inspecting Data: Distributions

the display, exploration and transformation of the data are demonstrated and biases typically encountered are highlighted.

AN EXCEL-BASED DATA EXCHANGE (EBDEX) FOR UNIFORM FORMATTING OF REINSURANCE SUBMISSIONS

Estimation and Application of Ranges of Reasonable Estimates. Charles L. McClenahan, FCAS, ASA, MAAA

Statistics, Measures of Central Tendency I

Developing a reserve range, from theory to practice. CAS Spring Meeting 22 May 2013 Vancouver, British Columbia

FINANCIAL MODELING IN EXCEL

Explaining Your Financial Results Attribution Analysis and Forecasting Using Replicated Stratified Sampling

DazStat. Introduction. Installation. DazStat is an Excel add-in for Excel 2003 and Excel 2007.

Better decision making under uncertain conditions using Monte Carlo Simulation

SENSITIVITY ANALYSIS IN CAPITAL BUDGETING USING CRYSTAL BALL. Petter Gokstad 1

Name PID Section # (enrolled)

Pricing of Life Insurance and Annuity Products

XLSTAT TIP SHEET FOR BUSINESS STATISTICS CENGAGE LEARNING

One Proportion Superiority by a Margin Tests

Frequently Asked Questions

Lecture 2 Describing Data

Overview/Outline. Moving beyond raw data. PSY 464 Advanced Experimental Design. Describing and Exploring Data The Normal Distribution

Chapter 5. Discrete Probability Distributions. Random Variables

UNIT 4 MATHEMATICAL METHODS

How To: Perform a Process Capability Analysis Using STATGRAPHICS Centurion

Week 1 Variables: Exploration, Familiarisation and Description. Descriptive Statistics.

CHAPTER TOPICS STATISTIK & PROBABILITAS. Copyright 2017 By. Ir. Arthur Daniel Limantara, MM, MT.

Structured Tools to Help Organize One s Thinking When Performing or Reviewing a Reserve Analysis

Math 2311 Bekki George Office Hours: MW 11am to 12:45pm in 639 PGH Online Thursdays 4-5:30pm And by appointment

Chapter 4-Describing Data: Displaying and Exploring Data

SYLLABUS OF BASIC EDUCATION 2018 Basic Techniques for Ratemaking and Estimating Claim Liabilities Exam 5

Math 227 Elementary Statistics. Bluman 5 th edition

Project planning and creating a WBS

Getting Started. Your Guide to Social Security Analyzer 2.1 Software

Chameleon REPORTING BUDGETING ANALYSIS INTELLIGENCE. Accelerated Performance Management with Computron s G2 Chameleon

November 3, Transmitted via to Dear Commissioner Murphy,

DCI Data Validation and Quality Issues

Parametric Statistics: Exploring Assumptions.

MANAGING YOUR EO BUDGET BEFORE IT MANAGES YOU. Brian Yacker, JD/CPA Stacey Bergman, CPA

Stat 101 Exam 1 - Embers Important Formulas and Concepts 1

Tutorial. Morningstar DirectSM. Quick Start Guide

DRAFT 2011 Exam 5 Basic Ratemaking and Reserving

Risk-Based Capital (RBC) Reserve Risk Charges Improvements to Current Calibration Method

LFA Spot check Terms of Reference Guidance Note for LFAs

Predictive Analytics in Life Insurance. Advances in Predictive Analytics Conference, University of Waterloo December 1, 2017

Morningstar Direct. Regional Training Guide

KARACHI UNIVERSITY BUSINESS SCHOOL UNIVERSITY OF KARACHI BS (BBA) VI

FRx FORECASTER FRx SOFTWARE CORPORATION

Murabaha Creation Oracle FLEXCUBE Universal Banking Release [December] [2012] Oracle Part Number E

E.D.A. Exploratory Data Analysis E.D.A. Steps for E.D.A. Greg C Elvers, Ph.D.

Describing Data: One Quantitative Variable

February 2010 Office of the Deputy Assistant Secretary of the Army for Cost & Economics (ODASA-CE)

Descriptive Analysis

Intro to Quant Investing

Categorical. A general name for non-numerical data; the data is separated into categories of some kind.

Jacob: What data do we use? Do we compile paid loss triangles for a line of business?

Solvency II. Building an internal model in the Solvency II context. Montreal September 2010

TACOMA EMPLOYES RETIREMENT SYSTEM. STUDY OF MORTALITY EXPERIENCE January 1, 2002 December 31, 2005

An Actuarial Model of Excess of Policy Limits Losses

Draft Educational Note. Data Validation. Committee on Workers Compensation. December Document

Some Characteristics of Data

Transcription:

Raising Your Actuarial IQ CAS Management Educational Materials Working Party with Martin E. Ellingsworth Actuarial IQ Introduction IQ stands for Information Quality Introduction to Quality and Management being written by the CAS Management Educational Materials Working Party Directed at actuarial analysts as much as actuarial data managers: what every actuary should know about data quality and data management Working Party Publications Book reviews of data management and data quality texts in the Actuarial Review starting with the August 2006 edition These reviews are combined and compared in Survey of Management and Quality Texts, CAS Forum, Winter 2007, www.casact.org This presentation is based on our Upcoming paper: Actuarial IQ (Information Quality) to be published in the Winter 2008 edition of the CAS Forum 1

What is Quality? Quality data is data that is appropriate for its purpose. Quality is a relative not absolute concept. for an annual rate study may not be appropriate for a class relativity analysis. Promising predictor variables in Predictive Modeling may not have been coded or processed with that purpose in mind. Introduction and Horror Stories Presented by Aleksey Popelyukhin Flow Collection Information Quality involves all steps: Collection & Actuarial To improve : Making 2

Collection Principles on Quality: Perspectives ASB ASOP 23 Quality CAS Management and Information Committee: White Paper on Quality Richard T. Watson Management: bases and Organization ASOP No. 23 Collection Due consideration to the following: Appropriateness for intended purpose Reasonableness and comprehensiveness Any known, material limitations The cost and feasibility of obtaining alternative data The benefit to be gained from an alternative data set Sampling methods White Paper on Quality Collection Evaluating data quality consists of examining data for: Validity Accuracy Reasonableness Completeness 3

Watson Collection 18 Dimensions of Quality: Many overlap with previously mentioned principles. Others describe ways of storing data e.g. Representational consistency, Precision Others go beyond data characteristics to processing and management e.g. Stewardship, Sharing, Timeliness, Interpretation Collection Redman: Manage Information Chain establish management responsibilities describe information chart understand customer needs establish measurement system establish control and check performance identify improvement opportunities make improvements Collection Quality Measurement: Quantify traditional aspects of quality data such as accuracy, consistency, uniqueness, timeliness and completeness using a score assigned by an expert Measure the consequences of data quality problems measure the number of times in a sample that data quality errors cause errors in analyses, and the severity of those errors Use measurement to motivate improvement 4

Metadata Collection Big help in describing Metadata! that Describes the Key Management Tool Reduces Risky Assumptions CWP means Closed with Payment? Closed without Payment? Example Marital Status Collection What is in the Marital Status Variable? Single? Married? Polygamist? Marital Status Frequency Percent 5,053 14.3 1 2,043 5.8 2 9,657 27.4 4 2 0 D 4 0 M 2,971 8.4 S 15,554 44.1 Total 35,284 100 Single / Separated? Example: What is the Marital Status Variable? Collection Example of Metadata Marital Status Value Description 1 Married, data from source 1, straight move of field ms_code 2 Single, data from source 1, straight move of field ms_code 4 Divorced, data from source 1, straight move of field ms_code D M S Blank Divorced, data from source 2, straight move of mstatus Married, data from source 2, straight move of mstatus Single, data from source 2, straight move of mstatus Marital status is missing 5

What Is In It? Collection Business Rules Processing Rules Report Compilation and Extraction Process Other Collection What Is In It? Business Rules Elements Definition of Field, e.g., How Claims are Defined How Exposure is Calculated Format of Field mm/dd/yyyy #,##0.00 Valid Values and Interdependencies Alpha Only Driver = Yes and Age > 15 What Is In It? Collection Processing Rules How base is Populated Sources of Handling of Missing 6

What Is In It? Collection Report Compilation and Extraction Process How is Selected or Bypassed Fiscal Period Accounting Date for Transactions Actuarial Evaluation Date Calculations Mappings What Is In It? Collection Other Process Flow Documentation Versioning Collection Why Actuaries Need Metadata? Better Avoid Being Mis-Informed about Variable and What It Represents Did Anything Change During the Experience Period? Only if Ask to receive this Actually compare metadata lists / files 7

Example of Metadata Collection Statistical Plans in P/C Industry General Reporting Element Definitions Standardize to the Extent Possible Collection Collection supplier management Let suppliers know what you want Provide feedback to suppliers Balance the following Known issues with supplier Importance to the business Supplier willingness to experiment together Ease of meeting face to face and Collection In this step data are put into standardized structures and then combined into larger, more centralized data sets Actuarial IQ introduces two ways to improve IQ in this step: Exploratory (EDA) Audits 8

EDA: Preprocessing Collection EDA: Overview Collection Typically first step in analyzing data Purpose: Find outliers and errors Explore structure of the data Uses simple statistics and graphical techniques Examples for numeric data include histograms, descriptive statistics and frequency tables EDA: Histograms Collection 25,000 20,000 Frequency 15,000 10,000 5,000 0 600 900 1200 1500 1800 License Year 9

EDA: Descriptive Statistics Collection Statistic Policyholder Age Mean 36.9 Standard Error 0.1 Median 35.0 Mode 32.0 Standard Deviation 13.2 Sample Variance 174.4 Kurtosis 0.5 Skewness 0.7 Range 84 Minimum 16 Maximum 100 Sum 1114357 Count 30226 Largest(2) 100 Smallest(2) 16 EDA: Categorical Collection EDA: Cubes Collection Usually frequency tables Example: search for missing gender values Gender Frequency Percent 5,054 14.3 F 13,032 36.9 M 17,198 48.7 Total 35,284 100 10

EDA: Cubes Collection Example: identify inconsistent coding of marital status Missing Multiple codes for same status Marital Status Frequency Percent 5,053 14.3 1 2,043 5.8 2 9,657 27.4 4 2 0 D 4 0 M 2,971 8.4 S 15,554 44.1 Total 35,284 100 Underutilized data elements? EDA: Missing Collection N BUSINESS TYPE Gender Age License Year Valid 35,284 35,284 30,242 30,250 Missing 0 0 5,042 5,034 25 27.00 1,986.00 Percentiles 50 35.00 1,996.00 75 45.00 2,000.00 Collection EDA: Summary Before data is analyzed, Gathered Cleaned Integrated EDA Techniques used to explore the data to detect missing values, to identify invalid values and to highlight outliers Use histograms, descriptive statistics and frequency tables 11

Collection Audits ASOP No. 23 does not require actuaries to audit, but good to understand Main Idea: compare the data intended for use to its original source, e.g., policy applications or notices of loss Top-Down: check that totals from one source match the totals from a reliable source (????) Bottom-Up: follow a sample of input records through all the processing to the final report Quality Collection Models On its way to results data can be: Rejected wrong Format Underutilized wrong Model Distorted wrong model Parameterization is a crucial component in the overall process quality Collection Model Design quality Implementation quality Testing and Documentation 12

Collection Model Design quality Model Selection and Validation Parameters Estimation Verification Model Performance Did I use the right model? Did I use the model right? Collection Model Performance Models predict observable events. Outcomes can be compared to predictions leading to Model s Improvements Model s Recalibration Model s Rejection leading to higher process quality. Collection Model Design quality Implementation quality Testing and Documentation 13

Collection Implementation quality Programming languages: C++, VBA, SQL many books on good design patterns Formulae in a Spreadsheet - also programming no books on good design patterns Need good software design to simplify: Usage Testing Modifications / Improvements Recovery (side benefit) do not belong to the template either Collection Implementation quality Separation of data and algorithms does not belong in template. Collection Implementation quality Layering simplifies Navigation optimizes Workflow shortens Learning Curve Each Step on its own tab 14

Collection Model Design quality Implementation quality Testing and Documentation Collection Testing and Documentation Validation black-box treatment: comparing results with correct ones Verification inside-the-box treatment: checking formulae 1. Should be integral part of development 2. Should be performed by outsiders 3. Should be well-documented base Documenter or Excel s External Comments Diagram CTRL asbuilder Source ~ Displays definition Structured formulae Attributes texts Collection Testing and Documentation Self-documenting features base Documenter <State>: CT AY\Age <LOB>: 12 WC24 36 48 60 1994 112,605 $ 124,592 Excel s named ranges and expressions $... $ 100,406 $ 107,847 115,288 $ External $ Shape --> $ 113,215 Triangle Range $ 110,271 definitions $ 112,562 1995 111,644 Amount--> Losses 1996 $ 115,551 $ 106,665 $ 104,029 Structured Cumulative- True comments 1997 $ 111,442 $ 108,581 1998 $ 105,647 15

One can link Document Properties to Spreadsheet Cells Collection Testing and Documentation Version management Smart diagrams can be automated Collection Testing and Documentation Documenting Workflow Collection Working Example Presented by Martin Ellingsworth 16

Collection PWC 2004 Study The key is to understand the impact data is having on your business and do something about it. quality is at the core if you improve your data you will directly impact your overall business results. Global Management Survey 2004, PriceWaterhouseCoopers Conclusion Collection Quality is a core issue affecting the quality and usefulness of the actuarial work product Quality is not just about how data is coded: phrase information quality is coined to emphasize the impact of processes on the quality of final product Conclusion Collection Ways to improve actuarial IQ Applying Quality principles Defining and using Metadata Measuring data quality to track progress and awareness of quality audit Utilizing Exploratory to identify outliers and explore the structure of a dataset Testing the quality of actuarial models Clarifying actuarial presentations and reports Employing Actuarial Management best practices 17

Conclusion Collection Expansions of actuarial frame of reference is a corporate asset that needs to be managed and actuaries can play a role needs to be appropriate for all of its intended uses Expansion of data quality principles to support these broader perspectives Acknowledgement The working party would like to thank the Insurance Management Association (www.idma.org) for their help in: Developing a shortlist of texts that would be relevant to actuaries, and Reviewing our papers Author, Author This presentation is a publication of CAS Management and Information Educational Materials Working Party: Keith P. Allen Robert Neil Campbell, Chairperson Louise A. Francis David Dennis Hudson Gary W. Knoble Rudy A. Palenik Aleksey Popelyukhin Ph.D. Virginia R. Prevosto Lijuan Zhang 18

CAS Management Educational Materials Working Party Publications Book reviews of data management and data quality texts in the Actuarial Review starting with the August 2006 edition These reviews are combined and compared in Survey of Management and Quality Texts, CAS Forum, Winter 2007, www.casact.org This presentation is based on our Upcoming paper: Actuarial IQ (Information Quality) to be published in the Winter 2008 edition of the CAS Forum 19