Estimating Parameters for Incomplete Data. William White

Similar documents
Point Estimation. Copyright Cengage Learning. All rights reserved.

Point Estimation. Some General Concepts of Point Estimation. Example. Estimator quality

David R. Clark. Presented at the: 2013 Enterprise Risk Management Symposium April 22-24, 2013

Point Estimation. Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

Frequency Distribution Models 1- Probability Density Function (PDF)

Analysis of truncated data with application to the operational risk estimation

ESTIMATION OF MODIFIED MEASURE OF SKEWNESS. Elsayed Ali Habib *

Back to estimators...

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

A Markov Chain Monte Carlo Approach to Estimate the Risks of Extremely Large Insurance Claims

Modeling. joint work with Jed Frees, U of Wisconsin - Madison. Travelers PASG (Predictive Analytics Study Group) Seminar Tuesday, 12 April 2016

A Skewed Truncated Cauchy Logistic. Distribution and its Moments

Week 2 Quantitative Analysis of Financial Markets Hypothesis Testing and Confidence Intervals

FAV i R This paper is produced mechanically as part of FAViR. See for more information.

Anti-Trust Notice. The Casualty Actuarial Society is committed to adhering strictly

Actuarial Mathematics and Statistics Statistics 5 Part 2: Statistical Inference Tutorial Problems

CAS Course 3 - Actuarial Models

A Skewed Truncated Cauchy Uniform Distribution and Its Moments

ARCH Proceedings

CS 361: Probability & Statistics

6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Lecture 23

[D7] PROBABILITY DISTRIBUTION OF OUTSTANDING LIABILITY FROM INDIVIDUAL PAYMENTS DATA Contributed by T S Wright

Estimation of Probability of Defaults (PD) for Low-Default Portfolios: An Actuarial Approach

Appendix A Financial Calculations

Much of what appears here comes from ideas presented in the book:

SYLLABUS OF BASIC EDUCATION SPRING 2018 Construction and Evaluation of Actuarial Models Exam 4

Homework Problems Stat 479

Gamma Distribution Fitting

Pakistan Export Earnings -Analysis

Chapter 3 Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2013 John Wiley & Sons, Inc.

Practice Exam 1. Loss Amount Number of Losses

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Probability & Statistics

Week 1 Quantitative Analysis of Financial Markets Probabilities

Maximum Likelihood Estimation

PORTFOLIO OPTIMIZATION AND SHARPE RATIO BASED ON COPULA APPROACH

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

The Leveled Chain Ladder Model. for Stochastic Loss Reserving

Modelling component reliability using warranty data

A lower bound on seller revenue in single buyer monopoly auctions

Bivariate Birnbaum-Saunders Distribution

Statistics for Managers Using Microsoft Excel/SPSS Chapter 6 The Normal Distribution And Other Continuous Distributions

Cost Estimation as a Linear Programming Problem ISPA/SCEA Annual Conference St. Louis, Missouri

Distortion operator of uncertainty claim pricing using weibull distortion operator

Confidence Intervals for One-Sample Specificity

A model for determining the optimal base stock level when the lead time has a change of distribution property

Chapter 8: Sampling distributions of estimators Sections

Maximum Likelihood Estimation

STAT 509: Statistics for Engineers Dr. Dewei Wang. Copyright 2014 John Wiley & Sons, Inc. All rights reserved.

Test Volume 12, Number 1. June 2003

UNIT 4 MATHEMATICAL METHODS

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION

Unit 5: Sampling Distributions of Statistics

On the Distribution and Its Properties of the Sum of a Normal and a Doubly Truncated Normal

Unit 5: Sampling Distributions of Statistics

STATISTICS and PROBABILITY

Optimal Allocation of Policy Limits and Deductibles

Lecture 10: Point Estimation

Stochastic Models. Statistics. Walt Pohl. February 28, Department of Business Administration

Simulation of Moment, Cumulant, Kurtosis and the Characteristics Function of Dagum Distribution

Gujarat University Choice Based Credit System (CBCS) Syllabus for Statistics (UG) B. Sc. Semester III and IV Effective from June, 2018.

Model Uncertainty in Operational Risk Modeling

New Approximations of Ruin Probability in a Risk Process

The actuar Package. March 24, bstraub... 1 hachemeister... 3 panjer... 4 rearrangepf... 5 simpf Index 8. Buhlmann-Straub Credibility Model

CHAPTERS 5 & 6: CONTINUOUS RANDOM VARIABLES

Maximum Likelihood Estimates for Alpha and Beta With Zero SAIDI Days

Applied Statistics I

Chapter 8. Sampling and Estimation. 8.1 Random samples

COMPARATIVE ANALYSIS OF SOME DISTRIBUTIONS ON THE CAPITAL REQUIREMENT DATA FOR THE INSURANCE COMPANY

Small Area Estimation of Poverty Indicators using Interval Censored Income Data

Multinomial Logit Models for Variable Response Categories Ordered

Australian Journal of Basic and Applied Sciences. Conditional Maximum Likelihood Estimation For Survival Function Using Cox Model

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.

THE USE OF THE LOGNORMAL DISTRIBUTION IN ANALYZING INCOMES

Measuring Risk. Review of statistical concepts Probability distribution. Review of statistical concepts Probability distribution 2/1/2018

1 Sampling Distributions

Volume 37, Issue 2. Handling Endogeneity in Stochastic Frontier Analysis

The Weibull in R is actually parameterized a fair bit differently from the book. In R, the density for x > 0 is

Chapter ! Bell Shaped

This homework assignment uses the material on pages ( A moving average ).

Chapter 4: Asymptotic Properties of MLE (Part 3)

Modelling Premium Risk for Solvency II: from Empirical Data to Risk Capital Evaluation

Fitting Finite Mixtures of Generalized Linear Regressions on Motor Insurance Claims

Likelihood Methods of Inference. Toss coin 6 times and get Heads twice.

Likelihood-Based Statistical Estimation From Quantized Data

Homework Problems Stat 479

Statistics & Flood Frequency Chapter 3. Dr. Philip B. Bedient

Stochastic Claims Reserving _ Methods in Insurance

Monitoring Processes with Highly Censored Data

Adaptive Control Applied to Financial Market Data

Chapter 7: Estimation Sections

Changes to Exams FM/2, M and C/4 for the May 2007 Administration

UNIVERSITY OF VICTORIA Midterm June 2014 Solutions

Folded- and Log-Folded-t Distributions as Models for Insurance Loss Data

Asymmetric Type II Compound Laplace Distributions and its Properties

1/2 2. Mean & variance. Mean & standard deviation

Risky Loss Distributions And Modeling the Loss Reserve Pay-out Tail

6. Genetics examples: Hardy-Weinberg Equilibrium

SOCIETY OF ACTUARIES EXAM STAM SHORT-TERM ACTUARIAL MATHEMATICS EXAM STAM SAMPLE QUESTIONS

Equation Chapter 1 Section 1 A Primer on Quantitative Risk Measures

Transcription:

Estimating Parameters for Incomplete Data William White

Insurance Agent Auto Insurance Agency Task Claims in a week 294 340 384 457 680 855 974 1193 1340 1884 2558 9743 Boss, Is this a good representation of the population?

Insurance Agent Things to think of. How should it look? The distribution should be skewed right. Frequency of Claims $ per Claim 294 340 384 457 680 855 974 1193 1340 1884 2558 9743

Insurance Agent Exponential Distribution, If is 1 If is.0001.35.00005.15.00002 5 10 40,000 80,000

Insurance Agent How can we estimate the value of? Find an estimator What is an estimator? Uses sample data to find approximations of actual parameters

Estimator What do we need to look for? Consistent The estimator value converges to the population value. Estimate Error True Parameter Sample Size

Estimator What do we need to look for? Efficient For a fixed sample size, there is less variability in the estimator. Sample Mean Sample Median Sample means have less variability than sample medians.

Estimator What do we need to look for? Unbiased True Parameter Estimate Sample Size True Parameter Sample Size As people take more samples, the expected value of the parameter will become the population parameter.

Maximum Likelihood Estimator Sir Ronald A. Fisher (1890-1962) Maximum Likelihood Estimator (MLE) Solve the problems of estimation Written in 1912 Completed in 1922

Maximum Likelihood Estimator Characteristics of the MLE Very versatile Applies to most types of data Simplistic Can be very efficient with little calculations

Maximum Likelihood Estimator Uses the likelihood function Finds the probability of obtaining the sample results that were obtained Product of probability density functions (pdf) with independent random variables

Probability Maximum Likelihood Estimator

Maximum Likelihood Estimator Likelihood function Sample Data- Claims 294 340 384 457 680 855 974 1193 1340 1884 2558 9743 What parameter is most likely for our sample? If we knew is the probability density not the probability

Maximum Likelihood Estimator Likelihood function Probability density function Our samples are identically distributed Restate: If we had a value for the parameter, what is the likelihood we would get the sample set? Because the events are independent of each other is the probability density not the probability

Maximum Likelihood Estimator Likelihood function is the probability density not the probability

Maximum Likelihood Estimator What makes our product maximized? Probability

Maximum Likelihood Estimator Loglikelihood Function Taking the product can be cumbersome Often easier due to properties of Logarithms Do logarithms change up our evaluation? No, because logarithms are increasing, we are still looking for the maximum value.

Maximum Likelihood Estimator Example using the Exponential Distribution

Maximum Likelihood Estimator

Maximum Likelihood Estimator With calculus we can find the MLE by taking the derivative, setting it equal to 0, and solving for the parameter. (We can use the 2 nd derivative to check maximum.) Because this is are estimate for the population parameter we are also concluding that the sample mean is an estimate for the population mean.

Let s use our claims with the Exponential Distribution, sample mean= 1725.2 What Do We Think?

What Do We Think? Why are there no claims below 294? 294 340 384 457 680 855 974 1193 1340 1884 2558 9743 Probability of Claim $ per Claim

Deductible We forgot there is a $250 deductible! No one is going to file a claim if the damage is not worth $250. Incomplete data- Truncated 10 12 16 17 22 25 27 33 35 39 45 47 53 57 65 71 81 89 99 103 115 122 139 140 156 185 194 225 243 294 340 384 457 680 855 974 1193 1340 1884 2558 9743

Incomplete Data The MLE also works with incomplete data. Incomplete data occurs when specific observations are either lost or are not recorded exactly. Two Types Truncated data When data is excluded. Censored When the number of observations is known, but the values of the observations are unknown.

Incomplete Data Truncated Data Vehicle insurance with a Deductible of $250 Claims are filed when greater than $250 10 12 16 17 22 25 27 33 35 39 45 47 53 57 65 71 81 89 99 103 115 122 139 140 156 185 194 225 243 294 340 384 457 680 855 974 1193 1340 1884 2558 9743

Incomplete Data This is an example of data that is truncated from below, or the left, since the data below the set value, $250, is truncated. Truncated from above, the right, is when data is truncated above a set value. Probability of Claim =undefined $250 $5,000 $ per Claim

Incomplete Data Censored data 10 12 16 17 22 25 27 33 35 39 45 47 53 57 65 71 81 89 99 103 115 122 139 140 156 185 194 225 243 294 340 384 457 680 855 974 1193 1340 1884 2558 9743 Policy Limit All values above $1,000, are set equal to $1,000. 10 12 16 17 22 25 27 33 35 39 45 47 53 57 65 71 81 89 99 103 115 122 139 140 156 185 194 225 243 294 340 384 457 680 855 974 1000 1000 1000 1000 1000

Incomplete Data This example would be considered censored from above, or the right, since the data above the set value, 1000, is censored. Censored from below, or the left, would be the case when data is censored below a set value. Probability of Claim =$1,000 $500 $1,000 $ per Claim

Incomplete Data Estimate with deductible and policy limit- 294 340 384 457 680 855 974 1000 1000 1000 1000 1000 What are we estimating for? We want to estimate for our entire sample using truncated and censored data. 10 12 16 17 22 25 27 33 35 39 45 47 53 57 65 71 81 89 99 103 115 122 139 140 156 185 194 225 243 294 340 384 457 680 855 974 1193 1340 1884 2558 9743 We want our estimate to be unbiased.

Incomplete Data Estimating with incomplete data Group X- modified value, claim amount Group Y- modified values, amount paid Group X- 294 340 384 457 680 855 974 1000 1000 1000 1000 1000 Group Y- 44 90 134 207 430 605 724 750 750 750 750 750

Incomplete Data Probability 1 250 250+y 750 y Group Y- 44 90 134 207 430 605 724 750 750 750 750 750

Incomplete Data Estimating with incomplete data Probability 1 250 250+y 750 y Group Y- 44 90 134 207 430 605 724 750 750 750 750 750

Incomplete Data Solving with incomplete Probability 750 y Group Y- 44 90 134 207 430 605 724 750 750 750 750 750

Incomplete Data Group Y- 44 90 134 207 430 605 724 750 750 750 750 750

What s Our Result? Boss, Is this a good representation of the population? Excel File What do we need to tell the boss? Estimated mean is $854.86. If we compare this too what our complete data set mean, $565.05, we observe that our estimate is too high. This may mean that we have a considerably high amount of accidents below the deductible.

What s Our Result? The results show that it is a good representation of our received claims, but it is not a good representation for our population.

Incomplete Data Why should we use the MLE? One of the major attractions of this estimator is that it is almost always available. That is, if you can write an expression for the desired probabilities, you can execute this method. If you cannot write and evaluate an expression for probabilities using your model, there is no point in postulating that model in the first place because you will not be able to use it to solve your problem. (Klugman, Panjer, and Willmot)

Thanks! Dr. Troy Riggs- Project Advisor Dr. Matt Lunsford, Seminar Instructor

References Klugman, Stuart A., Harry H. Panjer, and Gordon E. Willmot. Loss Models: From Data to Decisions. New York: John Wiley and Sons, Inc, 1998. ---. Loss Models: From Data to Decisions. 2nd ed. New York: John Wiley and Sons, Inc, 2004. Myung, In Jae. "Tutorial on Maximum Likelihood Estimation." Journal of Mathematical Psychology. 47 (2003): 93.