Estimation Procedure for Parametric Survival Distribution Without Covariates

Similar documents
Duration Models: Parametric Models

Duration Models: Modeling Strategies

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

INTRODUCTION TO SURVIVAL ANALYSIS IN BUSINESS

Gamma Distribution Fitting

Statistical Analysis of Life Insurance Policy Termination and Survivorship

1. You are given the following information about a stationary AR(2) model:

Hierarchical Generalized Linear Models. Measurement Incorporated Hierarchical Linear Models Workshop

Survival Data Analysis Parametric Models

Lecture 21: Logit Models for Multinomial Responses Continued

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Chapter 2 ( ) Fall 2012

Previous articles in this series have focused on the

Survival Analysis APTS 2016/17 Preliminary material

Quantile Regression. By Luyang Fu, Ph. D., FCAS, State Auto Insurance Company Cheng-sheng Peter Wu, FCAS, ASA, MAAA, Deloitte Consulting

Financial Risk Management

Panel Data with Binary Dependent Variables

STA 4504/5503 Sample questions for exam True-False questions.

proc genmod; model malform/total = alcohol / dist=bin link=identity obstats; title 'Table 2.7'; title2 'Identity Link';

The Weibull in R is actually parameterized a fair bit differently from the book. In R, the density for x > 0 is

Practice Exam 1. Loss Amount Number of Losses

EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS. Rick Katz

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

Log-linear Modeling Under Generalized Inverse Sampling Scheme

6. Genetics examples: Hardy-Weinberg Equilibrium

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]

9. Logit and Probit Models For Dichotomous Data

Equity, Vacancy, and Time to Sale in Real Estate.

Omitted Variables Bias in Regime-Switching Models with Slope-Constrained Estimators: Evidence from Monte Carlo Simulations

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

Homework Problems Stat 479

Assicurazioni Generali: An Option Pricing Case with NAGARCH

PASS Sample Size Software

Time Invariant and Time Varying Inefficiency: Airlines Panel Data

Clark. Outside of a few technical sections, this is a very process-oriented paper. Practice problems are key!

Commonly Used Distributions

Generalized MLE per Martins and Stedinger

PhD Qualifier Examination

ECON 6022B Problem Set 2 Suggested Solutions Fall 2011

Variance clustering. Two motivations, volatility clustering, and implied volatility

Introduction to the Maximum Likelihood Estimation Technique. September 24, 2015

Chapter 8. Markowitz Portfolio Theory. 8.1 Expected Returns and Covariance

Correcting for Survival Effects in Cross Section Wage Equations Using NBA Data

is the bandwidth and controls the level of smoothing of the estimator, n is the sample size and

book 2014/5/6 15:21 page 261 #285

1 Residual life for gamma and Weibull distributions

Financial Time Series Analysis (FTSA)

Australian Journal of Basic and Applied Sciences. Conditional Maximum Likelihood Estimation For Survival Function Using Cox Model

A Convenient Way of Generating Normal Random Variables Using Generalized Exponential Distribution

Logit Models for Binary Data

Loss Simulation Model Testing and Enhancement

σ e, which will be large when prediction errors are Linear regression model

MVE051/MSG Lecture 7

joint work with K. Antonio 1 and E.W. Frees 2 44th Actuarial Research Conference Madison, Wisconsin 30 Jul - 1 Aug 2009

Bayesian Multinomial Model for Ordinal Data

Financial Giffen Goods: Examples and Counterexamples

Final Exam Suggested Solutions

The histogram should resemble the uniform density, the mean should be close to 0.5, and the standard deviation should be close to 1/ 12 =

Eco504 Spring 2010 C. Sims FINAL EXAM. β t 1 2 φτ2 t subject to (1)

Determining Probability Estimates From Logistic Regression Results Vartanian: SW 541

Two Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 22 January :00 16:00

Model 0: We start with a linear regression model: log Y t = β 0 + β 1 (t 1980) + ε, with ε N(0,

SOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS

INSTITUTE OF ACTUARIES OF INDIA EXAMINATIONS. 20 th May Subject CT3 Probability & Mathematical Statistics

Financial Econometrics

Financial Times Series. Lecture 6

Missing Data. EM Algorithm and Multiple Imputation. Aaron Molstad, Dootika Vats, Li Zhong. University of Minnesota School of Statistics

Fixed Effects Maximum Likelihood Estimation of a Flexibly Parametric Proportional Hazard Model with an Application to Job Exits

The Delta Method. j =.

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2017, Mr. Ruey S. Tsay. Solutions to Final Exam

Multivariate Cox PH model with log-skew-normal frailties

Introduction to Population Modeling

Window Width Selection for L 2 Adjusted Quantile Regression

Stochastic Models. Statistics. Walt Pohl. February 28, Department of Business Administration

LAST SECTION!!! 1 / 36

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, Last revised January 10, 2017

Market risk measurement in practice

Hazardous Times for Monetary Policy: What do 23 Million Bank Loans Say About the Effects of Monetary Policy on Credit Risk?

STA2601. Tutorial letter 105/2/2018. Applied Statistics II. Semester 2. Department of Statistics STA2601/105/2/2018 TRIAL EXAMINATION PAPER

SOCIETY OF ACTUARIES EXAM STAM SHORT-TERM ACTUARIAL MATHEMATICS EXAM STAM SAMPLE QUESTIONS

CREDIT SCORING & CREDIT CONTROL XIV August 2015 Edinburgh. Aneta Ptak-Chmielewska Warsaw School of Ecoomics

GOV 2001/ 1002/ E-200 Section 3 Inference and Likelihood

Gov 2001: Section 5. I. A Normal Example II. Uncertainty. Gov Spring 2010

Log-Robust Portfolio Management

Personalized screening intervals for biomarkers using joint models for longitudinal and survival data

Analysis of truncated data with application to the operational risk estimation

Homework Problems Stat 479

Laplace approximation

MATH 3200 Exam 3 Dr. Syring

Phd Program in Transportation. Transport Demand Modeling. Session 11

Chapter 5 Univariate time-series analysis. () Chapter 5 Univariate time-series analysis 1 / 29

TECHNICAL WORKING PAPER SERIES GENERALIZED MODELING APPROACHES TO RISK ADJUSTMENT OF SKEWED OUTCOMES DATA

Mixed Logit or Random Parameter Logit Model

Estimation Parameters and Modelling Zero Inflated Negative Binomial

GPD-POT and GEV block maxima

An Introduction to Statistical Extreme Value Theory

Environmental samples below the limits of detection comparing regression methods to predict environmental concentrations ABSTRACT INTRODUCTION

John Hull, Risk Management and Financial Institutions, 4th Edition

Transcription:

Estimation Procedure for Parametric Survival Distribution Without Covariates The maximum likelihood estimates of the parameters of commonly used survival distribution can be found by SAS. The following two procedures can be used to find the maximum likelihood estimates of parameters: 1. PROC LIFEREG 2. PROC PHREG PROC PHREG is more popular, but PROC LIFEREG is not obsolete. In fact, PROC LIFEREG can do some things better than PROC PHREG, and it can do other things that PROC PHREG cannot do it at all. The greatest limitation of PROC LIFEREG is that it does not handle time-dependent covariates, something at which PROC PHREG excels. It should be mentioned that: PROC PHREG only allows right censoring while PROC LIFEREG handle right, left and interval censored data. PROC PHREG only gives nonparametric estimates of the survival function (which can be difficult to interpret). Certain hypothesis test about the shape of hazard function can be tested by using PROC LIFEREG. PROC LIFEREG produces more efficient estimates (with smaller standard errors) than PROC PHREH, if the shape of the survival function is known. We have to create sets of dummy (indicator) variables in the DATA step to represent categorical data in PROC PHREG. PROC LIFEREG automatically creates such variables. We discuss PROC LIFEREG in this chapter. Note that PROC PHREG does semi-parametric regression analysis using a method known as partial likelihood. The reason for using this method (and hence PROC PHREG) become apparent in next chapters.

Example: The remission times of 42 patients with acute leukemia were recorded in a clinical trial to assess the ability of 6-mercaptopurine (6-MP) to maintain remission. Each patient was randomized to receive 6-MP or a placebo. The study was terminated after one year. The remission times, in weeks, for 21 patients who received 6-MP are: 6,6,6,7,10,13,16, 22,23,6+,9+,10+, 11+, 17+, 19+, 20+, 25+, 32+, 32+, 34+, 35+ Let t denote the survival time (exact or censored) and C be a dummy variable with C=0 if t is censored and 1 otherwise. Assume that the data have been saved in C:\Example.dat as a text file, which contains two columns (t, in the first column and C in the second column), separated by space(s). The following SAS code for procedure LIFEREG can be used to obtain maximum likelihood estimate of parameters of the lognormal distribution for the observed survival data in C:\Example.Dat. data B; infile 'Data:\Example.dat ; input t c; run; proc lifereg data=b; model t*c(0) = /covb d=lnormal; run; quit The class of regression models estimated by PROC LIFEREG is known as the accelerated failure time (AFT) model. What ROC LIFEREG actually estimates is a special case of AFT that is quite similar in form to an ordinary linear regression model. Let T be a random variable i denoting the event for the i th individual in the sample. The model is then Log = β + σ i (1) T i Where ε i is a random disturbance term and β 0 and σ are parameters to be estimated. The estimated parameters of the lognormal distribution are: 0 ε ) µ = Intercept, and ) σ =Scale

25, 2006 1 The SAS System 10:17 Wednesday, January The LIFEREG Procedure Model Information Data Set WORK.B Dependent Variable Log(t) Censoring Variable C Censoring Value(s) 0 Number of Observations 21 Noncensored Values 9 Right Censored Values 12 Left Censored Values 0 Interval Censored Values 0 Name of Distribution Lognormal Log Likelihood -19.49230747 Number of Observations Read 21 Number of Observations Used 21 Parameter Information Parameter Intercept Effect Intercept Algorithm converged. Analysis of Parameter Estimates Standard 95% Confidence Chi- Parameter DF Estimate Error Limits Square Pr > ChiSq Intercept 1 3.2031 0.2861 2.6423 3.7639 125.31 <.0001 Scale 1 0.9787 0.2506 0.5925 1.6166 Estimated Covariance Matrix Intercept Scale Intercept 0.081872 0.035723 Scale 0.035723 0.062796

In ordinary linear regression, the distribution for the disturbance term is normal. PROC LIFEREG allows other distributions for the disturbance termε. For each of these distributions, there is a corresponding distribution for T: Distribution of T Exponential Weibull Log-normal Log-logistic Gamma Distribution of ε Extreme value (one parameter) Extreme value (two parameters) Normal Logistic Log-gamma Note that all AFT models are named for the distribution of T rather than the distribution of LogT orε. Exponential Distribution: To fit the exponential distribution with PROC LIFEREG, we should specify DIST=EXPONENTIAL as an option in the MODEL statement. As we saw in Chapter 6, an exponential distribution for T corresponds to a constant hazard function. That is * Logh ( t) = β 0 We added * to distinguish this coefficient from the coefficient in the first model. It can be shown that the two models are completely equivalent. In fact, we have parameter of the exponential distribution can be obtained by ) λ = exp( INTERCEPT ). * β = β 0 0. The estimated The relationship between parameters in the log-hazard model ( Logh(t) ) and the log-survival time (Log T) is more complicated for other distributions.

25, 2006 1 The SAS System 09:35 Wednesday, January The LIFEREG Procedure Model Information Data Set WORK.B Dependent Variable Log(t) Censoring Variable C Censoring Value(s) 0 Number of Observations 21 Noncensored Values 9 Right Censored Values 12 Left Censored Values 0 Interval Censored Values 0 Name of Distribution Exponential Log Likelihood -20.9870319 Number of Observations Read 21 Number of Observations Used 21 Algorithm converged. Analysis of Parameter Estimates Standard 95% Confidence Chi- Parameter DF Estimate Error Limits Square Pr > ChiSq Intercept 1 3.6861 0.3333 3.0328 4.3394 122.29 <.0001 Scale 0 1.0000 0.0000 1.0000 1.0000 Weibull Scale 1 39.8888 13.2963 20.7547 76.6628 Weibull Shape 0 1.0000 0.0000 1.0000 1.0000 Lagrange Multiplier Statistics Parameter Chi-Square Pr > ChiSq Scale 1.9466 0.1630

Weibull Distribution: To fit the Weibull distribution with PROC LIFEREG, we should specify DIST=WEIBUL as an option in the MODEL statement. The estimated parameter of the Weibull distribution can be obtained by ) λ = exp( INTERCEPT ) and ) γ = 1 Log-Logistic Distribution: To fit the Log-logistic distribution with PROC LIFEREG, we should specify DIST= LLOGISTIC as an option in the MODEL statement. The estimated parameter of the Log-logistic distribution can be obtained by ) INTERCEPT λ = exp( ) and ) γ = 1 Gamma Distribution: We discussed two different gamma distributions: the standard (2- parmeters) gamma distribution and the generalized (3-paramerts) gamma distribution. PROC LIFEREG fits the generalized gamma distribution. Note that the exponential, Weibull, standard gamma, and log-normal distribution (but not the log-logistic) are all special case of the generalized gamma distribution. To fit the generalized gamma distribution with PROC LIFEREG, we should specify DIST=GAMMA as an option in the MODEL statement. The estimated parameter of the generalized gamma distribution can be obtained by ) λ = exp( INTERCEPT ), ) SHAPE α = and ) γ = 1 SHAPE When the shape parameter is 0, we get the log-normal distribution. When it is 1.0, we have the Weibull distribution. When the shape parameter and the scale parameter are equal, we have the standard gamma distribution.

As for the standard gamma distribution, there is no direct way of fitting this in PROC LIFEREG. We cannot impose the constraint = SHAPE in PROC LIFEREG since it does not handle equality constraints. However, PROC LIFEREG allows fixing both the scale and shaping parameters at specific values. For example, we can have proc lifereg data=b; model t*c(0) = /d=gamma noshape1 shape1=0.7 noscale scale = 0.7; run; quit We can try a bunch of different values until to find the common value for the shape and scale parameters that maximizes the log-likelihood.