Survival Analysis APTS 2016/17 Preliminary material

Similar documents
Chapter 2 ( ) Fall 2012

Duration Models: Parametric Models

Lecture 34. Summarizing Data

Statistical Analysis of Life Insurance Policy Termination and Survivorship

The Weibull in R is actually parameterized a fair bit differently from the book. In R, the density for x > 0 is

Estimation Procedure for Parametric Survival Distribution Without Covariates

Basic notions of probability theory: continuous probability distributions. Piero Baraldi

Modelling component reliability using warranty data

Polyhazard models with dependent causes

Commonly Used Distributions

Duration Models: Modeling Strategies

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

Chapter 3 Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2013 John Wiley & Sons, Inc.

1. You are given the following information about a stationary AR(2) model:

Multivariate Cox PH model with log-skew-normal frailties

Survival Data Analysis Parametric Models

Homework Problems Stat 479

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Gamma Distribution Fitting

Survival models. F x (t) = Pr[T x t].

Queens College, CUNY, Department of Computer Science Computational Finance CSCI 365 / 765 Fall 2017 Instructor: Dr. Sateesh Mane.

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Confidence Intervals for an Exponential Lifetime Percentile

Hedge funds and Survival analysis

Probability and Statistics

Financial Risk Management

Lecture Note 8 of Bus 41202, Spring 2017: Stochastic Diffusion Equation & Option Pricing

Fixed Effects Maximum Likelihood Estimation of a Flexibly Parametric Proportional Hazard Model with an Application to Job Exits

Panel Data with Binary Dependent Variables

An Information Based Methodology for the Change Point Problem Under the Non-central Skew t Distribution with Applications.

Exam M Fall 2005 PRELIMINARY ANSWER KEY

Chapter ! Bell Shaped

The Cox Hazard Model for Claims Data: a Bayesian Non-Parametric Approach

Australian Journal of Basic and Applied Sciences. Conditional Maximum Likelihood Estimation For Survival Function Using Cox Model

Subject CS2A Risk Modelling and Survival Analysis Core Principles

MODELS FOR QUANTIFYING RISK

Financial Risk Management

The University of Chicago, Booth School of Business Business 41202, Spring Quarter 2011, Mr. Ruey S. Tsay. Solutions to Final Exam.

One-Sample Cure Model Tests

Practice Exam 1. Loss Amount Number of Losses

Bivariate Birnbaum-Saunders Distribution

An Improved Saddlepoint Approximation Based on the Negative Binomial Distribution for the General Birth Process

Longevity risk: past, present and future

Chapter 5: Statistical Inference (in General)

4-1. Chapter 4. Commonly Used Distributions by The McGraw-Hill Companies, Inc. All rights reserved.

Quantile Regression in Survival Analysis

Assembly systems with non-exponential machines: Throughput and bottlenecks

Monitoring Accrual and Events in a Time-to-Event Endpoint Trial. BASS November 2, 2015 Jeff Palmer

Surrenders in a competing risks framework, application with the [FG99] model

REINSURANCE RATE-MAKING WITH PARAMETRIC AND NON-PARAMETRIC MODELS

Statistics & Flood Frequency Chapter 3. Dr. Philip B. Bedient

Case Study: Heavy-Tailed Distribution and Reinsurance Rate-making

On the comparison of the Fisher information of the log-normal and generalized Rayleigh distributions

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]

1 Residual life for gamma and Weibull distributions

Chapter 7: Point Estimation and Sampling Distributions

Implied Data. Hajime Takahashi Hitotsubashi University. Reiko Tobe Hitiotsubashi Universituy. Nov. 20, 2009

Theoretical Problems in Credit Portfolio Modeling 2

Optimal (Under-)Pricing and Allocation of Publicly Provided Goods

MAS187/AEF258. University of Newcastle upon Tyne

Generalized Additive Modelling for Sample Extremes: An Environmental Example

Modeling Credit Risk of Portfolio of Consumer Loans

GOV 2001/ 1002/ E-200 Section 3 Inference and Likelihood

An Analytical Approximation for Pricing VWAP Options

Changes of the filtration and the default event risk premium

The Normal Distribution

Theoretical Statistics. Lecture 3. Peter Bartlett

A Comprehensive, Non-Aggregated, Stochastic Approach to. Loss Development

Normal Distribution. Definition A continuous rv X is said to have a normal distribution with. the pdf of X is

KURTOSIS OF THE LOGISTIC-EXPONENTIAL SURVIVAL DISTRIBUTION

Multinomial Logit Models for Variable Response Categories Ordered

joint work with K. Antonio 1 and E.W. Frees 2 44th Actuarial Research Conference Madison, Wisconsin 30 Jul - 1 Aug 2009

SOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS

Using survival models for profit and loss estimation. Dr Tony Bellotti Lecturer in Statistics Department of Mathematics Imperial College London

STA 4504/5503 Sample questions for exam True-False questions.

The mixed trunsored model with applications to SARS in detail. Hideo Hirose

Managing Systematic Mortality Risk in Life Annuities: An Application of Longevity Derivatives

Probability Weighted Moments. Andrew Smith

Counterparty Risk Modeling for Credit Default Swaps

Dividend Strategies for Insurance risk models

Small Sample Bias Using Maximum Likelihood versus. Moments: The Case of a Simple Search Model of the Labor. Market

Credit Risk. June 2014

A probability distribution can be specified either in terms of the distribution function Fx ( ) or by the quantile function defined by

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions

Statistics 431 Spring 2007 P. Shaman. Preliminaries

Analysis of truncated data with application to the operational risk estimation

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

PASS Sample Size Software

A Comprehensive, Non-Aggregated, Stochastic Approach to Loss Development

Importance Sampling and Monte Carlo Simulations

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

SOCIETY OF ACTUARIES EXAM STAM SHORT-TERM ACTUARIAL MATHEMATICS EXAM STAM SAMPLE QUESTIONS

WARRANTY SERVICING WITH A BROWN-PROSCHAN REPAIR OPTION

All Investors are Risk-averse Expected Utility Maximizers. Carole Bernard (UW), Jit Seng Chen (GGY) and Steven Vanduffel (Vrije Universiteit Brussel)

Financial Risk Forecasting Chapter 9 Extreme Value Theory

November 2001 Course 1 Mathematical Foundations of Actuarial Science. Society of Actuaries/Casualty Actuarial Society

INTRODUCTION TO SURVIVAL ANALYSIS IN BUSINESS

Building and Checking Survival Models

Clark. Outside of a few technical sections, this is a very process-oriented paper. Practice problems are key!

Random Variables Handout. Xavier Vilà

Transcription:

Survival Analysis APTS 2016/17 Preliminary material Ingrid Van Keilegom KU Leuven (ingrid.vankeilegom@kuleuven.be) August 2017

1 Introduction 2 Common functions in survival analysis 3 Parametric survival distributions 4 Exercises 5 References

1 Introduction 2 Common functions in survival analysis 3 Parametric survival distributions 4 Exercises 5 References

What is Survival analysis? Survival analysis (or duration analysis) is an area of statistics that models and studies the time until an event of interest takes place. In practice, for some subjects the event of interest cannot be observed for various reasons, e.g. the event is not yet observed at the end of the study another event takes place before the event of interest... In survival analysis the aim is to model time-to-event data in an appropriate way to do correct inference taking these special features of the data into account.

Examples Medicine : time to death for patients having a certain disease time to getting cured from a certain disease time to relapse of a certain disease Agriculture : time until a farm experiences its first case of a certain disease Sociology ( duration analysis ) : time to find a new job after a period of unemployment time until re-arrest after release from prison Engineering ( reliability analysis ) : time to the failure of a machine

1 Introduction 2 Common functions in survival analysis 3 Parametric survival distributions 4 Exercises 5 References

Let T be a non-negative continuous random variable, representing the time until the event of interest. Denote F (t) = P(T t) f (t) For survival data, we consider rather S(t) H(t) h(t) mrl(t) distribution function probability density function survival function cumulative hazard function hazard function mean residual life function Knowing one of these functions suffices to determine the other functions.

Survival function S(t) = P(T > t) = 1 F(t) Probability that a randomly selected individual will survive beyond time t Decreasing function, taking values in [0, 1] Equals 1 at t = 0 and 0 at t =

Cumulative Hazard Function H(t) = log S(t) Increasing function, taking values in [0, + ] S(t) = exp( H(t))

Hazard Function (or Hazard Rate) P(t T < t + t T t) h(t) = lim t 0 t = 1 P(T t) lim t 0 P(t T < t + t) t = f (t) S(t) = d log S(t) = d dt dt H(t) h(t) measures the instantaneous risk of dying right after time t given the individual is alive at time t Positive function (not necessarily increasing or decreasing) The hazard function h(t) can have many different shapes and is therefore a useful tool to summarize survival data

Hazard functions of different shapes Hazard 0 2 4 6 8 10 Exponential Weibull, rho=0.5 Weibull, rho=1.5 Bathtub 0 5 10 15 20 Time

Mean Residual Life Function The mrl function measures the expected remaining lifetime for an individual of age t. As a function of t, we have S(s)ds t mrl(t) = S(t) This result is obtained from mrl(t) = E(T t T > t) = Mean life time: E(T ) = mrl(0) = 0 sf (s)ds = t (s t)f (s)ds S(t) 0 S(s)ds

1 Introduction 2 Common functions in survival analysis 3 Parametric survival distributions 4 Exercises 5 References

Exponential distribution Characterized by one parameter λ > 0 : S 0 (t) = exp( λt) f 0 (t) = λ exp( λt) h 0 (t) = λ leads to a constant hazard function Empirical check: plot of the log of the survival estimate versus time

Hazard and survival function for the exponential distribution Hazard 0.0 0.1 0.2 0.3 0.4 Lambda=0.14 Survival 0.0 0.2 0.4 0.6 0.8 1.0 Lambda=0.14 0 2 4 6 8 10 0 2 4 6 8 10 Time Time

Weibull distribution Characterized by a scale parameter λ > 0 and a shape parameter ρ > 0 : S 0 (t) = exp( λt ρ ) f 0 (t) = ρλt ρ 1 exp( λt ρ ) h 0 (t) = ρλt ρ 1 hazard decreases monotonically with time if ρ < 1 hazard increases monotonically with time if ρ > 1 hazard is constant over time if ρ = 1 (exponential case) Empirical check: plot log cumulative hazard versus log time

Hazard and survival function for the Weibull distribution Hazard and survival functions for Weibull distribution Hazard 0.0 0.1 0.2 0.3 0.4 Lambda=0.31, Rho=0.5 Lambda=0.06, Rho=1.5 Survival 0.0 0.2 0.4 0.6 0.8 1.0 Lambda=0.31, Rho=0.5 Lambda=0.06, Rho=1.5 0 2 4 6 8 10 0 2 4 6 8 10 Time Time

Gompertz distribution Characterized by two parameters λ > 0 and γ > 0 : S 0 (t) = exp [ λγ 1 (exp(γt) 1) ] f 0 (t) = λ exp(γt) exp [ λγ 1 (exp(γt) 1) ] h 0 (t) = λ exp(γt) hazard increases from λ at time 0 to at time γ = 0 corresponds to the exponential case Gompertz distribution can also be presented with γ R for γ < 0 the hazard is decreasing and the cumulative hazard is not going to when t part of the population will never experience the event

Hazard and survival function for the Gompertz distribution Hazard 0.0 0.1 0.2 0.3 0.4 Lambda=0.03, Gamma=0.5 Lambda=0.00006, Gamma=2 Survival 0.0 0.2 0.4 0.6 0.8 1.0 Lambda=0.03, Gamma=0.5 Lambda=0.00006, Gamma=2 0 2 4 6 8 10 0 2 4 6 8 10 Time Time

Log-logistic distribution A random variable T has a log-logistic distribution if logt has a logistic distribution Characterized by two parameters λ and κ > 0 : S 0 (t) = 1 1 + (tλ) κ f 0 (t) = κt κ 1 λ κ [1 + (tλ) κ ] 2 h 0 (t) = κtκ 1 λ κ 1 + (tλ) κ The median event time is only a function of the parameter λ : Med(T ) = exp(1/λ)

Hazard and survival function for the log-logistic distribution Hazard 0.0 0.1 0.2 0.3 0.4 Lambda=0.2, Kappa=1.5 Lambda=0.2, Kappa=0.5 Survival 0.0 0.2 0.4 0.6 0.8 1.0 Lambda=0.2, Kappa=1.5 Lambda=0.2, Kappa=0.5 0 2 4 6 8 10 0 2 4 6 8 10 Time Time

Log-normal distribution Resembles the log-logistic distribution but is mathematically less tractable A random variable T has a log-normal distribution if logt has a normal distribution Characterized by two parameters µ and γ > 0 : ( ) log(t) µ S 0 (t) = 1 F N γ f 0 (t) = [ 1 t 2πγ exp 1 ] (log(t) µ)2 2γ The median event time is only a function of the parameter µ : Med(T ) = exp(µ)

Hazard and survival function for the log-normal distribution Hazard 0.0 0.1 0.2 0.3 0.4 Mu=1.609, Gamma=0.5 Mu=1.609, Gamma=1.5 Survival 0.0 0.2 0.4 0.6 0.8 1.0 Mu=1.609, Gamma=0.5 Mu=1.609, Gamma=1.5 0 2 4 6 8 10 0 2 4 6 8 10 Time Time

1 Introduction 2 Common functions in survival analysis 3 Parametric survival distributions 4 Exercises 5 References

1 Find a few more practical situations where time-to-event data are of interest, and try to imagine why the event of interest can sometimes not be observed in these situations. 2 Show that the four common functions in survival analysis (survival function, cumulative hazard function, hazard function and mean residual life function) all determine the law of the random variable of interest in a unique way.

1 Introduction 2 Common functions in survival analysis 3 Parametric survival distributions 4 Exercises 5 References

Some textbooks on survival analysis : Cox, D.R. et Oakes, D. (1984). Analysis of survival data, Chapman and Hall, New York. Fleming, T.R. et Harrington, D.P. (1981). Counting processes and survival analysis, Wiley, New York. Hougaard, P. (2000). Analysis of multivariate survival data. Springer, New York. Kalbfleisch, J.D. et Prentice, R.L. (1980). The statistical analysis of failure time data, Wiley, New York. Klein, J.P. and Moeschberger, M.L. (1997). Survival analysis, techniques for censored and truncated data, Springer, New York. Kleinbaum, D.G. et Klein, M. (2005). Survival analysis, a self-learning text, Springer, New York.