Chapter 2 ( ) Fall 2012

Size: px
Start display at page:

Download "Chapter 2 ( ) Fall 2012"

Transcription

1 Bios 323: Applied Survival Analysis Qingxia (Cindy) Chen Chapter 2 ( ) Fall 2012 Definitions and Notation There are several equivalent ways to characterize the probability distribution of a survival random variable. Some of these are familiar; others are special to survival analysis. We will focus on the following terms: The density function f(t) The survivor function S(t) The hazard function h(t) The cumulative hazard function H(t) Density function f(t) For discrete r.v. s (Probability Mass Function) Suppose that T takes values in a 1, a 2,..., a n. f j if t = a j, j = 1, 2,..., n f(t) = P r(t = t) = 0 if t a j, j = 1, 2,..., n Density Function for continuous r.v. s 1 f(t) = lim P r(t T t + t) t 0 t Survivorship Function: S(t) = P (T t). In other settings, the cumulative distribution function, F (t) = P (T t), is of interest. In survival analysis, our interest tends to focus on the survival function, S(t). 1

2 For a continuous random variable: S(t) = t f(µ)dµ Exponential (0.5) 0.4 Density f(x) 0.2 x x+dx Time Figure 1: Plot of probability density function The survival function S(x) corresponds to the area under the curve to the right of x. f(x)dx P (x X < x + dx) = S(x) S(x + dx). f(x)dx is infinitesimal prob. of failure at x, unconditionally on whether individual is alive just prior to x. For a discrete random variable: S(t) = P (T > t) = µ>t f(µ) = a j >t f(a j ) = a j >t f j Notes: 2

3 1. From the definition of S(t) for a continuous variable, S(t) = 1 F (t) as long as F (t) is absolutely continuous w.r.t the Lebesgue measure. [That is, F (t) has a density function.] 2. For a discrete variable, we have to decide what to do if an event occurs exactly at time t; i.e., does that become part of F (t) or S(t)? 3. To get around this problem, several books define S(t) = P (T > t), or else define F (t) = P (T < t) (eg. Collett). K&M used S(t) = P (T > t). Hazard Function h(t) Sometimes called an instantaneous failure rate, the force of mortality, or the age-specific failure rate. 1. Continuous random variables: h(t) = lim t 0 = lim t 0 = lim t 0 = f(t) S(t) 1 P r(t T t + t T t) t 1 P r([t T t + t] [T t]) t P r(t t) 1 P r(t T t + t) t P r(t t) h(t)dt is infinitesimal prob. of failure at the next instant after t, given that one is alive at t. 2. Discrete random variables: 3

4 Cumulative Hazard Function H(t) Continuous random variables: Discrete random variables: h(a j ) h j = P r(t = a j T a j ) = P r(t = a j) P r(t a j ) = f(a j) S(a j 1 ) f(a j ) = k:a k >a j 1 f(a k ) H(t) = t 0 H(t) = h(µ)dµ k:a k t h k 4

5 Relationship between S(t) and h(t) We ve already shown that, for a continuous r.v. h(t) = f(t) S(t) For a left-continuous survivor function S(t), we can show: f(t) = S (t) We can use this relationship to show that: So another way to write h(t) is as follows: d dt [log S(t)] = S (t) S(t) = f(t) S(t) = f(t) S(t) h(t) = d [log S(t)] dt 5

6 Relationship between S(t) and H(t) Continous case: H(t) = = t 0 t 0 t h(µ)dµ f(µ) S(µ) dµ = d log S(µ) 0 dµ = log S(t) + log S(0) S(t) = e H(t) Discrete case: Suppose that a 1 < a 2 < < a K, and a j t < a j+1. 1st way to derive it: S(t) = P (T > t) = P (T a j+1 ) = P (T a 1, T a 2,..., T a j+1 ) = P (T a 1 )P (T a 2 T a 1 ) P (T a j+1 T a j ) = P (T a 1 ) [1 P (T = a 1 T a 1 )] [1 P (T = a j T a j )] = 1 (1 h(a 1 )) (1 h(a j )) = (1 h(a j )). j:a j t 2nd way to derive it: 6

7 Since we have h(a j ) = f(a j) S(a j 1 ) = S(a j 1) S(a j ) S(a j 1 ) = 1 S(a j), where j = 1,..., K S(a j 1 ) S(a j ) = (1 h(a j ))S(a j 1 ) = = (1 h(a j )) (1 h(a 1 ))S(a 0 ) = (1 h(a j )) (1 h(a 1 )) The last equation is because S(a 0 ) = 1. Now we have S(a j ) = {1 h(a k )}. a k a j Since h(x) = 0 for x a 1,..., a d, we have S(t) = S(a j ) = {1 h(a k )}. k:a k t Cox defines H(t) = k:a k t log(1 h k ) (1) so that S(t) = e H(t) in the discrete case, as well. K&M used H(t) = h k. (2) k:a k t Equation (2) is an approximation of (1) when h k are small (Try log(1 h k) h k h k 0). 1 when 7

8 Example (discrete): f j = P (X = j) = 1/3, j = 1, 2, 3., S(x) =? (in Figure 2) h(x) =? Survival Probability Time Figure 2: Survival function for a discrete random lifetime 8

9 Measuring Central Tendency in Survival Mean Survival call this µ µ = = 0 n a j f j j=1 µf(µ)dµ = 0 for discrete T S(µ)dµ for continuous T Mean survival is the area under the curve of survival function. Mean residual life mrl(x) = E(X x X > x). For a continuous variable X, mrl(x) = x (t x)f(t)dt S(x) = x S(t)dt S(x) (integration by parts). Ex, cancer survivors might want to know how long they can live on average after 5 years relapse free survival. Census has been reporting remaining life expectancy in years stratified by gender and race. According to the 2005 data, for women of all races, mrl (0) = 80.4, mrl (65) = 20, and mrl (75) = Median Survival call this τ, is defined by S(τ) = 0.5 In practice, we don t usually hit the median survival at exactly one of the failure times. In this case, the estimated median survival is the smallest time τ such that Ŝ(τ) 0.5 pth quantile (also referred to as the 100pth percentile) of the distribution of X, x p satisfies S(x p ) 1 p, i.e. x p = inf{t : S(t) 1 p}. Example: X exponential (λ). What are mean, mrl(x) and median? 9

10 Example: X Log-normal (µ, σ 2 ), what is x p? 10

11 Hazard functions can be of different shapes as shown in Figure 3. h(x) (ii) (iv) (i) (iii) (v) Figure 3: Hazard functions of different shapes (i) constant: e.g. survival of patients with advanced chronic disease (ii) increasing: e.g. aging after 65 (iii) decreasing: e.g. survival after surgery (iv) bathtub-shaped: e.g. age-specific mortality (v) Humpshaped: e.g. tuberculosis 11

12 Estimating the survival or hazard function We can estimate the survival (or hazard) function in two ways: by specifying a parametric model for h(t) based on a particular density function f(t) by developing an empirical estimate of the survival function (i.e., non-parametric estimation) If no censoring: The empirical estimate of the survival function, S(t), is the proportion of individuals with event times greater than t. Ex. 1,2,3 With censoring: If there are censored observations, then S(t) is not a good estimate of the true S(t), so other non-parametric methods must be used to account for censoring (life-table methods, Kaplan-Meier estimator) Ex. 1,2 +,3 12

13 Some Parametric Survival Distributions 1. The Exponential distribution (1 parameter, λ > 0) f(t) = λe λt for t 0 S(t) = t f(µ)dµ = e λt h(t) = f(t) = λ constant hazard! S(t) H(t) = t h(µ)dµ = t 0 0 λdµ = λt Check: Does S(t) = e H(t)? median: solve 0.5 = S(τ) = e λτ τ = log(0.5) λ mean: 0 µλe λµ dµ = λ 1 mrl and median: X exponential (λ). What are mean, mrl(x) and median? lack of memory ( t 0 > 0, T t 0 T > t 0 T ) (reason? HW) coef. of variation = s.d. mean = 1 empirical check of the data plot log(s(t)) vs. t (should approximate a straight line through origin), what s the slope? (reason? HW) 13

14 If T has an arbitrary continuous dist n, the H(T ) has an exponential dist n with unit parameter (reason? HW. Hint: S(T ) Unif(0, 1) for any arbitrary continuous r.v.) 14

15 2. The Weibull distribution (2 parameters) Weibull(γ, λ) Generalizes exponential: S(t) = e λtγ f(t) = d dt S(t) = γλtγ 1 e λtγ h(t) = γλt γ 1 H(t) = t 0 h(µ)dµ = λt γ λ the scale parameter γ the shape parameter The Weibull distribution is convenient because of its simple form. several hazard shapes: γ = 1 constant hazard 0 < γ < 1 decreasing hazard γ > 1 increasing hazard It includes important generalization of the exponential distribution; allows for a power dependence of the hazard on time. empirical check of the data - plot log( logŝ(t)) vs log t - plot should give approximately a straight line. slope γ. intercept log λ(reason?) 15

16 Figure 4: Hazard functions of Weibull Function 16

17 3. log-normal: log-normal distribution (w/parameter µ & σ) 1 e (log(t) µ)2 2σ 2 2πσt f(t) = ( 1 S(t) = 1 Φ (log(t) µ) σ ( ) 1 F (t) = Φ (log(t) µ) σ λ(t) = f(t) S(t) ( log(t) µ = φ σ ) ) /(tσ) incomplete normal integral where φ is the density function of standard normal distribution and Φ is the cumulative distribution function of standard normal distribution. simple to apply if no censoring sensitive to the small failure times Log-logistic dist n provides a good approximation to the log-normal distribution (may frequently be a preferable survival time model) log(t ) N(µ, σ) 17

18 Figure 5: Hazard functions of Log-Normal Function 18

19 4. log-logistic: X Log logistic (µ, σ 2 ) if Y = ln X logistic (µ, σ 2 ). W standardized logistic, then f W (w) = e w /{1 + e w } 2, S W (w) = 1/{1 + e w }. Y = µ + σw logistic (µ, σ 2 ) with pdf f Y (y) = ln x µ 1 S X (x) = S W ( ) = σ h X (x) = λαxα 1 1+λx α. = 1 ln x µ 1+exp{ } σ e(y µ)/σ. σ(1+e (y µ)/σ ) 2 1+λx α, where α = 1/σ and λ = e µ/σ. h X (x) is monotone decreasing when α 1. h X (x) decreasing from when α < 1 and decreasing from λ when α = 1. For α > 1, h X (x) increases initially to a maximum value at time {(α 1)/λ} 1/α, and then decreases to 0 as time approaches infinity. relatively simple explicit forms for S(t), f(t) & λ(t)(vs. log-normal) more convenient in handling censored data than the log-normal distribution provides a good approximation to the log-normal distribution except in the extreme tails. 19

20 5. Gamma Distribution: another extension of exponetial distribution. X gamma (λ, γ), λ, γ > 0, f(x) = λγ x γ 1 e λx Γ(γ) No close form for h( ) and S( ) γ = 1, exponential (λ). γ, a normal distribution. λ = 1/2, γ is integer, χ 2 2γ. When γ > 1, h(x) is monotone increasing with h(0) = 0 and h(x) λ as x. When γ < 1, h(x) is monotone decreasing with h(0) = and h(x) λ as x. Not widely used, Weibull more popular. Figure 6: Hazard functions of Gamma Function 20

21 h(x) (ii) (iv) (i) (iii) (v) Figure 7: Hazard functions of different shapes (i) constant: e.g. exponential (ii) increasing: e.g. Weibull (γ > 1) (iii) decreasing: e.g. Weibull (0 < γ < 1) (iv) bathtub-shaped: e.g. Lifetime Distribution (3 parameters) (see Dimitrakopoulou etc. IEEE TRANSACTIONS ON RELIABILITY 2007), exponential power distribution with α < 1 (Smith-Bain, 1975) (v) Humpshaped: e.g. log-normal 21

22 Why use one versus another? technical convenience for estimation and inference explicit simple forms for f(t), S(t), and h(t). qualitative shape of hazard function One can usually distinguish between a one-parameter model (like the exponential) and two-parameter (like Weibull or log-normal) in terms of the adequacy of fit to a dataset. Without a lot of data, it may be hard to distinguish between the fits of various 2- parameter models (i.e., Weibull vs log-normal) 22

23 Choice of distributions 1. convenience for statistical inference 2. existence of explicit, simple forms for S(t), f(t) & h(t) 3. capability of representing both over- and under-dispersion relative to the exponential distribution (coef. of variation= mean s.d. ) 4. qualitative shape of the hazard (monotonicity) 5. behavior of S(t) for small times (guarantee period) 6. behavior of S(t) for large times (medical research) 7. any connection with a special stochastic model of failure 23

24 Ways to compare different distributions (to highlight the differences or as a basis for an empirical analysis) 1. not effective to consider the density function directly concentrate on plotting and tabulating 2. plot h(t) or log h(t) vs. t or log(t) 3. H(t) or log S(t) or other transforms vs. t or log(t). Discrete failure time models: (group cont. data because it s imprecise) There is no theoretical justification for adopting particular parametric models for discrete failure time data in many applications 24

25 Some properties useful in assessing distributional form logh(t) H(t) logh(t) Is it constant? linear in t? exponential exponential Is it linear in t? linear in t? Gompertz (ρ 0 = 0) Gompertz (ρ 0 = 0) Is it linear in log t? linear in log t? Weibull Weibull Is it nonmonotonic? asymptotically Log normal linear in t? Log logistic Distribution with exponential tail 25

26 Regression models for survival data A typical survival regression setting: Let X be the failure time and Z t = (Z 1,..., Z p ) be a p-dimensional vector of explanatory variables. Q: What are we going to model? Approach 1: log-linear ln X = µ + γ t Z + σw, W some known distribution F, µ, σ, γ unknown. Three choices of F 1. F is normal, that is W N(0, 1), then ln X N(µ + γ t Z, σ 2 ) 2. F is standard extreme value distribution, that is f W (w) = exp{w e w }, < w <. Then X Weibull (α, λ), with α = 1/σ, λ = e ( µ σ + γt Z σ ). 3. F is standardized logistic, then it is log logistic regression model. Two interpretations of γ Let Z 1, Z 2 be different covariate values, then E(X Z 1 ) E(X Z 2 ) = eµ+γtz1 E(e σw ) e µ+γt Z 2E(e σw ) = (Z 1 Z 2 ) eγt ln E(X Z 1) E(X Z 2 ) = γt (Z 1 Z 2 ). Unit increase in Z leads to γ increase in log ratio of the means. Let S 0 (x) = P (X > x Z = 0), then P (X > x Z) = S 0 (xe γtz ). Time is accelerated if e γtz > 1 and decelerated if e γtz < 1. 26

27 Approach 2: multiplicative or additive hazard models Multiplicative models h(x Z = z) = h 0 (x)c(β t z), where c is a non-negative function of covariates. Cox proportional hazards h(x Z) = h 0 (x)e βt z h 0 (x) is baseline hazard, may be unspecified or parameterized, for e.g., h 0 (x) = e α 0+α 1 x+α 2 x 2. β is log hazard ratio when c( ) = exp( ), and does not depend on time. For each unit increase in Z, there is β increase in log hazard ratio. H(x Z) = e βtz H 0 (x), and S(x Z) = (S 0 (x)) eβt Z. Additive models h(x Z) = h 0 (x) + p j=1 z j(x)β j (x). The effects of covariates on survival are allowed to vary with time. Additive models are less frequently used than multiplicative models. 27

Survival Analysis APTS 2016/17 Preliminary material

Survival Analysis APTS 2016/17 Preliminary material Survival Analysis APTS 2016/17 Preliminary material Ingrid Van Keilegom KU Leuven (ingrid.vankeilegom@kuleuven.be) August 2017 1 Introduction 2 Common functions in survival analysis 3 Parametric survival

More information

Duration Models: Parametric Models

Duration Models: Parametric Models Duration Models: Parametric Models Brad 1 1 Department of Political Science University of California, Davis January 28, 2011 Parametric Models Some Motivation for Parametrics Consider the hazard rate:

More information

Normal Distribution. Definition A continuous rv X is said to have a normal distribution with. the pdf of X is

Normal Distribution. Definition A continuous rv X is said to have a normal distribution with. the pdf of X is Normal Distribution Normal Distribution Definition A continuous rv X is said to have a normal distribution with parameter µ and σ (µ and σ 2 ), where < µ < and σ > 0, if the pdf of X is f (x; µ, σ) = 1

More information

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions.

UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions. UQ, STAT2201, 2017, Lectures 3 and 4 Unit 3 Probability Distributions. Random Variables 2 A random variable X is a numerical (integer, real, complex, vector etc.) summary of the outcome of the random experiment.

More information

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence

continuous rv Note for a legitimate pdf, we have f (x) 0 and f (x)dx = 1. For a continuous rv, P(X = c) = c f (x)dx = 0, hence continuous rv Let X be a continuous rv. Then a probability distribution or probability density function (pdf) of X is a function f(x) such that for any two numbers a and b with a b, P(a X b) = b a f (x)dx.

More information

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY Lecture -5 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. Summary of the previous lecture Moments of a distribubon Measures of

More information

Statistical Analysis of Life Insurance Policy Termination and Survivorship

Statistical Analysis of Life Insurance Policy Termination and Survivorship Statistical Analysis of Life Insurance Policy Termination and Survivorship Emiliano A. Valdez, PhD, FSA Michigan State University joint work with J. Vadiveloo and U. Dias Sunway University, Malaysia Kuala

More information

Reliability and Risk Analysis. Survival and Reliability Function

Reliability and Risk Analysis. Survival and Reliability Function Reliability and Risk Analysis Survival function We consider a non-negative random variable X which indicates the waiting time for the risk event (eg failure of the monitored equipment, etc.). The probability

More information

Basic notions of probability theory: continuous probability distributions. Piero Baraldi

Basic notions of probability theory: continuous probability distributions. Piero Baraldi Basic notions of probability theory: continuous probability distributions Piero Baraldi Probability distributions for reliability, safety and risk analysis: discrete probability distributions continuous

More information

Estimation Procedure for Parametric Survival Distribution Without Covariates

Estimation Procedure for Parametric Survival Distribution Without Covariates Estimation Procedure for Parametric Survival Distribution Without Covariates The maximum likelihood estimates of the parameters of commonly used survival distribution can be found by SAS. The following

More information

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions

Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions Lecture 5: Fundamentals of Statistical Analysis and Distributions Derived from Normal Distributions ELE 525: Random Processes in Information Systems Hisashi Kobayashi Department of Electrical Engineering

More information

Commonly Used Distributions

Commonly Used Distributions Chapter 4: Commonly Used Distributions 1 Introduction Statistical inference involves drawing a sample from a population and analyzing the sample data to learn about the population. We often have some knowledge

More information

Hedge funds and Survival analysis

Hedge funds and Survival analysis Hedge funds and Survival analysis by Blanche Nadege Nhogue Wabo Thesis submitted to the Faculty of Graduate and Postdoctoral Studies In partial fulfillment of the requirements For the M.A.Sc. degree in

More information

Continuous random variables

Continuous random variables Continuous random variables probability density function (f(x)) the probability distribution function of a continuous random variable (analogous to the probability mass function for a discrete random variable),

More information

Case Study: Heavy-Tailed Distribution and Reinsurance Rate-making

Case Study: Heavy-Tailed Distribution and Reinsurance Rate-making Case Study: Heavy-Tailed Distribution and Reinsurance Rate-making May 30, 2016 The purpose of this case study is to give a brief introduction to a heavy-tailed distribution and its distinct behaviors in

More information

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is:

**BEGINNING OF EXAMINATION** A random sample of five observations from a population is: **BEGINNING OF EXAMINATION** 1. You are given: (i) A random sample of five observations from a population is: 0.2 0.7 0.9 1.1 1.3 (ii) You use the Kolmogorov-Smirnov test for testing the null hypothesis,

More information

Bivariate Birnbaum-Saunders Distribution

Bivariate Birnbaum-Saunders Distribution Department of Mathematics & Statistics Indian Institute of Technology Kanpur January 2nd. 2013 Outline 1 Collaborators 2 3 Birnbaum-Saunders Distribution: Introduction & Properties 4 5 Outline 1 Collaborators

More information

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS Answer any FOUR of the SIX questions.

More information

The Normal Distribution

The Normal Distribution The Normal Distribution The normal distribution plays a central role in probability theory and in statistics. It is often used as a model for the distribution of continuous random variables. Like all models,

More information

Confidence Intervals for an Exponential Lifetime Percentile

Confidence Intervals for an Exponential Lifetime Percentile Chapter 407 Confidence Intervals for an Exponential Lifetime Percentile Introduction This routine calculates the number of events needed to obtain a specified width of a confidence interval for a percentile

More information

Gamma Distribution Fitting

Gamma Distribution Fitting Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics

More information

2 Control variates. λe λti λe e λt i where R(t) = t Y 1 Y N(t) is the time from the last event to t. L t = e λr(t) e e λt(t) Exercises

2 Control variates. λe λti λe e λt i where R(t) = t Y 1 Y N(t) is the time from the last event to t. L t = e λr(t) e e λt(t) Exercises 96 ChapterVI. Variance Reduction Methods stochastic volatility ISExSoren5.9 Example.5 (compound poisson processes) Let X(t) = Y + + Y N(t) where {N(t)},Y, Y,... are independent, {N(t)} is Poisson(λ) with

More information

The Weibull in R is actually parameterized a fair bit differently from the book. In R, the density for x > 0 is

The Weibull in R is actually parameterized a fair bit differently from the book. In R, the density for x > 0 is Weibull in R The Weibull in R is actually parameterized a fair bit differently from the book. In R, the density for x > 0 is f (x) = a b ( x b ) a 1 e (x/b) a This means that a = α in the book s parameterization

More information

Duration Models: Modeling Strategies

Duration Models: Modeling Strategies Bradford S., UC-Davis, Dept. of Political Science Duration Models: Modeling Strategies Brad 1 1 Department of Political Science University of California, Davis February 28, 2007 Bradford S., UC-Davis,

More information

Financial Risk Management

Financial Risk Management Financial Risk Management Professor: Thierry Roncalli Evry University Assistant: Enareta Kurtbegu Evry University Tutorial exercices #4 1 Correlation and copulas 1. The bivariate Gaussian copula is given

More information

Exam M Fall 2005 PRELIMINARY ANSWER KEY

Exam M Fall 2005 PRELIMINARY ANSWER KEY Exam M Fall 005 PRELIMINARY ANSWER KEY Question # Answer Question # Answer 1 C 1 E C B 3 C 3 E 4 D 4 E 5 C 5 C 6 B 6 E 7 A 7 E 8 D 8 D 9 B 9 A 10 A 30 D 11 A 31 A 1 A 3 A 13 D 33 B 14 C 34 C 15 A 35 A

More information

Homework Problems Stat 479

Homework Problems Stat 479 Chapter 10 91. * A random sample, X1, X2,, Xn, is drawn from a distribution with a mean of 2/3 and a variance of 1/18. ˆ = (X1 + X2 + + Xn)/(n-1) is the estimator of the distribution mean θ. Find MSE(

More information

Homework Assignments

Homework Assignments Homework Assignments Week 1 (p. 57) #4.1, 4., 4.3 Week (pp 58 6) #4.5, 4.6, 4.8(a), 4.13, 4.0, 4.6(b), 4.8, 4.31, 4.34 Week 3 (pp 15 19) #1.9, 1.1, 1.13, 1.15, 1.18 (pp 9 31) #.,.6,.9 Week 4 (pp 36 37)

More information

Quantile Regression in Survival Analysis

Quantile Regression in Survival Analysis Quantile Regression in Survival Analysis Andrea Bellavia Unit of Biostatistics, Institute of Environmental Medicine Karolinska Institutet, Stockholm http://www.imm.ki.se/biostatistics andrea.bellavia@ki.se

More information

Managing Systematic Mortality Risk in Life Annuities: An Application of Longevity Derivatives

Managing Systematic Mortality Risk in Life Annuities: An Application of Longevity Derivatives Managing Systematic Mortality Risk in Life Annuities: An Application of Longevity Derivatives Simon Man Chung Fung, Katja Ignatieva and Michael Sherris School of Risk & Actuarial Studies University of

More information

1. You are given the following information about a stationary AR(2) model:

1. You are given the following information about a stationary AR(2) model: Fall 2003 Society of Actuaries **BEGINNING OF EXAMINATION** 1. You are given the following information about a stationary AR(2) model: (i) ρ 1 = 05. (ii) ρ 2 = 01. Determine φ 2. (A) 0.2 (B) 0.1 (C) 0.4

More information

Chapter 3 Common Families of Distributions. Definition 3.4.1: A family of pmfs or pdfs is called exponential family if it can be expressed as

Chapter 3 Common Families of Distributions. Definition 3.4.1: A family of pmfs or pdfs is called exponential family if it can be expressed as Lecture 0 on BST 63: Statistical Theory I Kui Zhang, 09/9/008 Review for the previous lecture Definition: Several continuous distributions, including uniform, gamma, normal, Beta, Cauchy, double exponential

More information

IEOR E4602: Quantitative Risk Management

IEOR E4602: Quantitative Risk Management IEOR E4602: Quantitative Risk Management Basic Concepts and Techniques of Risk Management Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Some Characteristics of Data

Some Characteristics of Data Some Characteristics of Data Not all data is the same, and depending on some characteristics of a particular dataset, there are some limitations as to what can and cannot be done with that data. Some key

More information

Statistics & Flood Frequency Chapter 3. Dr. Philip B. Bedient

Statistics & Flood Frequency Chapter 3. Dr. Philip B. Bedient Statistics & Flood Frequency Chapter 3 Dr. Philip B. Bedient Predicting FLOODS Flood Frequency Analysis n Statistical Methods to evaluate probability exceeding a particular outcome - P (X >20,000 cfs)

More information

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi

Chapter 4: Commonly Used Distributions. Statistics for Engineers and Scientists Fourth Edition William Navidi Chapter 4: Commonly Used Distributions Statistics for Engineers and Scientists Fourth Edition William Navidi 2014 by Education. This is proprietary material solely for authorized instructor use. Not authorized

More information

PASS Sample Size Software

PASS Sample Size Software Chapter 850 Introduction Cox proportional hazards regression models the relationship between the hazard function λ( t X ) time and k covariates using the following formula λ log λ ( t X ) ( t) 0 = β1 X1

More information

Business Statistics 41000: Probability 3

Business Statistics 41000: Probability 3 Business Statistics 41000: Probability 3 Drew D. Creal University of Chicago, Booth School of Business February 7 and 8, 2014 1 Class information Drew D. Creal Email: dcreal@chicagobooth.edu Office: 404

More information

Random Variables and Probability Distributions

Random Variables and Probability Distributions Chapter 3 Random Variables and Probability Distributions Chapter Three Random Variables and Probability Distributions 3. Introduction An event is defined as the possible outcome of an experiment. In engineering

More information

Probability. An intro for calculus students P= Figure 1: A normal integral

Probability. An intro for calculus students P= Figure 1: A normal integral Probability An intro for calculus students.8.6.4.2 P=.87 2 3 4 Figure : A normal integral Suppose we flip a coin 2 times; what is the probability that we get more than 2 heads? Suppose we roll a six-sided

More information

Frequency and Severity with Coverage Modifications

Frequency and Severity with Coverage Modifications Frequency and Severity with Coverage Modifications Chapter 8 Stat 477 - Loss Models Chapter 8 (Stat 477) Coverage Modifications Brian Hartman - BYU 1 / 23 Introduction Introduction In the previous weeks,

More information

4-1. Chapter 4. Commonly Used Distributions by The McGraw-Hill Companies, Inc. All rights reserved.

4-1. Chapter 4. Commonly Used Distributions by The McGraw-Hill Companies, Inc. All rights reserved. 4-1 Chapter 4 Commonly Used Distributions 2014 by The Companies, Inc. All rights reserved. Section 4.1: The Bernoulli Distribution 4-2 We use the Bernoulli distribution when we have an experiment which

More information

6. Continous Distributions

6. Continous Distributions 6. Continous Distributions Chris Piech and Mehran Sahami May 17 So far, all random variables we have seen have been discrete. In all the cases we have seen in CS19 this meant that our RVs could only take

More information

Random variables. Contents

Random variables. Contents Random variables Contents 1 Random Variable 2 1.1 Discrete Random Variable............................ 3 1.2 Continuous Random Variable........................... 5 1.3 Measures of Location...............................

More information

Practice Exam 1. Loss Amount Number of Losses

Practice Exam 1. Loss Amount Number of Losses Practice Exam 1 1. You are given the following data on loss sizes: An ogive is used as a model for loss sizes. Determine the fitted median. Loss Amount Number of Losses 0 1000 5 1000 5000 4 5000 10000

More information

Multivariate Cox PH model with log-skew-normal frailties

Multivariate Cox PH model with log-skew-normal frailties Multivariate Cox PH model with log-skew-normal frailties Department of Statistical Sciences, University of Padua, 35121 Padua (IT) Multivariate Cox PH model A standard statistical approach to model clustered

More information

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial.

Normal Distribution. Notes. Normal Distribution. Standard Normal. Sums of Normal Random Variables. Normal. approximation of Binomial. Lecture 21,22, 23 Text: A Course in Probability by Weiss 8.5 STAT 225 Introduction to Probability Models March 31, 2014 Standard Sums of Whitney Huang Purdue University 21,22, 23.1 Agenda 1 2 Standard

More information

Modelling Environmental Extremes

Modelling Environmental Extremes 19th TIES Conference, Kelowna, British Columbia 8th June 2008 Topics for the day 1. Classical models and threshold models 2. Dependence and non stationarity 3. R session: weather extremes 4. Multivariate

More information

Modelling Environmental Extremes

Modelling Environmental Extremes 19th TIES Conference, Kelowna, British Columbia 8th June 2008 Topics for the day 1. Classical models and threshold models 2. Dependence and non stationarity 3. R session: weather extremes 4. Multivariate

More information

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION INSTITUTE AND FACULTY OF ACTUARIES Curriculum 2019 SPECIMEN EXAMINATION Subject CS1A Actuarial Statistics Time allowed: Three hours and fifteen minutes INSTRUCTIONS TO THE CANDIDATE 1. Enter all the candidate

More information

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial

Lecture 23. STAT 225 Introduction to Probability Models April 4, Whitney Huang Purdue University. Normal approximation to Binomial Lecture 23 STAT 225 Introduction to Probability Models April 4, 2014 approximation Whitney Huang Purdue University 23.1 Agenda 1 approximation 2 approximation 23.2 Characteristics of the random variable:

More information

Random Variables Handout. Xavier Vilà

Random Variables Handout. Xavier Vilà Random Variables Handout Xavier Vilà Course 2004-2005 1 Discrete Random Variables. 1.1 Introduction 1.1.1 Definition of Random Variable A random variable X is a function that maps each possible outcome

More information

II. Random Variables

II. Random Variables II. Random Variables Random variables operate in much the same way as the outcomes or events in some arbitrary sample space the distinction is that random variables are simply outcomes that are represented

More information

On the comparison of the Fisher information of the log-normal and generalized Rayleigh distributions

On the comparison of the Fisher information of the log-normal and generalized Rayleigh distributions On the comparison of the Fisher information of the log-normal and generalized Rayleigh distributions Fawziah S. Alshunnar 1, Mohammad Z. Raqab 1 and Debasis Kundu 2 Abstract Surles and Padgett (2001) recently

More information

Chapter 7: Portfolio Theory

Chapter 7: Portfolio Theory Chapter 7: Portfolio Theory 1. Introduction 2. Portfolio Basics 3. The Feasible Set 4. Portfolio Selection Rules 5. The Efficient Frontier 6. Indifference Curves 7. The Two-Asset Portfolio 8. Unrestriceted

More information

Populations and Samples Bios 662

Populations and Samples Bios 662 Populations and Samples Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-08-22 16:29 BIOS 662 1 Populations and Samples Random Variables Random sample: result

More information

Numerical Descriptions of Data

Numerical Descriptions of Data Numerical Descriptions of Data Measures of Center Mean x = x i n Excel: = average ( ) Weighted mean x = (x i w i ) w i x = data values x i = i th data value w i = weight of the i th data value Median =

More information

Advanced Tools for Risk Management and Asset Pricing

Advanced Tools for Risk Management and Asset Pricing MSc. Finance/CLEFIN 2014/2015 Edition Advanced Tools for Risk Management and Asset Pricing June 2015 Exam for Non-Attending Students Solutions Time Allowed: 120 minutes Family Name (Surname) First Name

More information

Universität Regensburg Mathematik

Universität Regensburg Mathematik Universität Regensburg Mathematik Modeling financial markets with extreme risk Tobias Kusche Preprint Nr. 04/2008 Modeling financial markets with extreme risk Dr. Tobias Kusche 11. January 2008 1 Introduction

More information

A Comprehensive, Non-Aggregated, Stochastic Approach to. Loss Development

A Comprehensive, Non-Aggregated, Stochastic Approach to. Loss Development A Comprehensive, Non-Aggregated, Stochastic Approach to Loss Development By Uri Korn Abstract In this paper, we present a stochastic loss development approach that models all the core components of the

More information

Statistical Tables Compiled by Alan J. Terry

Statistical Tables Compiled by Alan J. Terry Statistical Tables Compiled by Alan J. Terry School of Science and Sport University of the West of Scotland Paisley, Scotland Contents Table 1: Cumulative binomial probabilities Page 1 Table 2: Cumulative

More information

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ.

Definition 9.1 A point estimate is any function T (X 1,..., X n ) of a random sample. We often write an estimator of the parameter θ as ˆθ. 9 Point estimation 9.1 Rationale behind point estimation When sampling from a population described by a pdf f(x θ) or probability function P [X = x θ] knowledge of θ gives knowledge of the entire population.

More information

Statistics for Business and Economics

Statistics for Business and Economics Statistics for Business and Economics Chapter 5 Continuous Random Variables and Probability Distributions Ch. 5-1 Probability Distributions Probability Distributions Ch. 4 Discrete Continuous Ch. 5 Probability

More information

The Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis

The Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis The Multinomial Logit Model Revisited: A Semiparametric Approach in Discrete Choice Analysis Dr. Baibing Li, Loughborough University Wednesday, 02 February 2011-16:00 Location: Room 610, Skempton (Civil

More information

Probability distributions relevant to radiowave propagation modelling

Probability distributions relevant to radiowave propagation modelling Rec. ITU-R P.57 RECOMMENDATION ITU-R P.57 PROBABILITY DISTRIBUTIONS RELEVANT TO RADIOWAVE PROPAGATION MODELLING (994) Rec. ITU-R P.57 The ITU Radiocommunication Assembly, considering a) that the propagation

More information

Financial Risk Forecasting Chapter 9 Extreme Value Theory

Financial Risk Forecasting Chapter 9 Extreme Value Theory Financial Risk Forecasting Chapter 9 Extreme Value Theory Jon Danielsson 2017 London School of Economics To accompany Financial Risk Forecasting www.financialriskforecasting.com Published by Wiley 2011

More information

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5]

High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5] 1 High-Frequency Data Analysis and Market Microstructure [Tsay (2005), chapter 5] High-frequency data have some unique characteristics that do not appear in lower frequencies. At this class we have: Nonsynchronous

More information

Chapter 4 Continuous Random Variables and Probability Distributions

Chapter 4 Continuous Random Variables and Probability Distributions Chapter 4 Continuous Random Variables and Probability Distributions Part 2: More on Continuous Random Variables Section 4.5 Continuous Uniform Distribution Section 4.6 Normal Distribution 1 / 27 Continuous

More information

Logit Models for Binary Data

Logit Models for Binary Data Chapter 3 Logit Models for Binary Data We now turn our attention to regression models for dichotomous data, including logistic regression and probit analysis These models are appropriate when the response

More information

The Bernoulli distribution

The Bernoulli distribution This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Back to estimators...

Back to estimators... Back to estimators... So far, we have: Identified estimators for common parameters Discussed the sampling distributions of estimators Introduced ways to judge the goodness of an estimator (bias, MSE, etc.)

More information

REINSURANCE RATE-MAKING WITH PARAMETRIC AND NON-PARAMETRIC MODELS

REINSURANCE RATE-MAKING WITH PARAMETRIC AND NON-PARAMETRIC MODELS REINSURANCE RATE-MAKING WITH PARAMETRIC AND NON-PARAMETRIC MODELS By Siqi Chen, Madeleine Min Jing Leong, Yuan Yuan University of Illinois at Urbana-Champaign 1. Introduction Reinsurance contract is an

More information

SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data

SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data SYSM 6304 Risk and Decision Analysis Lecture 2: Fitting Distributions to Data M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu September 5, 2015

More information

Interest rate models and Solvency II

Interest rate models and Solvency II www.nr.no Outline Desired properties of interest rate models in a Solvency II setting. A review of three well-known interest rate models A real example from a Norwegian insurance company 2 Interest rate

More information

Lecture 3: Probability Distributions (cont d)

Lecture 3: Probability Distributions (cont d) EAS31116/B9036: Statistics in Earth & Atmospheric Sciences Lecture 3: Probability Distributions (cont d) Instructor: Prof. Johnny Luo www.sci.ccny.cuny.edu/~luo Dates Topic Reading (Based on the 2 nd Edition

More information

Lecture 10: Point Estimation

Lecture 10: Point Estimation Lecture 10: Point Estimation MSU-STT-351-Sum-17B (P. Vellaisamy: MSU-STT-351-Sum-17B) Probability & Statistics for Engineers 1 / 31 Basic Concepts of Point Estimation A point estimate of a parameter θ,

More information

What was in the last lecture?

What was in the last lecture? What was in the last lecture? Normal distribution A continuous rv with bell-shaped density curve The pdf is given by f(x) = 1 2πσ e (x µ)2 2σ 2, < x < If X N(µ, σ 2 ), E(X) = µ and V (X) = σ 2 Standard

More information

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions.

ME3620. Theory of Engineering Experimentation. Spring Chapter III. Random Variables and Probability Distributions. ME3620 Theory of Engineering Experimentation Chapter III. Random Variables and Probability Distributions Chapter III 1 3.2 Random Variables In an experiment, a measurement is usually denoted by a variable

More information

Lecture 4. Finite difference and finite element methods

Lecture 4. Finite difference and finite element methods Finite difference and finite element methods Lecture 4 Outline Black-Scholes equation From expectation to PDE Goal: compute the value of European option with payoff g which is the conditional expectation

More information

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -26 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc.

INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY. Lecture -26 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. INDIAN INSTITUTE OF SCIENCE STOCHASTIC HYDROLOGY Lecture -26 Course Instructor : Prof. P. P. MUJUMDAR Department of Civil Engg., IISc. Summary of the previous lecture Hydrologic data series for frequency

More information

An Information Based Methodology for the Change Point Problem Under the Non-central Skew t Distribution with Applications.

An Information Based Methodology for the Change Point Problem Under the Non-central Skew t Distribution with Applications. An Information Based Methodology for the Change Point Problem Under the Non-central Skew t Distribution with Applications. Joint with Prof. W. Ning & Prof. A. K. Gupta. Department of Mathematics and Statistics

More information

Slides for Risk Management

Slides for Risk Management Slides for Risk Management Introduction to the modeling of assets Groll Seminar für Finanzökonometrie Prof. Mittnik, PhD Groll (Seminar für Finanzökonometrie) Slides for Risk Management Prof. Mittnik,

More information

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی

درس هفتم یادگیري ماشین. (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی یادگیري ماشین توزیع هاي نمونه و تخمین نقطه اي پارامترها Sampling Distributions and Point Estimation of Parameter (Machine Learning) دانشگاه فردوسی مشهد دانشکده مهندسی رضا منصفی درس هفتم 1 Outline Introduction

More information

Statistics and Probability

Statistics and Probability Statistics and Probability Continuous RVs (Normal); Confidence Intervals Outline Continuous random variables Normal distribution CLT Point estimation Confidence intervals http://www.isrec.isb-sib.ch/~darlene/geneve/

More information

M.I.T Fall Practice Problems

M.I.T Fall Practice Problems M.I.T. 15.450-Fall 2010 Sloan School of Management Professor Leonid Kogan Practice Problems 1. Consider a 3-period model with t = 0, 1, 2, 3. There are a stock and a risk-free asset. The initial stock

More information

Calculating VaR. There are several approaches for calculating the Value at Risk figure. The most popular are the

Calculating VaR. There are several approaches for calculating the Value at Risk figure. The most popular are the VaR Pro and Contra Pro: Easy to calculate and to understand. It is a common language of communication within the organizations as well as outside (e.g. regulators, auditors, shareholders). It is not really

More information

Earnings Inequality and the Minimum Wage: Evidence from Brazil

Earnings Inequality and the Minimum Wage: Evidence from Brazil Earnings Inequality and the Minimum Wage: Evidence from Brazil Niklas Engbom June 16, 2016 Christian Moser World Bank-Bank of Spain Conference This project Shed light on drivers of earnings inequality

More information

Survival models. F x (t) = Pr[T x t].

Survival models. F x (t) = Pr[T x t]. 2 Survival models 2.1 Summary In this chapter we represent the future lifetime of an individual as a random variable, and show how probabilities of death or survival can be calculated under this framework.

More information

MODELLING OF INCOME AND WAGE DISTRIBUTION USING THE METHOD OF L-MOMENTS OF PARAMETER ESTIMATION

MODELLING OF INCOME AND WAGE DISTRIBUTION USING THE METHOD OF L-MOMENTS OF PARAMETER ESTIMATION International Days of Statistics and Economics, Prague, September -3, MODELLING OF INCOME AND WAGE DISTRIBUTION USING THE METHOD OF L-MOMENTS OF PARAMETER ESTIMATION Diana Bílková Abstract Using L-moments

More information

Chapter 3 Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2013 John Wiley & Sons, Inc.

Chapter 3 Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2013 John Wiley & Sons, Inc. 1 3.1 Describing Variation Stem-and-Leaf Display Easy to find percentiles of the data; see page 69 2 Plot of Data in Time Order Marginal plot produced by MINITAB Also called a run chart 3 Histograms Useful

More information

Credit Risk. June 2014

Credit Risk. June 2014 Credit Risk Dr. Sudheer Chava Professor of Finance Director, Quantitative and Computational Finance Georgia Tech, Ernest Scheller Jr. College of Business June 2014 The views expressed in the following

More information

Structural Models of Credit Risk and Some Applications

Structural Models of Credit Risk and Some Applications Structural Models of Credit Risk and Some Applications Albert Cohen Actuarial Science Program Department of Mathematics Department of Statistics and Probability albert@math.msu.edu August 29, 2018 Outline

More information

Heterogeneous Firm, Financial Market Integration and International Risk Sharing

Heterogeneous Firm, Financial Market Integration and International Risk Sharing Heterogeneous Firm, Financial Market Integration and International Risk Sharing Ming-Jen Chang, Shikuan Chen and Yen-Chen Wu National DongHwa University Thursday 22 nd November 2018 Department of Economics,

More information

EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS. Rick Katz

EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS. Rick Katz 1 EVA Tutorial #1 BLOCK MAXIMA APPROACH IN HYDROLOGIC/CLIMATE APPLICATIONS Rick Katz Institute for Mathematics Applied to Geosciences National Center for Atmospheric Research Boulder, CO USA email: rwk@ucar.edu

More information

Survival Data Analysis Parametric Models

Survival Data Analysis Parametric Models 1 Survival Data Analysis Parametric Models January 21, 2015 Sandra Gardner, PhD Dalla Lana School of Public Health University of Toronto 2 January 21, 2015 Agenda Basic Parametric Models Review: hazard

More information

The comparison of proportional hazards and accelerated failure time models in analyzing the first birth interval survival data

The comparison of proportional hazards and accelerated failure time models in analyzing the first birth interval survival data Journal of Physics: Conference Series PAPER OPEN ACCESS The comparison of proportional hazards and accelerated failure time models in analyzing the first birth interval survival data To cite this article:

More information

The Normal Distribution. (Ch 4.3)

The Normal Distribution. (Ch 4.3) 5 The Normal Distribution (Ch 4.3) The Normal Distribution The normal distribution is probably the most important distribution in all of probability and statistics. Many populations have distributions

More information

MATH3075/3975 FINANCIAL MATHEMATICS TUTORIAL PROBLEMS

MATH3075/3975 FINANCIAL MATHEMATICS TUTORIAL PROBLEMS MATH307/37 FINANCIAL MATHEMATICS TUTORIAL PROBLEMS School of Mathematics and Statistics Semester, 04 Tutorial problems should be used to test your mathematical skills and understanding of the lecture material.

More information

Chapter 7: Point Estimation and Sampling Distributions

Chapter 7: Point Estimation and Sampling Distributions Chapter 7: Point Estimation and Sampling Distributions Seungchul Baek Department of Statistics, University of South Carolina STAT 509: Statistics for Engineers 1 / 20 Motivation In chapter 3, we learned

More information

Course information FN3142 Quantitative finance

Course information FN3142 Quantitative finance Course information 015 16 FN314 Quantitative finance This course is aimed at students interested in obtaining a thorough grounding in market finance and related empirical methods. Prerequisite If taken

More information