Consistent Dynamic Affine Mortality Models for Longevity Risk Applications

Consistent Dynamic Affine Mortality Models for Longevity Risk Applications Craig Blackburn a,, Michael Sherris b a School of Risk and Actuarial Studies, and CEPAR, University of New South Wales, Sydney, Australia, 5 b School of Risk and Actuarial Studies, and CEPAR, University of New South Wales, Sydney, Australia, 5 Abstract This paper proposes and calibrates a consistent multi-factor affine term structure mortality model for longevity risk applications. We show that this model is appropriate for fitting historical mortality rates. Without traded mortality instruments the choice of risk-neutral measure is not unique and we fit it to observed historical mortality rates in our framework. We show the risk-neutral parameters can be calibrated and are relatively insensitive of the historical period chosen. Importantly, the framework provides consistent future survival curves with the same parametric form as the initial curve in the risk-neutral measure. The multiple risk factors allow for applications in pricing and more general risk management problems. A state-space representation is used to estimate parameters for the model with the Kalman filter. A measurement error variance is included for each age to capture the effect of sample population size. Swedish mortality data is used to assess - and 3-factor implementations of the model. A 3-factor model specification is shown to provide a good fit to the observed survival curves, especially for older ages. Bootstrapping is used to derive parameter estimate distributions and residual analysis is used to confirm model fit. We use the Heath-Jarrow-Morton forward rate framework to verify consistency and to simulate cohort survivor curves under the risk-neutral measure. JEL Classifications: G, G, G3, C3, C5, C5, J Keywords: Mortality model, longevity risk, multi-factor, affine, arbitrage-free, consistent, Kalman filter, Swedish mortality Corresponding Author Email addresses: c.blackburn@unsw.edu.au (Craig Blackburn), m.sherris@unsw.edu.au (Michael Sherris) Preprint submitted to Elsevier February 8, Electronic copy available at: http://ssrn.com/abstract=834

. Introduction In this paper we propose and calibrate a consistent multi-factor dynamic affine mortality model for modelling longevity risk based on the Affine Term Structure Model (ATSM) (Duffie and Kan [6] and Dai and Singleton [3]). We follow methods similar to Lando [] and Schonbucher [5] on modelling the term structure of defaultable bonds. The goal of this paper is to show that multi-factor ATSMs are an appropriate model for fitting historical mortality rates. Since there are no traded instruments on mortality risk, the risk-neutral measure is not unique and for illustration we fix it to observed mortality rates. The affine framework has an advantage over econometric models and we can make use of mathematical tools developed for the risk management and pricing of credit risky bonds. We will show that our chosen risk-neutral dynamics fit very well to historical mortality rates when calibrating the model and, in the chosen data-set, are relatively independent of the historical data period chosen. We find a similar problem as Luciano and Vigna [] when fitting affine processes to our population mortality data-set. By introducing multiple factors, we derive a suitable model that contains both non-mean reverting and mean reverting factors. This allows us to identify and interpret how each factor effects different ages in the data-set. All factors are time-homogeneous and Gaussian so that closed form solutions of survivor curves can be easily derived. A number of stochastic affine short-rate mortality models have been proposed in the literature. There is little coverage in the literature on the calibration of short-rate mortality models. Dahl and Moller [] and Milevsky and Promislow [3] both propose single factor models for a single cohort. The survival curves need to be solved via numerical techniques and neither author analyses if these mortality models are appropriate for the pricing or management of mortality risk. Biffis [7] proposes two models for an single cohort. These are a Gaussian mean-reverting process with jumps, with a time-varying mean-reversion level, and a -factor stochastic mean-reverting squareroot diffusion model. Biffis [7] shows these models are flexible enough to produce a variety of survival curves. Schrager [6] models the whole term structure of mortality rates based on population data using a Thiele or Makeham mortality law. We limit our data-set to the population over 5 and define the force of mortality short-rate as the sum of the individual factors. This simplification provides an excellent fit to the data-set and gives simple closed-form expressions for the survival curves. Although proposed short-rate and ATSM models are arbitrage-free, they do not consider the consistency property of the risk-neutral survival curves, Bjork and Christensen [8] provide a discussion on consistency. The consistency requirement is imposed under the risk-neutral dynamics that constrain forecast survival curves to have the same parametric form as the original fitted parametric curves. Models will need to be re-parameterised at future dates if they do not satisfy the consistency property. If the model is rich enough to capture the dynamics of mortality rates then re-parametrisation will be unnecessary. As a result the proposed model provides a more reliable basis for hedging and pricing longevity risk transactions. A way to ensure consistency is to model the observed forward mortality rate as a restricted exponential model (De Rossi [4]). As noted in Milevsky and Promislow [3], traditional mortality rates are forward rates and analogous to forward interest rates. Chiarella and Kwon [] show that ATSMs, similar to our formulation, can be represented as a forward rate model with a deterministic volatility function. We derive the forward force of mortality process and show the volatility function meets the consistency requirement. Unlike econometric models, Lee and Carter [] for example, our time-series dynamics or tran- Electronic copy available at: http://ssrn.com/abstract=834

sition equations, are defined in the model as latent stochastic factors. The measurement equation is the observed survivor curve, which is exponentially affine in the factors. Representing the ATSM in this state-space form allows the model parameters to be estimated using the Kalman filter (Kalman [9]). An explicit allowance is included for an age-dependent measurement error. Measurement errors are assumed independent between ages and a parametric function of age is used to estimate these errors. Given our risk-neutral measure, Girsanov s theorem is used to change to a real-world measure to model the historical mortality rates. Instead of assuming a constant price of mortality risk for each factor as in the completely affine model of Duffie and Kan [6], an essentially affine model introduced by Duffee [5] is adopted. This decouples the risk-neutral parameters from the real-world or historical drift. Swedish data is used to calibrate and assess the model fit. A state-space re-sampling method, as in Stoffer and Wall [7], is used to produce bootstrapped confidence intervals for each of the parameter estimates. Under the model assumptions, the bootstrapped distribution for each parameter has an asymptotically normal distribution. Different model assumptions are assessed; these include - and 3-factor independent models. We have analysed different model structures and not found support for factor dependence. We show that a -factor model is able to fit the data up to the age of 8, while the 3-factor models are able to capture the majority of data variation for the whole age range from 5 to in our data-set. We compare our 3-factor model to the arbitrage-free version of the Nelson-Siegel model (Christensen et al. []). We show this model produces a similar result to our 3-factor model. But due to the factor dependence of the Nelson-Siegel model it does not meet the consistency requirement. Section outlines our ATSM methodology and derives our proposed - and 3-factor models. Section 3 summarises the Kalman filter estimation methodology and outlines the Swedish data used to assess the models. We calibrate the proposed models to the data-set and use bootstrapping techniques to obtain a distribution of each parameter estimate. Section 4 provides an analysis of the results from fitting the models with Swedish mortality data. Section 5 shows an application of risk-neutral cohort survivor curves with parameter uncertainty. Section 6 concludes with a summary of the paper.. Affine Mortality Model We apply techniques from credit risky securities, as in Lando [], to mortality modelling. We are interested in the first stopping time of a hazard process, µ(t;x), for a person aged x at time zero. Starting with a filtered probability space (Ω, F, F, P), where P is the real-world probability space. The information at time-t is given by F = G H. The sub-filtration G t contains all financial and actuarial information except the actual time of death. The sub-filtration H t is the σ-algebra with death information. Let N(t) := τ t, if the compensator A(t) = t µ(s;x)ds is a predictable process of N(t) then dm(t) = dn(t) da(t) is a P-martingale, where da(t) = µ(t;x)dt. There also exists another measure where dm(t) is a Q-martingale, under which the compensator becomes da(t) = µ Q (t;x)dt, where µ Q (t;x) = ( + φ(t))µ(t;x) and φ(t). We set φ(t) = and do not price unsystematic risk. This assumes we are modelling population mortality rates and set µ Q (t;x) = µ(t;x). Definition. In the absence of arbitrage there exists an equivalent measure Q where P(t,T;x) is 3 Electronic copy available at: http://ssrn.com/abstract=834

the time-t value of the terminal payment X T at time-t. The payment of X T, which is a G-adapted process, is conditional on survival to the end of the period T, otherwise the value is zero. The time t value can be represented as P(t,T;x) = E Q[ ] e T t r(s)ds X T τ>t F t = τ>t E Q[ ] e T t [r(s)+µ(s;x)]ds X T G t () Proof: We can use the law of iterated expectations to show this, see Bielecki and Rutkowski [6] for detail. We assume independence between interest and mortality rates and the terminal payment X T =. Equation () becomes P(t,T;x) = τ>t E Q[ ] e T t r(s)ds G t E Q[ ] e T t µ(s;x)ds G t = τ>t B(t,T)S(t,T;x) () where B(t,T) is the time-t Bond price and S(t,T;x) is the risk-neutral survival probability for a cohort aged x at time. This definition lends itself to forward rate modelling in the Heath-Jarrow- Morton (HJM), Heath et al. [8], framework. Forward mortality models have been proposed (Miltersen and Persson [4] and Barbarin [3]). Bauer et al. [5] and Bauer et al. [4] propose a volatility structure for mortality and a calibration procedure. The calibration of affine models is complicated when dealing with multiple cohorts and cohort forecasts are exogenous to the model. Rather, Chiarella and Kwon [] show that the ATSM is a special case of the HJM framework with a deterministic volatility function. We use this property to calibrate our model in the ATSM framework using population data. If our risk-neutral parameters are not time-varying, an assumption of our framework, we can use HJM to estimate cohort survival curves. The movement from population to cohort model only requires a change in the initial forward mortality curve. The initial forward curve can be forecasts from the ATSM framework or given exogenously. In the following model definition we drop the reference to x, in the ATSM the x is fixed and references the lower age bound in the mortality data-set. Definition. Using the same filtered probability space (Ω, F, F, P), the G-adapted process, µ(t), represents the instantaneous mortality rate. There exists an equivalent martingale measure Q such that the survival probability can be represented as S(t,T) = E Q[ ] e T t µ(s)ds G t (3) The general solution of equation (3) can be shown to be S(t,T) = e B(t,T) Z t+c(t,t) where B(t,T) and C(t,T) are the solution, for our given short-rate mortality process, to the following set of ordinary differential equations. db dt = Q B(t,T), B() = (5) dc n dt = Σ B(t,T)B (t,t)σ, C() = (6) j= The instantaneous mortality intensity is defined as the sum of the latent factors, given by, µ(t) = Z 4 (4)

Proof: See Duffie and Kan [6]. For an n-factor model, B(t,T) is the transpose of the n-factor loadings matrix, C(t,T) is a constant and Z t are the latent factors. The average force of mortality curve is affine in the factors and given by µ(t,t) = T t log[s(t,t)] = B(t,T) Z t C(t,T) T t The n-latent factors under the risk-neutral measure are defined by the Gaussian process dz t = Q Z t dt+σdw Q t where Q R n n, Σ R n n. There is no restriction on the sign of Q. The risk-neutral factors can then be one of three processes; mean reverting to zero, non-mean reverting or a random walk. The covariance matrix, Σ, is diagonal and the Brownian motions, W t, are independent... Multi-Factor Affine Models Inthis section we present thesolution for - and 3-factor ATSMs based on theordinary differential equations in (5) and (6). In results not presented here, we have tested more general dependent and correlated specifications for the drift and volatility of the risk-neutral process, Z t, but found little evidence to support these more complex models. Instead, in this paper we present - and 3-factor models with independent and uncorrelated factors. We compare our results to the Nelson- Siegel model. This model has fewer parameters than our 3-factor model and adds dependence in the drift of one of the factors, this dependence allows the Nelson-Seigel curve to be generated. Corollary. The risk-neutral dynamics of each model are defined as (7) -Factor independent model ( dz,t dz,t ) ( δ = δ )( Z,t Z,t ) ( ) ( ) σ dw Q,t dt+ σ dw Q,y (8) 3-Factor independent model dz,t dz,t dz 3,t 3-Factor Nelson-Siegel Model δ = δ δ 3 Z,t Z,t Z 3,t σ dt+ σ σ 3 dz,t dz,t = δ δ Z,t Z,t dt+ σ σ dz 3,t δ Z 3,t σ 3 dw Q,t dw Q,t dw Q 3,t dw Q,t dw Q,y dw Q 3,y (9) () We can solve explicitly for B(t,T) and C(t,T) from equations (5) and (6). For the independent models the factor loadings are B i (t,t) = e δ i(t t) δ i () 5

and the C(t,T) solution for the n-factor independent models are C(t,T) = n σi δ 3 i= i [ ( ) ( ) ] e δ i(t t) e δ i(t t) +δ i (T t) () The factor loadings of the Nelson-Siegel model is given by B (t,t) = (T t), B (t,t) = e δ(t t) [ e δ(t t), B 3 (t,t) = (T t)e δ(t t)] δ δ (3) the solution to C(t,T) is given in Christensen et al. []. Proof: By making use of Definition we can substitute (8) and (9) into (5) and (6) to obtain the above solution. A complete derivation of the arbitrage-free Nelson-Siegel model is given in Christensen et al. []. By making use of Corollary () we can now give the form of the average force of mortality curve that is estimated in our proposed models. Definition 3. Under the risk-neutral measure the average force of mortality for the n-factor independent models are derived by combining equations (7), () and () and given by [ n e δ i(t t) µ(t,t) = T t i= δ i Z i,t σ i δ 3 i ( ) ] ( e δ i(t t) ) ( e δi(t t) )+δ i (T t) and the 3-factor Nelson-Siegel model is derived by combining equations (7) and (3) with the value for C(t,T) is given in Christensen et al. [] [ ( ] µ(t,t) = (T t)z,t + e δ(t t) e δ(t t) Z,t + (T t)e )Z δ(t t) 3,t C(t,T) T t δ δ.. Change of Measure So far the derivation has been under our risk-neutral measure and in the absence of pricing information we use the essentially affine model of Duffee [5]. This removes the strong link between factor loadings, B(t, T), and the real-world or historical drift. This effectively allows us to select different drift terms for the risk-neutral and real-world stochastic processes. Definition 4. Under the real-world P-measure the n-factor stochastic process, Z t, is related to the risk-neutral process by Girsanov s theorem such that Z t has the following definition where K R n n, Ψ R n, Σ R n n. (4) (5) dz t = K[Ψ Z t ]dt+σdw P t (6) 6

Proof: In the essentially affine model we specify the market price of mortality risk as Λ t = λ +λ Z t where Λ t R n n, λ R n, λ R n n. By Girsanov s theorem the change of measure from the n-factor real-world Brownian motion to the risk-neutral measure is dw Q t = dw P t +Λ t dt and the stochastic process under the P-measure is dz t =[ Θ Z t ]dt+σ[λ +λ Z t ]dt+σdw P t =[ Θ+Σλ ]dt [ Σλ ]Z t dt+σdw P t which can be written in the form of (6) Corollary. The long term mean level of mortality rates, Ψ, is set to. This allows each factor to decay towards zero at a rate defined by K. The real-world dynamics of each model are defined as -Factor independent model ( dz,t dz,t ) ( κ = κ )( Z,t Z,t ) ( )( ) σ dw P dt+,t σ dw P,y (7) 3-Factor independent and Nelson-Siegel model dz,t dz,t = κ κ Z,t Z,t dt+ σ dw P,t σ dw P,t (8) dz 3,t κ 3 Z 3,t σ 3 dw3,t P The flexibility of this definition allows real-world speed of mean reversion, K, to be estimated without influencing the risk-neutral parameters. Duffee [5] shows that by adding square-root diffusion factors, the essentially affine specification of the price of risk structure reduces to the completely affine model. We focus on Gaussian dynamics for the model where the real-world drift and risk-neutral parameters are separately estimated in the Kalman filter..3. Forward Mortality Rate The ATSM is a special case of the HJM framework. We define the forward mortality rate in the following Proposition. Proposition. The forward mortality dynamics under the risk-neutral measure for the proposed models is [ ] n σi dµ(t,t) = e δ i(t t) ( ) e δi(t t) dt σ i e δi(t t) dw Q i,t (9) δ i i= =ν µ (t,t)dt+σ µ (t,t)dw Q t 7

Proof: From our survivor curve definition, equation (4), we can derive the forward mortality rate as µ(t,t) = log[s(t,t)] T = [ B(t,T) Z t C(t,T) ] T n [ = e δi(t t) Z i,t σ i δi 3 i= Using Ito s Lemma we can determine the dynamics of µ(t,t) as, dµ(t,t) = µ n [ ] µ t dt+ dz t,t Z i,t = n i= i= [( δ i e δ i(t t) Z i,t σ i δ 3 i From equations (8) and (9) the risk-neutral dz t is defined as for each factor. Upon substitution ( δ i e δ i(t t) δ i e δ i(t t) +δ i ) ] () (δ ie δ i(t t) δ ie δ i(t t) )) dt+e δ i(t t) dz i,t ] dz i,t = δ i Z i,t dt+σdw Q i,t dµ(t,t) = = n i= [ ( δ i e δi(t t) Z t σi δi 3 ( δ i e δ i(t t) δ i e δ i(t t) )) dt e δi(t t) δ i Z i,t dt σ i e δi(t t) dw Q i,t [ ] n σi e δ i(t t) ( ) e δi(t t) dt σ i e δi(t t) dw hbbq i,t δ i i= ] () We can see that equation () meets the HJM arbitrage-free drift requirement, given by T ν µ (t,t) = σ µ (t,t) σ µ (t,s)ds t [ n ] T = σ i e δ i(t t) σ i e δi(s t) ds = i= n i= t [ σi e δ i(t t) ( ) ] e δi(t t) δ i where σ µ (t,t) is a matrix of volatility functions. The volatility function for each model is given in Section.4..4. Consistent Forward Mortality Curves One of the aims of this paper is to propose and assess consistent models for pricing and risk management applications. These curves maintain the same functional form of the survival curve 8

across time. The analysis of consistent forward curves by Bjork and Christensen [8] can be applied to our mortality model. We use this analysis to test if the consistency requirement is met in the proposed - and 3-factor independent models and the the Nelson-Siegel model proposed in Christensen et al. []. For each of the models we can identify the vector of volatilities as [ ] σ f (x) = σ e δ x σ e δ x [ ] σ 3f (x) = σ e δ x σ e δ x σ 3 e δ 3x σ NS (x) = [σ σ e δx σ 3 δe δx] These represent the -factor, 3-factor and Nelson-Siegel models respectively with x = T t. With these volatilities under the risk-neutral measure the proposed independent - and 3-factor models meet the consistency requirement while the Nelson-Siegel, although arbitrage-free, is not consistent 3. Model Calibration 3.. Kalman Filter The Kalman filter is an optimal linear estimator (Kalman [9]). The filter is optimal since it minimises the error covariance, and has been increasingly used in financial applications including estimating affine term structure models, Babbs and Nowman [] and Andersen and Lund []. Our model can easily be represented in a state-space form, from Definition 3, we have the measurement equations for the proposed independent and Nelson-Siegel models, equations (4) and (5) respectively. We introduce a measurement error, ψ t, to these equations, since there are many more ages in our model than factors. This is given by µ(t,t) = B(t,T) Z t C(t,T)+ψ t ψ t N(,R) () The measurement noise covariance matrix, R R m m, is assumed to be time-invariant and m represents the number of ages in the data-set. We assume measurement noise is independent between ages, so that R is a diagonal matrix. Rather than estimating a separate measurement error for each age, we estimate the error with a simple parametric curve, this reduces the number of parameters to be estimated down to three. The parametric form used for the diagonal of the covariance matrix, with all other entries zero, is, R(t,T) = T t T t [ rc +r e r i ] (3) i= where the values of r c, r and r are estimated as part of the optimal parameter-set ˆθ. The transition system of equations are the real-world dynamics described for each model in Corollary. The Euler discretization of the transition system, equation (6), is given by Z ti =e K(t i t i ) Z ti +η t, η ti N(,Q ti ) ( ) ti Q ti =E e K(ti s) ΣdWs P t i 9

Under the assumptions of the state-space model, where all the prediction errors are Gaussian, the Kalman filter calculates a multi-variate normal likelihood. The log-likelihood is computed with L(θ) = N t= [nlog(π)+ln S t +ǫ t S t ǫ t ]. where ǫ t and S t are the error term and the residual variance-covariance matrices respectively. Nonlinear optimisation is required to find the optimal parameter-set, ˆθ, that maximises the likelihood function; summarised as ˆθ = argmax θ L(θ). 3.. Bootstrapping State-Space models Due to the flat likelihood function around the optimal parameter-set, the Hessian matrix is numerically unstable. Instead we use a state-space bootstrap method to re-sample the observed data with the standardised residuals from the optimal parameter-set, ˆθ. This produces a distribution of parameter estimates that should be asymptotically Gaussian. The re-sample method is detailed in Stoffer and Wall [7]. The process used is as follows:. Determine the standardized residuals from the optimal parameter set ê t = ˆǫ t Ŝt (4). Re-sample the standardized residuals with replacement to generate a bootstrap set of residuals, e t excluding the first four residuals because of Kalman filter start up irregularities. 3. Recompute the transition and measurement equations given the bootstrap residuals, e t. Z t = e ˆKZ t + e t Ŝt µ t = ˆB Z t Ĉ + ˆK t e t Ŝt 4. Use the bootstrap data-set, µ t, to estimate a new set of parameters that maximise the likelihood, ˆθ = argmax θ L(θ ) where θ is the parameter set determined from the bootstrapped data-set. 5. Repeat steps to 4 five hundred (5) times to obtain a sample distribution for each of the parameter estimates. 3.3. Mortality Data The survival probability determined from mortality data-set as T t S(t,T) = ( q(x+s,t)) s= q(x,t) = e m(x,t)

where the one year death probability is q(x,t), for a person aged x in year t and m(x,t) is the central rate of mortality. Which is calculated from mortality data-set as m(x,t) = D(x,t) E(x,t) = # of deaths aged x in year t Exposure aged x in year t. Population exposures and the number of deaths is sourced from the Human Mortality Database for Sweden for the years 9 to 7. In the data-set the death data is taken from the recorded number of deaths in a year whereas population exposures by age are mostly estimates. In the first half of the th century accurate population estimates were only available in census years; this was every ten years. The population is then estimated between census years. Since the 97 s a registry system has been used to track population estimates on a yearly basis. As a result the mortality rates include smoothing of the population estimates prior to the 97 s..8.6.4. Probability..8.6.4. 9 8 7 6 5 99 98 97 96 95 94 93 9 9 Year Figure : Population µ(t,t) Figure plots the average force of mortality, µ(t,t), for Swedish males between the ages of 5 and for the years 9 to 7. The Figure clearly shows the improvement in mortality over the course of the th century and the exponential shape of the curve. Mortality improvements have occurred at differing rates for different ages. Figure gives the mean and first two principal components of µ(t,t) for the Swedish dataset. The mean and first principal component explains 96.8% of the variation in the data. This suggests that - and 3-factor mortality models should parsimoniously capture variations in observed mortality data. This figure gives insight into the appropriate definition of the risk-neutral dynamics of the models. Non-mean reverting factors will produce factor loadings that are exponentially

increasing with age. By adding dependence, such as the Nelson-Siegel model, we can produce various shapes in the factor loading that may improve the fit of the model..5.4 Mean Component st PC (96.8%) nd PC (98.7%).3.....3.4 5 55 6 65 7 75 8 85 9 95 Figure : Principal Component Analysis 4. Calibration Results 4.. Model Goodness of Fit Table shows the maximum Log Likelihood and the Root Mean Squared Error (RMSE) for each of the models. The 3-factor models have similar log Likelihood and RMSE; both provide a better fit than the -factor model. The 3-factor independent model estimates 5 parameters and 94 latent factors (3 factor estimates for each of the 98 years of data we tested). 3-Factor 3-Factor -Factor Independent Nelson-Siegel Independent Log Likelihood 348 353 384 RMSE.8.9.89 No. of Model Parameters 5 3 No. of Factors Estimated 94 94 96 AICb - 444 Table : Comparison of Log Likelihood, RMSE and the number of parameters estimated for each model. An AICb test is also performed.

ML -Factor 95% Confidence 3-Factor 95% Confidence 3-Factor 95% Confidence Estimates Independent Lower Upper Independent Lower Upper Nelson-Siegel Lower Upper δ (NS: δ) -.963 -.64 -.9448 -.8 -.79 -.8958 -.593 -.578 -.763 δ.55.37.3354.498 -.494.669 - - - δ 3 - - - -.488 -.5667 -.39 - - - κ.543.7.587.46.794.58.3.585.684 κ.3.8.39999.37.675.338.65.4.649 κ 3 - - -.44.5.44.37..3446 σ 4.33e-4 3.79e-4 4.784e-4 3.94e-4.585e-4 5.97e-4 4.89e-3 3.69e-3 5.43e-3 σ 7.567e-3 5.35e-3 8.4e-3 4.866e-4 4.376e-4 9.684e-4.498e-4.545e-4.584e-4 σ 3 - - - 6.43e-5 5.95e-5.335e-4 3.849e-5 3.844e-5 7.33e-5 r 5.6e- 9.7e-3 5.395e-.e- 6.6e-3.79e- 7.4e-3 3.e-3 7.7e-3 r.39886.3936.4775.38457.36794.454.3974.3969.444 r c.48e-8.73e-8.36e-8.457e-8.6e-8.635e-8.896e-8.49e-8.95e-8 Table : Model fit results with Bootstrap Confidence Intervals A modified Akaike Information Criteria (AIC) test for state-space models is performed (Cavanaugh and Shumway [9]). The AICb test uses the bootstrapped Log Likelihood results to estimate the penalty term. There is no improvement in using the 3-factor independent model over the 3-factor Nelson-Siegel model while the -factor model is rejected. The AICb test is calculated as [ ] N AICb = log L(ˆθ)+ log L(ˆθ i ) ( log L(ˆθ)) N i= (5) where N is the number of bootstrap distributions. L(ˆθ i ) is the maximum likelihood for ith bootstrap distribution and L(ˆθ) is our best estimate. The number of bootstrap distributions used in all the models is 5. 4.. Parameter Estimates Table gives the estimated parameters for the - and 3-factor independent models and the 3- factor Nelson-Siegel mode along with 95% confidence intervals from the state-space bootstrapping method described in section 3.. All the δ parameters of the -factor independent model and Nelson-Siegel models are significant. The confidence interval for the δ parameter of the 3-factor independent model includes zero, indicating this parameter is not significant. This would reduce the factor loading to a constant and weigh all ages equally. The real-world drift parameters, κ, are difficult to estimate with wide confidence intervals and a number of outliers. A value close to zero indicates a random walk, while large values indicate the factor quickly reverts to its long term mean of. The volatility matrix, Σ and the measurement variance parameters r, r and r c for all models are significant, with only a small number of outliers. These results show that although the realworld drift is difficult to estimate, and hence forecast, the risk-neutral drift and diffusion parameters are relatively stable. 4.3. Residual Analysis Figure 3 shows the standardised force of mortality residuals, ˆm, for the - and 3-factor independent models. The 3-factor Nelson-Siegel is very similar to the 3-factor independent model. These 3

residuals are re-constructed from the fitted µ(t, T) residuals, ê. The relationship is given as ê x,t (τ) = τ τ i= ˆm x +i,t (6) where τ is the number of years above the lower age boundary, x. In Figure 3 we only distinguish between positive and negative signs of the force of mortality residuals. If the model is a good fit, the residuals should be independently distributed. Figure 3a shows the -factor independent model s residuals; we can see prior to 97 the -factor model provides a good fit. After 97 there has been a structural change in Swedish mortality rates and the -factor model cannot capture this change. The 3-factor models can capture this change, and the residuals indicate a good fit for the whole data-set. 95 95 9 9 85 85 8 8 75 75 7 7 65 65 6 6 55 55 5 9 9 93 94 95 96 97 98 99 Year (a) -Factor Independent Model Residuals 5 9 9 93 94 95 96 97 98 99 Year (b) 3-Factor Independent Model Residuals Figure 3: Model Residuals 4.4. In-Sample Analysis We can show the structural change in mortality rates from the in-sample analysis of each of the models. Figure 4 shows the percentage error of the survival curve estimates for each of the models for the years 94 and. The scale of each percentage error plot is different above and below the age of 85. In 94 the -factor model is comparable to the 3-factor models. While in the year over the age of 8 the -factor model s percentage error is increasing exponentially. Figure 5 summarises the fit results with the Mean Absolute Relative Error (MARE) for all time periods. The 3-factor models provide a better fit, but the percentage error of all models is very low under age 8. Over age 8 the 3-factor models are required to capture the variation in the survival curve, indicating one of the factors is modelling the population after 97. There are limited differences between the 3-factor models. 4.5. Estimated Factors and Factor Loadings The latent factors and the factor loadings for the -factor independent model are shown in Figure 6. The first factor loading, B, is exponentially increasing with age and shows the general mortality trend. The corresponding factor, Z, shows a slowly improving mortality rate between 9 and 97 with a fairly large variation between years. Since the 97 s this trend has changed; 4

4 3 Factor Independent 3 Factor Independent 3 Factor Nelson Siegel 4 3 5 3.75.5 Factor Independent 3 Factor Independent 3 Factor Nelson Siegel 4 5 7 Percentage Error Percentage Error.5.5 35 35.5 7 3 3 3.75 5 4 5 55 6 65 7 75 8 85 9 95 4 (a) Fit Error Year 94 5 5 55 6 65 7 75 8 85 9 95 4 (b) Fit Error Year Figure 4: - and 3-Factor Survival Curve Fit Errors Percentage Error 3 Factor Independent 3 Factor Independent 3 Factor Nelson Siegel 8 7 6 5 4 3 5 55 6 65 7 75 8 85 9 95 Figure 5: - and 3-Factor Survival Curve MARE mortality rates are improving at an increased rate with a lower volatility between years. The second factor, Z, decreases between 9 and 96. Since 96, the volatility is lower and the downward trend has stopped. Its corresponding factor loading, B is downward sloping, this indicates greater sensitivity to the factor at younger ages. Prior to 96 the younger age population mortality rates were more sensitive to the mortality improvements due to Z, and prior to 97 the older ages saw relative small mortality improvements. Since 97, mortality rates at all ages have been improving, although the trend in Z has stopped. The factors and factor loadings for the 3-factor dependent model are shown in Figure 7. The first two factors and factor loadings are very similar to the -factor model results. The same downward trend of Z also stops around 96, while the factor loading, B, shows the older ages are more sensitive to this factor than in the -factor model. After 97, the general downward trend of mortality rates is given by Z and factor loading B. The additional factor and factor loading, Z 3 and B 3, captures the mortality trend after 97; we can see the scale of B 3 is significantly larger 5

8 x 3 Z Factor 3 B B 6 Z Factor.8 4 5 6 7 8 9.5 C.6.4 5 6 7 8 9..5..5 4 9 9 93 94 95 96 97 98 99 Year (a) Factors.3 5 55 6 65 7 75 8 85 9 95 (b) Factor Loadings Figure 6: -Factor Independent Model than B in the old age population. This factor is level until the 96 s and captures some of the old age volatility, then increasing at a constant rate. This implies the mortality rates of the older age are improving at a slower rate than the rest of the population after 97. 7 x 3 6 5 Z Factor Z Factor Z 3 Factor 4 B.95.9 B 4 3 5 6 7 8 9 B 3 3.85 5 6 7 8 9 C 5 Z 3. 9 9 93 94 95 96 97 98 99 Year 5 6 7 8 9.4 5 6 7 8 9 (a) Factors (b) Factor Loadings Figure 7: 3-Factor Independent Model The Nelson-Siegel factor loadings have only free parameter. This introduces factor dependence and one of the factor loadings is constant; indicating the factor has equal weight on all ages. This loss of flexibility only gives a slightly worse fit than the 3-factor independent model. 4.6. Model Robustness Figure (8) shows the robustness of the estimated factors and factor loadings to starting year in the data set. The 3-factor independent model is re-run a number of times with different starting years. We test every years from 9 to 98; the data set still finishes in the year 7. The factors Z and Z show almost identical fit results for each year, while Z 3, which effects the old age population, shows a very similar shape for each starting year although there if an offset. One of our modelling requirements is that the parameters of risk-neutral dynamics, given in Corollary, remain relative stable across time. Factor loading B is very stable for each starting year, while B 6

Volatility Starting Year σ σ σ 3 9 9 93 94 95 96 97 98 3.9e-4.65e-4.98e-4.74e-4.5e-4.39e-4.3e-4.3e-4 4.87e-4 3.74e-4.7e-4.76e-4 3.9e-4.5e-4.64e-4.4e-4 6.43e-5 4.99e-5 4.4e-5 3.5e-5.94e-5.9e-5.73e-5 3.58e-5 Table 3: Fitted volatility for different starting years fluctuates between an increasing or decreasing function of age. This can be seen in Table where parameter δ of the 3-factor independent model is not significant. Factor loading B 3 represents the old age population, this is relative stable with the function increasing as the start year increases. Factor Value 6 x 3 5 4 3 9 9 93 94 95 96 97 98 Factor Value 7 x 3 6 5 4 3 9 9 93 94 95 96 97 98 Factor Value 4 x 4 3 9 9 93 94 95 96 97 98 9 9 93 94 95 96 97 98 99 Year 9 9 93 94 95 96 97 98 99 Year 9 9 93 94 95 96 97 98 99 Year (a) Factor Z (b) Factor Z (c) Factor Z 3 Factor Loading 5 4 3 9 9 93 94 95 96 97 98 Factor Loading.8.6.4. 9 9 93 94 95 96 97 98 Factor Loading 5 4 3 9 9 93 94 95 96 97 98.8.6 5 55 6 65 7 75 8 85 9 95.4 5 55 6 65 7 75 8 85 9 95 5 55 6 65 7 75 8 85 9 95 (d) Factor B (e) Factor B Figure 8: Robustness test (f) Factor B 3 Table shows the volatility of the fitted model for each starting year. We expect to see higher volatility prior to 97 due to the recording method of the population estimates. As we remove the earlier population mortality rates form the data-set the volatility of the fitted stochastic processes is generally decreasing, with the volatility becoming more stable in the later years. 5. Consistent Survivor Curves Population forecasts are easily generated for the whole term structure from the Kalman filter. We extract the estimated mortality rates for a cohort aged 5 in 8 and use this as our initial forward mortality curve µ(,t). Using the forward mortality rate definition in equation (9) and our calibrated volatility function, we use Monte Carlo HJM simulations ( Glasserman [7]) to produce a distribution of the Survival curve. This approach is preferred to simulating the short-rate process, as in Schrager [6], where only initial factor values are given to define the cohort. 7

We run, HJM simulations to generate 9% confidence intervals shown in Figure 9. Parameter uncertainty is introduced through sampling from either a Gaussian approximation or the Bootstrap distribution of the parameter estimates from the risk-neutral process. Figure 9a shows simulation results when the initial forward mortality curve is known with certainty, i.e. the curve is given exogenously to the model. In this framework we can also simulate the case when the initial mortality forward curve is endogenously estimated from our optimal parameter-set, shown in Figure 9b. Uncertainty in the initial forward mortality curve is due to the distribution of realworld drift parameters, K, in equation (6). From our analysis in section 4. the real-world drift parameters have a number of outliers and are difficult to estimate. Figure 9 shows our Gaussian assumption of the risk-neutral drift and volatility parameters is valid. The bootstrap distribution does not diverge far from the no uncertainty assumption. Where as, when we introduce parameter uncertainty in the real-world drift there is a large divergence in the upper confidence interval when sampling from the bootstrap distribution; this problem of forecasting mortality rates is not limited to this framework, but this framework allows the quantification of these risks..9.9 Survival Probability.8.7.6.5.4.3 Forward Mortality Rate No Parameter Uncertainty Gaussian Distribution Bootstrap Distribution Survival Probability.8.7.6.5.4.3 Forward Mortality Rate No Parameter Uncertainty Gaussian Distribution Bootstrap Distribution.... 5 55 6 65 7 75 8 85 9 95 (a) Certain initial forward curve 5 55 6 65 7 75 8 85 9 95 (b) Uncertain initial forward curve Figure 9: 3-Factor Independent Model Risk-Neutral Distribution 6. Conclusion The results in this paper show that the Affine Term Structure Model (ATSM) is an appropriate tool for modelling mortality data. We have derived a closed-form risk-adjusted survival curve and shown modelling results under an essentially affine transform. We have also derived our models in a consistent framework. The models have similar problems to interest rate modelling in this framework, where the real-world drift parameters have large confidence intervals and are statistically insignificant. Analysing the bootstrap distribution of the parameter estimates explains where some of this difficulty lies. The bootstrap distribution also highlights the problem of assuming Gaussian standard errors when calculating confidence intervals. The goal of the paper framework is to produce consistent forecasts that can be used in pricing and risk management applications, the ATSM framework is a convenient and well developed tool for this application. We show, using Swedish mortality data, that the model can be readily implemented. The flexibility of the model allows it to have a wide range of applications in the pricing and risk management of mortality risk. 8

7. Acknowledgement The authors acknowledge the financial support of ARC Linkage Grant Project LP883398 Managing Risk with Insurance and Superannuation as Individuals with industry partners PwC and APRA and the Australian Research Council Centre of Excellence in Population ing Research (project number CE9). Comments from anonymous reviewers much improved an earlier version of the paper. 9

[] T. Andersen, J. Lund, Estimating continuous-time stochastic volatility models of the short-term interest rate, Journal of Econometrics 77 (997) 343 377. [] S.H. Babbs, K.B. Nowman, Kalman Filtering of Generalized Vasicek Term Structure Models, The Journal of Financial and Quantitative Analysis 34 (999) 5. [3] J. Barbarin, Heath-Jarrow-Morton modelling of longevity bonds and the risk minimization of life insurance portfolios, Insurance: Mathematics and Economics 43 (8) 4 55. [4] D. Bauer, M. Börger, J. Ruß, On the pricing of longevity-linked securities, Insurance: Mathematics and Economics 46 () 39 49. [5] D. Bauer, M. Borger, J. Ruß, H.J. Zwiesler, The Volatility of Mortality, Asia-Pacific Journal of Risk and Insurance 3 (8) 35. [6] T. Bielecki, M. Rutkowski, Credit Risk: Modelling, Valuation and Hedging, Springer-Verlag,. [7] E. Biffis, Affine processes for dynamic mortality and actuarial valuations, Insurance: Mathematics and Economics 37 (5) 443 468. [8] T. Bjork, B.J. Christensen, Interest Rate Dynamics and Consistent Forward Rate Curves, Mathematical Finance 9 (999) 33 348. [9] J.E. Cavanaugh, R.H. Shumway, A Bootstrap Variant Of AIC For Staet-Space Model Selection, Statistica Sinica 7 (997) 473 496. [] C. Chiarella, O.K. Kwon, Classes of Interest Rate Models under the HJM Framework, Asia-Pacific Financial Markets 8 (). [] J.H.E. Christensen, F.X. Diebold, G.D. Rudebusch, An arbitrage-free generalized NelsonâSiegel term structure model, Econometrics Journal (9) C33 C64. [] M. Dahl, T. Moller, Valuation and hedging of life insurance liabilities with systematic mortality risk, Insurance: Mathematics and Economics 39 (6) 93 7. [3] Q. Dai, K. Singleton, Specification analysis of affine term structure models, The Journal of Finance 55 () 943 978. [4] G. De Rossi, Kalman filtering of consistent forward rate curves: a tool to estimate and model dynamically the term structure, Journal of Empirical Finance (4) 77 38. [5] G.R. Duffee, Term Premia and Interest Rate Forecasts in Affine Models, The Journal of Finance 57 () 45 443. [6] D. Duffie, R. Kan, A yield-factor model of interest rates, Mathematical finance 6 (996) 379 46. [7] P. Glasserman, Monte Carlo Methods in Financial Engineering, Springer, edition, 3. [8] D. Heath, R. Jarrow, A. Morton, Bond Pricing and the Term Structure of Interest Rates: A New Methodology for Contingent Claims Valuation, 99. [9] R. Kalman, A new approach to linear filtering and prediction problems, Journal of basic Engineering 8 (96) 35 45. [] D. Lando, On cox processes and credit risky securities, Review of Derivatives Research (998) 99. [] R. Lee, L. Carter, Modeling and forecasting US mortality, Journal of the American Statistical Association 87 (99) 659 67. [] E. Luciano, E. Vigna, Non mean reverting affine processes for stochastic mortality, International Centre for Economic Research 4/5 (5). [3] M. Milevsky, S.D. Promislow, Mortality derivatives and the option to annuitise, Insurance: Mathematics and Economics 9 () 99 38. [4] K.R. Miltersen, S.a. Persson, Is mortality dead? Stochastic forward force of mortality rate determined by no arbitrage, Working Paper, University of Bergen. (5). [5] P.J. Schonbucher, Term structure modelling of defaultable bonds, Review of Derivatives Research (998) 6 9. [6] D. Schrager, Affine stochastic mortality, Insurance: Mathematics and Economics 38 (6) 8 97. [7] D.S. Stoffer, K.D. Wall, Resampling in State Space Models, State Space and Unobserved Component Models Theory and Applications (4) 7 58.