N-State Endogenous Markov-Switching Models

N-State Endogenous Markov-Switching Models Shih-Tang Hwu Chang-Jin Kim Jeremy Piger This Draft: January 2017 Abstract: We develop an N-regime Markov-switching regression model in which the latent state variable driving the regime switching is endogenous. The model admits a wide variety of patterns of correlation between the state variable and the regression disturbance term, while still maintaining computational feasibility. We provide an iterative filter that generates objects of interest, including the model likelihood function and estimated regime probabilities. The model provides for a simple test of the null hypothesis of exogenous switching. Using simulation experiments, we demonstrate that the maximum likelihood estimator performs well in finite samples and that a likelihood ratio test of exogenous switching has good size and power properties. We provide results from two applications of the endogenous switching model: a three-state model of U.S business cycle dynamics and a three-state volatility model of U.S. equity returns. In both cases we find statistically significant evidence in favor of endogenous switching. Keywords: Regime-switching, business cycle asymmetry, nonlinear models, volatility feedback, equity premium JEL Classifications: C13, C22, E32, G12 We thank seminar participants at the Federal Reserve Bank of St. Louis, the University of Oregon, the University of Washington, the University of California, Santa Barbara, and the 2017 ASSA meetings for helpful comments. Department of Economics, University of Washington, (hwus@uw.edu) Department of Economics, University of Washington, (changjin@u.washington.edu) Department of Economics, University of Oregon, (jpiger@uoregon.edu) 1

1 Introduction Regression models with time-varying parameters have become a staple of the applied econometrician s toolkit. A particularly prevalent version of these models is the Markovswitching regression of Goldfeld and Quandt (1973), in which parameters switch between some finite number of regimes, and this switching is governed by an unobserved Markov process. Hamilton (1989) makes an important advance by extending the Markov-switching framework to an autoregressive process, and providing an iterative filter that produces both the model likelihood function and filtered regime probabilities. Hamilton s paper initiated a large number of applications of Markov-switching models, and these models are now a standard approach to describe the dynamics of many macroeconomic and financial time series. Hamilton (2008) and Piger (2009) provide surveys of this literature. Hamilton s Markov-switching regression model assumes that the Markov state variable governing the timing of regime switches is strictly exogenous, and thus independent of the regression disturbance at all leads and lags. Diebold et al. (1994) and Filardo (1994) extend the Hamilton model to allow the transition probabilities governing the Markov process to be partly determined by strictly exogenous or predetermined information, which could include lagged values of the dependent variable. However, this time-varying transition probability (TVTP) formulation maintains the assumption that the state variable is independent of the contemporaneous value of the regression disturbance. The large literature applying Markov-switching models has almost exclusively focused on either the Hamilton (1989) fixed transition probability model or the TVTP extension, which we will collectively refer to as exogenous switching models. Despite the popularity of this exogenous switching framework, it is natural in many applications to think of the state process as contemporaneously correlated with the regression disturbance, which we refer to as endogenous switching. For example, a common application of the Markov-switching regression is to models where the dependent variable is an aggregate measure of some macroeconomic or financial variable, and the state variable 2

is meant to capture the business cycle regime (e.g. expansion and recession). It seems reasonable that shocks to these aggregate quantities, such as real GDP, would contribute simultaneously to changes in the business cycle phase. More generally, both the state variable and the disturbance term to the dependent variable may be influenced simultaneously by a number of unmodeled elements. For example, in the Hamilton (1989) regime-switching autoregressive model of real GDP growth, both the state variable capturing the business cycle phase and the shock to real GDP are likely influenced by other factors, such as monetary and fiscal policy. Motivated by such arguments, Kim et al. (2008) develop an endogenous switching regression model, in which the state variable and the regression disturbance term are determined simultaneously. Kang (2014) incorporates the Kim et al. (2008) model of endogenous switching inside of a more general state-space model. However, a significant drawback of this existing endogenous switching literature is that it is largely limited to the case of two regimes. 1 This limits the potential application of the model considerably, as there is evidence for more than two regimes in many empirical implementations of the Markov-switching model. For example, in models of real activity, Boldin (1996) finds evidence for a three regime switching model of business cycle dynamics for real GDP, while Hamilton (2005) does the same for the unemployment rate. For asset prices, Garcia and Perron (1996) and Guidolin and Timmermann (2005) provide evidence for a three-regime switching mean and volatility model of U.S. interest rates and equity returns respectively. In a Markov-switching VAR, Sims and Zha (2006) find the best fit using nine regimes, primarily capturing changes in conditional volatility. In this paper, we develop an N-regime endogenous Markov-switching regression model. In the two regime case, the model collapses to that in Kim et al. (2008). For more than two 1 Kim et al. (2008) propose a version of their model for more than two regimes, but it is very restrictive in terms of the patterns of correlation between the state variable and the regression disturbance term that can be captured. In particular, their N-state model implies that larger positive values of the regression disturbance term are monotonically related to larger values of the state variable. Among other things, this makes results from this model highly dependent on the arbitrary decision of how the states are labeled. 3

regimes, the model admits a wide variety of patterns of correlation between the state variable and regression disturbance term. Despite this flexibility, the model maintains computational feasibility, and can be estimated via maximum likelihood using extensions to the filter in Hamilton (1989). The parameterization of the model also allows for a simple test of the null hypothesis of exogenous switching. Using simulation experiments, we demonstrate that the maximum likelihood estimator performs well in finite samples and that a likelihood ratio test of the null hypothesis of exogenous switching has good size and power properties. We consider two applications of our N-regime endogenous switching model. In the first, we test for endogenous switching in a three regime switching mean model of U.S. real GDP growth. In the second, we consider endogenous switching inside of a three-regime version of the Turner et al. (1989) volatility feedback model of U.S. equity returns. We find statistically significant evidence of endogenous switching in both of these models, as well as quantitatively large differences in parameter estimates resulting from allowing for endogenous switching. 2 An N-State Endogenous Markov-Switching Model Consider the following Gaussian regime-switching model: y t = g (x t, y t 1,..., y t p, S t, S t 1,..., S t p ) + σ St ε t, (1) ε t i.i.d.n(0, 1), where g ( ) is a conditional mean function, y t is scalar, x t is a k 1 vector of observed exogenous variables, and S t {0, 1,..., N 1} is an integer valued state variable indicating which of N different regimes is active at time t. Both y t and x t are assumed to be covariance stationary. Examples of equation (1) include a regime-switching regression model: y t = x tβ St + σ St ε t, (2) 4

as well as a regime-switching autoregression: y t = µ St + φ 1 (y t 1 µ St 1) + φ 2 ( yt 2 µ St 2 ) + + φp ( yt p µ St p ) + σst ε t. (3) For simplicity of exposition, we focus on the regime-switching regression model in equation (2) throughout this paper. However, the algorithms presented below for estimation and filtering are easily extended to the more general case of equation (1). In an N-state Markov-switching model, the discrete regime indicator variable S t follows an N-state Markov-process. Here we will allow the Markov-process to have time-varying transition probabilities as in Diebold et al. (1994) and Filardo (1994): p ij,t = Pr (S t = i S t 1 = j, z t ) (4) In (4), the transition probability is influenced by the strictly exogenous or predetermined conditioning information in z t, and is thus time varying. We assume that z t is covariance stationary. To model the dependence of the transition probability on z t, it will be useful to alternatively describe S t as the outcome of the values of N 1 continuous latent variables, S1,t, S2,t,..., SN 1,t. To do so, we employ a multinomial probit specification: S t = 0, 0 = max { } 0, S1,t, S2,t,..., SN 1,t 1, S1,t = max { } 0, S1,t, S2,t,..., SN 1,t. N 1, S N 1,t = max { 0, S 1,t, S 2,t,..., S N 1,t }, (5) 5

where each of the N 1 latent variables follow a symmetric process: S i,t = γ i,st 1 + z tδ i,st 1 + η i,t (6) η i,t i.i.d.n(0, 1) i = 1,, N 1 This provides enough structure to parameterize the transition probabilities for the Markovprocess: p 0j,t = Pr (η 1,t < c 1,j,t, η 2,t < c 2,j,t,..., η N 1,t < c N 1,j,t ), (7) p ij,t = Pr ( η i,t < c i,j,t, {(η m,t η i,t ) < (c i,j,t c m,j,t ) : m = 1,..., N 1, m i}), (8) where c i,j,t = γ i,j + z tδ i,j, i = 1,..., N 1, j = 1,..., N 1. To model endogenous switching, we assume that the joint probability density for ε t and η t = (η 1,t, η 2,t,... η N 1,t ) is independent and identically distributed multivariate Gaussian: ε t η t i.i.d. N (0 N, Σ), (9) where: Σ = 1 ρ 1 ρ 2... ρ N 1 ρ 1 1 ρ 1 ρ 2... ρ 1 ρ N 1 ρ 2 ρ 2 ρ 1 1... ρ 2 ρ N 1..... ρ N 1 ρ N 1 ρ 1 ρ N 1 ρ 2... 1. (10) This multivariate Gaussian density implies that the correlation between ε t and η i,t is given by the parameter ρ i. In turn, these ρ i parameters control the extent of endogenous switching 6

in the model. Indeed, the exogenous switching model is nested through the parameter restriction ρ 1 = ρ 2 = = ρ N 1 = 0. Further, the ρ i parameters have a straightforward interpretation in terms of endogenous switching: When ρ i is positive, larger values of ε t are associated with an increased likelihood of S t = i occurring relative to S t = 0. When (ρ i ρ m ) is positive, larger values of ε t are associated with an increased likelihood of S t = i occurring relative to S t = m. The converse is also true. Note that nothing in this model takes a stand on the direction of causality, and the model could be consistent with causality running from ε t to S t, from S t to ε t, or bi-directional causality. Note that the covariance matrix in (10) implies that conditional on ε t, the η i,t are uncorrelated: E (η τ,t η τ,t ɛ t ) = 0, τ τ. (11) This assumption is required to identify the parameters of the model. Specifically, the covariance parameters in (11) are not separately identified from the γ i,st j parameters in (6). Further, the assumption plays the role of a normalization, as these parameters are present only to parameterize the transition probabilities of the Markov process. To gain further intuition into how our proposed endogenous Markov-switching model differs from the exogenous Markov-switching model, it is useful to consider the probability of transitioning between states, conditional on ε t : p ij,t = Pr (S t = i S t 1 = j, z t, ε t ) (12) For the exogenous switching model, this conditional transition probability is equal to the unconditional transition probability, so that p ij,t = p ij,t. For the endogenous switching model the conditional and unconditional transition probabilities will not be equal, and the realization of ε t can signal markedly different probabilities of transitioning between regimes. As an example of this, Figure 1 plots unconditional and conditional transition probabilities against alternative realizations of ε t [ 3, 3] for a particular parameterization of a three 7

state (N = 3) endogenous switching model. In this example, the dependence on z t has been eliminated, and the correlation parameters have been set to ρ 1 = 0.5 and ρ 2 = 0.9. The figure shows that the conditional probability of transitioning regimes can vary in extreme directions depending on the outcome of ε t. For example, focusing on the diagonal entries, the conditional probability of continuing in the S t = 0 regime ( p 00,t ) increases gradually from around 0.3 to above 0.8 as ε t moves from a large negative value (-3) toward 0. This transition probability then falls rapidly to near 0 as ε t increases from 0 to around 2. The other continuation probabilities, p 11,t and p 22,t, also display dramatic shifts that cover the entire probability range as ε t is varied. Alternative parameterizations for ρ 1 and ρ 2 give alternative patterns of p ij,t. An example of this is given in Figure 2, which depicts the transition probabilities when ρ 1 = 0.9 and ρ 2 = 0.9. These figures also demonstrate that the conditional transition probability can differ markedly from the unconditional transition probability, which is given by the horizontal dashed lines in each figure. As will be shown in detail in the next section, the ratio of these two probabilities is an important quantity in distinguishing the likelihood function for the endogenous switching model from that for the exogenous switching model. 3 Likelihood Calculation, State Filtering and Tests for Endogenous Switching In this section we describe how both the likelihood function and filtered and smoothed probabilities of the states can be calculated for the endogenous switching model. 2 We will also describe how these calculations differ from those for the exogenous switching model. Finally, we discuss how tests of the null hypothesis of exogenous switching vs. the alternative hypothesis of endogenous switching can be conducted. 2 The model likelihood for Markov-switching models will be invariant to an arbitrary relabeling of regimes. We assume throughout that the model has been appropriately normalized. Specific strategies for normalization will be discussed for the empirical analysis presented in Section 5. 8

Collect the model parameters into the vector θ, and let Z t = {z t, z t 1, } and Ψ t = {y t, y t 1, } indicate the history of observed z t and y t through date t. As in Filardo (1994), the conditional likelihood value for y t, f (y t Ψ t 1, Z t, θ), t = 1,, T, can be constructed recursively using an extension of the iterative formulas in Hamilton (1989) to the case of time-varying transition probabilities: 3 f (y t Ψ t 1, Z t, θ) = N 1 N 1 S t=0 S t 1 =0 f (y t S t, S t 1, Ψ t 1, Z t, θ) p ij,t Pr (S t 1 Ψ t 1, Z t 1, θ) (13) Pr (S t = i Ψ t, Z t, θ) N 1 S t 1 =0 f (y t S t, S t 1, Ψ t 1, Z t, θ) p ij,t Pr (S t 1 Ψ t 1, Z t 1, θ) (14) These equations are then iterated recursively to obtain the log likelihood function L (θ) = T log [f (y t Ψ t 1, Z t, θ)] and the filtered state estimates Pr (S t = i Ψ t, Z t, θ), t = 1,..., T. To t=1 initialize the recursion we require an initial filtered state probability, Pr (S 0 = i Ψ 0, Z 0, θ), i = 0,, N 1, calculation of which can be quite involved. Here we follow the usual practice, suggested by Hamilton (1989), of approximating this initial probability with an unconditional probability. In the case of time-varying transition probabilities, we use the unconditional state probability computed assuming z t is always at its sample mean. Denote this probability as P r (S t = i z), i = 0,, N 1, where z is the sample mean of z t. Next, define p ij = Pr (S t = i S t 1 = j, z), and collect these in a matrix of transition probabilities as: p 00 p 01... p 0 N 1 p 10 p 11... p 1 N 1 P =...... p N 1 0 p N 1 1 p N 1 N 1 (15) 3 For notational convenience, we suppress the dependence of probability density functions on the regressors, x t, throughout this section. Equations (13) and (14) make use of the assumption, implicit in equation (2), that conditional on x t and the state indicator S t, the probability density function of y t does not depend on z t. This is without loss of generality, since x t may include elements of z t. 9

Finally, define: A = I N P ι N where I N is the N N identity matrix and ι N is an N 1 vector of ones. The vector holding P r (S t = i z), i = 0,, N 1 is then computed as the last column of the matrix (A A) 1 A. The key element required to compute each step of the the recursion in (13) and (14) is f (y t S t, S t 1, Ψ t 1, Z t, θ), and it is here that we see the distinction in the likelihood function between the exogenous and endogenous switching models. In the exogenous switching model, the state indicators S t = i and S t 1 = j simply define the mean and variance of a Gaussian distribution for y t, such that: f (y t S t = i, S t 1 = j, Ψ t 1, Z t, θ) = 1 ( ) yt x φ tβ i σ i σ i where φ() indicates the standard normal probability density function. By contrast, when there is endogenous switching, the state variables S t = i and S t 1 = j indicate not just the parameters of the relevant data generating process, but additionally provide information about which values of the random disturbance, ε t, are most likely. In the case of endogenous switching: f (y t S t = i, S t 1 = j, Ψ t 1, Z t, θ) = p [ ( )] ij,t 1 yt x φ tβ i p ij,t σ i σ i (16) This equation, which is derived in Appendix A.1, can be interpreted as follows. The term in brackets is the regime-dependent conditional density of y t for the exogenous switching model. This density is then weighted by a ratio of probabilities of transitioning from regime j to regime i, where the probability in the numerator is conditional on the regime-specific value of ε t and the probability in the denominator is not. The unconditional transition prob- 10

ability p ij,t can be interpreted as the average value of p ij,t with respect to the unconditional distribution of ε t. In other words, p ij,t gives the average probability of transitioning from state j to state i with respect to ε t. Thus, equation (16) says that if the value of ε t signals an above average probability of transitioning from state j to state i, then the likelihood value for y t conditional on S t = i and S t 1 = j will be higher than would be calculated under the exogenous switching model. Returning to Figures 1 and 2, the ratio p ij,t /p ij,t can be far from unity, meaning the likelihood function for the exogenous switching model may be substantially misspecified in the presence of endogenous switching. In general, estimation assuming exogenous switching will lead to biased parameter estimates as well as biased filtered and smoothed state probabilities when endogenous switching is present. The recursion provided by equations (13) and (14) can be used to construct the value of the likelihood function for any value of θ. The likelihood function can then be numerically maximized with respect to θ to obtain the maximum likelihood estimates, θ. 4 Given these estimates, the recursion can be run again to provide the filtered state probability evaluated ( at the maximum likelihood estimates, Pr S t = i Ψ t, Z t, θ ). In many applications we also require the so-called smoothed state probability Pr (S t = i Ψ T, Z T, θ), which provides inference on S t conditional on all available sample information. To compute the smoothed probabilities, we can apply the recursive filter provided in Kim and Nelson (1999b), which remains valid for the N-state endogenous Markov-switching model described in Section 2. Beginning with the final filtered probability, Pr (S T = j Ψ T, Z T, θ), j = 0,..., N 1, the following equation can be applied recursively, for t = T 1,..., 1: Pr (S t = i Ψ T, Z T, θ) = N 1 N 1 j=0 k=0 Pr (S t 1 = j, S t = i, S t+1 = k Ψ T, Z T, θ) (17) 4 One practical computational difficulty in constructing the likelihood function is that it requires computing the unconditional and conditional transition probabilities, p ij,t and p ij,t, which involves calculation of a multivariate Gaussian cumulative distribution function (CDF), for which there is no closed form solution. In our empirical implementation of the endogenous switching model we use Matlab s mvncdf command to numerically compute the required integrals. Appendix A.2 provides an explicit characterization of these CDFs for the case of N=3. 11

where: Pr (S t 1 = j, S t = i, S t+1 = k Ψ T, Z T, θ) (18) = P r(s t = i, S t+1 = k Ψ T, Z T, θ)p ki,t P r(s t = i, S t 1 = j Ψ t, Z t, θ) P r(s t = j, S t+1 = k Ψ t, Z t, θ) For additional details of the derivation of equation (18), see Kim (1994) and Kim and Nelson (1999b). To conclude this section, we describe how statistical hypothesis tests of the null hypothesis of exogenous switching can be conducted. Our N-state endogenous switching model collapses to a standard exogenous Markov-switching model in the case where: ρ 1 = ρ 2 = = ρ N 1 = 0, (19) Thus, the null hypothesis of exogenous switching can be tested by any suitable joint test of the N 1 zero restrictions in 19. In the simulation studies presented in Section 4, we will consider the finite sample performance of both Wald and likelihood ratio tests of these restrictions. 4 Monte Carlo Evidence In this section we describe results from a Monte Carlo simulation study designed to evaluate the finite sample performance of the maximum likelihood estimator (MLE) applied to data generated from an endogenous switching model. We also evaluate the size and power performance of hypothesis tests for endogenous switching. To focus on the results most germane to the addition of endogenous switching, we consider a simplified version of the general model presented in Section 2. In particular, we focus on the Gaussian Markov- 12

switching mean model: y t = µ St + σε t (20) ε t i.i.d.n(0, 1) where S t {0, 1, 2} is a three-state Markov process that evolves with fixed transition probabilities p ij = Pr (S t = i S t 1 = j). In all Monte Carlo simulations we set µ St { 1, 0, 1} and σ = 1. The Markov process evolves according to the endogenous switching model outlined in Section 2 with z t = 0, t. Across alternative Monte Carlo experiments we vary the persistence of the transition probabilities for remaining in a regime from a high persistence case (p 00 = p 11 = p 22 = 0.9) to a low persistence case (p 00 = p 11 = p 22 = 0.7). In both the high and low persistence cases we spread the residual probability evenly across the remaining transitions. We vary the size of the correlation parameters from ρ 1 = ρ 2 = 0.9 to ρ 1 = ρ 2 = 0.5. Finally, we consider two sample sizes, T = 300 and T = 500. Performance is measured using the mean and root mean squared error (RMSE) of the estimates of each parameter across 1000 Monte Carlo simulations. The RMSE, reported in parentheses, is computed relative to the true value for each parameter. Table 1 presents results regarding the performance of the MLE that incorrectly assumes exogenous switching, and demonstrates that the bias in this incorrectly specified MLE can be severe. The bias in the µ i parameters increases as the state persistence falls, with the amount of bias reaching as high as 67% of the true parameter value in the case of µ 0. Estimation bias is also visible in the estimates of the conditional variance term, with the bias in some cases above 15% of the true parameter value. The estimation bias is not a small sample phenomenon, with similar bias observed for T = 300 as for T = 500. The bias decreases as the correlation parameters, ρ 1 and ρ 2, fall from 0.9 to 0.5. However, despite this substantially lessened importance of endogenous switching, the MLE that ignores endogenous switching 13

still generates very biased parameter estimates, with bias reaching as high as 43% of the true parameter value for µ 0. Table 2 shows results for the same variety of data generating processes, but with the MLE now applied to the correctly specified model. These results demonstrate that the MLE of the correctly specified model performs very well, with mean parameter estimates that are close to the true value, and RMSE statistics that are small. The performance of the correctly specified estimator seems largely unaffected by the extent of state persistence or the value of the correlation parameters. The sample size also does not have large effects on the mean estimates although, not surprisingly, the RMSE is higher when the sample size is smaller. Finally, Table 3 shows results of simulations meant to assess the finite sample performance of both Wald and likelihood ratio (LR) tests of the null hypothesis of exogenous switching, which is parameterized as a test of the joint restriction ρ 1 = ρ 2 = 0. We again consider two sample sizes, as well as a high and low state persistence case. To evaluate the size of the Wald and LR tests, we first consider the case where the true data generating process has ρ 1 = ρ 2 = 0. To evaluate the power of these tests we consider two cases, one in which the extent of endogenous switching is high (ρ 1 = ρ 2 = 0.9) and the other where endogenous switching is more moderate (ρ 1 = ρ 2 = 0.5). The size results are based on rejection rates of 5%-level tests using asymptotic critical values. The power results are based on rejection rates using size-adjusted 5% critical values. Beginning with the size of the tests, both the Wald and LR test are moderately oversized when T=300, with rejection rates around 10% for the Wald test and 8% for the LR test. Not surprisingly, the empirical size is closer to the level implied by the asymptotic critical values when the sample size is larger. The LR test in particular has empirical size very close to 5% when T=500. Turning to the power results, the LR test displays consistently high rejection rates ranging between 69% and 100%. The Wald test is less powerful, with rejection rates ranging from 42% to 90%. Overall, the Monte Carlo results suggest that ignoring endogenous switching can lead 14

to substantial bias in the MLE when endogenous switching is in fact present. This bias persists into large sample sizes, and for both high and moderate values of the parameters that control the extent of endogenous switching. The MLE that accounts for endogenous switching performed very well, yielding accurate parameter estimates. Finally, the LR test for exogenous switching was effective, with approximately correct size and good power. 5 Applications in Macroeconomics and Finance In this section, we consider two applications of the N-state endogenous Markov-switching model. In Section 5.1, we consider endogenous switching in a three-state model of U.S. business cycle dynamics. In Section 5.2 we extend the two-state endogenous-switching volatility feedback model in Kim et al. (2008) to allow for three volatility regimes. 5.1 U.S. Business Cycle Fluctuations One empirical characteristic of the U.S. business cycle highlighted by Burns and Mitchell (1946) is asymmetry in the behavior of real output across business cycle phases. In his seminal paper, Hamilton (1989) captures asymmetry in the business cycle using a two-state Markov-switching autoregressive model of U.S. real GNP growth. His model identifies one phase as relatively brief periods of steep declines in output, and the other as relatively long periods of gradual output increases. Using quarterly data from 1952:Q2 to 1984:Q4, Hamilton (1989) shows that the estimated shifts between the two phases accord well with the National Bureau of Economic Research (NBER) chronology of U.S. business cycle peaks and troughs. While Hamilton s original model captures the short and steep nature of recessions relative to expansions, it does not incorporate an important feature of the business cycle that was prevalent over the sample period he considered: recessions were typically followed by highgrowth recovery phases that pushed output back toward its pre-recession level. This bounce 15

back effect is evident in the post-recession real GDP growth rates shown in Table 4. In order to capture this high growth recovery phase, Sichel (1994) and Boldin (1996) extend Hamilton s original model to a three state Markov-switching model. Here we also use a three state Markov-switching model to capture mature expansions, recessions, and a post-recession expansion phase. In particular, we assume that the U.S. real GDP growth rate is described by the following three state Markov-switching mean model: y t = µ St + σε t (21) ε t i.i.d.n(0, 1) where y t is U.S. log real GDP and S t {0, 1, 2} is a Markov-switching state variable that evolves with fixed transition probabilities p ij. Note that in this model U.S. real GDP growth follows a white noise process inside of each regime. This intra-regime lack of dynamics is consistent with the results of Kim et al. (2005) and Camacho and Perez-Quiros (2007), who find that traditional linear autoregressive dynamics in U.S. real GDP growth are largely absent once mean growth is allowed to follow a three-regime Markov-switching process. We restrict the model in two ways. First, we restrict µ 0 > 0, µ 1 < 0, and µ 2 > 0, which serves to identify S t = 1 as the recession regime, and S t = 0 and S t = 2 as expansion regimes. Second, following Boldin (1996), we restrict the matrix of transition probabilities so that the states occur in the order 0 1 2: p 00 0 1 p 22 P = 1 p 00 p 11 0, 0 1 p 11 p 22 In combination with the restrictions on µ St, this form of the transition matrix restricts the regimes to occur in the order: mature expansion recession post-recession expansion. We will consider two versions of this model, one in which the Markov-switching is assumed 16

to be exogenous, so that ρ 1 = ρ 2 = 0, and one that allows for endogenous switching. We estimate this model using data on quarterly U.S. real GDP growth from 1947:Q1 to 2016:Q2. Over this sample period, there are two prominent types of structural change in the U.S. business cycle that are empirically relevant. The first is the well-documented reduction in real GDP growth volatility in the early 1980s known as the Great Moderation (Kim and Nelson (1999a), McConnell and Perez-Quiros (2000)). To capture this reduction in volatility, we include a one time change in the conditional volatility parameter, σ, in 1984:Q1, the date identified by Kim and Nelson (1999a) as the beginning of the Great Moderation. The second, as identified in Kim and Murray (2002) and Kim et al. (2005), is the lack of a high growth recovery phase following the three most recent NBER recessions. To capture this change in post-recession growth rates, we include a one-time break in µ 2. Finally, to allow for the possibility that the nature of endogenous switching changed along with the nature of the post-recession regime, we allow for breaks in ρ 1 and ρ 2. These breaks in µ 2, ρ 1 and ρ 2 are also dated to 1984:Q1, although results are insensitive to alternative break dates between 1984:Q1 and the beginning of the 1990-1991 recession. All other model parameters are assumed to be constant over the entire sample period. The second and third columns of Table 5 shows the maximum likelihood estimation results when we assume exogenous switching. The estimates show a prominent high growth recovery phase before 1984:Q1 (µ 2,1 >> µ 0 ). The estimates also show that this high growth recovery phase has disappeared in recent recessions, and indeed been replaced with a lowgrowth post-recession phase (µ 2,2 < µ 0 ). The conditional volatility parameter, σ, falls by 50% after 1984, consistent with the large literature on the Great Moderation. The maximum likelihood estimates assuming endogenous switching are shown in the fourth and fifth columns of Table 5. A likelihood ratio test rejects the null hypothesis of exogenous switching at the 5% level (p-value = 0.045). The estimates of the correlation parameters prior to 1984 are such that ρ 2 < ρ 1 < 0. This pattern of correlations means that larger positive values of ε t increase the likelihood of S t = 0 (mature expansion) relative 17

to S t = 1 (recession) and S t = 2 (post-recession expansion), and increase the likelihood of S t = 1 relative to S t = 2. These estimates switch signs after 1984, such that ρ 1 ρ 2 > 0. In this case, larger positive values of ε t increase the likelihood of S t = 1 and S t = 2 relative to S t = 0. There is also evidence of bias in the parameter estimates of the exogenous switching model. The estimated mean growth rate in the post-recession expansion phase is substantially different when accounting for endogenous switching. Also, the continuation probability for the post-recession phase, p 22, is substantially lower when accounting for endogenous switching, meaning the length of these phases are overstated by the results from exogenous switching models. Finally, results of the Ljung-Box test, shown in the bottom panel of the table, show that accounting for endogenous switching eliminates autocorrelation in the disturbance term that is present in the exogenous switching model. Figure 3 displays the smoothed state probabilities for both the exogenous and endogenous switching models, and shows the distortion in estimated state probabilities that can occur from ignoring endogenous switching. From panel (a), we see that the smoothed probability that the economy is in a mature expansion (S t = 0) is often lower for the exogenous switching model than the endogenous switching model, while panel (c) shows that the opposite is true for the smoothed probability of the post-recession expansion phase (S t = 2). Put differently, the endogenous switching model suggests a quicker transition from the postrecession expansion phase to the mature expansion phase than does the exogenous switching model. 5.2 Volatility Regimes in U.S. Equity Returns An empirical regularity of U.S. equity returns is that low returns are contemporaneously associated with high volatility. This is a counterintuitive result, as classical portfolio theory implies the equity risk premium should respond positively to the expectation of future volatility. One explanation for this observation is that while investors do require an increase 18

in expected return for expected future volatility, they are often surprised by news about realized volatility. This volatility feedback creates a reduction in prices in the period in which the increase in volatility is realized. The volatility feedback effect has been investigated extensively in the literature by French et al. (1987), Turner et al. (1989), Campbell and Hentschel (1992), Bekaert and Wu (2000) and Kim et al. (2004). Turner et al. (1989) (TSN hereafter) model the volatility feedback effect with a two state Markov-switching model: r t = θ 1 E ( σ 2 S t I t 1 ) + θ2 [ E(σ 2 St I t ) E(σ 2 S t I t 1 ) ] + σ St ε t ε t i.i.d.n(0, 1) where r t is a measure of excess equity returns, I t = {r t, r t 1, }, and I t is an information set that includes I t 1 and the information investors observe during period t. S t {0, 1} is a discrete variable that follows a two state Markov process with fixed transition probabilities p ij. To normalize the model, TSN restrict σ 1 > σ 0, so that state 1 is the higher volatility regime. One estimation difficulty with the above model is that there exists a discrepancy between the investors and the econometrician s information set. In particular, while I t 1 may be summarized by returns up to period t 1, the information set I t includes information that is not summarized in the econometrician s data set on observed returns. To handle this estimation difficulty, TSN use actual volatility, σ 2 S t to approximate E(σ 2 S t I t ). That is, they estimate, r t = θ 1 E ( σ 2 S t I t 1 ) + θ2 [ σ 2 St E(σ 2 S t I t 1 ) ] + σ St u t (22) u t = ε t + θ 2 [ E(σ 2 St I t ) σ 2 S t ] Kim, Piger, and Startz (2008) (KPS, hereafter) point out that this approximation introduces classical measurement error into the state variable in the estimated equation, thus 19

rendering it endogenous. KPS propose a two-state endogenous Markov-switching model to deal with this endogeneity problem. Again, this two-state model of endogenous switching is identical to the N-state endogenous switching model proposed in Section 2 when N = 2. However, there is substantial evidence for more than two volatility regimes in U.S. equity returns (Guidolin and Timmermann (2005)). Here, we extend the TSN and KPS exogenous and endogenous switching volatility feedback models to allow for three volatility regimes. Specifically, we extend the volatility feedback model in equation (22) to allow for three regimes, S t {0, 1, 2}, with fixed transition probabilities p ij. For normalization we assume σ 2 > σ 1 > σ 0, so that state 2 is the highest volatility regime. To estimate the three state volatility feedback model, we measure excess equity returns using monthly returns for a value-weighted portfolio of all NYSE-listed stocks in excess of the one-month Treasury Bill rate. The sample period extends from January 1952 to December 2015. The second and third columns of Table 7 show the estimation results when we assume exogenous switching. The estimates are consistent with a positive relationship between the risk premium and expected future volatility (θ 1 > 0) and a substantial volatility feedback effect (θ 2 << 0). The estimates also suggest a dominant volatility feedback effect, as θ 1 is small in absolute value relative to θ 2. The fourth and fifth columns of Table 7 show the results when we allow for endogenous switching. First, there is statistically significant evidence in favor of endogenous switching, with a likelihood ratio test rejecting the null hypothesis of exogenous switching at the 5% level (p-value = 0.034). The primary difference in the estimated parameters is a much smaller volatility feedback effect (smaller θ 2 ) in the endogenous switching model than was found in the exogenous switching model. The estimated correlation parameters have different signs, with ρ 1 < 0 and ρ 2 > 0. Thus, large values of u t in equation (22) increase the likelihood of S t = 0 (low volatility regime) relative to S t = 1 (medium volatility regime), and increase the likelihood of S t = 2 (highest volatility regime) relative to both S t = 0 and S t = 1. Figure 4 shows the risk premium implied by three different volatility feedback models, 20

the exogenous switching model with three states (red dashed line), the endogenous switching model with three states (blue solid line), and the endogenous switching model with two states (green dotted line). The three state endogenous switching model produces a risk premium that is more variable than the other models across volatility states. In particular, the risk premium from the three state endogenous switching model rises above the risk premium from the other models during the highest volatility state, which from Figure 5 is seen to be highly correlated with NBER recessions. However, during the other volatility states, the risk premium from the three state endogenous switching model is generally below that from the other models. On average, our model suggests a 9% risk premium, similar to that estimated by Kim et al. (2004) using the volatility feedback model assuming exogenous switching over the period 1952 to 1999. This estimated risk premium is higher than Fama and French (2002), who estimate an average risk premium of 2.5% using the average dividend yield plus the average dividend growth rate for the S&P 500 index over the period 1951 to 2000. 6 Conclusion We proposed a novel N-state Markov-switching regression model in which the state indicator variable is correlated with the regression disturbance term. The model admits a wide variety of patterns for this correlation, while maintaining computational feasibility. Maximum likelihood estimation can be performed using extensions to the filter in Hamilton (1989), and the parameterization of the model allows for a simple test of the null hypothesis of exogenous switching. In simulation experiments, the maximum likelihood estimator performed well, and a likelihood ratio test of exogenous switching had good size and power properties. We allowed for endogenous switching in two applications: a switching mean model of U.S. real GDP, and a switching volatility model of U.S. equity returns. We find statistically significant evidence of endogenous switching in both of these models, as well as quantitatively large differences in parameter estimates resulting from allowing for endogenous switching. 21

Appendix A.1: Derivation of f (y t S t, S t 1, Ψ t 1, Z t, θ) The iterative filter presented in Section 3 requires calculation of the regime-dependent density f (y t S t, S t 1, Ψ t 1, Z t, θ), where y t represents the random variable described by the data generating process described in equation (2) along with the endogenous regimeswitching process described in Section 2. We have again suppressed the conditioning of this density on the covariates x t. This appendix derives this regime-dependent density. Let yt denote a realization of y t for which we wish to compute f (yt S t = i, S t 1 = j, Ψ t 1, Z t, θ). Applying Bayes Rule yields: f (y t S t = i, S t 1 = j, Ψ t 1, Z t, θ) = f (y t, S t = i S t 1 = j, Ψ t 1, Z t, θ) Pr (S t = i S t 1 = j, Ψ t 1, Z t, θ) (A.1) The denominator of equation (A.1) is the time-varying transition probability, p ij,t. Consider the following CDF of the numerator of (A.1): Pr (y t < y t, S t = i S t 1 = j, Ψ t 1, Z t ) = = = = y t y t x t β i σ i y t x t β i σ i y t x t β i σ i f (y t, S t = i S t 1 = j, Ψ t 1, Z t ) dy t f (ε t, S t = i S t 1 = j, Ψ t 1, Z t ) dε t Pr (S t = i ε t, S t 1 = j, Ψ t 1, Z t ) f (ε t S t 1 = j, Ψ t 1, Z t ) dε t Pr (S t = i ε t, S t 1 = j, Ψ t 1, Z t ) f (ε t ) dε t where the validity of moving to the last line in this derivation is ensured by the independence of ε t over time, the exogeneity of Z t, and the independence of ε t and S t 1. Finally, 22

differentiating this CDF with respect to y t yields: f (yt, S t = i S t 1 = j, Ψ t 1, Z t, θ) ( ( ) ) ( ) y = Pr S t = i t x tβ i y, S t 1 = j, Ψ t 1, Z t f t x tβ i σ i σ i (A.2) where (yt x tβ i ) /σ i is a realization of the random variable ε t. The first term in (A.2) is the conditional transition probability, p ij,t. Given the marginal Gaussian distribution for ε t, the second term in equation (A.2) is: ( ) y f t x tβ i = 1 ( ) y φ t x tβ i σ i σ i σ i Combining the above results, we have: f (yt S t = i, S t 1 = j, Ψ t 1, Z t, x t ) = p [ ( )] ij,t 1 y φ t x tβ i p ij,t σ i σ i which is equation (16) evaluated at y t. 23

Appendix A.2: Transition Probabilities for a Three- State Endogenous Markov-Switching Model The unconditional and conditional transition probabilities defined in Section 2 take the form of a multivariate normal cumulative distribution function (CDF). In this appendix we explicitly characterize these CDFs for the case where N=3. First, define c 1,j,t = γ 1,j + z tδ 1,j and c 2,j,t = γ 2,j + z tδ 2,j. For unconditional transition probabilities, we have the following results: 1. Pr (S t = 0 S t 1 = j, z t ) = p 0j,t = P r(η 1,t < c 1,j,t, η 2,t < c 2,j,t ) is calculated using the CDF of the following multivariate normal distribution: η 1,t N 0, 1 ρ 1ρ 2 0 ρ 1 ρ 2 1 η 2,t 2. Pr (S t = 1 S t 1 = j, z t ) = p 1j,t = P r( η 1,t < c 1,j,t, (η 2,t η 1,t ) < (c 1,j,t c 2,j,t )) is calculated using the CDF of the following multivariate normal distribution: η 1,t N 0 1 1 ρ 1 ρ 2, η 2,t η 1,t 0 1 ρ 1 ρ 2 2(1 ρ 1 ρ 2 ) 3. Finally, Pr (S t = 2 S t 1 = j, z t ) = p 2j,t is calculated as: p 2j,t = 1 p 0j,t p 1j,t 24

For conditional transition probabilities, we have the following results: 1. Pr (S t = 0 S t 1 = j, z t, ε t ) = p 0j,t = Pr (η 1,t < c 1,j,t, η 2,t < c 2,j,t ε t ) can be calculated as: p 0j,t = Pr (η 1,t < c 1,j,t ε t ) Pr (η 2,t < c 2,j,t ε t ) = Φ ( ) ( ) c 1,j,t ρ 1 ε t c 2,j,t ρ 2 ε Φ t 1 ρ 2 1 1 ρ 2 2 where Φ( ) denotes the CDF of the univariate standard normal density. This is justified by the conditional independence assumption detailed in equation (11). 2. Pr (S t = 1 S t 1 = j, z t, ε t ) = p 1j,t = Pr ( η 1,t < c 1,j,t, (η 2,t η 1,t ) < (c 1,j,t c 2,j,t ) ε t ) is calculated as the CDF of the following multivariate normal density: η 1,t η 2,t η ε t N ρ 1ε t, 1 ρ2 1 1 ρ 2 1 1,t (ρ 2 ρ 1 )ε t 1 ρ 2 1 2 ρ 2 1 ρ 2 2 3. Finally, Pr (S t = 2 S t 1 = j, z t, ε t ) = p 2j,t is calculated as: p 2j,t = 1 p 0j,t p 1j,t 25

References Bekaert, G. and G. Wu (2000). Asymmetric volatility and risk in equity markets. Review of Financial Studies 13 (1), 1 42. Boldin, M. D. (1996). A check on the robustness of hamilton s markov switching model approach to the economic analysis of the business cycle. Studies in Nonlinear Dynamics and Econometrics 1 (1), 35 46. Burns, A. F. and W. C. Mitchell (1946). Measuring business cycles. New York: National Bureau of Economic Research. ID: 169122. Camacho, M. and G. Perez-Quiros (2007). Jump and rest effects of u.s. business cycles. Studies in Nonlinear Dynamics and Econometrics 11 (4). Campbell, J. Y. and L. Hentschel (1992). No news is good news: An asymmetric model of changing volatility in stock returns. Journal of Financial Economics 31 (3), 281 318. Diebold, F. X., J.-H. Lee, and G. C. Weinbach (1994). Regime switching with time-varying transition probabilities. In C. Hargreaves (Ed.), Nonstationary Time Series Analysis and Cointegration, Advanced Texts and Econometrics, Oxford and New York, pp. 283 302. Oxford University Press. Fama, E. F. and K. R. French (2002). The equity premium. Journal of Finance 57 (2), 637 659. Filardo, A. J. (1994). Business cycle phases and their transitional dynamics. Journal of Business and Economic Statistics 12 (3), 299 308. French, K. R., W. G. Schwert, and R. F. Stambaugh (1987). Expected stock returns and volatility. Journal of Financial Economics 19 (1), 3 29. Garcia, R. and P. Perron (1996). An analysis of the real interest rate under regime shifts. Review of Economics and Statistics 78 (1), 111 125. 26

Goldfeld, S. M. and R. E. Quandt (1973). A markov model for switching regressions. Journal of Econometrics 1 (1), 3 16. Guidolin, M. and A. Timmermann (2005). Economic implications of bull and bear regimes in uk stock and bond returns. Economic Journal 115 (500), 111 143. Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica 57 (2), 357 384. Hamilton, J. D. (2005). What s real about the business cycle? Federal Reserve Bank of St. Louis Review 87 (4), 435 452. Hamilton, J. D. (2008). Regime switching models. In S. N. Durlauf and L. E. Blume (Eds.), New Palgrave Dictionary of Economics, 2nd Edition. Palgrave MacMillan. Kang, K. H. (2014). Estimation of state-space models with endogenous markov regimeswitching parameters. Econometrics Journal 17 (1), 56 82. Kim, C.-J. (1994). Dynamic linear models with markov switching. Journal of Econometrics 60 (1-2), 1 22. Kim, C.-J., J. Morley, and J. Piger (2005). Nonlinearity and the permanent effects of recessions. Journal of Applied Econometrics 20 (2), 291 309. Kim, C.-J., J. C. Morley, and C. R. Nelson (2004). Is there a positive relationship between stock market volatility and the equity premium? Journal of Money, Credit and Banking 36 (3), 339 360. Kim, C.-J. and C. J. Murray (2002). Permanent and transitory components of recessions. Empirical Economics 27 (2), 163 183. Kim, C.-J. and C. R. Nelson (1999a). Has the u.s. economy become more stable? a bayesian approach based on a markov-switching model of the business cycle. Review of Economics and Statistics 81 (4), 608 616. 27

Kim, C.-J. and C. R. Nelson (1999b). State-Space Models with Regime Switching. Cambridge, MA: The MIT Press. Kim, C.-J., J. Piger, and R. Startz (2008). Estimation of markov regime-switching regressions with endogenous switching. Journal of Econometrics 143 (2), 263 273. McConnell, M. M. and G. Perez-Quiros (2000). Output fluctuations in the united states: What has changed since the early 1980 s? American Economic Review 90 (5), 1464 1476. Piger, J. (2009). Econometrics: Models of regime changes. In B. Mizrach (Ed.), Encyclopedia of Complexity and System Science, New York. Springer. Sichel, D. E. (1994). Inventories and the three phases of the business cycle. Journal of Business and Economic Statistics 12 (3), 269 277. Sims, C. A. and T. Zha (2006). Were there regime switches in u.s. monetary policy? American Economic Review 96 (1), 54 81. Turner, C. M., R. Startz, and C. R. Nelson (1989). A markov model of heteroskedasticity, risk, and learning in the stock market. Journal of Financial Economics 25 (1), 3 22. 28

Figure 1 P r (S t = i S t 1 = j) vs. P r (S t = i S t 1 = j, ε t ) ρ 1 = 0.5, ρ 2 = 0.9 Notes: These graphs show the unconditional transition probability, P r (S t = i S t 1 = j) (horizontal dashed line), and the transition probability conditional on the continuous disturbance term in equation (2), P r (S t = i S t 1 = j, ε t ) (solid line). In all panels, j i indicates transitions from state j to state i, and the x-axis measures alternative values of ε t. 29

Figure 2 P r (S t = i S t 1 = j) vs. P r (S t = i S t 1 = j, ε t ) ρ 1 = 0.9, ρ 2 = 0.9 Notes: These graphs show the unconditional transition probability, P r (S t = i S t 1 = j) (horizontal dashed line), and the transition probability conditional on the continuous disturbance term in equation (2), P r (S t = i S t 1 = j, ε t ) (solid line). In all panels, j i indicates transitions from state j to state i, and the x-axis measures alternative values of ε t. 30

Figure 3 Smoothed State Probabilities for Three Regime Model of Real GDP Growth (a) Probability of S t = 0 (b) Probability of S t = 1 (c) Probability of S t = 2 Notes: Smoothed probability of mature expansion phase (S t = 0), recession phase (S t = 1), and post-recession recovery phase (S t = 2) from 1947:Q2 to 2016:Q2. Dotted lines denote the regime probability estimated by the exogenous switching model, and solid lines represents the regime probability estimated by the endogenous switching model. NBER recessions are shaded. 31

Figure 4 Risk Premium from Alternative Volatility Feedback Models Notes: Risk premium implied by different Markov-switching volatility feedback models. The red dashed line reports the risk premium produced by the exogenous switching model with three states, the green dotted line reports the risk premium produced by the endogenous switching model with two states, and the blue solid line reports the risk premium produced by the endogenous switching model with three states. NBER recessions are shaded. 32

Figure 5 Smoothed State Probabilities from Three Regime Volatility Feedback Model with Endogenous Switching (a) Probability of S t = 0 (b) Probability of S t = 1 (c) Probability of S t = 2 Notes: Smoothed probability of low volatility phase (S t = 0), medium volatility phase (S t = 1), and high volatility phase (S t = 2). NBER recessions are shaded. 33