A Bayesian Evaluation of Alternative Models of Trend Inflation

A Bayesian Evaluation of Alternative Models of Trend Inflation Todd E. Clark Federal Reserve Bank of Cleveland Taeyoung Doh Federal Reserve Bank of Kansas City April 2011 Abstract This paper uses Bayesian methods to assess alternative models of trend inflation. We first use Bayesian metrics to compare the fits of alternative models. We then use Bayesian methods of model averaging to account for uncertainty surrounding the model of trend inflation, to obtain an alternative estimate of trend inflation in the U.S. and to generate medium-term, model-average forecasts of inflation. Reflecting models common in reduced-form inflation modeling and forecasting, we specify a range of models of inflation, including: AR with constant trend; AR with trend equal to last period s inflation rate; local level model; AR with random walk trend; AR with trend equal to the long-run expectation from the Survey of Professional Forecasters (SPF); and AR with time-varying parameters. We consider versions of the models with constant shock variances and with stochastic volatility. Our results show that, in terms of model fit and density forecast accuracy, the models with stochastic volatility dominate those with constant volatility. For core inflation, our Bayesian measures of model fit indicate the SPF and local level specifications of trend are about equally good. However, practically speaking, the differences in forecast performance are small enough that it is difficult to draw meaningful distinctions among alternative models of trend inflation. Keywords: Likelihood, model combination, forecasting JEL Classifications: E31, E37, C11 Clark(corresponding author): Economic Research Dept.; Federal Reserve Bank of Cleveland; P.O. Box 6387; Cleveland, OH 44101; todd.clark@clev.frb.org. Doh : Economic Research Dept.; Federal Reserve Bank of Kansas City; 1 Memorial Drive; Kansas City, MO 64198; taeyoung.doh@kc.frb.org. We gratefully acknowledge helpful conversations with Gianni Amisano and Francesco Ravazzolo. The views expressed herein are solely those of the authors and do not necessarily reflect the views of the Federal Reserve Banks of Cleveland or Kansas City or the Federal Reserve Board of Governors. 1

1 Introduction Recently, some forecasts of U.S. inflation made during the sharp 2007-2009 recession and the early stages of the ensuing recovery have highlighted the crucial role of the concept of trend inflation incorporated in alternative forecasting models. As Williams (2009) describes, models in which the inflation trend is represented by the 10-year ahead inflation forecast from the Survey of Professional Forecasters yield materially higher forecasts of inflation and, in turn, a lower risk of deflation than do models in which the inflation trend is simply a function of past inflation. More generally, prior studies such as Kozicki and Tinsley (1998) and Clark and McCracken (2008) have shown that the concept of trend embedded in an inflation model plays a key role in longer-term inflation forecasts. At the same time, the amount of time variation in mean or trend inflation has implications for the persistence of inflation relative to trend. The persistence of inflation also plays a key role in forecasts. For example, some estimates based on models allowing breaks in the inflation process in the early 1990s yield little inflation persistence, suggesting inflation will relatively quickly revert to trend after a departure from trend (e.g., Williams). Other estimates that treat the inflation process as constant through much of the 1980s and 1990s yield higher persistence and inflation forecasts that more slowly return to baseline after a shock. But as Williams (2009) suggests, exactly how the trend in inflation should be modeled is not clear. For example, for most of the period since 1995, the Survey of Professional Forecasters projected long-term CPI inflation (10 years ahead) of 2.5 percent. Yet core inflation was generally well below that threshold for most of the period. Since late 2007, long-term forecasts of PCE inflation from the Survey of Professional Forecasters (SPF) have consistently been a bit higher than longer-term projections from the FOMC. 1 Some researchers and forecasters (e.g., Macroeconomic Advisers (2007)) simply take long-run survey expectations as a good measure of trend; Clark and Davig s (2008) estimates of a joint model of actual inflation and inflation expectations show that long-run survey expectations are essentially the same as trend. Other research suggests a range of possible and reasonable models of trend inflation. 1 For example, in late April 2010, the central tendency of FOMC participants longer-run projection of PCE inflation (the longer-run projections represent each participant s assessment of the rate to which each variable would be expected to converge over time under appropriate monetary policy and in the absence of further shocks ) was 1.5 to 2.0 percent. In the mid-may Survey of Professional Forecasters, the median forecast of average PCE inflation for 2009-2018 was 2.3 percent. 2

A number of studies and still-used forecasting models measure inflation expectations, or in effect, trend inflation, with past inflation (e.g., Brayton, Roberts, and Williams 1999, Gordon 1998, and Macroeconomic Advisers 1997). 2 Another array of studies has modeled trend inflation as following a random walk (e.g., Cogley, Primiceri, and Sargent (2010), Cogley and Sbordone (2008), Ireland (2007), Kiley (2008), Kozicki and Tinsley (2006), Piger and Rasche (2008), and Stock and Watson (2007, 2010)). Inflation less trend follows an autoregressive process in some of these studies (e.g., Cogley, Primiceri, and Sargent (2010)) but not in others (Stock and Watson (2007)). In addition, some research using random walk trends (Stock and Watson (2007) and Cogley, Primiceri, and Sargent (2010)) has found that the size of the trend component in inflation has varied over time. Accordingly, this paper uses Bayesian methods to assess alternative models of trend inflation. We first use Bayesian metrics to compare the fits of alternative models. We then use Bayesian methods of model averaging to account for uncertainty surrounding the model of trend inflation, to obtain an alternative estimate of trend inflation in the U.S. and to generate medium-term, model-average forecasts of inflation. Reflecting models common in reduced-form inflation modeling and forecasting, we specify a range of models of inflation. We use predictive likelihoods to weight each model and forecast and construct probabilityweighted average estimates of trend inflation and inflation forecasts. For forecasting, we consider not only point predictions but also density forecasts (specifically, deflation probabilities and average log predictive scores). Morley and Piger (2010) follow a broadly similar Bayesian model averaging method to estimate the trend and business cycle component of GDP. In the interest of simplicity, we focus on univariate models of inflation. Studies such as Atkeson and Ohanian (2001), Clark and McCracken (2008, 2010), and Stock and Watson (2003, 2007) have found that, in U.S. data since at least the mid-1980s, univariate models of inflation typically forecast better than do multivariate models. Our set of models incorporates significant differences in the trend specification: AR with constant trend; AR with trend equal to last period s inflation rate; local level model; AR with random walk trend; AR with trend equal to the long-run expectation from the Survey of Professional Forecasters (SPF); and AR with time-varying parameters. Finally, we consider versions of the models with constant shock variances and with time-varying shock variances (stochastic volatility). 2 In a related formulation, Cogley (2002) develops an exponential smoothing model of trend. 3

To further highlight the importance of trend specification to inflation inferences, in assessing deflation probabilities we also consider some specifications that include an unemployment gap. Our results show that, in terms of model fit and density forecast accuracy, the models with stochastic volatility dominate those with constant volatility. The incorporation of stochastic volatility also materially affects estimates of the probability of deflation, sharply lowering them in the past decade, a period in which deflation became a concern of policymakers and others. Among alternative models of trend, for core inflation, our Bayesian measures of model fit for the full sample indicate the SPF and local level specifications of trend are about equally good, and strongly dominate other specifications of trend. However, we also find that model fit has evolved considerably over the sample. For example, relative to other models, the fit of the local level model with stochastic volatility improved significantly in the last few years of the sample. Up through 2006, the survey-based trend specification yielded a better fit of the data. For inflation in the GDP price index, the full sample evidence puts the most weight on the AR model in which trend inflation is last period s inflation rate, and much less weight (although not zero weight) on the local level model with stochastic volatility. However, among alternative models of trend, out-of-sample forecast performance differences are small enough to make it difficult to draw meaningful distinctions among alternative models of trend inflation. For example, for core inflation, while the local level-stochastic volatility model typically yields the smallest root mean square errors (RMSEs), the improvements over other models are very small, practically speaking. The paper proceeds as follows. Section 2 describes the data. Sections 3 and 4 present the models and estimation methodology, respectively. Section 5 presents the results. An appendix provides details of the estimation algorithms, priors, and computation of predictive likelihoods. Section 6 concludes. 2 Data We focus on modeling and forecasting core PCE inflation, which refers to the price index for personal consumption expenditures excluding food and energy, the Federal Reserve s preferred measure of inflation. But we also include some results for a broader measure of inflation, the GDP price index, often used in assessments of inflation dynamics and forecast comparisons. Inflation rates are computed as an annualized log percent change 4

(400 ln(p t /P t 1 )). Our time period of focus (for our estimates) is 1960 through 2009. To obtain training sample estimates (section 4 provides more detail on the samples), we use other data to extend the core PCE and GDP price indexes back in time. Specifically, we merge the published core PCE inflation rate (beginning in 1959:Q2) with: (1) a constructed measure of core inflation for 1947:Q2 through 1959:Q1 that excludes energy goods (for which data go back to 1947) but not energy services (for which data only go back to 1959); and (2) overall CPI inflation for 1913:Q2 through 1947:Q1. 3 In analysis of inflation in the GDP price index, for which published data extend back through 1947, we merge the published GDP inflation rate (beginning in 1947:Q2) with overall CPI inflation for 1913:Q2 through 1947:Q1. For our models that rely on survey forecasts of long-run inflation as the measure of trend inflation, we use the survey-based long-run (5- to 10-year-ahead) PCE inflation expectations series of the Federal Reserve Board of Governor s FRB/US econometric model. The FRB/US measure splices econometric estimates of inflation expectations from Kozicki and Tinsley (2001a) early in the sample to 5- to 10-year-ahead survey measures compiled by Richard Hoey and, later in the sample, to 10-year-ahead values from the Federal Reserve Bank of Philadelphia s Survey of Professional Forecasters. refer to this series as PTR, using the notation of the FRB/US model. In presenting the results, we Finally, for our analysis of deflation probabilities from models that include an economic activity indicator, we use an unemployment gap, defined as the unemployment rate for men between the ages of 25 and 54 less a one-sided estimate of the trend in that unemployment rate. The use of an unemployment rate for prime-age men helps to reduce the influence of long-term demographic trends. Following Clark (2011), we use exponential smoothing to estimate the trend, with a smoothing coefficient of 0.02. With the exception of the CPI index obtained from the Bureau of Labor Statistics, we obtained all of the data from the FAME database of the Federal Reserve Board of Governors. 3 We used the methodology of Whelan (2002) to construct the measure of core inflation from raw data series on total PCE and energy goods spending and prices. As to the historical CPI, we obtained a 1967 base year index from the website of the Bureau of Labor Statistics and seasonally adjusted it with the X11 algorithm. 5

3 Models 3.1 Constant volatility Our baseline model specification relates current inflation to past inflation and current and past rates of trend inflation: π t π t = b(l)(π t 1 π t 1)+v t, (1) where π t denotes actual inflation and πt denotes trend inflation. The trend corresponds to the moving endpoint concept of Kozicki and Tinsley (1998, 2001a,b): πt =lim h E t π t+h. In the absence of deterministic terms, this concept of trend inflation is the same as in the Beveridge and Nelson (2001) decomposition. Building on this baseline, we consider a range of models, each incorporating a different specification of the trend in inflation. These specifications include: πt = constant (constant trend) πt = π t 1 (π t 1 trend) πt = long-run survey forecast (PTR trend) πt = πt 1 + n t (random walk trend) The first model is a stationary AR model, written in demeaned form; Villani (2009) develops an approach for estimating such models with a steady state prior. The second model is a stationary AR model in the first difference of inflation. The third and fourth models are AR models in detrended inflation, where trend is defined as the PTR trend in one case and a random walk process in the other. We consider two other related model formulations. The first is the local level model, considered in such studies as Stock and Watson (2007): π t = π t + v t, π t = π t 1 + n t. (2) This model simplifies the version of equation (1) with a random walk trend by imposing AR coefficients of zero. The local-level model implies a filtered trend estimate that could be computed by exponential smoothing. The AR model with a random walk trend in inflation generalizes this local-level model by allowing autoregressive dynamics in the deviation from trend; this form of the AR model is equivalent to the trend-cycle model of Watson (1986). 6

The second alternative model we consider is an AR specification with time variation in all coefficients, considered in such studies as Cogley and Sargent (2005): p π t = b 0,t + b i,t π t i + v t, b t = b t 1 + n t, (3) i=1 where the vector b t contains the intercept and slope coefficients of the AR model and var(n t )=Q. For this time-varying parameters (TVP) model, we follow Cogley and Sargent (2005) and others in the growing TVP literature in estimating trend inflation as the instantaneous mean, as implied in each period by the intercept and slope coefficients. In our examination of the role of trend inflation in assessing the probability of deflation, we supplement our analysis with two models augmented to include the unemployment gap. Letting y t denote the vector containing π t πt and u t,whereudenotes the unemployment gap, we use a conventional VAR (without intercept): y t = b(l)y t 1 + v t. (4) Finally, as to the lag order of the models with autoregressive dynamics (every model except the local level), for computational tractability we fix the lag order at two for all models. As a robustness check, we estimated most of the models with stochastic volatility using four lags and obtained very similar results (for four lags versus two) for model fit and forecast performance. 3.2 Stochastic volatility In light of existing evidence of sharp changes over time in the volatility of inflation (e.g., Clark (2011), Cogley and Sargent (2005), and Stock and Watson (2007)), we consider versions of the models above supplemented to allow for time variation in the residual variances. As emphasized in Clark (2011) and Jore, Mitchell, and Vahey (2010), modeling changes in volatility is likely to be essential for accurate forecasts of measures that require the entire forecast density e.g., the probability of deflation. With stochastic volatility, the basic model specification is: π t π t = b(l)(π t 1 π t 1)+v t, v t = λ 0.5 t t, t N(0, 1), (5) log(λ t ) = log(λ t 1 )+ν t, ν t N(0,φ). 7

This specification applies to the constant trend, π t 1 trend, PTR trend, and random walk trend models. In the case of the random walk trend specification, the model also includes an equation for the trend: (2007): π t = π t 1 + n t, n t N(0,σ 2 n). (6) The local level model with stochastic volatility takes the form given in Stock and Watson π t = π t + v t, π t = π t 1 + n t, v t = λ 0.5 v,t v,t, v,t N(0, 1), n t = λ 0.5 n,t n,t, n,t N(0, 1), (7) log(λ v,t ) = log(λ v,t 1 )+ν v,t, ν v,t N(0,φ v ), log(λ n,t ) = log(λ n,t 1 )+ν n,t, ν n,t N(0,φ n ). In this paper, we generalize the approach of Stock and Watson (2007) by actually estimating the key parameters of the model, the variances of shocks to log volatilities, rather than treating them as fixed. However, we obtained very similar results from the model when we effectively fixed these variances at the Stock-Watson values (of 0.04) by setting the prior means of φ v and φ n at 0.04 and the prior degrees of freedom at 10,000. The TVP model with stochastic volatility takes the form given in Cogley and Sargent (2005), simplified to a univariate process: π t = p b 0,t + b i,t π t i + v t, i=1 b t = b t 1 + n t, var(n t )=Q, v t = λ 0.5 t t, t N(0, 1), (8) log(λ t ) = log(λ t 1 )+ν t, ν t N(0,φ). Finally, for the bivariate VAR models that include both inflation less trend and the unemployment gap, letting y t denote the vector of variables included, the bivariate VAR with stochastic volatility takes the form: y t = B(L)y t 1 + v t, v t = A 1 Λ 0.5 t t, t N(0,I 2 ), Λ t = diag(λ 1,t,λ 2,t ) (9) log(λ i,t ) = log(λ i,t 1 )+ν i,t, ν i,t N(0,φ i ) i =1, 2, 8

where A = a lower triangular matrix with ones on the diagonal and a coefficients a 21 in row 2 and column 1. Again, for simplicity, for all models with autoregressive dynamics (every model except the local level), we fix the lag order at two for all models. 4 Estimation 4.1 Algorithms Focusing on the 1960 to 2009 period, we estimate the models described above using Bayesian Markov Chain Monte Carlo (MCMC) methods. More specifically, to limit the influence of priors, we use empirical Bayes methods, specifying the priors for our model estimates on the basis of posterior estimates obtained from an earlier sample. For this purpose, we divide our data sample into three pieces: an estimation sample of 1960-2009, an intermediate estimation sample of 1947-59 for which we generate posterior estimates that form the basis of our priors for the estimation sample, and a training sample of 1913-1946 that we use to set priors for estimation in the intermediate sample. We use Gibbs samplers to estimate the models with constant volatilities. For the constant trend model, we use the algorithm detailed in Villani (2009). For the π t 1 trend and PTR trend specifications, the algorithm takes the Normal-diffuse form of Kadiyala and Karllson (1997). Estimation of the local level and random walk trend models is a straightforward application of Gibbs sampling, using state-space representations of the models. Estimation of the TVP model is described in sources such as Cogley and Sargent (2001); the algorithm for the local level model is effectively the same, with a lag length of 0 in the AR model. 4 In all cases (except the local level model), we follow the approach of Cogley and Sargent (2005), among others, in discarding draws with explosive autoregressive roots (and re-drawing). We use Metropolis-within-Gibbs MCMC algorithms to estimate the models with stochastic volatilities, combining some of the key Gibbs sampling steps for the constant volatility models with Cogley and Sargent s (2005) Metropolis algorithm (taken from Jacquier, Polson, and Rossi (1994)) for stochastic volatility. For the constant trend, π t 1 trend, and PTR trend specifications with stochastic volatility, our algorithms are the same as those used in Clark (2011) and Clark and Davig (2011). For the TVP model with stochastic 4 For all of our models with time-varying trends or coefficients, we use the algorithm of Durbin and Koopman (2002) for the backward smoothing and simulation, because the Durbin-Koopman algorithm is faster in the software we used. 9

volatility, our algorithm takes the form described in Cogley and Sargent (2005). 5 Again, for all models, we discard draws with explosive autoregressive roots. The appendix provides more detail on all of the algorithms for estimation with stochastic volatility. All of our reported results are based on samples of 5000 posterior draws. To ensure the reliability of our results, we estimated each model with a large number of MCMC draws, obtained by first performing burn-in draws and then taking additional draws, from which we retained every k th draw to obtain a sample of 5000 draws. Skipping draws is intended to reduce correlation across retained posterior draws. Koop and Potter (2008) show that MCMC chains for VARs with TVP can be quite slow to mix, a finding confirmed in Clark and Davig (2011). Drawing on this previous evidence of convergence properties and our own selected checks of convergence properties, we use larger burn-in samples and higher skip intervals for models with latent states (unobserved trends or time-varying volatility) than models without latent states. The appendix includes a table with the burn-in counts and skip intervals used for each model. 4.2 Model assessment Similar to Morley and Piger s (2010) approach to assessing models of the business cycle component of GDP, we construct a measure of trend inflation which addresses model uncertainty by averaging over different models of trend inflation. We assume equal prior weight for each model and use the following posterior model probability to average across different models: w i (π (T ) )= p(m i π (T ) ) n i=1 p(m i π (T ) ), (10) where M i denotes model i and π (T ) denotes the time series of inflation up to period T. To assess the congruency of each model with the data and compute the posterior model probabilities that determine the model weights, we follow Geweke and Amisano (2010) in using 1-step ahead predictive likelihoods. Sources such as Geweke (1999) and Geweke and Whiteman (2006) emphasize the close relationship between the predictive likelihood and marginal likelihood: as stated in Geweke (1999, p.15),... the marginal likelihood summarizes the out-of-sample prediction record... as expressed in... predictive likelihoods. 5 Again, though, we use the Durbin and Koopman (2002) smoother rather than the Carter and Kohn (1994) smoother used by Cogley and Sargent (2005). 10

Following Geweke and Amisano (2010), we use the log predictive likelihood defined as S log PL(M i )= log p(πt o π (t 1),M i ), (11) t=s where πt o denotes the observed outcome for inflation in period t and π (t 1) denotes the history of inflation up to period t 1. Following studies such as Bauwens, et al. (2011), we compute p(πt o π (t 1),M i ) from the simulated predictive density, using a kernel smoother to estimate the empirical density (from draws of forecasts). Finally, in computing the log predictive likelihood, we sum the log values over different samples, detailed below. In averaging trends and forecasts from their posterior densities, we follow the mixture of distributions approach described in Bjornland, et al. (2010). Specifically, from the posterior sample of 5000 draws, we sample 5000 draws with replacement, taking the draw from model i s density with probability M i. We then form the statistics of interest (median trend, etc.) from this mixture distribution. 4.3 Forecasting To assess the role of the trend model in medium-term forecasting, we consider the accuracy of forecasts of horizons from 1 to 16 quarters. The role of trend inflation in the forecast will increase with the forecast horizon, at a rate that depends on the persistence of inflation relative to trend. We assess the accuracy of point forecasts using mean errors and root mean square errors (RMSEs) and the accuracy of density forecasts using average log predictive scores (computed using the density estimated from forecast draws). The predictive scores provide the broadest possible measure of the calibration of the density forecasts. We evaluate forecasts over two samples: 1975:Q1-2009:Q4 and 1985:Q1 2009:Q4. In forming the first forecast, for 1975:Q1, we estimate each model using data from 1960:Q1 through 1974:Q4, and form forecasts from horizons 1 through 16. We then proceed to move forward a quarter, estimate each model using 1960:Q1-1975:Q1 data, and form forecasts for horizons 1 through 16. We continue with this recursive approach to forecasting through the rest of the sample. We also consider another specific aspect of the density forecast that may be particularly dependent on the specification of the trend in inflation: the probability of deflation, defined as inflation (over 4 quarters) next year of less than 0. More specifically, our deflation probability is defined as the probability of an average inflation rate less than 0 for periods t + 4 through t + 7, when forecasting starting in period t with models estimated using data through period t 1. 11

For each model, for each (retained) draw in the MCMC chain, we draw forecasts from the posterior distribution using an approach like that of Cogley, Morozov, and Sargent (2005). For models with time-varying coefficients (TVP), trends (local level, random walk), or stochastic volatility, we simulate the latent variables over the forecast horizon, using their random walk structure. For example, to incorporate uncertainty associated with time variation in λ t over the forecast horizon, we sample innovations to log λ t+h from a normal distribution with variance φ, and use the random walk specification to compute log λ t+h from log λ t+h 1. For each period of the forecast horizon, we then sample shocks v t+h to the equation for inflation with a variance of λ t+h and compute the forecast draw of π t+h from the AR structure and drawn shocks. The appendix provides more detail on our simulation of predictive densities, for the models with stochastic volatility. For all of the models with latent variables, the forecast distributions computed with this approach account for uncertainty about the latent variables i.e., in many of the model specifications, the evolution of trend inflation. In the case of the model using the PTR measure of trend, for simplicity we fix the trend over the forecast horizon of t +1 through t + H at the value of PTR in period t. As a result, the predictive density from the PTR trend model abstracts from uncertainty about the evolution of trend over the forecast horizon. However, based on comparisons conducted in the research for Clark (2011), this simplification has little effect on the results. 5 Results This section proceeds by first presenting model parameter estimates for the full sample of data and then reporting model fit for the full sample. The subsequent sections present the trend estimates and forecast results (with forecasting results for core PCE inflation and GDP inflation presented in separate subsections). 5.1 Full sample parameter estimates Tables 1 and 2 provide parameter estimates (posterior means and standard deviations) for full sample estimates of the constant volatility and stochastic volatility models, respectively. In the interest of brevity, we report results only for core PCE inflation; results for inflation in the GDP price index are qualitatively very similar. Consistent with some prior studies (e.g., Kozicki and Tinsley 2002), the AR models that 12

allow time variation in mean or trend inflation yield modestly lower sums of AR coefficients, reflecting reduced persistence of inflation relative to mean or trend. For example, with constant volatility, the sum of coefficients for the constant trend AR model is about 0.95, while the sum of coefficients for the PTR trend and random trend models is a little above 0.8. The same pattern is evident in the models with stochastic volatility. Within the set of constant volatility models, the local level specification yields modestly larger shocks to trend (σ 2 n = 0.490) than the noise component (σ 2 v = 0.223). Consistent with some prior studies (e.g., Stock and Watson 2007), the local level model attributes much of the movement in inflation to the trend component. By comparison, the variance of shocks to trend is considerably smaller (although still measurable) in the random walk trend model, which incorporates autoregressive dynamics. Consistent with recent evidence from studies such as Cogley and Sargent (2005) and Clark (2011), all estimates of the models with stochastic volatility imply sizable variation over time in the variance of shocks to inflation. For example, in the PTR trend model, the posterior mean estimate of the variance of shocks to log volatility is 0.079. Figure 1 s plot of the posterior median of the time series of volatility (specifically, Figure 1 plots the median and 70% credible set for λ 0.5 t ) confirms the considerable movement of volatility over the sample, dominated by the Great Moderation and a rise in volatility near the end of the sample, associated with the severe recession. The estimates of the variance of shocks to log volatility (all shown in Table 2) and the time series of volatilities (in the interest of brevity, Figure 1 provides estimates for just the PTR model) are similar across all models with autoregressive dynamics. Estimates differ somewhat for the local level model, which yields less variability in the size of shocks to the noise component of the model. Instead, the local level model yields more sizable variation in the size of shocks to trend. 5.2 Model fit To assess overall model likelihood and fit, Table 3 reports log predictive likelihoods (specifically, sums of 1-step ahead log likelihoods for 1975-2009). With the core PCE measure of inflation, among models with constant volatilities, the PTR trend specification fits the data far better than most of the other models, with a log predictive likelihood of -167.958 for the 1975-2009 sample. Based on this measure of Bayesian fit, the other models fit the data much worse (recall that, for model comparison, the differences in log scores would be exponentiated, so a difference in logs of a few points is a very large difference in probability). 13

The fits of the constant trend, random walk trend, and TVP models are comparable, with likelihoods of about -173. The local level model, of particular interest in light of the Stock and Watson (2007) evidence in support of a version with time-varying volatility, ranks worst among constant-volatility models of core inflation. However, allowing for stochastic volatility yields a somewhat different view of congruency with the data. For each specification of inflation trend and AR dynamics, the log predictive likelihood is considerably better with stochastic volatility than constant volatility. For example, with the PTR trend model of core PCE inflation, the version with stochastic volatility yields a log predictive likelihood of -154.759, compared to -167.958 for the version with constant volatility. In some cases, differences in likelihoods across models with stochastic volatility are smaller than differences in likelihoods across models with constant volatility. In addition, the model rankings change somewhat. With stochastic volatility, the best-fitting model is the local level specification. But the PTR trend model fits the data almost as well as the local level model. The other models don t fit nearly as well, with log predictive likelihoods of about -159. Translated into model probabilities as described in section 4, the full-sample evidence gives a 53 percent probability to the local level-sv model, 45% probability to the PTR trend specification, probabilities of less than 1 percent to the other stochastic volatility specifications, and probabilities of essentially 0 to the constant volatility models. 6 Results for the GDP price index are similar in some important respects. Among models with constant volatilities, the PTR trend and local level models yield, respectively, the best and worst predictive likelihoods. Models with stochastic volatility fit the data much better than the models with constant volatilities. For example, the PTR trend specification yields a log predictive likelihood of -202.423, while the same model with stochastic volatility has a likelihood of -186.071. However, the rankings of the models that best fit the data differ significantly across the core PCE and GDP price index measures of inflation. With the GDP price index, the π t 1 trend model with stochastic volatility fits the data best. The PTR trend-sv and TVP-SV models are next best and quite similar in fit. Translated into model probabilities, the log 6 For two inflation models that include an unemployment gap, we report log predictive likelihoods but not model weights. We include these models to highlight (later in this section) the role of trend inflation in determining deflation probabilities. Because the overall fit of the models is of interest, we report their likelihoods. However, in light of our focus on models of trend inflation, we do not include these specifications in the model combinations considered below. Accordingly, we do not report model weights or probabilities for these specifications. 14

predictive likelihoods give a 71 percent probability to the π t 1 trend-sv model, 11 percent probability to the local level-sv specification, 6 or 7 percent probabilities to the PTR trend- SV and TVP-SV models, smaller probabilities to the other models with stochastic volatility, and essentially 0 probabilities to the models with constant volatility. To further assess congruence of the models with data over time, we follow examples such as Geweke and Amisano (2010) in plotting the time series of cumulative log predictive likelihoods. To facilitate this graphical assessment, we take the local level model with constant volatility as the benchmark, and form, for every other model, the difference between its cumulative log predictive likelihood and the benchmark cumulative log predictive likelihood. The results for core PCE inflation provided in Figure 2 indicate that there have been important changes over time in the relative fit of the models. For much of the 1980s, these relative likelihoods were negative for every model, indicating the local level model best fit the data. The superiority of other models didn t emerge until the mid-1990s. From the mid-1990s onward, the ranking of the constant volatility models changed little, remaining near the full sample ranking (PTR trend best, etc.). Relative to constant volatility models, the stochastic volatility versions rapidly gain advantage starting in the mid-1990s. This pattern likely reflects the influences of the Great Moderation on volatility and the success of the stochastic volatility models in capturing those influences. Moving forward in time, the rankings of the stochastic volatility models shift around some. For example, based on cumulative likelihoods from the mid-1990s through 2006, the PTR trend-sv model fits the data better than the local level-sv model, but for the remainder of the sample, the local level model fits the data slightly better. Qualitatively, the results for GDP inflation provided in Figure 3 are similar in important respects. Most notably, relative to constant volatility models, the stochastic volatility versions rapidly gain advantage starting in the mid-1990s. Over time, the rankings of the stochastic volatility models shift around. For much of the 1990s, the fits of most of the models with stochastic volatility were quite similar. Over the subsequent decade, some larger differences emerged, with the π t 1 trend specification rising to the top. In light of these changes over time in the rankings of models, from a forecasting perspective it may be desirable to use model weights computed at each point in time from a rolling window of predictive likelihoods. Studies such as Jore, Mitchell, and Vahey (2010) 15

and Kascha and Ravazzolo (2010) use rolling windows of scores to compute model weights for forecasting. Accordingly, in this paper, in combining forecasts, we will consider model averages based on 10-year rolling windows of predictive likelihoods (these are the weights available for pseudo-real time forecasting). 7 Figure 4 presents these weights for the models of core PCE inflation (we omit the corresponding figure for GDP inflation in the interest of brevity). Consistent with the cumulative log likelihoods described above, in the mid-1980s, the local level model (with constant volatility) receives the most weight, roughly 70 percent for a short period of time. From the mid-1990s until about 2009, the constant volatility models receive essentially no weight. But in the last few observations of the sample, the PTR trend and TVP models receive some weight. From the late 1980s through about 2003, all of the stochastic volatility models receive some modest weight, with rankings that move around over time, except that the PTR trend specification receives the most weight for much of this period. Starting in about 2005, the weight given to the local level-sv model rises sharply, to nearly 100 percent, before trailing off in the last year or so of the forecast sample. Section 6 will examine how using these rolling sample-determined weights affect forecast accuracy. These results on model fit offer a somewhat different and more complicated picture than that of Stock and Watson (2007), which suggested a local level model with stochastic volatility as the best model of inflation. Some, although not all, of the differences may be attributable to the broader set of trend specifications considered in this paper. Other differences could be due to this paper s use of Bayesian, rather than frequentist, concepts of model fit. In the case of core PCE inflation, while the local level-sv model best fits the full sample, the PTR trend-sv model fits the data almost as well. For GDP inflation, the π t 1 trend model with stochastic volatility fits the data better than the local level-sv model. Perhaps even more importantly, the rankings of models have changed quite a bit over time. Accordingly, it is difficult to say that a single model of trend inflation best fits the data. 5.3 Trend estimates Figures 5 and 6 present, for core PCE inflation, estimates of trend inflation obtained from each of the models considered (specifically, posterior medians along with 70 percent credible sets). In the interest of chart readability, most of the chart panels provide just a single trend series (median estimate) along with its credible set. The upper left panel, for the constant 7 Using 5-year rolling windows of predictive likelihoods yielded similar results. 16

trend specification, reports the trend estimate from that model along with actual inflation and the PTR trend, the survey-based measure of long-run inflation expectations. The PTR trend is repeated in the upper right panel, which also provides the trend defined as last period s inflation rate. Those models that estimate a time-varying trend (local level, random walk trend, and TVP) imply considerable variation over time in trend inflation, with non-trivial differences across models. The local level model yields a trend that is considerably more variable than the PTR measure of trend. Indeed, the local level model attributes much of the movement in actual inflation to trend shifts. However, consistent with estimates such as Stock and Watson (2007), allowing time-variation in volatilities results in a local level trend that, since the mid-1980s, is considerably smoother than the trend estimated from the local level model with constant volatility. Compared to the local level model, the random walk trend specification yields a broadly similar trend. At least in the model with constant volatility, the trend from the random walk trend model is smoother than the estimate from the local level model. Broadly, the random walk trend estimate is comparable to PTR, but with some considerable divergences over time. For example, in the late 1960s, the random walk trend estimate rose faster than PTR did and then remained at a higher level than PTR. But in the late 1970s, the PTR measure rose to a higher level than the random walk trend estimate, and remained at a higher level until about 1990. In the past few years of the sample, the random walk trend edged up (more so in the model with stochastic volatility than constant volatility), while PTR remained steady. The TVP models yield trend time series that are smoother than the other econometric estimates and PTR, but with similar contours. Of course, the estimates of trend from these models are considerably different from the trends assumed for the π t 1 trend model and estimated in the constant trend specification. Figure 7 presents the results of accounting for model uncertainty by averaging core PCE trend estimates across models, based on the full-sample predictive likelihoods (i.e., on an ex post basis). As noted above, we obtain the density of model-average forecasts by sampling from the individual model densities, with a mixture approach based on the model weights. Given the large weights the PTR trend-sv and local level-sv models receive for the full sample, averaging all models based on the full-sample likelihoods yields a model average trend estimate that looks like an equally weighted average of the trends from these two 17

specifications. Consequently, prior to the stabilization of inflation, trend inflation is less variable than actual inflation but more variable than the PTR trend. But based on the changes in model fit over time described above, some might prefer to give more models some weight. For instance, in the forecasting literature (point and density forecasts), equal model weights often perform as well as or better than score-based weights (e.g., Kascha and Ravazzolo (2010) and Mazzi, Mitchell, and Montana (2010)). Accordingly, the lower panel of Figure 8 presents a trend estimate obtained by applying equal weights to all of the models with stochastic volatility. This approach yields a somewhat different, more variable, trend. For example, while the likelihood-based average shows trend inflation to have been largely constant in recent years, the simple average shows some upward drift in trend late in the sample, as well as a very recent dip. In the interest of brevity, Figures 8 and 9 present a reduced set of results for GDP inflation. Qualitatively, the estimates of trends from individual models with stochastic volatility in Figure 8 are similar to those for core PCE inflation (Figure 6). Those models that estimate a time-varying trend (local level, random walk trend, and TVP) imply considerable variation over time in trend inflation, with non-trivial differences across models. The local level model yields a trend that is considerably more variable than the PTR measure of trend. Moreover, the local level-sv estimate of trend in GDP inflation is considerably more variable than the local level-sv estimate of trend in core PCE inflation. Again, the TVP models yield trend time series that are smoother than the other econometric estimates and PTR, but with similar contours. The model-average estimates of trend GDP inflation in Figure 9 are more variable than the average estimates for core PCE inflation (Figure 7). With averaging based on predictive likelihoods, the GDP inflation estimates put most weight on the π t 1 trend model, which of course yields a highly variable estimate of trend. Applying equal weights to all models with stochastic volatility smooths out the trend estimate, but leaves it more variable than in the case of the core PCE inflation estimates. For example, in the last decade of the sample, the equally-weighted average trend moved significantly higher and then more than reversed the gain. 5.4 Forecast results: core PCE inflation Tables 4 and 5 present results for point forecasts of core PCE inflation (point forecasts are defined as posterior means), in the form of mean errors (Table 4) and RMSEs (Table 18

5). To facilitate comparisons, the RMSEs are reported as ratios of RMSEs for each model relative to the RMSE of the local level-sv model; for the local level-sv model, the table provides the levels of the RMSEs. The tables include results for each individual model and for three averages: a PL-weighted average that weights the model forecasts by rolling 10-year predictive likelihoods, a simple average of all forecasts, and a simple average of just the models with SV. As noted above, the forecasts are combined by sampling from the appropriate mixtures of distributions. The use of a 10-year window means that the predictive likelihood-weighted forecasts only begin in 1985; for the longer sample, the tables report values of NA. A forecast obtained from selecting the model with the highest predictive likelihood for each 10-year window didn t perform any better than those shown. Consistent with results in such studies as Clark (2011), the mean forecast errors in Table 4 show that every model consistently over-stated inflation over the 1975-2009 and 1985-2009 samples. The bias increases with the forecast horizon. For most of the models, the bias in point forecasts is similar with constant volatility and stochastic volatility. The exception is the constant trend model, for which the longer horizon bias is considerably smaller with the stochastic volatility specification than with constant volatility. The RMSE results accord with Stock and Watson (2007): in data since 1985, a local level model with stochastic volatility is difficult to beat. For the 1985-2009 sample, the local level-sv model is more accurate than every other model at every horizon (evidenced by RMSE ratios that are above 1), with the exception of the PTR-SV model in a few instances. In comparison, the constant trend model stands out as least accurate, in the 1985-2009 sample. That said, the differences in accuracy across all models are small to modest, peaking at about 15 percent. For the sample starting in 1975, the advantages of the local level-sv model are somewhat smaller, and sometimes several other models beat it, in terms of RMSE accuracy. For this longer sample, the PTR trend specification seems to come closest to matching the forecasting performance of the local level-sv model, yielding RMSEs that are most often below or very close to 1. In these point forecasting results, there appears to be at most a small gain to combining forecasts. For the full 1975-2009 sample, the simple averaging approaches consistently improve albeit only slightly on the local level-sv model. More generally, across all horizons, the average forecasts are relatively accurate. But for the 1985-2009 sample, the averages are slightly less accurate than the local level-sv model another finding in 19

line with the Stock and Watson (2007) evidence on the difficulty of beating the model in data since the mid-1980s. However, for the most part, the differences in point forecast accuracy across models or methods are quite small from a practical perspective. From a practical perspective, based on average log scores, there seems to be little to distinguish the alternative models of trend inflation. For the purpose of assessing forecast performance in the broadest way possible, Table 6 reports average log predictive scores that is, averages of log predictive likelihoods computed from simulated predictive densities. Within the set of models with constant volatilities, the PTR trend specification scores best (consistent with the 1975-2009 log predictive likelihoods described above), at all horizons. Consistent with the findings of Clark (2011), in most cases the models with stochastic volatility score better than the corresponding models with constant volatility. At all horizons, the local level-sv model scores best. However, the differences among a number of the models are small to modest. As in the case of the RMSE results, by the log score metric, the average forecasts offer no advantage over the good individual models. So while these results show that models with stochastic volatility are superior to models with constant volatility, from a practical perspective, based on RMSEs, there again seems to be little to distinguish the alternative models of trend inflation. Bigger differences in model forecasts emerge with deflation probabilities (as indicated above, deflation is defined as annual average inflation less than 0), reported in Figures 8 and 9. The models with constant volatilities (top panel of Figure 7) consistently yield higher probabilities of deflation than the corresponding models with stochastic volatilities (lower panel). The probability of deflation is higher in models with unit root trends (π t 1 trend and local level) than in the other models. Modeling trend inflation with PTR generally yields lower probabilities of deflation than most of the other models. These results highlight the importance of careful modeling of trend inflation in assessing the risks of deflation associated with periods of low inflation, such as 2003 and 2009. The role of the trend becomes even more important when the inflation forecasting model includes the unemployment gap. Figure 9 presents deflation probabilities for the π t 1 trend and PTR trend models, in their AR forms and in the bivariate forms that also include the unemployment gap (the predictive likelihoods in Table 3 show that, in most cases, the addition of the unemployment gap causes a modest reduction in model fit). Not surprisingly, in the past two recessions, 20