Return predictability - PDF Free Download

UNIVERSITEIT GENT FACULTEIT ECONOMIE EN BEDRIJFSKUNDE ACADEMIEJAAR 015 016 Return predictability Can you outperform the historical average? Gilles Bekaert & Thibaut Van Weehaeghe onder leiding van Prof. Dries Heyman 1

Abstract In this paper, we test the robustness of a generalized predictive regression framework and combination methods. Testing the forecasting accuracy of different variables and combination methods on many forecasting horizons, we find that most valuation and macroeconomic variables fail to outperform the historical average. However, combining these individual predictors using a broad range of combination methods show better results than the historical average. Although principal component based combination methods show promising results, we argue that a simple mean combination method is more robust across specifications and regions. We introduce some new variables that show good out-of-sample forecasting accuracy on a 10-year horizon.

Non-Technical Summary The academic literature has identified several variables with predictive ability for future market returns. An early leader has been the dividend-price ratio, which appeared to give a forecaster potential market-timing ability. Other popular valuation ratios with presumable predictive ability are the book-to-market ratio or an earningsprice ratio. Unfortunately, during the stock market s rally in the nineties, the traditional valuation ratios gave a bearish timing signal. The predictive ability of these variables broke down. This essentially highlights the difficulty of any variable in a forecasting exercise. Although many might have value, at times, they fail to perform. A potential way to deal with these instabilities, is to combine all the information from individual signals. We examine the results of doing this with a host of combination methods over multiple forecast horizons. We evaluate if this method can outperform a simple historical average as our best estimate of future returns, as it has been shown that many individual predictors fail in doing so. Our thesis shows that, indeed, a simple combination of individual variables significantly outperforms the historical average over a long sample in the US. This result seems to hold for multiple forecast horizons. Looking at the performance for a more recent period in both the US and in Europe, the results are slightly worse. Although the combination methods perform better than many individual variables, the outperformance of the historical average is not significant. Finally, we evaluate the performance of some new predictors with potential longerterm forecasting ability. Good performance of these variables suggests that it might be worthwhile for practitioners to explore other variables than the traditional ones that have been identified in academic literature. Our tool provides a good basis for exploring potential market timing ability of identified signals. 3

Table of Content 1 Introduction... 7 Literature overview... 10 3 Methodology... 17 3.1 Forecasting regression framework... 17 3. Forecast combination framework... 19 3.3 Forecast evaluation framework... 4 Empirical results... 4 4.1 One-month ahead forecast horizon... 4 4. One-quarter forecast horizon... 36 4.3 One-year forecast horizon... 47 4.4 Ten-year forecast horizon... 55 5 Robustness and Sensitivity analysis... 64 5.1 Long-term horizon robustness... 64 5. Analysis of PC combinations... 75 6 Conclusion... 79 7 References... 81 4

List of Figures Figure 1: Rolling 10-year average (log) excess returns... 11 Figure : One-month ahead mean combination forecast... 30 Figure 3: One-month ahead forecast variance and MSFE scatterplot... 31 Figure 4: One-quarter ahead forecasts dcmsfe... 39 Figure 5: One-quarter ahead forecasts... 40 Figure 6: One-quarter ahead mean combination forecast... 41 Figure 7: One-quarter ahead forecasts (extra variables)... 45 Figure 8: One-quarter ahead dcmsfe (extra variables)... 46 Figure 9: One-year realized excess return (log)... 50 Figure 10: One-year ahead forecasts... 51 Figure 11: One-year ahead dcmsfes... 5 Figure 1: One-year ahead combination weights... 53 Figure 13: One-year ahead mean combination forecasts... 54 Figure 14: Ten-year realized gross return (log)... 59 Figure 15: Ten-year ahead forecasts... 60 Figure 16: Ten-year ahead mean combination forecast... 61 Figure 17: Ten year ahead MaxROS forecast... 6 Figure 18: Ten-year ahead combination weights... 63 Figure 19: Long-term horizon Ros sensitivity... 67 Figure 0: Long-term horizon ΔU sensitivity... 68 Figure 1: Medium-term horizon Ros sensitivity... 69 Figure : Long-term horizon ΔU sensitivity... 70 Figure 3: Ros sensitivity of remaining variables... 71 Figure 4: ΔU sensitivity of remaining variables... 7 Figure 5: Combination methods Ros sensitivity... 73 Figure 6: Combination methods ΔU sensitivity... 74 List of Figures Table 1: One-month ahead forecast... 8 Table : Robustness of one-month ahead forecasts... 9 Table 3: One-month ahead forecast for other regions... 35 Table 4: One-quarter ahead forecast... 38 Table 5: One-quarter ahead forecasts with extra variables... 44 Table 6: One-year ahead forecasts... 49 Table 7: Ten-year ahead forecasts... 58 Table 8: Robustness of Ten-year ahead forecasts... 64 Table 9: Robustness of PC combination (one-month)... 76 Table 10: Robustness of PC combination (one quarter)... 77 Table 11: Robustness of PC combination (one quarter, extra variables)... 77 Table 1: Robustness of PC combination (one-year)... 78 5

List of Abbreviations AIEA BM CAY DDM DE DFM DFR DFY dgdp dgdp10 dip DMSFE DP dvol DY E10P EP FED HA IK INFL MA MCGVA MSFE NTIS OBV OLS OOS PNW PC(A) PMI PRES SOP SVAR TBILL TBOND TBONDR TERM TFAHDI U VOL YGAP Average Investor Equity Allocation Book-to-market ratio Consumption-to-wealth ratio Dividend Discount Model Dividend-payout ratio Dynamic factor model Default return spread Default yield spread Change in GDP 10-year trailing dgdp Change in Industrial Production Discount mean squared forecast error Dividend-price ratio Change in volume Dividend yield Trailing ten-year earnings-to-price Earnings-price ratio FED-variable Historical average Investment-to-capital ratio Inflation Moving average Market-cap to Gross Value Added Mean squared forecast error Net equity expansion On-balance volume Ordinary least squares Out-of-sample Corporate profits after tax of nonfinancial corporate business to net worth nonfinancial corporate business Principal component (analysis) Purchasing Managers Index Price pressure Sum-of-the-parts method Stock variance Treasury bill Long-term yield on Treasury Bond Long-term return on Treasury Bond Term spread Total Financial Assets to Households Disposable Income Unemployment Volume Output gap 6

1 Introduction Ever since the idea of constant expected returns was challenged, researchers and practitioners have attempted to predict the time-varying nature of equity premia. For a long time, this appeared to be a quest for the holy grail. Return predictability is called elusive by Timmerman (008). Although a host of predictive relations were documented, many of them relied on in-sample fit. Goyal & Welch (008) show that many relationships disappear when tested in an out-of-sample context, simulating the environment for a real-time forecaster. Doing better than the historical average remains a challenging task. 1 The difficulty of any predictive relationship to outperform the historical average over a lengthy sample period stems from the data generating process of the equity premium, which, as Rapach, Strauss & Zhou (010) argue, is highly complex and constantly evolving. Timmerman (008) states that investors searches for successful forecasting models actually cause the data generating process of financial returns to change over time. Rapach et al. (010), show that forecast combination appears a successful method for out-of-sample equity premium prediction. While individual predictive regression forecasts exhibit too much volatility to plausibly reflect changes in equity premia, the historical average is too smooth and ignores information in predictive variables that potentially capture the time-varying nature of expected returns. Thus, forecast combination achieves a middle-ground between excessively noisy forecasts and the largely uninformative historical average. Expected returns are valuable input for many decision makers. A mean-variance investor requires an expected return, which is often the historical average. Longerterm expected returns are often requested by corporates with investment decisions or by pension funds. A model that tractably incorporates information from a host of variables and outperforms the historical average can be a valuable tool. We develop a general tool for out-of-sample equity premium prediction using combination 1 Time series regressions are not the only models for predicting future returns. Duarte & Rosa (015) review several types of forecasting model, starting from the historical average, a dividend discount model (DDM), cross-sectional regressions, time-series regressions and surveys. We will focus on timeseries regressions and evaluate its performance against a historical average benchmark. 7

methods. While much of the literature on forecast combination has focused on oneperiod ahead forecasts, our general model looks at the results of the combination methods for both short-, medium- and longer-term horizons. Our results show that many individual valuation and macroeconomic variables fail to outperform the historical average on a consistent basis for the one-month ahead horizon, corroborating the findings of Neely, Rapach, Tu & Zhou (014). For the same forecast horizon, we find that technical indicators and combinations methods outperform the historical average. While the principal component (PC) combination method, used in the original paper, achieves outsized gains both in terms of statistical forecast accuracy and in terms of utility gains for a mean-variance investor, we show that more simple combination method also add value. Moreover, we show that the PC combination method is very sensitive to the choice of the number of principal components to select. Explaining a sizeable portion of the variance from a largedimensional space of predictor variables would require a relatively high number of principal components, thereby increasing the forecast variance substantially, and possibly eroding the stabilization gains from a combination. Finally, we show that the simple combination methods deliver better results than individual signals, relative to the historical average, but we cannot claim significance. When looking at the results for the one-quarter ahead forecasts, we find that the simple combination methods achieve significantly lower mean squared forecast errors (MSFE) relative to the historical average. Similar results are found when forecasting one-year ahead. Finally, we turn our attention to forecasts with a ten-year horizon. We evaluate the performance of a weighting method that maximizes the out-ofsample R and minimizes the covariance of the in-sample error terms for forecasting 10-year returns. Furthermore, we identify some new variables that perform extremely well both in- and out-of-sample. We obtain sizeable MSFE reductions relative to the historical average using both simple combination methods and the aforementioned weighting method. Given the worse performance of traditional variables in recent history, it is worthwhile to explore other potential predictors that might correlate better with time-varying risk premia or exploit market inefficiencies better. Guarding for data-mining is important 8

for academics and practitioners. Besides using test statistics that correct for datamining, Ilmanen (011) highlights potential ways to mitigate overfitting, namely, requiring a certain economic logic, conducting cross-validation on other samples, extensive robustness checks, being simple in model specification or dynamically selecting indicators from a large pool of candidates. We refrain from using parameter restrictions, evaluate robustness at length and test models across multiple regions. Finally, we test the performance of a simple average of individual predictions from a certain number of principal components of the underlying data. If the underlying space of variables is large, our PC combination method forecasts exhibit very high variance. The simple average of univariate regressions from the selected principal components could be a good alternative. Our results on quarterly data show its benefit relative to the multivariate PC combination method. It would be interesting to explore this method further on higher frequency data and a larger set of underlying predictor variables. The remainder of our thesis is organized as follows. In Section, we briefly outline the history of the equity risk premium and discuss the literature closest in spirit to our work. In Section 3, we outline the methodology behind our model and we present the results for various forecast horizons in Section 4. In Section 5, we provide some additional robustness tests and we conclude in Section 6. 9

Literature overview To provide some perspective, we first show evidence on the historically realized excess return (ex post equity premium) in the US, EU, Germany and the UK in the past decades. In Figure 1 (See Below), we plot average annual 10-year rolling realized excess returns (log) over a risk-free rate (T-bill). Most of the time, decade-long realized excess returns of equities have been positive but there have been periods in which this was not the case. The first decade of the new millennium has certainly been a poor one for equities. For the US, 3.4% of all months can be characterized as months with a negative realized 10-year rolling excess return of stocks over cash. Nevertheless, on average, excess returns of stocks over risk-free has been positive. For the 1950-014 period, the compound average excess return of the S&P500 has been 5.5%. The literature has had a hard time to explain the sizeable outperformance of equities over a risk-free alternative. Mehra & Prescott (1985) argued that the historical US realized excess return of equities is greater than can be rationalized in the context of the standard neoclassical paradigm of financial economics. There is insufficient volatility in consumption growth to explain the magnitude of the equity premium unless a very high risk aversion coefficient is assumed. Ever since, the profession has attempted to provide possible explanations for this equity premium puzzle. Ilmanen (011) surveys the literature and summarizes some possible explanations for this puzzle: rare disaster risk, structural uncertainty, long-run risk and behavioural explanations. Rare disaster risk poses that, if investors assigned a higher probability of rare catastrophic events, or black swan events, than actually happened ex post, it can explain the high observed equity risk premium. A second possible explanation is that investors do not fully know the structure of the economic system, in contrast to the models and assumptions of neoclassical finance. It follows naturally that an investor requires a higher premium to compensate for this uncertainty. 10

Figure 1: Rolling 10-year average (log) excess returns 11

A third explanation, long-run risk, poses that investors are not concerned as much with short-run volatility in consumption but are more interested in long-run growth rates, which is inherently more difficult to estimate. Finally, we can turn to behavioural explanations as possible explanations for the high observed equity premium: myopia and the house money effect. Both explanations need to be combined with loss aversion in the spirit of Kahneman & Tversky (1979) and their prospect theory. Traditional theory states that investors calculate their expected utility as the probability-weighted sum of all utility outcomes. However, the prospect theory adds risk aversion and investors choose between several investment opportunities given their level of risk aversion (Ilmanen,011). The myopic loss aversion model assumes that the longer the investment horizon of an investor is, the more an expected return advantage will attract him. If this investor evaluates his portfolio at a regularly basis, the chances that a risky asset will outperform a riskless one will be close to zero. Therefore, introducing loss aversion will output higher risk premia. The house money effect model builds further on the previous model with a time-varying loss aversion which varies with the prior gains and losses of the investor (Mehra-Prescott, 1985). Much like the literature on ex post realized excess return, literature on ex ante equity premium forecasting has changed dramatically over the last few centuries. Traditional finance theories often assumed the equity premium to be constant over time. In this case, the best estimate of future excess returns is simply a long-run historical average excess return. Through experience and empirical evidence, the idea of time-varying risk premia gradually replaced the old paradigm. For this there are two competing explanations: time-varying risk premia and irrational mispricing. Fama & French (1989) shows that the expected excess returns of stocks and corporate bonds move together. The variables that measure default and term premiums in bond returns predict also the variation in stock returns. Therefore, risk premia of stocks are time-varying. The term spread is typically low around business-cycle peaks and higher near troughs. The slopes for the default spread and dividend yield increases from high-grade bonds to low-grade bonds and from bonds to stocks. Therefore, the variation of this premium is higher for low-grade bonds and stocks as these higher slopes indicate that the sensitivity for unexpected changes in business conditions is higher. Shiller (1981) 1

provides evidence against the efficient market hypothesis as he finds that prices are too volatile. This volatility can only be explained by very volatile ex ante real interest rates or that markets are simply irrational. Various authors suggested a large amount of variables with presumable predictive ability for equity returns. Dow (190) and Fama & French (1989) propose dividendprice. Campbell & Shiller (1988, 1998) propose earnings-price. Kothari & Shanken (1997) and Pontiff & Schall (1998) suggest book-to-market ratios. Next to valuation ratios, nominal interest rates were suggested by Fama & Schwert (1977), Campbell (1987), Breen, Glosten & Jagannathan (1989) and Ang & Bekaert (007). Additionally, the inflation rate, term and default spreads, corporate issuing activity, consumptionwealth ratio and stock market volatility are all proposed as variables with predictive ability for returns. However, most of these studies focus on in-sample predictive ability. In an important study, Goyal & Welch (008) survey these variables and explore their predictive ability for the equity premium in an out-of-sample (OOS) context. They conclude that a market timer with access only to available information at that time would not have benefited relative to using a simple historical average as best estimate for future returns. They find that most models are unstable or spurious. Although time-varying equity premia are largely accepted in the literature, beating the constant risk-premium assumption in a forecasting exercise is inherently difficult for investors. Campbell & Thompson (008) follow up on this study and are more positive for equity premium forecasting. They find that beating the historical average is possible once weak restrictions are imposed on the signs of coefficients and return forecasts. They argue that, even the small out-of-sample explanatory power can provide meaningful gains to mean-variance investors in terms of utility. In addition to the prediction on an individual level, Rapach et al. (010) propose a combination approach to forecast the equity premium in an out-of-sample context. Both the econometric and macroeconomic value of the estimated model is explored. Nelson (1976), Fama & Schwert (1977), Campbell & Vuolteenaho (004), Campbell (1987), Fama & French (1989), Baker & Wurgler (000), Boudoukh et al. (007), Lettau & Ludvigson (001) & Guo (00). 13

The reason behind the combination of individual predictive regression models is that these models can t fully approximate the data-generating process for expected equity returns. Combining 15 individual predictive regression models results in a significant reduction of the uncertainty risk that is accompanied with individual models. Furthermore, their model outperform the historical average forecast of the equity premium. The link with the real economy is investigated in 4 ways. Firstly, the combination approach typically mimics the higher risk premium in economic downturns caused by heightened risk aversion. Secondly, out-of-sample gains are found during bad growth periods, which is in favour of the business-cycle fluctuation forecast ability of the model. Thirdly, the model is related to macroeconomic risk. Lastly, instabilities in the individual predictive regression models related to the equity premium are in favour of the combination approach in terms of the real economy. Timmerman (013) gives several reasons why the forecasting results of a combination of individual predictors are almost always better than models using a single predictor. Identifying the predictive accuracy of the individual forecasts is very difficult as their performance are mainly state-dependent. By combining these individual predictors, diversification gains are generated. Firstly, the dimensionality is reduced as the combination weights are a single summary measure of information. Secondly, optimal combinations look at the individual estimation errors and estimates weights according to the forecasting accuracy. This means that there will be granted more weight to predictors with lower estimation errors. Thirdly, only in a world where there is no model misspecification and with a complete access to the information underlying the individual forecasts, there is no need for forecast combination. This is the so-called Irrelevance Proposition. However, he also emphasizes the existence of the forecast combination puzzle. This puzzle states that simple equal-weighted forecast combinations often perform better than more sophisticated combination schemes. The explanation behind this puzzle is that estimation errors of the more sophisticated combination models are larger than the combination gains caused by large estimation errors or small gains. However, not everyone agrees with the fact that macroeconomic variables are the best predictors. Neely et al. (011) emphasize the use of technical indicators instead 14

of popular macroeconomic variables. They consider three types of technical indicators: moving average, momentum and on-balance volume. For example, onbalance volume indicates a buy signal if the trading volume in the past 3 months was higher than the past 1 months. They find that models based on technical indicators have economically significant out-of-sample forecasting power and generate utility gains. Furthermore, the decline in equity risk premium near cycle peaks is picked up, compared to macroeconomic variable models. However, a combination of both models is the best out-of-sample forecasting model as macroeconomic models tend to pick up the rise in equity risk premium later in recessions. Duarte & Rosa (013) combine different models that forecast the equity risk premium. They include the historical mean of realized returns, dividend discount models (DDMs), cross-sectional regressions, time-series regressions and surveys. They find that time-series regressions have mixed results, but all have large variances. Furthermore, dividend discount models perform better than other models at the short horizon, but are worst at long horizons. The opposite is found for cross-sectional regressions. An important conclusion is also made about the height of the equity risk premium. The premium isn t high because stocks are expected to have high returns, but because bond yields are low. This is the reason why the equity risk premium was so high in 013, caused by the extremely low bond yields. Policy makers have to know that their monetary policy has therefore huge impacts on asset prices and the real economy as a whole. Similar to many DDM-models, Ferreira & Santa-Clara (011) forecast stock market returns using the dividend-price ratio, earnings growth and price-earnings ratio growth. They call the sum of these three parts the sum-of-the-parts (SOP) method. In the simple version of the SOP, returns in the next period are forecasted using the current dividend-price ratio, the 0-year moving average of the earnings growth and the PE ratio growth is set to zero. They find a gain in the out-of-sample performance in comparison with other predictive regressions. This gain can mainly be explained by the fact that no parameters need to be estimated and an absence of estimation error as a consequence. 15

Most of the papers forecast returns for the U.S. Jordan, Vivian & Wohar (014) use macroeconomic variables and technical indicators to forecast European gross returns. They find evidence that even the individual variables and indicators have predictable power. Furthermore, combining these variables and indicators increase the forecast accuracy and these forecasting gains are mostly larger in comparison for the U.S. They show that theoretically motivated restrictions on the individual variables (as argued by Campbell & Thompson (008)), did not add value over the evaluation period 1995-011 in the EU. 16

3 Methodology In this section we will carefully explain the econometric methodology of our out-ofsample forecasting exercise, the combination methods and the metrics we use for evaluating our forecasts. 3.1 Forecasting regression framework In a forecasting exercise it is important to make the distinction between in-sample and out-of-sample predictive ability. Usually, in-sample predictive ability is estimated by running the regression given in Equation (1). r t+l is the (nominal or excess) continuously compounded cumulative log return from t + 1 to t + l and x t is a variable with potential predictive ability for period t. r t+l = α + βx t + ε t+l (1) Alternatively, out-of-sample predictive ability replicates the situation of a forecaster in real time, using only information available to him at time t. This regression is given in Equation (). Here, r t is the l periods continuously compounded cumulative log return from t l + 1 to t 3 and x t l is the predictor variable lagged l periods. Thus, for l = 1, we regress a 1-period lagged predictor on r t. r t = α + βx t l + ε t () From this model, we estimate time-varying ordinary least squares (OLS) coefficients using a recursive (expanding) estimation window and our forecast is then given by Equation (3). 4 We estimate l-period ahead forecasts for a host of predictor variables. The total sample T is divided in a first part of m observations, used to estimate stable coefficients and a second part of T m observations used in our evaluation. r t+l = α t + β t x t (3) In the constant equity risk premium environment, the historical average is our best estimate of the future returns. Thus, r t+l = l r t k=1 k is the natural benchmark against t 3 If l is equal to 1, than the return is just the return at time t. 4 The choice for a recursive (expanding) estimation window is motivated by recent literature and by Pesaran & Timmerman (007) who argue that, in the presence of structural breaks, the optimal estimation is an expanding window, including pre-break data due to the bias-efficiency trade-off. 17

which to evaluate our out-of-sample forecasts from Equation (3). This will be discussed in more detail in section 3.3. Important to note is that we do not impose, Campbell & Thompson, parameter restrictions but impose non-negativity constraints only if we are forecasting excessreturns. Following Neely et al. (011), we evaluate two types of technical indicators.5 The first rule is a moving-average (MA) rule, generating a buy or a sell signal (S t = 1 or S t = 0, respectively) by comparing two moving averages: where S t = { 1 if MA s,t MA l,t 0 if MA s,t < MA l,t, (4) j 1 MA j,t = ( 1 j ) P t i i=0 for j = s, l (5) with P t the price index level and s (l ) a short (long) MA lookback window (s < l). We consider monthly MA rules with s = 1,,3 and l = 9,1. In addition, we evaluate a signal incorporating volume information, namely onbalance volume. This is defined as: t OBV t = VOL k D k k=1, (6) where VOL k is the trading volume during k and D is a binary variable that is 1 if the index is up during period k and -1 otherwise. Signals are them formed from OBV t as where S t = { 1 if MAOBV OBV s,t MA l,t OBV OBV 0 if MA s,t < MA, (7) l,t 5 Neely et al. (011) consider three types of technical indicators. We consider MA-rules to be a generalization of Momentum signals with the added benefit of removing the shadow-effect of observations dropping out in the last period. This point is made by Ilmanen (011) p. 95. 18

j 1 MA OBV j,t = ( 1 j ) OBV t i i=0 for j = s, l. (8) We use the same lookback windows as for the MA rule. To arrive at forecasts for the equity risk premium we use the same predictive regression framework as described above, estimating Equation () with our signals S t. 3. Forecast combination framework Our combination methods can be grouped into two classes. The first class can be labelled forecast pooling, combining two or more individual forecasts from a panel of forecasts to arrive at a single, pooled forecast. This methodology was introduced by Bates & Granger (1969). The second class can be labelled as factor combinations, in which the comovements among a large number of economic variables are treated as arising from a small number of unobserved sources, or factors. (Stock & Watson, 010). Rapach et al. (010) argue that forecast pooling appears successful for out-of-sample equity premium prediction because it achieves a middle ground. Individual predictive regression forecast are very volatile (and prone to structural instability) while the historical average is simply too smooth and ignores information contained in economic / financial variables that could capture time-varying equity risk premia. Combining information from the individual forecasts uses the information contained there-in, while avoiding an excessively noisy forecast. Thus, for the forecast pooling, we combine our K individual predictor forecasts r t+l made at time t using a weighting approach. In mathematical expression, the c combination forecast r t+l is formed using Equation (9): c r t+l K = ω i,t i=1 r i,t+l (9) The simplest weighting scheme is the mean combination forecast with ω t = 1/N. The median combination forecast take the median value of the K individual forecasts and the trim mean combination forecast sets ω i,t = 0 for the smallest and largest individual forecasts and 1/(N ) for the remaining. 19

Another option is to let the weights depend inversely on the historical performance of the individual predictor forecasts. The Discounted MSFE (DMSFE) determines exante weights as follows: where 1 n 1 ω i,t = φ i,t φ j,t (10) j=1 t l φ i,t = θ t l s (r s+l r i,s+l) s=m (11) with θ is a discount factor. This method chooses greater weights in the combination Equation (9) for individual predictors with lower MSFE values over recent history if θ < 1. 6 The second class of forecast combination attempts to trace out a small number of principal components in the underlying data and have a long-standing record in the literature, as summarized by Stock & Watson (010). Early applications of so called dynamic factor models (DFMs) focused on large amounts of macroeconomic data and attempted to explain the variation in the underlying data by a small number of factors. Closer to our paper in application, Neely et al. (011) estimate the predictive ability of both macroeconomic variables and technical indicators in conjunction. Employing a principal component approach, they tractably incorporate all information while avoiding in-sample overfitting. Our individual predictor variables can be expressed as x t = (x 1,t,, x N,t ). We then take the first J principal components of x t, always estimated using data available up to time t to simulate the situation for a forecaster in real time and with J < N. This can be expressed as: 6 Stock & Watson (010) consider an important practical difference between the simple and the MSFE weighting methods. While the simple weighting use only contemporaneous forecasts, the DMSFE method weights the individual predictors based on their historical performance and thus require a historical track record. Because we have a sufficiently long data-sample, we use the historical performance of individual predictors inside the in-sample estimated period (pre-m). 0

f k,t = (f 1,k,t,, f J,k,t ) (1) The then generate a principal component (PC) forecast using a predictive regression: r PC,t = α PC,t + β PC,t f t,t l (13) with α PC,t and β PC,t the OLS intercept and J-vector slope coefficient estimates from t regressing {r k } k=l t l on a constant and {f k} k=1. Neely et al. (011) determine J (number of principal components) using the Onatski (011) ED algorithm. 7 We set J = 3 and evaluate how much of the variance is explained by the first three principal components. Additionally, we will check for robustness using different values of J. An important difference between the two combination methods is that the PC based forecast is much more volatile than the forecast pooling method. After selecting J principal components, the forecast is made using a multivariate predictive regression (See Equation (13)). A multivariate regression based on individual variables has poor forecasting performance, attributable to its high volatility. As Rapach et al. (010) point out, the mean combining method shrinks the multivariate (= kitchen sink) forecast towards the historical average. The choice of J provides a clear trade-off: Increasing J will extract more information from the individual predictor variables but will make the forecast more volatile and reduce the benefit of stabilization from forecast combination methods. Rapach et al. (010) note that forecast pooling and factor approaches represent different strategies for utilizing a range of information sets and it will be interesting for future research to explore potential gains to using them in conjunction. One possibility to combine both methods would be to extract J principal components from the underlying space of variables and perform individual predictive regressions from the principal components as in Equation (3). Then, we simply take an average of the J forecasts, reducing the forecast variance. Theoretically, this approach should contain the same information from the underlying variables, with the added benefit of 7 The ED algorithm typically selects J = 3 from a 8-dimensional space. It is called the edge distribution algorithm because it exploits the square-root shape of the edge of the eigenvalue distribution. 1

reducing noise from the principal component forecast. We will explore this further in Section 5.. 3.3 Forecast evaluation framework In order to statistically evaluate our forecasts (both individual and combination) r t+l we use the out-of-sample R² statistics, R OS, introduced by Campbell & Thompson (008). The R OS statistic compares the forecast r t+l with the natural benchmark r t+l and measures the reduction in MSFE compared to the historical average. The formula is given by Equation (14): = 1 T l k=1 (r m+k r m+k) T l (r m+k r m+k) R OS k=1 (14) When R OS > 0, the forecast outperform the historical average in terms of reduction in MSFE. Statistical significance is tested using the MSFE-adjusted statistic, suggested by Clark & West (007), evaluating the null hypothesis that R OS 0 against the alternative that R OS > 0. The test statistic is calculated by the following Equation (15): f t+l = (r t+l r t+l) [(r t+l r t+l) (r t+l r t+l) ] (15) By regressing {f s+l } T l s=m on a constant we obtain a t-value for which we can evaluate statistical significance using the standard normal distribution. 8 Because significantly positive R OS -values are often still small (especially for higher frequencies) we evaluate economic significance for a mean-variance investor. For a real-time forecaster, economic significance is of more importance. We calculate the realized utility gain for a mean-variance investor, allocating every period between stocks and cash. As the benchmark, we have an investor with access only to the historical average. Thus, at the end of period t, she allocates the following share of the portfolio to equities in period t + 1: 8 For a forecast horizon L > 1, we estimate the standard Newey-West OLS coefficient covariance by setting the bandwidth to (1.5 L) as advocated in Clark & West (007). We are aware that this approach can work reasonably well but exhibits size distortions as the forecast horizon increases. Alternative HAC estimators can improve size performance. Additionally, it would be worthwhile to explore finite-sample tests with bootstrapped critical values that can guard against data-mining. (Clark & McCracken, 011).

ω 0,t = ( 1 γ ) (r t+l ) (16) where σ t+l is simply a rolling-window estimate of historically realized stock variance. 9 We set γ = 3 and forecast volatility using a 5-year rolling window 10. The choice of estimation window for the volatility forecast should have little influence on the results as this is common for both investors considered. The average realized utility by the investor is then: σ t+l U 0 = μ 0 ( 1 ) γ σ 0 (17) where μ 0 and σ 0 are the out-of-sample mean and variance, respectively of the investors portfolio returns. In contrast, we do the same exercise for an investor with access to a forecast from the individual or combination method. Thus, allocating the following share of her portfolio to equities at the end of period t: ω j,t = ( 1 γ ) (r t+l ) (18) where σ t+l is simply a rolling-window estimate of historically realized stock variance and γ = 3. The average realized utility over the out-of-sample period is equal to: σ t+l U j = μ j ( 1 ) γ σ j (19) where μ j and σ j are the out-of-sample mean and variance, respectively of the investors portfolio returns. The difference between Equation (19) and Equation (17) is then annualized to arrive at a so-called certainty equivalent return that can be interpreted as a portfolio management fee that an investor would be willing to pay to have access to the information from the forecast relative to the historical average. In the same mean-variance exercise we calculate Sharpe-ratio gains over the out-ofsample period as the excess return of the portfolio divided by its volatility and report the difference in Sharpe-ratios between the two investors. 9 Following the literature, we impose constraints on the allocation weights to range between 0 and 150% 10 In this case, we follow Neely et al. (011) instead of Rapach et al. (010) who uses 10-years. 3

4 Empirical results In this section we report the results from our empirical exercise. We will discuss different forecast horizons, ranging from one month up to a long-term forecast horizon of ten years. We start by updating some well-known results from the literature up to December 015 and then check the robustness of the results by testing other regions, countries or settings. The variables we use that are the same as in Rapach et al. (010) are from Amit Goyal s website. 11 4.1 One-month forecast horizon For the one-month ahead forecasts, we follow the paper by Neely et al. (011), who forecast excess returns and introduce technical variables into the predictive regression literature and combine this with the original variables of Goyal & Welch (008). Specifically, we add two types of technical indicators: moving averages and onbalance volume. 1 The authors argue that technical indicators tend to pick up the decline in the equity risk premium near cyclical peaks. In combination with macroeconomic variables that particularly pick up the rise in the equity risk premium near troughs, out-of-sample forecasting ability and utility gain is expected to be even higher. We use the same notation as in the original paper for the technical variables: MA(1,9), MA(1,1), MA(,9), MA(,1), MA(3,9), MA(3,1) and OBV(1,9), OBV(1,1), OBV(,9), OBV(,1), OBV(3,9), OBV(3,1). We use monthly data with the evaluation period ranging from 1966:1 to 015:1. 13 Risk-aversion coefficient γ is set to 5 and the discount factor θ is set to 0.9. The remaining variables used in the model are the following: Dividend-price ratio (log), DP: difference between the log of dividends paid (1-month moving sum) and the log of stock prices on the S&P500 index. Dividend yield (log), DY: difference between the log of dividends and log of lagged stock prices. 11 http://www.hec.unil.ch/agoyal/ 1 For their calculation, see Methodology: Section 3.1 13 We use an in-sample period of 16 years. The hold-out period (used to calculate combination weights for the DMSFE and MSFE) is 10 years and is taken inside the in-sample period (i.e. 1956:1 1965-1). 4

Earnings-price ratio (log), EP: difference between the log of earnings (1- month moving sum) and the log of stock prices on the S&P500 index. Dividend-payout ratio (log), DE: difference between the log of dividends and the log of earnings. Stock variance, SVAR: sum of squared daily returns on the S&P 500 index. Book-to-market ratio, BM: ratio of book value to market value for the Dow Jones Industrial Average. Net equity expansion, NTIS: ratio of twelve-month moving sums of net issues by NYSE-listed stocks to total end-of-year market capitalization of NYSE stocks. Treasury bill rate, TBILL: interest rate on a 3-month Treasury bill. Long-term yield, TBOND: long-term government bond yield. Long-term return, TBONDR: return on long-term government bonds. Term spread, TERM: difference between the long-term yield and Treasury bill rate. Default yield spread, DFY: difference between BAA- and AAA-rated corporate bond yields. Default return spread, DFR: difference between long-term corporate bond and long-term government bond returns. Inflation, INFL: calculated from the CPI. We lagged this variable as it is only available in the next period. We obtain similar results as the aforementioned authors. The original empirical evidence, evaluating the period 1966:1 008:1, shows that the combination based on the first 3 principal components produces a R OS of 1.66 (%) and an annualized portfolio utility gain (in %) of 5.3. Our updated results show (See Table 1) that the extended exercise has an out-of-sample R of 1.69 with the PC combination method and an annualized portfolio utility gain (in %) of 5.64. 14 We report a portfolio Sharpe gain of 0.67 versus the historical average. The results are robust across a range of 14 The first 3 principal components latently explain approximately 63.5% of the predictor variables variance. 5

weighting schemes. 15 All methods produce significantly positive R OS gains at the 1% level and deliver economically meaningful gains for a mean-variance investor. The results for non-pc weighting schemes are smaller, but still substantial. From the results of the individual predictor variables we can infer that the technical variables deliver statistically and economically meaningful gains. In Figure (See Below), we report the results for the PC combination method only. Eyeballing the Panel B in Figure 1 reveals that the out-of-sample gains are primarily located in recessions, which agrees with the evidence in Neely et al. (014). While the original study attempts to forecast excess returns, newer literature seems to focus on forecasting gross returns. For instance, Jordan et al. (014) use raw or gross returns and point to several reasons. First, they argue that the risk-free rate is known at the time of the forecast, since they forecast one-month gross returns. While we agree, we note that for any forecast horizon, an investor knows what risk-free rate she can lock in, right now. Second, the theoretical basis for return predictability in Campbell & Shiller (1998) is derived with realized log gross returns. Lastly, they argue that they test robustness of their results using excess returns and find similar results. In addition to forecasting excess returns, the authors impose both a parameter and a non-negativity restriction along the lines of Campbell & Thompson (008). It is argued that risk considerations typically imply a positive expected equity risk premium (excess return) based on macroeconomic variables. An important finding in Jordan et al. (014) is that parameter restrictions do not improve predictability, outside the US. The theoretical basis for parameter restrictions is often unclear, except perhaps for valuation ratios. A clear example is the fragile relation between SVAR and subsequent returns. Although the contemporaneous correlation between stock market volatility and equity returns is strongly negative, the predictive relation is hard to pin down. 16 Furthermore, we note that, for the US and for the period ranging from 195:01 to 15 The alternative weighting schemes come very close to simple equally weighted combinations (=mean combination). 16 See Ilmanen (011, p. 148) and Pollet & Wilson (008). 6

015:1, 40.5% of all months can be characterized as months with a negative realized excess return. Testing for robustness when using gross returns and imposing non-negativity restrictions, we present the results in Table (See Below). We find that the results are robust across the board. In all three cases (excess restricted, excess unrestricted and gross unrestricted) the results are qualitatively similar and remain significant. We also report the results for the PC combination method with J = 4 and find similar results. The first four principal components latently explain 69.7% of the predictor variables variance. Increasing J further, thus extracting more information from the underlying variables, significantly reduces the performance of the PC combination forecasts. This is driven by the increased volatility in our estimates of future returns. In addition, different parameter values for γ and θ do not change our results much. Next, we look at the potential reasons behind the reduction in MSFE relative to the historical average. In Figure, we plot the forecast variance against the MSFE. 17 It shows that the mean combination method achieves a significantly lower forecast variance than many of the predictor variables. Although not as smooth as the historical average (HA), it still contains information about the time-varying nature of equity risk premia, as seen by the lower MSFE. The PC combination method achieves a much higher forecast variance, but this is counterbalanced by a larger reduction in MSFE. The figure reveals a clear picture about the difference between the two combining method. While the forecast pooling method mainly works via forecast variance reduction, the principal component approach benefits from a greater accuracy in its predictions, at least in the tested sample. 17 We only plot the mean-combining method. The other forecast pooling methods are very close to equally weighted combinations. 7

Table 1: One-month ahead forecast R² (%) R²os (%) du ds BM -0,04-1,36-1,51-0,1 DP 0,44 0, * 0,83 0,05 DY 0,50 0,6 ** 1,5 0,09 EP 0,17-0,03 0,53 0,0 TBILL 0,66 0,48 *** 3,6 0,4 TBOND 0,9 0,83 *** 3,6 0,4 TERM 0,33-0,9 1,49 0,15 DFY -0,11-0,67-1,55-0,08 TBONDR 0,68 0,17 ** 0,99 0,14 DFR 0,09-0,53 0,16 0,0 SVAR 1,13-0,11 0,79 0,05 NTIS -0,1-0,79-0,65 0,01 DE -0,05-0,4 0,63 0,07 INFL -0,07-0,05 0,81 0,09 MA(1,9) 0,37 0,45 *,1 0,19 MA(1,1) 0,44 0,58 *,41 0,3 MA(,9) 0,55 0,75 **,73 0,6 MA(,1) 0,69 0,90 ***,84 0,30 MA(3,9) 0,85 1,04 *** 3,04 0,31 MA(3,1) 0,16 0,7 1,71 0,16 OBV(1,9) 0,34 0,51 * 1,79 0,16 OBV(1,1) 0,38 0,40 * 1,47 0,13 OBV(,9) 0,18 0,4 1,31 0,13 OBV(,1) 0,65 0,94 ***,54 0,6 OBV(3,9) 0,41 0,36 1,51 0,13 OBV(3,1) 0,59 0,85 ***,33 0,5 Mean 0,81 *** 1,76 0,14 Median 0,9 ***,8 0, Trim 0,8 *** 1,90 0,16 DMSFE 0,81 *** 1,84 0,15 MSFE 0,81 *** 1,77 0,14 PC 1,69 *** 5,64 0,66 Out-of-sample equity risk premium forecasting and mean-variance investor asset allocation results for individual variables and combination methods over the evaluation period 1966:01-015:1. R²os measures the percentage reduction in MSFE of the forecast relative to the historical average. ΔU is the annual management fee that an investor with a risk aversion coefficient of five would be willing to pay to have access to the respective forecast. ΔS is the annual Sharpe-ratio gain for a mean-variance investor with the same risk-aversion coefficient. Statistical significance of R²os is evaluated with the Clark & West (007) MSFE-adjusted statistic testing H 0 : R os 0 against H A : R os > 0. *,** and *** indicate significance at the 10%, 5% and 1% levels, respectively. 8

Table : Robustness of one-month ahead forecasts Excess Restricted (Original) Excess Unrestricted Gross Unrestricted R²os (%) du R²os (%) du R²os (%) du Mean 0,81 *** 1,76 1,01 ***,69 0,79 *** 1,49 Median 0,9 ***,8 0,86 ***,8 0,67 ** 1,45 TrimMean 0,8 *** 1,90 0,90 ***,44 0,7 *** 1,3 DMSFE 0,81 *** 1,84 1,01 ***,76 0,79 *** 1,53 MSFE 0,81 *** 1,77 1,01 ***,71 0,79 *** 1,50 PC 1,69 *** 5,7 1,64 *** 5,7 1,40 *** 5,61 Original (J=4) Excess Unrestricted (J=4) Gross Unrestricted (J=4) R²os (%) du R²os (%) du R²os (%) du PC 1,99 *** 5,31 1,3 *** 5,31 1,00 *** 4,53 Out-of-sample equity risk premium forecasting and mean-variance investor asset allocation results for combination methods over the evaluation period 1966:01-015:1. R os measures the percentage reduction in MSFE of the forecast relative to the historical average. ΔU is the annual management fee that an investor with a risk aversion coefficient of five would be willing to pay to have access to the respective forecast. Statistical significance of R os is evaluated with the Clark & West (007) MSFE-adjusted statistic testing H 0 : R os 0 against H A : R os > 0. *,** and *** indicate significance at the 10%, 5% and 1% levels, respectively. 9

Figure : One-month ahead mean combination forecast One-month ahead out-of-sample equity premium prediction using the principal component combination method. (1966:01-015:1). Panel A shows the (excess, restricted) principal component equity risk premium forecast (blue) based on the first 3 principal components relative to the historical average (gray, dotted). Vertical bars show NBER recession periods. Panel B shows the cumulative difference in MSFE for the historical average relative to the principal component forecast. Panel C gives the mean-variance investor equity allocation weights (risk aversion coefficient of five) for an investor using the historical average (black) and the principal component forecast (blue). 30

Figure 3: One-month ahead forecast variance and MSFE scatterplot 31

In a next step, we test the robustness of these results with a similar out-of-sample exercise in the US, EU, Germany (GER) and the United Kingdom (UK) 18. A common finding in the forecasting literature is that the out-of-sample gains are often situated in the pre-1980 period. However, Jordan et al. (014) show good results using a simple average combination method of both technical and macro predictors to predict gross unrestricted returns for the period 1995:01 011:06. As we are testing robustness of the previous exercise, we will forecast excess restricted returns, which is the same as the original case of Neely et al. (009). Because of data availability, we use a set of variables that are available for all four regions: Dividend-price ratio (log), DP: difference between the log of dividends paid (1-month moving sum) and the log of stock prices on the S&P500 index. Dividend yield (log), DY: difference between the log of dividends and log of lagged stock prices. Earnings-price ratio (log), EP: difference between the log of earnings (1- month moving sum) and the log of stock prices on the S&P500 index. Dividend-payout ratio (log), DE: difference between the log of dividends and the log of earnings. Treasury bill rate, TBL: interest rate on a 3-month Treasury bill. Term spread, TERM: difference between the long-term yield and Treasury bill rate. Price pressure, PRES: ratio of the number of rising stocks in the previous month divided by the number of falling stocks. 19 Change in volume, dvol: monthly change in the turnover volume of traded stocks (in the index). 0 For technical variables, we use the following MA s: MA(1,9), MA(1,1), MA(,9), MA(,1), MA(3,9), MA(3,1). Our data sample ranges from 1980:1 to 015:1 and the evaluation period is 199:1 to 015:1. In order to keep the out-of-sample period 18 Data is gathered from Datastream via the same codes Jordan et al. (014) uses, except for the European Datastream Index. 19 This data is gathered from Datastream: TOTMKUS(FS) & TOTMKUS(RS). 0 This data is gathered from Datastream: TOTMKUS(VO). 3

large enough, we do not use the weighting methods that require a hold-out period (MSFE and DMSFE). The risk-aversion coefficient γ is set to 3 and we set J =. 1 We present the results in Table 3 (See Below). The R os of the individual predictors forecasts show mixed results. Traditional DP and DY ratios significantly outperform the historical average forecast only in the UK. Technical MA s have out-of-sample predictive ability, except for the UK. The averaging combination methods all have positive out-of-sample predictive ability, although significant only in a few cases. The PC combination method combination method achieves positive out-of-sample gains relative to the historical average in 3 out of 4 cases. Overall, it is clear that the out-of-sample gains are much less pronounced than the original exercise. Figure 1 already suggested that a more recent sample shows worse results. However, the combination methods still add value relative to the individual predictive regressions. Looking at the economic gains for a mean-variance investor, the principal component combination forecast shows much better results in terms of annualized utility gains, except for Germany. A bit of a puzzling finding is that the technical moving averages are considerably worse in both Germany and the UK compared to the US and EU. The performance of the MA s seems to be largely determined by the in-sample periods stock market performance. The theoretically expected positive sign of the Beta coefficients in the predictive regressions of the MA s do not always hold. In many cases, the early years of the evaluation period shows negative beta coefficients, meaning that a sell signal results in a relatively higher expected return forecast. Indeed, the end of the sample period shows the behaviour as we expect it. The model often needs a considerable in-sample period to train its coefficients. Our evidence suggests that, although the results are less convincing than the long sample, the combination methods still improves upon the individual predictive regressions results. A simple average of the same set of individual predictor variables 1 Due to the number of variables in this exercise, setting J=3 would explain approximately 75% of the variance of the underlying individual predictor variables. To make the exercise comparable with the original, we set J=, which explains 63.7% (US), 61.% (EU), 61.1% (GER) and 59.3% (UK) of the variance from the 14-dimensional space of variables. 33

provides positive forecasting power relative to the historical average forecast (although not always significant) and delivers economically meaningful gains to an investor, attempting to time the market across various markets. The gains appear to be situated near recession periods and driven mainly by the MA s, that pick-up the decline in the equity risk premium near business cycle peaks. It is certainly worth exploring a larger set of underlying variables, identified in the literature as variables with predictive ability, across various countries. 34

Table 3: One-month ahead forecast for other regions R os (%) ΔU US EU GER UK US EU GER UK DP -0,73-0,56-1,09 1,56*** -0,45-0,74-3,8,77 DY -0,61-0,56-0,56 1,69*** -1,06-0,70 -,73,99 EP 0,68-0,51 -,0 0,09,1-0,85-4,34 3,31 DE -1,81-0,31-0,37-1,13-1,6-0,65-0,50 0,30 TBILL -1,7-0,6 0,05-0,55-0,71-0,48-0,41-0,87 TERM -0,59 1,03** 1,0*** 1,44** -0,0 4,7,43 3,87 PRES -0,7 0,08 0,78** -0,74-0,17 0,56 1,38-0,5 dvol -0,43-1,0-1,19 0,66* -0,6-4,00-1,76,8 MA(1,9) 0,6 0,7 0,55* -1,96,36,97 1,04-1,45 MA(1,1) 0,54 0,63 0,60* -1,55,68,59 0,98-1,4 MA(,9) 0,13 0,06-0,4-1,6 1,38 1,45-0,06-1,3 MA(,1) 1,09** 0,86* 0,46-0,97 3,48 3,31 1,00 0,15 MA(3,9) 1,44** 0,17-0,64-1,91 3,87,30-0,9-0,30 MA(3,1) -0,40 0,19 0,3-0,64 1,16,31 0,55 0,30 Mean 0,19 0,35 0, 0,70* 1,11 1,35 0,0 0,4 Median 0,45 0,30 0,47* 0,70* 1,5 1,30 0,94 0,3 TrimMean 0, 0,31 0,3* 0,76*** 1,17 1,38 0,43 0,66 PC 0,48 0,40-0,15 1,***,7,40 0,06,40 Out-of-sample equity risk premium forecasting and mean-variance investor asset allocation results for individual variables and combination methods over the evaluation period 199:01-015:1. R os measures the percentage reduction in MSFE of the forecast relative to the historical average. ΔU is the annual management fee that an investor with a risk aversion coefficient of five would be willing to pay to have access to the respective forecast. Statistical significance of R os is evaluated with the Clark & West (007) MSFE-adjusted statistic testing H 0 : R os 0 against H A : R os > 0. *,** and *** indicate significance at the 10%, 5% and 1% levels, respectively. 35

4. One-quarter forecast horizon For one-quarter ahead forecasts of the equity risk premium, we follow the original methodology by Rapach et al. (010). We do not impose theoretically motivated restrictions, for reasons explained in Section 4.1. We have quarterly data from 195:1 to 015:4 and the evaluation period starts in 1965:1. We start with the original set of variables (see Section 4.1) and the Investment-to-capital ratio from Cochrane (1991). Dividend-price ratio (log), DP: difference between the log of dividends paid (1-month moving sum) and the log of stock prices on the S&P500 index. Dividend yield (log), DY: difference between the log of dividends and log of lagged stock prices. Earnings-price ratio (log), EP: difference between the log of earnings (1- month moving sum) and the log of stock prices on the S&P500 index. Dividend-payout ratio (log), DE: difference between the log of dividends and the log of earnings. Stock variance, SVAR: sum of squared daily returns on the S&P 500 index. Book-to-market ratio, BM: ratio of book value to market value for the Dow Jones Industrial Average. Net equity expansion, NTIS: ratio of twelve-month moving sums of net issues by NYSE-listed stocks to total end-of-year market capitalization of NYSE stocks. Treasury bill rate, TBILL: interest rate on a 3-month Treasury bill. Long-term yield, TBOND: long-term government bond yield. Long-term return, TBONDR: return on long-term government bonds. Term spread, TERM: difference between the long-term yield and Treasury bill rate. Default yield spread, DFY: difference between BAA- and AAA-rated corporate bond yields. Default return spread, DFR: difference between long-term corporate bond and long-term government bond returns. A difference with the original study is that our in-sample period starts in 195:1 instead of 1947:1. 36

Inflation, INFL: calculated from the CPI. We lagged this variable as it is only available in the next period. Investment-to-capital ratio, IK: ratio of aggregate investment to aggregate capital for the entire economy. We extend the evaluation period of the original research. We set the risk-aversion coefficient γ = 5 and select the first J = 3 principal components. 3 We present the results in Table 4. The results show that of the individual predictive variables, only IK significantly outperforms the historical average over the sample period. Combining the individual regression forecasts, however, significantly outperforms the benchmark, at the 1% level. This result holds for all weighting methods. The PC combination method, on the other hand, has a negative R os. In terms of utility gains for a mean-variance investor, all combination methods show positive results, ranging from annualized management fees of 1.4% up to.51%. In Figure 3, we show the difference in cumulative MSFE of the individual predictive regression forecasts relative to the historical average. If the line is steadily, upward sloping it indicates consistent outperformance relative to the historical average. It shows clearly that hardly any of the individual variables provides consistent gains over a lengthy timespan. These results are in line with the original results in Rapach et al. (010) and with the findings of Goyal & Welch (008). Figure 4 plots the unrestricted, excess return forecasts of the individual predictors. One thing that stands out is the low equity risk premia predicted by the traditional valuation ratios DP, DY and EP. An investor following this bearish signal through the nineties would have missed out on a big rally in equity markets. This can be partly attributed to firms greater reliance on share buybacks, to distribute cash to shareholders. In Figure 5, we show the results for the mean combination method. Panel B reveals that the average of all individual predictive regression forecasts outperforms the historical average, seen by the upward sloping line. However, the gains are concentrated in pre 1980 period. The negative signal of the valuation ratios is still 3 In Rapach et al. (009), the risk aversion is set at 3 and setting J=3 explains 66.3% of the latent variance in the individual predictive variables. 37

captured by the equally weighted combination method. The model seems to suffer losses, relative to the historical average, in the late nineties. Other weighting methods results are comparable to the simple average combination. Table 4: One-quarter ahead forecast R os (in %) ΔU ΔS BM -3,46-1,91-0,08 DP -0,81 0,17 0,0 DY -0,88 1,8 0,15 EP -,40 0,03-0,01 TBILL -4,01,45 0,1 TBOND -3,36,76 0, TERM -3,15 0,69 0,19 DFY -,49 -,18-0,08 TBONDR -1,00-0,0 0,06 DFR -0,18 1,0 0,10 SVAR -10,96 -,4-0,04 NTIS -,77-1,0 0,0 DE -,8 0,39 0,05 INFL -0,95,56 0,3 IK 1,54 ***,98 0,9 Mean,35 ***,51 0,1 Median,3 *** 1,4 0,11 Trim,31 ***,6 0,18 DMSFE,4 **,4 0,0 MSFE,6 ***,48 0,1 PC -4,3 1,94 0,4 Out-of-sample equity risk premium forecasting and mean-variance investor asset allocation results for individual variables and combination methods over the evaluation period 1965:01-015:1. R os measures the percentage reduction in MSFE of the forecast relative to the historical average. ΔU is the annual management fee that an investor with a risk aversion coefficient of five would be willing to pay to have access to the respective forecast. ΔS is the annual Sharpe-ratio gain for a mean-variance investor with the same risk-aversion coefficient. Statistical significance of R os is evaluated with the Clark & West (007) MSFE-adjusted statistic testing H 0 : R os 0 against H A : R os > 0. *,** and *** indicate significance at the 10%, 5% and 1% levels, respectively. 38

Figure 4: One-quarter ahead forecasts MSFE difference Cumulative MSFE difference between the historical average (HA) and the individual predictive regression forecasts. Evaluation period is from 1965:1-015:4. Vertical bars show NBER recession periods. An upward sloping line indicates the individual predictive regression forecast outperforms the historical average. 39

Figure 5: One-quarter ahead forecasts Out-of-sample unrestricted, excess return forecasts of individual predictive regressions (blue). The evaluation periods spans 1965:1-015:4. The dotted line shows the historical average benchmark return model. Vertical bars show NBER recession periods. 40

Figure 6: One-quarter ahead mean combination forecast One-quarter ahead out-of-sample equity premium prediction using the mean combination method. (1965:1-015:4). Panel A shows the (excess, unrestricted) mean combination equity risk premium forecast (blue) based on the individual forecast, relative to the historical average (gray, dotted). Vertical bars show NBER recession periods. Panel B shows the cumulative difference in MSFE for the historical average relative to the mean combination forecast. Panel C gives the mean-variance investor equity allocation weights (risk aversion coefficient of five) for an investor using the historical average (black) and the mean combination forecast (blue). 41

Next, we evaluate the out-of-sample performance of some new variables with insample predictive ability, as seen by their positive in-sample Adjusted R² values. The new variables are the following: FED-model, FED: Calculated as EP divided by TBOND Unemployment, U: Level of unemployment in the United States (in %). 4 Purchasing Managers Index, PMI: Monthly survey from The Institute of Supply Management (ISM) reflecting the state of the manufacturing sector. 5 Consumption-to-wealth, CAY: Lettau & Ludvigson (001). Data is from Amit Goyal. Average Investor Equity Allocation, AIEA 6 Change in Gross Domestic Product (GDP), dgdp: Y/Y change in Industrial Production in the United States. 7 Market-cap to Gross Value Added, MCGVA 8 In addition, we add two moving averages, MA(1,3) and MA(1,4) and two on-balance volume technical indicators, OBV(1,3) and OBV(1,4). Due to the slower frequency, it is harder to implement technical signals. We present the results in Table 5. Two new variables are added: AIEA and MCGVA. Beginning with AIEA, which are variables proposed by John Hussman. The total amount of cash and bonds by investors is calculated by the sum of the total outstanding liabilities of households, non-financial corporations, state and local governments, the federal government and the rest of the world. This supply of cash and bonds increases with the growth of the economy. The total amount of stocks is estimated by the total market value of all stocks of financial and nonfinancial corporate businesses. This amount increases either if there is new 4 https://search.stlouisfed.org/search?&client=researchnew&proxystylesheet=research&site=research&output=xml_no_dtd&num=30&getfields=*&q=level %0of%0unemployment 5 https://search.stlouisfed.org/search?&client=researchnew&proxystylesheet=research&site=research&output=xml_no_dtd&num=30&getfields=*&q=pmi 6 https://research.stlouisfed.org/fred/graph/?g=qis 7 https://search.stlouisfed.org/search?&client=researchnew&proxystylesheet=research&site=research&output=xml_no_dtd&num=30&getfields=*&q=gross %0domestic%0product%0us 8 http://www.hussmanfunds.com/wmc/wmc150518.htm 4

issuance of shares or if share prices hike. If the economy grows, the supply of cash and bonds grow at roughly the same pace. The price of stocks has to rise if investors want to stay in the same allocation to stocks. In general, if the AIEA is low, returns are expected to be higher than normal, because you would expect to get dividend return plus the price return to keep the AIEA constant with rising supply of cash and bonds plus the price return when AEIA returns to the general expected number. If the AIEA is high, you ll expect to get the dividend return plus the price return to keep the AIEA constant with rising supply minus the price return when AIEA falls back to normal levels. 9 The explanation behind the MCGVA variable is that he argues that GDP includes income of foreign companies in the US and no income of US citizens abroad. Therefore, using GVA for domestic corporations is a better estimation for knowing the true valuation in the US. Many of the new variables show positive out-of-sample forecasting ability. All combination methods are improved relative to the original set of variables. Although, most striking, is the good performance of the PC combination forecast. Selecting the first three principal components explains, on average, 60.3% of the individual variables variance, over the out-of-sample period. In Figure 6 we show the individual predictor forecasts of the new variables over the evaluation sample and in Figure 7 we plot the cumulative MSFE difference relative to the historical average. 9 http://www.philosophicaleconomics.com/013/1/the-single-greatest-predictor-of-future-stockmarket-returns/ 43

Table 5: One-quarter ahead forecasts with extra variables R is R os ΔU ΔS BM -0,1-3,46-1,91-0,08 DP 0,75-0,81 0,17 0,0 DY 1,13-0,88 1,8 0,15 EP -0,06 -,40 0,03-0,01 TBILL 0,88-4,01,45 0,1 TBOND 0,08-3,36,76 0, TERM 0,89-3,15 0,69 0,19 DFY -0,5 -,49 -,18-0,08 TBONDR 0,56-1,00-0,0 0,06 DFR 1,50-0,18 1,0 0,10 SVAR -0,39-10,96 -,4-0,04 NTIS -0,36 -,77-1,0 0,0 DE -0,04 -,8 0,39 0,05 INFL 1,66-0,95,56 0,3 IK,14 1,54 **,98 0,9 FED,05 1,57 ** 3,90 0,36 U 1,08 0,53 0,88 0,08 PMI 3,7,33 *** -0,85 0,08 CAY,76 1,5 ***,14 0,4 AIEA 3,38,01 *** 1,01 0,18 dgdp 1,94 0,64 * -0,87 0,05 MCGVA 1,6-0,05 * 0,54 0,09 MA(1,3) 0,30-0,38 1,0 0,07 MA(1,4) -0,06-0,85 0,71 0,05 OBV(1,3) 1,4 0,38,13 0,4 OBV(1,4) 1,49 0,85 1,51 0,14 Mean,85 ***,57 0, Median,98 ***,13 0,18 Trim,8 ***,4 0,0 DMSFE,77 ***,56 0, MSFE,81 ***,57 0, PC,0 *** 4,63 0,67 Out-of-sample equity risk premium forecasting and mean-variance investor asset allocation results for individual variables and combination methods over the evaluation period 1965:01-015:1. R is measures the in-sample adjusted-r². R os measures the percentage reduction in MSFE of the forecast relative to the historical average. ΔU is the annual management fee that an investor with a risk aversion coefficient of five would be willing to pay to have access to the respective forecast. ΔS is the annual Sharpe-ratio gain for a mean-variance investor with the same risk-aversion coefficient. Statistical significance of R os is evaluated with the Clark & West (007) MSFE-adjusted statistic testing H 0 : R os 0 against H A : R os > 0. *,** and *** indicate significance at the 10%, 5% and 1% levels, respectively. 44

Figure 7: One-quarter ahead forecasts (extra variables) Out-of-sample unrestricted, excess return forecasts of individual predictive regressions (blue). The evaluation periods spans 1965:1-015:4. The dotted line shows the historical average benchmark return model. Vertical bars show NBER recession periods. 45

Figure 8: One-quarter ahead MSFE difference (extra variables) Cumulative MSFE difference between the historical average (HA) and the individual predictive regression forecasts. Evaluation period is from 1965:1-015:4. Vertical bars show NBER recession periods. An upward sloping line indicates the individual predictive regression forecast outperforms the historical average. 46

4.3 One-year forecast horizon To test the forecasting ability of our individual variables and combination methods for a medium-term horizon, we forecast unrestricted, one-year ahead (log) excess returns. We start the evaluation period in 1965:1. 30 The risk-aversion coefficient is set to 3 and J = 3. We use the following variables: 31 Dividend-price ratio, DP Earnings-price ratio, EP Treasury bill rate, TBILL Treasury bond rate, TBOND Term spread, TERM Treasury bond return, TBONDR Dividend-payout ratio, DE Inflation, INFL Investment-to-capital ratio, IK FED-model, FED Unemployment, U Purchasing Managers Index, PMI Consumption-to-wealth, CAY Change in Gross Domestic Product, dgdp dgdp10: rolling average 10-year, dgdp10 Market-cap to Gross Value Added, MCGVA Average Investor Equity Allocation, AIEA We find that the R os is significant for some variables, with CAY having the highest forecast accuracy over the whole sample, 7.03%. However, almost all the combination methods show higher accuracy compared to the individual predictors, except for MSFE and PCA, where they are insignificant. Although some variables show they consistently beat the historical average, forecast accuracies during the 1970s were 30 The DMSFE and MSFE weighting schemes require a hold-out period. We take this period inside the in-sample period (pre-1965). The hold-out period in this case is 7 years. 31 The choice of predictor variables is based primarily on in-sample performance. 47

generally not good at all and some variables underperformed since then. This is also the case for one combination method, namely the PCA. In terms of utility gains for a mean-variance investor, we notice that there are some individual variables who have higher gains compared to the combinations. FED, TBILL and TBOND have higher values than the Median, which has the highest utility gain of all combined methods. Remarkable is that the PCA also underperforms here. This could indicate that using a PCA to forecast returns on the medium or long term is probably not the best combination method. 48

Table 6: One-year ahead forecasts R is R os ΔU ΔS DP 4,3-6,1 0,48 0,01 EP 1,19-8, 0,34 0,0 E10P,98-7,59-0,86-0,0 FED 6,3 3,6 ** 4,69 0,8 TBILL,3-11,45 4,49 0,9 TBOND 0,14-14,06 4,15 0,6 TERM 3,91-4,5 3,14 0,5 TBONDR 1,46 -,30 1,65 0,1 DE 0,8-8,1 1,64 0,11 INFL,87-4,98 3,4 0,19 U 1,98-1,1 0,08 0,01 PMI 5,97,13 * 1,7 0,16 CAY 7,8 4,53 *** 3,13 0,1 IK 5,96,41 *,56 0,14 AIEA 11,0 5,41 *** 1,83 0,16 dgdp,8 0,79 0,58 0,08 dgdp10 6,3 0,50 *** 3,6 0,0 MCGVA 4,5-4,6 1,15 0,09 Mean 8,96 ***,86 0,18 Median 9,6 ***,93 0,18 Trim 9,00 ***,73 0,17 DSMFE 6,85 ***,8 0,13 MSFE 8,56 ***,46 0,15 PC -15,11 0,54 0,08 One-year ahead out-of-sample equity risk premium forecasting and mean-variance investor asset allocation results for individual variables and combination methods over the evaluation period 1965:01-015:1. R is measures the in-sample adjusted-r². R os measures the percentage reduction in MSFE of the forecast relative to the historical average. ΔU is the annual management fee that an investor with a risk aversion coefficient of five would be willing to pay to have access to the respective forecast. ΔS is the annual Sharpe-ratio gain for a mean-variance investor with the same risk-aversion coefficient. Statistical significance of R os is evaluated with the Clark & West (007) MSFE-adjusted statistic testing H 0 : R os 0 against H A : R os > 0. *,** and *** indicate significance at the 10%, 5% and 1% levels, respectively. 49

Figure 9: One-year realized excess return (log) Plots the one-year trailing (log) excess returns on the S&P500 for the period 195:1-015:4. Vertical bars delineate NBER recession periods. 50

Figure 10: One-year ahead forecasts One-year ahead out-of-sample unrestricted, excess return forecasts of individual predictive regressions (blue). The evaluation periods spans 1965:1-015:4. The dotted line shows the historical average benchmark return model. Vertical bars show NBER recession periods. 51

Figure 11: One-year ahead MSFE difference Cumulative MSFE difference between the historical average (HA) and the individual predictive regression forecasts. Evaluation period is from 1965:1-015:4. Vertical bars show NBER recession periods. An upward sloping line indicates the individual predictive regression forecast outperforms the historical average 5

Figure 1: One-year ahead combination weights DMSFE and MSFE combination weights over the evaluation period 1965:1-015:4. For DMSFE, the discount factor is 0.9, while there is no discounting for the MSFE method. 53

Figure 13: One-year ahead mean combination forecasts One-year ahead out-of-sample equity premium prediction using the mean combination method. (1965:1-015:4). Panel A shows the (excess, unrestricted) mean combination equity risk premium forecast (blue) based on the mean combination method, relative to the historical average (gray, dotted). Vertical bars show NBER recession periods. Panel B shows the cumulative difference in MSFE for the historical average relative to the mean combination forecast. Panel C gives the mean-variance investor equity allocation weights (risk aversion coefficient of five) for an investor using the historical average (black) and the mean combination forecast (blue). 54

4.4 Ten-year forecast horizon In this subsection, we report results for ten-year forecast. A longer-term perspective is relevant for many investors but is less prevalent in the forecasting literature. For this exercise, we focus on a host of valuation, macro or financial variables with potential predictive ability, as suggested by their in-sample Adjusted R² values. We forecast unrestricted, gross (log) returns using quarterly data from 195:1-015:4 and evaluate the period from 198:1 to 015:4 3 The out-of-sample period is considerably shorter, which is unavoidable when forecasting long-horizon returns. The following variables are used: Dividend-price ratio (log), DP Dividend-yield ratio (log), DY Earning-price ratio (log), EP Ten-year trailing real Earnings-price ratio (log), E10P Unemployment, U Output gap, YGAP Ten-year trailing average GDP growth, dgdp10 Market-cap to Gross Value Added, MCGVA Corporate profits after tax of nonfinancial corporate business to net worth nonfinancial corporate business, PNW 33 Average Investor Equity Allocation, AIEA Total Financial Assets to Households Disposable Income, TFAHDI 34 It is worth exploring some of these variables, and the choice behind them further. We use four traditional ratios with potential market timing ability. Dividend yield has been the most popular indicator for a long time, until it failed completely in the late nineties. The breakdown can be attributed to a change in corporate behaviour, switching from dividend payments to share repurchases, as a means to distribute 3 As we need 10-year realized returns to calculate our errors, we need to have a larger in-sample period. This in-sample period is 30 years in this case with a hold-out period of 15 years. 33 Data is gathered from the FRED Database website. 34 Data is gathered from the FRED Database website. 55

profits to shareholders. We also use the basic 1-month trailing earnings-yield and the ten-year trailing real earnings-yield, made popular by Shiller. Next, we test the out-of-sample forecasting ability over a ten-year horizon of three macroeconomic variables, namely, unemployment, output-gap and ten-year trailing average GDP growth. These indicators should capture real activity and correlate well with business cycles, giving them potential market-timing ability. TFAHDI shows the savings of households compared to their income. If the saving rate increases, one would expect that also the rate of financial investment rises. However, this is not the case as it mainly reflects the increase in valuations. These high valuations don t mean direct wealth for investors, as these assets still need to be sold and are probably overvalued. Overall, the higher the savings rate of households is, the lower the expected returns in the long term are as this indicates more overvalued financial assets. 35 PNW should have some forecasting ability because stock prices are an approximation of future expected cash flows. Therefore, current and expected profits after tax reflect cash flows in the future. In this section, we introduce a new weighting scheme, which we call MaxROS. We do not consider this a combination method as it simply selects the best variable. This selection is made every period by maximizing the following ratio: max ω ω R OS,t (0) ω Σ ω with R os at the out-of-sample R² at time t and Σ the in-sample residual covariance matrix. The purpose of this weighting scheme is to combine both in-and out-of-sample forecast evaluation metrics. In this sense, individual predictor variables are chosen more aggressively than the previous combination methods. A well performing variable will be given a weight of 100%. A clear drawback is this weighting function is heavily prone to model breakdown. The benefit of noise reduction from forecast combination is entirely lost. 36 35 http://www.hussmanfunds.com/wmc/wmc160516.htm 36 It would be interesting to extend the weighting scheme and consider possible restrictions on the weights, so the selection becomes a combination method. 56

In Table 7, we present the results for both the individual predictors and the combination methods. Traditional dividend-yield ratios perform poorly out-ofsample. Both Earnings-yield variables do much better, both in terms of forecasting accuracy as well as utility gains for an investor. Real activity variables have lower insample fit and out-of-sample performance is mixed. Output gap seems to have the most timing ability, but the result is not significant. Our results show that AIEA has the highest out-of-sample performance of all individual variables with almost 87.7% R²os. The forecasting ability of TFAHDI and MCGVA is also high and significant at the 1% level. For a mean-variance investor, the best variables appear to be EP, AIEA and TFAHDI. Turning our attention to the combination methods, we notice that all of them significantly outperform the historical average. The largest reduction in MSFE relative to the historical average is achieved by the MaxROS selection mechanism. Figure 17 shows its performance is driven almost entirely by AIEA. Clearly, this method is highly sensitive to a variable breaking down. The diversification benefit of other combination methods is lost here. Both simple weighting schemes (mean, median, trimmean) and the PC combination forecast deliver out-of-sample gains around 6% relative to the historical average and show utility gains from 1% to 1.5%. Another interesting finding is that the DSMFE and MSFE weighting function are no longer quasi-equally weighted (See Figure 17). This can be attributed to larger differences in MSFEs (due to summation of log returns). Our results indicate that the forecasted returns of AIEA moved very closely with the realised return on 10-years. Furthermore, TFAHDI, MCGVA and PW show great forecasting ability. In this sample, our OwnRatio performs very well. This is caused by the fact that it follows AIEA through the sample for almost 100% in comparison with a DMSFE for example which also uses all the other variables. In that optic, we created a new weighting method which is much more severe than other methods. This is not necessarily only a good thing as AIEA may stop working in the future together with the performance of this method. 57

Table 7: Ten-year ahead forecasts R is R os ΔU ΔS DP 53,73-3,08-0,9 0,00 DY 5,48 3,13 0,13 0,01 EP 55,6 34,67 **,07 0,07 E10P 67,0 47,89 *** 1,7 0,07 U 3, 4,48 0,67 0,04 YGAP 7,64 15,41 1,6 0,05 dgdp10 14,81-8,9 0,57 0,0 AIEA 86,86 89,73 *** 1,95 0,11 TFAHDI 70,77 7,01 *** 1,89 0,10 MCGVA 69,81 66,74 *** 0,66 0,0 PNW 7,79 17,78 *** -1,9-0,07 Mean 63,91 *** 1,39 0,07 Median 6,80 ** 1,16 0,05 Trim 6,79 *** 1,46 0,07 DSMFE 70,49 *** 1,17 0,05 MSFE 74,9 *** 1,53 0,07 MaxROS 88,8 *** 1,91 0,10 PC 61,40 *** 0,97 0,04 Ten-year ahead out-of-sample equity return forecasting and mean-variance investor asset allocation results for individual variables and combination methods over the evaluation period 198:01-015:4. R is measures the in-sample Adjusted-R². R os measures the percentage reduction in MSFE of the forecast relative to the historical average. ΔU is the annual management fee that an investor with a risk aversion coefficient of five would be willing to pay to have access to the respective forecast. ΔS is the annual Sharpe-ratio gain for a mean-variance investor with the same risk-aversion coefficient. Statistical significance of R os is evaluated with the Clark & West (007) MSFE-adjusted statistic testing H 0 : R os 0 against H A : R os > 0. *,** and *** indicate significance at the 10%, 5% and 1% levels, respectively. The main conclusions that here can be made is that the results show that forecasting on the long-term has a lot better performance compared to the very short-term. However, we are aware that these good performances are mainly caused by the incredible forecast ability of AIEA, TFAHDI and MCGVA. 58

Figure 14: Ten-year realized gross return (log) Ten-year trailing (log) gross returns on the S&P500 for the period 196:1-015:4. Vertical bars delineate NBER recession periods. 59

Figure 15: Ten-year ahead forecasts Ten-year ahead out-of-sample unrestricted, gross return (annualized) forecasts of individual predictive regressions (blue). The evaluation periods spans 1965:1-015:4. The dotted line shows the historical average benchmark return model. The dashed line shows the annualized ten-year subsequently realized unrestricted gross return. Vertical bars show NBER recession periods. 60

Figure 16: Ten-year ahead mean combination forecast Ten-year ahead (annualized) out-of-sample return prediction using the mean combination method. (1965:1-015:4). Panel A shows the (gross, unrestricted) mean combination equity return forecast (blue) based on the mean combination method, relative to the historical average (gray, dotted).the dashed line shows the ten-year annualized, subsequently realized gross return. Vertical bars show NBER recession periods. Panel B shows the cumulative difference in MSFE for the historical average relative to the mean combination forecast. Panel C gives the meanvariance investor equity allocation weights (risk aversion coefficient of five) for an investor using the historical average (black) and the mean combination forecast (blue). 61

Figure 17: Ten year ahead MaxROS forecast Ten-year ahead (annualized) out-of-sample return prediction using the maxros combination method. (1965:1-015:4). Panel A shows the (gross, unrestricted) maxros combination equity return forecast (blue), relative to the historical average (gray, dotted).the dashed line shows the ten-year annualized, subsequently realized gross return. Vertical bars show NBER recession periods. Panel B shows the cumulative difference in MSFE for the historical average relative to the maxros. Panel C gives the mean-variance investor equity allocation weights (risk aversion coefficient of five) for an investor using the historical average (black) and the maxros combination forecast (blue). 6

Figure 18: Ten-year ahead combination weights DMSFE, MSFE and MaxROS combination weights over the evaluation period 198:1-015:4. For DMSFE, the discount factor is 0.9, while there is no discounting for the MSFE method. 63

5 Robustness and Sensitivity analysis 5.1 Long-term horizon robustness We test for robustness of the long-term signals that might have additional lag in data availability. Specifically, we add 3 extra lags for the variables AIEA, TFAHDI, MCGVA and PNW. Our results from Section 4.4 remain intact. In fact, the mean combination achieves a greater reduction in MSFE relative to the historical average. This result is puzzling. Table 8: Robustness of ten-year ahead forecasts R is R os ΔU ΔS DP 53,73-3,08 0,77 0,01 DY 5,48 3,13 * 1,48 0,07 EP 55,6 34,67 **,88 0,07 E10P 67,0 47,89 *** 1,5 0,04 U 3, 4,48 0,85 0,04 YGAP 7,64 15,41 1,6 0,06 AIEA 78,30 84,05 ***,08 0,11 TFAHDI 70,10 73,08 *** 3,84 0, dgdp 14,81-8,9 0,61 0,04 MCGVA 65,64 66,68 ***,4 0,09 PNW 30,94 4,67 *** -0,95-0,07 Mean 64, *** 1,78 0,05 Median 6,30 ** 1,87 0,05 Trim 63,5 ** 1,74 0,05 DSMFE 6,6 ***,33 0,07 MSFE 67,91 ***,7 0,07 MaxROS 68,18 ***,65 0,10 PC 46,68 *** 1,9 0,05 Ten-year ahead out-of-sample equity return forecasting and mean-variance investor asset allocation results for individual variables and combination methods over the evaluation period 198:01-015:4. AIEA, TFAHDI, MCGVA and PNW are lagged an extra 3 quarters. R is measures the in-sample Adjusted-R². R os measures the percentage reduction in MSFE of the forecast relative to the historical average. ΔU is the annual management fee that an investor with a risk aversion coefficient of five would be willing to pay to have access to the respective forecast. ΔS is the annual Sharpe-ratio gain for a mean-variance investor with the same risk-aversion coefficient. Statistical significance of R os is evaluated with the Clark & West (007) MSFE-adjusted statistic testing H 0 : R os 0 against H A : R os 0. *,** and *** indicate significance at the 10%, 5% and 1% levels, respectively. Next, we do a sensitivity analysis over different forecasting horizons. The evaluation period is 198:1-015:4 for all runs, we use quarterly data and forecast unrestricted, gross returns for one-quarter ahead up to 40-quarters ahead. J = 3 and γ = 5. We > 64

divide the variables in groups in order to present a more clear picture. In Figure 18 and 19, we show R os gains and utility gains, respectively, for variables that appear to work well as long-term valuation signals. These variables are, EP, E10P, AIEA, TFAHDI, MCGVA, PNW and YGAP. E10P clearly outperforms EP in Figure 18, but a simple EP appears to be a fairly good signal for a mean-variance investor, as seen in Figure 19. While AIEA appears to be our best long-term variable in terms of statistical predictive ability, it is not the leader in terms of utility gain. Here, EP, TFAHDI and MCGVA show better results. All three variables show a local minimum in the medium-term horizon, around 15-quarters ahead. In Figure 0 and 1, we show the same results, for R os and utility gain, respectively, for variables that appear to do well in the medium-term. In Figure 0, three variables stand out, namely U, CAY and IK. Other variables appear to have some predictive ability for horizons around 0 quarters, but not as much as the aforementioned. In terms of utility gains, DP and DY, appear to do better. Both seems to deliver their highest gains for the long-horizon forecasts. CAY appears to do worse than many others. Next, Figure and 3 show the results for variables that appear to have hardly any statistical predictive ability. While TBONDR appears to be the only variable with some predictive ability, particularly for shorter-horizons, the utility gains in Figure 3 show a different picture again. Although gains are smaller than for other groups, BM, FED, TBONDR, DFR and SVAR appear to have some value for an investor. Finally, Figure 4 and 5 shows the same exercise for the combination methods. It s clear that the mean combination is the most robust in terms of statistical and economical gains over all forecast horizons. The DMSFE and MSFE approach leverage quite strongly on the forecasting performance of some long-term signals, increasing their performance, but making them more prone to structural instability and model breakdown. The MaxROS model selection shows two clusters of good performance, indicating that it achieves it performance by the performance of CAY and AIEA. Finally, the principal component does not add much value on quarterly forecasts, except for 65

the longer term horizons. We will explore the PC combination more in the next subsection. While it is hard to draw conclusions from this exercise, it does show that variables with explanatory power do not always deliver the most gains to an investor in terms of utility gains. In addition, signals appear to work for different horizons, indicating that they capture different aspects of the business cycle. 66

Figure 19: Long-term horizon R os sensitivity Figure plots the R os gains relative to the historical average of a selection of variables for different quarterly forecast horizons. The evaluation period is 198:1-015:4. 67

Figure 0: Long-term horizon ΔU sensitivity Figure plots the ΔU gains relative to the historical average, for a mean-variance investor with a risk-aversion coefficient of five and J=3, of a selection of variables for different quarterly forecast horizons. The evaluation period is 198:1-015:4. 68