Prospective book-to-market ratio and expected stock returns

Prospective book-to-market ratio and expected stock returns Kewei Hou Yan Xu Yuzhao Zhang Feb 2016 We propose a novel stock return predictor, the prospective book-to-market, as the present value of expected future demeaned book-to-market ratios. We find that the aggregate prospective book-to-market ratio can significantly predict stock market return, with adjusted R-squared between 5.0% and 5.8% out-of-sample. In addition, a high-minus-low investment strategy based on prospective book-to-market ratio generates significant monthly alpha ranging from 13.4 to 20.8 basis points across various factor models, and the return spread is also shown to be non-redundant as an alternative value factor in pricing cross-section of stock returns. Fisher College of Business, The Ohio State University, 2100 Neil Avenue, Columbus, OH 43210, USA; phone: +1 614 292 0552; fax: +1 614 292 7062; email: hou.28@osu.edu. Faculty of Business and Economics, University of Hong Kong; Pokfulam Road, Hong Kong; phone: +852 2859 7037; fax: +852 2548 1152; email: yanxuj@hku.hk. Rutgers Business School, Rutgers, The State University of New Jersey, Newark, NJ 07102, USA; Phone: +1 973 353 2727; fax: +1 973 353 1006; email: yzhang@business.rutgers.edu

Prospective book-to-market ratio and expected stock returns We propose a novel stock return predictor, the prospective book-to-market, as the present value of expected future demeaned book-to-market ratios. We find that the aggregate prospective book-to-market ratio can significantly predict stock market return, with adjusted R-squared between 5.0% and 5.8% out-of-sample. In addition, a high-minus-low investment strategy based on prospective book-to-market ratio generates significant monthly alpha ranging from 13.4 to 20.8 basis points across various factor models, and the return spread is also shown to be non-redundant as an alternative value factor in pricing cross-section of stock returns. 2

1 Introduction In this paper, we propose a new stock return predictor, through decomposing the book-to-market ratio into permanent and transitory components. Our decomposition relates the present value of demeaned stock return to the temporary component of the book-to-market ratio, the present value of demeaned book-to-market, and the present value of demeaned return on equity. When expected return moves with each or all of these three terms, future stock return can be predictable when the investor observes new information about book-to-market ratio. Specifically, we focus on the prospective book-to-market, 1, defined as the expected sum of all future book-to-market around its long run trend. When the expected sum of future book-to-market is above its long run trend, it signals that either the expected return is above its long run trend, or the market value is temporarily underpriced than the book value and is expected to rise in the future. Indeed, we find that the prospective book-to-market is particularly useful in predicting next period returns. Empirically, we model the prospective book-to-market by assuming a simple autoregressive form of book-to-market ratio then estimate its infinite sum while taking into consideration of the historical average. Similar to?, which utilize the difference of persistence in state variables to better predict stock returns, in our setup the superior predictive power of the prospective bookto-market depend on the persistence of book-to-market ratio and its current level relative to the long run trend. Our data include returns and book-to-market on three different levels: market, industry portfolios, and cross-section of individual firms. In out-of-sample tests, we use only the currently available information to ensure there is no look-ahead bias. There are two parameters to estimate: for the long run trend, we simply use the historical sample average to proxy for it. The estimation of autoregressive coefficient of book-to-market ratio deserves further elaboration. For both market and industry portfolios, we rely on simple autoregression as the measure for the persistence. Other than the OLS, we also conduct robust regression to minimize the effect of outliers in the return predictability tests. 2 1 The naming of this term is analogous to?, although in a different setting. 2 In the meantime, we are aware that we will suffer the well-known? bias that occurs when a sample size is small especially at the early stage of out-of-sample period. This bias constitutes another difficulty: that our point 1

We find that our prospective book-to-market ratio is a significant return predictor at the market, industry, individual stock level. At the market level, the prospective book-to-market ratio produces out of sample adjusted R-squared between 5.0% and 5.8%, in contrast to the conclusion of? that market returns can not reliably be predicted out of sample. Moreover, as shown in?, these out-of-sample R-squared implies substantial economic gains for the investor. In industry level time series tests, we show that the prospective book-to-market ratio predicts 48 industry portfolio returns. Moreover, using a zero cost long-short strategy, industry prospective book-to-market ratio is shown to generate a significant monthly spread of 2.3% 2.4% in riskadjusted returns across industries, but the original book-to-market ratio fails to do so, consistent with?. At the individual firm level, as firm by firm estimation of the persistence in book-to-market ratios is very imprecise with shorter sample period, we use a straightforward pooling OLS regression at industry level then assign to individual stocks within that industry. Interestingly, we observe a lot of cross-industry difference in the persistence parameter, which creates additional degrees of heterogeneity in book-to-market ratios when we sort firms into portfolios to develop a highly profitable investment strategy. To demonstrate the forecasting power of our new predictor, we long (short) firms when the expected sum of future book-to-market ratio is higher (lower) than its historical average, after controlling for firm size. This strategy generates significant monthly alphas ranging from 13.4 to 20.8 basis points, over models with q-factors, Fama-French 3 factors, 3 factors augmented with momentum factor, Fama-French 5 factors, and 5 factors augmented with momentum factor. We provide time series spanning tests by regressing the returns of the standard HML factor and the alternative annually formed HML factor with updated price information on our prospective book-to-market factor. We find that these two versions of HML factors are spanned by our prospective factor, but not the other way around. Finally, we contribute to the debate whether HML is a redundant factor in the existing factor models. Although these two annually formed HML factors are indeed redundant in the Fama-French 5 factor model, the prospective factor is estimates will be imprecise in the early sample period, thereby contaminating the return predictability results. We thus also employ other econometric tools to address such bias, for example, the recursive mean least squares. 2

not redundant and can be useful in pricing cross-section of stock returns. Methodologically, we extend the model in? and decompose the book-to-market into transitory and permanent components. In deriving the transitory component of the book-to-market, our model implies the stock return predictability of a multiple period sum of expected book-to-market ratio, which we term as prospective book-to-market ratio. This is a similar construct as the prospective interest rate differentials in? excess returns. and?, which has been shown to predict currency Finally, our work also contributes to the literature on the value premium in the cross-section of stock returns. In particular, recent studies have focused on decomposing the book-to-market ratio and examine each components return predictive power. For example,? demonstrate that most of the return predictability of book-to-market comes from the within-industry component.? and? also examine the components of the book-to-market ratio.? propose the use of priced component of book-to-market which outperforms the raw value of book-to-market. In contrast, we study the transitory component of book-to-market ratio as a starting point to improve the predictive power of the raw value of book-to-market. The paper is organized as follows. In section 2, we detail our present-value model and present the permanent-temporary components decomposition. In section 3, we introduce the data. We then report estimation of model parameters, present predictive regressions of expected returns and compare out-of-sample portfolio performance. We also explore various variations of our proposed predictor and conduct robustness checks. Section 4 concludes. 2 Model We start from the definition of stock return P t+1 P t ( 1 + D ) t+1 = R t+1 P t+1 where P t, D t, R t denote the stock price, dividend, and returns separately. case letters to denote the logarithm of these variables. 3 We use the lower Let δ t be the log dividend-price ratio

δ t = ln (1 + exp (dp t )), and if we take log on both sides p t+1 p t + δ t+1 = r t+1 taking expectations at time t and iterating forward we have E t p t+1 p t + E t δ t+1 = E t r t+1 E t p t+2 E t p t+1 + E t δ t+2 = E t r t+2... E t p t+j E t p t+j 1 + E t δ t+j = E t r t+j summing up we have k k E t p t+j p t + E t δ t+j = E t r t+j j=1 j=1 We then write the expected return E t r t+1 as the sum of risk premium µ t and risk free rate i t, and we specify the dynamics of the risk free rate, risk premium and log dividend-price ratio as i t ī = φ (i t 1 ī) + error µ t µ = γ (µ t 1 µ) + error δ t δ = β ( δ t 1 δ ) + error that is, we assume these three variables all follow simple first order autoregressive processes, with AR(1) coefficients φ, γ, and β, and their long run trend ī, µ, and δ separately. Just like?,?, and?, by so doing, we abstract away from specifying a utility function and deriving the dynamics for expected returns. It is also worth noting that a different modeling approach may impose cross-equations restrictions on these AR(1) coefficients (?,?,?), while we only require estimation of β in this paper. Besides, our method does not necessarily require these restrictions as our methodology does not rely on Campbell and Shiller s approximate identity on log dividend-price. 4

We then let τ = µ + ī δ, k, thus we have lim E tp t+j p t jτ + j [ E t δt+j δ ] = j=1 E t (µ t+j µ) + E t [i t+j ī] (1) j=1 j=1 The modeling methodology follows?, which focuses on the sum of deviations of expected future interest rate from its long run trend.? develop an empirical proxy for this prospective interest rate and show it predicts currency return, beyond the conventional carry trade. According to the? decomposition, lim j E t p t+j jτ can be seen as the permanent component of the stock price p P t. After we eliminate the permanent components of stock price, both sides of the equation are stationary. Note that on appearance, this equation is just another identity expressing the temporary component of book-to-market as present value of return and cash-flow. This model thus is reminiscent of the well-known Campbell and Shiller s approximate identity relating log dividend-price to present value of return and cash-flow, and Vuolteenaho s approximate identity relating bookto-market to present value of return and cash-flow (?,?). However, they differ conceptually. Our decomposition is on an unobservable term, namely, the transitory component of book-to-market, and our equation holds as an identity instead of approximation. Also, as put in?,? s (1991) return decomposition is related to the Beveridge-Nelson (1981) decomposition in the time-series literature, such that cash-flow news corresponds to the shock to the random-walk component of the log stock price and expected-return news to the shock to the stationary component of the log stock price. Our decomposition does not have this implication necessarily, and we use it to only motivate the present-value identity, so we refrain from modeling the dynamics of book-to-market in our empirical work. We then simplify the permanent-transitory components decomposition as p P t p t + β δ t δ 1 β = µ t µ 1 γ + i t ī 1 φ (2) Next, we apply the same approach to the log book equity b t = log (B t ). Define the log 5

dividend-book equity as ψ t = ln (1 + exp (db t )), then we also have b P t b t + β ψ t ψ 1 β = g t ḡ 1 ξ + i t ī 1 φ (3) where the expected excess returns on equity g t also follows a simple first order autoregressive process, with AR(1) coefficient ξ and long run trend ḡ: g t = E t [roe t+1 ] i t ; and g t ḡ = ξ (g t 1 ḡ) + e t Now define the book-to-market ratio as θ t log (BE t /ME t ) = b t p t Then if we subtract 3 from 2 ( θ P t ) [ψ t θ ψ t β 1 β δ t δ ] µ t µ = 1 β 1 γ g t ḡ 1 ξ Next, we conduct the loglinearization (?) and we also exploit the same assumption as? such that in the steady state, the historical dividend-price and dividend-book equity are equivalent: ρ = 1/ [ 1 + D P ] [ D ] = 1/ 1 + B Then we obtain for both the log dividend-price ratio and log dividend-book equity ratio δ t = ln (1 + exp (dp t )) (1 ρ) ( dp t dp ) + κ t ψ t = ln (1 + exp (db t )) (1 ρ) ( db t db ) + κ t We emphasize that we follow the convention in this literature and assume the historical dividendprice and dividend-book equity are known to the investor, in contrast to? which estimate the 6

sample mean at each time t. Then it follows ( θt θ P t ) µ t µ 1 γ g ( t ḡ θt θ ) (1 ρ)β 1 ξ 1 β This equation decomposes the temporary component of the book-to-market ratio into three parts, namely, the infinite sum of three terms: future demeaned expected return, expected demeaned return on equity, and demeaned log book-to-market ratio. Thus if the book-to-market is temporarily high, it may be that investors expect a persisting above average future discount rate, or a persisting below average future cash-flow. If there is no time variation of cash-flow and discount rate, then the current change of book-to-market reflects purely the time variation of book-to-market in the future. Now if we write the equation as µ t µ 1 γ ( ( θ t θt P ) g t ḡ θt θ ) + + (1 ρ)β 1 ξ 1 β If investor expects no time variation of cash-flow, and book-to-market ratio is independent to its own history, then return can be predicted in the conventional regression where the lagged bookto-market is the predictor. However, investor updates the belief every time when she received new information about the cash-flow or/and book-to-market ratio, thus all the future cash-flow or/and book-to-market will be expected to fluctuate around their own average. Therefore, instead of running the conventional predictive regression, we find from this equation that the three variables on the RHS, could separately or jointly predict future returns. We label the variable π = (1 ρ) β(θt θ) 1 β as the prospective book-to-market and this is the novel predictor we develop in this paper. By taking into consideration of the persistence and long run trend of book-to-market ratio, the prospective book-to-market ratio is much more volatile than the original book-to-market. In our analysis we focus our attention on the prospective book-to-market, and we treat the AR(1) coefficient γ of return process as constant, thus it plays no role in the time series predictive regression. Even abstracting away from its estimation, we still achieve superior performance in return predictability. Also, we do not make attempts on the temporary component of the book- 7

to-market ratio, as the estimation of which requires more time series modeling and defeats our purpose of develop a parsimonious new return predictor. Finally, we also leave the term involving returns on equity in a separate research. 3 3 Data and empirical results 3.1 Predicting market returns We use three sets of data for the empirical analysis. We first apply the methodology to the aggregate market book-to-market ratio and examine the predictive power of the prospective bookto-market. For the aggregate market data, we rely on those from?, available on Amit Goyal s website. The book value is from Value Line and Dow Jones. The annual book-to-market ratio is the ratio of book value at the end of the previous year to market value at the end of the current year, for the Dow Jones Industrial Average. It starts from 1921 and market return starts from 1926. Both samples end in 2013. 3.1.1 Parameter estimation To construct the prospective book-to-market, we start with using the sample historical mean as a proxy for long run trend, and the sample first order autoregression coefficient as a proxy for persistence of the log book-to-market ratios. In order to facilitate a fair comparison with the benchmark log book-to-market ratio, these parameters are (re)estimated every year only using data available at the time of estimation. This estimation requires some history of log book-to-market ratio to initialize. Specifically, we use the first 10 observations to obtain the initial estimates of the moving average and persistence. Each year after, we add one more observation to the data and reestimate the parameters and eventually we obtain the time-series of the estimated parameters. 4 Table 1 3 In a companion paper, we propose a slightly different empirical approach to better suit the much weaker persistence of return on equity, and show this framework also allows the prospective ROE to predict returns. In fact, we introduce a much larger array of accounting variables to improve their return predictive power, for U.S. equity market and international markets, all within the same framework. 4 Our main results do not depend on the choice of the beginning year in the estimation. 8

Panel A presents the summary statistics of the historical mean and persistence parameters. The sample mean of the moving average is 0.531, slightly higher than the full sample average ( 0.676 shown in Panel B). The average persistence is 0.772, confirming that the log book-to-market ratio is a slow moving random variable. Finally, the literature has set the parameter ρ to be 0.96 per year as in? and?. We follow this custom and treat this parameter as a constant, therefore actually the value of it will not affect our empirical work. Our baseline prospective book-to-market is constructed as follows, π = β(θ θ) 1 β, in which the persistence (β) parameter is the simple OLS estimate. Both the historical mean and persistence parameters are updated each year. We emphasize that each observation is constructed using only available information at the point in time and there is no look-ahead bias in obtaining the π variable. Our main results are based on π and we will also examine alternative specifications in the robustness check section. Table 1 about here. Table 1 Panel B presents the summary statistics of the market excess returns, log book-tomarket ratios, and prospective log book-to-market ratios. Market excess returns and log bookto-market ratios have been extensively studied in the literature and therefore we focus on our main predictor, prospective log book-to-market ratio. We note that the mean of the prospective book-to-market is slightly smaller in magnitude than the original variable ( 0.560 vs. 0.676) and the standard deviation of prospective book-to-market ratio (14.728) is much larger than that of the original variable (0.496). The seemingly amplified variability is due to the fact that the prospective book-to-market can be large especially when β approaches one. For example, the maximum of π is as large as 123.092 (in 1933), and the minimum value is 33.076. To ensure that our results are not driven by the particularly large positive outlier, we also conduct robust regression which uses an iteratively re-weighted least squares algorithm, the estimates of which are less sensitive to outliers in the data as compared with the OLS estimates. Furthermore, the 9

prospective book-to-market ratio is not nearly as persistent as the original variable, which is also expected because the prospective variable removes the persistence of the original book-tomarket ratio. Panel C shows the pairwise correlations among market excess returns, log book-tomarket ratios, and prospective log book-to-market ratios. The correlation between the original and prospective book-to-market ratios is 0.510, suggesting that the two variables are different and yet still share common information. 3.1.2 Predictive regressions Our main goal in this paper is to compare the predictive power of the prospective ratio with that of the original variable. We first examine the predictability on the market risk premium in a time-series setting and the results are reported in Table 2. Table 2 about here. Panel A presents the full sample in-sample (IS) predictive regression results. Note that insample refers to the regression being conducted using the full sample and there is no look-ahead bias in our prospective variable. We first show that the original book-to-market ratio marginally predicts the market risk premium, with a moderate t-statistics of 1.70 5. More importantly, when we turn to regression with π as predictor, the coefficient for π is 0.004 and highly significant with a t-statistic of 5.86 and adjusted R 2 of 8%. The economic significance is also impressive and a one standard deviation shock in π predicts a positive 5.89% (computed as 0.004 14.728) excess return the following year. Many known return predictors fail in the post oil-shock sample (?). Accordingly, we examine the performance of predictors from 1975 to 2013 in Panel B. As suggested, the log book-to-market itself completely loses its predictive power for market returns during this period (t = 0.59 and R 2 = 0.01). On the contrary, the coefficient of prospective book-to-market, proxied by π, 5 Because our sample period covers the most recent financial crisis period, this result is generally consistent with, albeit different from, those in? and? 10

remains very significant. The point estimate, 0.008, is about twice as large as that in the full sample and is statistically significant with t-statistics 2.71 and adjusted R 2 9%. As discussed above, to minimize the effect of outliers, we winsorize the 1933 observation (Panel C) by replacing with the next largest value and this procedure reduces the statistical significance of π, with t-statistics 2.60 and adjusted R 2 3%. To further control for outliers, we also winsorize 5% of the π variables and the results are in Panel D. The slope coefficient, 0.008, is significant with a t-statistic of 1.98. It s worth noting that in both winsorized samples, the log book-to-market ratio itself turns insignificant at 10% level in each winsorized samples.? cautions that in-sample predictability often fails to translate into out-of-sample (OOS) predictability. To address this issue, we examine the out-of-sample predictive powers in Table 3 and the metrics we use are adjusted R 2, RMSE, and MSE-F, as advocated by?. 6 Table 3 about here. Panel A presents the out-of-sample predictive test results using the first 15 observations 7 to initialize the regression (also known as the burn-in period). As documented in the extant literature, the original book-to-market ratio shows negative values in all 3 metrics we examine, suggesting that the original book-to-market fails to outperform a simple historical moving average estimate. In contrast, the prospective book-to-market ratio (π) exhibits strong out-of-sample performance in all 3 measures we use. The adjusted out-of-sample R 2 is 4.3% with a p-value of 0.01, significantly outperforming the naïve model using simple moving average. The extant literature also finds that the commonly used predictors perform even worse in the modern sample starting only from 1975. To investigate whether our proposed predictor also suffers from this reduced predictive power in the most recent period, we perform the out-of-sample test using the first 45 6 We reproduce the formulas for these measures in the appendix, which are all statistics examining the relative performance of a predictor against the historical mean. 7 Because we use another 10 observations to start estimating the prospective book-to-market ratio, the first 25 observations are not used to evaluate the model performance. Our results are not sensitive to the choice of the out-of-sample period. 11

observations to initialize the estimate and the results are in Panel B. Because we use another 10 observations to start estimating the prospective ratio, this specification effectively only tests the predictive power in the post oil-shock period. We find that the raw book-to-market ratio entirely loses its predictive power in this out-of-sample test, with p-values equal to 0.79. In sharp contrast, the prospective book-to-market ratio still generates a significant R 2 at 5.0% with a p-value of 2%. Collectively, the results in Tables 2 and 3 demonstrate the strong predictive power of our prospective ratio both in- and out-of-sample and in different sample periods. 3.1.3 Robustness We have shown strong return predictability using empirical proxies for the long run trend and persistence in the model. To further investigate the robustness of the preceding results, we examine several other alternative constructs 8. As the prospective book-to-market relies critically on the estimate of the β, the time variation of prospective book-to-market will also be sensitive to outliers in the raw book-to-market ratio. Therefore we also consider the iteratively re-weighted least squares (?). The new estimate (β RLS ) is generated from an iteratively re-weighted least squares algorithm, which gives lower weight to points that don t fit well. Specifically, the weights at each iteration are calculated by applying the bi-square function to the residuals from the previous iteration. The RLS estimates are known to be less sensitive to outliers in the data as compared with the OLS estimates. With this alternative measure of persistence, we construct π = β RLS(θ θ) 1 β RLS Table 1 Panel A also presents the summary statistics of the alternative measure (β RLS ). In general, β RLS is similar with β and the linear correlation between the alternative π measure and π is 0.892 (Table 1 Panel C). Panel A of Table 2 establishes the full sample predictive power of this alternative measures, with t-statistics 5.41 and adjusted R 2 8%. Panels B presents the 8 We are more concerned with the large value of π due to estimate uncertainty of β. Another concern is that β has downward bias in estimation especially when the sample size is small. We alternatively try the recursive mean least squares to address this issue, for market level, industry portfolios, and cross-section of stock returns. The results are similar and we don t report them to save space. Adjusting estimate of β upward of course may increase of the time variation of the π. 12

post oil-shock sample and Panels C and D examine two winsorized samples. In the post oil-shock sample and winsorized samples, π are all statistically significant, with t-statistics 3.08, 2.81, and 2.13 respectively. Interestingly, the point estimates of π are almost identical at 0.007 across all 4 Panels. This is because the measure is more robust to outliers by construction. The significance of π remains the same when we examine the OOS performance in Table 3. The OOS R 2 is 4.1% in the full sample and 5.8% in the post oil-shock sample, with p-values 0.01 and 0.02, respectively. As the OLS AR(1) coefficient is potentially sensitive to outliers, it is not surprising to see that this alternative measure sometimes outperforms our baseline prospective book-to-market ratio, which corroborates our main findings. 3.2 Predicting industry portfolio returns In this section we turn to industry level stock returns.? finds that book-to-market ratio not only predicts returns at the market level, industry book-to-market ratios also predict industry portfolio returns in a time-series setting. Based on the strong predictability of our prospective ratio at the aggregate level, we investigate whether the prospective book-to-market ratios at industry level can also predict industry returns. Our industry classification, returns data, industry book values, and industry market values are all from Ken French s data library. We compute end-of-year industry book-to-market ratios by dividing book value at the end of the previous year to market value at the end of the current year. The sample period for industry portfolio data are from 1926 to 2013 and we use the 48 industry portfolios 9. 3.2.1 Parameter estimation We use the Fama-French 48 industry portfolios as test assets and estimate the moving average and persistence parameters for each industry using the same method as that used for the market book-to-market ratios. Table 4 presents the summary statistics of industry excess returns and industry log book-to-market ratios of each industry. 9 Results on 12 and 38 industry portfolios are reported in the appendix. 13

Table 4 about here. There are ample variations in excess returns and book-to-market across industries. For example. the highest average excess return is 20% per year (Aircraft ind Business Supplies) and the lowest is 9% (Utilities, Communication, Retail, and a few other industries). The Transportation industry registers the highest log book-to-market ratio (0.34) and Pharmaceutical Products lowest (-1.27). The AR(1) coefficients of excess returns are mostly negative for all industries and those of log book-to-market are all very high. Table 5 about here. Table 5 presents the summary statistics of the estimated parameters of each industry. Naturally, the industries with the highest bm and industries with the most persistent bm do not necessarily coincide with each other. For instance, the bm of Coal industry is the most persistent (average β is 0.895) and that of Apparel industry is the least persistent (average β is 0.530). Table 6 about here. This finding would lead to different patterns of heterogeneities between bm and π. Table 6 presents the summary statistics of π of each industry. The Coal industry shows the most negative π at -6.89 and Automobiles and Trucks shows the highest π at 0.457. Table 7 about here. Table 7 further shows the cross-industry average of summary statistics after we pool all the industry estimates. Panel A presents those of the estimated parameters. Similar to the market level estimates from the last section, the sample mean of the average persistence of 48 industry portfolios is 0.719, suggesting that the log book-to-market ratios are also persistent at industry level. Panel B shows the summary statistics of industry returns, industry book-to-market ratios, 14

and industry prospective book-to-market ratios (π). Comparing with the original book-to-market ratios, the prospective book-to-market ratios display a higher volatility (2.606 vs. 0.492) and a lower persistence (0.755 vs. 0.803). Both features are consistent with how we construct the variables, as discussed in the previous section. Panel C presents the correlations among the variables of interest. At industry level, the correlation between the original and prospective book-to-market ratios is quite high, 0.911. 3.2.2 Predictive regressions and industry portfolio sorts We next examine whether the prospective book-to-market ratios help predict industry portfolio returns. Table 8 presents the in-sample predictive regressions of annual 48 industry returns 10 on lagged industry bm, π, and π. All time-series predictive regressions are adjusted for Newey-West correction with 3 lags. For the sake of brevity, we only display the coefficient and t-statistics of the predictor and the adjusted R 2 of the predictive regression. Table 8 about here. For the original bm, 16 coefficients out of 48 industries are significant with p-values less than 5% and the average adjusted R 2 is 1.59%. For prospective book-to-market ratio π, 17 coefficients out of 48 industries are significant with p-values less than 5% and the average adjusted R 2 is 1.50%. For the alternative prospective book-to-market ratio π, 15 coefficients out of 48 industries are significant with p-values less than 5% and the average adjusted R 2 is 1.39%. Overall, the 3 predictors show similar predictive power in sample. Table 9 about here. To examine the out-of-sample predictive power of our variables, Tables 9 and 10 present the OOS results at industry level 11. 10 For robustness, we also present results of these tests for 12 industry portfolios and 38 industry portfolios in the Appendix. 11 We use the first 15 observations to initialize the estimate 15

Table 10 about here. To directly compare with out-of-sample results of the aggregate market, we apply the same time-series out-of-sample methods to every industry and count the number of industries with a p-value of adjusted R 2 measure less than 5% and 10% 12. Using the original book-to-market ratio, 8 R 2 s out of 48 industries show a p-value lower than 5%. The 8 industries are Healthcare, Textiles, Construction, Aircraft, Petroleum and Natural Gas, Personal Services, Real Estate and Other. The prospective book-to-market ratio significantly predicts 7 industries OOS, including Healthcare, Construction, Aircraft, Petroleum and Natural Gas, Real Estate, Trading, and Other. Although we do not perform a formal tests, both 7 and 8 exceed the number of portfolios that would have been expected to be significant just by chance. The numbers of portfolio OOS R 2 s that are significant at the 10% level are 10 (Fabricated Products and Transportation, in addition to the 8 listed above) and 9 (Fabricated Products and Utilities, in addition to the 7 listed above), respectively for the original and prospective ratios. We also examine the alternative measure π and its OOS results are shown in Table 11. For π, the number of industry portfolio OOS R 2 s that are significant at 5% and 10% level are 5 and 7, respectively. Overall, the original and 2 prospective book-to-market ratios demonstrate similar predictive power in the out-of-sample tests. Table 11 about here. The industry portfolios naturally provides us with a cross-section to increase the power of our test. To test whether the book-to-market and prospective book-to-market ratios predict returns across industries, we sort 48 industries into 5 quintiles every year end, based on the available bookto-market ratio. The high and low quintiles each includes 10 industry portfolios and we examine mean returns of the 5 quintiles, as well as the zero-cost High-minus-Low industry portfolios in Table 12 Panel A. Table 12 about here. 12 Using p-values of the other two metrics yield identical results. 16

Industry portfolio excess returns are increasing with both bm and π. And both strategies generate cross-industry return spreads of 3.7% per year. The risk-adjusted returns, however, show a different pattern. Panel B shows the α augmented by the Fama-French 3-factor model. The High-minus-Low α for the book-to-market ratio is 0.5% per year with a t-statistics of 0.63, whereas that for the prospective book-to-market ratio is significant at 2.3% per year with a t- statistics of 2.51. For robustness,we also examine the portfolios formed by the alternative measure, π. The mean return spread is 3.8% with t-statistics 3.64 and α is 2.4% with t-statistics 2.58. The results are quantitatively similar with those generated by π. Therefore, industry prospective book-to-market ratio generates a significant spread in risk-adjusted returns across industries, but the original book-to-market ratio fails to do so, consistent with the findings in?. 3.2.3 Robustness Similar to the robustness check for the aggregate market, we perform robustness checks for industry portfolios using the industry level π, which are similarly constructed as those for the market. Table 8 Panel A presents the in-sample SUR results and π show similarly significant predictive power for industry portfolio returns. The t-statistics for π is 7.64 and system R 2 is 8%. Panels B and C presents the industry portfolio sorting results and the methodology used remains the same. We focus on the mean returns and 3-factor α of the zero-cost portfolios sorted by π. The mean return spread is 3.8% with t-statistics 3.64 and α is 2.4% with t-statistics 2.58. The results are similar with those generated by π. The OOS results are shown in Table 11 and also comparable with the OOS results of π. 3.3 Cross-section of returns The evidence in the stock market and industry portfolios has indicated the superior predictive power of the prospective book-to-market ratio in the next period return over that of the raw log book-to-market value. In this section we turn to the cross-section of individual stocks and examine whether the predictability can translate into profitable investment strategies. 17

We take stock returns from CRSP and accounting data from Compustat. Our sample starts with all common shares (share code 10 or 11) traded on NYSE, Amex, and Nasdaq. For these firms, we calculate the book value of equity (shareholder equity, plus balance sheet deferred taxes, plus balance sheet investment tax credits, minus preferred stock) at the end of June. We set missing values of balance sheet deferred taxes and investment tax credit equal to zero. To calculate the value of preferred stock, we set it equal to the redemption value if available, or else the liquidation value, or the carrying value. Our main sample of individual firms starts from July 1959 and ends in December 2013, including stock returns, firms SIC codes and accounting information. 3.3.1 Parameter estimation As stated earlier, unlike industry portfolios data, our CRSP-Compustat data begin from 1959. We still conduct all our tests in the out-of-sample (OOS) period to avoid any look-ahead bias. Yet, estimating the first order autocorrelation coefficient and long run trend becomes a more acute problem in the cross-section of individual stocks as we have a panel with a large dimension of cross-section but a small dimension of time series. To overcome this difficulty, we pool the individual firms into groups and focus our attention on estimating the parameters from the group then assign the group values to individual firms. To be consistent with our industry portfolio results, we assume all firms within the same industry share the same AR(1) coefficient and long run trend. At the end of this section, we also report the results that are based mainly on estimation of individual firms AR(1) and long run trend, and only supplemented by their group counterparts when individual estimations based on short history become highly unreliable. This alternative produces very similar results. Our out-of-sample period starts from 1962:07 up to 2013:12. At the beginning of July 1962, we pool the firms according to their industry SIC codes at that time. Then we estimate a pooling OLS on this panel to generate a first estimate of β using the log book-to-market values between 1959 and 1962. We take each firms past book-to-market ratios then simply estimate a value weighted average as the long run trend θ. Again, for every year afterwards, we expand our estimation 18

window to generate a new set of estimates. Based on these estimates we calculate the value of prospective book-to-market ratio then assign to each firms within that industry 13. Table 13 about here. Table 13 reports the estimates and standard deviations of ˆβ and θ for each of the 48 industries during the sample period. There are large variations across different industries. For example, the Smoke industry has an AR(1) coefficient of 0.966, as the largest estimate compared to other industries. The Health industry on the other hand has the smallest estimate of 0.730 only. The smoke industry generates a very stable estimate with the standard deviation of 0.018 only, while the other industry contains the most volatile estimate with a very large standard deviation of 0.379. Across all the firms, the average firm s β is estimated to be 0.820, which is lower than average of market or industry estimates. Turning to the estimate of long run trend, we find that the smoke industry again has the largest estimate of 0.605, while the drugs industry has the smallest estimate of 1.193. The health industry s estimate has largest standard deviation of 0.420, while the smallest standard deviation comes from paper industry, only 0.092. On average, the long run trend of US firms is 0.325. 3.3.2 Prospective factor We examine whether the return predictability of prospective book-to-market can be detected from profitable portfolio performances. Our key variable is still π = β[θ t θ]/[1 β]. The raw book-to-market ratio is known to produce the value anomaly. In addition,? show that two alternative ways of constructing book-to-market portfolios, using the June-end market value and using current month market value in calculating the book-to-market ratio, can produce significant alpha above and beyond that of the raw book-to-market ratio portfolio. Our aim in this section 13 We have also tried a firm individual fixed effect regression, two way fixed effect regression, or random effect regression (GLS) and obtain similar results in the next section. We have also tried equally weighted past book-tomarket ratios within industry as long run trend and obtain similar results in the next section 19

is then to contrast our proposed prospective book-to-market ratio portfolio against the standard book-to-market ratio along with these two alternatives. Following? and?, we construct the prospective factor using six value-weighted portfolios formed on size and prospective book-to-market ratio. At the end of June of year t, stocks are assigned to two size-sorted portfolios with median NYSE market equity as breakpoint. We valueweight these portfolios then refresh every June. The prospective factor s high-minus-low return is the average return on the two portfolios with the highest prospective book-to-market ratios minus the average return on the two portfolios with the lowest prospective book-to-market ratios. Table 14 about here. Panel A of Table 14 reports the basic portfolio characteristics of this prospective book-tomarket ratio factor. We present the mean, std dev, max, min, and Sharpe ratio. During our sample period, the prospective book-to-market factor has mean return of 36.3 basis point per month, with annualized Sharpe ratio of 0.559. Panel B of Table 14 reports the time series regression results of prospective book-to-market factor on various factor asset pricing models. We consider the following risk factors: the q-factors MKT, ME, IA and ROE as in?, Fama-French three factors MKT, SMB, and HML as in?, the three factors augmented with momentum factor (UMD), and Fama-French five factors MKT, SMB, HML, RMW, and CMA as in?. The t-statistics are based on 6 lags Newey-West standard errors and reported in parentheses. Except that the q-factors are available from 1967:01, all the factors start from 1963:07. We find that the αs for all the regressions are significantly different from zero. Among all the model, the q-factors leave the largest alpha with 20.8 basis points per month (t=2.09), while the FF five factors leave the smallest alpha with 13.4 basis points per month (t=1.88). Generally, we find that the prospective factor is weakly related to the market portfolio negatively and related to the size factor, strongly related to investment positively and related to profitability strongly, and weakly related to RMW negatively and related to CMA positively. 20

In Table 15, we run additional time-series regressions and test whether our proposed prospective HML factor adds value in the presence of the competing versions of HML: the standard version using the ME at Dec 31 of year t ME which we denote as HML A,L following?, the annually updated version using the ME at June 30 of year t+1 (HML A,C ) following?, along with other conventional factors. These two versions are constructed similarly in that they both use the annual observation of ME in calculating the book-to-market ratio. Since HML does not belong to q-factors, we only examine the FF 3-factor model, the 3-factor augmented with UMD, the FF 5- factor model, and 5-factor model augmented with UMD. Each time, we replace the standard HML factor with our prospective factor. Besides, we specifically examine whether the two competing versions can be spanned by the prospective HML factor. Table 15 about here. Panel A presents results with the standard HML factor as the dependent variable, where the π HML is our prospective factor. Apparently, all the models show that standard HML is spanned by π HML with an insignificant alpha, ranging from 0.177 to 0.115 (t-stat from 1.63 to 1.19). It is worth noting that almost all of the alphas are negative, except when UMD is added to the 3-factor model. When we start from the simple regression, HML A,L is about 1.105 times π HML. When we add more factors, it seems that MKT and SMB do not capture the return of HML A,C other than the π HML. Consistent with?, HML A,L is an inefficient way to load momentum into a portfolio. Particularly, in the presence of π HML, HML A,C is negatively correlated with UMD, and positively with CMA. Panel B presents results with the HML A,C factor as the dependent variable. Again, all the models show that HML A,C is spanned by π HML with an insignificant alpha, ranging from 0.115 to 0.093 (t-stat from 1.12 to 0.97). This time, almost all of the alphas are positive, except in the case of 5 factor model. Compared to the results in panel A, it is consistent that HML A,L is a somewhat inferior portfolio to HMl A,C. When we start from the simple regression, HML A,C is about 0.987 times π HML. When we add more factors, again MKT and SMB do not capture return of the HML A,C. Further, UMD does not load significantly. 21

Finally, we consider the debate whether HML is a redundant factor in the presence of other factors.? find HML can be completely described by other four factors resulting in an insignificant alpha, thus concluding HML is a redundant factor in measuring abnormal returns.? conduct similar regressions and find that HML A,L alpaha still does not survive the time series regressions even after incorporating the UMD factor. They further show that the monthly updated bookto-market ratio, HML M,C, does generate significant alpha due to the heavy negative loading on momentum. Therefore, after showing prospective HML is not spanned by HML A,L or HML A,C but rather spans these two factors, in the presence of other factors, our next goal is to examine whether prospective HML can be useful in asset pricing models. Table?? about here. In panel A of Table??, we focus on the q-factors, while the dependent variables are four different versions of HML. The first model replicates the results from Table 15 that π HML has a significant alpha of 0.211. The next three models show that q-factors do explain the standard HML, consistent with?, and the HML A,C. Finally, consistent with?, HML A,C also has a significant alpha of 0.395. In panel B, we move on to the FF 5-factors without HML. The result shows that all versions of HML are highly positively related with CMA, and does not produce significant alpha. A closer look indicates that our π HML does produce the largest alpha of 0.120 with the most largest t-value among all. Finally, following?, we add the momentum factor to the above model. The result shows that the π HML, along with the HML M,C, generates significant alpha of 0.183 (t=1.99). It is interesting to note that π HML is the least loaded on CMA with the regression coefficient of 0.736 (t=7.84), compared to especially that of 1.061 for HML A,C, and the least loaded on UMD with a regression coefficient of 0.086 (t= 2.38), compared to especially that of 0.524 for HML M,C. Overall, these regression results suggest that our prospective HML is not as redundant as the monthly updated version of HML. However, the latter version is refreshed every month (thus heavily loading on UMD), while our prospective HML still relies on the Dec 31 year t value of ME and refreshed only annually. 22

4 Conclusion We model the transitory components of book-to-market ratio as sum of the present value of three demeaned terms: stock return, return on equity, and book-to-market. We propose an empirical proxy of the last term, the prospective book-to-market, as a novel stock return predictor. This new variable is more volatile than book-to-market, and requires the estimates of long run trend and persistence. Our data include market, industry portfolios, and cross-section of individual firms. We find that the prospective book-to-market can significantly predict market return with adjusted R- squared between 4.1% and 6.1% out-of-sample, can generate cross industry risk adjusted monthly return spread of 1.6%, and a cross sectional high-minus-low strategy on it generates significant monthly alpha ranging from 13.4 to 20.8 basis points across various workhorse factor models. As the prospective book-to-market factor spans instead of being spanned by the standard HML factor, it is also useful as an alternative value factor in factor models. 23

References Asness, C. S., R. B. Porter, and R. L. Stevens (2000). Predicting stock returns using industryrelative firm characteristics. Available at SSRN: http://ssrn.com/abstract=213872. Beveridge, S. and C. R. Nelson (1981). A new approach to decomposition of economic time series into permanent and transitory components with particular attention to measurement of the business cycle. Journal of Monetary Economics 7 (2), 151 174. Campbell, J. Y. (1991). A variance decomposition for stock returns. Economic Journal 101 (405), 157 179. Campbell, J. Y. and R. J. Shiller (1988). Stock prices, earnings, and expected dividends. The Journal of Finance 43 (3), pp. 661 676. Campbell, J. Y. and R. J. Shiller (1991). Yield spreads and interest rate movements: A bird s eye view. The Review of Economic Studies 58 (3), pp. 495 514. Campbell, J. Y. and S. B. Thompson (2008). Predicting excess stock returns out of sample: Can anything beat the historical average? 1509 1531. The Review of Financial Studies 21 (4), pp. Cochrane, J. H. (1992). Explaining the variance of price-dividend ratios. The Review of Financial Studies 5 (2), pp. 243 280. Cochrane, J. H. (2008). The dog that did not bark: A defense of return predictability. The Review of Financial Studies 21 (4), pp. 1533 1575. Cochrane, J. H. (2011). Presidential address: Discount rates. The Journal of Finance 66 (4), pp. 1047 1108. Cohen, R. B., P. A. Gompers, and T. Vuolteenaho (2002). Who underreacts to cash-flow news? evidence from trading between individuals and institutions. Journal of Financial Economics 66 (2-3), 409 462. Daniel, K. and S. Titman (2006). Market reactions to tangible and intangible information. The Journal of Finance 61 (4), pp. 1605 1643. Engel, C. (2011). The real exchange rate, real interest rates, and the risk premium. University of Wisconsin, working paper. 24