Salience Theory and Stock Prices: Empirical Evidence

Salience Theory and Stock Prices: Empirical Evidence Mathijs Cosemans and Rik Frehen Abstract We present empirical evidence on the asset pricing implications of salience theory. In our model, investors overweight salient past returns when forming expectations about future stock returns. Due to these salience distortions, investors are attracted to stocks with salient upsides. The excess demand for these stocks leads to overvaluation and low subsequent returns. Conversely, stocks with salient downsides become undervalued and earn high future returns. We find strong empirical support for these predictions in the cross-section of U.S. stocks. Consistent with a behavioral interpretation of our results, we find that the predictive power of salience is stronger among stocks with greater limits to arbitrage and during high-sentiment periods. We show that the salience effect is robust to controlling for risk factors, proxies of investor attention, and measures of lottery demand. Keywords: salience theory, probability weighting, asset pricing, return predictability JEL classification: D03, G11, G12, G14 For helpful comments and suggestions we thank Dion Bongaerts, Pedro Bordalo, Sebastian Ebert, Nicola Gennaioli, Yigitcan Karabulut, Mathijs van Dijk, Oliver Spalt, Marta Szymanowska, Wolf Wagner, and seminar and conference participants at Erasmus University, Tilburg University, and the 2016 Research on Behavioral Finance Conference. Rotterdam School of Management, Erasmus University, Burgemeester Oudlaan 50, 3062 PA Rotterdam, Netherlands, E-mail: mcosemans@rsm.nl. Tilburg University, Warandelaan 2, 5000 LE Tilburg, Netherlands, email: r.g.p.frehen@uvt.nl.

Salience Theory and Stock Prices: Empirical Evidence Abstract We present empirical evidence on the asset pricing implications of salience theory. In our model, investors overweight salient past returns when forming expectations about future stock returns. Due to these salience distortions, investors are attracted to stocks with salient upsides. The excess demand for these stocks leads to overvaluation and low subsequent returns. Conversely, stocks with salient downsides become undervalued and earn high future returns. We find strong empirical support for these predictions in the cross-section of U.S. stocks. Consistent with a behavioral interpretation of our results, we find that the predictive power of salience is stronger among stocks with greater limits to arbitrage and during high-sentiment periods. We show that the salience effect is robust to controlling for risk factors, proxies of investor attention, and measures of lottery demand.

1 Introduction Traditional asset pricing theory assumes that investors are fully rational and use all available information when choosing between risky assets. However, a large body of research suggests that investors have limited attention and processing power (see, e.g., Kahneman (1973)). 1 Bordalo, Gennaioli, and Shleifer (2012), henceforth BGS (2012), argue that due to these cognitive limitations, the attention of decision makers is drawn to the most unusual attributes of the options they face. As a consequence, these salient attributes receive disproportionate weighting in their decisions, while less salient attributes are neglected. BGS (2012) propose a novel theory of choice under risk that formalizes such salient thinking and demonstrate that salience can account for fundamental puzzles in decision theory, such as the Allais paradox and the existence of preference reversals. In this paper, we present empirical evidence on the asset pricing implications of salience theory. Specifically, we test the predictions of the salience-based asset pricing model of Bordalo, Gennaioli, and Shleifer (2013a) for the cross-section of stock returns. In the model, the demand for risky assets is influenced by the salience of asset payoffs in different states of the world. Salience is defined in the psychology literature as the phenomenon that when one s attention is differentially directed to one portion of the environment rather than to others, the information contained in that portion will receive disproportionate weighting in subsequent judgments (Taylor and Thompson (1982)). A key premise of the salience model is that choices are made in context, which means that investors evaluate each risky asset by comparing its payoffs to those of the available alternatives. This context dependence is motivated by a large body of experimental evidence showing that preferences depend on the context in which choices are presented. 2 The salient payoffs of a stock are therefore those that stand out relative to the payoffs of the other stocks in the market. Because investors focus their attention on a stock s salient payoffs and ignore its non-salient payoffs, they are attracted to stocks with salient upsides. The excess demand for these stocks causes them to become overvalued and to earn low future returns. On the flip side, stocks with salient downsides are unattractive to investors, causing them to become undervalued and to earn high future returns. 1 Hirshleifer (2015) provides a recent overview of this literature. 2 See Camerer (1995) for a comprehensive survey of this literature. 1

Any application of salience theory requires a specification of the set of states of the world that can occur and their objective probabilities. Following Barberis, Mukherjee, and Wang (2016), we assume that, when making a trading decision, investors mentally represent each stock by the distribution of its past returns, which they view as a proxy for the stock s future return distribution. Investors thus infer the set of possible future return states from the set of past return states. Because past returns are realized, their objective probabilities are known. Investors who engage in salient thinking then form a context-dependent representation of each stock by replacing these objective state probabilities with decision weights that depend on the salience of the past return in each state. Specifically, we suggest that investors form expectations about future stock returns by extrapolating salience-weighted daily returns over the past month. Motivated by our theoretical framework, we define the salience theory (ST) value of a stock as the distortion in return expectations caused by salient thinking. ST is positive when the return expectation of salient thinkers exceeds the return forecast computed using objective probabilities, which happens when the highest past returns on a stock are salient. In this case, investors focus on the upside potential of the stock and neglect its downside risk, thereby effectively acting as risk seekers and accepting a negative risk premium. Conversely, when the lowest past returns on the stock are salient, ST is negative and investor attention is focused on downside risk. Investors then exhibit risk-averse behavior and demand a positive risk premium for holding the stock. Because salience distortions stem from investors cognitive limitations, salient thinkers are assumed to engage in narrow framing: when evaluating a stock, they do not think about its contribution to the return distribution of their portfolio. This form of narrow framing implies that salience is determined solely by past stock returns and does not depend on investor-specific characteristics such as their portfolio or wealth. Consequently, demand for salient stocks will be correlated across investors and can exert temporary pressure on stock prices, as long as there are limits to arbitrage that prevent rational investors from correcting mispricing. We therefore expect the predictive power of the salience theory variable for future returns to be stronger among stocks for which arbitrage is more costly. We further predict the impact of salience on future returns to be more pronounced among stocks with larger ownership by individual investors, who are typically assumed to be less 2

sophisticated than professional investors and therefore more prone to salient thinking. Our empirical results provide strong support for the predictions of the salience model. First, we show that stocks with salient upsides earn lower future returns than stocks with salient downsides. A univariate portfolio analysis shows that the return difference between stocks in the highest and lowest ST deciles is statistically significant and economically large. Specifically, the average excess return for the zero-cost strategy that buys high-st stocks and shorts low-st stocks ranges from -1.91% per month for the equal-weighted portfolio to -0.80% per month for the value-weighted portfolio. These return differences are unexplained by standard market, size, value, momentum, and liquidity factors, with five-factor alphas ranging from -2.04% (EW) to -1.01% (VW) per month. We also construct double-sorted portfolios and perform firm-level Fama-MacBeth regressions to ensure that the salience effect we identify is not just a repackaging of existing return anomalies. We find that our salience measure retains significant explanatory power for returns after controlling for an extensive list of firm characteristics that are known to explain cross-sectional variation in returns. Additional tests confirm that the relation between ST and future returns is also robust to alternative specifications of the salience measure, different portfolio weighting schemes, other definitions of the state space, and alternative estimation methods. The results also hold for different subperiods and across various subsamples that exclude penny stocks, NASDAQ stocks, and illiquid stocks. Second, we find a stronger cross-sectional relation between salience and future returns among stocks with low institutional ownership, consistent with the hypothesis that salient thinking is more prevalent among individual investors. Furthermore, we show that the predictive power of salience for future returns is stronger among stocks with greater limits to arbitrage. Additional analyses reveal that the strength of the salience effect not only varies across stocks but also over time. In particular, we find that the impact of salience is larger during high-sentiment periods when unsophisticated investors are more likely to participate in the market. Collectively, these findings lend support to a behavioral interpretation of the relation between salience and future returns. Third, we find support for the prediction that a stock becomes mispriced because investors perception of its future return distribution is distorted by the returns on other stocks in the market. Specifically, we show that the ability of ST to explain cross-sectional differences in future stock 3

returns weakens when the salience of a stock s past returns is defined in isolation, rather than in the context of all other stocks available for choice. Changes in the context used to evaluate stocks affect the predictive power of ST because they lead to changes in salience and decision weights, and consequently, to changes in investors return expectations and trading decisions. These results highlight the importance of the context-dependent probability weighting function in salience theory. We explore three alternative explanations for the negative relation between ST and future returns. First, we consider the possibility that our salience measure proxies for lottery demand, because extreme stock returns are more likely to be salient. Several theoretical models predict that investors are attracted to lottery-like assets, because they overweight the small probability of a large gain that these assets offer (Barberis and Huang (2008)) or because they have a preference for skewness (Mitton and Vorkink (2007)). In contrast, in the salience model, extreme stock returns are overweighted not because they have small probabilities but because they are salient relative to the market return. Moreover, the pricing implications of salience are derived without assuming that investors have lottery preferences. Consistent with these theoretical differences, we find that the return-forecasting power of salience is not subsumed by measures of lottery demand proposed in the literature, such as a stock s idiosyncratic skewness and maximum daily return. The attention-induced price pressure hypothesis of Barber and Odean (2008) provides another potential explanation for our findings. This hypothesis asserts that individual investors are more likely to buy attention-grabbing stocks, because they face a search problem when choosing stocks to buy. An increase in attention is therefore expected to result in temporary positive price pressure. In contrast, in salience theory, attention is drawn to salient states, rather than to salient stocks. Salience affects prices by distorting decision weights and return expectations, and not by narrowing down the set of stocks that investors consider for purchase. To distinguish between both theories, we exploit their opposite predictions for stocks with salient downsides. According to the investor attention theory, such stocks should become overpriced because both positive and negative attention-grabbing events lead to net buying by individual investors. Salience theory predicts that these stocks will be underpriced because investors focus on their downside risk. Empirically, we find that stocks with salient downsides earn higher future returns, which lends support to the salience 4

theory interpretation of our results. We also explicitly control for several attention proxies using bivariate sorts and Fama-MacBeth regressions and find that the negative relation between ST and future returns remains economically and statistically significant. Finally, one may be concerned that our salience measure simply picks up the short-term return reversal effect documented by Jegadeesh (1990) and Lehmann (1990). An extreme positive (negative) daily stock return drives up (down) monthly returns and will be salient if the market return on that day is moderate. Note, however, that our salience measure ST is defined as the difference between salience-weighted and equal-weighted daily returns. Hence, instead of capturing reversal, ST measures the incremental effect of salience on stock prices, conditional on investors using past returns to forecast future returns. Our empirical results confirm that the salience effect is robust to controlling for reversal in the bivariate portfolio sorts and in the Fama-MacBeth regressions. As a further robustness check, we introduce an additional one-month lag between the construction of ST and the measurement of subsequent returns. Again, the evidence indicates that the negative relation between salience and future returns is distinct from the short-term reversal effect. Our work adds to the growing literature that studies the asset pricing implications of behavioral choice theories. Most of this work focuses on the prospect theory of Kahneman and Tversky (1979). At the aggregate level, Benartzi and Thaler (1995) and Barberis, Huang, and Santos (2001) show that prospect theory can account for the equity premium puzzle. In the cross-section, a large number of papers provide empirical support for the prediction of Barberis and Huang (2008) that lottery-type assets earn lower returns. 3 In the framework of Barberis and Huang (2008), investors care about future gains and losses at the portfolio level. In contrast, Barberis, Mukherjee, and Wang (2016) assume that investors derive utility from past stock-level gains and losses and overvalue stocks whose historical return distributions are appealing under prospect theory. We contribute to this literature by providing empirical evidence on the pricing implications of a novel theory of choice under risk in which preferences are driven by the psychologically motivated mechanism of salience. Our paper also adds to the large literature that examines the consequences of limited attention for asset prices. Studies have shown that investors underreact to news when they are distracted (e.g., 3 Examples include Boyer, Mitton, and Vorkink (2010), Bali, Cakici, and Whitelaw (2011), Conrad, Dittmar, and Ghysels (2013), Boyer and Vorkink (2014), Conrad, Kapadia, and Xing (2014), and Eraker and Ready (2015). 5

DellaVigna and Pollet (2009), Hirshleifer, Lim, and Teoh (2009)). Other work shows that limited attention can generate return predictability when investors neglect specific types of information (e.g., Peng and Xiong (2006), Cohen and Frazzini (2008), Da, Gurun, and Warachka (2014)). Prior work has also studied the impact of attention-grabbing events on stock prices and trading behavior. Da, Engelberg, and Gao (2011) find support for the price pressure hypothesis of Barber and Odean (2008). Hartzmark (2015) argues that investors are more likely to sell the best- and worst-ranked stocks in their portfolio because extreme positions are more likely to enter their consideration set. Our work complements these papers by studying the impact of salience on the actual choice between the stocks in the consideration set in the final stage of the decision process. Finally, our paper contributes to the rapidly expanding literature on the impact of salience on individual decision making. A series of recent papers have shown that salience theory can account for evidence on decision making in a wide range of fields, including consumer choice (Bordalo, Gennaioli, and Shleifer (2013b)), judicial decisions (Bordalo, Gennaioli, and Shleifer (2015)), tax effects (Chetty, Looney, and Kroft (2009)), education choice (Choi, Lou, and Mukherjee (2016), and corporate policy choices (Dessaint and Matray (2016)). To the best of our knowledge, our paper is the first to provide empirical evidence on the asset pricing implications of salience. The paper proceeds as follows. Section 2 summarizes salience theory and discusses its implications for stock returns. Section 3 describes the data and Section 4 presents our empirical evidence on the cross-sectional relation between salience and future returns. Section 5 explores the role of the choice context and state space in the salience model. Section 6 considers alternative explanations for our findings and provides results for additional robustness checks. Section 7 concludes. 2 Salience Theory and Stock Prices In this section, we discuss the conceptual framework that relates salience theory to stock prices. In Section 2.1, we review the salience model and highlight the differences with prospect theory. Section 2.2 discusses the salience function that we adopt and explains how salience distorts decision weights. In Section 2.3, we summarize the salience-based asset pricing model of BGS (2013a) and formalize its empirical implications. Section 2.4 describes the construction of our salience measure. 6

2.1 Saliency Theory The first key premise of salience theory (ST) is that decision makers direct their attention to the most salient payoffs of the lotteries available for choice. Because of this distorted attention allocation, decision makers overweight the states of the world in which these salient payoffs occur when evaluating each lottery. 4 The second central idea is that choices are made in context, i.e., that the decision maker compares the payoffs of each lottery to those of the available alternatives. Salient payoffs are therefore defined as those that differ most from the payoffs of the other lotteries in the choice set. This definition is motivated by the observation that differences are more accessible to a decision maker than absolute values (Kahneman (2003)). The salience model of BGS (2012) combines the ideas of endogenous attention allocation and context-dependent choice by specifying a context-dependent weighting function to transform objective probabilities into decision weights. An important implication of the weighting function in salience theory is that payoffs in the tails of the distribution are only overweighted if they are salient. In sharp contrast, in the cumulative prospect theory (CPT) of Tversky and Kahneman (1992), decision weights are distorted by a fixed weighting function, which implies that tail events are always overweighted. In other words, whereas in prospect theory only the rank of payoffs affects the distortion of decision weights, in salience theory also the magnitude of payoffs and the choice context matter. BGS (2012) demonstrate that by adopting a context-dependent weighting function, salience theory can account for several violations of expected utility theory, such as preferences that are unstable and dependent on the context in which choices are presented. In contrast to prospect theory, salience can explain most of these anomalies without requiring a value function that is concave for gains and convex for losses. In particular, the decision maker exhibits risk seeking behavior when the upsides (i.e., the highest payoffs) of a lottery are salient and is risk averse when the downsides are salient. To illustrate the differences between probability weighting in ST and CPT, consider a simple example. Assume that an agent has to choose between two correlated lotteries L 1 and L 2 : 4 This idea is in line with recent evidence in neuroeconomics. For example, Fehr and Rangel (2011) show that agents evaluate goods by aggregating information about multiple attributes, with decision weights driven by attention. 7

Probability 0.10 0.30 0.60 Payoff L 1 $2000 $0 $1000 Payoff L 2 $2000 $300 $850 In both lotteries, the highest payoff of $2000 occurs in the low probability state. In CPT, the low probability associated with this high payoff is overweighted, because the decision maker evaluates each lottery in isolation. In ST, the low probability state is non-salient because both lotteries yield the same payoff. As a result, instead of being overweighted, the state cancels out in the salient thinker s evaluation of the two lotteries and therefore does not affect choice. Recent experimental evidence provided by Mormann and Frydman (2016) confirms that salience can causally affect risk preferences by distorting decision weights. After performing a structural estimation of the salience model and the prospect theory model, they find that the context-dependent weighting function of salience theory can explain much of the observed variation in risk taking, while cumulative prospect theory requires a large degree of concavity in the value function to fit the data. Our goal in this paper is not to run a horse race between salience and prospect theory, but rather to provide empirical evidence on the consequences of salient thinking for the cross-section of returns. 2.2 Salience-Based Probability Weighting To measure the salience of the payoff x is of lottery i in state s, BGS (2012) propose a salience function that maps payoffs into salience values: σ(x is, x s ) = x is x s x is + x s + θ, (1) where θ > 0 and x s = N i x is /N, with N denoting the number of lotteries. The salience function in Equation (1) satisfies four conditions: (i) ordering; (ii) diminishing sensitivity; (iii) reflection; and (iv) convexity. The ordering property implies that the salience of state s for lottery i increases in the distance between its payoff and the average payoff in state s of all lotteries in the choice set. Diminishing sensitivity implies that salience decreases as absolute payoff levels rise uniformly for all lotteries. Put differently, a difference in payoffs is perceived less 8

intensely when it occurs at higher payoff levels. According to reflection, salience only depends on the magnitude of payoffs, rather than their sign. In other words, reflecting gains into losses does not change the salience of a state because perception is sensitive to differences in absolute values. Finally, convexity means that diminishing sensitivity gets weaker as absolute payoff levels increase. 5 A smaller value of the parameter θ in Equation (1) increases the convexity of the salience function. More importantly, θ controls the salience of states in which a lottery has a zero payoff. If θ were excluded, zero-payoff states would have maximal salience, regardless of the average payoff level. Given the salience function in (1), the salient thinker ranks the states for each lottery. Based on this ranking, objective probabilities are replaced by decision weights, which are defined by: π is = π s ω is, (2) where ω is is the salience weight: ω is = δ k is s δk is π s, δ (0, 1], (3) where k is is the salience ranking of state s for lottery i, which ranges from 1 (most salient) to S (least salient). S denotes the set of states of the world, where each state s S occurs with objective probability π s, such that Σ S s=1 π s = 1. The decision weights are normalized so that they sum to 1, i.e., the expected distortion is zero (E[ω is ] = 1). In contrast to the objective probabilities, the decision weights are lottery-specific because they depend on the salience of lottery-specific payoffs. The parameter δ in Equation (3) captures the degree to which salience distorts decision weights and proxies for the decision maker s cognitive ability. When δ = 1, there are no salience distortions and decision weights are equal to objective probabilities (ω is = 1). This case corresponds to the rational decision maker. When δ < 1, the decision maker is a salient thinker who overweights salient states (ω is > 1) and underweights non-salient states (ω is < 1). When δ 0, the salient 5 Formally, assume there are two states s and s and two lotteries i and j. Let x min s and x max s denote the lowest and highest payoff in state s. Ordering implies that if the interval [x min s,x max s ] is a subset of [x min s,xmax s ], then σ(xis, xjs) < σ(x is, x js ). Diminishing sensitivity implies that if x is, x js > 0, then for any ɛ > 0, σ(x is + ɛ, x js + ɛ) < σ(x is, x js). Reflection implies that if x is, x js, x is, x js > 0, then σ(x is, x js) < σ(x is, x js ) σ( x is, x js) < σ( x is, x js ). Convexity implies that if x is, x js > 0, then for any ɛ, z > 0, the difference σ(x is +z, x js +z) σ(x is +z +ɛ, x js +z +ɛ) decreases with z. 9

thinker considers only the most salient payoff of a lottery and neglects all other payoffs. 2.3 Salience-Based Asset Pricing Model In this section, we summarize the salience-based asset pricing model proposed by Bordalo, Gennaioli, and Shleifer (2013a) to assess how salience impacts trading decisions and stock prices. BGS (2013a) start from a two-period consumption-based model with a measure one of identical investors. Each investor has linear utility over current (t=0) and future (t=1) values of consumption. 6 There is no time discounting, i.e., the subjective discount factor equals 1. 7 The initial endowment of the investor consists of her wealth w 0 and a holding of one unit of each of the N available stocks. Stock i has a current price p i and yields a payoff x is in state s S at t = 1. At t = 0, the investor trades an amount α i of each asset i to maximize her expected utility: max {α i } s.t. u(c 0 ) + E[ω is u(c 1,s )], (4) N c 0 = w 0 α i p i, N c 1,s = (α i + 1)x is. i i Equation (4) differs from the standard portfolio choice problem by weighting the asset payoff x is in each state by its salience ω is. The first-order condition for a solution to this problem is: p i u (c 0 ) = E[ω is x is u (c 1,s )] = S ( π s ωis x is u (c 1,s ) ), i N. (5) s Note that aside from distorting the objective state probabilities π s by the salience weights ω is, the investor s valuation of payoffs is standard. The salience weighting of x is reflects that the attention 6 Linear utility is assumed to illustrate how the mechanism of payoff salience and context dependence can generate shifts in risk attitudes, without relying on an S-shaped value function. The implications of salience for stock prices can also be derived in a mean-variance framework with risk-averse investors, analogous to the approach taken by Barberis, Mukherjee, and Wang (2016) to study the impact of prospect theory on future returns. In this framework, traditional mean-variance investors hold the tangency portfolio, whereas salient thinkers adjust the tangency portfolio by tilting their holdings towards stocks with salient upsides. The main prediction derived from this alternative model coincides with the key prediction of the consumption-based model of BGS (2013a), namely that stocks with salient upsides earn lower subsequent returns, while stocks with salient downsides earn higher future returns. 7 This assumption is innocuous as it only affects the risk-free rate, which plays no role in our analysis. 10

of a salient thinker is drawn to salient payoffs, while non-salient payoffs are neglected. Compared to an expected utility maximizer who evaluates payoffs using objective probabilities, a salient thinker wants to buy a larger (smaller) amount of asset i when its highest (lowest) payoffs are salient. The pricing implications of this salience-driven demand for stocks can be derived by combining the optimal trading decisions of all investors with the market clearing condition α i = 0 for all i. In equilibrium all investors hold the market portfolio and asset prices are given by: 8 p i = E[ω is x is ] = E[x is ] + cov[ω is, x is ], i N. (6) Equation (6) has an intuitive interpretation. The first term on the right hand side shows that in the absence of salience distortions, the stock price is simply the expected value of its future payoff, where the expectation is calculated using the objective (undistorted) state probabilities. The second term captures the impact of salient thinking on stock prices. When the highest payoffs of the stock are salient, i.e., when cov[ω js, x js ] > 0, the investor overvalues the stock because her attention is drawn to the stock s upside potential. In contrast, when the lowest payoffs are salient (i.e., cov[ω js, x js ] < 0), the investor focuses on the downside risk of the stock and is only willing to hold the stock when its price is below the rational price E[x is ]. To obtain the implications of salience for expected returns, we divide both sides of (6) by p i. After some rearrangements, we obtain the following equation for expected returns: E[r is ] = cov[ω is, r is ] ST i, i N, (7) where ST i stands for stock i s salience theory value. Equation (7) captures the main empirical prediction of the salience-based model that we test in Section 4. Specifically, stocks with salient upsides (positive ST) have lower expected returns, while stocks with salient downsides (negative ST) earn higher future returns. Note that when investors are rational (δ = 1), there are no salience distortions and all states are equally salient, ω is = 1 for all s. In this case, cov[ω is, r is ] = 0 and the 8 To see this, recall that E[ω is] = 1 and note that for a linear utility function u (c 0)/u (c 1) = 1. 11

expected return in (7) is zero, since investors are risk-neutral and do not discount the future. 9 2.4 Construction of Salience Measure To test the prediction that a stock s salience theory value negatively predicts future returns, we need to specify the set of states of the world that can occur and the corresponding probabilities. In an experimental setting where subjects are asked to choose between lotteries, the feasible payoff combinations and their probabilities are given. In an empirical application, however, the definition of the state space is less clear. Following Barberis, Mukherjee, and Wang (2016), we assume that when choosing between stocks, the most important attributes for investors are the stock s past returns, which they view as proxies for future returns. 10 In other words, each stock is characterized in investors minds by the distribution of its past returns and investors infer the set of possible future states from the states that occurred in the past. Specifically, in our analysis, the state space is formed by the daily returns in month t. Because each of these past returns was realized, its objective probability is known and equal to the inverse of the number of trading days in month t. We compute ST over a one-month window for two reasons. First, in our empirical analysis, we predict one-month-ahead stock returns. 11 Because a one-month window of past returns matches the one-month forecasting horizon, the number of past states is (almost) identical to the number of future states. Second, because the selective attention that distorts decision weights stems from cognitive limitations, salient thinkers may only recall the most recent returns. 12 Consistent with a shorter memory span, Greenwood and Shleifer (2014) find that expectations of individual investors are more sensitive to the most recent past returns than expectations of professional investors. 9 The aim of the BGS (2013a) model is to formalize the impact of salience on stock prices. Of course, in reality there are many other determinants of prices and returns, which may be correlated with ST. In our empirical analysis we therefore control for an extensive list of risk factors and firm characteristics known to explain variation in returns. 10 Barberis, Greenwood, Jin, and Shleifer (2015) develop a consumption-based model in which some investors form expectations about future aggregate stock market returns by extrapolating past returns, while other investors hold rational beliefs. A key implication of the model is that the stock market becomes overvalued after positive cash flow news, because extrapolators cause the initial price jump to be amplified. Subsequently, prices reverse, leading to lower future returns. Extending this model to an economy with multiple risky assets in which investors extrapolate salience-weighted past returns is an important direction for future work but beyond the scope of this paper. 11 Strictly speaking, given the daily state space, E[r is] in Equation (7) is the expected daily return in the next period. We predict monthly, rather than daily, returns to facilitate comparison of our results to those in the literature predicting monthly returns. We find similar results when predicting the average daily return in the next month. 12 Bordalo, Gennaioli, and Shleifer (2015) combine the salience model with a model of limited recall. 12

The salience of state s for stock i depends on the difference between the return on the stock (r is ) and the average return across all stocks ( r s ) on that day, i.e., Equation (1) becomes: σ(r is, r s ) = r is r s r is + r s + θ. (8) To illustrate the calculation of salience values, consider the following example. Suppose that on day s, the return on stock i is 10% and the market return is 5%. On day s, the stock return is 5% and the market return is 0%. Although the difference between stock and market returns is the same on both days, the stock s return is more salient to the investor on day s because of diminishing sensitivity, captured by the denominator in Equation (8). Intuitively, the stock s outperformance of 5% stands out more on a day when the market is flat than on a day when the market goes up. Equation (8) implies that salience is determined solely by an individual stock s return relative to the market return and does not depend on investor-specific characteristics such as the return on their overall portfolio. 13 This form of narrow framing implies that if a daily stock return is salient to one salient thinker, it will also be salient to all other salient thinkers. Consequently, demand for salient stocks will be correlated across investors and can exert temporary pressure on stock prices, as long as there are limits to arbitrage that prevent rational investors from correcting mispricing. For each stock, we rank the daily returns in each month in descending order on their salience. Subsequently, we calculate the corresponding salience weights ω is according to Equation (3). We then obtain ST by computing the covariance between daily returns and salience weights in each month. To compute salience values and salience weights, we need to specify values for the parameters θ and δ. In our implementation, we use the values employed by BGS (2012) to match experimental evidence on long-shot lotteries, namely θ = 0.1 and δ = 0.7. Note that the salience model has fewer parameters than the cumulative prospect theory of Tversky and Kahneman (1992), which requires the specification of two value function parameters and two weighting function parameters. This greater parsimony limits the degrees of freedom, thereby reducing the risk of overfitting. 13 The assumption that investors engage in stock-level narrow framing is common in the literature that studies the impact of mental accounting on trading decisions and asset prices (e.g., Barberis and Huang (2001), Barberis, Huang, and Thaler (2006), Ingersoll and Jin (2013), and Barberis, Mukherjee, and Wang (2016)). Notable exceptions are Barberis and Huang (2008) and Hartzmark (2015), who consider framing of gains and losses at the portfolio level. 13

Our salience measure ST has an intuitive interpretation. To see this, write ST as: ST i,t cov[ω is,t, r is,t ] = S π s,t ω is,t r is,t s S π s,t r is,t = E ST [r is,t ] r is,t, (9) s where the second equality follows from E[ω is ] = 1 and where the last equality follows from the fact that π s = 1/S, with S equal to the number of trading days in month t. The subscript t is used to emphasize that the ST measure for month t is computed using the returns on each day s in that month. Equation (9) shows that ST is equal to the difference between salience-weighted past returns and equal-weighted past returns. ST therefore measures the distortion in return expectations caused by salient thinking. 14 Investors overestimate (underestimate) the future return on high (low) ST stocks because they overweight the most salient past returns. For instance, when the highest past returns of a stock are salient, investors excessively focus on the upside potential of the stock and push up its price above the fundamental value, leading to lower returns in the future. 3 Data Our data come from CRSP and Compustat and consist of the daily and monthly return, the book and market value of equity, the price, and the trading volume for all firms listed on the NYSE, AMEX, and NASDAQ. The sample covers the period from January 1926 to December 2015. We include a stock in the analysis for month t if it satisfies the following criteria: First, there should be a minimum of 15 daily return observations within the given month to compute ST. Second, historical data should be available to compute each of the firm characteristics that we use as control variables. We measure firm size (ME) by the log of the market value of equity and book-to-market (BM) as the ratio of the book and market value of equity. Following Fama and French (1992), we calculate book-to-market using accounting data from Compustat as of December of the previous year and exclude firms with negative book equity. Because Compustat does not have book common equity 14 The rational benchmark here is the expected return computed using undistorted, objective probabilities. Note that we do not claim that the use of past returns to forecast future returns is rational. In fact, given the low serial correlation in returns, predicting future returns based on past returns may not be optimal. However, what matters is that in practice, individual investors do extrapolate past returns (e.g., Greenwood and Shleifer (2014)). Conditional on investors using past returns, we examine the incremental effect of salience distortions on stock prices. 14

(BE) data for the first part of our sample period, we obtain BE data from Kenneth French s data library for the period 1926-1953. 15 Momentum (MOM) is measured as the cumulative return over the 11 months prior to the current month. Amihud (2002) illiquidity (ILLIQ) is computed as the absolute daily return divided by the daily dollar trading volume, averaged over all trading days within the month. Market beta (BETA) is estimated from a regression of daily excess stock returns on the daily excess market return over a one-month window. Idiosyncratic volatility (IVOL) is defined as the standard deviation of the residuals from this regression. Short-term reversal (REV) is defined as the stock return in the previous month t 1. MAX (MIN) is the maximum (minimum) daily return on a stock within each month, as in Bali, Cakici, and Whitelaw (2011). The prospect theory (TK) value of a stock is constructed using a five-year window of monthly returns following the approach of Barberis, Mukherjee, and Wang (2016). Skewness (SKEW) is the skewness of daily stock returns. Coskewness (COSKEW) is defined as the coskewness of daily stock returns with daily market returns, computed using the approach of Harvey and Siddique (2000). Idiosyncratic skewness (ISKEW) is defined as the skewness of the residuals from a Fama and French (1993) three-factor model regression, as in Boyer, Mitton, and Vorkink (2010). Following Bali, Cakici, and Whitelaw (2011), we compute total skewness, coskewness, and idiosyncratic skewness using daily returns over a one-year period in order to have sufficient observations to adequately capture skewness. Downside beta (DBETA) is estimated from a regression of daily excess stock returns on the daily excess market return over a one-year window, using only days on which the market return was below the average daily market return during that year, as in Ang, Chen, and Xing (2006). We winsorize all variables cross-sectionally at the 1st and 99th percentiles. After performing our main unconditional analysis, we condition our study of the relation between a stock s salience value and its future return on various limits to arbitrage and on investor sentiment. In addition to firm size, illiquidity, and idiosyncratic volatility, we consider two other proxies for limits to arbitrage: institutional ownership and analyst coverage. Institutional ownership (IO) is defined as the fraction of shares outstanding held by institutional investors, which is available from the Thomson Reuters Institutional Holdings (13F) database from 1980 onwards. We 15 http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html 15

lag IO by one quarter to avoid any look-ahead bias when predicting future returns. Analyst coverage (NOA) is measured as the natural log of one plus the number of analysts that have issued an earnings forecast for the stock, which is available from the Institutional Brokers Estimate System (I/B/E/S) data set from 1976 onwards. Finally, we measure investor sentiment using the monthly sentiment index of Baker and Wurgler (2006), which starts in July 1965. 4 Cross-Sectional Relation Between Salience and Stock Returns This section presents empirical evidence on the cross-sectional relation between salience and returns. In Section 4.1 we perform a univariate analysis by sorting stocks into portfolios based on their ST value. Next, we control for a variety of firm characteristics by forming double-sorted portfolios in Section 4.2 and by running firm-level Fama-MacBeth regressions in Section 4.3. Finally, in Sections 4.4 and 4.5 we report results for conditional analyses that examine the impact of limits to arbitrage and investor sentiment on the strength of the relation between salience and future stock returns. 4.1 Univariate Portfolio Sorts The main prediction from the salience model in Section 2 is that stocks with a high (low) salience theory value will earn lower (higher) subsequent returns. In this section, we perform our first test of this prediction by performing univariate portfolio sorts. At the end of each month, we sort all stocks in our sample into ten portfolios based on their salience theory value ST. We then calculate the equal-weighted and value-weighted return of each ST portfolio over the next month. Table 1 reports for each decile the time series average of the one-month-ahead excess portfolio return, the four-factor alpha obtained from the Carhart (1997) model, and the five-factor alpha obtained from the Carhart (1997) model extended with a liquidity factor. The liquidity factor is constructed as the monthly innovation in the value-weighted average of the Amihud (2002) illiquidity measure across all stocks in the CRSP universe. 16 The last row shows the average excess return and alpha for the zero-cost strategy that buys high-st stocks (decile 10) and shorts low-st stocks (decile 1). 16 Our results are robust to using the Pastor and Stambaugh (2003) liquidity factor. In the paper we employ the Amihud (2002) liquidity factor because the Pastor and Stambaugh (2003) factor is only available from 1968 onwards. 16

The results in Table 1 provide strong support for our theoretical prediction that stocks with salient upsides earn significantly lower future returns than stocks whose downsides are salient. The first column shows that the equal-weighted (EW) portfolio returns decline nearly monotonically with the salience value of the stocks in the portfolios. The differences in the performance of high- and low-st stocks are not only statistically significant but are also large in economic terms. Specifically, the average excess return on the EW high-low ST portfolio is -1.91% per month, with a Newey and West (1987) t-statistic of -13.13. This return difference is unexplained by standard market, size, value, momentum, and liquidity factors, with four- and five-factor alphas equal to -2.07% and -2.04%, respectively, and corresponding t-statistics of -14.37 and -14.41. The right-hand panel of Table 1 shows that the return difference between the highest and lowest ST deciles is also significant for the value-weighted (VW) portfolios. As expected, the results are less pronounced than for the EW portfolios because large stocks tend to have lower retail ownership and smaller limits to arbitrage. Nevertheless, the effect of salience on VW portfolio returns remains sizeable, with a return spread of -0.80% per month (t-stat = -5.24). Again, we find no evidence that this return difference is driven by differences in factor exposures. The four- and five-factor alphas of the VW high-low ST portfolio are close to -1% per month and significant at the 1% level. 17 To shed more light on the composition of the ST-sorted portfolios, we compute the crosssectional average of various characteristics for the stocks in each decile. Table 2 reports the time series mean of the characteristics across all months in the sample for the EW (panel A) and VW (panel B) portfolios. Panel A shows that the portfolio sort generates a large spread in ST, ranging from -3.26 for the lowest-st decile to 6.26 for the decile of stocks with highest ST. In the other columns we relate this variation in ST to firm characteristics. We observe an inverse U-shaped relation between ST and firm size, with stocks in the extreme ST deciles being smaller on average. Small stocks are more likely to have salient returns because they tend to be more volatile. Table 2 confirms that the stocks in decile 1 and 10 have higher idiosyncratic volatility. High- and low-st 17 As a robustness check, we have also computed return-weighted portfolios that are constructed by weighting each individual stock return by its gross return in the previous month. Asparouhova, Bessembinder, and Kalcheva (2013) demonstrate that this procedure is effective in correcting a potential bias in equal-weighted returns that can arise from noise in security prices. The unreported results show that the return-weighted average raw return and alphas on the high-low ST portfolio are very similar to their equal-weighted counterparts in Table 1. 17

stocks also tend to be more illiquid and have a higher market beta. Unsurprisingly, ST is positively associated with the contemporaneous monthly stock return (REV). An extreme positive (negative) daily stock return drives up (down) monthly returns and will be salient if the market return on that day is moderate. 18 We find an inverse U-shaped relation between ST and MAX and no clear relation between ST and the prospect theory variable TK. Finally, both total and idiosyncratic skewness increase with ST, because positively skewed stocks are more likely to have salient upsides. We observe similar, albeit somewhat less pronounced, patterns for the VW portfolios in panel B. Overall, the univariate analysis provides preliminary evidence of a strong negative relation between a stock s ST value and its return in the next month, consistent with the predictions of the salience-based asset pricing model in Section 2. The return difference between high- and low-st stocks is economically large and statistically significant, regardless of the portfolio weighting scheme that is used. Furthermore, the return spread is not explained by common risk factors. However, a potential concern is that ST is related to a number of firm characteristics that have been shown to explain variation in returns. In the following two subsections we therefore examine whether the negative relation between ST and future returns is robust to controlling for these characteristics. 4.2 Bivariate Portfolio Sorts In this section we study the relation between ST and future returns using bivariate portfolio sorts to account for firm characteristics. We construct double-sorted portfolios as follows. First, at the end of each month, we sort stocks into deciles based on one of the control variables. Next, within each decile, stocks are further sorted into deciles based on ST so that a total of 100 portfolios is created. For each of these portfolios, we record the realized return over the next month. Finally, we average the returns of the salience deciles across the different deciles of the control variable. Table 3 presents the results of this sorting exercise. For each of the ST-sorted deciles, we report the EW (panel A) and VW (panel B) monthly excess return. The last set of rows shows the 18 As shown in Section 2.4, ST can be interpreted as the difference between salience-weighted (SW) and equalweighted returns. A high daily stock return that is salient pushes up the EW return but has an even larger impact on the SW return because it receives more weight. Hence, the high daily return leads to both a higher monthly return (REV) and a larger difference between salience-weighted and equal-weighted returns, i.e., a higher ST. In contrast, if the high stock return is non-salient because the market return on that day is also high, ST will be much smaller. In the limit, in the absence of salience distortions in return expectations, ST is zero and does not predict reversal. 18