Investor Clienteles and Asset Pricing Anomalies *

Investor Clienteles and Asset Pricing Anomalies * David Lesmond Mihail Velikov November 6, 2015 PRELIMINARY DRAFT: DO NOT CITE OR CIRCULATE Abstract This paper shows that the profitability of anomaly trading strategies is strongly concentrated among stocks which have the highest concentration of 500- and 1000-share trade size clusters. Value-weighted decile long-short anomaly strategies executed in the high 500- or high 1000-shares trade size frequency terciles generate average gross returns, net returns, and alphas that are over two times higher than the equivalent for strategies executed in the low 500- or low 1000-shares trade size frequency terciles. Conversely, the concentration of 100-share trades does not help improve the performance of anomaly trading strategies. We argue that this finding is consistent with the existence of trade clienteles where informed traders congregate in the 500- or 1000-share trade clusters compared to the relatively uninformed traders who congregate in the 100-share trade size clusters. JEL classification: G11, G12, E52. Keywords: Asset Pricing, Anomalies, Trade Clusters, Market Microstructure, Return Predictability *The views expressed in this paper are those of the authors and do not necessarily reflect the position of the Federal Reserve Bank of Richmond or the Federal Reserve System. We thank James Weston for discussions and comments. All mistakes are ours alone. Freeman School of Business, Tulane University, M136 Goldring/Woldenberg Hall II, New Orleans, LA 70118 Email: dlesmond@tulane.edu. Federal Reserve Bank of Richmond, 502 S Sharp St, Baltimore, MD 21201. Email: mihail.velikov@rich.frb.org.

1 Introduction Academic researchers have documented dozens of cross-sectional anomalies" and have used them as examples of violations of both weak-form and semi-strong form of market efficiency. As a result, a host of papers have attempted to either identify the source of the price inefficiency through behavioral arguments, such as Lee and Swaminathan (2000) and Hvidkjaer (2008), or to explain the persistence of anomalies through trading costs, such as Lesmond et al. (2004) and, more recently, Novy-Marx and Velikov (2014). But little research has been devoted to analyzing if there are identifiable clienteles that actively trade anomalies and whether these clienteles, if identified, systematically trade in the same universe of stocks. The identification of trade clienteles that actively engage in investing in anomalies can aid in better understanding the trading mechanisms through which anomalies occur. In this study, we categorize investor clienteles as those trades that are clustered at 100-, 500-, or 1000-share sizes and we asses whether there is a distinctive trading pattern specific for each trade size cluster across the 23 anomaly trading strategies from Novy-Marx and Velikov (2014). We show that the profitability of anomaly trading strategies is strongly concentrated among stocks which have the highest concentration of 500- and 1000-share size trades. Valueweighted decile long-short anomaly strategies executed in the high 500- or high 1000-shares trade size frequency terciles generate average returns that are an order of magnitude higher than the equivalent strategies executed in the low 500- or low 1000-shares trade size frequency terciles. The opposite holds for conditional double sorts on the 100-share trade size frequencies and the anomaly signals. Across the low- and medium-turnover anomalies, portfolios dominated by 500- or 1000-share trade clusters earn a robust and significant alpha of more than 1% per month, more than double the ones from portfolios dominated by 100-share trade clusters, regardless of the anomaly. We argue that this finding is consistent with the existence of trade 1

clienteles where informed traders congregate in the 500 or 1000-share trade clusters compared to the relatively uninformed traders who congregate in the 100-share trade size clusters. Trade size has been examined by Barclay and Warner (1993) who argue that trade sizes from 500 shares to 9,900 shares per trade contain more value relevant information than do smaller trade sizes. Hasbrouck (1995) and Chakravarty (2001) document the presence of stealth trading by institutional investors and show that medium-sized trades (defined as trades between 501 and 9,900 shares) tend to have a disproportionately greater aggregate price impact that is attributable to informed traders disguising at least some of their trades. Keim and Madhavan (1995) provide empirical evidence that institutions often break up their orders into discrete trade sizes and fill them over time and Chordia and Subrahmanyam (2004) model conditions in which traders find it optimal to break up their orders to minimize price impact. Battalio and Mendenhall (2005) uses the 500-share trade size as a minimum delineation for more informed trades and contend that trade sizes of 100 to 400 correspond to the trading interests of less informed traders. Alexander and Peterson (2007) find that 500 and 1000 share trade size clusters do indeed experience higher price impact costs consistent with informed trading. In addition, they argue that these trade size clusters may naturally arise because large orders 1 are likely broken down into smaller medium size trades, resulting in trade clusters are 500 and 1000-shares. Given the potential that 500 and 1000-share trade size clusters evidence more informed trade, it is an empirical question whether they are useful in predicting future returns. We explicitly study whether trade size clusters are predictive of the returns accruing to a wide range of anomalies identified in the literature. Previous studies have typically focused on a single anomaly at a time. As noted previously, Hvidkjaer (2008) or Lesmond et al. (2004) study only the momentum anomaly without regard to the vast number of other anomalies that have been identified in the literature. The work 1 Since large orders are likely to involve informed institutions, given the analysis of Hasbrouck (1991) and Chakravarty (2001), the subsequent medium-sized rounded trades are more likely to be information-based. 2

of Novy-Marx and Velikov (2014) is more expansive, but they focus on the importance of transaction costs in evaluating the profitability of the anomaly strategies. We take a far different stance, by showing that specific investor clienteles congregate in certain trade size clusters. Consistent with Alexander and Peterson (2007), we attempt to show that there is a commonality in the trading of these anomalies as seen through trade size clusters. Our argument is that these trade size clusters can delineate informed from uninformed trading and, thus, sorting by the frequency of trades occurring at the clusters first should produce a sort in the profitability of the anomalies. Easley and O Hara (1987) theorize that trade size matters because it is correlated with private information about the security s true value. What is relevant for asset pricing is not the number of informed trades, but rather the fraction of trades that come from informed traders. In their model, based on an exogenous signal, large trade sizes (principally block trades) matter because they change the perception of the value of a security. Yet much of the literature is focused on small trades that presumably reflect retail or uninformed trades. Lee (1992) and Battalio and Mendenhall (2005) note the importance of small trades in predicting future returns of post-earnings announcement drift strategies. Hvidkjaer (2006) argues that large trades show no evidence of underreaction, and large trade imbalances have little impact on subsequent returns, concluding that the results suggest that momentum could partly be driven by the behavior of small traders. Our results suggest that trade size clusters are more important in understanding the sources and profits than simply focusing on the behavior of retail or uninformed traders. We rely on the findings Alexander and Peterson (2007) who note that trades cluster at 100-, 500-, and 1000-shares, where, in particular, the 500- and 1000-share trades are representative of informed trading. We sum all 100-, 500-, and 1000-share trades for each firm each day and then divide by the total number of trades each day and then average this ratio over the month. We first sort portfolios into terciles based on each trade size frequency (cluster) and 3

then according to deciles based on an anomaly signal. For the low- and medium-turnover anomalies, we show clear monotonicity of the average returns of the anomaly strategies with trade size frequency of 500- and 1000-share sizes. In other words, portfolios that experience the largest concentration of 500 or 1000-share trades earn an order of magnitude higher average returns than do portfolios that are avoided by these traders. The alphas accruing to the highest concentration of larger trade clusters are substantial, often exceeding 1% per month, while the alpha for the portfolio avoided by the 500- and 1000-share trades earns less than half of their counterparts. Conversely, there is little differentiation between the portfolios dominated and avoided by the 100-share traders indicating little informativeness in the 100-share trade clusters. We note that since 2000 decimalization of stock quotes and the advent of high-frequency trading have significantly affected trading behavior, resulting in a vast increase (decrease) in the number of 100-share (500-share) trades. We use this exogenous shock as a control for the informativeness of larger trade clusters. We do not find a decrease in the performance of the hedged 500 and 1000-share portfolios, rather we find that the monotonicity across trade size portfolios for the 500 and 1000-share size trades remains and surprisingly, the profitability of the accruing to the highest concentrations of 500 and 1000-share trade size cluster portfolios is only strengthened. This is particularly evident for the low-turnover anomalies where the alphas in the pre-decimalization period are approximately 0.5% per month, while the alphas in the post-decimalization period are approximately 0.75% per month. We show that investor clienteles exist and can be identified on the basis of trade size clusters where the most informative trade size clusters are delineated by 500- and 1000-share trades. We show that anomaly strategies executed among stocks associated with the informed traders clientele perform significantly better than strategies executed in the universe of stocks dominated by less informed traders. This is the first paper to comprehensively analyze a wide variety 4

of anomalies for a common indicator that points to informed trading. Although our study is in very preliminary stages, it has the potential to contribute to several different strands of the literature. In the next draft of this study, we plan to... 2 Trade Size and Firm Attribute Controls The sample includes all ordinary common stocks listed on the NYSE and the American Stock Exchange (AMEX) in the period January 1983 through December 2012. Transactions data on NASDAQ stocks became available in January 1987, hence those stocks are included in the sample from that time on. Real estate investment trusts, stocks of companies incorporated outside the U.S., and closed-end funds are eliminated from the sample. Return data and unsigned share volume data are from the Center for Research in Security Prices (CRSP) files. Transactions data are obtained from the Institute for the Study of Security Markets (ISSM) and the Trade And Quote (TAQ) data sets. The ISSM data set includes all trades for stocks listed on NYSE/AMEX from 1983 to 1992 and on NASDAQ from 1987 to 1992, while TAQ covers 1993 to present for all exchanges. Trades with irregular terms are excluded and trades are run through a simple price-based error filter to exclude likely erroneous prices. We only focus on the trades database for both ISSM and TAQ negating the need to match the trade with the prevailing quote due to our focus on trade size. We do utilize the quote database to calculate the bid-ask spread applicable to the closing price to estimate the costs of implementing the trade. The trade size frequency variables are the sum of 100-share (T100), 500-share (T500), and 1000-share (T1000) trades over a month divided by the total number of trades that month to derive monthly firm-level frequencies within each trade size category. We also analyze trade size increments between 100 and 500 shares, and between 500 and 1000 shares, between 1000 5

and 5000, 5000-share trades, and greater than 5000 share trades. 2.1 Trade Size Clusters over Time Figure 1 plots the monthly cross-sectional means across firms for trade size frequency variables over time. We can observe that following 2000 the pattern in trading changed dramatically. The share of trades executed in 100-share sizes increased from about 20% in 1999 to over 70% by the end of 2013. On the other hand, the average 500- and 1000-share frequencies dropped from over 10% in 1999 to under 3% in 2013. This regime shift coincides with the decimalization in stock quotes. The NYSE Fact book reports statistics showing average trade sizes falling dramatically after stock decimalization. The average trade size in 1999 for NYSE-listed firms was 1,205 shares per trade. After decimalization in 2004, the average trade size was significantly reduced to just over 390 shares per trade. In 2010, the average trade size had dwindled to 220 shares per trade and in 2014 the average trade size was approximately 140 shares per trade. In section 3.2 we will explicitly consider how this regime shift affects our results by splitting our sample into pre- and post-decimalization periods. Similarly, figure 2 plots the monthly cross-sectional mean, 30 th, and 70 th percentiles across firms for the trade sizes frequency variables over time. We can observe that the distance between the percentile bounds and means over time, but the percentiles exhibit similar temporal temporal trends. 2.2 Determinants of Trade Size Clusters Next, we focus on the determinants of the trade size frequency variables. Table 1 reports results from monthly Fama-MacBeth cross-sectional regressions of the trade size frequency 6

variables on firm characteristics. Specifications (1) - (3) use T100 as the dependent variable, specifications (4) - (6) use T500 as the dependent variable, and specifications (7) - (9) use T1000 as the dependent variables. We can observe that all three trade size frequency variables are persistent. The coefficients on the lagged trade size frequency variables are high and statistically significant in all specifications, even after controlling for firm characteristics related to liquidity. Moreover, the average cross-sectional R 2 in specification (1), for example, is 0.57, while the corresponding number for specification (2) is 0.22. In other words, firm characteristics, including size, bookto-market, Amihud s illiquidity, and transaction costs do not have much power in explaining the cross-sectional variation in the trade frequencies. It should be noted that even though some of the t-statistics, such as the one on log(prc) in specification (2), are high and statistically significant at conventional levels, the persistent nature of the dependent variable makes them overstated, in spite of the Newey-West correction. The lack of correlation with commonly employed characteristics is important, since we argue that the variables of interest in this study proxy for different investor clienteles, which cannot be captured through conventional variables. 3 Anomalies within Trade Size Clusters In this section we discuss the performance of anomaly trading strategies within trade size clusters. In section 3.1 we present the main results of study, which include conditional double sorts on the three trade size frequency variables (T100, T500, and T1000) and anomaly signals. We show that the profitability of the anomaly trading strategies is strongly concentrated among the high T500 and T1000 terciles. In section 3.2, we address the issue of the dramatic regime shift in trading following the decimalization of U.S. equity market in 2000-2001. We show that the 7

results holds across both pre- and post-decimalization samples. Finally, in section 3.3 we show that the results also hold when we evaluate the average strategy returns net of transaction costs. Moreover, we confirm the findings of Novy-Marx and Velikov (2014) that the high-turnover anomalies do not exhibit significant profits. All tests use the 23 anomalies from Novy-Marx and Velikov (2014). Unless otherwise noted, all strategies consist of a time-series of value-weighted returns on a long/short selffinancing portfolio, constructed using a decile sort on an anomaly signal. Table 2 documents the anomalies and provides brief descriptions of the signals used for sorting, the rebalancing frequencies, and the appropriate references. For further details on the construction, see Novy- Marx and Velikov (2014) or the relevant references. Novy-Marx and Velikov (2014) establish a taxonomy of anomalies in the cross section of stock returns. They divide trading strategies into three groups, low-, mid-, and high-turnover strategies, corresponding roughly to strategies where each of the long and short side on average turn over less than once per year, between one and five times per year, and more than five times per year, respectively. They note that the strategies exploiting the three different groups of anomalies exhibit different transactions costs. Since transaction costs can impact the different investor clienteles differently, we also look at the anomalies by the three turnover groups. 3.1 Main Results In this section we present the main results of this study. Table 3 reports conditional double sorts o the n trade size frequency variables and an anomaly signal. In each month, stocks are first sorted into terciles based on one of the three trade size frequency variables (T100, T500, or T1000). Then, within each tercile, stocks are sorted into deciles based on an anomaly signal. For each double sort, within each trading size frequency tercile, the table reports the average return [t-stat] of the long-short decile value-weighted anomaly portfolio. Panels A, B, and C 8

report results for the low-, mid-, and high-turnover anomalies, respectively. Here, we focus on the full time period, which is 01/1983 to 12/2013. We can observe that across the low- and mid-turnover anomalies (Panels A and B), the strategies executed in the high T500 and T1000 terciles earn average returns that are significantly higher than the strategies executed in the low T500 and T1000 terciles. For example, a decile value-weighted ValMomProf strategy, executed in the high T500 tercile generates a staggering 1.99% per month with a t-statistic of 5.02. The corresponding number for the low T100 tercile is 0.77 with a t-statistic of 2.83. This pattern holds consistently across all 17 lowand medium-turnover strategies. Across these, the average returns in the high T500 tercile are, on average, over 2.2 times as large as the average returns for the low T500 tercile. The corresponding number for the T1000 sorts is 2.3. Moreover, across almost all the low- and medium-turnover anomalies, there seems to be a monotonically increasing pattern in the average anomaly strategy returns across the terciles. For example, the average returns (t-stats) to the Net Issuance (M) anomaly strategy are 0.58 (3.33), 0.82 (3.43), and 1.17 (3.60) for the low, mid, and high T500 terciles, respectively. Similarly, the average returns (t-stats) to the Net Issuance (M) anomaly strategy are 0.63 (3.69), 0.88 (4.28), and 1.06 (3.95) for the low, mid, and high T1000 terciles, respectively. On the other hand, the T100 sorts do not exhibit similar patterns. In fact, they contain the opposite pattern, where the anomaly strategies executed in the low T100 terciles exhibit slightly higher returns than the strategies executed in the high T100 universe. This is not surprisingly since T100 is mechanically negatively correlated with T500 and with T1000. If a stock s frequency of trades in 100-share sizes is high, it is likely that its frequencies of 500- and 1000-share sizes are low. In untabulated results which will be included in the next draft of the study we show that the predictive power of trade clustering on anomaly profits stems entirely from the 500- and 1000-share frequencies. On the other hand, clustering around 100 9

shares does not seem to predict anomaly profitability. These results are consistent with the notion of existence of investor clienteles. If, as argued by previous literature, informed traders cluster around medium trade share sizes, such as 500 and 1000, and they trade on and profit from these anomalies, we should expect to see precisely the results documented in table 3. Namely, the anomaly strategies, when executed in the universe of stocks where we expect to see more informed traders, perform better. Figures 3-5 are a graphical depiction of the results in table 3. Figure 3 plots the average returns for the anomaly strategies across the T100 terciles, while figures 4 and 5 do the same for the T500 and T1000 terciles, respectively. In all figures, panels (a) and (b) focus on the low- and mid-turnover anomalies, respectively. It is pretty clear from figures 4 and 5 that the low- and mid-turnover anomaly strategies earn significantly higher returns when executed in the univers of stocks where we hypothesize that informed investors focus on. Tables 4 provides an alternative test for the results obtained in table 3 and figures 3-5. It reports results from monthly Fama-MacBeth cross-sectional regressions of returns on anomaly signals, trade size frequency variables and interaction terms. The regressions are estimated separately for each anomaly. Panels A, B, and C report results for the low-, mid-, and highturnover anomalies, respectively. The first column presents results for regressions of the form r tj = α + βx tj + ε tj, where x tj that is an anomaly characteristic signed to predict returns positively. For example, for the Investment anomaly, x tj is the negative of the sum of the changes in plant property and equipment and inventories. All regressions include intercept terms, but their estimates are omitted for expositional purposes. We can observe that, with the exception of Size and, to an extent, Piotroski s F-score, all anomaly characteristics significantly predict returns in the cross section. The rest of columns, however, include our variables of interest, as well as interaction terms 10

between the trade size frequency variables and the anomaly characteristics. They take the following form: r tj = α + β 1 x tj + β 2 T X tj + β 3 x tj T X tj + ε tj, where x tj is again an anomaly characteristic and T X tj is T 100 tj in columns (2)-(4), T 500 tj in columns (5)-(7), and T 1000 tj in columns (8)-(10). According to our hypothesis we expect to see positive and statistically significant coefficients β 3 on the interaction terms for the T500 and T1000 specifications. We can observe that, in panels A and B, with the exception of the coefficient in the regressions that includes the Size anomaly characteristic, all other estimated coefficients on the interaction terms for the T500 and T1000 specifications are positive and almost all of them are significant. For example, the estimate for beta 3 in panel B for the T500 specification strategy is 8.24 with a t-statistic of 3.13. On the other hand, for the T100 specification, the interaction terms are mostly negative and far fewer are significant. This corroborates our earlier results from table 3 and figures 3-5 and effectively serves as a robustness test. 3.2 Results Pre- and Post-Decimalization As noted in section 2.1, there has been a dramatic regime shift in trading behavior since the late 90 s primarily due to the decimalization and the increase in high-frequency trading activity. With the advancements in algorithmic trading, it has become increasingly easier for traders to break down their orders into smaller and smaller sizes, which is also evident in the timeseries behavior of our main variables of interest. A natural concern then, is that our results are primarily driven by what happened before 2000. We alleviate these concerns in this section. Figures 6-8 repeat the analysis in figures 3-5 by breaking down the sample into two 11

subperiods. Panels (a), (c), and (e) restrict the sample to the 1983-1999 period and report results for low-, mid-, and high-turnover anomaly strategies, respectively. Similarly, panels (b), (d), and (f) focus on the 2000-2013 period and report results for low-, mid-, and highturnover anomaly strategies, respectively. Figures 4 and 5 clearly demonstrate that our main results hold for both samples. Although there have been some shifts in the overall profitability of some of the strategies (such as most of the high-turnover ones for example), the monotonic increase in the strategy profitability across the T500 and T1000 terciles is just as evident in the latter sample as in the earlier. In fact, in figure 4 the Net Issuance (M) anomaly strategy has more pronounced differences in the latter sample. Thus, it appears that as long as there is some trading done in higher clusters (500 or 1000), we can infer something about the profitability of anomaly trading strategies. 3.3 Results Net of Transaction Costs So far in the analysis we have ignored the fact that the patterns observed across the T500 and T1000 terciles don t hold for the high-turnover anomalies as they do for the low- and midturnover anomalies. In this section we show why this is the case. Novy-Marx and Velikov (2014) show that transaction costs eat out the profits from trading high-turnover anomalies and we confirm their finding here. Figures 9-11 repeat the analysis in figures 3-5 by documenting the average returns to the strategies net of transaction costs. The transaction cost estimation follows Novy-Marx and Velikov (2014). We can observe that none of the high-turnover anomalies seems to exhibit significantly positive returns. Thus, it is no surprise that the patterns observed for low- and mid-turnover anomalies don t hold. If our hypothesis of trade clienteles is correct, then it is no surprise that they don t trade these strategies, since they are not profitable. 12

4 Conclusions We show that trade size clusters are an important determinant in the pricing of anomalies. We consider the effect of trade size clusters across 23 anomalies and find that 500 and 1000-share trade size clusters delineate more informed trading than do 100-share size trade clusters. Portfolios dominated by 5000 (1000) share trade size clusters outperform by 100% those those that show little concentration in these trade sizes. Across the low and medium-turnover anomalies, we see increasing performance as evidenced in either returns or alphas moving from low to high concentrations of 500 (1000) share trade cluster portfolios. The 100-share size trade cluster portfolios experience a decreasing performance moving from low to high concentrations indicating the uninformativeness of small trade size clusters. The large trade size cluster results are robust to the portfolio formation techniques that focus on pre and post-decimalization and whether the returns are robust to transaction costs. Large trade portfolios continue to earn significant returns in post decimalization period, but the profitability is confined to the low and medium turnover anomalies. Traders concentrating in these portfolios appear to act strategically by focusing on anomalies that are require less frequent trading adding to the literature on stealth trading. This paper attempts to expand upon the research that explores trade size, but we are distinctive in that we combine two separate strains of literature that have been previously explored separately. This include the retail trade literature that requires separate buy and sell volume and is behaviorally based and an older literature that explores the pricing of more informed trades. We combine these two separate literature streams into one picture showing the importance of both categories of trade size and in particular trade size clusters at 500 and 1000-shares. The main features of this path is the ease in implementation and the sheer significance of the results. We view these results as fundamental to better understanding the aspects of asset pricing 13

that previously have been only considered as either behavioral in nature or that to be used in conjunction with an information event. We show that trade size clusters are critically important in determining the source of profits across the most prominent anomalies used in the literature 14

References Alexander, G., Peterson, M. 2007. An analysis of trade-size clustering and its relation to stealth trading. Journal of Financial Economics. Amihud, Y. 2002. Illiquidity and stock returns: Cross-section and timeseries effects. Journal of Financial Markets, 5, 31 56. Ang, A., Hodrick, R. J., Xing, Y., Zhang, X. 2006. The cross-section of volatility and expected returns. Journal of Finance, 61, 259 299. Barclay, M., Warner, J. 1993. Stealth trading and volatility. Journal of Financial Economics, 34, 281 305. Battalio, R. H., Mendenhall, R. R. 2005. Earnings expectations, investor trade size, and anomalous returns around earnings announcements. Journal of Financial Economics, 77, 289 319. Brandt, M. W., Kishore, R., Santa-Clara, P., Venkatachalam, M. 2008. Earnings announcements are full of surprises. Working paper. Campbell, J. Y., Hilscher, J., Szilagyi, J. 2008. In search of distress risk. Journal of Finance, 63, 2899 2939. Chakravarty, S. 2001. Stealth-trading: Which traders trades move stock prices?. Journal of Financial Economics, 61, 289 307. Chen, L., Novy-Marx, R., Zhang, L. 2010. An alternative three-factor model. Working paper. Chordia, T., Subrahmanyam, A. 2004. Order imbalance and individual stock returns: Theory and evidence. Journal of Financial Economics, 72, 485 518. 15

Cooper, M. J., Gulen, H., Schill, M. J. 2008. Asset growth and the cross-section of stock returns. Journal of Finance, 63, 1609 1651. Da, Z., Liu, Q., Schaumurg, E. 2014. A closer look at the short-term reversal. Management Science, 60, 658 674. Easley, D., O Hara, M. 1987. Price, trade size, and information in security markets. Journal of Financial Economics, 19, 69 90. Fama, E. F., French, K. R. 1993. Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33, 3 56. Fama, E. F., French, K. R. 2008. Dissecting anomalies. Journal of Finance, 63, 1653 1678. Foster, G., Olsen, C., Shevlin, T. 1984. Earnings releases, anomalies, and the behavior of security returns. The Accounting Review, 59, 574 603. Hasbrouck, J. 1991. Measuring the Information Content of Stock Trades. The Journal of Finance, XLVI, 179 207. Hasbrouck, J. 1995. One Security, Many Markets: Determining the Contributions to Price Discovery. The Journal of Finance, 50, 1175 1199. Hasbrouck, J. 2009. Trading costs and returns for U.S. equities: estimating effective costs from daily data. Journal of Finance, 64, 1446 1477. Heston, S. L., Sadka, R. 2011. Seasonality in the cross-section of stock returns. Journal of Financial Economics, 87, 418 445. Hvidkjaer, S. 2006. A trade-based analysis of momentum. Review of Financial Studies, 19, 457 491. 16

Hvidkjaer, S. 2008. Small trades and the cross-section of stock returns. Review of Financial Studies, 21(3), 1123 1151. Jegadeesh, N., Titman, S. 1993. Returns to buying winners and selling losers: implications for stock market efficiency. Journal of Finance, 48, 65 91. Keim, D., Madhavan, A. 1995. Anatomy of the trading process empirical evidence on the behavior of institutional traders,. Journal of Financial, 37, 371 398. Lee, C. M. C., Swaminathan, B. 2000. Price Momentum and Trading Volume. Journal of Finance, 55, 2017 2069. Lee, C. M. 1992. Earnings news and small traders. Journal of Accounting and Economics, 15, 265 302. Lesmond, D. A., Schill, M. J., Zhou, C. 2004. The illusory nature of momentum profits. Journal of financial economics, 71, 349 380. Lyandres, E., Sun, L., Zhang, L. 2008. Investment-based underperformance following seasoned equity offerings. Review of Financial Studies, 21, 2825 2855. Moskowitz, T., Grinblatt, M. 1999. Do industries explain momenum? Journal of Finance, 54, 1249 1290. Novy-Marx, R. 2013. The other side of value: The gross profitability premium. Journal of Financial Economics, 108, 1 28. Novy-Marx, R. 2014. The quality dimension of value investing. Working paper. Novy-Marx, R., Velikov, M. 2014. A Taxonomy of Anomalies and their Trading Costs. NBER Working Paper No. 20721. 17

Piotroski, J. 2000. Value investing: the use of historical financial statement information to separate winners from losers. Journal of Accounting Research, 38, 1 41. Sloan, R. G. 1996. Do stock prices fully reflect information in accruals and cash flows about future earnings? The Accounting Review, 71, 289 315. 18

19 Figure 1: Average trade size frequencies over time. For each stock, for each month, T100, T500, and T1000 are the frequencies of trades that occur in 100-, 500-, and 1000-share sizes, respectively. T100500 is the frequency of trades that occur in share sizes between 100 and 500. The figure plots the monthly cross-sectional mean across firms for the four trade sizes frequency variables over time.

(a) (b) (c) Figure 2: Trade size frequency tercile breakpoints over time. The figure plots the monthly cross-sectional mean, 30 th, and 70 th percentiles across firms for trade sizes frequency variables over time. Panels (a), (b), and (c) plot the results for T100, T500, and T1000, respectively. 20

(a) (b) (c) Figure 3: Conditional double sorts on T100 and an anomaly signal. In each month, stocks are first sorted into terciles based on T100. Then, within each tercile, stocks are sorted into deciles based on an anomaly signal. For each anomaly signal, the figure plots average returns on decile long-short value-weighted 21 anomaly portfolios within the T100 terciles. Panels (a), (b), and (c) report low-, mid-, and high-turnover anomaly strategies, respectively. For further details of the anomaly signals construction, see table 2. Time period used is 01/1983 to 12/2013.

(a) (b) (c) Figure 4: Conditional double sorts on T500 and an anomaly signal. In each month, stocks are first sorted into terciles based on T500. Then, within each tercile, stocks are sorted into deciles based on an anomaly signal. For each anomaly signal, the figure plots average returns on decile long-short value-weighted 22 anomaly portfolios within the T500 terciles. Panels (a), (b), and (c) report low-, mid-, and high-turnover anomaly strategies, respectively. For further details of the anomaly signals construction, see table 2. Time period used is 01/1983 to 12/2013.

(a) (b) (c) Figure 5: Conditional double sorts on T1000 and an anomaly signal. In each month, stocks are first sorted into terciles based on T1000. Then, within each tercile, stocks are sorted into deciles based on an anomaly signal. For each anomaly signal, the figure plots average returns on decile long-short value-weighted 23 anomaly portfolios within the T1000 terciles. Panels (a), (b), and (c) report low-, mid-, and high-turnover anomaly strategies, respectively. For further details of the anomaly signals construction, see table 2. Time period used is 01/1983 to 12/2013.

(a) (b) (c) (d) (e) (f) Figure 6: Conditional double sorts on T100 and an anomaly signal within subsamples. In each month, stocks are first sorted into terciles based on T100. Then, within each tercile, stocks are sorted into deciles based on an anomaly signal. For each anomaly signal, the figure plots average returns on decile long-short value-weighted anomaly portfolios within the T100 terciles. Panels (a), (c), and (e) use the 1983-1999 period and report results for low-, mid, and high-turnover anomaly strategies, respectively. Similarly, panels (b), (d), and (f) use the 2000-2013 period and report results for low-, mid-, and high-turnover anomaly strategies, respectively.for further details of the anomaly signals construction, see table 2. 24

(a) (b) (c) (d) (e) (f) Figure 7: Conditional double sorts on T500 and an anomaly signal within subsamples. In each month, stocks are first sorted into terciles based on T500. Then, within each tercile, stocks are sorted into deciles based on an anomaly signal. For each anomaly signal, the figure plots average returns on decile long-short value-weighted anomaly portfolios within the T500 terciles. Panels (a), (c), and (e) use the 1983-1999 period and report results for low-, mid, and high-turnover anomaly strategies, respectively. Similarly, panels (b), (d), and (f) use the 2000-2013 period and report results for low-, mid-, and high-turnover anomaly strategies, respectively.for further details of the anomaly signals construction, see table 2. 25

(a) (b) (c) (d) (e) (f) Figure 8: Conditional double sorts on T1000 and an anomaly signal within subsamples. In each month, stocks are first sorted into terciles based on T1000. Then, within each tercile, stocks are sorted into deciles based on an anomaly signal. For each anomaly signal, the figure plots average returns on decile long-short value-weighted anomaly portfolios within the T1000 terciles. Panels (a), (c), and (e) use the 1983-1999 period and report results for low-, mid, and high-turnover anomaly strategies, respectively. Similarly, panels (b), (d), and (f) use the 2000-2013 period and report results for low-, mid-, and high-turnover anomaly strategies, respectively.for further details of the anomaly signals construction, see table 2. 26

(a) (b) (c) Figure 9: Net returns from conditional double sorts on T100 and an anomaly signal. In each month, stocks are first sorted into terciles based on T100. Then, within each tercile, stocks are sorted into deciles based on an anomaly signal. For each anomaly signal, the figure plots average net returns on decile long-short 27value-weighted anomaly portfolios within the T100 terciles. The transaction costs estimation follows Novy-Marx and Velikov (2014). Panels (a), (b), and (c) report low-, mid-, and high-turnover anomaly strategies, respectively. For further details of the anomaly signals construction, see table 2. Time period used is 01/1983 to 12/2013.

(a) (b) (c) Figure 10: Net returns from conditional double sorts on T500 and an anomaly signal. In each month, stocks are first sorted into terciles based on T500. Then, within each tercile, stocks are sorted into deciles based on an anomaly signal. For each anomaly signal, the figure plots average net returns on decile long-short 28value-weighted anomaly portfolios within the T500 terciles. The transaction costs estimation follows Novy-Marx and Velikov (2014). Panels (a), (b), and (c) report low-, mid-, and high-turnover anomaly strategies, respectively. For further details of the anomaly signals construction, see table 2. Time period used is 01/1983 to 12/2013.

(a) (b) (c) Figure 11: Net returns from conditional double sorts on T1000 and an anomaly signal. In each month, stocks are first sorted into terciles based on T1000. Then, within each tercile, stocks are sorted into deciles based on an anomaly signal. For each anomaly signal, the figure plots average net returns on decile long-short 29value-weighted anomaly portfolios within the T1000 terciles. The transaction costs estimation follows Novy-Marx and Velikov (2014). Panels (a), (b), and (c) report low-, mid-, and high-turnover anomaly strategies, respectively. For further details of the anomaly signals construction, see table 2. Time period used is 01/1983 to 12/2013.

Table 1: Fama-MacBeth regressions of trade Sizes on Characteristics The table reports results from monthly Fama-MacBeth cross-sectional regressions of the trade size frequency variables on firm characteristics. Amihud s illiquidity measure is estimated monthly, following Amihud (2002), while Tcosts is the effective bidask spread measure from Hasbrouck (2009). All variables are winsorized at the one and 99% levels. T-statistics in brackets are estimated using Newey-West standard errors with 12 lags. Time period is 02/1983 to 12/2013. 30 y = T 100 y = T 500 y = T 1000 (1) (2) (3) (4) (5) (6) (7) (8) (9) Const 6.15 47.32 7.80 4.58 10.55 5.20 3.11 9.77 2.70 [20.61] [8.13] [15.06] [8.68] [7.86] [7.13] [7.49] [5.72] [6.66] T100 t 1 0.76 0.78 [33.47] [40.33] T500 t 1 0.52 0.57 [18.20] [24.81] T1000 t 1 0.67 0.66 [50.60] [31.37] log(me) -1.74-0.26-0.41-0.23-0.44-0.09 [-4.21] [-5.30] [-3.94] [-4.85] [-2.73] [-2.52] log(prc) 0.20 0.04-0.02-0.01-0.04-0.01 [10.71] [10.51] [-4.47] [-4.49] [-5.03] [-5.20] Amihud -8.60 1.10 1.42-0.03-9.45-2.68 [-1.14] [1.13] [1.07] [-0.06] [-3.90] [-4.44] log(b/m) -0.12 0.12-0.16-0.09-0.90-0.23 [-0.50] [2.32] [-3.39] [-3.66] [-4.43] [-5.45] Tcosts -4.12-0.80 0.10 0.03 1.79 0.60 [-4.19] [-6.26] [0.61] [0.44] [7.53] [6.71] Average R 2 0.60 0.23 0.69 0.30 0.13 0.42 0.46 0.20 0.55 n 371 372 371 371 372 371 371 372 371

Table 2: The anomalies All strategies consist of a time-series of value-weighted returns on a long/short self-financing portfolio, constructed using a decile sort on an anomaly signal. Column 2 indicates the relevant reference, column 3 reports the signal used for sorting. The last column indicates the frequency of rebalancing. For further details on the construction, see Novy-Marx and Velikov (2014) or the relevant references. Panel A: Low Turnover Anomaly Reference(s) Signal Rebal. Size Fama and French (1993) Market equity Annual Gross Profitability Novy-Marx (2013) Gross Profitability Annual Value Fama and French (1993) Book-to-market equity Annual ValProf Novy-Marx (2014) Sum of firms ranks in univariate sorts on Annual book-to-market and gross profitability Accruals Sloan (1996) Accruals Annual Asset Growth Cooper et al. (2008) Asset Growth Annual Investment Lyandres et al. (2008) Investment Annual Piotroski s F-score Piotroski (2000) Piotroski s F-score Annual Panel B: Mid Turnover Anomaly Reference(s) Signal Rebal. Net Issuance (M) Fama and French (2008) Net stock issuance Monthly Return-on-book equity Chen et al. (2010) Return-on-book equity Monthly Failure Probability Campbell et al. (2008) Failure Probability Monthly ValMomProf Novy-Marx (2014) Sum of firms ranks in univariate sorts on Monthly book-to-market, gross profitability, and momentum ValMom Novy-Marx (2014) Sum of firms ranks in univariate sorts on Monthly book-to-market and momentum Idiosyncratic Volatility Ang et al. (2006) Idiosyncratic volatility, measured as the Monthly residuals of regressions of their past three months daily returns on the daily returns of the Fama-French three factors Momentum Jegadeesh and Titman Prior year s stock performance excluding Monthly (1993) the most recent month PEAD (SUE) Foster et al. (1984) Standardized Unexpected Earnings (SUE) Monthly PEAD (CAR3) Brandt et al. (2008) Cumulative three-day abnormal return around announcement (days minus one to one) Monthly 31

Table 2: Continued Panel C: High Turnover Anomaly Reference(s) Signal Rebal. Industry Momentum Moskowitz and Grinblatt Industry past month s return Monthly (1999) Industry Relative Reversals Da, Liu and Schaumurg (2014) Difference between a firm s prior month s return and the prior month s return of their Monthly High-frequency Combo Novy-Marx and Velikov (2014) industry Sum of firms ranks in the univariate sorts on industry relative reversals and industry momentum Short-run Reversals Jegadeesh and Titman (1993) Prior month s returns Seasonality Heston and Sadka (2011) Average return in the calendar month over the preceding five years Industry Relative Novy-Marx and Velikov Industry relative reversals, restricted to Reversals (Low (2014) stocks with idiosyncratic volatility lower Volatility) than the NYSE median for the month Monthly Monthly Monthly Monthly 32

Table 3: Conditional Double Sorts on Trade Size and an Anomaly Signal The table reports conditional double sorts on trade size frequency variables and an anomaly signal. In each month, stocks are first sorted into terciles based on one of the three trade size frequency variables. Then, within each tercile, stocks are sorted into deciles based on an anomaly signal. For each double sort, within each trading size frequency tercile, the table reports the average return [t-stat] of the longshort decile value-weighted anomaly portfolio. Panels A, B, and C report results for the low-, mid-, and high-turnover anomalies, respectively. For further details of the anomaly signals construction, see table 2. Time period used is 01/1983 to 12/2013. Panel A: Low-turnover anomalies Anomaly T100 Terciles T500 Terciles T1000 Terciles (L) (2) (H) (L) (2) (H) (L) (2) (H) Size -0.12-0.02 0.05-0.13 0.11-0.19-0.15 0.08-0.38 [- 0.29] [- 0.07] [0.23] [- 0.45] [0.37] [- 0.59] [- 0.63] [0.27] [- 1.00] Gross Profitability 0.37 1.20 0.53 0.62 0.75 1.17 0.41 0.58 1.22 [0.85] [4.21] [1.98] [2.28] [2.57] [2.88] [1.74] [1.74] [2.91] Value 0.79 0.03 0.27 0.18 0.53 0.15 0.26 0.45-0.08 [2.63] [0.13] [0.94] [0.78] [1.87] [0.40] [1.07] [1.34] [- 0.20] ValProf 1.19 0.97 0.73 0.68 1.18 1.56 0.37 0.98 1.81 [3.30] [3.29] [2.42] [2.58] [3.66] [3.89] [1.59] [3.07] [4.73] Accruals 0.56 0.40 0.32 0.17 0.53 0.47 0.28 0.91 0.57 [1.58] [1.61] [1.41] [0.72] [2.08] [1.33] [1.38] [3.01] [1.67] Asset Growth 0.83 0.57 0.19 0.47 0.53 0.43 0.26 0.36 0.44 [2.19] [2.13] [0.78] [1.82] [1.78] [1.20] [1.14] [1.41] [1.13] Investment 0.63 0.59 0.31 0.41 0.63 0.47 0.32 0.54 1.12 [2.24] [2.61] [1.40] [1.96] [2.65] [1.60] [1.77] [2.43] [3.13] Piotroski F-score 1.01 0.39 0.23 0.52 0.14 0.80 0.62 0.29 0.91 [2.56] [1.47] [0.96] [2.06] [0.42] [1.91] [3.06] [0.78] [2.13] Panel B: Mid-turnover anomalies Net Issuance (M) 1.17 0.89 0.66 0.54 0.81 1.33 0.61 0.83 0.98 [4.62] [4.96] [2.46] [3.16] [3.43] [3.92] [3.86] [3.89] [2.84] ROE 0.62 0.86 0.71 0.81 0.71 0.83 0.57 0.96 0.88 [1.60] [3.27] [2.98] [2.96] [2.45] [2.00] [2.33] [3.32] [1.87] Failure Probability 2.15 1.41 0.93 1.43 1.28 2.53 0.88 1.62 2.35 [4.06] [3.00] [2.63] [4.01] [2.87] [4.93] [2.48] [3.62] [4.40] ValMomProf 2.26 1.54 1.16 1.07 1.71 2.31 0.73 1.98 2.53 [4.69] [4.03] [3.63] [3.90] [4.50] [5.16] [2.83] [5.48] [5.29] ValMom 1.83 1.42 1.28 0.90 1.63 2.16 0.67 1.85 2.04 [3.46] [3.27] [3.74] [2.56] [4.04] [4.13] [2.21] [4.49] [3.63] Idiosyncratic Volatility 2.52 1.32 0.94 1.34 1.20 2.56 0.98 1.58 2.47 [4.21] [2.68] [2.04] [2.96] [2.23] [4.80] [2.22] [3.44] [4.25] Momentum 2.31 1.50 1.10 1.25 1.39 2.59 0.68 1.67 2.59 [3.75] [3.05] [2.79] [3.03] [2.82] [4.49] [1.72] [3.40] [4.26] PEAD (SUE) 0.76 0.48 0.64 0.55 0.63 0.81 0.39 0.67 1.11 [2.85] [2.55] [3.13] [3.21] [2.79] [2.62] [2.57] [2.84] [3.19] PEAD (CAR3) 1.14 1.04 1.10 0.97 0.77 1.85 0.65 1.15 1.94 [3.36] [4.17] [5.41] [4.12] [3.37] [5.05] [3.87] [4.71] [5.06] 33

Table 3: Continued Panel C: High-turnover anomalies Anomaly T100 Terciles T500 Terciles T1000 Terciles (L) (2) (H) (L) (2) (H) (L) (2) (H) Industry Momentum 0.79 0.42 0.26 0.20 0.31 0.24 0.20 0.66 0.91 [2.25] [1.47] [0.77] [0.73] [0.93] [0.58] [0.69] [1.92] [2.32] Industry Relative Reversals 0.96 0.17 0.75 0.66 0.26 0.71 0.90 0.52 0.38 [1.78] [0.50] [2.76] [2.25] [0.76] [1.35] [3.23] [1.71] [0.71] High-frequency Combo 1.68 1.22 0.82 0.87 1.20 1.25 0.95 1.22 0.94 [4.86] [5.10] [3.52] [4.12] [4.90] [3.81] [5.07] [5.12] [2.48] Short-run Reversals 0.89 0.08 0.37 0.63 0.07 0.76 0.71 0.13 0.29 [1.59] [0.19] [1.08] [1.80] [0.18] [1.35] [2.22] [0.34] [0.50] Seasonality 1.01 0.83 1.17 1.15 0.75 1.14 1.09 0.87 0.98 [2.84] [3.10] [4.33] [4.02] [2.49] [3.05] [5.25] [2.89] [2.44] IRR (LowVol) 1.52 1.22 1.33 1.40 1.22 1.00 1.32 1.16 1.78 [3.05] [5.94] [6.40] [7.01] [5.52] [2.89] [8.16] [4.54] [3.46] 34

Table 4: Fama-MacBeth Regressions of Returns on Anomaly Signals and Trade Size This table reports results from monthly Fama-MacBeth cross-sectional regressions of returns on anomaly signals, trade size frequency variables and interaction terms. The regressions are estimated separately for each anomaly. The first column contains slope coefficents from regressions with the following specification: r tj = α + βx tj + ε tj The rest of the columns contain slope coefficients from regressions with the following specifications: r tj = α + β 1 x tj + β 2 T X tj + β 3 x tj T X tj + ε tj In all of these specifications, x tj is the anomaly characteristic. T X tj stands for T 100 tj in columns (2)-(4), T 500 tj in columns (5)-(7), and T 1000 tj in columns (8)-(10). Panels A, B, and C report results for low-, mid-, and high-turnover anomalies, respectively. Independent variables are winsorized at the one and 99% levels. For further details of the anomaly signals construction, see table 2. Time period used is 01/1983 to 12/2013. Panel A: Low-turnover anomalies Anomaly (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) T100 Specification T500 Specification T1000 Specification β β 1 β 2 β 3 β 1 β 2 β 3 β 1 β 2 β 3 Size -0.00-0.00 0.19 0.00-0.00-1.24-0.00-0.00-0.98-0.00 [-0.84] [-0.99] [0.29] [1.00] [-0.00] [-0.58] [-0.34] [-0.12] [-0.31] [-0.01] Gross Profitability 0.78 0.88 0.10-1.23 0.49-1.88 2.55 0.44-1.04 2.57 [4.32] [2.35] [0.11] [-1.47] [2.21] [-0.71] [0.88] [2.18] [-0.29] [0.69] Value 0.39 0.63 0.86-1.22 0.16-3.64 2.80 0.14-3.49 3.67 [5.05] [4.76] [1.08] [-3.54] [1.57] [-1.39] [2.81] [1.61] [-0.96] [2.88] ValProf 0.00 0.00 3.09-0.00 0.00-8.94 0.00 0.00-7.53 0.00 [7.37] [5.17] [2.30] [-4.13] [3.32] [-2.07] [2.67] [4.06] [-1.32] [2.10] Accruals 1.76 2.68-0.34-1.53 1.28-1.02 10.69 2.05 0.12 4.80 [4.95] [3.00] [-0.48] [-0.68] [2.02] [-0.44] [1.30] [3.91] [0.04] [0.44] Asset Growth 1.03 1.68-3.89-3.12 0.68 3.06 3.38 0.48 5.81 5.11 [8.97] [6.25] [-3.37] [-4.63] [4.03] [0.87] [1.60] [3.38] [1.29] [2.09] Investment 1.85 2.85-0.81-4.71 1.47-0.24 8.57 0.86 1.30 15.36 [7.62] [4.40] [-1.07] [-3.60] [4.22] [-0.10] [1.42] [2.92] [0.39] [2.08] Piotroski F-score 0.07 0.15-0.16-0.16 0.04-4.48 0.97 0.06-1.76 0.82 [1.78] [2.10] [-0.12] [-1.06] [0.74] [-1.29] [2.16] [1.79] [-0.37] [1.28] 35