feds · May 21, 2020

Zeroing in on the Expected Returns of Anomalies

Abstract

We zero in on the expected returns of long-short portfolios based on 120 stock market anomalies by accounting for (1) effective bid-ask spreads, (2) post-publication effects, and (3) the modern era of trading technology that began in the early 2000s. Net of these effects, the average anomaly's expected return is a measly 8 bps per month. The strongest anomalies return only 10-20 bps after accounting for data-mining with either out-of-sample tests or empirical Bayesian methods. Expected returns are negligible despite cost optimizations that produce impressive net returns in-sample and the omission of additional trading costs like price impact. Accessible materials (.zip)

Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C. Zeroing in on the Expected Returns of Anomalies Andrew Y. Chen and Mihail Velikov 2020-039 Please cite this paper as: Chen, Andrew Y., and Mihail Velikov (2020). “Zeroing in on the Expected Returns of Anomalies,” Finance and Economics Discussion Series 2020-039. Washington: Board of Governors of the Federal Reserve System, https://doi.org/10.17016/FEDS.2020.039. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

Zeroing in on the Expected Returns of Anomalies AndrewY.Chen MihailVelikov FederalReserveBoard PennsylvaniaStateUniversity andrew.y.chen@frb.gov velikov@psu.edu ∗ May2020 Abstract We zero in on the expected returns of long-short portfolios based on 120 stockmarketanomaliesbyaccountingfor(1)effectivebid-askspreads,(2) post-publicationeffects,and(3)themoderneraoftradingtechnologythat began in the early 2000s. Net of these effects, the average anomaly’s expectedreturnisameasly8bpspermonth.Thestrongestanomaliesreturn only10-20bpsafteraccountingfordata-miningwitheitherout-of-sample tests or empirical Bayesian methods. Expected returns are negligible despitecostoptimizationsthatproduceimpressivenetreturnsin-sampleand theomissionofadditionaltradingcostslikepriceimpact. ∗ FirstpostedtoSSRN:November2017. ThispaperoriginatedfromaconversationwithSvetlanaBryzgalova. WethankMarieBriere(HFPEdiscussant), VictorDeMiguel, YesolHuh, Nina Karnaukh,AlbertoMartin-Utrera(FDUdiscussant),AndyNeuhierl,SteveSharpe,NitishSinha, IngridTierens(JacobsLevydiscussant),TugkanTuzun,MichaelWeber,HaoxiangZhu,andseminarparticipantsattheFederalReserveBoard,PennStateUniversity,UniversityofGeorgia,the 11thAnnualHedgeFundandPrivateEquityResearchConference, 2019FinanceDownUnder Meetings,2019EasternFinanceAssociationMeetings,and2019JacobsLevyFrontiersinQuantitativeFinanceconferenceforhelpfulcomments. Theviewsexpressedhereinarethoseofthe authorsanddonotnecessarilyreflectthepositionoftheBoardofGovernorsoftheFederalReserveortheFederalReserveSystem.

1. Introduction The literature on stock market anomalies has documented more than one hundred predictors of the cross-section of stock returns. Using historical data, these papers demonstrate market-neutral returns that average around 8% per year. Theseanomaliesrangefromthosebasedonpastreturnpatterns,tothose based purely on accounting variables, and still others based on institutional stock holdings. Few economic risk factors or behavioral theories are so broad thattheycanmakeadentinthiswidevarietyofreturnpredictors. Anomalies’ expected returns, however, may be much lower than the mean returns found in the literature. With only a couple exceptions, the literature ignorestradingcosts,whichcansignificantlyreduceexpectedpayoffs,andthusexpectedreturns. Moreover,thehistoricaldatausedinthesepapersarestale. The literatureusesdatagoingbacktothe1920s,leadingtoquestionsaboutwhether returnsfromsolongagoarestillrelevant. Indeed,data-miningbiasandinvestor learningimplythatreturnsinrecentyearshavebeenmuchsmaller(McLeanand Pontiff 2016). And the early 2000s saw a revolution in information and trading technologies, implying that data from earlier decades may not be informative aboutthefuture(Chordia,Subrahmanyam,andTong2014). Inthispaper,wezeroinontheexpectedreturnsofanomaliesbyaccounting forbothtradingcostsandthestalenessofhistoricaldata.Ourmainresultisthat, netoftheseeffects,expectedreturnsareeffectivelyzero. Figure 1 illustrates how we “zero in.” To generate this figure, we replicate 120 anomaly signals, construct long-short portfolios using state-of-the-art cost mitigation techniques, and reduce portfolio payoffs by half of the effective bidaskspreadwheneveraportfolioweightisadjusted.Eachbar,movingfromleftto 1

Figure 1: Anomaly Mean Long-Short Returns. Error bars show one standard error. 70 60 50 40 30 20 10 0 Gross In-Sample Net In-Sample Net Post-Pub Net Post-Pub & Post-2005 nruteR teN naeM )htnom rep spb( seilamonA 021 ssorcA 70 60 50 40 30 20 10 0 right,providesamorerefinedestimateoftheaverageanomaly’sexpectedreturn. Thefirstbaristhemeanreturnbeforetradingcosts(grossreturn)withinthe originalpapers’sampleperiods(in-sample).Inourdatasetwefindanimpressive 66bpspermonth.Accountingfortradingcostsreducestheexpectedreturnto38 bps,whichisstillanotable4.6%peryear. Addingpost-publicationeffects,however,resultsinameasly13bpspermonth. Additionallyrestrictingthesampleto themoderneraoftradingtechnology(post-2005),weshouldexpectanegligible 8bpspermonth.1 Theseresultsomitadditionaltradingcostssuchaspriceimpactandshort-salefees. Indeed,short-salecostsaverage10-20basispointsper month (Cohen, Diether, and Malloy 2007; Drechsler and Drechsler 2016), and wouldwipeouttheremainingprofits. Though the average anomaly is unprofitable, perhaps the strongestanomaliesstillofferlargeexpectedreturns. Indeed,sevenanomalieshavemeannetreturnsinexcessof60bpspermonthinthethedatathatisbothpost-publication 1WethankMarieBriereforsuggestingthisanalysis. Post-2003andpost-2004samplesleadto similarresults. 2

andpost-2005. Thisperformanceshouldbeviewedwithsuspicion,however,as some portion of it must be due to luck. Indeed, reporting only anomalies with thelargestmeanreturnsistheverydefinitionofdata-mining. Wefind,however,thateventhestrongestanomaliesoffernegligibleexpected returns. To come to this conclusion, we use two data-mining adjustments that havedistinctstatisticalmotivations. Despitetheirdifferentorigins,butbothadjustmentsleadtothesamequantitativeresult. The first data-mining adjustment is a simple out-of-sample test. We sort anomalies based on information available in their in-sample periods, and then average post-publication and post-2005 net returns within quantiles. This exercise avoids using the same data to select and make inference on anomalies, thus eliminating data-mining bias. We examine four in-sample predictors: the net return, the net Sharpe ratio, the return reduction due to trading costs, and turnover. The best expected returns come from sorting anomalies on their in-sample netSharperatios. Thetopquartileofhasameannetreturnof21bpspermonth in post-publication and post-2005 data. But net returns are not monotonic, andthesecondstrongestpredictor(turnover)producesatmostonly14bpsper month. Moreover,theseresultsrequireusingequal-weightedimplementations. Restrictingourimplementationstovalue-weightingimpliesexpectedreturnsof 11bpspermonth,atbest. The second data-mining adjustment uses an empirical Bayes estimator. Theseestimatorshavebeenshowntoeffectivelyadjustfordata-mininginawide varietyofsettings(Efron2012; Azevedoetal.2019; Liu, Moon, andSchorfheide 2020; Chen and Zimmermann 2019). Such estimators compare the crossanomaly dispersion of mean returns to their standard errors to determine how much dispersion is due to luck. The overall contribution of luck is estimated 3

from empirical data using frequentist methods, and then adjustments for individualanomaliesarederivedusingBayesianformulas,hencethename“empiricalBayes.” Thisestimationfindsthatmostofthedispersioninmeannetreturnsinpostpublication and post-2005 data is due to luck. As a result, even the 90th percentile anomaly has an expected return of 20 bps per month after adjusting for data-mining. Even worse, implementations that use only value-weighting produce only 6 bps in the 90th percentile. These results are remarkably consistent with our first data-mining adjustment, despite their very different methodologies. Our data-mining results are intuitive given the distribution of mean net returns in recent data. The distribution closely resembles a standard normal distribution. Only 9% of t-stats exceed 2.0 in absolute value, not far from the 5% impliedbyastandardnormal.Thus,thedatacanbelargelyexplainedbythenull ofnopredictability,andreturnsintherighttailofthedistributionaremostlydue toluck. These results may be surprising, as other papers find size, B/M, and momentumsurvivetradingcosts(Novy-MarxandVelikov2016;Frazzini,Israel,and Moskowitz 2015; Briere et al. 2019). Individual anomalies, however, have noisy mean returns that are very sensitive to the sample period. Size, B/M, and momentumhavepositivenetreturnsof30to70bpsinthe1998-2013samplestudied by Frazzini, Israel, and Moskowitz (2015), but their net returns drop to between-30and+25bpspost-2005. Thischangeinperformanceisconsistentwith the fact that standard errors on mean returns are around 40 bps per month for samples of 15 years. This fragility demonstrates the importance of aggregating acrossmanyanomalies,aswedoinourpaper. A limitation of our study is that we do not allow for combining multiple 4

anomalies. Combining anomalies can improve portfolio performance, particularlywhenaccountingfortradingcosts(Novy-MarxandVelikov2016,forexample). Indeed, DeMigueletal.(Forthcoming)applystate-of-the-artoptimization techniquesto50anomalies,andfindthatcombininganomalieshasverypowerfuleffectsontradingcostsinthelonghistoricalsample. Combininganomalies, however, does not allow for sharp inferences about more recent data. It is only by averaging over 120 anomalies that we can obtain the small standard errors in Figure 1. Indeed, our large dataset allows us to make sharp inferences about the recent performance of the best anomalies. Both data-mining adjustments producestandarderrorsonmeannetreturnsofaround5-10bps. Another limitation is that we measure trading costs with effective spreads. Spreads are a lower bound trading cost because they correspond to the smallest market orders, but one might argue that even lower costs can be obtained with the strategic use of limit orders. Indeed, Frazzini, Israel, and Moskowitz (2018) argue that traders can act as market makers and pay negative spreads, receiving rather than paying trading costs. Acting as a market maker, however, resultsinadverseselectioncosts(GlostenandMilgrom1985)andexecutionrisk (Cont and Kukanov 2017). Moreover, theory suggests that there are fundamental trading costs that cannot be avoided regardless of the implementation (Kyle and Obizhaeva 2016), and empirical studies find that effective bid-ask spreads arecloselyrelatedtothesefundamentalcosts(Fong,Holden,andTobek2017). In the remainder of the Introduction, we relate our study to existing literature. Section2describesourmethods. Section3presentsresultsfortheaverage anomaly. Section 4 examines the strongest anomalies. We examine size, B/M, andmomentuminSection4.3. Section5concludes. 5

Relation to the Literature In a closely-related study, Novy-Marx and Velikov (2016) (NV) find that trading costs have a large effect on the mean returns of twenty-threeanomalies. However,thereareseveralreasonswhyNV’sresultsdo not allow for inference about expected returns on the “anomaly zoo” (McLean and Pontiff 2016; Freyberger, Neuhierl, and Weber 2017; Feng, Giglio, and Xiu 2017;etc) First,andforemost,theanomaliesstudiedinNVarenotrepresentativeofthe anomalyzoo. NV’sanomaliesare“twenty-threeofthebestknown,andstrongest performing, anomaly strategies.” In contrast, anomaly zoo papers like McLean and Pontiff (2016) are drawn from a more-or-less exhaustive literature search, andincludedozensofanomaliesthatarenotpopularlyknown.2 UnlikeNV,ouranomaliesincludeall68ofMP’sanomaliesthatallowforcost optimizationand52additionalanomaliesfromGreen,Hand,andZhang(2017) and Hou, Xue, and Zhang (2017). This large set of anomalies also differentiates ourpaperfromothertradingcoststudies,allofwhichexaminesmallsetsofwellknownanomalies(Frazzini,Israel,andMoskowitz2015,andBriereetal.2019,for example).3 Moreover, we reconcile our results with studies of selected anomaliesbyusingdata-miningadjustments. Theseadjustmentsallowustostudythe strongestanomaliesusingobjectivestatistics,unlikepreviouspaperswhichuse judgmenttodeterminenotableanomalies. The second reason NV’s results cannot be used to study expected returns is 2McLeanandPontiff’sanomalies“weremostlyidentifiedwithsearchenginessuchasEconlit bysearchingforarticlesinfinanceandaccountingjournalsusingwordssuchas‘cross-section.’” DavidMcLeaninformedusthattheyalsosurveyedassetpricingexpertstomakesuretheywere notmissinganything. 3Forothertradingcoststudies,seeStollandWhaley(1983),Schultz(1983),Ball,Kothari,and Shanken(1995), KnezandReady(1996), PontiffandSchill(2001), KorajczykandSadka(2004), Lesmond,Schill,andZhou(2004),andHannaandReady(2005),McLean(2010),Hou,Kim,and Werner(2016),PattonandWeller(2017),Frazzini,Israel,andMoskowitz(2015),andBriereetal. (2019). Forotherpapersonthedecayofpredictabilityovertime,seeSchwert(2003),Marquering,Nisser,andValla2006,HuangandHuang2013,Chordia,Subrahmanyam,andTong(2014), JacobsandMüller(2017),Chu,Hirshleifer,andMa(2017),andChenandZimmermann(2019). 6

that NV’s trading cost exhibits a large upward bias in recent years. NV measure trading costs using Hasbrouck’s (2009) low-frequency spreads, and as we show inSection2.2,low-frequencyspreadsareupwardbiasedby25-50bpsafter2003. This bias is consistent with changes in the trading environment since decimalization(Jahan-ParvarandZikes2019). Weaccountforthisbiasbyusinghigh-frequencyspreadsfromNYSE’sTrade andQuote(TAQ)database. SpreadsfromTAQserveasbenchmarkmeasuresof liquidity in the microstructure literature (Goyenko, Holden, and Trzcinka 2009; Fong, Holden, and Trzcinka 2017), and indeed all low frequency (LF) spreads demonstratetheirvaliditybyexaminingtheircorrelationswithHFspreads(CorwinandSchultz2012,forexample). Finally, it is not obvious how to combine NV’s trading cost effects with the performance decay found in other papers. While performance decay tends to reducenetreturnsinrecentdata,tradingcostshaveplummetedtoo,withoppositeeffects. Moreover,theoriesoflimitedarbitragepredictthatperformancedecayandtradingcostsinteractcross-sectionally,implyingthatthemeasurement ofexpectedreturnsmustbedoneattheanomalylevel. Toaccommodatethese interactions, we provide the first joint study of trading costs and performance decaythatusesempiricaltradingcostdata.4 2. Anomalies Data, Trading Cost Measurement, and Portfolio Implementations Here we describe our methods. We begin with the anomalies data (Section 2.1), then describe trading cost measurement (Section 2.2), and then describe 4HuangandHuang(2013)alsoexaminetradingcostsandpost-publicationreturnsformany anomaliesandfindthatexpectedreturnsarepositive, buttheyimputetradingcostsbasedon statisticsreportedintheliteratureandstudyonly14anomalies. 7

ourportfolioimplementations(Section2.3). 2.1. AnomaliesData OuranomaliesdatasetiscreatedfromChenandZimmermann’s(2019)(CZ’s) setof156cross-sectionalreturnpredictorsfrom115publicationsinaccounting, economics,andfinancejournals. Thisdatasetcontainsall97fromMcLeanand Pontiff(2016)andadds59predictorsfromGreen,Hand,andZhang(2017),Hou, Xue,andZhang(2017),andHarvey,Liu,andZhu(2016). ChenandZimmermannshowthattheirreplicatedpredictorsperformquite well. The average in-sample (original publication’s sample) return is 0.72% per month, withanaveraget-statof4.3. Moreover, theirin-samplereturnsarevery similar to hand collected statistics from the original publications, differing by onlyahandfulofbasispointsonaverage. Weexclude34predictorsthathavedifficult-to-evaluatetradingcosts. Many of these predictors are created from event studies (such as Ritter’s (1991) study of long-run IPO performance) that are difficult to compare with predictors that changeonaregularbasis. Inparticular,theoptimalrebalancingofeventstudybasedportfoliosisdifficulttodetermine,andrebalancinghasalargeeffectwhen examining trading costs. We also exclude predictors that are too discrete to be used in our trading cost mitigation techniques such as Hong and Kacperczyk’s (2009)sinstockclassification.Continuityisimportant,becauseourmostreliable costmitigation,thebuy-holdspread,reliesonthecontinuityofthepredictorfor moreefficientrebalancing. We also exclude the Fama and MacBeth (1973) CAPM beta and Kelly and Jiang’s(2014)tailriskfactorbecausesomeacademicsmayobjectthattheseare notanomalies. Nevertheless,includingthemhasalmostnoeffectonourresults. 8

The anomalies are constructed from the usual data sources. More than half ofthepredictorsfocusonCompustatdata,andabout30%usepurelypricedata. Mostoftheremainderuseanalystforecasts,thoughseveralfocusoninstitutional ownershipdata,tradingvolume,orspecializeddata(suchasGompers,Ishii,and Metrick’s (2003) governance index). Appendix A.1 provides a list of the anomalies. Forfurtherdetails,pleaseseeChenandZimmermann(2019). 2.2. TradingCostMeasurement WemeasurereturnsbeforetradingcostsusingtheubiquitousmonthlyCRSP data. Thentoadjustfortradingcosts,wetrackportfolioweights,andeachtime apositionisenteredorexited, weassumetheeffectivehalfspreadispaid. This notionoftradingcostsisalsostudiedinHannaandReady(2005),Korajczykand Sadka(2004),andNovy-MarxandVelikov(2016). Tounderstandthistradingcostmeasure,ithelpstoknowthatpricesinCRSP are predominately determined by closing auctions.5 The hypothetical anomaly portfolios studied by academics would have added additional demand or supply to these auctions, increasing the prices for buys and decreasing the prices for sells. These price deviations, then, would reduce returns compared to the CRSP benchmark. Our trading cost aims to measure the minimal amount by whichthesepriceswouldhavebeenmoved.6 AnalternativemethodformeasuringtradingcostsistoexclusivelyuseintradaydataasinKnezandReady(1996), butthiswoulddeviatesignificantlyfromtheanomaliesliteraturewhichisbased onclosingauctionprices. Our measure of the minimal price deviation is the effective half bid-ask 5The NYSE and NASDAQ closing auctions are described at https://www.nyse.com/article/nyse-closing-auction-insiders-guide and https://www.nasdaqtrader.com/content/productsservices/Trading//ClosingCrossfaq.pdf. 6WearegratefultoHaoxiangZhuforsuggestingthisinterpretation. 9

spread—thatis,theabsolutedifferencebetweenthetradepriceandtheprevailing quoted midpoint. Supposing that the prevailing midpoint is an unbiased estimated of the frictionless price, a buy trade “overpays” by the effective half spreadandaselltradereceivestoolittlebythesameamount. Effectivespreads use trades that are actually executed, and typically imply smaller spreads than quotedpricesduetopriceimprovement(Stoll2003). Weusehigh-frequencyHFdatatocomputespreadswheneveritisavailable. Our HF data combines the Daily TAQ, Monthly TAQ, and ISSM datasets. ComputationofspreadsfollowsHoldenandJacobsen(2014)(HJ)closely.7 Tomatch themonthlydatafrequenciesusedintheanomaliesliterature,wefirstaggregate toadailylevelbytakingashare-weightedaverageofintra-dayspreads,andthen aggregate across days within each month by taking a simple average following HannaandReady(2005)andothers. Anomalyreturnsaremeasuredusingendof-monthclosingpricesandthusonemayarguethatend-of-monthspreadsare a better match. However, averaging across the month ensures that our spreads arenotsensitivetooutliers. ForadditionaldetailsseeAppendixA.2. OurHFdataprovideamostlycontinuoushistoryoftransactionsontheNYSE and AMEX from 1983-2016.8 These datasets are sufficient for estimating tradingcostsofanomaliespost-publication,as97%ofanomaliesarepublishedafter 1983. However, we also wish to study the effects of cost optimization, and to avoiddata-miningbiaswerunouroptimizationsonpre-publicationdata. Thus,wecomputeeffectivespreadspre-1983(andwheneverHFdataismissing) using low frequency (LF) proxies based on daily CRSP data. Rather than 7WearegratefultoCraigHoldenforprovidingSAScodeonhiswebsite. 8DataforNASDAQstocksissomewhatshorter(1987-2016),asISSMismissingNASDAQdata before1987. TheolderISSMdataalsofeaturesseveralgapsindata. NASDAQdataismissingin AprilandMay1987,AprilandJuly1988,NovemberandDecember1989.Inaddition,thereare46 tradingdayswithnodataforNASDAQstocksbetween1987and1991,and146tradingdayswith nodataforNYSE/AMEX.ThesedatagapsarealsofoundbyBarber,Odean,andZhu(2008). 10

choose any particular LF proxy, we compute four different LF proxies and use the simple average as our spread. The four LF proxies we use are Hasbrouck’s (2009)Gibbsestimate(Gibbs),CorwinandSchultz’s(2012)high-lowspread(HL), AbdiandRanaldo’s(2017)close-high-lowspread(CHL),andFong,Holden,and Tobek’s (2017) implementation of Kyle and Obizhaeva (2016) invariance-based volume-over-volatilitymeasure(VoV). ThisapproachismotivatedbytheideathattheLFproxiesareaforecast(or backcast) of the unobserved high frequency effective spread. The literature on economicforecastinghasshownthatasimpleaverageofforecasts(a.k.a. combinationforecasts)significantlyoutperformsindividualforecastsinawidevarietyofsettings(BatesandGranger1969;Timmermann2006). Thisimprovement canbeunderstoodfromasimplediversificationargument: thepredictivepower ofaparticularforecastvariesacrossobservations,andcombiningmultipleforecasts averages out these errors. The averaging of multiple LF illiquidity proxies is also used in Karnaukh, Ranaldo, and Soderlind (2015), who find that averagingimprovesonusingtheconstituentproxiesalone. Indeed,wefindthatourLF averageoutperformsanyindividualLFproxyintermsofitsabilitytomatchHF data. ForfurtherdetailsseeAppendixA.3. [Table1abouthere.] Table 1 illustrates the performance of our LF average proxy. Panel A begins byshowingthatourfourLFproxies,whilehighlycorrelated,stillcontaindistinct information.Thetypicalcorrelationisaround75%,butcanbeaslowas0.59(betweenHLandVoV).Theseresultssuggestthatthelogicofcombinationforecasts applieshere: bycombiningproxieswecanaverageouttheirerrors. Panels B and C shows that this logic works. These panels compare our LF averagewithHF spreadswhenthey are available. The LF average hasthe high- 11

estcorrelationwithTAQspreads,at90%. Incomparison,thebestindividualLF proxiesareGibbsandVoV,whichbothhave84%correlationswithTAQ.PanelC shows a similar result with ISSM. The LF average has an even higher 94% correlation with ISSM spreads, compared to 90% for the best individual LF proxy, CHL. Though LF spreads are highly correlated with HF spreads, they exhibit a strongbias, especiallyinrecentdata. ThisproblemisshowninFigure2, which plots the median difference between LF and HF spreads over time. Post-2003, spreadsarebiasedupwardby25-50basispoints. ThisbiasindicatesthatitisimportanttouseHFdatatoexaminetradingcostsinrecentyears,andthattheLF tradingcostsusedbyNovy-MarxandVelikov(2016)overestimateexpectedcosts goingforward. [Figure2abouthere.] Figure3illustrateshowourcombinedeffectivespreadmeasurehaveevolved overtime. Tradingcostsrisesharplyintheearly1970sasNASDAQstocksenter theCRSPuniverse. Costsrisefurtherinthelate1980’s,aphenomenonwhichis seeninotherpapers(CorwinandSchultz2012;AbdiandRanaldo2017). Trading costs plummet in the 2000’s as electronic trading and decimalization have improved liquidity. Overall, our combined effective spread is consistent with key featuresofstockmarkethistory. [Figure3abouthere.] 2.3. PortfolioImplementations We examine three different implementations for each anomaly: (1) academicimplementations,(2)constrainedcostoptimizationsthatallowforequalweighting,and(3)constrainedcostoptimizationsthatenforcevalue-weighting. 12

Implementation is important because the more general notion of trading costsincludesnotonlythedirectcostsoftrades(e.g. effectivespreads),butalso the lost returns that come from avoiding the direct costs (Perold 1988). Thus, a full accounting of trading costs requires the study of cost optimization. Moreover, the relevant implementation depends on the investor in question, so we studytwoversionsofourconstrainedoptimizedimplementation. 2.3.1. AcademicImplementations Ouracademicimplementationsaresimplyequal-weightedlong-shortquintiles.Wesortstocksintoquintilesbasedonthesignal,equally-weightstocks,and re-calculateportfolioweightswhenthesignalupdates.9 Thisimplementationrepresentsthemodalapproachintheliterature.Almost allanomalypapersreporteitherequal-weightedportfoliosorequal-weightedregressions, but only a minority report value-weighted portfolios (Green, Hand, and Zhang 2013). Similarly, though the decile and quintile sorts are both frequentlyreported,manypapersthatstudydecilesortsalsocombinethe9thand 10thdecilesinthelonglegoftheirhedgeportfolios,suggestingthattheoriginal authorswouldsimilarlyadvocatetheuseofquintilesorts. 2.3.2. ConstrainedOptimizedImplementations Optimalimplementationwithmanyassetsandproportionaltradingcostsis an extremely difficult problem. Theoretical solutions have been found only by imposingstarkapproximationssuchasuncorrelatedreturns(Liu2004)oranexogenous and constant target portfolio (Leland 2000). For tractability, empirical studies often optimize within a restricted set of linear portfolio rules (Brandt, Santa-Clara, and Valkanov 2009; DeMiguel et al. Forthcoming; Moallemi and 9ForadetailedlistofsignalupdatingfrequenciesseeAppendixA.1 13

Saglam2017),despitethefactthattheorytendstoimplynon-linearpolicies. Weoptimizewithinasetofsimplenon-linearrulesthatcapturetheintuition fromoptimaltheory.Thissetofrulesiscalledthe“buy/holdspread”(alsoknown as“banding”), andisbestdescribedwithanexample: a20/40buy/holdspread goes long stocks with signals that are in the top 20th percentile, but only exits stocksthathavesignalsbelowthetop40thpercentile(andsimilarlyfortheshort end). Betweenthe20thand40thpercentilesisaninactionregionwherenotradingoccurs. Inaction regions are a key feature of optimal trading under trading costs (Magill and Constantinides 1976). Intuitively, while frictionless trading implies thatonecouldalwaysbenefitfromtradingtoimprovetheexpectedreturn,with frictionstherearestatesinwhichthecostoftradingoutweighsthisbenefit. Empirical evidence supports this intuition. Novy-Marx and Velikov (2016, 2019) show that the buy/hold spread outperforms other rules commonly used in industry. Buy/hold spreads also have the advantage that they nest the standard academicimplementation:a20/20buy/holdruleisequivalenttothestandardquintilesort. Thisfeaturemakesiteasytointerprethowouroptimizationimproves ontheacademicbenchmark. The buy/hold spread rules only prescribe which stocks to long or short—it doesnotprescribetheweightsofeachposition. Forstocksthatareprescribedto go long or short, we consider both equal-weighting and value-weighting stocks inouroptimization.Thischoiceallowsourconstrainedoptimalimplementation totilttowardmorelargerandmoreliquidstocksifthelowercostoftradingoutweighs the gain in expected gross returns. One could consider a more complex weighting function, but we consider only equal- and value- weighting to avoid overfitting. 14

Optimization proceeds in two steps. In the first step, choose the buy/hold spreadexitparametertomaximizetheaveragein-samplenetreturnofanomalies within turnover quartiles. We apply this optimization twice: first assumingequal-weightingandabuy/holdenterthresholdof20%usingallstocks,and secondassumingvalue-weightingandabuy/holdenterthresholdof10%using NYSEstocksonly. Inthesecondstep,chooseequal-orvalue-weightingtomaximizein-samplenetreturnsattheanomalylevel.ForfurtherdetailsseeAppendix A.4. Whenweexaminecostoptimizationsthatenforcevalue-weighting,wesimplyenforcevalue-weightinginthesecondstepoftheoptimization. Our optimization is clearly constrained. We take as given a buy/hold decision rule, the enter thresholds of these rules, and only allow for equal or value weighting. Optimizingoveradditionalchoiceswould,byconstruction,improve performancein-sample,butwouldleadtomoreoverfitting. Indeed,thefactthat ouroptimizationdramaticallyimprovesnetreturnsin-samplesuggeststhatcost ofmoreoverfittingoutweighsbenefits. Thesecoststendtobelargeinportfolio choice(DeMiguel,Garlappi,andUppal2009,forexample). We optimize using only in-sample information for similar reasons. Our main object of interest is the mean net return in samples that are both postpublication and post-2005. Optimizing using only in-sample information ensuresthatourmainobjectofinterestisnotaffectedbydata-miningbiascoming fromoptimization. 3. Zeroing in on the Average Anomaly Havingdescribedourmethods,wecannowzeroinonexpectedreturns. We beginwithacademicimplementationsbecausetheyarewidelyunderstoodand 15

thus are helpful for understanding how the anomaly zoo interacts with trading costs. Wethenpresentourfirstmainresultwhichexaminescost-optimizedimplementations(Section3.2). 3.1. TheAverageAcademicImplementation Table 2 shows that academic implementations offer no expected returns at all. Though the historical gross return (in-sample) was 66 bps per month, one should expect closer to a net return of -3 bps going forward (net of costs and post-publication). Notably,ourlargesetofanomaliesproducesastandarderror onthepost-publicationnetreturnofjust5bps. [Table2abouthere.] Table 2 offers a few decompositions for understanding this lack of expected returns. Thepost-publicationrowshowsthatroughlyhalfofthein-samplegross returns are eliminated by data-mining bias and changes in the investing environment,consistentwithMcLeanandPontiff(2016). Thoughthisdecayislarge, post-publicationdatastillimplyanotable30bpspermonthofexpectedreturns (4%peryear)beforetradingcosts. Trading costs wipe out the remaining expected returns, however. A second decomposition shows that this return reduction (column d) is roughly equal to theproductof2-sidedturnover(columnc)andtheaveragespreadpaid(column d).Asthetypicalanomalyturnsover15%ofitslongportfolioand15%ofitsshort portfolioeachmonth,thetotal2-sidedturnoveris30%.Multiplyingthisturnover by the average paid post-publication spread of 111 bps (column d) leads to the returnreductionof32bps. Thelargeimpactoftradingcostsmaybesurprising,sincedecimalizationimpliesthatthequotedspreadonmanystocksisjustonepenny. Dividing$0.01by 16

thetypicalsharepriceof$20leadstoatinyspreadof5bps,farfromthe111bps post-publicationspreadpaidinTable2. Trading costs are extremely right-skewed, however, and anomaly strategies require trading stocks from all over the liquidity spectrum. Thus, the typical spread paid by an anomaly strategy is more similar to the mean spread, and muchlargerthanthemodalspreadonetypicallyseesatabrokerage. This skewness is seen in Figure 4, which compares distributions of spreads in 2014. NYSE spreads (dotted line) display a mode at around 5 basis points, consistent with the tiny spread implied by decimalization. The NYSE contains many stocks with much larger spreads, however, as seen in the long right tail ofthe distribution. Indeed, about 20%of NYSE stockshave effective spreadsin excessof20bps. [Figure4abouthere.] Anomalyportfoliosloaduponthisrighttail.Thedistributionofspreadspaid by academic implementations in 2014 (solid line) shares the same mode as the NYSEdistribution,butthepeakisonlyhalfastall,andthemissingmassisshifted intotherighttail.Asaresult,themeanspreadpaidbyanomalystrategiesin2014 is67bps,morethan4timestheaverageNYSEspreadof16bps. While academic portfolios tend to trade stocks that are more illiquid than the NYSE, their trading costs are similar to that of the broad universe of stocks. Indeed,theanomalypaidspreaddistribution(solidline)linesupcloselywiththe distributionforallstocks(dash-dottedline),andissignificantlyshiftedtotheleft comparedwiththedistributionfortheRussell2000(dashedline). ReturningtoTable2,the“in-sample”rowshowsthatacademicimplementations are not even profitable in-sample. Compared to post-publication results, turnoverisaboutthesamein-sample,buttheaveragespreadpaidismorethan 17

twiceaslarge,andthusthereturnreductiondoublesto61bpspermonth. This returnreductioneffectivelywipesoutthein-samplegrossreturn. These results suggest academic strategies naively trade stocks that are too illiquid. Butsimplyavoidingilliquidstocksmaynotbewise, aspredictabilityis muchstrongerinthemoreilliquidstocks. Indeed,Novy-MarxandVelikov(2019) findthatsimplyavoidingilliquidstocksalsonaive, asthereductioningrossreturnsisaslargeorlargerthantheimprovementintradingcosts. 3.2. TheAverageCost-OptimizedAnomaly This section presents our first main result. Here we zero in on the expected returnsofcost-optimizedimplementations. We begin by showing that our constrained optimization is very effective. Panel B of Table 2 shows that, relative to the academic implementation, optimization improves in-sample net returns by 33 bps per month, leading to a noteworthy 38 bps net return. This improvement comes from a 35% decrease inturnoveranda38%decreaseinthespreadspaid,whilethelostreturnsarejust 7bps(66-59bps). Post-publication,however,themeannetreturnisjust13bpspermonth.This negligible return comes from the fact that the gross return drops to just 20 bps post-publication. Thus,evenwithaminisculereturnreductionof8bps,thenet returnistiny. Figure 5 provides a more graphic view of this decline in performance. This figureshowsthedetailsofourestimatesasaneventstudy:weaveragenetreturns across120anomalieswithineachmonthrelativetopublication(lightline). The extremevolatilityofthelightlineisareminderthatanomaliesportfoliosarenot atallsurebets. 18

[Figure5abouthere.] Thedarklineshowsthetrailing5-yearmovingaveragenetreturn,onceagain averaging across 120 anomalies. This moving average shows a sharp decline in performance dropping from about 40 bps before publication to around 12 bps afterwards. Returning to Table 2, the “Post-Pub & Post-2005” row further isolates expected returns by accounting for the change in trading technologies that happened during the early 2000s. This change saw an explosion in trading volume and institutional activity, which implies that the data pre-2005 is unlikely to be representative of the future (Chordia, Subrahmanyam, and Tong 2014). We accountforthischangebylimitingthedatatoanomaly-monthsthatarebothpostpublicationandpost-2005.10 Inthismorerefinedisolation,thetypicalanomaly isexpectedtoreturnonly8bpspermonth,withastandarderrorofjust4bps. Eventhistiny8bpspermonthmaybeunachievableonlargerscales,aspanel Ballowsforequal-weightingforeaseofcomparisonwiththebroaderanomalies literature. PanelClimitsourcost-optimizedstrategiestovalue-weighting. There wefind4bpspermonthofexpectedreturns. Despitethesmallstandarderrorof 3bpspermonth, theseexpectedreturnsarestatisticallyindistinguishablefrom zero. 4. Zeroing in the Strongest Anomalies We’veseenthattheaverageanomaly’sexpectedreturniseffectivelyzero. But whatshouldweexpectfromthestrongestanomalies? Thissectionpresentsour secondmainresult:thestrongestanomalies’expectedreturnsareonly10-20bps permonth. 10Usingonlypost-2003orpost-2004dataleadstoverysimilarresults. 19

To come to this result, we need to account for data-mining bias. To understand this, it helps to examine the heterogeneity in post-publication and post- 2005(“post-pub05”)meannetreturns,showninFigure6. Someanomalieshave notable net returns. Cash flow to price (CF2Price), tangibility (Tangibili), and momentumforyoungfirms(MomYoung)allproducenetreturnsinexcessof80 bpspermonthinthisrecentsample. [Figure6abouthere.] Aportionoftheselargenetreturnsisduetodata-miningbias,however. This biasisclearlyseenifwebreakdownthemeanpost-pub05netreturnofpredictor i intotwocomponents r¯ =µ +(cid:178) (1) i i i where r¯ is the observed mean, µ is the true expected return, and (cid:178) is a zero i i i mean noise term due to sampling variability. And suppose we define large net returnsasthosewherer¯ islargerthanthe80thpercentiler¯ . Thenthecondii 80 tionalexpectationforlargenetreturnsis (cid:69)(r¯ |r¯ >r¯ )=(cid:69)(µ |r¯ >r¯ )+(cid:69)((cid:178) |r¯ >r¯ ). (2) i i 80 i i 80 i i 80 (cid:124) (cid:123)(cid:122) (cid:125) >0 The noise term (cid:69)((cid:178) |r¯ >r¯ ) is positive because mining for large mean returns i i 80 also selects for large realizations of noise. As a result, the mean returns in the righttail(cid:69)(r¯ |r¯ >r¯ )areupwardbiasedcomparedtotheirtruereturns(cid:69)(µ |r¯ > i i 80 i i r¯ ). 80 We examine two approaches to removing the bias (cid:69)((cid:178) |r¯ >r¯ ). Section 4.1 i i 80 uses an out-of-sample test, and Section 4.2 uses an empirical Bayesian adjustment. Thoughthemethodsareverydifferent,theyleadtoverysimilarresults. 20

4.1. Data-MiningAdjustmentsUsingOut-of-SampleTests AsimplewaytoremovethebiasinEquation(2)iswithanout-of-sampletest. Specifically,wesortanomaliesbasedonin-samplepredictors,andthenmeasure post-pub05 net returns within quantiles to measure conditional expectations. This exercise ensures that the data used to select anomalies is not the same as thatusedtoevaluatethem,thuseliminatingdata-miningbias. Formally, suppose we use net returns as the in-sample predictor, and focus onanomaliesabovethe80thpercentiler¯ . Thentheconditionalexpectation IS,80 ofthepost-pub05netreturnis (cid:69)(r¯ |r¯ >r¯ )=(cid:69)(µ |r¯ >r¯ )+(cid:69)((cid:178) |r¯ >r¯ ) (3) i i IS,80 i i IS,80 i i IS,80 =(cid:69)(µ |r¯ >r¯ ), (4) i i IS,80 where(cid:69)((cid:178) |r¯ >r¯ )=0becausemonthlystockreturnsarenearlyi.i.d.andthus i i IS,80 sampling error in the mean (cid:178) is uncorrelated across the two, non-overlapping i samples. The sample analogue of (cid:69)(r¯ |r¯ > r¯ ), then, provides an unbiased i i IS,80 estimateofthetrueexpectedreturn(cid:69)(µ |r¯ >r¯ ). i i IS,80 We consider the following in-sample predictors: the mean net return, net Sharpe ratio, return reduction from trading costs, and turnover. In-sample net returns would predict post-pub05 net returns if µ is persistent across samples, i andSharperatioswouldpredictforsimilarreasons.Tradingcostsshouldpredict because net returns are the difference between gross returns and trading costs, andonceagaintradingcostsmaybepersistent. Turnover,finally,maypredictas itisoneofthecomponentsoftradingcosts. Table 3 shows the results. The table shows the mean post-pub05 net return of anomalies grouped by predictor quartiles. Predictability is weak and fragile. 21

In implementations that allow for equal-weighting (panel A), the best net returnscomefromusingthenetSharperatio,withthetopquartileproducingexpectedreturnsof21.2bpspermonth. Butthenetreturnsfromthissortarenot monotonically increasing, and indeed, three out of four predictors fail to produce monotonicity. Moreover, the second strongest predictor (Turnover) producesonly14.3bpspermonthinitstopquartile. [Table3abouthere] Predictability is even essentially gone when using only value-weighting (PanelB).ThenetSharperatiosortisveryfragile, withthesecondquartileperforming much better than the first and third, suggesting that the 11.4 bps in its topquartilecannotbetrusted.Indeed,onlyturnoverseemstoproduceareliable improvementinmeanreturns,anditonlyleadstoastatisticallyinsignificant9.7 bpspermonthinitstopquartile. Overall, post-pub05 mean net returns show little predictability in out-ofsampletests. Takentogether,theseresultsleadustoconcludethatthestrongest anomalies offer at most 10-20 bps per month, once data-mining bias is accountedfor. 4.2. Data-MiningAdjustmentsUsingEmpiricalBayes Asanalternativedata-miningadjustment,westudyan“empiricalBayesian” estimator. This method can be motivated by Equation (2). Bias comes from the noise term (cid:69)((cid:178) |r¯ >r¯ ). Thus, one can remove bias by directly estimating i i 80 (cid:69)(µ |r¯ >r¯ ). Inotherwords,whattheeconometricianreallywishestoknowis i i 80 µ for the strongest anomalies, and thus our goal is not the conditional sample i mean(cid:69)(r¯ |r¯ >r¯ ),buttheconditionalexpectationoftruereturns(cid:69)(µ |r¯ >r¯ ). i i 80 i i 80 22

Given an estimated model, Bayes rule provides the logic for computing this expectation. And to generate an estimated model, we specify a DGP and fit it to empirical data using frequentist methods. This combination of empirical frequentistmethodsandBayesianlogicgivesthename“empiricalBayes.” Empirical Bayeshasbeenshownto effectively remove data-miningbias inavariety of settings(Efron2011;Azevedoetal.2019;Liu,Moon,andSchorfheide2020). Wefirstdeveloptheadjustmentandthenexamineadjustedexpectedreturns. Throughout this section, we refer to mean returns that are post-publication, post-2005,andnetoftradingcosts. Foreaseofreading,wedropallofthequalifiersinwhatfollows(“Sharperatio”referstothepost-publication,post-2005,net Sharperatio). 4.2.1. EmpiricalBayesMethodology The Sharpe ratio for predictor i is normally distributed around the true Sharperatio r¯ (cid:181)µ (cid:182) i ∼N i ,SE(SR ) , (5) σ σ i i i where σ is the volatility of net returns and SE(SR ) is the standard error for i i Sharpe ratio i. The normal distribution is justified by the central limit theorem andthefactthatthesamplesizesareintheorderofhundreds.WeassumeSharpe ratios are uncorrelated across predictors, consistent with the near-zero median correlationinreturnsacrossanomalies(McLeanandPontiff2016;Green,Hand, andZhang2014;ChenandZimmermann2019). ModelingSharperatiosratherthanmeanreturnseffectivelyrescalesportfoliostohavethesamevolatility.Wefindthatmodelingmeanreturnsleadstoeven smallerexpectedreturns,consistentwiththestrongperformanceofnetSharpe 23

ratiosasinTable3. Weassumeσ isobserved.Thisassumptioncanbejustifiedbythesmallstani darderrorinsamplevolatilityforsamplesof360months.11 Underthisassumption and the standard assumption of zero autocorrelation in monthly returns, (cid:112) SE(SR )=SE(r¯ )/σ =1/ T . i i i i TrueSharperatiosarelocation-scalet-distributed µ i ∼t (cid:161)µ ,σ ,ν , (cid:162) (6) σ SR SR SR i whereµ isthelocation(mean),σ isthescale(dispersion),ν isthedegrees SR SR SR offreedomparameter. Thisbell-shapeddistributionisconsistentwiththedata (Figure6). Usingat-distributionallowsforfattailsandthustheideathatthere maybeafewpredictorsthataretrulyexceptional. Equations(5)and(6)summarizethemodel.Themodelhasjustthreeparameters: µ ,σ ,andν . Forsimplicity,wefixν atdifferentvaluestoexamine SR SR SR SR howourresultschange. Given ν , method of moments implies a simple estimate (Xie, Kou, and SR Brown2012)12 µˆ ≡ 1 (cid:88) N r¯ i (8) SR N σ i=1 i (cid:40) (cid:34) (cid:35) (cid:41) σˆ2 ≡max (cid:181)ν SR −2 (cid:182) 1 (cid:88) N (cid:181) r¯ i −µˆ (cid:182)2 − 1 (cid:88) N 1 ,0 . (9) SR ν N σ SR N T SR i=1 i i=1 i 11Ifthemonthlyreturnisnormallydistributed,thensamplevolatilityisσˆ i = (cid:112) T σ i −1 χ T−1 . Then thestandarderrorofσˆ =0.037sforasamplesizeof30years. i 12Toseethis,note (cid:69)(cid:163) (r¯/σ −µ )2(cid:164)=(cid:69)(cid:163) (µ /σ −µ )2+(µ /σ −µ )δ +δ2(cid:164) , (7) i i SR i i SR i i SR i i whereδ isanoiseterm. Thecrosstermdropsout,andthenpopulationmomentsarereplaced i bysamplemomentstoarriveat(7). Restrictingtheparametersettopositiveσ2 resultsinthe SR maxoperation. 24

Intuitively, the grand mean is estimated using the average of all Sharpe ratios, and the scale parameter is estimated as the dispersion in Sharpe ratios (cid:179) (cid:180)2 1 (cid:80)N r¯ i −µˆ that cannot be accounted for by noise 1 (cid:80)N 1 . Finally, the N i=1 σ i SR N i=1Ti (cid:179) (cid:180) factor ν SR −2 adjustsfortheassumedfattailparameterν . ν SR SR Givenestimatedparameters,wecalculatethebias-adjustedexpectedreturn forpredictori with (cid:183)µˆ (cid:175) (cid:184) µˆ ≡(cid:69) i(cid:175)r¯ ,σ ,µˆ ,σˆ ,ν σ . (10) i σ (cid:175) i i SR SR SR i i Thatis,thebiasadjustedreturnistheconditionalexpectationofthetrueSharpe ratiogivenallavailableinformation,rescaledbyvolatility.Werescalebyvolatility foreaseofcomparisonwithourotherresults. Equation (10) is free of data-mining bias, even for predictors with large r¯ . i ThisfeaturecomesfromthefactEquation(10)alreadyconditionsonallavailable information. This property is sometimes considered a paradox (Dawid 1994), but Senn (2008) demonstrates that it is entirely logical. Indeed, the removal of data-miningbiasusingestimationsanalogoustoEquation(10)hasbeendemonstrated in numerous settings (Efron 2011; Azevedo et al. 2019; Liu, Moon, and Schorfheide2020;ChenandZimmermann2019). Themechanicsoftheadjustmentcanbeseeninthespecialcaseν →∞. In SR thiscase,normal-normalupdatingformulasimply µˆ =sˆ µˆ σ +(1−sˆ )r¯ (11) i i SR i i i wherethe“shrinkage”sˆ isgivenby i 1/T sˆ ≡ i . (12) i σˆ2 +1/T SR i 25

Intuitively, we shrink large r¯ toward the grand mean µˆ σ . Predictors with i SR i smaller samples are shrunk more, as they are more vulnerable to data-mining bias. Theoverallshrinkageisdeterminedbyσˆ ,whereintheextremecasethat SR there is no dispersion in true Sharpe ratios, shrinkage is 100%. Equation (11) shows our estimator is closely related to the celebrated James and Stein (1961) estimator. Thus,similarestimatorscanalsobederivedfromquadraticlossarguments,aswellasGaltonianreverseregression(Stigler1990). 4.2.2. EmpiricalBayesResults Table 4 describes the estimation results and bias adjusted returns. Panel A shows our baseline cost optimizations, which allow for equal-weighting. There wefindthatassumingthattrueSharperatiosareapproximatelynormal, (ν = SR 100), the standard deviation of true Sharpe ratios is 0.20 (annualized). ConsideringthatthemeanstandarderrorontheobservednetSharperatiois0.35,this impliesthattheadjustmentisverylarge(Equation(11)). Indeed,80thand90th percentile adjusted net post-pub05 returns are only about 20 bps per month. Assuming that true Sharpe ratios are fat tailed (ν = 4) has almost no effect SR on the results. These results are quantitatively very similar to those from our predictability-basedadjustment(Table3). [Table4abouthere.] Bias adjustments for implementations that only use value-weighting (Panel B) are even stronger. Indeed, our estimates imply that there is no dispersion of true Sharpe ratios at all. This result comes from the fact that the dispersion in observed Sharpe ratios is smaller than the average standard error, and thus method of moments hits the positivity constraint on σ . In other words, all SR value-weighted anomalies have the same true Sharpe ratios, and the strongest 26

expectedreturnscomeonlyfromtakingonmorevolatility. Asaresult, the90th percentileofadjustednetpost-pubreturnsisjust6.4bpspermonth. Thisresult isalsoconsistentwithourpredictabilityresults,wherewesawthatnoin-sample informationisareliablepredictorofpost-pub05netreturns. The intuition for these results can be seen in Figure 6. Only 11 out of 120 anomaliesproducet-stats>2.0inabsolutevalue,notfarfromthe6impliedby a model in which there is no predictability (σ = µ = 0). As a result, noise SR SR canaccountformostoftheheterogeneityinpost-pub05performance,Bayesian logicimpliesthatbiasadjustmentsarelarge,leadingtoourfindingthateventhe strongestanomaliesofferonly10-20bpsofexpectedreturns. 4.3. PerformanceofSize,B/M,andMomentum Objectivestatisticsshowthatthestrongestanomaliesprovidelittleexpected returns. Butthese resultsgroupfamousanomalies like size, B/M, andmomentumwithlesserknownonesfromthebroaderanomalyzoo. Thissectionexamines the performance of size, B/M, and momentum and compares our results withtheliterature,whichtendstofocusonthesewell-knownanomalies. Wefindthatsize,B/M,andmomentumhaveunremarkableperformance,inline with the broader anomaly zoo in post-pub05 samples. This can be seen in Figure 6, in which B/M is represented by “BM,” and momentum is represented by “Mom12m.” These famous anomalies lie in the middle of the distribution, centeredaroundzero. Table5takesacloserlookattheseanomalies.Ourbaselineresultsemphasize thepost-pub05sample,inwhichsize,B/M,andmomentumnet-26bps,33bps, and 16 bps, respectively. This sample corresponds to 2006-2016, as size, B/M, andmomentumareallpublishedbeforeourpost-2005periodbegins. 27

Thispoorperformanceappearstoconflictwiththoseofotherpapers,which often find that these select anomalies perform well net of costs. In particular, Frazzini,Israel,andMoskowitz(2015)(FIM)concludethat“size,value,andmomentum — are robust, implementable, and sizeable in the face of transactions costs.” [Table5abouthere.] FIM’s conclusions, however, come from examining long historical samples goingbackto1926,andmeannetreturnsarehighlysensitivetothesampleperiod.Indeed,FIM’sresultsforthe1998-2013sampleshowmoreofamixedresult. Wereprinttheseresultsinthe“FIM(2015)”columnofTable5. Thereweseethat inthemorerecentdata,sizeandB/Mhavenotablenetreturns,butmomentum hasaslightlynegativenetreturn. This sensitivity is also seen in using our methodology. While size, B/M, and momentumareunremarkablepost2006,theyseemtohaveabove-averageperformance 1998-2013, as seen in the 3rd column of Table 5. Indeed, this earliersampleperformanceisasgoodorbetterthanthosereportedbyFIMforthesame sampleperiod. Consistent with these fragile results, Table 5 shows that individual anomaliesproducehugestandarderrorsof30-60bpspermonth. Thissamplingnoise makesitimpossibletotelliftheanyindividualanomalyhasstrongperformance inthemoderneratradingtechnology. Indeed,ourpost-2005netreturnsarenot statisticallydifferentthananyoftheotherresultsshowninthetable. Overall, Table 5 highlights the importance of studying a large set of anomaliesformakinginferenceaboutexpectedreturns. Theperformanceofindividual anomalies is very noisy in the post-2005 period. It is only by aggregating information over many anomalies that we can make precise measurements of what 28

weshouldexpectafterexcludingstaledata. 5. Conclusion We zero in on the expected returns of anomalies by accounting for trading costsandthestalenessofhistoricaldata.Netoftheseeffects,theexpectedreturn oneventhebestanomaliesiseffectivelyzero. Thisconclusioncomesfromapplyingdata-miningadjustmentstodatathat includes high-frequency trading costs and a large set of anomalies. Highfrequency data is necessary as low-frequency spreads are biased upward in recentyears. Alargesetofanomaliesisrequiredasindividualanomalyreturnsare very noisy after excluding stale data. Finally, data-mining adjustments are requiredtocontrolforthebiasthecomesfromselectingthebestanomalies. Our studyisuniqueincombiningthesedatasetsandmethods. Incombinationwithrecentfindings,ourresultsprovideacompleteaccounting for the average return on the anomaly zoo. Previous papers show that the grossreturnisabout15%publicationbias(McLeanandPontiff2016; Chenand Zimmermann 2019). We find that trading costs account for another 40%, and thattheremainingnetreturns(45%)aretradedawayovertime,consistentwith the idea that mispricing is removed as information proliferates and technology improves(Chordia,Subrahmanyam,andTong2014;McLeanandPontiff2016). This decomposition paints a picture of a dynamic equilibrium process, but onemoreinlinewithLo’s(2004)adaptivemarkethypothesisor“efficientlyinefficient”markets(GrossmanandStiglitz1980;GârleanuandPedersen2018)than standard dynamic equilibrium models (Campbell and Cochrane 1999). Every month, researchers find imperfections in the existing market equilibrium. As information about predictability diffuses and trading technology improves, the 29

netreturnsoftheseimperfectionsaretradedaway,leadingtoanewequilibrium. 30

A. Appendix A.1. DescriptionoftheAnomalyDataset TableA.1:ListofCross-SectionalReturnPredictorsPart1/3 Thistableliststheanomaliesinourdataset. Forfurtherdetails,pleaseseetheAppendixofChenandZimmermann(2019). Freqliststherebalancingfrequenciesweassume. Acronym Description Freq Publication AccrAbn AbnormalAccruals A Xie 2001AR AccrOper PercentOperatingAccruals A Hafzallaetal 2011AR AccrPct PercentTotalAccruals A Hafzallaetal 2011AR Accruals Accruals A Sloan 1996AR AdExpGr Growthinadvertisingexpenses A Lou 2014RFS AnnounRet Earningsannouncementreturn Q Chanetal 1996JF AssetCGr Changeincurrentoperatingassets A Richardsonetal 2005JAE InvestAG AssetGrowth A Cooperetal 2008JF ATurn AssetTurnover A Soliman 2008AR BEgrowth SustainableGrowth A LockwoodPrombutr 2010JFR BetaSquared CAPMbetasqured M FamaMacBeth 1973JPE BidAskSpread Bid-askspread M AmihudMendelsohn 1986JFE BM Booktomarket A FamaFrench 1992JF BMent EnterprisecomponentofBM A Penmanetal 2007JAR BMlev LeveragecomponentofBM A Penmanetal 2007JAR CAPXgr Changeincapex(twoyears) A AndersonGarcia-Feijoo 2006JF Cash Cashtoassets Q Palazzo 2012JFE CF2Price Cashflowtomarket A Lakonishoketal 1994JF CFOper2Price OperatingCashflowstoprice A Desaietal 2004AR DebtFinC Compositedebtissuance A LyandresSunZhang 2008RFS DeferRev DeferredRevenue A PrakashSinha 2012CAR DepGr ChangeindepreciationtogrossPPE A HolthausenLarcker 1992JAE EarnCons EarningsConsistency Q Alwathainani 2009BAR EarnSupBig Earningssurpriseofbigfirms M Hou 2007RFS EarnSurp EarningsSurprise Q Fosteretal 1984AR EffFrontier Efficientfrontierindex A NguyenSwanson 2009JFQA EntMult EnterpriseMultiple A LoughranWellman 2011JFQA EP Earnings-to-PriceRatio A Basu 1977JF EPforecast EarningsForecast M ElgersLoPfeiffer 2001AR EPSDisp EPSForecastDispersion M Dietheretal 2002JF EPSForeLT Long-termEPSforecast M LaPorta 1996JF EPSrevise Earningsforecastrevisions M Chanetal 1996JF Eq2AGr Changeinequitytoassets A Richardsonetal 2005JAE ExcludExp ExcludedExpenses M Doyleetal 2003RAS ExtFinNet Netexternalfinancing A Bradshawetal 2006JAE FailurePr Failureprobability Q Campbelletal 2008JF FinLiabGr Changeinfinancialliabilities A Richardsonetal 2005JAE GIndex GovernanceIndex A Gompersetal 2003QJE GM2SaleGr GrossMargingrowthoversalesgrowth A AbarbanellBushee 1998AR Herf Industryconcentration(Herfindahl) A HouRobinson 2006JF High52 52weekhigh M GeorgeHwang 2004JF IdioVol Idiosyncraticrisk M Angetal 2006JF Illiquid Amihud’silliquidity M Amihud 2002JFM IndMom IndustryMomentum M GrinblattMoskowitz 1999JFE IndRetBig Industryreturnofbigfirms M Hou 2007RFS 31

TableA.2:ListofCross-SectionalReturnPredictorsPart2/3 Acronym Description Freq Publication InstOwnSI Instownamonghighshortinterest Q AsquithPathakRitter 2005JFE IntanBM IntangiblereturnusingBM A DanielTitman 2006JF IntanCFP IntangiblereturnusingCFtoP A DanielTitman 2006JF IntanEP IntangiblereturnusingEP A DanielTitman 2006JF IntanSP IntangiblereturnusingSale2P A DanielTitman 2006JF InvestGr Changeincapitalinv(indadj) A AbarbanellBushee 1998AR Invntory InventoryGrowth A ThomasZhang 2002RAS InvToRev Investmenttorevenue A Titmanetal 2004JFQA KZ KaplanZingalesindex A Lamontetal 2001RFS LaborGr Employmentgrowth A BazdreschBeloLin 2014JPE Leverage Marketleverage A Bhandari 1988JFE LiabCGr Changeincurrentoperatingliabilities A Richardsonetal 2005JAE LTAssetGr ChangeinNoncurrentOperatingAssets A Soliman 2008AR LTNOAgr GrowthinLongtermnetoperatingassets A Fairfieldetal 2003AR MaxRet Maximumreturnovermonth M Balietal 2010JF Mom12m Momentum(12month) M JegadeeshTitman 1993JF Mom12to7 IntermediateMomentum M Novy-Marx 2012JFE Mom1813 Momentum-Reversal M DeBondtThaler 1985JF Mom1m Shorttermreversal M Jegedeesh 1989JF Mom36m Long-runreversal A DeBondtThaler 1985JF Mom6Jnk JunkStockMomentum M Avramovetal 2007JF Mom6m Momentum(6month) M JegadeeshTitman 1993JF MomVol MomentumandVolume M LeeSwaminathan 2000JF MomYoung FirmAge-Momentum M Zhang 2004JF NDebtFin Netdebtfinancing A Bradshawetal 2006JAE NDebtPrice Netdebttoprice A Penmanetal 2007JAR NEqFin Netequityfinancing A Bradshawetal 2006JAE NOA NetOperatingAssets A Hirshleiferetal 2004JAE NPayYield NetPayoutYield A Boudoukhetal 2007JF NWCgr ChangeinNetWorkingCapital A Soliman 2008AR OperLeverage OperatingLeverage A Novy-Marx 2010ROF OptVol OptionVolumetoStockVolume M JohnsonSo 2012JFE OptVolGr OptionVolumerelativetorecentaverage M JohnsonSo 2012JFE OrderBacklog Orderbacklog A Rajgopaletal 2003RAS OrgCap OrganizationalCapital A EisfeldtPapanikolaou 2013JF OScore OScore A Dichev 1998JFE PayYield PayoutYield A Boudoukhetal 2007JF PensionFunding PensionFundingStatus A FranzoniMarin 2006JF PMGrowth ChangeinProfitMargin A Soliman 2008AR Price Price M BlumeHusic 1972JF PriceDelay Pricedelay M HouMoskowitz 2005RFS ProfCash Cash-basedoperatingprofitability A Balletal 2016JFE ProfGross grossprofits/totalassets A Novy-Marx 2013JFE ProfitMargin ProfitMargin A Soliman 2008AR ProfOper operatingprofits/bookequity A FamaFrench 2006JFE 32

TableA.3:ListofCross-SectionalReturnPredictorsPart3/3 Acronym Description Freq Publication RDirtSurp Realdirtysurplus A Landsmanetal 2011AR RealEstate Realestateholdings A Tuzel 2010RFS RetConglomerate Conglomeratereturn M CohenLou 2012JFE Rev2Price Sales-to-price A Barbeeetal 1996FAJ RevG2InvG Salesgrowthoverinventorygrowth A AbarbanellBushee 1998AR RevG2OHG Salesgrowthoveroverheadgrowth A AbarbanellBushee 1998AR RevGrowth RevenueGrowthRank A Lakonishoketal 1994JF RevSurprise RevenueSurprise Q JegadeeshLivnat 2006JFE RoA earnings/assets Q Balakrishnanetal 2010JAE RoE netincome/bookequity A HaugenBaker 1996JFE Seasonality ReturnSeasonality M HestonSadka 2008JFE ShareIs1 Shareissuance(5year) A DanielTitman 2006JF ShareIs5 Shareissuance(1year) A PontiffWoodgate 2008JF VolumeShare ShareVolume Q DatarNaikRadcliffe 1998JFM ShortInterest ShortInterest Q Dechowetal 2001JFE Size Size A Banz 1981JFE OSmirkNTM Volatilitysmirknearthemoney M XingZhangZhao 2010JFQA OSmirkCP Putvolatilityminuscallvolatility M Yan 2011JFE Tangibility Tangibility A HahnLee 2009JF Tax2E Taxableincometoincome A LevNissim 2004AR TaxGr ChangeinTaxes Q ThomasZhang 2011JAR ATurnGr ChangeinAssetTurnover A Soliman 2008AR TurnovVol Shareturnovervolatility M Chordiaetal 2001JFE CF2Pvar Cash-flowtopricevariance A HaugenBaker 1996JFE Volume2Mkt Volumetomarketequity M HaugenBaker 1996JFE VolumeDol Pasttradingvolume M Brennanetal 1998JFE VolumeSD VolumeVariance M Chordiaetal 2001JFE VolumeTrend VolumeTrend M HaugenBaker 1996JFE ZeroTrade Dayswithzerotrades M Liu 2006JFE ZScore AltmanZ-Score A Dichev 1998JFE 33

A.2. DetailsofHighFrequencyData TheHFeffectivespreadforthekthtradeofagivenstockis [EffectiveSpread] =2|log(P )−log(M )|, (13) k k k where P is the price of the kth trade and M is the midpoint of the matched k k consolidatedbestbidandoffer(BBO)quote. WeuseDailyTAQ(DTAQ)datawithitsmilli-nanosecondtime-stampswheneveritisavailable(October2003toDecember2016).HoldenandJacobsen(2014) find that DTAQ leads to a more accurate and precise measurement of effective spreadsinthemodernmarketenvironmentrelativetotheMonthlyTAQ(MTAQ) datawithitssecond-leveltimestamps. DTAQspreadsuseHoldenandJacobsen’s(2014)(HJ’s)HJ’sDTAQcode. ISSM andMTAQspreadsuseHJ’smonthlycode. Forpre-1999data,weadda2second delaytotheHJinterpolation-matchingalgorithm. Fordatain1999-2002weuse the1milliseconddelayfollowingHJ’sMTAQcode. InadditiontothedatascreensusedbyHJ,wealsodiscardanyspreads>40% atthetradelevel(beforeaveraging),followingAbdiandRanaldo(2017). Wealso adaptthemodescreenstoISSMdatafollowingLouandShu(2014). Thedetailsofthedatacleaningaredescribedbelow. A.2.1. ISSMDataDetails WeadaptHJ’sMTAQcodetocalculateISSMspreads. One of HJ’s screens deletes quotes in which the offer or bid size are ≤ 0 or missing. These depth fields are missing or appear to have errors in some subsamplesofthedata,andwechoosenottoapplythisscreenonthesesubsamples. 34

NASDAQstocksinISSMfrom1987-1989areallmissingdepthdata. Roughlyhalf of the stocks in MTAQ from January 1, 1993 to April 5, 1993 (inclusive) are have zeroforallobservationsofdepth,whilecloseto0%ofstocksarehavezerosbeginning April 6. HJ use the depth screen in order to avoid withdrawn quotes. We choose to not use the depth screen on these subsamples, as the noise in LF spreads is likely to be much larger than the errors introduced by withdrawn quotes. Quotesareexcludedifanyofthefollowinghold: • Timeisbefore9:00amorafter4:00pm • ifmodein(C,D,F,G,I,L,N,P,S,V,X,Z) • BID>OFRandBID>0andOFR>0 • BID>0andOFR=0 • OFR-BID>5andBID>0andOFR>0 • OFR≤0ormissing • BID≤0ormissing • ofrsize≤0ormissing • bidsize≤0ormissing. NASDAQ listed stocks from 1987-1989 and NYSE listed stocks in 1986 are not subjecttothesizefiltersastheyareallmissingofrsizeandbidsize. Tradesarekeptifallofthefollowinghold • Timeisafter9:30amandbefore4:00pm • Price>0 • Type=T • Condnotin(C,L,N,R,O,Z)andSize>0 • FromTAQandcorrectionfieldiszero We add a 2-second interpolated delay using Holden and Jacobsen’s (2014) interpolationcode. 35

A.2.2. MTAQDataDetails WefollowHJ’sMTAQcodetocalculateMTAQspreads. MTAQdataspansJan 1,1993toDec31,2014withtradesandquotestimestampedtothesecond. Quotesareexcludedifanyofthefollowinghold: • Timeisbefore9:00amorafter4:00pm • ifmodein(4,7,9,11,13,14,15,19,20,27,28) • BID>OFRandBID>0andOFR>0 • BID>0andOFR=0 • OFR-BID>5andBID>0andOFR>0 • OFR≤0ormissing • BID≤0ormissing • ofrsiz≤0ormissing • bidsiz≤0ormissing. Data from January 1, 1993 to April 5, 1993 are not subject to the size filters because about 50% of stocks have zero for all observations of ofrsize and bidsize during this period. In contrast, close to 0% have zeros beginning April 6, 1993, suggesting there are errors for bid and offer sizes at the beginning of the MTAQ data. Tradesarekeptifallofthefollowinghold • Timeisafter9:30amandbefore4:00pm • Price>0 • Type=T • Corr=0 FollowingHoldenandJacobsen(2014),wedelayquotesasfollows: • Add2secondinterpolateddelaypre-1999 • Add1millisecondinterpolateddelaybasedonHJfor1999-2002 36

A.2.3. DTAQdatadetails We exactly follow HJ’s DTAQ code to calculate DTAQ spreads. DTAQ spans Sep 10, 2003 to the present with trades, quotes, and NBBOs originally timestamped to the millisecond. On Aug 25, 2015 the Daily TAQ timestamps were switched to the microsecond and on Oct 24, 2016 the Daily TAQ timestamps wereswitchedtothenanosecond.OurDTAQcodeusesnanosecondtimestamps throughouteventhoughsomeofthetrailingdigitswillbezerosduringthemillisecondandmicroseconderas. Observations in the DATQ NBBO and quote file are excluded if any of the followinghold: • Qu_Condnotin(A,B,H,O,R,W) • Ask≤0ormissing • Asksize≤0ormissing • Bid≤0ormissing • Bidsize≤0ormissing Observations in the DTAQ NBBO are also excluded if Qu_Cancel = B. ObservationsinthequotefilearealsoexcludedifBid>AskorBid-Ask>5. Wealsokeeponlyquotesthatmeetthefollowingadditionalrestrictions: • (Qu_Source = C and NatBBO_Ind=1) or (Qu_Source = N and NatBBO_Ind=4) • sym_suffix=” • Timeisbetween9:00amand4:00pm Tradesarekeptiftheallofthefollowinghold: • Tr_Corr=00 • price>0 • sym_suffix=” • Timeisbetween9:30amand4:00pm 37

FollowingHoldenandJacobsen(2014),wedelayquotesasfollows: • Add1nanosecond(one-billionthofasecond)delaypostOct24,2016 • Add1microsecond(one-millionthofasecond)delaypostJul24,2015 • Add1millisecond(one-thousandofasecond)delaypostSep9,2003 Explicitly, the Holden and Jacobsen (2014) DTAQ code adds a nanosecond delay,butduetothedatavariabledataavailabilityinDTAQthedelaysareaslisted above. A.3. DetailsofLowFrequencySpreads Three of our four proxies build off of Roll’s (1984) classic microstructure model. TheRollmodelassumesthatthetruevalueofastockfollowsarandom walk, and that the observed trade prices deviate from the true value by the effectivespread. Thefourthproxyusesacompletelydifferentframework: theKyle and Obizhaeva (2016) microstructure invariance hypothesis. All 4 proxies have beenshowntobehighlycorrelatedwithHFspreads. TheLFproxiesweuseareasfollows: 1. Hasbrouck’s(2009)GibbssamplerestimateoftheRollmodel(Gibbs) Hasbrouck(2009)estimatestheRollmodelusingBayesianmethods(Gibbs sampler) and daily closingprices. Identification comes from the "bid-ask bounce"—thephenomenoninwhichbuyerinitiatedtradestendtooccur athigherpricesthansellerinitiatedtrades. Bid-askbounceinducesanegativeserialcorrelationintransactionprices,thatisstrongerforstocksthat aremoreexpensivetotrade.TheBayesianapproachensuresthatthemeasured serial correlation is negative, and thus the estimated spread is well 38

defined. OurGibbsproxyisestimatedusingannualsamples,followingthe approachrecommendedinHasbrouck(2009). Gibbsformsthebasisfortransactioncostsinseveralotherstudiesofportfolio returns, including Brandt, Santa-Clara, and Valkanov (2009); Hand and Green (2011); Novy-Marx and Velikov (2016); and DeMiguel et al. (Forthcoming). 2. CorwinandSchultz’s(2012)High-LowSpread(HL). CorwinandSchultz(2012)estimatetheRollmodelfromdailyhighandlow prices(hence,HL)thatareavailableinCRSP.Identificationcomesfromthe factthatthedailyhigh-lowratioreflectsbothspreadsandreturnvolatility, butthesetwocomponentsdecayatdifferentrates. Thus,thecomparison of 1-day and 2-day price ranges provides information about the effective spread. HL is used in many studies including Karnaukh, Ranaldo, and Soderlind (2015); McLean and Pontiff (2016); Koch, Ruenzi, and Starks (2016); and ChenandZimmermann(2019). 3. AbdiandRanaldo’s(2017)Close-High-Low(CHL) AbdiandRanaldo’s(2017)CHLproxyestimatestheRollmodelusingdaily closing prices as well as the daily high and low (hence, CHL). Abdi and Ranaldo’s identification builds off the insight that the average of the daily highandlowprices(themidpoint)containsimportantinformationabout thetrueprice. AbdiandRanaldo(2017)showthatCHLoutperformsboth GibbsandHLusinganumberofempiricaltests. 4. Volume-over-Volatility(VoV),basedonKyleandObizhaeva’s(2016)microstructureinvariancehypothesis. 39

Our last LF proxy takes a rather different approach. Rather than build off ofRoll(1984),VoVisbasedontheKyleandObizhaeva’s(2016)microstructureinvariancehypothesis.Inparticular,weuseFong,Holden,andTobek’s (2017)(FHT’s)implementation: (cid:163) (cid:164)2 8.0 StdDevofDailyReturns 3 [VoV] = (14) i,t 1 [MeanRealDailyDollarVolume]3 where[VoV] istheproxyforeffectivespreadforstocki inmontht,the 2 i,t 3 and 1 exponentsarepredictionsofKyleandObizhaeva’s(2016)invariance 3 hypothesis, and the 8.0 coefficient was chosen by FHT to fit the average monthlyTAQeffectivespreadintheirU.S.sample. Nominaldollarvolume isconvertedtorealdollarvolumeusingtheCPI. The invariance hypothesis is that the distribution of transaction costs is thesameacrossassetsandtimeperiodswhenexpressedintermsof“business time,” that is, the speed with which “bets” arrive at the market. This hypothesis leads to the prediction that the constant term in trading costs (alternatively, the bid-ask spread) is proportional to the RHS of Equation (14). Fong, Holden, and Tobek (2017) find that VoV is the best performingLFproxyamongmanyproxiesintermsofcorrelationsandRMSEwith respecttoTAQspreads. HLandCHLbothusedailyhighandlowprices. Fordaysinwhichstocksdo nottrade,weusethemostrecentobservationofhighandlowprices. Asnotedin AbdiandRanaldo(2017)andCorwinandSchultz(2012),ondaysinwhichstocks donottradeCRSPprovidesclosingquotedspreads,andclosingquotedspreads areveryhighlycorrelatedwitheffectiveHFspreadsintherecentsample.Inthese cases,wedonotusetheclosingquotedspreadinordertomakeinterpretationof ourLFproxyaveragesimple. 40

The LF proxies require multiple firm-day observations to compute a spread for a given firm-month. We follow the original papers and do not compute the proxy if the data is insufficient. Specifically, HL requires 12 daily observations, CHLrequires12eligibledaysfollowingthedefinitioninAbdiandRanaldo(2017), VoVrequires5positivevolumeand11non-zeroreturnobservations,andGibbs requiresthesamplertoconverge. WecomputeaLFaverageifwehaveatleastoneLFproxywithdata.In12.24% ofobservations,allLFandHFspreadsaremissingdata. Thesemissingobservations have little effect on our main results, however, as only 0.27% of post-1993 observationsaremissing,and90%ofouranomaliesarepublishedafter1993. If ISSM,TAQ,andtheLFspreadsareallmissing,wematchthefirmtothenearest firmwithavailabledataintermsofEuclideandistanceofmarketequityrankand idiosyncraticvolatilityrank. Ifidiosyncraticvolatilityismissing,weusejustthe market equity rank. This data filling procedure follows Novy-Marx and Velikov (2016). A.4. DetailsofCostOptimization Table A.4 illustrates the first step of our optimization. Panel A shows the net returns of equal-weighted quintile strategies within turnover quantiles, afterimplementingavarietyofbuy-holdspreads. Thepanelshowsthatbuy/hold spreads improve the net returns of high turnover anomalies, but do not help muchamonganomalieswithlowturnover. Anomaliesinthe3rdturnoverquartile perform best on average using a 20/35 buy-hold spread—that is, long positions should only be exited when they drop below the top 35th percentile of theanomalysignal. 4thturnoverquartileanomaliesbenefitsignificantlyfroma 20/50buy/holdspread,buttheydonotproducepositivenetreturnsonaverage. 41

PanelBshowsthatbuy/holdspreadsarereliablyeffectiveforvalue-weighted NYSE decile strategies. As with equal-weighted quintiles, buy/hold spreads do not significantly improve the net returns of anomalies with below-median turnover. Buy/hold spreads produce significantly positive net returns for 3rd turnoverquartileanomaliesandeventhe4thturnoverquartileanomalies,however. These results are consistent with Novy-Marx and Velikov (2016), who also findthatthetradingcostsforlowturnoveranomaliesaretoosmalltojustifyimplementingabuy/holdspread. Boldnumbersindicatethebest-performingbuy/holdspreadsforeachstock weightingandturnoverquartilecombination. Inthelaststepofourcostmitigation, we choose the stock weighting and breakpoint choice that maximizes the netreturnin-sample,giventheboldbuy/holdspreadsinTableA.4.Thislaststep oftheoptimizationisdoneattheanomalylevel,andisnotshowninthetable. FiguresA.2andA.3showthatourcostmitigationiseffectivein-sample. The figuresshowthedistributionofin-samplenetreturnsbefore(FigureA.2)andafter(A.3)costmitigation. Ratherthanusebarstoindicatethehistogramcounts, welistacronyms,witheachacronymidentifyingadifferentanomaly. FullreferencesforeachacronymarefoundinAppendixA.1. Figure A.2 shows that net returns before cost mitigation feature a long left tail. While most anomalies have positive net returns ranging between 0 and 60 bpspermonth,manyanomalieshaveverynegativenetreturnsof-50to-300bps. Averagingacrossallanomaliesleadstothetinynetreturnof6bpspermonthin Table2. Anomalies with above-median turnover are shown in bold. These high turnoveranomaliesoccupythevastmajorityofthelefttailofnetreturns. These high turnover anomalies include many momentum anomalies like 12-month 42

momentum (Mom12m) and momentum among junk-rated firms (Mom6Jnk), but also includes a variety of unrelated anomalies like idiosyncratic volatility (IdioVol),earningsforecastdispersion(EPSDisp),anddetrendedtradingvolume (VolumeTre). Persistent anomaly signals like B/M (BM) and size (Size) are little affectedbybid-askspreadsandoccupytherighttailofthisdistribution. Cost-mitigation should be very helpful with this left tail of net returns. As seen in Table A.4, value-weighting combined with a buy/hold spread produces positivenetreturnsevenamonganomaliesinthehighestturnoverquartile. Indeed,FigureA.3showsthatourcost-mitigationisquiteeffectivein-sample. The long left tail of net returns from Figure A.2 is gone. As a result, the average anomalynetreturnincreasestoanotable38bpspermonth. Cost mitigation techniques used on each anomaly are also shown in Figure A.3. Anomaliesthatusevalue-weightingareshowninitalics. Strategiesthatuse buy/holdspreadslargerthan5percentagepointsareunderlined. Wedonotunderline equal-weighted 20/25 buy/hold spreads as the improvement in net returnsisverysmall(TableA.4). 60%ofanomaliesperformbestusingvalue-weightingoncetradingcostsare accountedfor. Alargefractionoftheseanomaliesworkbestwithacombination of value-weighting and a buy/hold spread. Indeed, most of the anomalies with negativenetreturnsbeforeoptimization(bold)becomeprofitableoncebothof thesetechniquesareapplied. The anomalies that are rescued by cost-mitigation include the momentum anomalies (Mom6m, Mom12m, Mom6Jnk, etc). Indeed, momentum anomalies move from among the worst performers using the academic strategies to amongthebestperformersoncevalue-weightingandbuy/holdspreadsareapplied. Other anomalies that have significantly improved by cost mitigation includeidiosyncraticvolatility(IdioVol), thedistressanomaly(FailurePr), andthe 43

forecastedearnings-priceratio(EPforecas). Still,thereareafewanomaliesthatcostmitigationcannotresuscitate. Many of these are related to information diffusion, such as price delay (PriceDela) or theearningssurpriseofmatchedlargefirms(EarnSupBig). Intuitively,profiting onslowinformationdiffusionmayrequiretradingneglectedandilliquidstocks, aswellasfrequenttrading. ThenetreturnsinFigureA.3arelargelynotavailabletothepublic,however. Manyreadersmaynotbeabletotradeontheanomaliesuntilaftertheyarepublished. Even the academics who developed the original strategies in Figure A.3 likelycannotearnthein-sampleprofits,asthestrategiesweredevelopedtoward theendofthein-sampleperiod. 44

Table A.4: Optimizing Buy-Hold Spreads: Mean Net Returns In-Sample by TurnoverQuartile Thetableshowsmeannetreturnsin-sampleforvariousbuy/holdspreadtradingrules withinturnoverquartiles. Boldnumbersindicatethebest-performingbuy/holdspread for each turnover quartile. Turnover quartiles are calculated using the EW quintile benchmark(panelA)andtheVWNYSEdeciles(panelB).Forbuy/holdspreadsinpanel A,weenteralongpositionforstocksthatenterthetop20thpercentileoftheanomaly signal, but only exit the long position when the stock drops below the percentile indicatedbythebuy/holdlowerboundinthetable.Similarly,weentershortpositionswhen stocksenterthebottom20thpercentile, butonlyexitwhenstocksriseabovetheindicatedbuy/holdlowerbound. PanelBenterslongpositionswhenstocksenterthe10th NYSEpercentileandexitswhenthestockdropsbelowtheNYSEpercentileindicatedby thebuy/holdlowerbound. PanelA:EWQuintiles Buy/HoldLowerBound 20 25 30 35 40 45 50 Q1 0.39 0.39 0.38 0.37 0.36 0.34 0.33 Turnover Q2 0.31 0.32 0.31 0.31 0.30 0.29 0.28 Quartile Q3 0.12 0.16 0.17 0.18 0.17 0.17 0.17 Q4 -0.65 -0.51 -0.41 -0.34 -0.29 -0.24 -0.21 PanelB:VWNYSEDeciles Buy/HoldLowerBound 10 20 30 40 50 Q1 0.33 0.28 0.26 0.24 0.23 Turnover Q2 0.34 0.32 0.30 0.26 0.22 Quartile Q3 0.16 0.23 0.22 0.19 0.19 Q4 0.07 0.23 0.28 0.31 0.32 45

Figure A.2: Distribution of Net Returns: In-Sample, Before Cost Optimization We adjust anomaly returns for effective bid-ask spreads (Figure 3). All portfolios use equal-weighted quintile sorts, following the modal approach in the literature. Anomalies withabovemedianturnover(15%permonth, two-sided)are showninbold. Hashmarksindicatelargerbins. Publishedanomaly strategieshavealonglefttailinnetreturns,andproduceanaveragenetreturnofonly5bpspermonth. 35 30 25 20 15 10 5 0 -300 -200 -100 0 20 40 60 80 100 Net Return In-Sample (bps per month) tnuoC bold: Turnover > 15% per month ATurnGr AccrAbn BMlev CAPXgr CF2Pvar ATurn CFOper2Pr AccrOper EPSrevise Accruals FailurePr AdExpGr GIndex AssetCGr AccrPct GM2SaleGr BEgrowth AnnounRet Herf BetaSquar Cash Illiquid BidAskSpr DepGr InstOwnSI DebtFinC EPSDisp InvToRev IntanEP BMent EPforecas LTNOAgr IntanSP DeferRev EarnCons LiabCGr InvestGr EP Eq2AGr MomYoung Invntory EPSForeLT ExcludExp OSmirkNTM LaborGr EarnSurp High52 PayYield Leverage EntMult IndMom PensionFu NDebtFin FinLiabGr LTAssetGr ProfCash OSmirkCP IntanBM Mom12m ProfGross OperLever IntanCFP Mom1813 RDirtSurp OptVol KZ Mom6Jnk RealEstat OrderBack NDebtPric Mom6m RevG2InvG OrgCap NEqFin MomVol RevG2OHG RoE NOA NWCgr RevSurpri ShareIs1 NPayYield EarnSupBi OptVolGr ShortInte ShareIs5 OScore BM IdioVol Mom12to7 PMGrowth Tangibili Tax2E ProfOper CF2Price EffFronti IndRetBig Mom1m VolumeDol TaxGr VolumeSD ProfitMar ExtFinNet InvestAG MaxRet RetConglo VolumeSha TurnovVol ZScore Rev2Price Mom36m Price PriceDela Seasonali VolumeTre Volume2Mk ZeroTrade RevGrowth RoA Size // // // // 46

FigureA.3: CostOptimizationResults: DistributionofNetReturns: In-Sample. Wemitigatetransactioncostsbyapplyingvalueweighting and/or buy/hold spreads to 120 anomaly portfolios. Buy/hold spreads are chosen to maximize net returns in-sample followingTableA.4.Stockweightingischosentomaximizethein-samplenetreturngiventheoptimizedbuy/holdspread.Italicized anomalies benefit from value-weighting. Underlined anomalies benefit from buy/hold spreads. Bold indicates anomalies with negativenetreturnsbeforecostmitigation.Hashmarksindicatelargerbins.Costmitigationleadstopositivenetreturnsforthevast majorityofanomalies,andraisetheaveragenetreturnto38bpspermonth. 40 35 30 25 20 15 10 5 0 -60 -40 -20 0 20 40 60 80 100 120 180 Net Return In-Sample (bps per month) tnuoC ATurn ATurnGr AccrOper italics: Value-Weighted AdExpGr underline: Buy/Hold Spread AssetCGr Cash bold: Net Ret < 0 Before Mitigation DebtFinC EPSrevise Herf High52 Illiquid AccrPct IntanEP AccrAbn BMlev IntanSP BEgrowth CAPXgr InvToRev BMent CF2Pvar InvestGr BetaSquar CFOper2Pr Invntory BidAskSpr DepGr LTAssetGr DeferRev EPSDisp LaborGr EP EarnCons Leverage EPSForeLT GM2SaleGr MaxRet EarnSurp IndRetBig Mom12to7 EntMult InstOwnSI MomVol FailurePr LTNOAgr NDebtFin FinLiabGr LiabCGr NWCgr IntanBM Mom1813 OSmirkCP IntanCFP OSmirkNTMOperLever KZ OptVolGr OptVol Mom6m Accruals PensionFu OrgCap NDebtPric BM ProfGross RevG2InvG OrderBack CF2Price RDirtSurp RoE PayYield ExtFinNet EPforecas RealEstat ShareIs1 ProfOper GIndex EffFronti RevG2OHG Tax2E ProfitMar IdioVol InvestAG Eq2AGr RevSurpri TaxGr Rev2Price Mom36m Mom6Jnk ExcludExp ShortInte Volume2Mk RevGrowth NEqFin NPayYield EarnSupBi IndMom TurnovVol VolumeDol ShareIs5 NOA OScore PriceDela PMGrowth VolumeSha ZScore Tangibili ProfCash Price AnnounRet Mom1m RetConglo Seasonali VolumeTre ZeroTrade VolumeSD RoA Size Mom12m MomYoung // // 47

A.5. AdditionalResults FigureA.4:DistributionofPublicationYears. 48

FigureA.5:DistributionofPost-PublicationSampleLengths. 120 100 80 60 40 20 0 0 5 10 15 20 25 30 35 40 45 Years Since Publication snruteR htiw seilamonA fo # 120 100 80 60 40 20 0 49

TableA.5:ReturnsGrossandNetofTradingCosts:Post-PubandPost-2005 ThistableshowsthesamecalculationsasTable2butusespost-2005dataonly. (a) (b) (c) (d)≈(b)×(c) (e)=(a)-(d) Gross Turnover AveSpread Return Net Return (2-sided) Paid Reduction Return PanelA:Equal-WeightedLong-ShortQuintiles Post-Pub&Post-2005 0.30 0.30 1.11 0.32 -0.03 (0.10) (0.10) (0.10) (0.10) (0.10) PanelB:Cost-MitigatedusingValue-WeightingandBuy/HoldSpreads Post-Pub&Post-2005 0.20 0.20 0.60 0.08 0.13 (0.10) (0.10) (0.10) (0.10) (0.10) PanelC:Cost-MitigatedusingBuy/HoldSpreads,Value-Weightedonly Post-Pub&Post-2005 0.12 0.19 0.31 0.05 0.07 (0.10) (0.10) (0.10) (0.10) (0.10) 50

References Abdi,FarshidandAngeloRanaldo.“ASimpleEstimationofBid-AskSpreadsfrom Daily Close, High, and Low Prices”. The Review of Financial Studies 30.12 (2017),pp.4437–4480. Azevedo,EduardoMetal.“EmpiricalBayesEstimationofTreatmentEffectswith Many A/B Tests: An Overview”. AEA Papers and Proceedings. Vol. 109. 2019, pp.43–47. Ball,Ray,SPKothari,andJayShanken.“ProblemsinmeasuringportfolioperformanceAnapplicationtocontrarianinvestmentstrategies”.JournalofFinancialEconomics38.1(1995),pp.79–107. Barber,BradM,TerranceOdean,andNingZhu.“Doretailtradesmovemarkets?” TheReviewofFinancialStudies22.1(2008),pp.151–186. Bates,JohnMandCliveWJGranger.“Thecombinationofforecasts”.Journalof theOperationalResearchSociety20.4(1969),pp.451–468. Brandt,MichaelW,PedroSanta-Clara,andRossenValkanov.“Parametricportfoliopolicies:Exploitingcharacteristicsinthecross-sectionofequityreturns”. TheReviewofFinancialStudies22.9(2009),pp.3411–3447. Briere,Marieetal.“StockMarketLiquidityandtheTradingCostsofAssetPricing Anomalies”.AvailableatSSRN3380239(2019). Campbell, John Y and John H Cochrane. “By Force of Habit: A Consumption- Based Explanation of Aggregate Stock Market Behavior”. Journal of Political Economy107.2(1999),pp.205–251. Chen,AndrewYandTomZimmermann.“PublicationBiasandtheCross-Section of Stock Returns”. The Review of Asset Pricing Studies (Nov. 2019). raz011. ISSN: 2045-9920. eprint: https://academic.oup.com/raps/advance- 51

article-pdf/doi/10.1093/rapstu/raz011/31618483/raz011.pdf. URL:https://doi.org/10.1093/rapstu/raz011. Chordia,Tarun,AvanidharSubrahmanyam,andQingTong.“Havecapitalmarket anomaliesattenuatedintherecenteraofhighliquidityandtradingactivity?” JournalofAccountingandEconomics58.1(2014),pp.41–58. Chu, Yongqiang, David Hirshleifer, and Liang Ma. The causal effect of limits to arbitrageonassetpricinganomalies.Tech.rep.NationalBureauofEconomic Research,2017. Cohen, Lauren, Karl B Diether, and Christopher J Malloy. “Supply and demand shifts in the shorting market”. The Journal of Finance 62.5 (2007), pp. 2061– 2096. Cont,RamaandArseniyKukanov.“Optimalorderplacementinlimitordermarkets”.QuantitativeFinance17.1(2017),pp.21–39. Corwin, Shane A and Paul Schultz. “A simple way to estimate bid-ask spreads fromdailyhighandlowprices”.TheJournalofFinance67.2(2012),pp.719– 760. Dawid, AP. “Selection paradoxes of Bayesian inference”. Lecture Notes- MonographSeries(1994),pp.211–220. DeMiguel, Victor, Lorenzo Garlappi, and Raman Uppal. “Optimal versus naive diversification: How inefficient is the 1/N portfolio strategy?” The review of Financialstudies22.5(2009),pp.1915–1953. DeMiguel,Victoretal.“Aportfolioperspectiveonthemultitudeoffirmcharacteristics”(Forthcoming). Drechsler,ItamarandQingyiFredaDrechsler.“TheShortingPremiumandAsset PricingAnomalies”(2016). Efron, Bradley. Large-scale inference: empirical Bayes methods for estimation, testing,andprediction.Vol.1.CambridgeUniversityPress,2012. 52

Efron, Bradley. “Tweedie’s formula and selection bias”. Journal of the American StatisticalAssociation106.496(2011),pp.1602–1614. Fama,EugeneFandJamesDMacBeth.“Risk,return,andequilibrium:Empirical tests”.TheJournalofPoliticalEconomy(1973),pp.607–636. Feng, Guanhao, Stefano Giglio, and Dacheng Xiu. “Taming the Factor Zoo” (2017). Fong, Kingsley YL, Craig W Holden, and Charles A Trzcinka. “What are the best liquidityproxiesforglobalresearch?”ReviewofFinance21.4(2017),pp.1355– 1401. Fong,Kingsley,CraigHolden,andOndrejTobek.“AreVolatilityOverVolumeLiquidityProxiesUsefulForGlobalOrUSResearch?”(2017). Frazzini, Andrea, Ronen Israel, and Tobias J Moskowitz. “Trading costs”. AvailableatSSRN3229719(2018). — “Tradingcostsofassetpricinganomalies”(2015). Freyberger,Joachim,AndreasNeuhierl,andMichaelWeber.Dissectingcharacteristics nonparametrically. Tech. rep. National Bureau of Economic Research, 2017. Gârleanu, Nicolae and Lasse Heje Pedersen. “Efficiently inefficient markets for assetsandassetmanagement”.TheJournalofFinance73.4(2018),pp.1663– 1712. Glosten, Lawrence R and Paul R Milgrom. “Bid, ask and transaction prices in a specialist market with heterogeneously informed traders”. Journal of financialeconomics14.1(1985),pp.71–100. Gompers,Paul,JoyIshii,andAndrewMetrick.“Corporategovernanceandequity prices”.Thequarterlyjournalofeconomics118.1(2003),pp.107–156. 53

Goyenko,RuslanY,CraigWHolden,andCharlesATrzcinka.“Doliquiditymeasuresmeasureliquidity?”JournaloffinancialEconomics92.2(2009),pp.153– 181. Green, Jeremiah, John RM Hand, and Frank Zhang. “The remarkable multidimensionalityinthecross-sectionofexpectedUSstockreturns”.Availableat SSRN2262374(2014). Green, Jeremiah, John RM Hand, and X Frank Zhang. “The characteristics that provide independent information about average us monthly stock returns”. TheReviewofFinancialStudies(2017),hhx019. — “The supraview of return predictive signals”. Review of Accounting Studies 18.3(2013),pp.692–730. Grossman,SanfordJandJosephEStiglitz.“Ontheimpossibilityofinformationally efficient markets”. The American economic review 70.3 (1980), pp. 393– 408. Hand, John RM and Jeremiah Green. “The importance of accounting information in portfolio optimization”. Journal of Accounting, Auditing & Finance 26.1(2011),pp.1–34. Hanna,JDouglasandMarkJReady.“Profitablepredictabilityinthecrosssection ofstockreturns”.JournalofFinancialEconomics78.3(2005),pp.463–505. Harvey, Campbell R, Yan Liu, and Heqing Zhu. “... and the cross-section of expectedreturns”.TheReviewofFinancialStudies29.1(2016),pp.5–68. Hasbrouck,Joel.“TradingcostsandreturnsforUSequities:Estimatingeffective costsfromdailydata”.TheJournalofFinance64.3(2009),pp.1445–1477. Holden,CraigWandStaceyJacobsen.“Liquiditymeasurementproblemsinfast, competitivemarkets:Expensiveandcheapsolutions”.TheJournalofFinance 69.4(2014),pp.1747–1785. 54

Hong, Harrison and Marcin Kacperczyk. “The price of sin: The effects of social normsonmarkets”.JournalofFinancialEconomics93.1(2009),pp.15–36. Hou,Kewei,SehoonKim,andIngridMWerner.“(Priced)Frictions”(2016). Hou,Kewei,ChenXue,andLuZhang.ReplicatingAnomalies.Tech.rep.National BureauofEconomicResearch,2017. Huang,Jing-ZhiandZhijianJamesHuang.“Real-TimeProfitabiliytofPublished Anomalies:AnOut-of-SampleTest”.QuarterlyJournalofFinance3(2013). Institute for the Study of Security Markets. NYSE/AMEX and NAS- DAQ Historical Tick Data. Wharton Research Data Services, http://www.whartonwrds.com/datasets/crsp/. Jacobs, Heiko and Sebastian Müller. “Anomalies across the globe: Once public, nolongerexistent?”(2017). Jahan-Parvar, Mohammad and Filip Zikes. “When do low-frequency measures reallymeasuretransactioncosts?”(2019). James,WilliamandCharlesStein.“Estimationwithquadraticloss”.Proceedings ofthefourthBerkeleysymposiumonmathematicalstatisticsandprobability. Vol.1.1961.1961,pp.361–379. Karnaukh,Nina,AngeloRanaldo,andPaulSoderlind.“UnderstandingFXliquidity”.TheReviewofFinancialStudies28.11(2015),pp.3073–3108. Kelly,BryanandHaoJiang.“Tailriskandassetprices”.ReviewofFinancialStudies(2014),hhu039. Knez,PeterJandMarkJReady.“Estimatingtheprofitsfromtradingstrategies”. TheReviewofFinancialStudies9.4(1996),pp.1121–1163. Koch, Andrew, Stefan Ruenzi, and Laura Starks. “Commonality in liquidity: a demand-side explanation”. The Review of Financial Studies 29.8 (2016), pp.1943–1974. 55

Korajczyk,RobertAandRonnieSadka.“Aremomentumprofitsrobusttotrading costs?”TheJournalofFinance59.3(2004),pp.1039–1082. Kyle,AlbertSandAnnaAObizhaeva.“Marketmicrostructureinvariance:Empiricalhypotheses”.Econometrica84.4(2016),pp.1345–1404. Leland, Hayne. “Optimal portfolio implementation with transactions costs and capitalgainstaxes”.HaasSchoolofBusinessTechnicalReport (2000). Lesmond,DavidA, MichaelJSchill, andChunshengZhou.“Theillusorynature of momentum profits”. Journal of financial economics 71.2 (2004), pp. 349– 380. Liu, Hong. “Optimal consumption and investment with transaction costs and multipleriskyassets”.TheJournalofFinance59.1(2004),pp.289–338. Liu,Laura,HyungsikRogerMoon,andFrankSchorfheide.“Forecastingwithdynamicpaneldatamodels”.Econometrica88.1(2020),pp.171–201. Lo, Andrew W. “The Adaptive Markets Hypothesis: Market Efficiency from an Evolutionary Perspective”. Journal of Portfolio Management 30 (2004), pp.15–29. Lou, Xiaoxia and Tao Shu. “Price impact or trading volume: Why is the Amihud (2002)illiquiditymeasurepriced”.AvailableatSSRN 2291942(2014). Magill,MichaelJPandGeorgeMConstantinides.“Portfolioselectionwithtransactionscosts”.JournalofEconomicTheory13.2(1976),pp.245–263. Marquering, Wessel, Johan Nisser, and Toni Valla. “Disappearing anomalies: a dynamic analysis of the persistence of anomalies”. Applied Financial Economics16.4(2006),pp.291–302. McLean,RDavid.“Idiosyncraticrisk,long-termreversal,andmomentum”.JournalofFinancialandQuantitativeAnalysis45.4(2010),pp.883–906. McLean,RDavidandJeffreyPontiff.“Doesacademicresearchdestroystockreturnpredictability?”TheJournalofFinance71.1(2016),pp.5–32. 56

Moallemi,CiamacC.andMehmetSaglam.“DynamicPortfolioChoicewithLinear Rebalancing Rules”. Journal of Financial and Quantitative Analysis 52.3 (2017),1247fffd1278. New York Stock Exchange. Daily TAQ (Historical Trades Quotes). Wharton ResearchDataServices,http://www.whartonwrds.com/datasets/crsp/. — Monthly TAQ (Historical Trades Quotes). Wharton Research Data Services, http://www.whartonwrds.com/datasets/crsp/. Novy-Marx,RobertandMihailVelikov.“Ataxonomyofanomaliesandtheirtradingcosts”.ReviewofFinancialStudies29.1(2016),pp.104–147. — “Comparing Cost-Mitigation Techniques”. Financial Analysts Journal 75.1 (2019),pp.85–102. Patton, Andrew J and Brian M Weller. “What you see is not what you get: The costsoftradingmarketanomalies”(2017). Perold,AndreF.“Theimplementationshortfall:Paperversusreality”.Journalof PortfolioManagement 14.3(1988),p.4. Pontiff, Jeffrey and Michael Schill. “Long-run seasoned equity offering returns: Datasnooping,modelmisspecification,ormispricing?Acostlyarbitrageapproach”(2001). Ritter,JayR.“Thelong-runperformanceofinitialpublicofferings”.Thejournal offinance46.1(1991),pp.3–27. Roll, Richard. “A simple implicit measure of the effective bid-ask spread in an efficientmarket”.TheJournaloffinance39.4(1984),pp.1127–1139. Schultz,Paul.“Transactioncostsandthesmallfirmeffect:Acomment”.Journal ofFinancialEconomics12.1(1983),pp.81–88. Schwert, G William. “Anomalies and market efficiency”. Handbook of the EconomicsofFinance1(2003),pp.939–974. 57

Senn,Stephen.“Anoteconcerningaselection“paradox”ofdawid’s”.TheAmericanStatistician62.3(2008),pp.206–210. Stigler,StephenM.“The1988Neymanmemoriallecture:aGaltonianperspective onshrinkageestimators”.StatisticalScience5.1(1990),pp.147–155. Stoll, Hans R. “Market microstructure”. Handbook of the Economics of Finance. Vol.1.Elsevier,2003,pp.553–604. Stoll,HansRandRobertEWhaley.“Transactioncostsandthesmallfirmeffect”. JournalofFinancialEconomics12.1(1983),pp.57–79. Timmermann, Allan. “Forecast combinations”. Handbook of economic forecasting 1(2006),pp.135–196. WRDS: Center for Research in Security Prices. CRSP/Compustat Merged Database. Wharton Research Data Services, http://www.whartonwrds.com/datasets/crsp/. Xie, Xianchao, SC Kou, and Lawrence D Brown. “SURE estimates for a heteroscedastic hierarchical model”. Journal of the American Statistical Association107.500(2012),pp.1465–1479. 58

Tables and Figures 59

Table1:CorrelationsBetweenLow-FrequencyProxiesandHigh-FrequencyEffectiveBid-AskSpreads Correlations are pooled. We examine four low frequency proxies for spreads: Gibbs is Hasbrouck’s (2009) Gibbs estimate of the Roll model, HL is Corwin and Schultz’s (2012) high-low spread, CHL is Abdi and Ranaldo’s (2017) close-high-low, and VoV (volume-over-volatility) is Fong, Holden, and Tobek’s (2017) implementation of Kyle and Obizhaeva (2016) microstructure invariance hypothesis. LF_ave is the equal weighted average of the four low frequency proxies. TAQ and ISSM are computed from high-frequency data. The low frequency measures are imperfectly correlated, suggesting that they contain distinct information. LF_ave has the highest correlation with high-frequency spreads. LF spread data are available at http://sites.google.com/site/chenandrewy/code-and-data/. PanelA:LFspreadcorrelations(1926-2017;2,114,436obs.) Gibbs HL CHL VoV Gibbs 1.00 HL 0.68 1.00 CHL 0.76 0.88 1.00 VoV 0.75 0.59 0.74 1.00 PanelB:CorrelationswithTAQ(1993-2014;1,183,068obs.) TAQ Gibbs HL CHL VoV LF_Ave TAQ 1.00 Gibbs 0.84 1.00 HL 0.71 0.67 1.00 CHL 0.80 0.74 0.88 1.00 VoV 0.84 0.73 0.60 0.75 1.00 LF_Ave 0.90 0.90 0.86 0.93 0.87 1.00 PanelC:CorrelationswithISSM(1983-1992;262,381obs.) ISSM Gibbs HL CHL VoV LF_Ave ISSM 1.00 Gibbs 0.88 1.00 HL 0.84 0.79 1.00 CHL 0.90 0.84 0.92 1.00 VoV 0.86 0.82 0.66 0.78 1.00 LF_Ave 0.94 0.95 0.90 0.95 0.88 1.00 60

Table2:ZeroinginontheAverageAnomaly’sExpectedReturn We estimate the average net return (e) of 120 anomaly long-short portfolios after accounting for effective bid-ask spreads and stale data. All figures are in bps per month exceptforturnover,whichisaratiopermonth.Figuresaverageacrossmonthsandthen acrossanomalies,withstandarderrorsinparentheses.PanelAexaminesthetypicalacademicimplementation(Section2.3.1). PanelsBandCexaminecost-optimizedimplementations(Section2.3.2).Columns(a)-(d)reportanapproximatenetreturndecomposition. AnomaliesaredrawnfromMcLeanandPontiff(2016),Green,Hand,andZhang (2017), andHou, Xue, andZhang(2017)(Section2.1, TablesA.1-A.3). Afteraccounting fortradingcostsandstaledata,theexpectedreturnisapproximatelyzero. Source:CenterforResearchinSecurityPrices,NewYorkStockExchange,andInstitutefortheStudy ofSecurityMarkets. (a) (b) (c) (d)≈(b)×(c) (e)=(a)-(d) Gross Turnover AveSpread Return Net Return (2-sided) Paid Reduction Return PanelA:Equal-WeightedLong-ShortQuintiles In-Sample 66 0.31 219 61 5 (4) (0.04) (6) (7) (6) Post-Publication 30 0.30 111 32 -3 (4) (0.04) (6) (5) (5) PanelB:Cost-Optimized In-Sample 59 0.20 136 21 38 (4) (0.02) (7) (2) (3) Post-Publication 20 0.20 60 8 13 (4) (0.02) (6) (1) (4) Post-Pub&Post-2005 14 0.20 46 6 8 (4) (0.02) (4) (1) (4) PanelC:Cost-Optimized,Value-Weightedonly In-Sample 46 0.20 86 16 30 (4) (0.02) (5) (2) (3) Post-Publication 12 0.19 31 5 7 (3) (0.02) (5) (1) (3) Post-Pub&Post-2005 7 0.19 21 3 4 (3) (0.02) (3) (0) (3) 62

Table3:TheBestExpectedReturnsUsingOut-of-SampleTests Toavoiddata-miningbias,wesortanomalyportfoliosbasedonin-sampledataandaveragenetreturnspost-publicationandpost-2005. Quartilesarenumberedfromworst expectednetreturnapriori—forexample,quartile1hasthehighestturnover,andquartile4hasthelowestturnover. Allportfolioimplementationsusecost-optimizationfollowing Section 2.3.2. Panel B restricts implementations to value-weighting. Even the strongestanomalieshaveexpectedreturnsofonly10-20bpspermonth.Source:Center forResearchinSecurityPrices,NewYorkStockExchange,andInstitutefortheStudyof SecurityMarkets. PanelA:IncludingEqual-Weighting Post-PubPost-05NetReturn(bpsmonthly) In-Sample PredictorQuartile Predictor 1(Worst) 2 3 4(Best) NetReturn 4.8 6.0 12.5 10.6 (5.8) (6.2) (6.8) (7.4) NetSharpe 4.9 3.5 3.9 21.2 (6.9) (7.0) (6.4) (6.2) ReturnReduction 14.0 7.0 8.5 4.3 (7.1) (6.4) (5.7) (7.1) Turnover 1.4 7.2 11.0 14.3 (8.1) (6.2) (5.5) (6.5) PanelB:Value-WeightedOnly Post-PubPost-05NetReturn(bpsmonthly) In-Sample PredictorQuartile Predictor 1(Worst) 2 3 4(Best) NetReturn 1.4 9.3 -4.6 10.6 (6.6) (7.8) (7.4) (8.3) NetSharpe -0.7 10.2 -4.2 11.4 (7.0) (8.7) (7.7) (7.0) ReturnReduction 4.1 1.9 9.3 0.2 (8.4) (7.1) (7.4) (7.2) Turnover 2.4 -0.5 4.3 9.7 (8.1) (7.0) (7.2) (7.9) 63

Table4:EmpiricalBayesEstimatesoftheBestExpectedReturns Weadjustlargemeannetreturnsinpost-publicationandpost-2005(post-pub05)samplesfordata-miningusingempiricalBayes. Bootstrappedstandarderrorsareinparentheses. AdjustmentsassumeSharperatiosarethesumofthetrueSharperatioandan errorterm,andtrueSharperatiosaret-distributedwithd.o.f. ν ,scaleσ ,andmean SR SR µ .Givenν ,weestimateσ andµ bymethodofmoments(Equation(7)).Adjusted SR SR SR SR expected returns are computed from the conditional expectation of true Sharpe ratios (Equation(10)).Value-weightedimplementationsimplythatmethodofmomentshitsa positivityconstraint,andthusσˆ =0. Eventhestrongestanomalieshaveexpectedre- SR turnsofonly5-20bpspermonth,consistentwithTable3.Source:CenterforResearchin SecurityPrices,NewYorkStockExchange,andInstitutefortheStudyofSecurityMarkets. PanelA:IncludingEqual-Weighting Parameters(annualized) Post-Pub05NetReturn(bpsmonthly) Assumed Estimated Percentile ν σˆ µˆ 50 70 80 90 SR SR SR 100 0.20 0.11 10.2 14.3 18.9 21.3 (0.06) (0.03) (3.2) (4.1) (4.2) (5.0) 4 0.15 0.11 10.0 14.3 18.1 20.2 (0.05) (0.03) (3.1) (3.8) (4.0) (4.3) PanelB:Value-WeightedOnly Parameters(annualized) Post-Pub05NetReturn(bpsmonthly) Assumed Estimated Percentile ν σˆ µˆ 50 70 80 90 SR SR SR 100 0.00 0.04 4.1 4.9 5.6 6.4 (0.06) (0.03) (3.5) (4.4) (4.7) (5.3) 4 0.00 0.04 4.1 4.9 5.6 6.4 (0.04) (0.03) (3.6) (4.3) (4.6) (5.2) 64

Table5:PerformanceofSize,B/M,andMomentum Returns are in bps per month. Post-Pub05 is our baseline post-publication and post- 2005 sample, and is equivalent to 2006-2016 for these three anomalies. FIM (2015) is takenfromTableIVofFrazzini, Israel, andMoskowitz(2015). Size, B/M,andmomentum perform well in earlier data, consistent with FIM. The performance of individual anomaliesishighlysensitivetothesampleperiod, andthusweneedmanyanomalies toestimateexpectedreturnspost-2005. Source: CenterforResearchinSecurityPrices, NewYorkStockExchange,andInstitutefortheStudyofSecurityMarkets. PanelA:Size Post-Pub05 FIM(2015) Return 2006-2016 1998-2013 1998-2013 Gross -25.8 60.0 66.5 (33.9) (39.2) (22.1) Net -33.1 48.5 54.3 (33.8) (39.2) (21.9) PanelB:B/M Post-Pub05 FIM(2015) Return 2006-2016 1998-2013 1998-2013 Gross 32.9 79.9 40.5 (28.7) (31.3) (36.2) Net 24.5 66.4 29.3 (29.1) (31.8) (36.6) PanelC:Momentum Post-Pub05 FIM(2015) Return 2006-2016 1998-2013 1998-2013 Gross 16.4 36.2 18.8 (60.6) (59.9) (47.1) Net 12.6 28.8 -6.4 (60.5) (59.9) (45.8) 65

Figure2: TheBiasinLow-FrequencyEffectiveSpreadProxies. Wetakethedifferencebetweenlow-frequencyproxiesandTAQspreadsatthefirm-monthlevel, andthentakethemedianacrossfirmstocalculateanerrorineachmonth. LowfrequencyspreadsarefromHasbrouck(2009)(Gibbs),CorwinandSchultz(2012) (HL), Abdi and Ranaldo (2017) (CHL), and Kyle and Obizhaeva (2016) (VoV). Post-decimalization,low-frequencyproxiesarebiasedupwardbyroughly25-50 bps. 2 1.5 1 0.5 0 -0.5 -1 1990 1995 2000 2005 2010 2015 2020 )tnioP %( ,)daerpS QAT daerpS FL( naideM 2 Gibbs HL CHL 1.5 VoV 1 0.5 0 -0.5 -1 66

Figure 3: Combined Effective Spreads Over Time. Spreads combine highfrequency and low-frequency data. We use high-frequency Daily TAQ (DTAQ), MonthlyTAQ(MTAQ),andISSMwhenavailable. Otherwise,weusetheaverage offourlowfrequencyproxies: Gibbs(Hasbrouck2009),HL(CorwinandSchultz 2012), CHL (Abdi and Ranaldo 2017), and VoV (Kyle and Obizhaeva 2016). The combinedspreadtrackswell-knownstructuralchangesliketheentryofNASDAQ (early 1970s) and decimalization (early 2000s). LF spread data are available at http://sites.google.com/site/chenandrewy/code-and-data/. Source: Center for ResearchinSecurityPrices,NewYorkStockExchange,andInstitutefortheStudy ofSecurityMarkets. 12 11 10 9 8 7 6 5 4 3 2 1 0 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020 )%( daerps evitceffE 12 th 75 percentile 11 th 50 percentile 10 th 25 percentile 9 <- Average of LF proxies ISSM MTAQ DTAQ -> 8 7 6 5 4 3 2 1 0 67

Figure4: DistributionofSpreadsPaidbyAcademicImplementationsin2014. Wecomparetheeffectivespreadspaidbyacademicimplementationswiththose ofallstocks,NYSEstocks,andRussell2000stocks. “Paidbyanomalyportfolios” poolsacrossalltradesimpliedby120academicimplementationsin2014. Other distributionsarepooledacrossallstock-monthsin2014.Academicimplementationstradestocksacrosstheentireliquidityspectrum,resultinginlargetrading costsdespitethenear-zeromodalspreadsofrecentyears. Source:CenterforResearchinSecurityPrices,NewYorkStockExchange,andInstitutefortheStudyof SecurityMarkets. 0.12 0.1 0.08 0.06 0.04 0.02 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Effective Spread (%) ycneuqerF Paid by Anomaly Portfolios All Stocks NYSE Russell 2000 68

Figure5: Event-TimeNetReturnsforCost-OptimizedImplementations. Fora givenmonthrelativetopublication,lightlinesplotthemeannetreturnacrossall anomalies. Darklinesshowthetrailing5-yearmovingaverageofmeanreturns, and dashed lines show 2 standard error confidence bounds. Cost optimization is effective before publication, but net returns become tiny afterwards. Source: Center for Research in Security Prices, New York Stock Exchange, and Institute fortheStudyofSecurityMarkets. 0.5 0.375 0.25 0.125 0 -0.125 -0.25 -0.375 -10 -5 0 5 10 15 20 25 30 Years Since Publication nruteR teN naeM )ylhtnoM %( 0.5 0.375 0.25 0.125 0 -0.125 Ave in Month -0.25 Trailing 5-year Ave 2 SE C.I. -0.375 69

Figure6:HeterogeneityinCost-OptimizedMeanNetReturnsPost-PublicationandPost-2005. Manyanomalieshavenotablenet returns, butthedistributioncloselyresemblesthenullofnopredictability, consistentwiththeideanotablenetreturnsarelargely duetoluck.Source:CenterforResearchinSecurityPrices,NewYorkStockExchange,andInstitutefortheStudyofSecurityMarkets. 30 25 20 15 10 5 0 -100 -80 -60 -40 -20 0 20 40 60 80 100 120 Net Return Post-Pub and Post-2005 (bps per month) 70 tnuoC

Cite this document

APA

Andrew Y. Chen and Mihail Velikov (2020). Zeroing in on the Expected Returns of Anomalies (FEDS 2020-039). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2020-039

BibTeX

@techreport{wtfs_feds_2020_039,
  author = {Andrew Y. Chen and Mihail Velikov},
  title = {Zeroing in on the Expected Returns of Anomalies},
  type = {Finance and Economics Discussion Series},
  number = {2020-039},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2020},
  url = {https://whenthefedspeaks.com/doc/feds_2020-039},
  abstract = {We zero in on the expected returns of long-short portfolios based on 120 stock market anomalies by accounting for (1) effective bid-ask spreads, (2) post-publication effects, and (3) the modern era of trading technology that began in the early 2000s. Net of these effects, the average anomaly's expected return is a measly 8 bps per month. The strongest anomalies return only 10-20 bps after accounting for data-mining with either out-of-sample tests or empirical Bayesian methods. Expected returns are negligible despite cost optimizations that produce impressive net returns in-sample and the omission of additional trading costs like price impact. Accessible materials (.zip)},
}