ifdp · January 31, 2006

Should We Expect Significant Out-of-Sample Results When Predicting Stock Returns?

Abstract

Using Monte Carlo simulations, I show that typical out-of-sample forecast exercises for stock returns are unlikely to produce any evidence of predictability, even when there is in fact predictability and the correct model is estimated.

Should We Expect Significant Out-of-Sample K.7 Results when Predicting Stock Returns? Erik Hjalmarsson International Finance Discussion Papers Board of Governors of the Federal Reserve System Number 855 February 2006

Board of Governors of the Federal Reserve System International Finance Discussion Papers Number 855 February 2006 Should We Expect Significant Out-of-Sample Results when Predicting Stock Returns? Erik Hjalmarsson NOTE: International Finance Discussion Papers are preliminary materials circulated to stimulate discussion and critical comment. References in publications to International Finance Discussion Papers (other than an acknowledgment that the writer has had access to unpublished material) should be cleared with the author or authors. Recent IFDPs are available on the Web at www.federalreserve.gov/pubs/ifdp/.

Should We Expect Significant Out-of-Sample Results when Predicting Stock Returns? Erik Hjalmarsson ∗ Division of International Finance Federal Reserve Board, Mail Stop 20, Washington, DC 20551, USA February 2006 Abstract Using Monte Carlo simulations, I show that typical out-of-sample forecast exercises for stock returns are unlikely to produce any evidence of predictability, even when there is in fact predictability and the correct model is estimated. JEL classification: C15; C53; G14. Keywords: Stock return predictability; Out-of-sample tests. Tel.: +1-202-452-2436; fax: +1-202-263-4850; email: erik.hjalmarsson@frb.gov. The views presented in this ∗ paperaresolely thoseofthe authorand donotrepresentthose oftheFederalReserve Board oritsstaff.

1 Introduction Forecasting models play a central role in economics and finance, both for practitioners and academics. Although standard, in-sample, econometric methods are usually relied upon to judge the validity of a given model, out-of-sample tests are often deemed to provide the most robust assessmentofanyeconometricforecastingmodel. GoyalandWelch(2003,2004)provideperhapsthemost notedexpressionofthisbeliefinrecenttimes;theyarguethatvirtuallyallvariablesthathavebeen proposedaspredictorsoffuturestockreturnsprovidenopredictivegainsinout-of-sampleexercises. ShouldoneinterpretthesefindingsbyGoyalandWelchasconclusiveevidencethatstockreturns are not predictable? There are several reasons to adopt a more nuanced view. Inoue and Kilian (2004) show that out-of-sample tests typically have lower power than in-sample tests. They also argue that the widely held belief that out-of-sample tests are less susceptible to data-mining is not generally true. Campbell and Thompson (2005) take a different approach and show that by imposingsomecommonsenserestrictionswhenformingtheout-of-sampleforecasts,thereisinfact fairly strong evidence of out-of-sample predictive ability in stock returns. This note adds to the above strand of literature by reporting the results from simulated out-ofsampleexercises. IshowthattheresultsofGoyalandWelch(2003,2004)donotimplythatprevious in-sampleresultsarespurious. Infact,usingMonteCarlosimulations,similarout-of-sampleresults to those of Goyal and Welch are found even when the postulated forecasting model is in fact the true data generating process. This result stems from the fact that any predictive component in stock returns must be small, if it does exist. Therefore, if the predictive relationship is estimated poorly, the conditional forecasting model may be outperformed by the unconditional benchmark model which assumes that expected returns are constant over time. Put in other words, in order to produce good forecasts when the slope coefficient in a linear regression is small, you are often betteroffsettingitequaltozero, ratherthanusinganoisyestimateofit. Toaccuratelyestimatea very small coefficient, large amounts of data are needed. The results in this paper show that when testing for stock return predictability, the sample sizes in most relevant cases are simply too small relative to the size of the slope coefficient for any predictive ability to show up in out-of-sample 1

exercises. Thefindingsinthispaper,ofcourse,donottellusthatstockreturnsare predictable,butmerely that we should not disregard the econometric in-sample results in favour of out-of-sample results. As shown here, when the level of predictability is low, it is quite possible to correctly specify and estimatethetruepredictiverelationshipwithout obtaininganyimprovement inout-of-sampleforecasts. However, this raises the question of the practical use of identifying a predictive relationship if, in fact, that relationship cannot be used to improve upon forecasts. It is evident that much care must be used in forming the forecast if it is to perform better than a simple unconditional alternative. For instance, Campbell and Thompson (2005), manage to improve the out-of-sample performancesubstantiallybyimposingsomesimpleeconomicallymotivatedrestrictionsontheforecasts. On the econometric side, it is of course also important to form the best possible estimate of the relevant parameters in the model. At present, most econometric studies on stock return predictability have focused primarily on the issues of testing, rather than point estimation. Hopefully, there will be also be some advances in the estimation area over the next few years. 2 Simulation design To create simulated samples of stock-returns and predictor variables, I rely on the standard data generating process (dgp) most often found in the stock return predictability literature (e.g. Campbell and Yogo, 2005). Let r denote the excess stock return in period t=1,...,T, and let x denote t t the corresponding value of some scalar predictor variable, such as the dividend- or earnings-price ratio. The variables r and x are generated according to t t r = α+βx +u , (1) t t 1 t − x = ρx +v , (2) t t 1 t − where the joint innovations, w t = (u t ,v t )0, are iid normal with unit variance and correlation δ. Theautoregressiveroot ρis modelled as local-to-unityand takes values ρ=1+c/T, wherecis the 2

so-called local-to-unity parameter. Tocapturetheoftenlargenegativecorrelationbetweenu andv ,theparameterδ issetequalto t t 0.9. The local-to-unity-parameter is set equal to either c= 20 or c= 2 and the sample size T − − − issetequaltoeither600or1200torepresentsamplesizesof50or100yearsofmonthlydata.1 The interceptαissetequalto0.005andIlettheslope-coefficientvarybetweenzeroand0.05. Campbell andYogo(2005),whoanalyzepredictabilityinaggregateU.S.stockreturns,presenttheirempirical results in a format standardized to conform with a unit variance in the innovations u and v and t t show that in most cases the OLS estimate of β is between 0.01 and 0.02 for monthly data. The aim of the Monte Carlo study is to compare the out-of-sample forecasts of r based on an t estimateofequation(1)tothosebasedonamodelofconstantexpectedreturns(i.e. β =0). These forecasts will be referred to as the conditional and unconditional forecasts, respectively. The simulated out-of-sample exercises are performed in the following manner. The first half of the sample is used to form the initial estimates of the conditional and unconditional models. The estimate of the unconditional model is, of course, merely the mean of the returns observed up to that point in time. For each time-period in the remaining half of the sample, the one-step-ahead forecastsbasedontheconditionalandunconditionalmodelsarecalculatedandtheestimatesofthe forecastmodelsareupdatedusingtheadditionaldatathatbecomeavailabletotheforecasterevery period. The slope coefficients are estimated using standard OLS. The mean squared errors (MSE) from every forecast are calculated, as well as the corresponding Diebold and Mariano (1995) (DM) statistic, which tests the null hypothesis of no additional predictive accuracy in the conditional forecastcomparedtotheunconditionalone. Inordertoassesstheimpactofpoorlyestimatedβs,I alsoformconditional forecasts using the true value of β. Thevalueof α is still estimated, however. All simulation results are based on 10,000 repetitions. 1Simulations based on annualparametervalues,usingsample sizes of50and 100yearswere also performed,but notreported here. These simulations delivered qualitatively identicalresults tothemonthly onespresented here. 3

3 Results The results of the paper are presented in Figures 1 and 2, which show the Monte Carlo evidence from samples with sizes T = 600 and T = 1200, respectively. The top two panels in both figures give the results corresponding to c= 20, and the two lower panels correspond to c= 2. − − Panel (a) in Figure 1 shows the ratios between the MSEs for the unconditional and conditional forecasts,whenc= 20. When,onaverage,theconditionalforecastoutperformstheunconditional − one,thisratioisgreaterthanone,andviceversa. Asisevidentfromtheplot,whentheconditional forecast is based on the OLS estimate of β, the true value of β needs to be greater than 0.015 for the conditional forecast to outperform the unconditional one on average. The conditional forecast based on the true value of β does, of course, always outperform the unconditional one. The same results for c = 2 are shown in Panel (c). The conditional forecast, based on the OLS estimates, − nowperformbetterrelativetotheunconditionalone;atruevalueofβ greaterthan0.06issufficient for the conditional forecast to beat the unconditional one on average. Panels (b) and (d) show the rejection rates for a one-sided 5% DM test of higher accuracy in the conditional forecast versus the unconditional forecast, based on the MSE.2 Clearly, when using OLS estimates of β to form the conditional forecasts, theDM test lacks power to reject the null hypothesis of equal forecasting ability in the relevant regions of the parameter space. For β = 0.015, the rejection rates are 5.8% and20.5%forc= 20andc= 2,respectively. Thechanceofdetectingpredictiveabilitythrough − − the DM test is thus extremely limited when β is small. Indeed, even when the true value of β is used in forming the conditional forecasts the rejection rates remain low. In Figure 2, the Monte Carlo results from samples with sizes T =1200, representing 100 years of monthly data, are reported. As expected, the conditional forecasts based on OLS estimates of β performsubstantiallybetterthaninthe T =600case. Avalueofβ greaterthan0.07, forc= 20, − and 0.04 for c= 2, is sufficient for the conditional forecast to outperform the unconditional one, − on average. For c= 20, the DM test statistic is still not very powerful, although for c= 2 there − − isnowafairchanceofrejectingthenullforreasonableparametervalues;forβ =0.015therejection 2The DM statistic is calculated using the long-run variance estimator of Andrews and Monahan (1992) with a quadratic spectralkernel. 4

rates are 15.3% and 69.3%, for c= 20 and c= 2, respectively. − − To sum up, in a monthly sample spanning 50 years it is often difficult to detect any predictive ability, when β is small, even under such perfect circumstances as in a controlled Monte Carlo experiment where the true functional form of the model is known and completely stable over time. Inreality,themodelislikelytobeonly,atbest,adecentapproximationofthetruedatagenerating process, which is also not likely to remain unchanged over a 50 year time span. This point, of course, is even more valid for the 100 year sample, where a stable model for the entire time span seems even less probable. Practical limitations on data availability also often restrict researchers to time spans of around 50 years or shorter. For instance, use of the short interest rate as a predictor is typically only considered for data after 1952 when the Fed unpegged the interest rate, and accounting variables, such as the book-to-market ratio, are often only available under an even shorter period. The overall interpretation of these results must be that, in practice, it should be difficult to detect any out-of-sample predictability in stock returns, even without any demands for statistical significance but merely as evaluated by the MSE. References Andrews, D.W.K., and C.J. Monahan, 1992. An Improved Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimator, Econometrica 60, 953—966. Campbell., J.Y., and S.B. Thompson, 2005. Predicting the Equity Premium Out of Sample: Can Anything Beat the Historical Average?, Working Paper, Harvard University. Campbell, J.Y., and M. Yogo, 2005. Efficient Tests of Stock Return Predictability, forthcoming Journal of Financial Economics. Diebold, F.X., and R.S., Mariano, 1995. Comparing predictive accuracy, Journal of Business and Economics Statistics 13, 253-263. 5

Goyal,A.,andI.Welch,2003. Predictingtheequitypremiumwithdividendratios,Management Science 49, 639-654. Goyal, A., and I. Welch, 2004. A comprehensive look at the empirical performance of equity premium prediction, NBER Working Paper 10483. Inoue, A., and L. Kilian, 2004. In-sample or out-of-sample tests of predictability: which one should we use?, forthcoming Econometric Reviews. 6

Figure 1: Results from Monte Carlo simulations with a simulated monthly sample of 600 observations. The top two panels show the results for c= 20 and the bottom panels for c= 2. The left − − panels,(a)and(c),displaytheratiosbetweenthemeansquarederrors(MSE)fortheunconditional andconditionalforecasts;aratiogreaterthanoneimpliesthattheconditionalforecastoutperforms the unconditional one. The right hand panels show the 5% rejection rates for the Diebold and Mariano(DM)testofhigheraccuracyintheconditionalforecastversustheunconditionalforecast, based on the MSE. The dashed lines show the results for conditional forecasts based on the OLS estimate of β and the dotted lines the results for conditional forecasts based on the true value of β. The flat lines in the left hand graphs indicate a value of one. 7

Figure2: ResultsfromMonteCarlosimulationswitha simulatedmonthlysampleof1200observations. The top two panels show the results for c= 20 and the bottom panels for c= 2. The left − − panels,(a)and(c),displaytheratiosbetweenthemeansquarederrors(MSE)fortheunconditional andconditionalforecasts;aratiogreaterthanoneimpliesthattheconditionalforecastoutperforms the unconditional one. The right hand panels show the 5% rejection rates for the Diebold and Mariano(DM)testofhigheraccuracyintheconditionalforecastversustheunconditionalforecast, based on the MSE. The dashed lines show the results for conditional forecasts based on the OLS estimate of β and the dotted lines the results for conditional forecasts based on the true value of β. The flat lines in the left hand graphs indicate a value of one. 8

International Finance Discussion Papers IFDP Number Titles Author(s) 2006 851 Exchange-Rate Pass-Through in the G-7 Countries Jane E. Ihrig Mario Marazzi Alexander D. Rothenberg 850 The Adjustment of Global External Imbalances: Does Partial Christopher Gust Exchange Rate Pass-Through to Trade Prices Matter? Nathan Sheets 2005 849 Interest Rate Rules, Endogenous Cycles and Chaotic Dynamics Marco Airaudo in Open Economies Luis-Felipe Zanna 848 Fighting Against Currency Depreciation Macroeconomic Luis-Felipe Zanna Instability and Sudden Stops 847 The Baby Boom Predictability in House Prices and Interest Rates Robert F. Martin 846 Explaining the Global Pattern of Current Account Imbalances Joseph W. Gruber Steven B. Kamin 845 DSGE Models of High Exchange-Rate Volatility and Low Giancarlo Corsetti Pass-Through Luca Dedola Sylvain Leduc 844 The Response of Global Equity Indexes to U.S. Monetary Jon Wongswan Policy Announcements 843 Accounting Standards and Information: Inferences from John Ammer Cross-Listed Financial Firms Nathanael Clinton Gregory P. Nini 842 Alternative Procedures for Estimating Vector Autoregressions Lawrence J Christiano Identified with Long-Run Restrictions Martin Eichenbaum Robert J. Vigfusson ________ Please address requests for copies to International Finance Discussion Papers, Publications, Stop 127, Board of Governors of the Federal Reserve System, Washington, DC 20551. Email: publications-bog@frb.gov. Fax (202) 728-5886. 9

International Finance Discussion Papers IFDP Number Titles Author(s) 841 Monetary Policy and House Prices: A Cross-Country Study Alan G. Ahearne John Ammer Brian M. Doyle Linda S. Kole Robert F. Martin 840 International Capital Flows and U.S. Interest Rates Francis E. Warnock Veronica C. Warnock 839 Effects of Financial Autarky and Integration: The Case of the Brahima Coulibaly South Africa Embargo 838 General-to-specific Modeling: An Overview and Selected Julia Campos Bibliography Neil R. Ericsson David F. Hendry 837 Currency Crashes and Bond Yields in Industrial Countries Joseph E. Gagnon 836 Estimating Elasticities for U.S. Trade in Services Jaime Marquez 835 SIGMA: A New Open Economy Model for Policy Analysis Christopher Erceg Luca Guerrieri Christopher Gust 834 Optimal Fiscal and Monetary Policy with Sticky Wages and Sanjay K. Chugh Sticky Prices 833 Exchange Rate Pass-through to U.S. Import Prices: Some New Mario Marazzi Evidence Nathan Sheets Robert J. Vigufsson And Others 832 A Flexible Finite-Horizon Identification of Technology Shocks Neville Francis Michael T. Owyang Jennifer E. Roush 831 Adjusting Chinese Bilateral Trade Data: How Big is China’s John W. Schindler Surplus Dustin H. Beckett 830 Order Flow and Exchange Rate Dynamics in Electronic David W. Berger Brokerage System Data Alain P. Chaboud Sergey V. Chernenko Edward Howorka Raj S. Krishnasami Iyer David Liu Jonathan H. Wright 10

Cite this document

APA

Erik Hjalmarsson (2006). Should We Expect Significant Out-of-Sample Results When Predicting Stock Returns? (IFDP 2006-855). Board of Governors of the Federal Reserve System, International Finance Discussion Papers. https://whenthefedspeaks.com/doc/ifdp_2006-855

BibTeX

@techreport{wtfs_ifdp_2006_855,
  author = {Erik Hjalmarsson},
  title = {Should We Expect Significant Out-of-Sample Results When Predicting Stock Returns?},
  type = {International Finance Discussion Papers},
  number = {2006-855},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2006},
  url = {https://whenthefedspeaks.com/doc/ifdp_2006-855},
  abstract = {Using Monte Carlo simulations, I show that typical out-of-sample forecast exercises for stock returns are unlikely to produce any evidence of predictability, even when there is in fact predictability and the correct model is estimated.},
}