ifdp · October 31, 2007

The Stambaugh Bias in Panel Predictive Regressions

Abstract

This paper analyzes predictive regressions in a panel data setting. The standard fixed effects estimator suffers from a small sample bias, which is the analogue of the Stambaugh bias in time-series predictive regressions. Monte Carlo evidence shows that the bias and resulting size distortions can be severe. A new bias-corrected estimator is proposed, which is shown to work well in finite samples and to lead to approximately normally distributed t-statistics. Overall, the results show that the econometric issues associated with predictive regressions when using time-series data to a large extent also carry over to the panel case. The results are illustrated with an application to predictability in international stock indices.

Board of Governors of the Federal Reserve System International Finance Discussion Papers Number 914 November 2007 The Stambaugh Bias in Panel Predictive Regressions Erik Hjalmarsson NOTE: International Finance Discussion Papers are preliminary materials circulated to stimulate discussion and critical comment. References in publications to International Finance Discussion Papers (other than an acknowledgment that the writer has had access to unpublished material) should be cleared with the author or authors. Recent IFDPs are available on the Web at www.federalreserve.gov/pubs/ifdp/.

The Stambaugh Bias in Panel Predictive Regressions Erik Hjalmarsson (cid:3) Division of International Finance Federal Reserve Board, Mail Stop 20, Washington, DC 20551, USA November 29, 2007 Abstract Thispaperanalyzespredictiveregressionsinapaneldatasetting. Thestandard(cid:133)xede⁄ectsestimator su⁄ers from a small sample bias, which is the analogue of the Stambaugh bias in time-series predictive regressions. Monte Carlo evidence shows that the bias and resulting size distortions can be severe. A new bias-corrected estimator is proposed, which is shown to work well in (cid:133)nite samples and to lead to approximately normally distributed t-statistics. Overall, the results show that the econometric issues associatedwithpredictiveregressionswhenusingtime-seriesdatatoalargeextentalsocarryovertothe panelcase. Theresultsareillustratedwithanapplicationtopredictabilityininternationalstockindices. JEL classi(cid:133)cation: C22, C23, G1. Keywords: Panel data; Pooled regression; Predictive regression; Stock return predictability. HelpfulcommentshavebeenprovidedbyLennartHjalmarsson,RandiHjalmarsson,andGeorgeKorniotis,aswellasseminar (cid:3) participants at the European Summer Meeting ofthe Econometric Society in Vienna. Tel.: +1-202-452-2426;fax: +1-202-263- 4850; email: erik.hjalmarsson@frb.gov. The views in this paper are solely the responsibility of the author and should not be interpreted as re(cid:135)ecting the views of the Board of Governors of the Federal Reserve System or of any other person associated with the FederalReserve System.

1 Introduction Predictive regressions are important tools for evaluating and testing economic models. Although tests of stock return predictability, and the related market e¢ ciency hypothesis, are probably the most common application, many rational expectations models can be tested in a similar manner (Mankiw and Shapiro, 1986). Traditionally, forecasting regressions have been evaluated in time-series frameworks. However, with the increased availability of data, in particular international (cid:133)nancial and macroeconomic data, it becomes natural to extend the single time-series framework to a panel data setting; for instance, Cohen et al. (2003) and Polk et al. (2006) rely on predictive panel data regressions in some of their analyses. It is well known that the apparently simple linear regression model most often used for evaluating predictability in fact raises some very tough econometric issues. The high degree of persistence found in many predictor variables, such as the earnings- or dividend-price ratios in the prototypical stock return forecasting regression, is at the root of most econometric problems associated with predictive regressions. The near persistence of the regressors, coupled with a strong contemporaneous correlation between the innovations in the regressor and the regressand, causes standard OLS estimates to su⁄er from a small sample bias and normal t tests to have the wrong size; this is the so-called Stambaugh (1999) bias in predictive regressions. (cid:0) In the panel case, with pooled regressions, it turns out that as long as no (cid:133)xed e⁄ects are included, the pooled estimator is unbiased. However, once one allows for (cid:133)xed e⁄ects in the pooled regression, an analogue of the Stambaugh bias is also present in the panel case. This result can be understood in light of therepresentationofthebiasinpredictiveregressionsderivedbyStambaugh. Undertheassumptionthatthe predictor variable follows an auto-regressive process, he shows that the bias in the OLS estimate of the slope coe¢ cient in the predictive regression is a function of the bias of the OLS estimate of the auto-regressive coe¢ cient in the predictor variable. It is well known that the bias in the OLS estimator of auto-regressive coe¢ cients is more severe if an intercept is included in the regression equation. Therefore, in the timeseries case, the Stambaugh bias is less severe if no intercept is included in the predictive regression. This is, of course, mostly of theoretical interest since in almost all empirical applications an intercept is required. The same idea holds in the panel case; but, rather than di⁄erentiating between the case of intercept or no intercept, the relevant cases are now a common intercept or individual intercepts, i.e. (cid:133)xed e⁄ects. In this paper, I propose a simple bias correction to the (cid:133)xed e⁄ects estimator in pooled predictive regressions. An analogue representation of the time-series Stambaugh bias is also derived. It is shown that the asymptotic bias in the (cid:133)xed e⁄ects estimator in the predictive regression can be expressed as a function of the bias in the pooled (cid:133)xed e⁄ects estimator of the auto-regressive coe¢ cient in the predictor variable. The results in this paper complement those of Hjalmarsson (2007), which also studies the bias in the (cid:133)xed e⁄ects 1

estimator but does not explicitly analyze the connection with the Stambaugh bias in time-series regressions or the direct bias correction procedure suggested here. The bias-corrected estimator is straightforward to implement. The key parameter on which the bias depends is the auto-regressive root in the regressor variable. The practical implementation of the biascorrected(cid:133)xede⁄ectsestimatoristhereforefacilitatedbythefactthateventhoughthe(cid:133)xede⁄ectsestimator oftheauto-regressivecoe¢ cientisbiased,analternativeunbiasedandconsistentestimatorisreadilyavailable. Since the bias-corrected estimator is approximately asymptotically normal, it becomes trivial to perform inference on the slope parameter. Simulation results show that the bias-corrected estimator works well in (cid:133)nite samples. These simulations also show the importance of controlling for the bias in the panel case. The average rejection rates for the t test corresponding to the standard (cid:133)xed e⁄ects estimator exceed 75 percent in some cases, under the null (cid:0) hypothesis of no predictability, and for a nominal (cid:133)ve percent test. As an illustration of the methods derived in this paper, I test for stock return predictability in an internationalpanelofreturnsfrom18di⁄erentstockindicesusingthecorrespondingdividend-andearnings-price ratios, as well as the book-to-market values, as predictors. The empirical results clearly illustrate the theoretical results in the paper. Based on the results from the standard (cid:133)xed e⁄ects estimator, the evidence in favour of return predictability is very strong, using either of the three predictor variables. However, when using the robust methods developed here, the evidence disappears almost completely. Thus, both the simulationresultsandtheempiricalresultsclearlyshowthattheStambaughbiasisatleastasimportantinpanel regressions as it is in time-series regressions. Therestofthepaperisorganizedasfollows. Section2outlinesthepanelmodelandshowstheStambaugh biasinpanelpredictiveregressions. Section3describesthebias-correctedestimatorandSection4illustrates the small sample properties of this estimator, as well as those of the pooled estimator without (cid:133)xed e⁄ects. Section 5 shows the results from the empirical application to stock return predictability and o⁄ers a brief conclusion. The Appendix outlines the derivations of the main results. 2 The Stambaugh bias 2.1 Model and assumptions Consider a panel model with dependent variable y , i = 1;:::;n, t = 1;:::;T, and corresponding regressor, i;t x . Here, i represents the cross-sectional dimension (e.g. (cid:133)rm or country) and t represents the time-series i;t 2

dimension. The behavior of y and x are modelled as follows, i;t i;t y = (cid:11) +(cid:12)x +u ; (1) i;t i i;t 1 i;t (cid:0) x = (cid:13) +(cid:26)x +v ; (2) i;t i i;t 1 i;t (cid:0) where (cid:26)=1+c=T. The auto-regressive root of the regressor is parameterized as being local-to-unity, which captures the near unit-root behavior of many predictor variables but is less restrictive than a pure unitroot assumption (e.g. Cavanagh et al. 1995, and Campbell and Yogo, 2006). The model can be seen as a panel analogue of the time-series models studied by Mankiw and Shapiro (1986), Cavanagh et al. (1995), Stambaugh (1999), Lewellen (2004), and Campbell and Yogo (2006), among others. The innovation processes are assumed to satisfy martingale di⁄erence sequences with (cid:133)nite fourth order moments and the regressor x is generally endogenous in the sense that u and v are contemporaneously i;t i;t i;t correlated. That is, let w i;t = (u i;t ;v i;t )0 and t = w i;s s t;i=1;:::;n be the (cid:133)ltration generated by F f j (cid:20) g the innovation processes. Then, for all i = 1;:::;n, and t = 1;:::;T, E[w ] = 0, E w w = it jF t (cid:0) 1 i;t i0;t F t (cid:0) 1 (cid:10) i = [(! 11i ;! 12i );(! 12i ;! 22i )]0 and (cid:10) = lim n n 1 n i=1 (cid:10) i . Finally, it is assumed that th(cid:2)e innova(cid:12) (cid:12) tions(cid:3)are !1 cross-sectionally independent.1 P Let J (r) denote the limiting process of the scaled regressor x . That is, as T , xi;t=[Tr] J (r), i i;t ! 1 pT ) i whereJ (r), de(cid:133)nedintheAppendix, isthestandardasymptoticprocessforanearunit-rootvariable. Also, i let J =J 1 J be the demeaned version of J and let (cid:10) E 1 J2 , and (cid:10) E 1 J2 . i i (cid:0) 0 i i xx (cid:17) 0 i xx (cid:17) 0 i h i h i FollowingRtheworkofPhillipsandMoon(1999),resultsforthepaRnelestimatorsarederiRvedusingsequential limits, which implies (cid:133)rst keeping n (cid:133)xed and letting T go to in(cid:133)nity, and then letting n go to in(cid:133)nity. Such sequential convergence is denoted (T;n ) .2 !1 seq 2.2 The bias in the (cid:133)xed e⁄ects estimator Let y~ =y 1 n T y denote the overall demeaned data and let y =y 1 T y denote i;t i;t (cid:0) nT i=1 t=1 i;t i;t i;t (cid:0) T t=1 i;t the time-series demPeanedPdata. De(cid:133)ne x~ and x analogously. The pooled estimator of (cid:12)Pwhen there are i;t i;t no individual e⁄ects, i.e. when (cid:11) (cid:11) for all i, is given by i (cid:17) 1 n T (cid:0) n T (cid:12)^ = x~2 y~ x~ : (3) Pool i;t 1 i;t i;t 1 (cid:0) ! (cid:0) ! i=1 t=1 i=1 t=1 XX XX 1In order to highlight the e⁄ects of the Stambaugh bias in panel regressions, the e⁄ects of cross-sectional dependence are not considered. In certain applications it may be desirable to allow for clustering of the errors either across time for a given individual i, or across individuals (i.e. cross-sectional correlation). As shown by Thompson (2006), it is straightforward to construct standard error estimators that control for such clustering across both time and individuals. His framework could easily be used in the current context and the details are omitted. 2Subject to potential rate restrictions, such as n=T 0, these results can generally be shown to hold as n and T go to ! in(cid:133)nity jointly;technicalproofs ofsuch joint convergence is not pursued in the current study,however. 3

The (cid:133)xed e⁄ects estimator, allowing for individual e⁄ects is given by 1 n T (cid:0) n T (cid:12)^ = x2 y x : (4) FE i;t 1 i;t i;t 1 (cid:0) ! (cid:0) ! i=1 t=1 i=1 t=1 XX XX As shown by Hjalmarsson (2007), and outlined in the Appendix, as (T;n ) , !1 seq pnT (cid:12)^ (cid:12) N 0;! (cid:10) 1 ; (5) Pool(cid:0) ) 11 (cid:0)xx (cid:16) (cid:17) (cid:0) (cid:1) under the assumption that (cid:11) (cid:11) for all i, whereas i (cid:17) 1 r T (cid:12)^ (cid:12) ! e(r s)cdsdr (cid:10) 1; (6) FE (cid:0) ! p (cid:0) 12 (cid:0) (cid:0)xx (cid:16) (cid:17) (cid:18)Z0 Z0 (cid:19) whenever ! =0.3 Thus, the estimator without individual e⁄ects is asymptotically unbiased and normally 12 6 distributed; summing up over the cross-section in the pooled estimator eliminates the usual near unit-root asymptotic distributions found in the time-series case. The (cid:133)xed e⁄ects estimator, on the other hand, su⁄ers from a second order bias; in practice, this means that the estimator will exhibit a small sample bias and test statistics will not have standard distributions. Theintuitionbehindtheseresultsisthatwhenpoolingthedata, independentcross-sectionalinformation dilutes the endogeneity e⁄ects and thus potentially alleviates the bias e⁄ects seen in the time-series case; persistent regressors that are exogenous do not cause any inferential issues. This intuition holds when no individual intercepts are allowed in the speci(cid:133)cation. The bias in the (cid:133)xed e⁄ects estimation arises because the time-series demeaning induces a correlation between the innovation processes u and the demeaned i;t regressors x ; intuitively, this happens because information available after time t 1 is used in the i;t (cid:0) 1 (cid:0) demeaning of x .4 i;t 1 (cid:0) 2.3 An alternative representation of the (cid:133)xed e⁄ects bias Inthecaseofapredictivetime-seriesregression,Stambaugh(1999)showsthatthebiasintheOLSestimator of the slope coe¢ cient (cid:12) in equation (1) is a function of the bias in the OLS estimator of the AR coe¢ cient (cid:0) (cid:26) in equation (2). Here we derive an analogue result for the (cid:133)xed e⁄ects estimator in the panel case. Note that 1 r e(r s)cds dr = (ec c 1)/c2 and let (cid:18)(c) (ec c 1)/c2. The limiting bias (cid:0) 0 0 (cid:0) (cid:0) (cid:0) (cid:0) (cid:17)(cid:0) (cid:0) (cid:0) 3In the special cRase(cid:0)oRf !12=0, it(cid:1)follows easily that (cid:12)^ FE is also asymptotically normally distributed with convergence rate pnT. 4Polk et al. (2006) make the same conjecture regarding inference in pooled predictive regressions, namely that independent cross-sectionalinformation dilutes the endogeneity e⁄ects,but do not recognize that this intuition fails in the presence of (cid:133)xed e⁄ects. Theirregressoris nearly exogenous however,and theirempiricalconclusions should therefore stillbe fairly accurate. 4

of (cid:12)^ is thus given by T 1(! (cid:18)(c)/(cid:10) ). Let the (cid:133)xed e⁄ects estimator of (cid:26) be given by FE (cid:0) 12 xx 1 n T (cid:0) n T (cid:26)^ = x2 x x : (7) FE i;t 1 i;t i;t 1 (cid:0) ! (cid:0) ! i=1 t=1 i=1 t=1 XX XX As shown in Moon and Phillips (2000), the bias in (cid:26)^ is in fact equal to T 1(! (cid:18)(c)/(cid:10) ). Thus, the FE (cid:0) 22 xx limiting bias in the pooled (cid:133)xed e⁄ects estimator of (cid:12) can be written as a function of the limiting bias in the (cid:133)xed e⁄ects estimator of the auto-regressive parameter (cid:26). That is, ! p-lim T (cid:12)^ (cid:12) = p-lim 12T ((cid:26)^ (cid:26)): (8) FE (cid:0) ! FE (cid:0) (T;n ) (T;n ) 22 !1 seq (cid:16) (cid:17) !1 seq This is the analogue of the expression for the bias in the time-series estimator of (cid:12) given by Stambaugh (1999). The Stambaugh bias thus carries over directly to pooled regressions, once (cid:133)xed e⁄ects are included. Inthetime-seriescase,itiswellknownthatstandardleastsquaresestimatesofauto-regressivecoe¢ cients close to unity are much more biased when there is an intercept included in the regression. These e⁄ects also carryovertoapredictiveregressionwithpersistentregressors;ifnointerceptisincludedintheregression,the Stambaughbiaswillbemuchsmaller. Ofcourse,aninterceptisrequiredinalmostalltime-seriesapplications. The panel data therefore gets us halfway: if only a common intercept is included in the pooled regression, the resulting estimator is well behaved, but once individual intercepts are included the bias shows up in the panel case as well. 3 A bias-corrected estimator 3.1 The infeasible estimator For a known c, a bias-corrected (cid:133)xed e⁄ects estimator is given by 1 n T (cid:0) n T (cid:12)^+ = x2 y x nT! (cid:18)(c) : (9) FE i;t (cid:0) 1 ! i;t i;t (cid:0) 1(cid:0) 12 ! i=1 t=1 i=1 t=1 XX XX As shown in the Appendix, as (T;n ) , !1 seq pnT (cid:12)^+ (cid:12) N 0;! (cid:10) 1 (! (cid:18)(c))2(cid:10) 2 : (10) FE (cid:0) ) 11 (cid:0)xx (cid:0) 12 (cid:0)xx (cid:16) (cid:17) (cid:16) (cid:17) Thus, the bias-corrected estimator (cid:12)^+ is asymptotically normally distributed and converges at the same FE pnT rate as the standard pooled estimator. (cid:0) 5

3.2 The feasible estimator Inordertoimplement(cid:12)^+ inpractice,estimatesofcand! arerequired. Theparameter! istheaverage FE 12 12 covariance between the error terms u and v and can be estimated by averaging the estimates of the i;t i;t individual covariances ! ; estimates of ! can be formed using (cid:133)tted residuals from either the pooled 12i 12i or time-series estimates of equations (1) and (2).5 In practice, the implementation of (cid:12)^+ will not be very FE sensitive to the exact way of estimating ! . Rather, the crucial parameter is c, which is more di¢ cult to 12 estimate consistently. Inthetime-seriescase,consistentestimationofcisnotpossible. Thatis,(cid:26)canbeestimatedconsistently, but not with enough precision to identify c = T ((cid:26) 1). This is also the reason why the Stambaugh bias is (cid:0) di¢ cult to correct in practice in time-series regressions; for instance, given the lack of precise knowledge of (cid:26); Lewellen (2004) suggests a bias correction that leads to conservative tests by imposing a maximum value on the bias under the assumption that (cid:26) 1. In the panel data case, it is possible to estimate c consistently. (cid:20) Asdiscussedpreviously,thepooledestimateof(cid:26)isbiasedwhenincluding(cid:133)xede⁄ects. Thisbiasnaturally carries over to the estimator of c = T ((cid:26) 1) as well. However, as discussed in Moon and Phillips (2000), (cid:0) even when there are (cid:133)xed e⁄ects in equation (2), a consistent estimator of c is obtained by simply using the plainpooledestimatorwithoutanydemeaningofthevariables. Theestimatorof(cid:26)with (cid:133)xede⁄ectsisbiased for reasons similar to those of the (cid:133)xed e⁄ects estimator of (cid:12); by not demeaning the data, the bias is no longer present. Intuitively, the (cid:133)xed e⁄ects, or intercepts, in equation (2) can be ignored in the estimation of (cid:26) because when the root (cid:26)=1+c=T is close to unity, there is enough variation in x that these intercepts i;t are of negligible importance. Therefore, let (cid:26)^ be the plain pooled estimator of (cid:26), Pool 1 n T (cid:0) n T (cid:26)^ = x2 x x ; (11) Pool i;t 1 i;t i;t 1 (cid:0) ! (cid:0) ! i=1 t=1 i=1 t=1 XX XX and de(cid:133)ne the corresponding estimator of c as c^= T ((cid:26)^ 1): Moon and Phillips (2000) show that this Pool(cid:0) estimatorofcisconsistent;again,observethatthedatausedinestimatingcisnottime-seriesdemeanedand that demeaning the data in the time-series dimension will lead to a bias in the estimator. A feasible version of (cid:12)^+ is thus given by substituting ! (cid:18)(c)with !^ (cid:18)(c^) in equation (9). FE 12 12 Formally,theasymptoticnormalityof(cid:12)^+ isshownonlyfortheinfeasibleversionoftheestimator,which FE is based on the true (unknown) value of c. Although it is outside the scope of this paper to derive the exact limiting distribution of the feasible version of the estimator, the simulation results below show that inference based on the assumption of normality works well also in this case. 5Recallthatalthoughboththetime-seriesandpooled(cid:133)xede⁄ectsestimatorsof(cid:12)and(cid:26)aregenerallybiasedin(cid:133)nitesamples, they are stillconsistent estimators. Estimators ofthe covariance !12i based on the (cid:133)tted residuals willtherefore be consistent. 6

3.3 Practical inference Finally, in order to perform feasible inference using either (cid:12)^ or (cid:12)^+ , one only needs estimates of (cid:10) , Pool FE xx (cid:10) , and ! . Natural estimators of (cid:10) and (cid:10) are given by (cid:10)^ = 1 n 1 T x~2 and (cid:10)^ = xx 11 xx xx xx n i=1 T2 t=1 i;t 1 xx (cid:0) 1 n 1 T x2 , respectively. Let u^ be the (cid:133)tted residuals and !^P = 1 Pn 1 T u^2 .6 An n i=1 T2 t=1 i;t 1 i;t 11 n i=1 T t=1 i;t (cid:0) estPimate ofPthe variance of (cid:12)^ is thus given by !^ (cid:10)^ 1. P P Pool 11 (cid:0)xx Similarly, the natural estimator of (cid:8) is given by (cid:8)^+ =!^ (cid:10)^ (!^ (cid:18)(c^))2. However, this estimator ux ux 11 xx(cid:0) 12 of (cid:8) su⁄ers from the drawback that it is not necessarily positive. Furthermore, subtracting o⁄the term ux coming fromthe bias correction of the estimator, withoutcontrollingforthepossibilitythat the feasible bias correction induces additional variance in the estimator of (cid:12) through the sampling error in c^, may lead to too lowanestimateofthevarianceofthefeasibleversionof(cid:12)^+ . Thatis,asmentionedabove,theexactlimiting FE distribution of the feasible estimator is unknown, and it therefore seems reasonable to use an estimator that is more robust. Thus, I propose to use the estimator (cid:8)^ = !^ (cid:10)^ , and estimate the variance of (cid:12)^+ by ux 11 xx FE !^ (cid:10)^ (cid:0) 1 ; this will result in a more conservative, i.e. larger, estimate of the variance. In both the pooled and 11 xx (cid:133)xed e⁄ects cases, therefore, standard estimators can be used to estimate the variance of the estimators. Since the distributions of (cid:12)^ and (cid:12)^+ are (approximately) asymptotically normal, implementing tests Pool FE on the slope coe¢ cient becomes trivial. The t statistic for the estimator (cid:12)^ , for instance, will satisfy (cid:0) Pool (cid:12)^ (cid:12) t = Pool(cid:0) 0 N(0;1): (12) Pool ) !^ (cid:10)^ 1 (nT2) 11 (cid:0)xx r . The t statistic t+ corresponding to (cid:12)^+ is constructed in an analogous manner using (cid:10)^ instead of (cid:0) FE FE xx (cid:10)^ . In the simulations and empirical illustrations below, we also consider the properties of the t statistic xx (cid:0) correspondingtothestandard(cid:133)xede⁄ectsestimator,t ,whichagainisidenticaltot with(cid:10)^ replaced FE Pool xx by (cid:10)^ ; given the above discussion, t will not be standard normally distributed unless ! =0. However, xx FE 12 inferenceusing(cid:12)^ andt underthenormalityassumptionprovidesausefulillustrationofthebiasesthat FE FE occur if one ignores the issues resulting from the endogeneity and persistence of the regressor. 4 Simulation evidence To evaluate the small sample properties of the panel data estimators proposed in this paper, a Monte Carlo study is performed. In the (cid:133)rst experiment, the properties of the point estimates are considered. Equations (1) and (2) are simulated for the case with a single regressor. The innovations (u ;v ) are drawn from i;t i;t 6Alternatively, the estimator of !11 could be replaced by a robust variance estimator to allow for heteroskedasticity in the errorterms. 7

normaldistributionswithmeanzero, unitvariance, andcorrelations(cid:14) =0; 0:4; 0:7;and 0:95. Theslope (cid:0) (cid:0) (cid:0) parameter (cid:12) is set equal to 0:05 and the local-to-unity parameter c is set to 5. The sample size is given by (cid:0) T =100; n=20. The small value of (cid:12) is chosen in order to re(cid:135)ect the fact that most forecasting regressions are used to test a null of (cid:12) =0, and any plausible alternative is often close to zero. The intercepts (cid:11) are all i set equal to zero. All results are based on 10,000 repetitions. Three di⁄erent estimators are considered: the pooled estimator with no (cid:133)xed e⁄ects, (cid:12)^ , the (cid:133)xed Pool e⁄ects estimator, (cid:12)^ , and the bias-corrected (cid:133)xed e⁄ects estimator, (cid:12)^+ . The bias-correction term in the FE FE estimator(cid:12)^+ isestimatedby!^ (cid:18)(c^),wherec^isthepanelestimateofthelocal-to-unityparameterand!^ FE 12 12 is estimated as n 1 n !^ with !^ the covariance between the residuals from a time-series estimation (cid:0) i=1 12i 12i of equation (1) andPthe residuals from the pooled estimation of equation (2).7 The results are shown in Figure 1. (cid:12)^ and (cid:12)^+ are virtually unbiased whereas (cid:12)^ exhibits a rather Pool FE FE substantialbiaswhentheabsolutevalueofthecorrelation(cid:14) islarge. Thebias-correctedestimator, (cid:12)^+ , has FE a slightly less peaked distribution than the standard pooled estimator, (cid:12)^ , but overall the bias correction Pool appears to produce good point estimates. The second part of the Monte Carlo study concerns the size and power of the pooled t tests. The same (cid:0) setup as above is used, but, in order to calculate the power of the tests, the slope coe¢ cient (cid:12) now varies between 0:05 and 0:05. The tests are evaluated under the assumption that the limiting distributions are (cid:0) standardnormal; i.e. thenullisrejectedforabsolutetestvaluesgreaterthan1:96. PanelAinTable1shows theaveragesizesofthenominal(cid:133)vepercenttestsunderthenullhypothesisof(cid:12) =0forthetwosidedt tests (cid:0) corresponding to the three di⁄erent estimators considered above. Figure 2 shows the average rejection rates of the (cid:133)ve percent two-sided t tests, evaluating a null of (cid:12) =0 for di⁄erent values of the true (cid:12); that is, the (cid:0) power curves of the tests. Again, the results are based on 10,000 repetitions. Apart from the test based on the standard (cid:133)xed e⁄ects estimator, the other two tests perform very well intermsofsize,withactualrejectionratesverycloseto(cid:133)vepercentinthenominal(cid:133)vepercenttest. Table1 andthepowercurvesinFigure2clearlyshowthee⁄ectsofthesecondorderbiasinthe(cid:133)xede⁄ectsestimator. The test based on the bias-corrected (cid:133)xed e⁄ects estimator has similar power properties to the test based on the standard pooled estimator. Inpractice,theassumptionthat(cid:26)isidenticalforalli,i.e. thattheregressorsallhavethesamepersistence, may seem restrictive. I therefore brie(cid:135)y analyze the robustness of the bias-correction method proposed here to deviations from this assumption. In particular, identical size simulations as those reported in Panel A of Table 1 are shown in Panel B when the individual local-to-unity parameters, c , are drawn from a i uniform distribution with support [ 20; 2]. As is seen, the results are very similar, and the bias correction (cid:0) (cid:0) 7As mentioned before,the exact estimation procedure forthe !12isdoes not play a crucialrole in the properties of(cid:12)^+ FE . 8

appears fairly robust to this generalization. The standard pooled estimator should not be a⁄ected, since the assumption of a common parameter c is not needed in deriving its asymptotic result. In summary, the simulation evidence shows the importance of controlling for the bias arising from (cid:133)tting individual intercepts in the pooled regression. The bias correction of the (cid:133)xed e⁄ects estimator appears to work well, producing nearly unbiased results and correctly sized tests with good power. In cases where individual e⁄ects are not present, the pooled estimator performs well also when the regressors are highly endogenous, as the theory would predict. 5 Empirical illustration and conclusion Toillustratethemethodsdevelopedinthispaper, Iconsiderthequestionofstockreturnpredictabilityinan internationaldataset. ThedataareobtainedfromtheMSCIdatabaseandconsistofapaneloftotalreturns for stock markets in 18 di⁄erent countries and three corresponding forecasting variables: the dividend- and earnings-price ratios as well as the book-to-market values. With varying success, all three of these variables have been used extensively in tests of stock return predictability for U.S. data (e.g. Lewellen, 2004, and CampbellandYogo,2006),andtoalesserdegreeininternationaldata(e.g. AngandBekaert,2007). Allthree of these forecasting variables are highly persistent, and since they are all valuation ratios, their innovations are likely to be highly correlated with the innovations to the returns process. The data are on a monthly basis and the returns data span the period 1970.1 to 2002.12, though not all forecasting variables or all countriesare availableforthiswholetime-period. Inparticular, Ihave dataforstockindicesinthe following countries: Australia, Austria, Belgium, Canada, Denmark, France, Germany, Hong Kong, Italy, Japan, the Netherlands, Norway, Singapore, Spain, Sweden, Switzerland, the UK, and the USA.8 The dividend price ratio (d p) is available for all countries except Hong Kong and for the entire sample period from 1970.1 (cid:0) onwards. The earnings price ratio (e p) is available for all countries except Italy and Switzerland, from (cid:0) 1974.12 onwards. The book-to-market value (b p) is available for all countries from 1974.12 onwards. All (cid:0) returns are expressed in U.S. dollars, and the dependent variable in the predictive regressions is given by the excess return over the 1-month U.S. T-bill rate. Finally, all data are log-transformed. The results from the pooled forecasting regressions are shown in Table 2. The estimates of c and the correlationbetweentheinnovationsinthereturnsandpredictorprocessesshowthattheforecastingvariables are clearly near unit-root processes and highly endogenous. The standard pooled (cid:133)xed e⁄ects estimator, (cid:12)^ , delivers highly signi(cid:133)cant estimates and clearly rejects the null-hypothesis of no predictability. Given FE thehighpersistenceandendogeneityfoundinthedata,however,theseresultsarelikelytobeupwardbiased. 8Hong Kong is,ofcourse,not a country. 9

As seen from the estimates based on the bias-corrected (cid:133)xed e⁄ects estimator, (cid:12)^+ , signi(cid:133)cance disappears FE when controlling for the bias induced by the time-series demeaning in the (cid:133)xed e⁄ects estimator; (cid:12)^+ is FE implemented in a manner identical to that described in the simulation section above. Overall, the case for stock return predictability in this international data set, using either of the three predictor variables, must be considered very weak. These results are in line with the extensive study of stock return predictability in international data by Hjalmarsson (2007), which suggests that the predictive power of valuation ratios is typically weak in international data. Ang and Bekaert (2007) also (cid:133)nd in a smaller international sample, covering France, Germany, theUKandtheUSAoverasimilarsampleperiod, thatthepredictiveabilityofthedividend-price ratio is very weak in all four of these countries.9 The empirical results illustrate well the di¢ culties of performing inference in regressions with persistent and endogenous variables, and that these di¢ culties also prevail when a panel of data, rather than a single time-series, is available. Indeed, judging by the vast di⁄erence between the estimates and test statistics resulting from the standard (cid:133)xed e⁄ects estimator and those from the robust estimators, it is clear that the bias e⁄ects can be as large in panel estimations as in time-series regressions. ^ ^ ^+ A The asymptotic properties of (cid:12) , (cid:12) , and (cid:12) Pool FE FE Hjalmarsson (2007) derives the asymptotic properties of (cid:12)^ and (cid:12)^ in a similar setting to the one Pool FE considered here, although he does not consider the bias-corrected estimator (cid:12)^+ . The following derivations FE therefore primarily recollect those found in Hjalmarsson (2007). Given the conditions on u and v , as T , 1 [Tr]w B (r) = BM((cid:10) )(r), where B () = i;t i;t ! 1 pT t=1 i;t ) i i i (cid:1) (B 1i ( (cid:1) );B 2i ( (cid:1) ))0 denotes a two-dimensional Brownian moPtion. As T !1 , xi; p t= T [Tr] ) J i (r), where J i (r)= r e(r s)cdB (s). Analogous results hold for the time-series demeaned data, x , with J replaced by J : 0 (cid:0) 2;i i;t i i R First note that x~i;t=[Tr] = xi;t +O 1 J (r); so that the overall demeaning has no asymptotic pT pT p pn ) i e⁄ects. By standard results as T , (cid:16)1 (cid:17)T u x~ 1 dB J and 1 T x~2 1 J2. Since ! 1 T t=1 i;t i;t (cid:0) 1 ) 0 1;i i T2 t=1 i;t (cid:0) 1 ) 0 i B and J are iid across i and E 1 dBPJ = 0, it follRows that as n P , 1 n 1RJ2 (cid:10) 1;i i 0 1;i i ! 1 n i=1 0 i ! p xx and 1 n 1 dB J N(0;(cid:8) h )Rwhere (cid:8) i E 1 dB J 2 , by the weak lPaw ofRlarge numbers pn i=1 0 1;i i ) ux ux (cid:17) 0 1;i i and thePcentraRl limit theorem, respectively. Thus, as ( (cid:20) T (cid:16) ;Rn ) (cid:17) ; (cid:21) 1 n 1 T x~2 (cid:10) and !1 seq n i=1 T2 t=1 i;t (cid:0) 1 ! p xx 1 n 1 T u x~ N(0;(cid:8) ). It follows that pnT (cid:12)^ (cid:12)P NP0;(cid:8) (cid:10) 2 . By the It(cid:244) pn i=1 T t=1 i;t i;t (cid:0) 1 ) ux Pool(cid:0) ) ux (cid:0)xx isomPetry, (cid:8)P=! E 1 J2 =! (cid:10) and thus (cid:8) (cid:10) 2 =! (cid:16) (cid:10) 1: (cid:17) (cid:0) (cid:1) ux 11 0 i 11 xx ux (cid:0)xx 11 (cid:0)xx h i R 9AngandBekaert(2007)doarguethatthedividend-priceratiohassomepredictiveabilitywhenconsideredjointlywiththe short interest rate. 10

Similarly, 1 n 1 T x2 (cid:10) , as (T;n ) . However, simple calculations yield that n i=1 T2 t=1 i;t (cid:0) 1 ! p xx !1 seq E 1 dB J =P ! P 1 r e(r s)Cdsdr =0,anditfollowsthat 1 n 1 T u x ! 1 r e(r s)Cdsdr ; 0 1;i i (cid:0) 12 0 0 (cid:0) 6 n i=1 T t=1 i;t i;t (cid:0) 1 ! p (cid:0) 12 0 0 (cid:0) as h (RT;n ) i . Thus (cid:16) ,RT R(cid:12)^ (cid:12) (cid:17) ! 1 r e(r s)cdsdr (cid:10)P1 as (TP;n ) . (cid:16) R R (cid:17) !1 seq FE (cid:0) ! p (cid:0) 12 0 0 (cid:0) (cid:0)xx !1 seq By removing the mean (cid:16) of the te (cid:17) rm 1 dB (cid:16) JR iRn the bias-cor (cid:17) rected estimator (cid:12)^+ , the central limit 0 1;i i FE theorem once more applies and, by the Rsame arguments as for (cid:12)^ Pool , pnT (cid:12)^+ FE (cid:0) (cid:12) ) N 0;(cid:8) ux (cid:10) (cid:0)xx 2 where (cid:8) = E 1 dB J E 1 dB J 2 = E 1 dB J ! (cid:18)(c (cid:16) ) 2 . By (cid:17) the It(cid:244)(cid:0)isometry, i(cid:1)t ux 0 1;i i(cid:0) 0 1;i i 0 1;i i(cid:0) 12 follows that (cid:8) (cid:20) = (cid:16) R! (cid:10) (! (cid:18) h R(c))2 and (cid:8) i(cid:17) (cid:21) (cid:10) 2 = (cid:20) ! (cid:16) R(cid:10) 1 (! (cid:18)(c))2(cid:10) (cid:17)2. (cid:21) ux 11 xx(cid:0) 12 ux (cid:0)xx 11 (cid:0)xx (cid:0) 12 (cid:0)xx 11

References [1] Ang, A., and G. Bekaert, 2007. Stock Return Predictability: Is it There? Review of Financial Studies 20, 651-707. [2] Campbell, J.Y., and M. Yogo, 2006. E¢ cient Tests of Stock Return Predictability, Journal of Financial Economics 81, 27-60. [3] Cavanagh, C., G. Elliot, and J. Stock, 1995. Inference in Models with Nearly Integrated Regressors, Econometric Theory 11, 1131-1147. [4] Cohen, R., C. Polk, and T. Vuolteenaho, 2003. The Value Spread, Journal of Finance 58, 609-641. [5] Hjalmarsson, E., 2007. Predicting Global Stock Returns, Working Paper, Federal Reserve Board. [6] Lewellen, J., 2004. Predicting Returns with Financial Ratios, Journal of Financial Economics, 74, 209- 235. [7] Mankiw, N.G., and M.D. Shapiro, 1986. Do We Reject Too Often? Small Sample Properties of Tests of Rational Expectations Models, Economics Letters 20, 139-145. [8] Moon,H.R.,andP.C.B.Phillips,2000.EstimationofAutoregressiveRootsnearUnityusingPanelData, Econometric Theory 16, 927-998. [9] Phillips, P.C.B., and H.R. Moon, 1999. Linear Regression Limit Theory for Nonstationary Panel Data, Econometrica 67, 1057-1111. [10] Polk, C., S. Thompson, and T. Vuolteenaho, 2006. Cross-Sectional Forecasts of the Equity Premium, Journal of Financial Economics 81, 101-141. [11] Stambaugh, R., 1999. Predictive Regressions, Journal of Financial Economics 54, 375-421. [12] Thompson, S.B., 2006. Simple Formulas for Standard Errors that Cluster by Both Firm and Time, Working Paper, Harvard University. 12

Table 1: Size results from the Monte Carlo study. The table shows the average rejection rates under the null of (cid:12) = 0, for the two-sided t tests corresponding to the respective estimators; the nominal size of the tests (cid:0) are 5 percent. The di⁄ering values of (cid:14) are given in the top row of the table and the results are based on 10;000 repetitions. The sample size used is T = 100 and n = 20. In Panel A, the local-to-unity parameter, c, is set equal to 5. In Panel B, separate local-to-unity parameters c are drawn for each i from a uniform i (cid:0) distribution with support [-20,-2]. Estimator (cid:14) =0:0 (cid:14) = 0:4 (cid:14) = 0:7 (cid:14) = 0:95 (cid:0) (cid:0) (cid:0) Panel A: c= 5 (cid:0) (cid:12)^ 0.050 0.051 0.054 0.050 POOL (cid:12)^ 0.052 0.211 0.546 0.807 FE (cid:12)^+ 0.054 0.052 0.056 0.054 FE Panel B: c U[ 20; 2] i (cid:24) (cid:0) (cid:0) (cid:12)^ 0.053 0.051 0.053 0.053 POOL (cid:12)^ 0.056 0.150 0.362 0.584 FE (cid:12)^+ 0.056 0.054 0.059 0.064 FE 13

Table 2: Results from the empirical regressions. The table shows the point estimates and corresponding t statistics (in parentheses) from the pooled regressions of excess stock returns onto either the dividend (cid:0) price ratio (d p), the earnings price ratio (e p), or the book-to-market value (b p). The (cid:133)rst column (cid:0) (cid:0) (cid:0) indicateswhichofthethreeforecastingvariablesisusedandthesecondandthirdcolumnsgivethesizeofthe panel used in the regression. The next two columns give the results for the standard (cid:133)xed e⁄ects estimator and the bias-corrected (cid:133)xed e⁄ects estimator, respectively. The (cid:133)nal two columns give the estimate of the local-to-unity parameter in the regressors and the average correlation between the innovations to the returns and the regressors, respectively. Variable n T (cid:12)^ (cid:12)^+ c^ ^(cid:14) FE FE pool d p 17 396 0:007 0:002 0:004 0:771 (cid:0) (cid:0) (cid:0) (cid:0) (3:840) ( 1:254) (cid:0) e p 16 337 0:011 0:000 0:091 0:697 (cid:0) (cid:0) (4:924) ( 0:089) (cid:0) b p 18 337 0:008 0:002 1:538 0:835 (cid:0) (cid:0) (cid:0) (4:016) (0:948) 14

Figure 1: Estimation results from the Monte Carlo study. The graphs show the kernel density estimates of the estimated slope coe¢ cients, for samples with T = 100 and n = 20. The solid lines, labeled Pooled in the legend, show the results for the standard pooled estimator without individual intercepts, (cid:12)^ ; the long Pool dashedlines, labeledFixedE⁄ects, showtheresultsforthestandard(cid:133)xede⁄ectsestimator, (cid:12)^ ; thedotted FE lines, labeled Bias Corrected FE, show the results for the bias-corrected (cid:133)xed e⁄ects estimator, (cid:12)^+ . All FE results are based on 10;000 repetitions. 15

Figure 2: Power results from the Monte Carlo study. The graphs show the average rejection rates for a two-sided 5 percent t test of the null hypothesis of (cid:12) = 0; for samples with T = 100, and n = 20. The (cid:0) x axis shows the true value of the parameter (cid:12), and the y axis indicates the average rejection rate. The (cid:0) (cid:0) solid lines, labeled Pooled, give the results for the t test corresponding to the standard pooled estimator (cid:0) without individual intercepts, (cid:12) ; the long dashed lines, labeled Fixed E⁄ects, show the results for the Pool t test corresponding to the standard (cid:133)xed e⁄ects estimator, (cid:12)^ ; the dotted lines, labeled Bias Corrected F (cid:0) E,showtheresultsforthet testcorrespondingtothebias-cor F r E ected(cid:133)xede⁄ectsestimator, (cid:12)^+ . The(cid:135)at (cid:0) FE lines indicate the 5% rejection rate. All results are based on 10;000 repetitions. 16

Cite this document
APA
Erik Hjalmarsson (2007). The Stambaugh Bias in Panel Predictive Regressions (IFDP 2007-914). Board of Governors of the Federal Reserve System, International Finance Discussion Papers. https://whenthefedspeaks.com/doc/ifdp_2007-914
BibTeX
@techreport{wtfs_ifdp_2007_914,
  author = {Erik Hjalmarsson},
  title = {The Stambaugh Bias in Panel Predictive Regressions},
  type = {International Finance Discussion Papers},
  number = {2007-914},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2007},
  url = {https://whenthefedspeaks.com/doc/ifdp_2007-914},
  abstract = {This paper analyzes predictive regressions in a panel data setting. The standard fixed effects estimator suffers from a small sample bias, which is the analogue of the Stambaugh bias in time-series predictive regressions. Monte Carlo evidence shows that the bias and resulting size distortions can be severe. A new bias-corrected estimator is proposed, which is shown to work well in finite samples and to lead to approximately normally distributed t-statistics. Overall, the results show that the econometric issues associated with predictive regressions when using time-series data to a large extent also carry over to the panel case. The results are illustrated with an application to predictability in international stock indices.},
}