ifdp · April 30, 2007

A Note on the Coefficient of Determination in Models with Infinite Variance Variables

Abstract

Since the seminal work of Mandelbrot (1963), alpha-stable distributions with infinite variance have been regarded as a more realistic distributional assumption than the normal distribution for some economic variables, especially financial data. After providing a brief survey of theoretical results on estimation and hypothesis testing in regression models with infinite-variance variables, we examine the statistical properties of the coefficient of determination in models with alpha-stable variables. If the regressor and error term share the same index of stability alpha<2, the coefficient of determination has a nondegenerate asymptotic distribution on the entire [0, 1] interval, and the density of this distribution is unbounded at 0 and 1. We provide closed-form expressions for the cumulative distribution function and probability density function of this limit random variable. In contrast, if the indices of stability of the regressor and error term are unequal, the coe¢ cient of determination converges in probability to either 0 or 1, depending on which variable has the smaller index of stability. In an empirical application, we revisit the Fama-MacBeth two-stage regression and show that in the in…nite-variance case the coefficient of determination of the second-stage regression converges to zero in probability even if the slope coe¢ cient is nonzero.

Board of Governors of the Federal Reserve System International Finance Discussion Papers Number 895 May 2007 A Note on the Coefficient of Determination in Models with Infinite Variance Variables Jeong-Ryeol Kurz-Kim Mico Loretan NOTE: International Finance Discussion Papers are preliminary materials circulated to stimulate discussion and critical comment. References in publications to International Finance Discussion Papers (other than an acknowledgment that the writer has had access to unpublished material) should be cleared with the author or authors. Recent IFDPs are available on the Web at www.federalreserve.gov/pubs/ifdp/. This paper can be downloaded without charge from the Social Science Research Network electronic library at http://www.ssrn.com/.

A Note on the Coe¢ cient of Determination in Models with In(cid:133)nite Variance Variables (cid:3) Jeong-Ryeol Kurz-Kim Mico Loretan y z May 2007 Abstract Since the seminal work of Mandelbrot (1963), (cid:11)-stable distributions with in(cid:133)nite variance have been regarded as a more realistic distributional assumption than the normal distribution for some economic variables, especially (cid:133)nancial data. After providing a brief survey of theoretical results on estimation and hypothesis testing in regression models with in(cid:133)nite-variance variables, we examine the statistical properties of the coe¢ cient of determination in models with (cid:11)-stable variables. If the regressor and error term share the same index of stability (cid:11)<2, the coe¢ cient of determination has a nondegenerate asymptotic distribution on the entire [0;1] interval, and the density of this distribution is unbounded at 0 and 1. We provide closed-form expressions for the cumulative distribution function and probability density function of this limit random variable. In contrast, if the indices of stability of the regressor and error term are unequal, the coe¢ cient of determination converges in probability to either 0 or 1, depending on which variable has the smaller index of stability. In an empirical application, we revisit the Fama-MacBeth two-stage regression and show that in the in(cid:133)nite-variance case the coe¢ cient of determination of the second-stage regression converges to zero in probability even if the slope coe¢ cient is nonzero. Keywords: Regressionmodels,(cid:11)-stabledistributions,in(cid:133)nitevariance,coe¢ cientofdetermination,Fama- MacBeth regression, Monte Carlo simulation, signal-to-noise ratio, density transformation theorem. JEL classifications: C12, C13, C21, G12 Theviewsexpressedinthispaperaresolelytheresponsibilityoftheauthorsandshouldnotbeinterpretedasre(cid:135)ectingthe (cid:3) views of the sta⁄of the Deutsche Bundesbank, the Board of Governors of the Federal Reserve System, or of any other person associated with the Federal Reserve System. We are grateful to Jean-Marie Dufour, Neil R. Ericsson, Peter C.B. Phillips, Werner Ploberger, Jonathan H. Wright, and participants of a workshop at the Federal Reserve Board for valuable comments, and to Zhenyu Wang forthe data used in the empiricalsection ofthis paper. Corresponding author. Research support from the Alexander von Humboldt Foundation is gratefully acknowly edged. Deutsche Bundesbank, Wilhelm-Epstein-Strasse 14, 60431 Frankfurt am Main, Germany; Email: jeong-ryeol.kurzkim@bundesbank.de;Tel: +49-69-9566-4576,Fax: +49-69-9566-2982. Mailstop18,BoardofGovernorsoftheFederalReserveSystem,WashingtonDC20551,USA;Email: mico.loretan@frb.gov; z Tel: +1-202-452-2219;Fax: +1-202-263-4850.

1 Introduction GrangerandOrr(1972)begintheirarticle,(cid:147)(cid:145)In(cid:133)nitevariance(cid:146)andresearchstrategyintimeseriesanalysis,(cid:148) byquestioningtheuncriticaluseofthenormaldistributionassumptionineconomicmodellingandestimation: It is standard procedure in economic modelling and estimation to assume that random variables arenormallydistributed. Inempiricalwork,con(cid:133)denceintervalsandsigni(cid:133)cancetestsarewidely used, andtheseusuallyhingeonthepresumptionofanormalpopulation. Lately, therehasbeen agrowingawarenessthatsomeeconomicdatadisplaydistributionalcharacteristicsthatare(cid:135)atly inconsistent with the hypothesis of normality. Dueinparttothein(cid:135)uentialseminalworkofMandelbrot(1963), (cid:11)-stabledistributionsareoftenconsidered to provide the basis for more realistic distributional assumptions for some economic data, especially for high-frequency (cid:133)nancial time series such as those of exchange rate (cid:135)uctuations and stock returns. Financial timeseriesaretypicallyfat-tailedandexcessivelypeakedaroundtheirmean(cid:151)phenomenathatcanbebetter captured by (cid:11)-stable distributions with 1<(cid:11)<2 rather than by the normal distribution, for which (cid:11)=2.1 The (cid:11)-stable distributional assumption with (cid:11) 2 is thus a generalization of rather than an alternative to (cid:20) theGaussiandistributionalassumption. Ifaneconomicseries(cid:135)uctuatesaccordingtoan(cid:11)-stabledistribution with (cid:11) < 2, it is known that many of the standard methods of statistical analysis, which often rest on the asymptoticpropertiesofsamplesecondmoments, donotapplyintheconventionalway. Inparticular, aswe demonstrate in this paper, the coe¢ cient of determination(cid:151)a standard criterion for judging goodness of (cid:133)t in a regression model(cid:151)has several nonstandard statistical properties if (cid:11)<2. The linear regression model is one of the most commonly used and basic econometric tools, not only for the analysis of macroeconomic relationships but also for the study of (cid:133)nancial market data. Typical examples for the latter case are estimation of the ex-post version of the capital asset pricing model (CAPM) and the two-stage modelling approach of Fama and MacBeth (1973). Because of the prevalence of heavytaileddistributionsin(cid:133)nancialtimeseries, itisofinteresttostudyhowregressionmodelsperformwhenthe data are heavy-tailed rather normally distributed. The(cid:133)rstpurposeofthepresentpaperistosurveytheoreticalresultsofestimationandhypothesistesting in regression models with in(cid:133)nite-variance distributions, and the second is to establish that in(cid:133)nite variance of the regression variables has important consequences for the statistical properties of the coe¢ cient of determination and tests of the hypothesis that this coe¢ cient is equal to zero. Third, we revisit the Fama- 1Thenormaldistributionistheonlymemberofthefamilyof(cid:11)-stabledistributionsthathas(cid:133)nitesecond(andhigher-order) moments;allothermembers ofthis family have in(cid:133)nite variance. 1

MacBeth two-stage regression approach and demonstrate that in(cid:133)nite variance of the regression variables can a⁄ect decisively the interpretation of the empirical results. The rest of our paper is structured as follows. In Section 2 we provide a brief summary of the properties of (cid:11)-stable distributions and of aspects of estimation, hypothesis testing, and model diagnostic checking in regression models with in(cid:133)nite-variance regressors and disturbance terms. Section 3 provides a detailed analysis of the asymptotic properties of the coe¢ cient of determination in regression models with in(cid:133)nitevariance variables. In our empirical application, presented in Section 4, we revisit the data used in Fama and French (1992), and we show that the statistical and/or economic interpretation of their (cid:133)ndings can be quite di⁄erent under the maintained assumption of (cid:11)-stable distributions from an interpretation based on theassumptionofnormaldistributions. Section5summarizesthepaperando⁄erssomeconcludingremarks. 2 Framework 2.1 (cid:11)-stable distributions A random variable X is said to have a stable distribution if, for any positive integer n > 2, there exist d constants a n >0and b n R such thatX 1 + +X n =a n X+b n , where X 1 ;:::;X n are independentcopies 2 (cid:1)(cid:1)(cid:1) of X and = d signi(cid:133)es equality in distribution. The coe¢ cient a above is necessarily of the form a = n1=(cid:11) n n for some (cid:11) (0;2] (see Feller, 1971, Section VI). The parameter (cid:11) is called the index of stability of the 2 distribution, and a random variable X with index (cid:11) is called (cid:11)-stable. An (cid:11)-stable distribution is described by four parameters and will be denoted by S((cid:11);(cid:12);(cid:13);(cid:14)). Closed-form expressions for the probability density functions of (cid:11)-stable distributions are known to exist only for three special cases.2 However, closed-form expressionsforthecharacteristicfunctionsof(cid:11)-stabledistributionsarereadilyavailable. Oneparameterization of the logarithm of the characteristic function of S((cid:11);(cid:12);(cid:13);(cid:14)) is ln Eei(cid:28)X =i(cid:14)(cid:28) (cid:13)(cid:11) (cid:28) (cid:11) 1+i(cid:12)sign((cid:28))!((cid:28);(cid:11)) ; (1) (cid:0) j j (cid:0) (cid:1) (cid:0) (cid:1) wheresign((cid:28))= 1for(cid:28) <0, sign((cid:28))=0for(cid:28) =0, andsign((cid:28))=+1for(cid:28) >0; and!((cid:28);(cid:11))= tan((cid:25)(cid:11)=2) (cid:0) (cid:0) for (cid:11)=1 and !((cid:28);(cid:11))=(2=(cid:25))ln (cid:28) for (cid:11)=1. 6 j j The tail shape of an (cid:11)-stable distribution is determined by its index of stability (cid:11) (0;2]. Skewness is 2 governed by (cid:12) [ 1;1]; the distribution is symmetric about (cid:14) if and only if (cid:12) = 0. The scale and location 2 (cid:0) parameters of (cid:11)-stable distributions are denoted by (cid:13) > 0 and (cid:14) R, respectively. When (cid:11) = 2, the log 2 characteristic function given by equation (1) reduces to i(cid:14)(cid:28) (cid:13)2(cid:28)2, which is that of a Gaussian random (cid:0) 2The three special cases are: (i) the Gaussian distribution S(2;0;(cid:13);(cid:14)) N((cid:14);2(cid:13)2), (ii) the symmetric Cauchy distribution (cid:17) S(1;0;(cid:13);(cid:14)),and(iii)theLØvydistributionS(0:5; 1;(cid:13);(cid:14));seeZolotarev(1986),Section2,andRachevetal.(2005),Section7. (cid:6) 2

variable with mean (cid:14) and variance 2(cid:13)2. For (cid:11) < 2 and (cid:12) < 1, the tail properties of an (cid:11)-stable random j j variable X satisfy lim P(X >x)= C((cid:11))(cid:13)(cid:11)(1+(cid:12))=2 x (cid:11) and (2) (cid:0) x !1 (cid:2) (cid:3) lim P(X < x)= C((cid:11))(cid:13)(cid:11)(1 (cid:12))=2 x (cid:11); (3) (cid:0) x (cid:0) (cid:0) !1 (cid:2) (cid:3) i.e., both tails of the probability density function (pdf) of X are asymptotically Paretian. For (cid:11) < 2 and (cid:12) = +1 ( 1), the distribution is maximally right-skewed (left-skewed) and only the right (left) tail is (cid:0) asymptotically Paretian.3 The term C((cid:11)) in equations (2) and (3) is given by 1 (cid:11) C((cid:11))= (cid:0) for (cid:11)=1 (4) (cid:0)(2 (cid:11))cos((cid:25)(cid:11)=2) 6 (cid:0) and 2=(cid:25) for (cid:11) = 1; see, e.g., Samorodnitsky and Taqqu (1994), p. 17. The function C((cid:11)), which is shown in Figure 1, is continuous and strictly decreasing in (cid:11) (0;2), with lim C((cid:11))=1 and lim C((cid:11))=0.4 (cid:11) 0 (cid:11) 2 2 # " In consequence, even though all stable distributions with (cid:11)<2 have asymptotically Paretian tails, as (cid:11) 2 " proportionately less and less of the distribution(cid:146)s probability mass is located in the tail region. In addition, the density(cid:146)s tails decline at an increasingly rapid rate as (cid:11) 2, thereby limiting the likelihood of observing " very large draws conditional on the draw coming from the tail region. These observations explain why potentiallyverylargesamplesizesarerequiredifonedesirestoestimatetheindexofstabilitywithadequate precision if (cid:11) is close to but smaller than 2. Figure 1 somewhere here Because E X (cid:24) = lim b P(X (cid:24) > x)dx, it follows that E X (cid:24) < for (cid:24) (0;(cid:11)) and E X (cid:24) = j j b !1 0 j j j j 1 2 j j 1 for (cid:24) (cid:11) if X is (cid:11)-stable witRh (cid:11) (0;2).5 Only moments of order up to but not including (cid:11) are (cid:133)nite (cid:21) 2 if (cid:11) < 2, and a non-Gaussian stable distribution(cid:146)s index of stability is also equal to its maximal moment 3For (cid:11)<1 and (cid:12) =+1, P(X <(cid:14))=0, i.e., the distribution(cid:146)s support is bounded below by (cid:14). Zolotarev (1986, Theorem 2.5.3) and Samorodnitsky and Taqqu (1994, pp. 17(cid:150)18) provide expressions for the rate of decline of the non-Paretian tail if (cid:12)= 1and (cid:11) 1. 4T (cid:6) he functi (cid:21) on C((cid:11)) is smooth on the entire interval (0;2). The numerator and the second term in the denominator of equation (4) both converge to 0 as (cid:11) 1;C(1)=2=(cid:25) therefore follows from an application ofL(cid:146)H(cid:244)pital(cid:146)s Rule. 5Ibragimov and Linnik (1971, The ! orem 2.6.4) show that this result holds not only for (cid:11)-stable distributions, but that it pertains to all distributions that are in the domain of attraction of an (cid:11)-stable distribution. Ibragimov and Linnik (1971, Theorem2.6.1)providenecessaryandsu¢ cientconditionsforaprobabilitydistributiontolieinthedomainofattractionofan (cid:11)-stable law. 3

exponent.6 In particular, if (cid:11) (1;2), the variance is in(cid:133)nite but the mean exists. For (cid:11)>1, it follows that 2 E(X)=(cid:14); in addition, for (cid:12) =0, (cid:14) is equal to the distribution(cid:146)s mode and median irrespective of the value of (cid:11), justifying the use of the term (cid:147)central location parameter(cid:148)for (cid:14) in the (cid:133)nite-mean or symmetric cases. In addition, for (cid:11) = 1, one can show that S((cid:11);(cid:12);(cid:13);(cid:14)) = d (cid:13) S((cid:11);(cid:12);1;(cid:14)=(cid:13)).7 We make use of this property 6 (cid:1) below in the derivations of Theorem 1 and Remark 3. Theclassof(cid:11)-stabledistributionsisaninterestingdistributionalcandidatefordisturbancesinregression models because (i) it is able to capture the relative frequencies of extreme vs. ordinary observations in the economic variables, (ii) it has the convenient statistical property of closure under convolution, and (iii) only (cid:11)-stable distributions can serve as limiting distributions of sums of independent and identically distributed (iid) random variables, as proven in Zolotarev (1986). The latter two properties are appealing for regression analysis, given that disturbances can be viewed as random variables which represent the sum of all external e⁄ects not captured by the regressors. For more details on the properties of (cid:11)-stable distributions, we refer toGnedenkoandKolmogorov(1954),Feller(1971),Zolotarev(1986),andSamorodnitskyandTaqqu(1994). Theroleofthe(cid:11)-stabledistributionin(cid:133)nancialmarketandeconometricmodellingissurveyedinMcCulloch (1996) and Rachev et al. (1999). 2.2 Regression models with in(cid:133)nite-variance variables Let X and Y be two jointly symmetric (cid:11)-stable (henceforth, S(cid:11)S) random variables with (cid:11) > 1, i.e., we require X and Y to have (cid:133)nite means. Our main reason for concentrating on the case (cid:11) > 1 lies in its empirical relevance. Estimated maximal moment exponents for most empirical (cid:133)nancial data, such as exchangeratesandstockprices,aregenerallygreaterthan1.5;see,forexample,deVries(1991)andLoretan and Phillips (1994). An econometric (purposeful) reason for studying the case (cid:11) > 1 is that, for (cid:11)-stable distributions with (cid:11)>1, regression analysis that is based on sample second moments, such as least squares, is still asymptotically consistent for the regression coe¢ cients, even though the limit distributions of these regression coe¢ cients are nonstandard.8 Suppose that the regression of a random variable Y on a random variable X is linear, i.e., there exists a constant (cid:18) such that E(Y X)=(cid:18)X a:s:; (5) j 6The maximal moment exponent of a distribution is either a (cid:133)nite positive number, or it is in(cid:133)nite if a distribution has (cid:133)nite moments of all orders. For a Student-t distribution, the degrees of freedom parameter is equal to its maximal moment exponent. 7This result also holds forthe case (cid:11)=1and (cid:12)=0. 8Another reason for this restriction comes from the viewpoint of statistical modelling. The conditional expectation of the bivariate symmetric stable distribution in (5) is, as in the Gaussian case, linear in X only if (cid:11) (1;2). The regression 2 function is in general nonlinear, or rather only asymptotically linear, under other conditions. For more on bivariate linearity, see Samorodnitsky and Taqqu (1994,Sections 4 and 5). 4

with [Y;X] (cid:18) = (cid:11)X; (cid:13) (cid:11) x where (cid:13) is the scale parameter of the S(cid:11)S random variable X and [; ] in the numerator is covariation x (cid:1) (cid:1) (cid:11) (covariance in the Gaussian case), which can be calculated as E XY<(cid:24) 1> E Y (cid:24) , for all (cid:24) (1;(cid:11)) with (cid:0) j j 2 a<(cid:24)> a(cid:24)sign(a). (cid:0) (cid:1)(cid:14) (cid:0) (cid:1) (cid:17)j j For estimation and diagnostics, the relation (5) can be written as a regression model with a constant term, y =c+(cid:18)x +u ; (6) t t t where the maintained hypothesis is that u is iid S(cid:11)S, with (cid:11) (1;2]. The econometric issues of interest t 2 are to estimate (cid:18) properly, to test the hypothesis of signi(cid:133)cance for the estimated parameter, usually based on the t-statistic, as well as to compute model diagnostics, such as the coe¢ cient of determination, the Durbin-Watson statistic, and the F-test of parameter constancy across subsamples. The e⁄ects of in(cid:133)nite variance in the regressor and disturbance term can be substantial. If the variables share the same index of stability (cid:11), the ordinary least squares (OLS) estimate of (cid:18) is still consistent, but its asymptotic distribution is (cid:11)-stable with the same (cid:11) as the underlying variables. Furthermore, the convergence rate to the true parameter is T((cid:11) 1)=(cid:11), smaller than the rate T1=2 which applies in the (cid:133)nite- (cid:0) variancecase. If(cid:11)<2,OLSlosesitsbestlinearunbiasedestimator(BLUE)property,i.e.,itisnolongerthe minimum-dispersion estimatorintheclass of linearestimatorsof (cid:18). In addition, the asymptotic e¢ ciencyof the OLS estimator converges to zero as the index of stability (cid:11) declines to 1. Blattberg and Sargent (1971) (henceforth, BS) derived the BLUE for (cid:18) in (6) if the value of (cid:11) is known. The BS estimator is given by ^(cid:18) = T t=1 x t <1=((cid:11) (cid:0) 1)>y t ; 1<(cid:11) 2; (7) (cid:11) BS T x (cid:11)=((cid:11) 1) (cid:20) P t=1j t j (cid:0) P which coincides with the OLS estimator if (cid:11) = 2. Kim and Rachev (1999) prove that the asymptotic distribution of the BS estimator is also (cid:11)-stable. Samorodnitsky et al. (2007) consider an optimal power estimate based on the BS estimator for unknown (cid:11), and they also provide an optimal linear estimator of the regression coe¢ cients for various con(cid:133)gurations of the indices of stability of x and u . Other e¢ cient t t estimators of the regression coe¢ cients have been studied as well; Kanter and Steiger (1974) propose an unbiased L -estimator, which excludes very large shocks in its estimation to avoid excess sensitivity due to 1 outliers. Using a weighting function, McCulloch (1998) considers a maximum-likelihood estimator which is based on an approximation to a symmetric stable density. 5

Hypothesis testing is also a⁄ected considerably when the regressors and disturbance terms have in(cid:133)nitevariance stable distributions. For example, the t-statistic, commonly used to test the null hypothesis of parameter signi(cid:133)cance, no longer has a conventional Student-t distribution if (cid:11)<2. Rather, as established by Logan et al. (1973), its pdf has modes at 1 and +1; for (cid:11) < 1 these modes are in(cid:133)nite. Kim (2003) (cid:0) provides empirical distributions of the t-statistic for (cid:133)nite degrees of freedom and various values of (cid:11) by simulation. Theusualappliedgoodness-of-(cid:133)tteststatistics,suchasthelikelihoodratio,Lagrangemultiplier, and Wald statistics, also no longer have the conventional asymptotic (cid:31)2 distribution, but have a stable (cid:31)2 distribution, a term that was introduced by Mittnik et al. (1998). In time series regressions with in(cid:133)nite-variance innovations, Phillips (1990) shows that the limit distribution of the augmented Dickey-Fuller tests for a unit root are functionals of LØvy processes, whereas they are functionals of Brownian motion processes in the (cid:133)nite-variance case. The F-test statistic for parameter constancy that is based on the residuals from a sample split test has an F-distribution in the conventional, (cid:133)nite-variancecase. Kurz-Kimetal.(2005)obtainthelimitingdistributionoftheF-testiftherandomvariables have in(cid:133)nite variance. As shown by the authors, as well as by Runde (1993), the limiting distribution oftheF-statisticfor(cid:11)<2behavescompletelydi⁄erentlyfromtheGaussiancase: whereasinthelattercase the statistic converges to 1 under the null as the degrees of freedom for both numerator and denominator of the statistic approach in(cid:133)nity, in the former case the statistic converges to a ratio of two independent, positive, and maximally right-skewed (cid:11)=2-stable distributions. This result is used below to derive closedformexpressionsforthepdfandcumulativedistributionfunction(cdf)ofthelimitingdistributionoftheR2 statistic if the regressor and disturbance term share the same index of stability (cid:11)<2. Moreover, commonly used criteria for judging the validity of some of the maintained hypotheses of a regressionmodel,suchastheDurbin-WatsonstatisticandtheBox-PierceQ-statistic,wouldbeinappropriate if one were to rely on conventional critical values. Phillips and Loretan (1991) study the properties of the Durbin-Watson statistic for regression residuals with in(cid:133)nite variance, and Runde (1997) examines the properties of the Box-Pierce Q-statistic for random variables with in(cid:133)nite variance. Loretan and Phillips (1994) and Phillips and Loretan (1994) establish that both the size of tests of covariance stationarity under the null and their rate of divergence of these tests under the alternative are strongly a⁄ected by failure of standard moment conditions; indeed, standard tests of covariance stationarity are inconsistent if population second moments do not exist. 6

3 Asymptotic properties of the coe¢ cient of determination in models with (cid:11)-stable regressors and error terms 3.1 Basic results For the general asymptotic theory of stochastic processes with stable random variables, we refer to Resnick (1986) and Davis and Resnick (1985a, 1985b, 1986). Our results in this section are, in large part, an application of their work to the regression diagnostic context. The maintained assumptions are: 1. The relationship between the dependent and independent variable conforms to the classical bivariate linear regression model, y =c+(cid:18)x +u ; t=1;:::;T : (8) t t t 2. u is iid S(cid:11)S((cid:11) ;0;(cid:13) ;0), with (cid:11) (1;2). t u u u 2 3. x is exogenous and is also iid S(cid:11)S((cid:11) ;0;(cid:13) ;0), with (cid:11) (1;2). t x x x 2 4. The regressor and the error term have the same index of stability, i.e., (cid:11) =(cid:11) =(cid:11). x u 5. The coe¢ cients c and (cid:18) are consistently estimated by c^and ^(cid:18).9 The fourth assumption, that the regressor and the error term have the same index of stability, is rather strong, and its validity may be di¢ cult to ascertain in empirical applications. In Corollary 2 below, we examinetheconsequencesofhavingunequalvaluesfortheindicesofstabilityforx andu fortheasymptotic t t properties of the coe¢ cient of determination. The coe¢ cient of determination measures the proportion of the total squared variation in the dependent variable that is explained by the regression: Explained Sum of Squares T (y^ y(cid:22))2 R2 = = t=1 t (cid:0) : Total Sum of Squares T (y y(cid:22))2 Pt=1 t (cid:0) P Becausey^ y(cid:22)=^(cid:18)(x x(cid:22))andy y(cid:22)=^(cid:18)(x x(cid:22))+u^ ,wherey(cid:22)andx(cid:22)aretherespectivesampleaveragesofy t t t t t t (cid:0) (cid:0) (cid:0) (cid:0) and x , and because T (x x(cid:22))u^ =0 by construction, the coe¢ cient of determination may be written as t t=1 t (cid:0) t P ^(cid:18) 2 T (x x(cid:22))2 R2 = t=1 t (cid:0) : (9) ^(cid:18) 2 T P(x x(cid:22))2+ T u^2 t=1 t (cid:0) t=1 t 9If(cid:11)x=(cid:11)u,OLSisknowntogenerateconsistentPestimatesofcand(cid:18).PSeeSamorodnitskyetal.(2007)foranoverviewand discussion ofestimation methods that are consistent forvarious combinations of(cid:11)u and (cid:11)x. 7

Since x2 and u2 are in the normal domain of attraction of a stable distribution with index of stability t t (cid:11)=2, norming by T 2=(cid:11) rather than by T 1 is required to obtain non-degenerate limits for the sums of the (cid:0) (cid:0) squared variables. Because ^(cid:18) (cid:18) by the assumption of consistent estimation, an application of the law of p ! large numbers to x(cid:22), the continuous mapping theorem, and the results of Davis and Resnick (1985b) yield the following expression for the joint limiting distribution of the elements in equation (9): T T T T T 2=(cid:11)(cid:13) 2 u^2; ^(cid:18) 2 T 2=(cid:11)(cid:13) 2 (x x(cid:22))2 T 2=(cid:11)(cid:13) 2 u2; (cid:18)2T 2=(cid:11)(cid:13) 2 x2 (cid:0) (cid:0)u t (cid:0) (cid:0)x t (cid:0) (cid:24) (cid:0) (cid:0)u t (cid:0) (cid:0)x t (cid:16) X t=1 X t=1 (cid:17) (cid:16) X t=1 X t=1 (cid:17) T T = T 2=(cid:11) (u =(cid:13) )2; (cid:18)2T 2=(cid:11) (x =(cid:13) )2 (cid:0) t u (cid:0) t x (cid:16) X t=1 X t=1 (cid:17) S ; (cid:18)2S : (10) d u x ! (cid:0) (cid:1) For (cid:11) < 2, the random variables S and S are independent, maximally right-skewed, and positive stable u x random variables with index of stability (cid:11)=2<1, (cid:12) =+1, (cid:13) =1,10 (cid:14) =0, and log characteristic function lnE ei(cid:28)Sx =lnE ei(cid:28)Su = (cid:28) (cid:11)=2 1 isign((cid:28)) tan((cid:25)(cid:11)=4) : (11) (cid:0)j j (cid:0) (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) Wethereforeconcludethat, underthe(cid:133)vemaintainedassumptionsofthissection, theR2 statisticofthe regression model (8) has the following asymptotic distribution. Theorem 1 Under the maintained assumptions of the regression model in equation (8), the coe¢ cient of determination is distributed asymptotically as (cid:18)2(cid:13)2S (cid:17)S (cid:17)Z R2 x x = x = =R((cid:11);(cid:17)), say, (12) d ! (cid:18)2(cid:13)2S +(cid:13)2S (cid:17)S x +S u (cid:17)Z+1 x x u u e where (cid:17) = ((cid:18)(cid:13) =(cid:13) )2 011 and Z = S =S . For (cid:11) < 2, S and S are independent and are identically x u (cid:21) x u x u distributed with log characteristic functions given by equation (11). Thus, for (cid:11) < 2 and (cid:17) > 0, the coe¢ cient of determination does not converge to a constant but has a nondegenerate asymptotic distribution on the interval [0;1]. This contrasts starkly with the standard, (cid:133)nite-variance result, which is stated here for completeness. 10To prove that (cid:13) =1, see equation (13.3.14) on p. 529 of Brockwell and Davis (1991). In that equation, put C =C((cid:11)=2), where C()is given by equation (4),and employ the recursive relationship (cid:0)(2 (cid:11)=2)=(1 (cid:11)=2) (cid:0)(1 (cid:11)=2). 11Obser (cid:1) ve that (cid:17)=0ifand only if(cid:18)=0,as the dispersion parameters (cid:13) an (cid:0) d (cid:13) are nec (cid:0) essarily (cid:1) posi (cid:0) tive. x u 8

Corollary 1 If(cid:11)=2,andhenceifx andu have(cid:133)nitevariance,thelimitvariablesS andS inTheorem1 t t x u are non-random constants and are, in fact, equal to 2.12 In the (cid:133)nite-variance case, then, the limit of R2 as T is given by !1 (cid:18)2(cid:27)2 (cid:17) R2 x = ; p ! (cid:18)2(cid:27)2 +(cid:27)2 (cid:17)+1 x u where now (cid:17) =((cid:18)(cid:27) =(cid:27) )2. x u In the (cid:133)nite-variance case, the model(cid:146)s asymptotic signal-to-noise ratio, (cid:17) =((cid:18)(cid:27) =(cid:27) )2, is constant, as is x u therefore the limit of the coe¢ cient of determination. In contrast, in the in(cid:133)nite-variance case the model(cid:146)s limitingsignal-to-noiseratioisgivenby(cid:17)Z,where(cid:17) =((cid:18)(cid:13) =(cid:13) )2 andZ =S =S ,andisthereforearandom x u x u variable even asymptotically; it is this feature that causes the randomness of R((cid:11);(cid:17)). We postpone a fuller discussion of the intuition that underlies this result to the end of this section, after we provide a detailed e analysis of the statistical properties of R. Beforedoingso,however,wenotethatthefourthmaintainedassumption,i.e.,thattheindicesofstability e of the regressor and error term in (8) be the same, is crucial for obtaining the result that the asymptotic distribution of R is nondegenerate. Indeed, if the two indices of stability di⁄er, the asymptotic properties of the R2 statistic are as follows. e Corollary 2 SupposethatthemaintainedassumptionsofTheorem1applyexceptthat(cid:11) =(cid:11) ,i.e.,suppose x u 6 that the indices of stability of the regressor and error term are unequal. Let (cid:18) =0 to rule out the trivial case 6 from further consideration. Then, if (cid:11) x <(cid:11) u , 1 R2 =o p T2=(cid:11)u(cid:0) 2=(cid:11)x ; and (cid:15) (cid:0) (cid:0) (cid:1) if (cid:11) u <(cid:11) x , R2 =o p T2=(cid:11)x(cid:0) 2=(cid:11)u . (cid:15) (cid:0) (cid:1) Thus, R2 converges to 1 in probability if (cid:11) <(cid:11) , and it converges to 0 in probability if (cid:11) <(cid:11) . x u u x Proof. These results follow immediately from the fact that if (cid:11) = (cid:11) , di⁄erent norming factors, viz., x u 6 T2=(cid:11)x and T2=(cid:11)u, are needed in equation (10) to achieve joint convergence of the terms ^(cid:18) T t=1 (x t (cid:0) x(cid:22))2 and T u^2 to the limiting random variables S and S . Whenever the two norming factorPs di⁄er, the larger t=1 t x u Pof the two factors dominates the ratio that de(cid:133)nes R2 as T , and this statistic must therefore converge !1 either to 0 or 1 in probability. Suppose (cid:133)rst that (cid:11) x < (cid:11) u ; since T2=(cid:11)x > T2=(cid:11)u, we (cid:133)nd T (cid:0) 2=(cid:11)x u^2 t = T (cid:0) 2=(cid:11)u T2=(cid:11)u(cid:0) 2=(cid:11)x u^2 t = o p T2=(cid:11)u(cid:0) 2=(cid:11)x . Therefore, P (cid:0) (cid:1)P 12(cid:0)Recallthatin(cid:1)the(cid:133)nite-variancecase,(cid:13)2=(cid:27)2=2;therefore,normingbyT 1(cid:13) 2 andT 1(cid:13) 2 inequation(10)producesa (cid:0) (cid:0)x (cid:0) (cid:0)u constant of2. 9

R2 = ^(cid:18) 2 T (cid:0) 2=(cid:11)x (x t (cid:0) x(cid:22))2 ^(cid:18) 2 T (cid:0) 2=(cid:11)x (x t (cid:0) Px(cid:22))2+T (cid:0) 2=(cid:11)x u^2 t (cid:18)2(cid:13)2S P x x P ! d (cid:18)2(cid:13)2 x S x +o p T2=(cid:11)u(cid:0) 2=(cid:11)x 1: (cid:0) (cid:1) p ! Similarly, if (cid:11) u <(cid:11) x , T (cid:0) 2=(cid:11)u (x t x(cid:22))2 =o p T2=(cid:11)x(cid:0) 2=(cid:11)u , and R2 p 0. (cid:0) ! Heuristically, if (cid:11) = (cid:11) aPnd (cid:18) = 0, the lim(cid:0)iting distrib(cid:1)ution of the R2 statistic is degenerate at 0 or 1 x u 6 6 because the model(cid:146)s asymptotic signal-to-noise ratio is either zero (if (cid:11) < (cid:11) ) or in(cid:133)nite (if (cid:11) < (cid:11) ). u x x u From an examination of the proof of this corollary, we can also deduce that if (cid:11) =(cid:11) , the (cid:133)fth maintained x u 6 assumption(cid:151)that the regression coe¢ cients are estimated consistently(cid:151)could be relaxed, to require merely that an estimation method be employed that guarantees ^(cid:18) =o (1); the result that R2 converges either to 0 p 6 or 1 would continue to hold in this case. 3.2 Qualitative properties of R Returning to the main case of (cid:11) x =(cid:11) u =e(cid:11), we note that the random variable R is de(cid:133)ned for all values of (cid:11) (0;2), even though in a regression context one would typically assume that (cid:11) (1;2). We now establish 2 e2 some important qualitative properties of R. e Remark 1 For (cid:17) >0, the median of R, m, equals (cid:17)=((cid:17)+1). Proof. For (cid:17) >0, observe that e (cid:17) (cid:17)S (cid:17) P R =P x (cid:20) (cid:17)+1 (cid:17)S +S (cid:20) (cid:17)+1 (cid:18) (cid:19) (cid:18) x u (cid:19) 1 e =P S ((cid:17)S +S ) x x u (cid:20) (cid:17)+1 (cid:18) (cid:19) =P ((cid:17)+1)S (cid:17)S S x x u (cid:0) (cid:20) (cid:0) (cid:1) =P(S S ) : x u (cid:20) Because S and S are iid and have continuous cdfs, P(S S ) = 0:5 by an application of Fubini(cid:146)s x u x u (cid:20) Theorem.13 13See,e.g.,Resnick (1999,p.155). 10

Thus, m is equal to the non-random limit of R2 in the (cid:133)nite-variance case. Since S and S are positive x u a.s.,wealsohaveP(S =S 1) P(Z 1)=0:5,i.e.,themedianofZ isequalto1,regardlessofthevalue x u (cid:20) (cid:17) (cid:20) of (cid:11). As we will demonstrate rigorously later in this paper, the probability mass of Z is highly concentrated around1forvaluesof(cid:11)closeto2. Conversely,forsmallvaluesof(cid:11),Z isunlikely tobecloseto1; instead,it isverylikelythatonewillobtainadraw of Z thatiseitherverysmall, i.e., close to0, orverylarge. Asmall orlargedrawofZ hasacruciale⁄ectonthemodel(cid:146)ssignal-to-noiseratio,(cid:17)Z,andthereforealsoonR2. This suggests that an informal measure of the e⁄ect of in(cid:133)nite variance in the regression variables on the value ofR2 inagivensamplemaybebasedonthedi⁄erence betweenthemodel(cid:146)scoe¢ cientofdeterminationand a consistent estimate of its median m, say m^ = (cid:17)^=((cid:17)^+1), where (cid:17)^ = (^(cid:18)(cid:13)^ =(cid:13)^ )2. The larger the di⁄erence x u between R2 and m^, the more important the e⁄ect is of having obtained a small (or large) value of Z. Thefollowingremarkshowsthata(cid:133)nite-variancepropertyofR2((cid:17))for(cid:17) >0, viz., R2(1=(cid:17))=1 R2((cid:17)), (cid:0) carries over in a natural way to R. e Remark 2 For (cid:17) >0, the distribution of R((cid:11);(cid:17)) is skew-symmetric, viz., e d R((cid:11);(cid:17))=1 R((cid:11);1=(cid:17)); (cid:0) e e d or, equivalently, R((cid:11);m)=1 R((cid:11);1 m). The pdf of R therefore satis(cid:133)es (cid:0) (cid:0) e e e f (r)=f (1 r) r [0;1]: R((cid:11);m) R((cid:11);1 m) (cid:0) 8 2 (cid:0) e e The distribution of R is symmetric about 0:5 for (cid:17) =1. Proof. Recall thateS and S are iid. Thus, for (cid:17) >0 x u (1=(cid:17))S 1 R((cid:11);1=(cid:17))=1 x (cid:0) (cid:0) (1=(cid:17))S +S x u S e = u (1=(cid:17))S +S x u (cid:17)S = u (cid:17)S +S u x (cid:17)S = d x =R((cid:11);(cid:17)): (cid:17)S +S x u e The symmetry of R about 0:5 for (cid:17) =1 follows immediately from this result and the fact that the distribution(cid:146)s support is the interval [0;1]. e Next, as the following remark shows, the pdf of R has in(cid:133)nite modes at 0 and 1, i.e., at the endpoints of its support. e 11

Remark 3 (i) For (cid:17) > 0, the pdf of R is unbounded at 0 and 1, i.e., f (0) = f (1) = . (ii) The cdf of R R 1 R is continuous on [0;1], and the distribution does not have atoms at 0 and 1. e e e Peroof. To demonstrate the validity of the (cid:133)rst part of this remark, we apply a standard result for the pdf of the ratio of two random variables,14 adapted to the present case where the random variables in the numerator and denominator are both strictly positive. For (cid:17) >0, set V =(cid:17)S and W =(cid:17)S +S . We have x x u 1 f (r)= wf (rw;w)dw; 0 r 1; R V;W (cid:20) (cid:20) Z0 e where the joint pdf f V;W (; ) is nonzero on R + R +. The case r = 1 can occur only if S u = 0; if S u = 0, (cid:1) (cid:1) (cid:2) however, the random variables V and W are perfectly dependent, their joint pdf is nonzero only on the positive 45 -halfline, and the joint pdf f (w;w) reduces to (1=p2)f (w), w 0. Hence, for r =1 we (cid:133)nd (cid:14) V;V V (cid:21) 1 1 1 1 f (1)= wf (1 w;w)dw = wf (w)dw = E((cid:17)S )= : R V;V (cid:1) p2 V p2 x 1 Z0 Z0 e By Remark 2, we have f (0)= as well. R 1 The continuity of the cdf of R on [0;1] for (cid:17) > 0 follows from the continuity of the cdfs of S and S e x u on R + and the fact that their pd e fs are equal to zero at the origin. For example, one (cid:133)nds that P(R=1)= P(S =0)=0; the result P(R=0)=0 then follows from Remark 2. u e ThefactthattheprobabilitydensityfunctionofR hasin(cid:133)nitesingularitiesmayseemunusual. However, e the presence of singularities is a regular feature of pdfs that are based on ratios of stable random variables. e For example, Logan et al. (1973) and Phillips and Hajivassiliou (1987) showed that if (cid:11)<1, the density of the t-statistic has in(cid:133)nite modes at 1 and +1; similarly, Phillips and Loretan (1991) demonstrated that if (cid:0) (cid:11)<2,thisfeatureisalsopresentintheasymptoticdistributionsofthevonNeumanratioandthenormalized Durbin-Watson test statistic. 3.3 The cdf and pdf of R The remarks in the preceding subesection provide important qualitative information about some of the distributional properties of R. However, they do notaddress issues such as whether the distribution has modes beyondthoseat0and1,whetherthediscontinuityofthepdfattheendpointsissimpleoriff (r)diverges(cid:151) e R and, if so, at which rate(cid:151)as r 0 or r 1, or how much of the distribution(cid:146)s mass is concentrated near the e # " endpoints of the support. To examine these issues, we provide expressions for the cdf and pdf of f (r) in R this subsection. It is possible to do so because R is a continuously di⁄erentiable and invertible function of e 14See,e.g.,Mood,Graybill,and Boes (1974),p.187. e 12

the ratio of two independent, maximally right-skewed, and positive (cid:11)-stable random variables, and because closed-form expressions for the cdf and pdf of this ratio are known. The latter expressions are provided in the following proposition. Proposition 1 (Zolotarev 1986, p. 205; Runde 1993, p. 11) LetS andS betwoiidpositive(cid:11)-stable 1 2 random variables with common parameters (cid:11)=2 (0;1), (cid:12) = +1, (cid:13) = 1, and (cid:14) = 0. Set Z = S =S . For 1 2 2 z 0, the cdf of Z is given by (cid:21) 1 z(cid:11)=2+cos((cid:25)(cid:11)=2) 1 F (z)=P(Z z)= arctan +1: (13) Z (cid:20) (cid:25)(cid:11)=2 sin((cid:25)(cid:11)=2) (cid:0) (cid:11) (cid:18) (cid:19) Di⁄erentiating this expression with respect to z, the pdf of Z for z >0 is obtained as d sin((cid:25)(cid:11)=2) f (z)= F (z)= : (14) Z dz Z (cid:25)z z (cid:11)=2+z(cid:11)=2+2cos((cid:25)(cid:11)=2) (cid:0) (cid:2) (cid:3) As Z is a positive random variable, F (z)=f (z)=0 for z <0. Z Z Figure 2 somewhere here The cdf of the random variable Z is shown in Figure 2 for various values of (cid:11) between 1.98 and 0.25.15 The random variable Z has several interesting properties. First, note that lim f (z) = and that the z 0 Z # 1 rate of divergence to in(cid:133)nity of f (z) as z 0 is given by (1=z)1 (cid:11)=2; thus, the pdf of Z has a one-sided Z (cid:0) # in(cid:133)nite singularity at 0. Second, as z , f (z) (cid:20) z (cid:11)=2 1 for a suitable constant (cid:20) > 0. This Z (cid:0) (cid:0) ! 1 (cid:25) (cid:1) result, along with P(Z >0)=1, implies that Z lies in the normal domain of attraction of a positive stable distribution,sayZ ,withindexofstability(cid:11)=2and(cid:12) =+1,thesameparametersasthatofthevariablesS 0 1 and S .16 Hence, the mean of Z is in(cid:133)nite for all values of (cid:11) < 2. Third, in the special case of (cid:11) = 1, S 2 1 and S are each distributed as a LØvy (cid:11)-stable random variable, which is well known to be equivalent to the 2 inverse of a (cid:31)2(1) random variable. For (cid:11)=1, then, the pdf of Z reduces to (cid:25)z1=2(1+z) (cid:0) 1 , which is also the pdf of an F distribution; see Runde (1993). (cid:0) (cid:1) 1;1 As was noted earlier, the median of Z is equal to 1 for all values of (cid:11) (0;2). The regression model(cid:146)s 2 signal-to-noise ratio is given by the random variable (cid:17)Z if (cid:11) < 2, whereas it is given by the constant (cid:17) in the standard, i.e., (cid:133)nite-variance case. The fact that the random variable which multiplies (cid:17) has a 15Runde (1993) graphs pdfs ofZ forvalues of(cid:11)between 1:0and 1:9. 16See Mittnik et al.(1998) fora discussion ofsome ofthe properties ofthe stable law Z . 0 13

median of 1 helps to develop further the intuition that underlies the result of Remark 1, viz., that the median of R, (cid:17)=((cid:17)+1), is the same in both the (cid:133)nite-variance and the in(cid:133)nite-variance cases. Finally, an inspection of equation (13) reveals that lim P(Z < 1) = 0 and lim P(Z > 1) = 0; put di⁄erently, e (cid:11) 2 (cid:11) 2 " " lim P(Z = 1) = 1. The probability mass of Z therefore becomes perfectly concentrated at 1 as (cid:11) 2, (cid:11) 2 " " even though, of course, its mean remains in(cid:133)nite as long as (cid:11)<2. From Theorem 1, we have R = (cid:17)Z=((cid:17)Z +1) = g(Z), say. Note that Z S =S satis(cid:133)es the conditions x u (cid:17) of Proposition 1 and that the function Z = g 1(R) = (1=(cid:17)) R=(1 R) is continuously di⁄erentiable and e (cid:0) (cid:0) strictly increasing in the interior of its domain. We are ther(cid:0)efore able t(cid:1)o provide the following expressions e e e for the cdf and pdf of R by an application of the density transformation theorem.17 e Theorem 2 For r (0;1) and (cid:17) > 0, set z = g 1(r) = (1=(cid:17)) r=(1 r) , and let the cdf and pdf of Z be (cid:0) 2 (cid:0) given by equations (13) and (14). The cdf of R for r (0;1) is(cid:0)given by (cid:1) 2 e F (r)=F g 1(r) : (15) R Z (cid:0) (cid:2) (cid:3) e Furthermore, F (0)=0 and F (1)=1. R R The pdf of R for r (0;1) is given by e e 2 e d f (r)= g 1(r) f g 1(r) R dr (cid:0) Z (cid:0) (cid:12) (cid:12) e = (cid:12) (cid:12) 1 (cid:12) (cid:12) (cid:2) (cid:3) sin((cid:25)(cid:11)=2) (cid:12) (cid:12) (cid:17)(1 r)2 (cid:1) (cid:25)g 1(r) [g 1(r)] (cid:11)=2+[g 1(r)](cid:11)=2+2cos((cid:25)(cid:11)=2) (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) sin((cid:25)(cid:11)=2) (cid:16) (cid:17) = z (cid:0) (cid:11)=2+z(cid:11)=2+2cos((cid:25)(cid:11)=2) (cid:0) 1 ; where z =r= (cid:17)(1 r) . (16) (cid:25)r(1 r) (cid:1) (cid:0) (cid:0) (cid:2) (cid:3) (cid:0) (cid:1) As r # 0 or r " 1, f R (r) diverges to in(cid:133)nity at a rate proportional to (1=r)1 (cid:0) (cid:11)=2 and 1=(1 (cid:0) r) 1 (cid:0) (cid:11)=2 , respectively. (cid:0) (cid:1) e Proof. Theresultsstatedinequations(15)and(16)followimmediatelyfromProposition1andthedensity transformation theorem. Because lim dg 1(r)=dr =(cid:17) 1, the rate of divergence of f (r) as r 0 is equal r # 0 (cid:0) (cid:0) R # to(cid:151)apart from the multiplicative constant (cid:17) 1(cid:151)that of f (z) as z 0, which is (1=z)1 (cid:11)=2. Finally, it (cid:0) Z e (cid:0) # follows from Remark 2 that as r 1 the pdf of R also diverges to in(cid:133)nity at this rate. " The probability density functions and cumulative distribution functions of R((cid:11);(cid:17)) for values of (cid:11) bee tween 0.25 and 1.98 are graphed in Figures 3 and 4. (In all cases, we have set (cid:17) =1.) The pdfs in Figure 3 e 17See,e.g.,Mood,Graybill,and Boes (1974,p.200). 14

are shown with a logarithmic scale on the ordinate. Since we know that f (0) = f (1) = , we graph the R R 1 functions only for r 10 13;1 10 13 . The graphs show that (cid:0) (cid:0) e e 2 (cid:0) (cid:0) (cid:1) If (cid:11) is close to but less than 2, e.g., if (cid:11)=1:98 or (cid:11)=1:90, the pdf has an interior mode, and most of (cid:15) the probability mass of R is concentrated near its median. Conversely, only very little mass is located near0and1, andthepdfsregisteronlymildincreasesasr approacheseitheredgeofthedistribution(cid:146)s e support. For (cid:11) = 1:75 and (cid:11) = 1:50, the distribution of R continues to have an interior mode (as well as, of (cid:15) course,thetwounboundedmodesat0and1). However,thedistributionisnoticeablylessconcentrated e around the interior mode than if (cid:11) is closer to 2. By (cid:11)=1:20, the interior mode has disappeared and the distribution is nearly uniform over the entire (cid:15) interval [0;1]. If (cid:11) takes on even smaller values, less and less of the probability mass of R is located near the median, (cid:15) and more and more of it is concentrated close to 0 and 1. e If (cid:11) = 0:25, about 75 percent of the probability mass lies within 0.001 of the two endpoints of the (cid:15) distribution, while the probability of observing a realization of R for r [0:25;0:75] is less than 5 2 percent. e Figures 3 and 4 somewhere here A heuristic summary of these properties of R is straightforward. We begin by recalling that the multiplicative term C((cid:11)), shown in equation (4) and Figure 1, a⁄ects the probability of tail-region values of the e random variables in question, and that the rate of decline in the tail areas of density of (cid:11)-stable random variables increases as (cid:11) 2. Suppose (cid:133)rst that (cid:11) is very close to 2; then, C((cid:11)) is close to 0, and the fraction " ofobservationsofx andu thatfallintotherespectiveParetian-tailregionsisthereforeverylow; moreover, t t given the fairly rapid decay of the density(cid:146)s tails for (cid:11) close to 2, the likelihood of obtaining a very large draw, conditionalonobtainingadrawfromtheParetiantailarea, isalsolow. Asaresult, theprobabilityof observinglargeobservationsofx andu isquitelow. This,inturn,makesitunlikelytoobserveaverylarge t t draw of either S or S and thus of observing a value of Z that is either close to 0 or very large. Therefore, x u if (cid:11) is very close to 2, Z is likely close to its median of 1, and most of the mass of R is concentrated near e 15

its median, (cid:17)=((cid:17)+1). Next, as (cid:11) moves down and away from 2, say to around 1.5, C((cid:11)) increases rapidly, leading to a higher frequency of observing tail-region draws for x and u . In addition, as the density in the t t tail region declines more slowly for smaller values of (cid:11), it is much more likely of obtaining very large draws of the regressor and error term than if (cid:11) is close to 2. In consequence, if (cid:11) is around 1.5, it is quite likely to obtaindrawsofZ thatareeitherveryclosetozeroorverylarge,andthusmoreoftheprobabilitymassofR is located near the edges of its support. Conversely the interior mode of R is considerably less pronounced e than if (cid:11) is close to 2. Finally, as (cid:11) decreases further, C((cid:11)) rises further, and both the frequency of tail e observations and the likelihood that any draws from the tail areas will be very large increase. Therefore, it is very likely that the largest few observations of x or u will dominate the realization of Z and therefore t t the realization of R. As a result, if (cid:11) is small the central mode of R vanishes entirely and almost all of its probabilitymassislocatedveryclosetotheendpointsofthedistribution(cid:146)ssupport. Inthelimit,as(cid:11) 0,R e e # converges to a Bernoulli random variable, for which all of the probability mass is located at 0 and 1. e 4 An empirical application FamaandMacBeth(1973)proposedtheso-calledFama-MacBethregressiontotestthehypothesisofalinear relationshipbetweenriskandriskpremiuminstockreturnsinacross-sectionalsetting. Letr bethereturn it on market portfolio i at time t, where i=1;:::;N and t=1;:::;T; denote the average return of portfolio i as r(cid:22) = T 1 T r ; denote the average portfolio return at time t as R = N 1 N r ; and denote i (cid:0) t=1 it t (cid:0) i=1 it the average pPortfolio return across all time periods by (cid:22) = T 1 T R . The (cid:133)rst-Pstage Fama-MacBeth R (cid:0) t=1 t regression is an ex post CAPM, P r =(cid:18) +(cid:18) R +u ; t=1;:::;T; (17) it 0i i t t where E(u )=0, E(u R )=0, and u is iid S(cid:11)S with the same index (cid:11) (1;2] as r . We may assume that t t t t it 2 the distribution of (cid:18) has a (cid:133)nite mean and variance, say, E((cid:18)) and Var((cid:18)). Denote the OLS estimates of the i regression coe¢ cients in equation (17) by ^(cid:18) and ^(cid:18) . The second-stage Fama-MacBeth regression is given 0i i by r(cid:22) =(cid:21) +(cid:21) ^(cid:18) +" ; i=1;:::;N; (18) i 0 1 i i where " is iid S(cid:11)S with the same index (cid:11) as r , E(" )=0, and E(" ^(cid:18) )=0. i it i i i 16

The R2 statistic of the second-stage Fama-MacBeth regression is given by N 1(cid:21)^2 N ^(cid:18) ^(cid:18) 2 R2 = (cid:0) 1 i=1 i (cid:0) i : (19) N 1(cid:21)^2 N ^(cid:18)P ^(cid:18) (cid:0)2 +N (cid:1)1 N ^"2 (cid:0) 1 i=1 i (cid:0) i (cid:0) i=1 i P (cid:0) (cid:1) P This statistic has the following asymptotic properties. Theorem 3 If the individual portfolio returns r follow an iid S(cid:11)S distribution with (cid:11) (1;2] and if it 2 (cid:22) >0, the coe¢ cient of determination in (19) has the following limits as T and N : R !1 !1 If (cid:11)=2, R2 (cid:17) (cid:17)+1), where (cid:17) =(cid:21)2Var((cid:18)) Var("); and (cid:15) ! p 1 (cid:14)(cid:0) (cid:14) If (cid:11)<2, R2 =o N1 2=(cid:11) . p (cid:0) (cid:15) (cid:0) (cid:1) Thus, if (cid:11)<2, R2 0, at a rate that is proportional to N1 2=(cid:11). p (cid:0) ! Proof. Theresultforthe(cid:133)nite-variancecasefollowsimmediatelyfromCorollary1. For(cid:11)<2,observethat thenormalizedestimatorof(cid:18) ,T((cid:11) 1)=(cid:11) ^(cid:18) E((cid:18)) ,isinthedomainofattractionofan(cid:11)-stabledistribution i (cid:0) i (cid:0) for (cid:133)xed values of T. As T , the(cid:0)dispersion(cid:1)of ^(cid:18) about E((cid:18)) converges to 0, and the distributional i ! 1 properties of the estimated regressors ^(cid:18) converge to those of (cid:18) ; by assumption, the variance of (cid:18) is (cid:133)nite. i i i Thus, as N and T , the numerator in equation (19) converges to (cid:21)2Var((cid:18)). In contrast, the ! 1 ! 1 1 second summand in the denominator of (19) requires norming by N2=(cid:11) > N in order to attain a proper limit. The coe¢ cient of determination therefore converges to 0 in probability as N and T , at a !1 !1 rate of N1 2=(cid:11). (cid:0) This result does not con(cid:135)ict with the one provided in Theorem 1, as the present case is one of an unbalanced regression design: the regressor has an asymptotically (cid:133)nite variance, whereas the error term has in(cid:133)nite variance, implyingthat the asymptotic signal-to-noise ratio is zero. Instead, this resultis closely related to the one provided in Corollary 2, which examined the asymptotic limit of R2 if (cid:11) =(cid:11) . We note x u 6 that even if T is (cid:133)xed (as is generally taken to be the case in Fama-MacBeth regressions), the dispersion of ^(cid:18) will likely be quite a bit smaller than that of " , indicating that the model(cid:146)s signal-to-noise ratio, (cid:17), and i i hence the median of R2, in the second-stage regression will be quite small unless (cid:21) is su¢ ciently large in 1 absolute value. These qualitative observations are con(cid:133)rmed by a small-scale Monte Carlo simulation, shown in Table 1, in which we report the median value of R2 as a function of two values of (cid:11) and selected values of T, N, and (cid:22) .18 It is evident for both (cid:11) = 1:5 and (cid:11) = 1:75 that the median value of R2 declines as N increases R 18The design of the simulation and the choices of values for (cid:11), T, N and (cid:22) were in(cid:135)uenced by a desire to maximize the R empirical relevance of the simulation exercise. We chose (cid:11)=1:5 and (cid:11)=1:75 because (cid:11)^ 1:5 for most empirical economic (cid:21) 17

if T is (cid:133)xed, that this e⁄ect is particularly strong if T is large, and that this e⁄ect is more pronounced for (cid:11) = 1:5 than it is for (cid:11) = 1:75. The (cid:133)nal result is as one would expect, given that Theorem 3 states that the rate of convergence of R2 to zero increases as (cid:11) moves down further from 2. Table 1 somewhere here On the basis of the small value of coe¢ cient of determination from the Fama-MacBeth regression, Jagannathan and Wang (1996) con(cid:133)rm the (cid:133)nding of Fama and Macbeth (1973) of a (cid:147)(cid:135)at(cid:148)relation between averagereturnandmarketbeta. Theyreportaverylowcoe¢ cientofdeterminationof1.35%=0.0135forthe Sharpe-Lintner-Black (SLB) static CAPM. Regarding (cid:147)thick-tailed(cid:148)phenomena in empirical data, Fama and French (1992) conjectured that neglecting the heavy-tails phenomenon of the data does not lead to serious errors in the interpretation of empirical results. In the following, we use the same CRSP dataset as was used by Jagannathan and Wang (1996); the data are very similar to those that were used in the study of Fama and French (1992). The data consist of stock returns of non(cid:133)nancial (cid:133)rms listed on the NYSE and AMEX from July 1963 until December 1990 covered by CRSP alone; the frequency of observation is monthly. In the preceding notation, we have T = 330 and N = 100. Figure 5 displays the time series of these monthly returns. Figure 5 somewhere here Forouranalysisweneedtoobtainpointestimatestheindexofstabilityofthestockreturnsanddetermine whether the estimates are less than 2. Under the assumption of symmetry, which implies that the left and righttailsofthereturnsdistributionpossessthesamemaximalmomentexponentanddispersioncoe¢ cient, the point estimate of (cid:11) for monthly stock returns in the CRSP dataset using the Hill method (Hill, 1975) data. We study the cases of T = 100, 250, 1;000, and 2;500 because T = 250 corresponds approximately to the number of business days in a calendar year. The values of N = 30, 100, 500, and 1;000 correspond to the numbers of stocks contained in certain well-known stock priceindices,such astheU.S.Dow-JonesIndustialand German DAX indices,theU.K.FTSE-100 index, the U.S. S&P-500 index, etc. The choice of (cid:22) R = 0 provides a reference to contrast the cases of (cid:18)i = 0 and (cid:18)i 6 = 0; (cid:22) =0:1is particularly relevant forthe empiricalstudy provided below. R 18

is 1.77, with a standard deviation of 0.15.19 On the basis of these estimates, normality ((cid:11) = 2) can be excluded only at a con(cid:133)dence level of approximately 87.5 percent. However, inference about the width of the con(cid:133)dence interval for the Hill estimator is valid only asymptotically; in (cid:133)nite samples, the Hill-method estimatesareknowntobequitesensitivetoevenminordeparturesfromexactlyParetiantailbehavior.20 In contrast, the method of Dufour and Kurz-Kim (2007) provides exact con(cid:133)dence intervals for (cid:133)nite samples. By their method, the point estimate of (cid:11) for the monthly stock returns data is 1.78, and the exact (cid:133)nitesample 90% con(cid:133)dence interval for this point estimate is [1.64, 1.99]. This result also does not o⁄er very strong evidence against the hypothesis (cid:11) = 2. Nevertheless, because of estimation uncertainty in small samples,andbecausethisuncertaintyisespeciallysevereif(cid:11)iscloseto2,thedatacanberegardedasbeing in the domain of attraction of a stable distribution with (cid:11) < 2.21 We therefore proceed to investigate the consequences of this (cid:133)nding for the proper interpretation of the low R2 statistic reported by Jegannathan and Wang (1996). We designed Monte Carlo simulations to obtain the cdf of R2 for our empirical data, (cid:133)rst under the assumptionthatthereturnsdataareinthedomainofattractionofan(cid:11)-stabledistributionwith(cid:11)<2,and secondundertheassumptionofnormality((cid:11)=2). Thesimulationwascalibratedtothemaincharacteristics the empirical data; we set (cid:11)=1:78, T =330, N =100, and we set the expected return equal to the average annual return in the full sample, i.e., (cid:22) =0:1088. The number of replications of the (cid:133)rst-stage and second- R stage Fama-MacBeth regressions is 100,000, for the both values of (cid:11). The simulated cdfs of the R2-statistic are shown in Figure 6, where a vertical line is drawn at R2 = 0:0135 to indicate the in-sample value of the coe¢ cient of determination. The shapes of the two curves are rather di⁄erent, with the one for (cid:11) = 1:78 rising much more quickly for small values of R2. Figure 6 somewhere here 19Inthisestimation,weused0.0031asthecenteringo⁄setfortheempiricaldata;thisadjustmentisnecessarybecausetheHill estimatorisnotlocation-invariant. Theo⁄setisequaltotheestimatedlocationparameterobtainedbythequantileestimation method of McCulloch (1986). The choice of the number of order statistics to include in the Hill method used was determined bytheMonteCarlomethodofDufourandKurz-Kim (2007). Forthepresentdataset,thismethodindicatedtheuseof43% of allobservations. TheHillestimatorusesextremeobservationsfrombothtailsoftheempiricaldistributionundertheassumptionofsymmetry, but it uses only observations from the right (left) tail under the assumption of right-skewed (left-skewed) asymmetry. In the case of the monthly stock returns, the distribution is clearly left-skewed, i.e., the largest negative returns are larger in the samplethanthelargestpositivereturns;seeFigure5. Undertheassumptionofleft-skewedasymmetry,thepointestimateof(cid:11) forthe left tailusing the Hillmethod is 1.47,with one standard deviation of0.18. 20Stabledistributionshavetailsthatareasymptotically Paretian. In (cid:133)nitesamples,and especially iftheindex ofstability is notfarbelow 2,itisknownthatthetailsofstabledistributionsarenotapproximatedparticularlywellbyParetodistributions with the same value ofalpha. See Resnick (2006,pp.86(cid:150)9) fora discussion ofthe consequences ofthese (cid:133)nite-sample features forthe reliability ofthe Hillestimator. 21Fora broaderdiscussion ofhow to decide if(cid:11)<2,see McCulloch (1997). 19

The simulated median R2 of the second-stage Fama-MacBeth regression is 0.384 for (cid:11) = 2, but it is only 0.072 for (cid:11)=1:78. The simulated probability of obtaining R2 0:0135 is a minuscule 1.55 percent for (cid:20) (cid:11) = 2, but it is a much more sizable 21.88 percent for (cid:11) = 1:78; thus, if (cid:11) = 1:78 the event R2 0:0135 is (cid:20) about 14 times more probable than if (cid:11)=2. On the basis of these (cid:133)ndings, we conclude that the inference drawnfromthelowvalueofR2 byFamaandFrench(1992)(cid:151)thattheempiricalusefulnessoftheSLBCAPM isrefuted(cid:151)doesnotseemtoberobustonceproperallowanceismadeforthedistributionalpropertiesofthe data that give rise to this statistic. 5 Concluding remarks After providing a brief overview of some of the properties of (cid:11)-stable distributions, this paper surveys the literature on the estimation of linear regression models with in(cid:133)nite-variance variables and associated methods of conducting hypothesis and speci(cid:133)cation tests. Our paper adds to the already-wide body of knowledge that there are substantial di⁄erences between regression models with in(cid:133)nite-variance and (cid:133)nitevariance regressors and error terms by examining the properties of the coe¢ cient of determination. In the in(cid:133)nite-variance case with iid regressors and error terms that share the same index of stability (cid:11), we (cid:133)nd that the R2 statistic does not converge to a constant but instead that it has a nondegenerate asymptotic distributiononthe[0;1]interval,withapdfthathasin(cid:133)nitesingularitiesat0and1. Weprovideclosed-form expressionsforthecdfandpdfofthislimitrandomvariable. Iftheregressorsanderrortermdonothavethe sameindexofstability, weshowthatthecoe¢ cientofdeterminationcollapseseitherto0orto1, depending on whether the model(cid:146)s signal-to-noise ratio converges asymptotically to zero or in(cid:133)nity. Finally, we provide an empirical application of our methods to the Fama-MacBeth two-stage regression setup, and we show that the coe¢ cient of determination asymptotically converges to 0 in probability if the regression variables have in(cid:133)nite variance. This, in turn, strongly suggests that low values of the R2 statistic should not, by themselves, be taken as proof of a (cid:147)(cid:135)at(cid:148)relationship between the dependent variable and the regressor. In viewof therandom natureofthelimitlawR ifthe regressorsand errorterms sharethesame indexof stability, and given our related (cid:133)nding that the coe¢ cient of determination converges to zero in probability e if the tail index of the disturbance term is smaller than that of the regressor, a case that may be di¢ cult to rule out in empirical practice unless the sample size is very large, we view our results as establishing that one should not rely on R2 as a measure of the goodness of (cid:133)t of a regression model whenever the regressors and disturbance terms are su¢ ciently heavy-tailed to call into question the existence of second (population) moments. At the very least, if one chooses to report the coe¢ cient of determination in regressions with in(cid:133)nite-variance variables at all, one should also report a point estimate of the median of R, m^ =(cid:17)^=((cid:17)^+1), e 20

where (cid:17) is as in Theorem 1. In addition, one should indicate whether the error terms and regressors may reasonably be assumed to share the same index of stability. If the validity of that assumption is in doubt, the authors should also indicate which of the two parameters is likely to be smaller and how far apart the two parameters may plausibly be. It is widely known, and it is certainly stressed in all introductory econometrics textbooks, that a high value of R2 does not provide a su¢ cient basis for concluding that an empirical regression model is a (cid:147)good(cid:148) explanation of the dependent variable, or even that the regression is correctly speci(cid:133)ed. Nevertheless, one suspects, researchers may view low values of R2 in an empirical regression as an indication that the (linear) relationship is either weak or unreliable. A direct implication of the work presented in this paper is that wheneverthedataarecharacterizedbysigni(cid:133)cantoutlieractivity, alowvalueofR2 shouldnot, byitself, be used to disqualify the model from further consideration. Several extensions to the work presented here are possible. First, the regression F-statistic is a simple function of the coe¢ cient of determination; e.g., F =(T 2) R2=(1 R2) in the bivariate regression case. (cid:0) (cid:1) (cid:0) Given the close connection between the two statistics, it seems useful to study if and how the distributional properties of the regression F-statistic are a⁄ected by the presence of (cid:11)-stable regressors and error terms under both the null hypothesis, (cid:18) = 0, and the alternative hypothesis, (cid:18) = 0. It would also be useful to 6 elaborate on our idea, o⁄ered after Remark 1 in subsection 3.2, that the di⁄erence between the estimate of R2 and a consistent estimate of its median may serve as a diagnostic check of the size of the e⁄ect of in(cid:133)nitevarianceonR2. Forexample,itmaybefeasibletodevelopanasymptotictheoryofthedistributional properties of this di⁄erence. It also seems desirable to study how well the distribution of R approximates the empirical distribution of R2 in (cid:133)nite samples, for various types of heavy-tailed distributions that are in the domain of attraction of e S(cid:11)Sdistributions,andforvarioustypesofestimators(suchasOLS,Blattberg-Sargent(cid:146)sBLUE,andtheleastabsolute deviation estimator). In addition, an extension to a multiple-regression framework may produce additionalinsightsintothepropertiesofthecoe¢ cientofdeterminationinthein(cid:133)nite-variancecase. Finally, the theoretical results presented in our paper depend crucially on the assumption that the random variables are iid. Relaxing this assumption would seem to be useful, as many economic and (cid:133)nancial time series(cid:151) especially if they are sampled at very high frequencies(cid:151)are characterized by interesting dependence and heterogeneityfeatures. Introducingserialdependenceandheterogeneity,especiallyconditionalheterogeneity, would serve the purpose of studying how the properties of R may be a⁄ected by such departures from the basic case of iid variables. The authors are considering conducting research to extend the work presented in e this paper along these lines. 21

References Blattberg, R. and T. Sargent (1971): Regression with non-Gaussian stable disturbances: Some sampling results, Econometrica 39, 501(cid:150)510. Brockwell, P.J. and R.A. Davis (1991): Time Series: Theory and Models (2nd ed). Springer: New York. Davis, R.A. and S.I. Resnick (1985a): Limit theory for moving averages of random variables with regularly varying tail probabilities, Annals of Probability 13, 179(cid:150)195. Davis,R.A.andS.I.Resnick(1985b): Morelimittheoryforsamplecorrelationfunctionsofmovingaverages, Stochastic Processes and Their Applications 20, 257(cid:150)279. Davis, R.A.andS.I.Resnick(1986): Limittheoryforsamplecovarianceandcorrelationfunctionsofmoving averages, Annals of Statistics 14, 533(cid:150)558. De Vries, C. (1991): On the relation between GARCH and stable processes, Journal of Econometrics 48, 313(cid:150)324. Dufour,J.-M.andJ.-R.Kurz-Kim(2007): Exactinferenceandoptimalinvariantestimationforthestability parameter of symmetric (cid:11)-stable distributions, Journal of Empirical Finance, forthcoming. Fama, E.F. and K.R. French (1992): The cross-section of expected stock returns, Journal of Finance 47, 427(cid:150)465. Fama, E.F. and J.D. MacBeth (1973): Risk, return, and equilibrium: Empirical tests, Journal of Political Economy 71, 607(cid:150)636. Feller, W. (1971): An Introduction to Probability Theory and its Applications, Vol. 2 (2nd ed). Wiley: New York. Gnedenko, B.V. and A.N. Kolmogorov (1954): Limit Distributions for Sums of Independent Random Variables. Addison-Wesley: Reading, Mass. Granger,C.W.J.andD.Orr(1972): (cid:147)In(cid:133)nitevariance(cid:148)andresearchstrategyintimeseriesanalysis,Journal of the American Statistical Association 67, 275(cid:150)285. Ibragimov, I.A. and Yu.V. Linnik (1971): Independent and Stationary Sequences of Random Variables. Wolters-Noordho⁄: Groningen. Hill,B.M.(1975): Asimplegeneralapproachtoinferenceaboutthetailofadistribution,AnnalsofStatistics 3, 1163(cid:150)1174. Jagannathan,R.undZ.Wang(1996): TheconditionalCAPMandcross-sectionofexpectedreturns,Journal of Finance 51, 3(cid:150)53. Kanter, M. and W.L. Steiger (1974): Regression and autoregression with in(cid:133)nite variance, Advances in Applied Probability 6, 768(cid:150)783. Kim,J.-R.(2003): Finite-sampledistributionsofself-normalizedsums,ComputationalStatistics 18,493(cid:150)504. Kim, J.-R. and S.T. Rachev (1999): Asymptotic distributions of BLUE in the presence of in(cid:133)nite variance residuals, Unpublished manuscript. Kurz-Kim,J.-R.,S.T.RachevandG.Samorodnitsky(2005): F-typeDistributionsforHeavy-tailedVariates, Unpublished manuscript. Logan, B.F., C.L. Mallows, S.O. Rice and L.A. Shepp (1973): Limit distributions of self-normalized sums, Annals of Probability 1, 788(cid:150)809. Loretan, M. and P.C.B. Phillips (1994): Testing the covariance stationarity of heavy-tailed time series, Journal of Empirical Finance 1, 211(cid:150)248. 22

Mood,A.M.,F.A.GraybillandD.C.Boes(1974): IntroductiontotheTheoryofStatistics (3rded).McGraw- Hill: New York. Mandelbrot, B. (1963): The variation of certain speculative prices, Journal of Business 36, 394(cid:150)419. McCulloch, J.H. (1986): Simple consistent estimators of stable distribution parameters, Communications in Statistics, Computation and Simulation 15, 1109(cid:150)1136. McCulloch,J.H.(1996): Financialapplicationsofstabledistributions,in: Handbookof Statistics(cid:151)Statistical Methods in Finance, Vol. 14, 393(cid:150)425, eds. G.S. Maddala and C.R. Rao. Elsevier Science: Amsterdam. McCulloch, J.H. (1997): Measuring tail thickness to estimate the stable index (cid:11): A critique, Journal of Business and Economic Statistics 15, 74(cid:150)81. McCulloch, J.H. (1998): Linear regression with stable disturbances, in: A Practical Guide to Heavy Tails, 359(cid:150)376, eds. R. Adler, R. Feldman, and M.S. Taqqu. Birkh(cid:228)user: Berlin and Boston. Mittnik, S., S.T. Rachev and J.-R. Kim (1998): Chi-square-type distributions for heavy-tailed variates, Econometric Theory 14, 339(cid:150)354. Phillips, P.C.B. (1990): Time series regression with a unit root and in(cid:133)nite-variance errors, Econometric Theory 6, 44(cid:150)62. Phillips, P.C.B. and V.A. Hajivassiliou (1987): Bimodal t-ratios, Cowles Foundation Discussion Paper No. 842. Phillips, P.C.B. and M. Loretan (1991): The Durbin-Watson ratio under in(cid:133)nite-variance errors, Journal of Econometrics 47, 85(cid:150)114. Phillips, P.C.B. and M. Loretan (1994): On the theory of testing covariance stationarity under moment condition failure, in: Advances in Econometrics and Quantiative Economics. Essays in Honor of Professor C.R. Rao, 198(cid:150)233, eds. G.S. Maddala, P.C.B. Phillips, and T.N. Srinivasan. Blackwell: Oxford. Rachev,S.T.,J.-R.KimandS.Mittnik(1999): StableParetianeconometrics,PartIandII,The Mathematical Scientists 24, 24(cid:150)55 and 113(cid:150)127. Rachev, S.T., C. Menn and F.J. Fabozzi (2005): Fat-Tailed and Skewed Asset Return Distributions. Wiley: New York. Resnick,S.I.(1986): Pointprocessesregularvariationandweakconvergence,AdvancesinAppliedProbability 18, 66(cid:150)138. Resnick, S.I. (1999): A Probability Path. Birkh(cid:228)user: Berlin and Boston. Resnick, S.I. (2006), Heavy-Tail Phenomena: Probabilistic and Statistical Modeling. Springer: New York. Runde,R.(1993): AnoteontheasymptoticdistributionoftheF-statisticforrandomvariableswithin(cid:133)nite variance, Statistics & Probability Letters 18, 9(cid:150)12. Runde, R. (1997): The asymptotic null distribution of the Box-Pierce Q-statistic for random variables with in(cid:133)nite variance, with an application to German stock returns, Journal of Econometrics 78, 205(cid:150)216. Samorodnitsky, G., S.T. Rachev, J.-R. Kurz-Kim and S. Stoyanov (2007): Asymptotic distribution of linear unbiased estimators in the presence of heavy-tailed stochastic regressors and residuals, Probability and Mathematical Statistics, forthcoming. Samorodnitsky,G.andM.S.Taqqu(1994): Stable Non-Gaussian Random Processes.Chapman&Hall: New York. Zolotarev, V.M. (1986): One-Dimensional Stable Distributions, Translations of Mathematical Monographs, Vol. 65. American Mathematical Society: Providence. 23

Table 1: Median value of R2 as a function of (cid:11), T, N, and (cid:22) R (cid:11) T N (cid:22) R 0:0 0:1 0:3 0:5 1:0 1:50 100 30 0:0404 0:0425 0:0576 0:1009 0:2963 100 0:0162 0:0172 0:0292 0:0596 0:2019 500 0:0068 0:0075 0:0147 0:0325 0:1206 1000 0:0046 0:0055 0:0114 0:0264 0:0973 250 30 0:0402 0:0426 0:0779 0:1598 0:4417 100 0:0161 0:0190 0:0448 0:1058 0:3304 500 0:0064 0:0075 0:0220 0:0565 0:2020 1000 0:0047 0:0058 0:0172 0:0452 0:1667 1000 30 0:0387 0:0484 0:1499 0:3320 0:6748 100 0:0162 0:0223 0:0940 0:2272 0:5558 500 0:0065 0:0104 0:0521 0:1341 0:3994 1000 0:0046 0:0072 0:0399 0:1079 0:3443 2500 30 0:0403 0:0580 0:2478 0:4806 0:7962 100 0:0155 0:0294 0:1621 0:3581 0:6973 500 0:0066 0:0130 0:0883 0:2243 0:5507 1000 0:0047 0:0103 0:0737 0:1896 0:4970 1:75 100 30 0:0488 0:0543 0:1332 0:2944 0:6410 100 0:0260 0:0328 0:1032 0:2413 0:5756 500 0:0177 0:0222 0:0779 0:1941 0:5055 1000 0:0149 0:0199 0:0720 0:1778 0:4792 250 30 0:0474 0:0642 0:2509 0:4899 0:7993 100 0:0265 0:0430 0:2066 0:4264 0:7560 500 0:0169 0:0290 0:1571 0:3500 0:6950 1000 0:0143 0:0251 0:1440 0:3273 0:6730 1000 30 0:0470 0:1193 0:5351 0:7665 0:9309 100 0:0265 0:0871 0:4612 0:7124 0:9115 500 0:0169 0:0635 0:3910 0:6507 0:8865 1000 0:0144 0:0579 0:3663 0:6257 0:8744 2500 30 0:0480 0:2185 0:7214 0:8804 0:9677 100 0:0255 0:1704 0:6599 0:8474 0:9578 500 0:0169 0:1251 0:5900 0:8066 0:9452 1000 0:0149 0:1202 0:5674 0:7902 0:9394 Thenumbersinthebodyofthetablearethemediansfromsimulateddistributionswith100,000replications.

Figure 1. The functionC(α) , 0<α<2

Figure 2. Cumulative distribution functions of Z ≡ S S x u

~ Figure 3. Probability density functions of R(α,η) , η=1

~ Figure 4. Cumulative distribution functions of R(α,η) , η=1

Figure 5. CRSP Returns, July 1963 to December 1992 0.25 0.2 0.15 0.1 0.05 0 −0.05 −0.1 −0.15 −0.2 −0.25 1965 1970 1975 1980 1985 1990 nruter ylhtnoM Year

Figure 6. Simulated cdf of R2 , Second-stage Fama-MacBeth regressions 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 R2 ytilibaborp evitalumuC α=1.78: solid line α=2: dotted line

Cite this document
APA
Jeong-Ryeol Kurz-Kim and Mico Loretan (2007). A Note on the Coefficient of Determination in Models with Infinite Variance Variables (IFDP 2007-895). Board of Governors of the Federal Reserve System, International Finance Discussion Papers. https://whenthefedspeaks.com/doc/ifdp_2007-895
BibTeX
@techreport{wtfs_ifdp_2007_895,
  author = {Jeong-Ryeol Kurz-Kim and Mico Loretan},
  title = {A Note on the Coefficient of Determination in Models with Infinite Variance Variables},
  type = {International Finance Discussion Papers},
  number = {2007-895},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2007},
  url = {https://whenthefedspeaks.com/doc/ifdp_2007-895},
  abstract = {Since the seminal work of Mandelbrot (1963), alpha-stable distributions with infinite variance have been regarded as a more realistic distributional assumption than the normal distribution for some economic variables, especially financial data. After providing a brief survey of theoretical results on estimation and hypothesis testing in regression models with infinite-variance variables, we examine the statistical properties of the coefficient of determination in models with alpha-stable variables. If the regressor and error term share the same index of stability alpha<2, the coefficient of determination has a nondegenerate asymptotic distribution on the entire [0, 1] interval, and the density of this distribution is unbounded at 0 and 1. We provide closed-form expressions for the cumulative distribution function and probability density function of this limit random variable. In contrast, if the indices of stability of the regressor and error term are unequal, the coe¢ cient of determination converges in probability to either 0 or 1, depending on which variable has the smaller index of stability. In an empirical application, we revisit the Fama-MacBeth two-stage regression and show that in the in…nite-variance case the coefficient of determination of the second-stage regression converges to zero in probability even if the slope coe¢ cient is nonzero.},
}