ifdp · May 31, 1992

The Power of Cointegration Tests

Abstract

A cointegration test statistic based upon estimation of an error correction model can be approximately normally distributed when no cointegration is present. By contrast, the equivalent Dickey-Fuller statistic applied to residuals from a static relationship has a non-standard asymptotic distribution. When cointegration exists, the error-correction test generally is more powerful than the Dickey-Fuller test. These differences arise because the latter imposes a possibly invalid common factor restriction. The issue is general and has ramifications for system-based cointegration tests. Monte Carlo analysis and an empirical study of U.K. money demand demonstrate the differences in power.

Board of Governors of the Federal Reserve System International Finance Discussion Papers Number 431

June 1992

THE POWER OF COINTEGRATION TESTS

Jeroen J.M. Kremers, Neil R. Ericsson, and Juan J. Dolado

NOTE: International Finance Discussion Papers are preliminary materials circulated to stimulate discussion and critical comment. References in publications to International Finance Discussion Papers (other than an acknowledgment that the writer has had access to unpublished material) should be cleared with the author or authors.

ABSTRACT

Key words and phrases: cointegration, Dickey-Fuller statistic, econometrics, error correction, power, statistical inference, unit roots.

The Power of Cointegration Tests

Jeroen J.M. Kremers, Neil R. Ericsson, and Juan J. Dolado!

1 Introduction

Contrasting inferences about the presence of cointegration often appear in empirical investigations. For example, in applying the commonly used “two-step” procedure proposed by Engle and Granger (1987), the Dickey-Fuller unit-root test may only marginally reject the null hypothesis of no cointegration, if it rejects at all. By contrast, the coefficient on the error-correction term in the corresponding dynamic model of the same data may be “highly statistically significant”, strongly supporting cointegration; cf. Kremers (1989), Hendry and Ericsson (1991a), and Campos and Ericsson (1988). Both procedures are tests of cointegration, so why should there be such a contrast? A plausible explanation centers on an implicit common factor restriction imposed when using the Dickey-Fuller statistic to test for cointegration. If that restriction is invalid, the Dickey-Fuller test remains consistent, but loses power relative to cointegration tests that do not impose a common factor restriction, such as those based upon the estimated error-correction coefficient.

This paper examines the asymptotic and finite sample properties of the two procedures for a simple, single-lag, bivariate process. Even with more lags and more variables, the reason for the low power of the Dickey-Fuller test remains. The errorcorrection-based test is preferable because it uses available information more efficiently than the Dickey-Fuller test.

Section 2 describes the process of interest and derives the relationship between the error-correction mechanism and the equation from which the Dickey-Fuller statistic

1Forthcoming in a special issue of the Ozford Bulletin of Economics and Statistics entitled Testing Integration and Cointegration, Anindya Banerjee and David F. Hendry (eds.), Vol. 54, No. 3, August 1992. The authors of this paper are staff economists in the Ministry of Finance, The Hague, The Netherlands; the International Finance Division, Federal Reserve Board, Washington, D.C., U.S.A.; and the Research Department, Bank of Spain, Madrid, Spain, respectively. The first author is also a visiting professor at Erasmus University, Rotterdam (OCFEB). This paper represents the views of the authors and should not be interpreted as reflecting those of the Dutch Ministry of Finance, the Board of Governors of the Federal Reserve System, the Bank of Spain, or other members of their staff. This paper was prepared in part while the second and third authors were visiting INSEE and CEPREMAP, who we thank for generous hospitality. We are grateful to Javier Andrés, Anindya Banerjee, Julia Campos, Christian Gourieroux, David Hendry, Sgren Johansen, Augustin Maravall, Alain Monfort, Mark Salmon, Jim Stock, and Hong-Anh Tran for helpful discussions, and to Lisa Barrow and Rafael Domenech for research assistance. All numerical results were obtained using PC-NAIVE and PC-GIVE Version 6.01; cf. Hendry and Neale (1990) and Hendry (1989).

is calculated. Section 3 presents the asymptotic distribution of each test statistic under the null hypothesis of no cointegration, while Section 4 gives the corresponding asymptotic distributions under the alternative hypothesis of cointegration, using fixed and “near non-cointegrated” alternatives. Section 5 generalizes the results for testing in multivariate, multiple-lag systems. Section 6 interprets some Monte Carlo finite sample evidence in light of the asymptotic formulae. Section 7 empirically illustrates the two testing procedures with Hendry and Ericsson’s (1991b) quarterly data on U.K. narrow money demand. Derivations of all new results appear in the Appendix.

2 <A Simple Bivariate Process

Using a simple dynamic bivariate process, this paper focuses on the relative merits of the two-step Engle-Granger and single-step dynamic-model procedures for testing for the existence of cointegration. See Engle and Granger (1987) on the former and Banerjee, Dolado, Hendry, and Smith (1986) inter alia on the latter. The former is characterized by a Dickey-Fuller (DF) statistic used to test for the existence of a unit root in the residuals of a static cointegrating regression. The latter is based upon the t-ratio of the coefficient on the error-correction term in a dynamic model reparameterized as an error-correction mechanism (ECM), noting that cointegration implies and is implied by an ECM. This t-ratio is denoted the ECM statistic. This section describes the data generation process (DGP) and derives the analytical relationship between the ECM and the equation for the DF statistic.

The bivariate process considered is one of the simplest imaginable, and has been used elsewhere for expository purposes; cf. Davidson, Hendry, Srba, and Yeo (1978) and Banerjee, Dolado, Hendry, and Smith (1986). It is a linear first-order vector autoregression with normal disturbances, at least one unit root, and Granger-causality in only one direction. For expositional convenience, this DGP is written as a conditional ECM (1) and a marginal unit-root process (2):

(1) Ay: = aAz + by — za te (2) AX, = ur where

Et 0 a? 0 | ~ IN € =1,... |e | (Hoe 21) PS treaty

and where A is the first-difference operator 1 — L, L is the lag operator, and T is the sample size. The variables y; and z; are integrated of order one {denoted I(1)] and are possibly cointegrated. For y = InY and z = In Z, a is the short-run elasticity of Y with respect to Z. The parameter 6 is the error-correction coefficient in the

conditional model of y;, given lagged y and current and lagged z; and €; and u; are the disturbances in this conditional/marginal factorization. Without loss of generality, the cointegrating vector for (yz 2)’ is (1 — 1) if y, and z are cointegrated.

For simplicity, the (hypothesized) cointegrating vector is assumed known. Such a priori knowledge of the cointegrating vector arises frequently in economic models of long-run behavior, as in modeling (logs of) consumers’ expenditure and disposable income, wages and prices, money and income, or the exchange rate and foreign and domestic price levels.” Also, z; is assumed weakly exogenous for the parameters in the conditional model (1); see Engle, Hendry, and Richard (1983) and Johansen (1992a).

As Section 5 shows, the logical issues arising from common factor restrictions apply to processes more general than (1)-(2). Specifically, the cointegrating vector or vectors may be estimated and may enter more than one equation (e.g., no weak exogeneity); and a constant term, seasonal dummies, additional variables, and additional lags may be included. However, some statistics’ distributions are more complicated with such generalizations, so we focus on this bivariate case.

The parameter space is restricted to {0 < a < 1,—-1 < 6 < 0}. In many empirical studies, a ~ 0.5 and 6 © —0.1, with o2 > o?. That is, the short-run elasticity (a) is smaller than the long-run elasticity (unity), adjustment to remaining disequilibria is slow, and the innovation error variance for the regressor process is larger than that of the conditional ECM.

The variables y, and z, are cointegrated or not, depending upon whether 6 < 0 or b= 0. Thus, tests of cointegration rely upon some estimate of 6. In the ECM approach, equation (1) itself is estimated by OLS (denoted by a circumflex * ):

(3) Ay: = @Az, + buy + Et, where the putative disequilibrium is: (4) We = Yt — 2t-

The t-ratio based upon b is the ECM statistic, denoted tgcy. It is used to test the null hypothesis that b = 0, i.e., that y and z are not cointegrated with a cointegrating vector (1 — 1).

The DF statistic derives from a different regression, so it is helpful to establish the relationship between the DF regression equation and the ECM in (1). Specifically, subtract Az, from both sides of (1) and re-arrange:

(5) A(y — 2)e = Wy — zea + [(a— 1) Az t er].

Noting (4), equation (5) may be rewritten as:

(6) Aw; = bwr-1 + et,

See Davidson, Hendry, Srba, and Yeo (1978), Hendry, Muellbauer, and Murphy (1990), Sargan

(1964), Nymoen (1992), Hendry and Ericsson (1991a, 1991b), and Johansen and Juselius (1990a, 1990b) inter alia.

where the disturbance e; is:

OLS estimation of (6) (denoted by a tilde ) generates:

The t-ratio based upon b is the DF statistic, denoted tpr here [7 in Dickey and Fuller (1979)]. This t-ratio is also used to test whether or not y; and % are cointegrated with cointegrating vector (1 — 1). See Dickey and Fuller (1979, 1981) and Engle and Granger (1987).

In contrast to the estimated ECM in (3), the estimated DF equation (8) ignores potential information contained in Az;. Equivalently, (6) imposes the restriction that a equals unity. That is, the short-run elasticity (a) equals the long-run elasticity (unity). More generally, (6) imposes a common factor, as follows from rewriting (4)

and (6):

(9) Ye = 2% + rt wt = (1 + b)wi-1 +e (10) [1—-(1+ d)L]y, = [1-(1 + b)Lla + ea,

where [1 — (1 + 6)L] is the factor common to y; and 2; in (10).?

The transformation of (1) to (6), (9), and (10) provides several insights. First, (1), (6), (9), and (10) are equivalent representations, given the relationship between the errors €; and e; in (7); but the two errors are not equal unless a = 1 or Az; = 0. Second, and relatedly, the common factor restriction in (10) [and so in (6) and (9)| is invalid unless a = 1, noting that:

(11) [l — (1+ 6) Ly: = [a — (a + 6) Lz + €t,

from (1). Interestingly, even if the common factor restriction is invalid, e; remains white noise for this DGP. Nonetheless, e; is not an innovation with respect to current and lagged z and lagged y; cf. Granger (1983) and Hendry and Richard (1982) on the distinction between white noise and innovations. Since empirically estimated short- and long-run elasticities often differ markedly (as noted above), imposing their equality in the DF statistic is rather arbitrary. Third, (9) motivates the use of unitroot statistics in testing for cointegration. If w; has a unit root, then w; is nonstationary, b = 0, and y, and z are not cointegrated with the cointegrating vector (1 —1). Conversely, if w; has its root inside the unit circle, then w; is stationary, b<0, and y and z are cointegrated.

3See Hendry and Mizon (1978) and Sargan (1964, 1980) on common factors.

3 Distribution of the Statistics under the Null Hypothesis (No Cointegration)

The null hypothesis is no cointegration: that is, b = 0 in (1)-(2). Because w;y_, fin (3) and (8)| is not stationary under this hypothesis, distributional results from “standard” asymptotic theory do not apply. This section describes the asymptotic distributions of the DF and ECM statistics under that null hypothesis, and obtains a normal approximation to the distribution of the ECM t-ratio when a # 1.

For expositional convenience, we adopt certain notational conventions concerning Brownian motion (or Wiener) processes. Consider a normal, independently and identically distributed variable m:,t = 1,...,T: that is, n, ~ JN(0,02). In this paper, ne is usually either e:, €:, or uz. Define Br,,(r) as the partial sum yn m//To?, where r lies in [0,1], and [Tr] is the integer part of Tr. As discussed in Phillips (1987b), Br,,(r) converges weakly to a standardized Wiener process, denoted B,(r). Frequently, the argument r is suppressed, as is the range of integration over r, when that range is [0,1]. Thus, integrals such as fo B,(r)?dr are written as f B?. The symbol “ = ” denotes weak convergence of the associated probability measures as the sample size T’ — oo. See Banerjee, Dolado, Galbraith, and Hendry (1992) for a detailed discussion of Wiener processes.

The DF statistic [from (8)] is:

(12) tpr = b/ese(b)

(XL wea) (DL wer Aw)|/ 62 -(L wia)}

= (Cw) F(T wrrer)/Se, where ese(-) is the estimated standard error of its argument, &? is the estimated residual variance in (8), and all summations > are from 1 to T unless otherwise noted. Dickey and Fuller (1979) show that: J BedB.

VJ B?

under the null hypothesis. Dickey [in Fuller (1976, p. 373)] tabulates by Monte Carlo

the finite sample distribution for tpy, from which critical values may be taken for constructing a unit-root test.

(13) tpF >

The DF statistic has several important properties. First, its distribution is skewed to the left, and it has a negative median. In part because of these characteristics, the use of (negative) one-sided normal critical values may result in over-rejection under the null hypothesis. Second, the distribution of the DF statistic is invariant to oy, O-, and a, even in finite samples; cf. (12).

Banerjee, Dolado, Hendry, and Smith (1986, Theorem 4) derive the asymptotic distribution of the t-ratio on 6 in the ECM (3). Our Appendix corrects their formula

and obtains a simpler normal approximation for a # 1. Since > Az,w;-1 is O,(T) and E(uz€éz) = 0, the ECM t-ratio is:

(14) teEcmM = b/ese(b) (> w?_1)77 (Xo wi-re1/Ge) + O,(T-*),

where G? is the estimated residual variance in (3), and Mann and Wald’s (1943) order notation is used. Ignoring the term of O,(T~2), (14) is identical to the DF statistic in (12), except that ¢, appears rather than e;. Using properties of independent Brownian motion, the limiting distribution of tgcas 1s:

bone oy Bed Be ECM a oe VJ B?

(a—1)f BudB. + s7' f BedB. (a —1)? f B2+2(a— 1)s— f BB. + s~? f B?’ where s is the ratio o,,/0. (assumed strictly positive). As will be discussed below, the distribution of tgcy depends on the relative impor-

tance of the two terms comprising e; in (7), which are (a —1)Az% and e;. Specifically, it is useful to define a “signal-to-noise” ratio:

(16) q = —(a—1)s,

where q? is the variance of (a—1)Az relative to that of €,. Equally, q? is R?/(1—R?), where R? is the population R? with 6 = 0 for Au; regressed on w;_1 and Az, as in (28) below.

The asymptotic distribution of the ECM statistic has several unusual properties. First, because Az; is observed and is conditioned upon in estimating (3), q measures the amount of information present on the invalidity of the common factor restriction (for a given T). Second, and relatedly, when a = 1 (and so q = 0), (15) simplifies to the DF distribution (13), noting that e; = €, (and hence B, = B,) for a = 1. Third, for a # 1, (15) can be reparameterized in terms of q exclusively, rather than a and s separately:

—q-! .

tECM => equ ey VJ B2 —2q7) f BL B. + q7-? f B?

The asymptotic distribution of tgcy is sensitive to a and s only insofar as they enter q: Fourth, for large q, (17) is approximately a standardized normal distribution:

(18) tecm => N(0,1) + O,(q7").

(15)

This second approximation is “small-o” in nature or, equivalently, assumes the signalto-noise ratio for (3) to be large; cf. Kadane (1970, 1971)4 As q varies from small to large, the asymptotic distribution of tgcy shifts from the DF distribution to the normal distribution. To obtain (18), note that (17) is:

J BudB.

(19) tEcM => Jf B2 + O,(q7").

Since B, and B, are independent Brownian motions, the ratio in (19) is normally distributed; see Phillips and Park (1988).

Thus, when the common factor restriction in (9) is invalid and Az, contributes substantively to the determination of Ay, the t-ratio on the error-correction term in (3) is approximately normal, even when the error-correction coefficient is zero and so yt and z are not cointegrated. That simplifies conducting inference with tecay when q is large.” The distribution of tpp is independent of a, o,, and o, (and thus of s and q), even in finite samples, so no parallel approximation exists for tor.

To summarize, in so far as distributions under the null are concerned, tgoy has a distinct advantage over tpr when q is known to be large because of the former’s approximate normality under that condition. The next section considers distributions under the alternative hypothesis of cointegration, and so the issue of power.

4 Distribution of the Statistics under the Alternative Hypothesis (Cointegration)

The alternative hypothesis is cointegration: namely, b < 0 in (1)-(2). This section examines the asymptotic distributions of the DF and ECM statistics under both fixed and local alternatives. A priori, the distributions derived under either alternative could approximate the underlying finite sample distributions well, so both alternatives are of interest. Under a fixed alternative, w;_1 in (3) and (8) is stationary, so distributional results follow from conventional central limit theorems. Under a local alternative, the non-conventional asymptotic theory developed by Phillips (1988) for near-integrated series can be applied.

“Complementary interpretations exist. From (1) and (2) with 6 = 0 and a $ 0, Ye and z are virtually identical series for large q (a constant term and factor of proportionality aside) because the variance of aAz; is large relative to that of €r. Thus, y and z, appear cointegrated, giving rise to “standard” inferential procedures for b. This reasoning does not apply to the DF statistic because it is invariant to the variance of e,.

5If no information is available on the magnitude of q, then it appears advisable to use the DF critical values for the ECM statistic because they are larger in absolute value than the critical values for the normal. This choice follows from the definition of statistical size involving the supremum over the appropriate parameter space, here, being over the range of a and s.

Section 4.1 compares the asymptotic distributions of the DF and ECM statistics under a fixed alternative; Section 4.2 compares them under a local alternative. When a = 1, the two statistics are asymptotically equivalent. When a ¥ 1, the ECM test can be arbitrarily more powerful than the DF test.

4.1 Distributions under a Fixed Alternative

Under a fixed alternative, this subsection analyzes the components of the DF and ECM statistics, from which the properties of the statistics themselves can be compared.

For the DF statistic, the numerator is:

(20) b= (Suk) (lw rdw) = 64+ (w?_,)7(S wi-ier),

from which it follows that:

(21) T? -(6—b) > N(0,0?/02), where o2, = o2/[{1 — (1+ 6)?]. The denominator of the DF statistic is: (22) ese(b) = T~20-/oy + O,(T-'). For the ECM statistic, the numerator is: (23) b= b+ (So wey) (So w-1ex) + O,(T~), which implies: (24) T? - (6-6) > N(0,02/02). The denominator of the ECM statistic is: (25) ese(b) = T~?0/ow + O,(T7).

Combining these results obtains a relationship between the two statistics:

(26) tECM _ b/ese(6) tpF b/ese(b)

-1 oe/0. + O,(T~2). That is, the ECM statistic is approximately o./o- times the DF statistic. That factor of proportionality is at least unity, and in general is greater than unity, noting that: (27) oe/oe = [(a—-1) oi +03]/0?

= (i+) 21

from (7). The degree of inequality depends upon q. Relative power is likewise affected, as illustrated in Section 6 via Monte Carlo.

Intuition for the differences between the statistics is as follows. The ECM regression conditions on both Az; and w;_1, whereas the DF regression conditions on only

wz-1, thereby losing potentially valuable information from Az;. Rewriting (5) helps clarify:

where, as an extreme example, ¢; + 0, a £ 1, and Var(Az;) is “substantial” (and so q is large). The ECM (28) has a near perfect fit, a and b are estimated with near exact precision, and the ECM t-ratio for 6 is (arbitrarily) large. However, the DF statistic is invariant to the variance of e; (and so to the values of a and s), and the distribution of the DF statistic depends upon only 6 and T. For a suitably small (but nonzero) value of 6 and a given T, the DF statistic has little power (e.g., approximating its size) while the ECM statistic has power close to unity. This arises because the DF statistic ignores valuable information about Az, that is present in e;. Nevertheless,

both statistics are O,(T'?) under a fixed alternative, so motivating a local alternative to obtain distributions of O,(1).

4.2 Distributions under a Local Alternative

To formalize the previous intuition, we apply Phillips’s (1988) noncentral distribution theory to analyze the local asymptotic properties of the test statistics. The DGP is (1)-(2) with the local alternative:

(29) b= e/T_1 & C/T,

where c is a negative fixed scalar. The local alternative (29) parallels the usual Pitman-type local alternative, except that, in order to obtain statistics of O,(1), (29) differs from the null by O,(7'-*), rather than by O,(T'-2).

To proceed, we follow Phillips (1987b) and use the diffusion process:

(30) Ky(r) = fo e")dB, (5) B,(r) + fg eB, (3) dj, where K,,(r) is an implicit function of c. If c = 0, then K,(r) is B,(r). As with B,, the argument r and the limits of integration are dropped if no ambiguity arises from doing so.

Under the local alternative (29), the DF statistic is distributed as: f KedB.

Jf Kz’

see Phillips (1987b, p. 541; 1988, (26)). As shown in the Appendix, the ECM statistic is distributed as:

(31) tor => ¢(fK?)? +

(32) tecm => ¢(1 +4q?)3(f K?)?

(a—1)f K.dB.+ 37! f K.dB, (a — 1)? f K24 2(a —1)s-1 f KK. + 8? f K?

Properties of the asymptotic distributions in (31) and (32) are closely related to results under the null hypothesis. First, when c = 0, (32) simplifies to the distribution under the null, (17). Likewise, the asymptotic distribution (31) for the DF statistic reduces to (13) under the null. Second, when a = 1, (32) simplifies to the DF distribution (31). Third, for a # 1, (32) can be reparameterized in terms of c and q exclusively:

f K,dB. — q7! J KedB,

Jf K2 —2q7) f KyKe + q-? f K2

Fourth, for large q, (33) is approximately a standardized normal distribution: (34) teom => N (c(1+4?)?(f K3)?,1) + O,(97),

conditional on the process for u;. Fifth, the unconditional mean of tecy can be approximated as:

(35) E(teom) ® ¥/V/2, where y = c(1+q?)2.

The powers of the DF and ECM statistics can be summarized, as follows. For a given pair of values for c and T’,, the DF statistic has an associated asymptotic power, derivable from (31) and its critical value. For the same (c, 7’) pair and some comparable critical value, g can be arbitrarily large, in which case the ECM statistic is conditionally approximately normally distributed with unit variance. Further, its unconditional mean is negative and arbitrarily large, so its power can be arbitrarily close to unity. Thus, the ECM test has greater power than the DF test when q is sufficiently large, and the two tests have the same power when q = 0.

(33) tom =>. e(1+4q?)2(f K2)? +

5 Generalizations

The common factor “problem” of the DF statistic remains when (1) includes additional variables, additional lags of variables, a constant term, seasonal dummies, and/or a more complicated cointegrating vector. Furthermore, augmented versions of the DF statistic [such as Dickey and Fuller’s (1981) ADF statistic] and non-parametric corrections [such as in Phillips (1987a) and Phillips and Perron (1988)] do not resolve this problem. This section examines the common factor problem for a more general structure. It then shows how common factors can appear in systems procedures, as

illustrated by Stock and Watson’s (1988) test for common trends and avoided by Johansen’s (1988) procedure.

Consider three generalizations of (1): lagged as well as current values of Ay; and Az may appear, z; is a vector rather than a scalar, and the cointegrating vector is (1 — A’), being normalized on y but being otherwise unrestricted. Letting d(L) and a(L) be suitable scalar and vector polynomials in the lag operator L, (1) becomes:

Subtracting d(L)\’Az; from both sides (rather than Az; as in Section 2) obtains: (37) d(L)A(y — Nz), = wy — Mz)i-1 + {[a(L)! — d(L)N Az + 4}

(38) d(L)Aw, = bui_1 + e, where

(39) Ww = w—-Nxy

and

(40) e, = [a(L)’ — d(L)\Az, 4+ e.

Equations (38), (39), and (40) generalize (6), (4), and (7). When e; is not white noise, (38) is not a regression equation, and below we comment on that case.

The ADF statistic is based upon (38), and so imposes the common factor restriction:

(41) a(L) = d(L)).

If invalid, that restriction implies a loss of information (and so a loss of power) for the ADF test relative to the ECM test from (36). The caveat about common factors applies to other single-equation unit-root-type cointegration tests constructed from a static relationship between ye and z, including Phillips’s (1987a) Z, and 4, statistics, Phillips and Perron’s (1988) generalizations thereon, and Sargan and Bhargava’s (1983) statistic. The problem is not with the unit root tests per se: they may be quite useful for determining an individual series’s order of integration. Rather, the difficulty arises from testing for cointegration via testing for a unit root (or the lack thereof) in the purported disequilibrium measure Ye — AZ.

The ADF tests applied to (38) may encounter an additional difficulty. Whereas €: is white noise in the simple example (6), it need not be in (38); cf. (7) and (40). If not, then, in order to generate white noise errors, the ADF regression would need a lag length longer than that required in the ECM. Conversely, choosing too short a lag length for the ADF statistic can create misleading inferences; cf. Kremers (1988).

System analysis of cointegration faces similar problems. In a system notation following Johansen (1988), let 2, denote the entire vector of I(1) variables under study, of dimension p x 1. One interesting and commonly used representation for Ly is the Gaussian, finite-order vector autoregressive process:

(42) . n(L)z, = %4 % ~ IN(0,2,) or

where 1(L) is the €th order, p x p matrix polynomial yg ML’, T(L) is a related p X p matrix polynomial, and 7 = 7(1). But for the normalization 7) = I,, (L) is unrestricted; so 7 and I'(L) are also unrestricted. Cointegration of variables in x; implies that m is of reduced rank (r, say), so 7 can be factorized as:

(44) r= af’,

where a and # are full-rank p x r matrices. The rows of f’ are cointegrating vectors, and the coefficients in a are the weights on the cointegrating vectors in each equation. Some “systems” procedures focus on the roots of §’z; rather than on the properties

of z, itself. Such procedures impose “system common factors”, as can be seen by premultiplying (43) by 6’:

(45) B'dx, = (6'a)Blee1 + BT(L) Ama + Bx

(46) (I, -—G(L)L)Aw, = (Bla)w + Yt,

where wu; is now the vector 6’z,, G(L) is an r x r matrix polynomial in L, and 4; is: (47) be = [BT(L) — G(L)B Ari + BX.

Equations (46)-(47) parallel (38) and (40) for a single equation.

The disturbance y; may contain valuable, predictable information for two reasons. First, unless the restriction G(L)6’ = #’'T(L) holds, lags of Az; enter y;. Second, if 2 . is weakly exogenous, then 6’, may be explained in part by current z [as in (1)]. Both reasons imply a loss of information from analyzing w; rather than z; when testing for cointegration.

As an example, Stock and Watson’s (1988) test for common trends imposes common factors, except when the maintained hypothesis is p common trends (i.e., no cointegration). Stock and Watson’s statistic is derived from a vector autoregression in the hypothesized common trends f/a; [their equation (3.1)], which is an autoregression “complementing” (46). Unless 8, is square, their autoregression omits lags in 6’x,, and so ignores potentially valuable information.

Johansen (1988, 1991) and Johansen and Juselius (1990a) derive a likelihoodbased method for testing the rank of 7 and, conditional upon a given rank, conducting inference about a and #. Because (43) is the basis for inference, this method avoids common factor problems. All short-run dynamics in ['(L) are unrestricted, and so are “structural” rather than “error” dynamics: the Johansen procedure parallels the ECM procedure, but with the system complete. Conversely, the ECM procedure is

a special case of Johansen’s for a system in which the cointegrating vectors appear in only the equation of interest. Under that condition, it is valid to analyze only the equation of interest, as a conditional equation; cf. Dolado, Ericsson, and Kremers (1989) and Johansen (1992a).

6 Finite Sample Evidence

To analyze the size and power of the DF and ECM tests, a set of Monte Carlo experiments were conducted with (1) and (2) as the DGP. Without loss of generality, o2 = 1. That leaves the parameters (s,a,b) and the sample size T as experimental design variables, noting that s now is oy. This Monte Carlo study is solely meant to illustrate the common factor issue, so we chose a full factorial design of:

(48) (a,s) = [(1.0, 1), (0.5, 6), (0.5, 16)] 6 = (0.0 [no cointegration}, —0.05 [cointegration]) T = 20,

resulting in six experiments. The number of replications per experiment was N = 10,000, the first twenty observations of each replication were discarded in order to attenuate the effect of initial values, and new z’s were generated for each replication.

The parameter values were chosen with the following in mind. For a = 1.0 (and so gq = 0), only s = 1 is considered, since the analytical results in Sections 3 and 4 imply exact or asymptotic invariance of the statistics to s when the common factor restriction is valid. For a = 0.5, the values s = 6 and s = 16 imply q = 3 andg=8 respectively, with the latter very “strongly” violating the common factor restriction. The two values of 6, 0.0 and —0.05, imply lack of and existence of cointegration respectively, although, in the latter case, the stationary root of the system is still large: 0.95. Finally, the sample size is small by most econometric standards, and implies a low power of the DF statistic for the nonzero value of 6.

Table 1 lists rejection frequencies of the DF and ECM statistics under the hypotheses of no cointegration and cointegration. These rejection frequencies correspond to size and power, provided the correct critical values are used. Panels A and B of the table report rejection frequencies for one-sided tests at two nominal sizes, 5% and 1%. For each, three critical values are examined: those from Dickey in Fuller (1976, Table 8.5.2, p. 373) for T = 25, those of the normal distribution, and (for power) those estimated from our Monte Carlo with 6 = 0. The values of b and q appear at the top of the table: they define the experiments, and q in particular is important for the ECM statistic.

In Panel A (5% critical values) under “no cointegration”, rejection frequencies for tpr are virtually unchanged as q varies, in line with the invariance result. With the Dickey-Fuller critica] value, the rejection frequency for tecyy matches that of tor for q = 0, and shrinks to well below the nominal rejection frequency for large q (e.g., 3.5%

Table 1. Rejection Frequencies and Estimated Means of the Statistics

no cointegration: 6 = 0.0

Critical Value q and Statistic 0 3 8

A. Rejection Frequency at the 5% critical value (in per cent)

Dickey-Fuller (—1.95)

DF 0.4 5.6 0.4

ECM 0.4 4.1 3.5 Gaussian (—1.645)

DF 9.4 9.5 9.7

ECM 9.5 7.2 6.4 Estimated!

DF [—2.01] [—2.03] [—2.02]

ECM [2.02] [—1.88] [1.80]

9.6 9.9

17.3 17.3

8.2 8.6

q 3

10.3 50.2

18.1 60.6

8.9 52.4

B. Rejection Frequency at the 1% critical value (in per cent)

Dickey-Fuller (—2.66)

DF 1.1 1.3 1.2

ECM 1.3 1.2 0.9 Gaussian (—2.326)

DF 2.5 2.7 2.4

ECM 2.6 2.1 1.7 Estimated!

DF , [2.76] [—2.80] [—2.77]

ECM [—2.80] [—2.76] [—2.62]

2.1 2.3

4.5 4.5

1.6 1.7

C. Estimated Means of the Statistics?

mean(tpr) —0.34 -0.38 —0.37 +//2 0.0 00 0.0

—0.95 —0.93 —0.71

2.1 30.2

4.7 39.2

1.6 27.9

—0.96 —2.09 —2.24

cointegration: b = —0.05

10.1 91.6

17.4 94.3

8.8 92.9

2.3 82.8

4.6 87.3

1.7 83.4

—0.95 —5.08 —5.70

1Under the null of no cointegration, Monte Carlo estimates of the critical values are reported, in square brackets. Under the alternative,rejection frequencies are reported. The estimated critical values used for the DF statistic are the averages of those obtained under the null: —2.02 for 5% and -2.78 for 1%. The estimated critical values used for the ECM statistic are those obtained under the null, and they vary with q.

?Monte Carlo standard errors on the estimated means are approximately 0.01.

for q = 8). With the Gaussian critical value, the rejection frequency for tgcy is 9.5% for gq = 0, approximately double the nominal value, and tends toward the nominal value for large g. Such over-rejection limits the use of Gaussian critical values in practice.

In Panel A under “cointegration”, the power of the DF statistic is approximately 10%, whether with Dickey-Fuller or estimated critical values. As expected, its power is insensitive to q and to the choice of critical value. The power of the ECM statistic for q = 0 is virtually identical that of the DF statistic. However, as q increases, so does the power of the ECM statistic. At q = 8, its power is over 90%. The common factor restriction is disastrous for the Dickey-Fuller procedure in such instances. Conversely, the ECM procedure can gain markedly in power because it allows more flexible dynamics than the DF procedure. Panel B reports similar results at the 1% critical value.

Panel C lists the estimated means of tpr and tgcy across experiments, and the approximate asymptotic mean of tgcm, which is y//2. The estimated mean of the DF statistic appears invariant to q, as implied by Sections 3 and 4. Its estimated mean is more negative with cointegration than without cointegration, reflecting inter alia the negative noncentrality c(f K2)? in (31). The estimated mean of tgcy is not invariant to q. Under the null of no cointegration, it tends to zero as q increases. With cointegration, the estimated mean of tgcm is approximately y/./2, and becomes large and negative as q increases. In these experiments, ¢ = 3 and q = 8 appear quite “large” for the mean of tgcm, but not for tail properties. That suggests using the Dickey-Fuller or related critical values for tgcm rather than Gaussian critical values, in order to control size.

7 Empirical Evidence

This section tests for cointegration in Hendry and Ericsson’s (1991b) quarterly data on U.K. money demand to show how the DF and ECM statistics can differ empirically. The data are nominal M, (AZ), 1985 price total final expenditure (Y), the corresponding deflator (P), the three-month local authority interest rate (R3), and the (learning-adjusted) retail sight deposit interest rate (Rra). Below, lower case denotes logarithms. Hendry and Ericsson (1991b) describe the data in their appendix. Johansen (1992b) finds that m and p appear I(2), and are cointegrated as m — p, which is I(1). Thus, to avoid possible inferential complexities with I(2) variables, we consider whether or not m — p, y, Ap, R3, and Rra are cointegrated. The static regression of these variables obtains:

(49) (m—p): = —0.07y, + 0.94Ap, — 2.1R3, + 6.9Rra, + 11.8

T = 100 [1964(3) — 1989(2)] 6 =9.646% dw =0.18.

While direct statistical inference on the estimated coefficients in (49) is difficult, note that the income elasticity is negative, not positive; and the inflation elasticity is positive, not negative. Neither property is “economically sensible”. Additionally, the two interest rate semi-elasticities are numerically quite different in absolute magnitude, so an interest rate differential does not seem plausible as a measure of the opportunity cost.

The augmented Dickey-Fuller regression ADF(4) for the residuals w, from (49) is:

(50) Aw, = — 0.182 wir + Th, bjAw; (0.053)

LP = 95 [1965(4) — 1989(2)] ¢=3.690% tape = —3.41. Here and in equations below, ¢; denotes a generic coefficient, and standard errors are in parentheses. MacKinnon’s (1991) 10% critical value for the DF statistic is —4,25 for T = 95, so the variables do not appear cointegrated by this measure. Even so, the coefficient on w;_1 is negative and large numerically, implying a root of approximately 0.8.

In the error-correction framework, the long-run relationship between the variables may be obtained by estimating an autoregressive distributed lag in the variables and solving numerically for that long-run solution. Estimating the fifth-order autoregressive distributed lag for m — p, y, Ap, R3, and Rra obtains this long-run solution: (61) (mp) (a:27)"" (sy? 3) u

T = 100 [1964(3) — 1989(2)]. The long-run income elasticity is near unity, and inflation has a strong negative long-run effect. Further, the interest-rate coefficients are nearly equal in magnitude, opposite in sign, so in the long run, interest rates appear to matter only through the ’ net interest rate (R3 — Rra, denoted Rr).

Re-estimating the autoregressive distributed lag as an error-correction model obtains:

7.2 Rrazy — 0.8 (0.7) (2.9)

(52) A(m = p)e = — 0.149 wer + T+, }:A(m — p)e-; t (0.023) t-1 ia G:A( P)t

+ Lo or (Aya, A? 3, AR3i_;, ARra;_;)

T’ = 100 [1964(3) — 1989(2)] & = 1.320% teom = —6.39, where the lagged residual from (51) is now w;,_1, the error-correction term. Even in this highly over-parameterized model, the ECM statistic exceeds MacKinnon’s (1991) DF 1% critical value of —5.18. The equation standard error in (52) is far smaller than

that in (50), implying that the common factor restriction in (50) is invalid [COMFAC x2(20) = 64.6).

The contrast between the DF and ECM statistics is robust to the choice of lag length and to whether or not long-run price homogeneity is imposed. Further, results from system analysis match the ECM results above. For a corresponding vector autoregression, Ericsson, Campos, and Tran (1991) test and strongly reject the null of no cointegration in favor of one cointegrating vector, using Johansen’s (1988, 1991) procedure. The system estimate of the first cointegrating vector is (1 — 0.77 5.67 5.82 — 7.72), close to that in (51), noting that signs on unnormalized coefficients reverse. The first column in the estimated weighting matrix & is (—0.22 0.00 0.04 0.07 0.01)’, consistent with weak exogeneity of Ap, y, R3, and Rra in the money equation for the cointegrating vector. That exogeneity permits valid conditional inference in the money equation, such as with the autoregressive distributed lag above.

The ECM statistic in (52) contains an estimated cointegrating vector, so the appropriateness of MacKinnon’s tables for this tgcm is as yet a conjecture, albeit a natural one. As an alternative, consider Hendry and Ericsson’s (1991b) equation (6) —aconstant, parsimonious, simplification of an autoregressive distributed lag in the money demand variables:

(53) A(m — p)s 03) (0:06) P—y)t-1 (0.080)

— 0.093 (m—p—y)t-1 + 0.023 (0.009) “ " (0,004)

T = 100 [1964(3) — 1989(2)]} 6 =1.313% tzom = —10.87.

This equation imposes the long-run coefficients on prices and income, thus mirroring the analysis in Sections 2-4. While the error correction coefficient is somewhat smaller than before, the ECM statistic is even more highly significant than in (52). Prices and income have short-run elasticities of 0.31 and zero respectively, which contrast with their unit long-run elasticities and imply substantial violation of the common factor restriction in (50). Hendry and Ericsson (1991b, Section 4) further discuss the economic and statistical merits of (53).

8 Summary

Over the last several years, testing for cointegration has become an important facet of the empirical analysis of economic time series, and various tests have been proposed and widely applied. This paper illustrates how a statistic based upon the estimation of an ECM can be approximately normally distributed when no cointegration is present, even though the equivalent DF statistic has a non-normal asymptotic distribution. With cointegration, the ECM statistic can generate more powerful tests than those based upon the DF statistic applied to the residuals of a static cointegrating rela-

tionship. These differences arise because the DF statistic ignores potentially valuable information — specifically, it imposes a possibly invalid common factor restriction. Phrased somewhat differently, a loss of information can occur from assuming error dynamics rather than structural dynamics. Both empirical and Monte Carlo finite sample evidence support these analytical results.

Appendix: Asymptotic Distributions

This appendix is divided into Parts I, Il, and III, which respectively derive distributions under the null hypothesis of no cointegration, distributions under a fixed alternative of cointegration, and distributions under a local alternative of cointegration. Subsections A and B within each part concern the distributions of the DF and ECM statistics respectively. Proofs for the distributions of the DF statistic already exist in the literature. However, because the proofs are similar for the ECM statistic, both statistics are examined below. In brief, the proofs proceed by rescaling summations to be O,(1), applying the functional limit results in Table A.1, and dropping terms of 0,(1).

The notation for Brownian motion is used throughout; see Section 3. As a convenient reference for the building blocks of the proofs, Table A.1 lists correspondences between sample moments and limiting distributions. See Billingsley (1968, Chapters 2 and 4), White (1984), Phillips (1986, Appendix; 1987a; 1987b; 1988), Phillips and Durlauf (1986), Phillips and Park (1988), Banerjee, Dolado, Hendry, and Smith (1986,

Appendix), and Banerjee, Dolado, Galbraith, and Hendry (1992) for derivation of the results in the table.

I Distributions under the Null Hypothesis (No Cointegration)

In Part I, the DGP is (1)-(2) under the null hypothesis that b = 0.

I.A The DF Statistic

The DF statistic is:

(Al) tor = (Owes)? (CwAw/ée)

= (TY? Yw?,/o2)-2 «(TO Lo wr-1€¢/02) + O,(T-2)

=> (f BedB.)//(f B?). This is the “Dickey-Fuller” distribution. See Dickey and Fuller (1979) and Phillips (1987a) for details. Different values of a, 0, and o, affect only the variance of e, (a7), and so only the scaling of w;. From (Al), the (exact) distribution of tpr is invariant to the scaling of w;, and so to the choice of a, Oy, and o,.

Table A.1. Asymptotic Distributions of Sample Moments Under the Null Hypothesis of No Cointegration

Sample Brownian Motion Alternative _ Moment Representation Representation Basic Relationships

T~? S (yt)? o2 f B - T?D2 of BB - T~? » ZY? O.0y f B.By ~~

TOS Ye_1Et o? f BedB. (02 /2)[B.(1)? — 1) TOY zim o? f BudB,, (o?/2)[Bu(1)? = 1] T7} wi et a? f BedB. (o2/2)[B.(1)? _ 1}

TS yf_yue 0.0, f B.dB, _ T7} > Zt-1Et O,0y f B.dB. _ T-2? Ane: oo, f dB dB. N(0, 0202)

Implied Auxiliary Relationships

TS wi-1€t oe0ef BedB. or (a—1)o.0, f BudB. + 0? f BedB, Ty w? o? f B? or (a—1)*o? f B? + 2(a—1)o.0, f By Be + o2 f B?

_—_ eS

Notes:

1. The variable yf is defined as: yf = ty &.

2. Because uz and e; are independent and e, = (a—1)ur+€;, it follows that o.B. = (a—1)o,B,+0-B,. ando.dB, = (a—1)o,.dB,+0-dB,. Likewise, under the local alternative, o.K. = (a—1)o,K,+0-K, and o.dK, = (a—l)o dK, + 0.dk,.

3. Under the local alternative, three of the formulae in the table change:

T"} wr-1€4 => o? f K.dB., T'S wi-1€1 > 0-0. f K.dB,, and T?* Sw? > o? f K?,

with corresponding adjustments for their decompositions.

I.B The ECM Statistic The OLS estimator (4 6)’ in (3) is:

(A2) @]_ | T(Az)? LAzwier l | rv AzAy b ~ 1 Wt-1 Az > we, > Wt-1 Ay :

b Substituting the definition of Ay; into (A2) and pre-multiplying by the matrix diag(T?, T) obtains:

(o./ou) f dB.dB, | | {(a —1)o.o, f BudB. + 0? f BedB.}/{o? f B?} |"

The rates of convergence for & and 6 imply that: (A4) 6: Lv &/(T — 2) o2 + O,(T-2).

By partitioned inversion of the matrices involved in calculating tgcm, and applying the limit results in Table A.1, the ECM statistic is:

nieK

(A5) tgom = (Cw? - (Cw rAal[l(Axn) JS Azws]) (CwidAy — b> wr-1Az4] [K(Az)IIS AzAy]) /6- = (PT? Dw, — TTY Azwal[P-! D(Az) lS Aziwx|) (T7 Dwrer — TFT YD Azwal[T7! D(Az)J[P-3 5 eAz]) /6.

= (PT? Dwi) -2(T71 Dwires)/oe + O,(T7*), where all summations after the second equality sign are scaled to be O,(1). From Table A.1, it follows that: f B.dB.

VJB?

5 (a—1)f BydB. + s~ f B.dB, (a - 1)? f B2 + (a — 1)s1 f BB, + s~? f B?’

noting the relation between e;, uz, and €; (and so between B., B,, and B,). When a = 1, (A6) simplifies to the Dickey-Fuller distribution. When a # 1, (A6)

can be reparameterized in terms of g rather than a and s:

(A6) tecm =>

f BudB. — q7' f B.dB,

ttcmM => Sr ee," VJ B — 2q7! J BuB.e + q-? J B?

For large q, (A7) simplifies to:

(A7)

f BydB.

ttcm > VJ B

where the leading term is standardized normal; see Phillips and Park (1988). Thus, tzom is itself approximately distributed as a standardized normal variate:

(AQ) tecm => N(0,1) + Op(q7').

Equations (A3) and (A6) correct Banerjee, Dolado, Hendry, and Smith (1986, Theorem 4).

(A8) + O,(q7"),

II Distributions under a Fixed Alternative Hypothesis (Cointegration)

The DGP is (1)-(2) with 6 fixed, and such that —1 < b < 0. Asymptotic distributions follow directly from standard proofs with stationary variables, so details are omitted.

II.A The DF Statistic For the DF statistic, the numerator is:

(A10) b (ow?) 7S we Aw:) b+ (Sow?) wi-rer).

From (A10), it follows that:

(All) T? -(b—b) = N(0,02/02), where o2, = o?/{1 — (1+ 6)?]. The denominator of the DF statistic is: (A12) ese(b) = T7?o¢/o, + O,(T7!).

II.B The ECM Statistic

The OLS estimator (4 b)’ is (A2), and E(Az,w;,-1) = 0, so: The denominator of the ECM statistic is:

(A14) ese(b) = T-?0./0, + O,(T7), paralleling (A12) but with o, appearing in place of o,.

i) wo)

By substitution:

(A15) =o = [6/8)/{ese(6)/ese(5)]

[1 + O,(T-?)]/[oe/oe + Op(T-?)]

Oe/o- + O,(T~2).

III Distributions under a Local Alternative Hypothesis (Cointegration)

The DGP is (1)-(2) under the local alternative hypothesis that 6 = e°/T — 1, following (e.g.) Phillips (1987b) and Johansen (1989).

III.A The DF Statistic The DF statistic is: (Al6)) tpr = (Dw?,)-? - (CS w-1Aw:/é)

= eT? Dw? ,/02)? + (PT? YE wey/o2)-F (TD werer/o2) + Op(T~#)

=> e(f K?)? + (f KedB.)/V(f K2).

See Phillips (1987b) for details. As under the null hypothesis, the distribution of tor is invariant to the choice of a, o,, and o-.

III.B The ECM Statistic

The OLS estimator (4 6)’ is still (A2). From the first equality in (A3), the rates of convergence for a and 6 are the same under the local alternative as under the null hypothesis. Thus:

(A17) 6? = o? + O,(T?).

Substituting (1) as a local alternative into the first equality of (A5) and applying the limit results from Table A.1, the ECM statistic is:

1 Tb (tT Yew, — TOUT Y Azwal[T7! D(Ax) IIPS Azwe-1]) */66

+ (T? Dw? — THT YD Azweal[P7 D(Az)3 PD Azw-a)) * (TY weree — TH? [TE Az al[T! D(a)? } IL? DeAa)) [be

= c(oe/oe)(T~? D w?_,/02)? + (T7? Ow? 1)72(T71 Y wie / oe) + O,(T~2).

It follows that:

K.dB. (A19) trom = (1+ q?)B(p K2)E + ets

Vik? = el +¢q?)?(f K2)? (a—1)f KudB. + sf K.dB. (a —1)? f K2 + 2(a—1)s—! f KK. + s~? f K2’ noting the definition of q.

When a = 1, (A19) simplifies to distribution (A16) for the Dickey-Fuller statistic. When a # 1, (A19) can be reparameterized in terms of c and gq:

(A20) tecM >

e(1 + q)? ([g?/(1 + @)] SK? — 2fg/(1 4+ @)) Sf Kuke + (1+?) f K2)?

{ KudBe — q7 { K-dB.

pat Jf K2 297) f KK. + q7? f K2

noting that (1+q?)K2 = q?K?2—2qK,K.+K?. In order to obtain a “large-q” approximation without having tgcat — —oo, we hold c(1 + q?)2 constant while expanding in q. Thus, we define a new parameter ¥, which is:

(A21) y = e1+¢@)?. For large q and constant 7, (A20) simplifies to:

(A22) trom > AS K28 + AE + Og"). VIR

Derivation of the distribution of (A22) parallels Phillips and Park (1988, p. 114, Proof of Theorem 2.3). The bivariate Brownian motion (B., K,)/ is defined on a probability space, denoted (Q,F,P). Let F, denote the sub o-field of F generated by K,. Then the second term on the right-hand side of (A22) is a standardized

normal distribution, conditional on F, (and also unconditionally). Thus, tecm is itself approximately conditionally distributed as a standardized normal variate:

(A23) tecm|r, => N (7(f K2)2,1) + O,(q7).

In essence, (A23) is conditional on {uz}, and so on {zz}.

Comparison of the unconditional distributions of tgcmy and tpr requires several steps. First, note that the distribution of tpr in (A16) is invariant to g. Thus, for given values of T, c, and its critical value, tp has a given power, p* (say). Second, (f K2)2 in (A23) is non-negative; and, for any 6 (1 > @ > 0), there exists a x > 0 such that:

(A24) Prob|( bKayh > >] > 1-8.

Third, note that c is negative; and y in (A23) is c(1 + q?)?, which is O(q). Now, consider a critical value for tgcy equivalent to that for tpr. For some q large enough, +(f{ K2)2 [and so tgcy itself] is more negative than that critical value with probability

arbitrarily close to unity. Thus, for large q, tests using tgcm have greater power than those using tpr.

An approximation to the unconditional mean of tgcm helps in analyzing the Monte Carlo simulations:

(A25) E(tecm) © Ely(f K2)?] © y[E(f K2)]? & 4/2.

The two approximations arriving at y[E(f{ K2)]? are standard. The derivation of E(f K?) proceeds as follows.

The integral f K? can be generated as the large-T limit of T~? > &?/o? for the

process: where p = e/T, c < 0, and &) = 0. Without loss of generality, ¢2 = 1. For any t > 0,

(A27) E(&) (1 — p**)/(1 — p*) — (1 _ e2t/T) (1 _ e2/T)

by repeated substitution of (A26). Thus, it follows that:

-2 2 Tt TY]? 2¢/T 2 (A28) E(T? 3° &) = To ext ~ fd e!* [1 — e**], Applying L’Hopital’s rule (as T — oo), the large-sample limit of (A28) is: (A29) Jim ET D8) = (e* = 1-20) /(42). Applying L’Hopital’s rule again (this time as q — oo and so as c — 0) obtains: (A30) lim lim E(T~*S> @) = lime’*/2 = 1/2.

References

Banerjee, A., J.J. Dolado, J.W. Galbraith, and D.F. Hendry (1992) Co-integration, Error-Correction and the Econometric Analysis of Non-stationary Data, Oxford, Oxford University Press, forthcoming.

Banerjee, A., J.J. Dolado, D.F. Hendry, and G.W. Smith (1986) “Exploring Equilibrium Relationships in Econometrics Through Static Models: Some Monte Carlo Evidence”, Ozford Bulletin of Economics and Statistics, 48, 3, 253-277.

Billingsley, P. (1968) Convergence of Probability Measures, New York, John Wiley and Sons.

Campos, J. and N.R. Ericsson (1988) “Econometric Modeling of Consumers’ Expenditure in Venezuela”, International Finance Discussion Paper No. 325, Board of Governors of the Federal Reserve System, Washington, D.C.

Davidson, J.E.H., D.F. Hendry, F. Srba, and S. Yeo (1978) “Econometric Modelling of the Aggregate Time-series Relationship between Consumers’ Expenditure and Income in the United Kingdom”, Economic Journal, 88, 352, 661-692.

Dickey, D.A. and W.A. Fuller (1979) “Distribution of the Estimators for Autoregres-

sive Time Series with a Unit Root”, Journal of the American Statistical Association, 74, 366, 427-431.

Dickey, D.A. and W.A. Fuller (1981) “Likelihood Ratio Statistics for Autoregressive Time Series with a Unit Root”, Econometrica, 49, 4, 1057-1072.

Dolado, J.J., N.R. Ericsson, and J.J.M. Kremers (1989) “Inference in Conditional Dynamic Models with Integrated Variables”, paper presented at the European Meeting of the Econometric Society, Munich, Germany, September.

Engle, R.F. and C.W.J. Granger (1987) “Co-integration and Error Correction: Representation, Estimation, and Testing”, Econometrica, 55, 2, 251-276.

Engle, R.F., D.F. Hendry, and J.-F. Richard (1983) “Exogeneity”, Econometrica, 51, 2, 277-304.

Ericsson, N.R., J. Campos, and H.-A. Tran (1991) “PC-GIVE and David Hendry’s Econometric Methodology”, International Finance Discussion Paper No. 406,

Board of Governors of the Federal Reserve System, Washington, D.C.; Revista de Econometria, forthcoming.

Fuller, W.A. (1976) Introduction to Statistical Time Series, New York, John Wiley and Sons.

Granger, C.W.J. (1983) “Forecasting White Noise” in A. Zellner (ed.) Applied Time Series Analysis of Economic Data, Washington, D.C., Bureau of the Census, 308- 314.

Hendry, D.F. (1989) PC-GIVE: An Interactive Econometric Modelling System, Version 6.0/6.01, Oxford, Institute of Economics and Statistics and Nuffield College, University of Oxford.

Hendry, D.F. and N.R. Ericsson (1991a) “An Econometric Analysis of U.K. Money Demand in Monetary Trends in the United States and the United Kingdom by

Milton Friedman and Anna J. Schwartz”, American Economic Review, 81, 1, 8— 38.

Hendry, D.F. and N.R. Ericsson (1991b) “Modeling the Demand for Narrow Money in the United Kingdom and the United States”, European Economic Review, 35, 4, 833-881.

Hendry, D.F. and G.E. Mizon (1978) “Serial Correlation as a Convenient Simplification, Not a Nuisance: A Comment on a Study of the Demand for Money by the Bank of England”, Economic Journal, 88, 351, 549-563.

Hendry, D.F., J.N.J. Muellbauer, and A. Murphy (1990) “The Econometrics of DHSY”, Chapter 13 in J.D. Hey and D. Winch (eds.) A Century of Economics: 100 Years of the Royal Economic Society and the Economic Journal, Oxford, Basil Blackwell, 298-334.

Hendry, D.F. and A.J. Neale (1990) PC-NAIVE: An Interactive Program for Monte Carlo Experimentation in Econometrics, Version 6.01, Oxford, Institute of Economics and Statistics and Nuffield College, University of Oxford (documentation by D.F. Hendry, A.J. Neale, and N.R. Ericsson)

Hendry, D.F. and J.-F. Richard (1982) “On the Formulation of Empirical Models in Dynamic Econometrics”, Journal of Econometrics, 20, 1, 3-33.

Johansen, S. (1988) “Statistical Analysis of Cointegration Vectors”, Journal of Economic Dynamics and Control, 12, 2/3, 231-254.

Johansen, S. (1989) “The Power Function of the Likelihood Ratio Test for Cointegration”, Preprint No. 1989:8, Institute of Mathematical Statistics, University of Copenhagen, Copenhagen, Denmark.

Johansen, S. (1991) “Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian Vector Autoregressive Models”, Econometrica, 59, 6, 1551-159.

Johansen, S. (1992a) “Cointegration in Partial Systems and the Efficiency of Singleequation Analysis”, Journal of Econometrics, 52, 3, 389-402.

Johansen, S. (1992b) “Testing Weak Exogeneity and the Order of Cointegration in UK Money Demand Data”, Journal of Policy Modeling, 14, 3, 313-334.

Johansen, S. and K. Juselius (1990a) “Maximum Likelihood Estimation and Inference on Cointegration — With Applications to the Demand for Money”, Ozford Bulletin of Economics and Statistics, 52, 2, 169-210.

Johansen, S. and K. Juselius (1990b) “Some Structural Hypotheses in a Multivariate Cointegration Analysis of the Purchasing Power Parity and the Uncovered Interest Parity for UK”, Preprint No. 1990:1, Institute of Mathematical Statistics, University of Copenhagen, Copenhagen, Denmark; Journal of Econometrics, forthcoming.

Kadane, J.B. (1970) “Testing Overidentifying Restrictions When the Disturbances Are Small”, Journal of the American Statistical Association, 65, 329, 182-185.

Kadane, J.B. (1971) “Comparison of k-Class Estimators When the Disturbances Are Small”, Econometrica, 39, 5, 723-737.

Kremers, J.J.M. (1988) “Long-run Limits on the US Federal Debt”, Economics Letters, 28, 3, 259-262.

Kremers, J.J.M. (1989) “U.S. Federal Indebtedness and the Conduct of Fiscal Policy”, Journal of Monetary Economics, 23, 2, 219-238.

MacKinnon, J.G. (1991) “Critical Values for Cointegration Tests”, Chapter 13 in R.F. Engle and C.W.J. Granger (eds.) Long-run Economic Relationships: Readings in Cointegration, Oxford, Oxford University Press, 267-276.

Mann, H.B. and A. Wald (1943) “On Stochastic Limit and Order Relationships”, Annals of Mathematical Statistics, 14, 3, 217-226.

Nymoen, R. (1992) “Finnish Manufacturing Wages 1960-1987: Real-wage Flexibility and Hysteresis”, Journal of Policy Modeling, 14, 4, in press.

Phillips, P.C.B. (1986) “Understanding Spurious Regressions in Econometrics”, Journal of Econometrics, 33, 3, 311-340.

Phillips, P.C.B. (1987a) “Time Series Regression with a Unit Root”, Econometrica, 59, 2, 277-301.

Phillips, P.C.B. (1987b) “Towards a Unified Asymptotic Theory for Autoregression”, Biometrika, 74, 3, 535-547.

Phillips, P.C.B. (1988) “Regression Theory for Near-integrated Time Series”, Econometrica, 56, 5, 1021-1043.

Phillips, P.C.B. and S.N. Durlauf (1986) “Multiple Time Series Regression with Integrated Processes”, Review of Economic Studies, 53, 4, 473-495.

Phillips, P.C.B. and J.Y. Park (1988) “Asymptotic Equivalence of Ordinary Least Squares and Generalized Least Squares in Regressions with Integrated Regressors”, Journal of the American Statistical Association, 83, 401, 111-115.

Phillips, P.C.B. and P. Perron (1988) “Testing for a Unit Root in Time Series Regression”, Biometrika, 75, 2, 335-346.

Sargan, J.D. (1964) “Wages and Prices in the United Kingdom: A Study in Econometric Methodology” in P.E. Hart, G. Mills, and J.K. Whitaker (eds.) Econometric Analysis for National Economic Planning, Colston Papers, Vol. 16, London, Butterworths, 25-63 (with discussion); reprinted in D.F. Hendry and K.F. Wallis

(eds.) (1984) Econometrics and Quantitative Economics, Oxford, Basil Blackwell, 275-314.

Sargan, J.D. (1980) “Some Tests of Dynamic Specification for a Single Equation”, Econometrica, 48, 4, 879-897.

Sargan, J.D. and A. Bhargava (1983) “Testing Residuals from Least Squares Regression for Being Generated by the Gaussian Random Walk”, Econometrica, 51, 1,

153-174.

Stock, J.H. and M:W. Watson (1988) “Testing for Common Trends”, Journal of the American Statistical Association, 83, 404, 1097-1107.

White, H. (1984) Asymptotic Theory for Econometricians, Orlando, Florida, Academic Press.

IFDP NUMBER

431

430

429

428

427

426

425

424

423

422

421

420

419

International Finance Discussion Papers TITLES 1992

The Power of Cointegration Tests

The Adequacy of the Data on U.S. International Financial Transactions: A Federal Reserve Perspective

Whom can we trust to run the Fed? Theoretical support for the founders views

Stochastic Behavior of the World Economy under Alternative Policy Regimes

Real Exchange Rates: Measurement and Implications for Predicting U.S. External Imbalances

Central Banks’ Use in East Asia of Money Market Instruments in the Conduct of Monetary Policy

Purchasing Power Parity and Uncovered

Interest Rate Parity: The United States 1974 - 1990

Fiscal Implications of the Transition from Planned to Market Economy

Does World Investment Demand Determine U.S. Exports?

The Autonomy of Trade Elasticities: Choice and Consequences

German Unification and the European Monetary System: A Quantitative Analysis Taxation and Inflation: A New Explanation

for Current Account Balances

1991

A Primer on the Japanese Banking System

AUTHOR(s

Jeroen J.M. Kremers Neil R. Ericsson Juan J. Kolado

Lois E. Stekler Edwin M. Truman Jon Faust

Joseph E. Gagnon Ralph W. Tryon

Jaime Marquez Robert F. Emery

Hali J. Edison William.R. Melick

R. Sean Craig Catherine L. Mann

Andrew M. Warner Jaime Marquez

Gwyn Adams Lewis Alexander Joseph Gagnon

Tamim Bayoumi Joseph Gagnon

Allen B. Frankel Paul B. Morgan

Please address requests for copies to International Finance Discussion Papers, Division of International Finance, Stop 24, Board of Governors of the

Federal Reserve System, Washington, D.C.

20551.

IFDP NUMBER

418

417

416

415

414

413

412

411

410

409

408

407

406

405

404

403

International Finance Discussion Papers

TITLES 1991

Did the Debt Crisis Cause the Investment Crisis?

External Adjustment in Selected Developing Countries in the 1990s

Did the Debt Crisis or the Oil Price Decline Cause Mexico’s Investment Collapse?

Cointegration, Exogeneity, and Policy Analysis: An Overview

* The Usefulness of P Measures for Japan and Germany

Comments on the Evaluation of Policy Models

Parameter Constancy, Mean Square Forecast Errors, and Measuring Forecast Performance: An Exposition, Extensions, and Illustration

Explaining the Volume of Intraindustry Trade: Are Increasing Returns Necessary?

How Pervasive is the Product Cycle? The Empirical Dynamics of American and Japanese Trade Flows

Anticipations of Foreign Exchange Volatility and Bid-Ask Spreads

A Re-assessment of the Relationship Between Real Exchange Rates and Real Interest Rates: 1974 - 1990

Argentina's Experience with Farallel Exchange Markets: 1981-1990

PC-GIVE and David Hendry’s Econometric Methodology

EMS Interest Rate Differentials and Fiscal Policy: A Model with an Empirical Application to Italy

The Statistical Discrepancy in the U.S. International Transactions Accounts: Sources and Suggested Remedies

In Search of the Liquidity Effect

AUTHOR(s

Andrew M. Warner

William L. Helkie David H. Howard

Andrew M. Warner Neil R. Ericsson Linda S. Kole

Michael P. Leahy

Clive W.J. Granger Melinda Deutsch

Neil R. Ericsson

Donald Davis

Joseph E. Gagnon Andrew K. Rose

Shang-Jin Wei

Hali J. Edison B. Dianne Pauls

Steven B. Kamin

Neil R. Ericsson Julia Campos Hong-Anh Tran

R. Sean Craig

Lois E. Stekler

Eric M. Leeper David B. Gordon

Cite this document

APA

Jeroen J.M. Kremers, Neil R. Ericsson, & and Juan J. Dolado (1992). The Power of Cointegration Tests (IFDP 1992-431). Board of Governors of the Federal Reserve System, International Finance Discussion Papers. https://whenthefedspeaks.com/doc/ifdp_1992-431

BibTeX

@techreport{wtfs_ifdp_1992_431,
  author = {Jeroen J.M. Kremers and Neil R. Ericsson and and Juan J. Dolado},
  title = {The Power of Cointegration Tests},
  type = {International Finance Discussion Papers},
  number = {1992-431},
  institution = {Board of Governors of the Federal Reserve System},
  year = {1992},
  url = {https://whenthefedspeaks.com/doc/ifdp_1992-431},
  abstract = {A cointegration test statistic based upon estimation of an error correction model can be approximately normally distributed when no cointegration is present. By contrast, the equivalent Dickey-Fuller statistic applied to residuals from a static relationship has a non-standard asymptotic distribution. When cointegration exists, the error-correction test generally is more powerful than the Dickey-Fuller test. These differences arise because the latter imposes a possibly invalid common factor restriction. The issue is general and has ramifications for system-based cointegration tests. Monte Carlo analysis and an empirical study of U.K. money demand demonstrate the differences in power.},
}