ifdp · September 30, 1978

Distributed Lag Order Determination

International Finance Discussion Papers Number 126

October 1978 -

DISTRIBUTED LAG ORDER DETERMINATION by

Richard Meese

NOTE: International Finance Discussion Papers aré preliminary materials circulated to stimulate discussion and critical comment, References in publications to International Finance Discussion Papers (other than an acknowledgment by a writer that he has had access to uapublished material) should be cleared with the author or authors,

Distributed Lag Order Determination

by Richard Meese*

Introduction

This paper is organized as follows: parameterization problems endemic to time series models are discussed in section one. New approaches to the parameterization problems are summarized and then applied to the problem of simulataneously estimating the length and the coefficients of a distributed lag regression model. The asymptotic properties of the new estimator of the distributed lag model (DLM) are examined in section two, and the small sample properties of the

estimator are examined in section three using Monte Carlo experiments.

1. Time Series Parameterization Problems

Considerable economic analysis is carried out using time series data. The large forecasting models of the U.S. economy that help determine macro-policy are notable examples. Hence it is important that econometricians develop and use estimation procedures that are appropriate for time series. An appropriate regression technique for handling this type of data is generalized least squares (GLS), a procedure that dates back to the work of Aitken (1). Consider the

generalized linear regression (GR) model:

y = XB + u , with (1.1) (Txl) . (TxK) (Tx1) (Kx1)

a) X full column rank,

*Economist International Finance Division. John Geweke and Arthur Goldberger made helpful comments on earlier drafts of the manuscript. Any errors which may remain are my own.

b) E(u|X) = 0, and

c) E(uu’|X) = 2, positive definite.

Since 2 is positive definite, so is gt, Let G'G = got, After

premultiplication of y and X by G, the least squares regression of Gy on GX is best linear unbiased (BLU). Although theoretically eloquent, this procedure is only a paradigm since the researcher rarely knows the disturbance covariance matrix.

Estimation of 8B with an unknown 2 matrix has been one focal point of the time series econometric literature for many years. Since an unrestricted 2 matrix contains T(T+1)/2 distinct parameters, some restrictions on the autocovariance function of the disturbance process are necessary. The earliest estimators of B were derived using the assumption that the disturbance process followed a first order autoregression, AR(1), u(t) = pu(t-1) + e(t). In what follows €(t) shall always denote a process which is independent and identically distributed (iid) with zero mean and variance 0°. For the AR(1) process £ contains two unknown parameters, p and o, and an asymptotically efficient estimator of 8 can be obtained using a variety of procedures. Hannan (7) and Amemiya (3) have worked out estimators of 8 with less tringent assumptions on the disturbance process. They assume that the

disturbance foliows an ARMA (p,q) process.

P q Lr oa,u(t-j) = ¥,eCt-i) where (1.2)

b) p and q unknown non-negative integers, y aol c) the zeros of = a,z

j = 0 (z complex) and.

j=0

q : z Y,2 = 0 lie outside the unit circle. i=0

Hannan's estimator of 8 is developed in the frequency domain while Amemiya's estimator is formulated in the time domain. Both may be interpreted as multistage GLS procedures which require consistent estimation of the parameters of G or = as an intermediate step. Since the autocovariance fumction of the disturbance process is unknown, the Amemiya and Hannan procedures can only be shown to be asymptotically efficient. To prove asymptotic efficiency (and asymptotic normality) of either coefficient estimator, the estimates of the parameters of G or 2 must "improve" as the sample size increases. This requires that the following conditions be satisfied: (a) The number of parameters characterizing the disturbance . process must be allowed to increase without bound. (b) The number of observations must increase at a faster rate than the number of parameters so that the ratio of parameters to observations tends. to.zero as each tends to infinity. Point (a) ensures that the approximation of the true disturbance

process improves as the sample size increases, and point (b)

ensures that the estimate of the approximation is consistent. In this paper we shall be concerned with the estimation of the special case of model 1.1 in which the columns of X are successive lagged values of the same variable. Consider the distributed lag model: M -

y(t) = 2 B(s)x(t-s) + E(t) with _— (1.3) s=0

a) Ma fixed unknown non-negative integer, M 2 b) 2% B(s)” <, B(M) #0, s=0 c) (x(t), €(t))° a zero mean jointly covariance stationary

process and

d) E(e€(t)\x(t-s)) = 0 for all t ands.

Model 1.3 has an observation matrix X with unknown column dimension. Although models 1.1 and 1.3 have striking dissimilarities, they have a common parameterization problem. Feasible GLS

estimators (Hannan and Amemiya) of model 1.1 with disturbance

Process 1.2 require close attention to points (a) and (b) above. Since Mis unknown in model 1.3, a defensible procedure in this context is to expand the length of the fitted distributed lag indefinitely as sample size increases so that specification error is avoided asymptotically. Again, the number of parameters must be al-

lowed to increase without bound as the sample size tends to infinity,

while the ratio of parameters to sample size converges to zero. In practice, the parameterization problem is solved by

increasing the dimension of the parameter space deterministically

with sample size T. For example, let m be the maximum length

of the distributed lag that is to be fit for a given sample size.

If a deterministic rule is followed, we choose m as a function

m(T) T

of T, m(T), so that m>™ as T+ © and lim Too

guarantees that for sufficiently large T, m(T) > M, and under-

= 0. This

fitting of model 1.3, i.e., m <M, is avoided asymptotically. Also, since lim 2) . 0, the estimator of the coefficients of the distributed lag can be shown to have desirable asymptotic properties.” The feasible GLS procedures can be made operational by use of the deterministic rule m = n(T) described above. For Hannan efficient estimation of model 1.1, a consistent estimator of the spectral density of the disturbance process can be obtained by expanding the width of the spectral window as a function of the sample size. Amemiya's estimator of model 1.1 is made operational by expanding the length of the residual autoregression as a function of sample size.? |

Recently, Akaike (2) and Parzen (13, 14, 15) have suggested

a new resolution of the type of parameterization problems discussed

above. Both authors have suggested methods for choosing the order

(length) of an autoregressive process when the order is unknown.

Their procedures are similar to regression strategies since one

estimates a set of autoregressive models whose length varies from zero to m, (m chosen as a function of the sample size) choosing the order that is best according to some criterion. Akaike's decision rule is based on the principle of maximum Likelihood estimation while Parzen's criterion minimizes the one step ahead mean square prediction error. It has been shown (14, p. 14)

that for any autoregressive process and large T, the Parzen criterion selects an order that is bounded above by the order chosen using Akaike's criterion. Although this result does

not imply that Akaike's decision rule is less useful, we choose to restrict attention to Parzen's criterion, which is called CAT for "criterion autoregressive transfer function."

Parzen's criterion was derived for use in selecting the order of the estimated autoregressive process, but his decision rule can be applied to the problem of estimating M, the length of the distributed lag in model 1.3. One chooses a lag length

.m* which gives the minimum value of

2 o,*, i=0, 1, ..., 0, (1.4)

CAT(i) = 0 o, i

Ne) tue

j

Aa

where °; is the residual variance from the regression of y(t) on current and j lagged values of the independent variable x(t). The variable T denotes sample size, and m is chosen as a function

of T in a manner described below (section two).: A rigorous

derivation of criterion 1.4 can be found in Parzen (14, pp. 16-20).

The following is a paraphrase of Parzen's rationale for the use of CAT as a method of order estimation. Let s(t) be a zero mean, normal, covariance stationary process with auto-

covariance function R,(v) = E(s(t) ° s(ttv)), v0, +1, ... . (1.5)

We assume that s(t) has autoregressive representation,

Z ai(j)s(t-j) =e(t), a (0) = 1. (1.6) j=0

Define the mmemory prediction error as

© (t) = s(t) - E[s(t)|s(t-1), s(c-2), ..., s(t-m)], (1.7)

The normality of s(t) implies that the m-memory prediction error is linear in past and present s(t), m € Ct) = seo a ()s(t-9), a (0) =1. (1.8)

Since eft) is uncorrelated with past values of s(t), E(e (t) * s(t-k)) = 0, k=l, ..., m, (1.9)

the mmemory autoregressive coefficients a (i> j=l, ..., m can be found by solving a set of m Yule-Walker equations, where a (0)

is defined to be one:

a (DR, Gi-k) = 0, k=1, ..., m (1.10)

im

j=0

The m-memory prediction error variance is given by oF Ele (t))2 = £ a G)RG), 20) Zl. (1.1) m m j=0 m s m

From 1.6 €(t) is the infinite~memory prediction error obtained

from the projection of s(t) on its infinite past, E(t) = s(t) - E[s(t)|s(t-1), s(t-2), ...]. (1.12) Let 02 denote the infinite-memory prediction error variance,

on = Be, (t))? = 2 aR), a0) 214, (2.23)

j=0

and define the transfer functions

m : g(z)=1+ 2 a (424 ™ j=l 2

g(2) =1+ = a(4) 24 jl

for z complex, and let

y(e'”) = 077g (e”), (1.14)

“| iw “m2. , iw Yate 2 =O, ei(e), and

Aa mn A € Ct) = seo a (i)s(t-3)-

a aw

27) ie . ; 2 The ao” a6 ) and g ) are consistent estimates of oy a6) and go) respectively which are obtained by solving the sample Yule-Walker equations, where a (9) is defined to

be one:

mM iA —— , Za (J)R,G-k) = 0, kel, ..., m

420 R (v) == £. s(t)s(ttv), and ‘ . (1.15) s T j=1 n9 n . A aA o* 520 a (GR, (3) -

The idea is to approximate the autoregressive process 1.6, of unknown but possibly infinite order, by a finite order process so as to minimize the one-step ahead mean square prediction error associated with the approximation of s(t) by. an AR(m). As a

"Measure of the one-step ahead mean square prediction error Parzen takes (14, p. 19) |

J = E(E_(t) - e_(t))? (1.16)

_ H a iw iw, ;2 = ES [¥g(e") = va(e™) [" £06) a0,

where £(w) is a spectral density function on (-II,Il) given by

2 fo}

l

-iw , _1

f(w) = ; 5

>) - (1.17)

4

2 "iw e

jos

ge )|

10

For large T Parzen shows (14, pp. 16-20) that approximately, m Xoo,” + (o_* = a )- . (1.18)

This mean square error expression is the sum of two terms,

(a2 - 07) representing bias and + E Oo, representing eo nm T jel ;

-2 . representing variability of Yn Because 0, is not a function

of m, to find m*, the optimal order of the AR process, it is

sufficient to find the minimum of

-2 -2 Oo

~2 . ga Jot

CAT(i) = Jy -oO =

foo)

Hie

i=1, ..., m.(1.19)

In practice, when using the CAT criterion to fit an autoregressive model, it is necessary to replace the oF j=l, ..., min formula 1.19 by their consistent estimates. See Parzen (14, pp. 20-23) for further discussion of this point.@

Although Cat's theoretical justification is completely different from that of the residual variance criterion, Theil (18, pp. 543-545), the two methods of determining the order of the distributed lag mdel 1.3 have similarities. The method of selecting the variables to be included in a regression model by choosing the specification with smallest residual variance can be used since the expected value of the residual variance of the erroneous model minus the expectation of the residual variance

of the true model is non-negative. On average, one cheoses the

il

correct specification of the regression model, but

the residual variance criterion produces estimates of the population variance of y given x that are biased downward,

and will not choose the correct specification of the regression ©

model if it is not one of the models that is being considered. For — n ;

fixed m the expression = I~ a? - o” converges to Tt. j _m j=0 Ao yp Bay plim (-0 ")because the first term of CAT, = £060,” converges T->0o m T j=0

to zero as T+ {see section two below). In large samples, minimizing CAT(i), i=l, ..., m ts thus similar to choosing the model specification with smallest residual variance, because nininizing -s* is equivalent to choosing the model with smallest o. Despite the similarity, we produce asymptotic distribution results in the next section which indicate the

superiority of CAT over the residual variance criterion.

12

2. Properties of an estimator of the distributed lag model 1.3 when CAT is used to determine the order of the lag distribution.

Consider the distributed lag model 1.3 with the additional

assumptions

e) lim (cov(x(t), x(t-s)) = 0, soo

£) x(t) has finite folrth order moments, and g) €(t) is normally distributed with zero mean and unit

variance for all t.

It will be convenient to write model 1.3 in matrix notation,

y = Xeu + € . (1.20) (TxL) peer) (TLD (M+1) x1

Let m denote the maximum length of the distributed lag that is to . > ition X= be fit for a given T. For m > M partition x (X> x and

write 1.20 as

Bug y= Ye XW ° 3 +€, where 8B = 0. (1.21)

Tx(M+1), Tx(m-M) =n-M

i}

When m < M partition Xy (x, Xm and write 1.20 as

Bx ~m = * = *&R* = y= X Br tve v= e+ KSB By as (1.22) —M-n

Assumptions (c), (e), and (f) are sufficient to show?

13

T+ (4. Q(™) s (1.23) (any fixed M)

where Q(M) is an (M+tl)x(Mt1) matrix with 3) element equal

to cow(x(t-i), x(t-j)). When m > M is fixed, define

\ Xe Seay Qc) Q(M,m-™ BS

co , : )" QGrm) To x! xt Q(M,m-M)' Q(m-M) (M,m “ea | Mn m MoM | ‘ (1.24) | Given that the limit matrix of 1.24 is nonsingular, Pim cms ~ , T+ + as (1.25)

(M,m fixed) xe x eM |

exists since the elements of the inverse of a matrix are continuous functions of the elements of the matrix itself, (18, p. 363)

Similarly, when m < M define

plim 1 wee eR Q*(m) Q*(m,M-m) Too ; T é r) = ' (m,M fixed) en, ee Q*(m,M-m)' Q*(M-m) / | | (1.26)

{t is also true that limit matrix of 1.26 has an inverse by the Same argument given above, (18, p. 363). In what follows

we will be interested in expanding the maximum length

14

of the fitted distributed lag indefinitely as Te . Suppose we

let m = m(T) so that m(T)>- as Te, and lim m(T) = Q. Clearly, T0 o

for T sufficiently large, m(T) > M. Now consider the sequence of

i . i ' > (Mt1) x (Mtl) matrices for any sample size T; TOR > i ' ' i ' i ' ! TOP Sp! 2 OPH) 2 +++ 2 TOP ary > 9 (1.27) where > and > denote matrix ordering and P. = (I - K, (XIX) kK),

i=1, ..., m(T)-M, where xX, is the Txt matrix whose Fi colum,

j=1, ... i is composed of x(t) lagged M+ j times. The inverses

of the matrices in 1.27 also form a monotone sequence,

. 6 T(x1X,) < T(XIP, XK) Kw. K TOP A cp ao? (1.28)

Nn Finally, we define on as the error sum of squares e'e froma

regression of y(t) on x(t), x(t-l), ..., x(t-m, t=1

d e009

T divided by T-m-1.

For any fixed m and iid c(t), plim ¢ x, ©) = 0, Too A (4, pp. 23-24).7 When m < M, plin (62) =

T3303

1.2 li —m- * plim =[g'x! ai (T/ (T-m-1)) | phim TB. x Xu + 284X! tet e'P* ©], (1.29)

where Pe = (I — X*(Xx'y% -1 nm a ( be x*) xa") .

15

22 'y Piin i; se (8 bss > (2) + (1.30)

: A2 _ 2 . ' _ t -1 = For m > M, plim O70 since x, Xx | x) = 0

Toxo when m>M. Therefore, a is a consistent estimator of o” when m > M, m fixed. For m< M, plim on > oc. Now consider

Too the case m = n(T):

“2 g'e plim Cnr) = Pim |p aGya-t ™ _ (1.32) tim | = %ncry Snr) Xycr)) Xr) & Toa T-m(T)-1 . | = | ' -l., ; 8 Nacry = *n(T) (2) *nr)? X(T)’ Then from 1.31, 2 eo? = aim (——P £'Nacr) £ Toe “m(t) “9 = a (ea )puss tT Je (1.32)

16

Observe that

Le eN € (£%ac*) > 0 for all T and n(T), and that

(1.33) ' E ( “wen ) - o7m(T) . T T £'Nany= 2 ,m(T) Clearly, lim E = lim o ¢ T ) = Q, from which Too T T0o

e'N € A plim 2") = 0.9 Therefore, on) is a consistent estimator Too

of o* provided lim mo) = Q.

T3730 We are now ready to examine the large sample properties of CAT as a model selection criterion. For sufficiently large T, m(T) > M so without loss of generality, we need only examine the

case for which m(T) > M. Consider the probability limit of

CAT(m(T)), plim CAT(m(T)) = plim T = 6, -o ¢r)]* T¥0 TH jeo J a (1.34) : M aL m(T) aL aL plim + Eo.” + plim= = od - plimo/.). mo Tye9 J 0 pe 7 jem J pom 2

We shall analyze each part separately:

plim = = a5 = lim Co) x plin a7? < lin @ (MELO? = 0, (1.35) Toco 4=0 To j=0 To j :

17

m9 9 l M a) since for j <M, plimo.~ <g - Therefore, plim > z o, = 0. Too j To00 . j=0 + 22 - We have already established that plim (-o )= -s, so there To m(T) :

is one part of 1.34 left to consider. Now

m(T) « m(T) AL m(T) E(z f o?)e2 Fo gry -— 2: th | jamt1 J j=Mt1 j=MHL 0° (1-5-3)

; (1.36)

a Aa

since gj & Vo is distributed as a x" (T-j-1) for j=Mt1, ..., m(T), and using the definition of expectation, the reciprocal of a

1 2 (t-j-1) variate can be shown to be >): The last term on the

right hand side of 1.36 is bounded above.

i mT) T-j-1 < B(T)-M (2-2),

j=M+1 o7 (T~4-3) ~ To? T-m(T)-3 (1.37)

Thelimit as T*~ of the right hand side of 1.37 is equal to zero,

mT) A_ ;

so plim= f£ a? = 0 by lemma 1 in footnote 9. Returning to Tro * j=m+1 J

our original problem 1.34, we have

m(T) w_ AL _ plim caT(m(t)) = plinf2 ¢° 9-2 _ 3 ty p= 07. T1090 T-+0 j=o J m(T) (1.38) Provided m(T) increases at a slower rate then T, i.e., T . AL lin 3) = 0, CAT(m) is a consistent estimator of -¢ 210

To

TT RN RR ee tte ee nent re

18

Define m* as the lag length which gives the minimum value of

boa, A. | j=0

2X 1 CAT (i) = 7

Proposition 1: The lim Prob(m* > M) = 1, i.e., the probability . Too

that minimizing CAT(i), i=0, 1, ..., m(T) results in specification

error, goes to zero as To, Proof: Suppose we look at a finite subcollection of the set {iz=O, 1, 2, ..., m(T)} that does not contain any integers greater

than or equal toM. Let i= {0, 1, 2, ..., M1}.

Then plim [min CAT(i)] = min(plim CAT(i)) since the min function

Te i i does not depend upon the sample size T, and plim CAT(i) exists T00 by virtue of 1.30. The min (plim CAT(i)) > -o72 from 1.30. {TT

If we look at a different finite subcollection of the i's that contains at least one integer greater than or equal to M,

i* = {non-negative intergers < M, and at least one integer > Mi,

then

plim (min CAT(i)) = min (plim CAT(i)) = -o Too 8 ik i* Tro

Since m(T)>°© as Toe, for sufficiently large Twe have m(T)>M and

tnere exists a finite subcollectionof {i=0; 1, ..., m(T)} that contains

at least one integer greater than or equal to M as T+, Therefore,

lim Prob(m* > M) = 1. Too

Proposition 2: Let k be a fixed positive integer. Then

19

T * (CAT(Mtk) - CAT(M)) converges in distribution to

dian (k)) as To, 2 o Proof: T(CAT(M+tk) - CAT(M)) = Dy Oo, + To. - To — jee J M Mtk

-

= f oj) + TY Cae (1.39) Ss: ' ' jaM+1 (e'e ) (€ KE 1 ( = ueetu ) toactca\(; 7 4, 4 ) — ees - aR ' -er _ 42, * T To DAS Sa pc nee a j “1 “| Nn e jnwe (Ss) Sieur) T T ' Because marae is a consistent estimator of o* for i=l, ..., Are Lo 2. 2 Settee k < 0, and 2 Ce Ey LAE RoVIE has ax (k) distribution independent

of T, expression 1.39 converges in distribution to

2 2 2 wo? + (EE PY WD Jd on - 2),

fo} fo}

Proposition 3:

ral

' Aa 4, * “) A 4 A _ . Cov( ean Sux SiS? Se Ses Sey Spm OS FSH <K.

a

20

P f: F < <i< 2 rene or 0 < jf < i< kK, it is true that con Ev; “Ene Ene) and ¢€! i eae are independent (18, pp. 84-85, 139). This implies

' ' =

cov(e,, © Seej ~ Stee Sera? Eerifaes? 79>

whence (1.40) ‘ ' = M7 4

cov(Ey Suey ? Eve eme 7 Var(el Ema? 2(T-M—-i-Do*™,

The second line follows from the fact that — has a chifo} a

square distribution with (T-M-i-1) degrees of freedom, and the

variance of a chi-square variate is equal to twice the number of an aA an “A Aw

e! degrees of freedom. The cov(et € Sax 7 Exes Evy? CML Cy

nw an

av ave = 0 since for 0 < j <i <K, and using 1.40 above,

coves Eman ~ SxeiEra? Sera Sora ~ Sve Sey? (1.41)

2(T-M-K-1)o" - 2(T-M-K-1)o" + 2(T-M-i-1)o" - 2(T-M-i-1)o" = 0.

Using propositions 2 and 3 we can now calculate the

lim Prob(m* = Mtj), j=0, 1, ..., k. It has already been established T-r00 that lim Prob(m* = M-j) = 0, j=l, ..., M. Define the following:

T7300 :

S, = 2°34 + (e'

F le ety 7 cle/o, 0<4<e,

(1.42) § =0.

2 Observe that S, ~ 2*j - j) and S, = 5S, + 2-w., where the F j-xQ j jel - 4;

w,'s are distributed as independent x71) variates. Define

21

= Preb(Z, > O|Z. > 0 > j=1,2, eve and k= - > —4 > i= ace j

2 - 2 here Z, ~ 2*j7 -y°(j); 2.20, Z, = Z. , +y., y, 2 - xX (1) and w j j-x Gs 2, hi joa * %5° 957 x (1) y, is indeperident of 254 1 Ye Now

lim Prob(m* = M+j) = Prob(-S, > 0, sy -~S,>0, ...,; 730 j j - > ~- > S _ > weed . S347 8; 0, S17 8; 0, 342 s, 0, ) (1.44) = Prob(-~$§ ~ XS © Prohl - 2 Prop(~S, > 0, ...,; S54 > oy Prob(S 45 oF > 0, );

since the random variables (S, - 8). 0 <i< j-1 are independent

of (Ss. - S,), n> j+1 by proposition 3. Note that

j

Prob(-S. > 0, ..., S, - S.>0) = (-S, j-1 j )

Prob(S, - S. > 0) ° Prob(S, - S,'> QOjS. -S,>0)°: j-1 j (Sj j S54 j )

Prob(S,_ -S,>0,S,,-S,>0)°... (1.45)

- s.|s, 3 ; j-2 j Fr 12 - Prob(S, , - S, > O|S, .,, -S, > 0, ..-, 8, , - S$, > 0). j-i jj j- j

itl j ; j-1

Given 1 < i < j observe that

22

Prob(S, , - S, > O|S. j-i jj

-S,>0,... -S,>0) = jetta 7 55 7 0 re 8, 8, > 0)

Prob(-Z, >.0|-Z,_, > 0, ...,- Z, > 0) =

1 (1.46)

Prob(-Z,_, - y, > O|-2,_ > 0, ..., -Z, > 0)

1 1

= Prob(-Z,_, - y, > 0|-2,_, > 0),

1

_ because yy is independent of Zs n=1, ..., i-l. Therefore,

Prob(-S, >0, Ss - 8, >0, ..., 8 ~ S, > 0) =

1 j-1 (1.47) j | J IIT Prob(Ss, -S,>0js. , -S,>0) = IIL px. i=l j-i j j-itl j {=1 i Similiarly, one can show foo] - > ~ > ~..) = . . Prob($. 44 s, 0, S540 s, 0, ) I P. (1.48) i=1 Putting the two results together, j © lim Prob(m* = M+tj) = IT p* TT op. (1.49) T-100 i=l *i=1 7

Inorder to calculate the limiting probability that m* is equal to M+j, we need to approximate expression 1.49. To accomplish this end let Wid ~ x? (4-1), Vw ¥2(1) independent of Vip and for i >2

i note tnat

23

= . > > Prob(Z, olZ, 5 0)

Py

= Prob(2i - u,) - v > 0|2(i-1) - uo

(1.50) = Prob(u;_y tv < 2ilu,_) < 2(4-D) ; of4- 13 = Prob(u, < 2iju,_) < 2(4-1)). Figure 1

u,_it

v>

The shaded area of figure 1 represents the set of Usd and v

+ < 1 < i- . ® ° such that ya Vv 24 and Wd 2(i-1). Let Fu; | ), Foyer ); and FL(*) denote the cumulative distribution functions (cdf) of us

u;_, and v and let f, , fi...» and fy denote the corresponding

i i-L probability density functions (pdf). It is clear from figure 1.1

that there are several representations of P, in terms of these cdf

and pdf. Since the Pp. must be calculated numerically, we have chosen

24

the following expression for p, to minimize computational cost:

2

Pp, = Fy, 2D - £ Oy EY - Fay OW EM av F (2(i-1)) “i-1 2 : Fu, OP - ara, Ey av = F (2) + F (QGai)) . (1.51) “irl

Numerical integration of the second term in the numerator of 1.51 was carried out as follows. The closed interval [0,2] was divided into 20,000 disjoint intervals, and the value of f(y),

0 < v < 2 was calculated using the approximation

£ (vy) = Prob(y7(1) < v + .0001) ~ Prob(y7(1) <v- .0001).

(1.52) A similar procedure is used to find pF, i<j: *k = - > - > pt = Prob(-Z, > 0|-z,_, > 0) = i- - < i- - < Prob(2i wii7y 0|2(i-1) Usa 0) (1.53)

Prob(u,_, +v > 2ilu,_, > 2(i-1))

Prob(u, > 2ifuss > 2(i-1)).

25

Figure 2

2(i-1)

_ The shaded area of figure 2 represents the set of Usd and v

such that u, _ + v > 2i and u > 2(i-1). i-l i-l

2 l- Fy (2(i-1)) -f (F (2i-v) - Fs (2(4~1)) )£ Cv) dv

_ ed o “ya pz = 1-F (2-1) u, : i-l 5 | (1.54) l+F (2(4-1))F(2)-S F (2i-v)£ (v)av . “irl Vv QO “i-1 M 1-F (2G-1)) Ui-d

Table 2.1 summarizes the results of these calculations.

26

Table 2.1 Limiting Probabilities that m*=M+j, j=0, ..., 14.

J Mtj lim Prob(m* = M+j) = II 9 T?. i=l i=l

iH oO uo wi) re

j

ht i) i) rary ln

oO ON KD MN F&F WD bd o: 18) far So

oe ee WR EH O ooo oO Co oOoO08 6 6 OPN WN © Fr OO BR

The figures in Table 2.1 indicate that as T?~, the CAT criterion does not prevent asymptotic overfitting of the distributed lag model, i.e. lim Prob(m*>M) = .5646. As To Parzen's criterion selects the true lag length approximately 44% of the time. The limiting probabilities that m* = M+j converge rapidly to zero as j increases. Because the use of the CAT criterion results in a non-zero probability that m*=M, there is a gain in asymptotic efficiency when using this criterion to estimate the DLM. Consider the estimator of an unconstrained distributed lag model of unknown order called Hannan inefficient (HI), (7). To get

consistent estimates of the coefficients using HI, the fitted

distributed lag m(T) must be expanded to infinity with sample size, while the ratio of m(T) over T goes to zero, (17, p. 304). The terminology "Hannan inefficient" stems from the fact that the lag distribution must be expanded indefinitely in both directions, so there is no way of incorporating prior information on the lag

distribution (one-sidedness or known lag length) into the estimation

procedure. Our analysis is comparable to the HI procedure if model 1.3 is amended to include M future x's. None of the preceding analysis (propositions 1-3) is affected if we fit symmetric two-

sided lag distributions of order m(T). Once the random variables

oF j=l, ..., m(T) are redefined as the error sum of squares from

a two sided distributed lag with j leads and lags divided by T-2j-1, the previous results follow with appropriate changes in degrees of freedom. The salient feature of these results is that

lim Prob(m*=M) > 0. Also, since the limiting probability that T00

m* = M+j goes to zero as j gets large, for 6 > 0 there exists some

integer M, = M.(6) such that lim Prob (m* >M.) <6. Hence there 0 0 Toe 0 is a gain in asymptotic efficiency over HI when CAT is used to

estimate the length of a symmetric two sided distributed lag model.

As Tee, the CAT criterion selects a finite lag length m* > M with

14 non-zero probability.

We conclude this section with a discussion of the large sample “ -1 . = ' ' ; i properties of B . (Xe Xw7: We shall focus attention on

aw

the first M+l components of Bee

28 ous . Te ' f @ = Proposition 4: Let Bak (Biy> Bt» . Then pyim By By. Proof: By proposition 1 we need only consider those values of m*

such that m* > M. Let {x(t)} denote the entire history of the x

process, {..., Xp Xe Xap? -..}. Then Lim E(B, [ix(t)}, mA) = (x g'X ) XX By B (M+1)x.1 =| ™ (1.55) o / (m*M)x1

* _ 2 ey -1 iim Var(B ,|m*, {x(t)}) =o (XX ae) . The (M+l)x(M+1) upper left hand corner block of o°(x! AX 4) is equal Zee -1 . , * . to 0 OO at - Denote this matrix by V(m*). The expectation of V(m*) with respect to m* is equal to the variance of the first

Mtl components of Bx conditioned on {x(t)} alone:

M-1 Ex (V(m*) |{x(t)}, m*) = 2 v(m*)e(m*, T|{x(t)}) m*=0 (1.56) m(T) + EZ V(m*)£(m*,T{{x(t)}), m*=M

where f(m*,T|{x(t)}) is the pdf of m* given the x process. The

sample size is included as an argument of f(*) to re-emphasize the

dependence of m* on T. Since plim V(m*) is a well defined matrix (see T00

footnote 15 below), the first term on the right hand side of 1.56

converges to zero as T*© by proposition 1. The second tern,

29

n(T) 7 n(T)

EZ v(m*) £(m*,T]{X(t)}) = 0? COP ah £Ga*, | Lx(t)}) m*=M m*=M . (1.57) Zoe -1 m(T) ° So Paro — ” £(ms, THxcr) }), m*=M

where < denotes matrix eerie The matrix Pa(T) -M is equal to

7 ~ Xa crm x pay Xi(r)-w* Observe that plim o” yP x) ° “su T|{x(t)}) = im &) prin ( an -an) -1-0,

Too Teo

' -1 Xe —wu 15 since plim ( mt ) is a well defined matrix.

T70

Therefore, the first Mtl components of Bix consistently estimate By Conditional on {x(t)} and a fixed m* >M, the limiting

distribution of vI(B - ( o>) is normal with zero mean vector

: : 2 -1 : and covariance matrix 0° Q(m*) ~. This is true because for

= 2, Xn ene fixed m* > M, JE ie £~ NO, oC A )) and

xh xe 1 -1 plim ( —T ) = Q(m*) ~. The innovation in our analysis

T 700

has been the estimation of both M and the distributed lag coeffcients, but it is applied work that motivates this discussion of the conditional limiting distribution of Bix given m* and

{x(T)}. Once m* has been selected using the CAT criterion, it

is convenient for the researcher to act as if m* were fixed

30

when performing hypothesis tests using coefficient estimates and their estimated covariance matrix. In the next section we examine the bias associated with conventional coefficient t-tests when CAT is used to select a DLM from a set of competing models of various lag lengths.

Although we have worked’ out the limiting probabilities that m* = Mti, i=0, 1, ..., m(T) for a distributed lag regression model with an arbitrary zero mean covariance stationary x process and normal independent disturbances, the problem of determining the limiting distribution of VT(B g - Baw or some other function of 3 gs remains a difficult problem. The column dimension of xe is a random variable with no upper bound, and "unconventional" central limit theorems are required to examine the limiting distribution of (1/YT)x', € . This is

n

true even if one restricts attention to By =

aA

-1

' ’ .

CP XP em = the first Mtl components of Bee The paper by Sims (17) discusses the problems associated with infinite dimensional parameter spaces; we shall not pursue the

subject further.

31

3. A Monte Carlo study of the use of the CA! criterion to select the order of the distributed lag model.

In this section we report results from a series of Monte Carlo experiments in which the CAT criterion was used to select the length of the coefficient lag distribution in the regression model 1.3 of section one,

M

y(t) = 2 B(s)x(t-s) + €(t), t=l, ..., T, s=0 ;

when M is an unknown nonnegative integer. The explanatory

variable x(t) was generated by the covariance stationary process 2 x(t) (1-.8L) = e*(t), t=1, oeey T, (1.59)

where L denotes the lag operator. Both e*(t) and ¢(t) are

"pseudo-random" standard normal variates independent of one another. This particular parameterization for the x(t) process was chosen since the autocovariance function of x(t) closely resembles that of a typical U.S. time series.” For each experiment we chose samples of size 50, 100, and 200, and one of the following lag distributions:.

a) -No.lag distribution; 8(0) = 1.0 and B(i) = 0, i¥0.

b) Linear Decay; B(i) = 1.0, 0 <i <4and 8(i) =

1.0 - .1(4i-4), 5 <i < 13. (1.60) c) Box; B(O) = .5= 8(6), B(i) = .8, i <i <5.

. 18 d) Infinite geometric; B(i) = 8, i=0, 1, .

32

For each sample size the maximum order of the fitted distributed lag models was approximately r°°, when T=50, distributed lags of order 0-12 were fit, for T=100; 0-18, and for T=200, 0-27. For each replication and sample size the CAT criterion estimated a particular lag distribution. After 100 replications the following summary statistics (i-viii) were calculated.

(i) The average order of the lag distribution that ¢

was selected by minimixing CAT(i), i=0, 1, ..., (T° )s

denoted m*(T). Let m*(T,k) denote the order chosen

for sample size T and replication k. The average

order m*(T) is given by

100

x m*(T,k), T=50, 100, 200. (1.61) k=1

m*(T) = 5 We include this statistic in the analysis since it gives a general indication of the performance of CAT as sample size increases. For example, when the CAT criterion is applied to — the finite lag distributions (a), (b), (c), we would expect the average m* to more closely coincide with the population lag. length M, the larger is the sample size. When-CAT is used to estimate the lag distribution which is of infinite length, lag distribution (d), we would expect the average m*(T) to be an increasing function of sample size.

(ii) The average estimate of o, the variance of the disturbance term (a? = Var (e€(t)) = 1), denoted

“9 cay

Oo -(T). Let o* (T,k) denote the estimate of o for

sample size T and replication k. Then o7 (7) is given by

; Ap , 100, O° (T) = Too X o° (T,k), T=50, 100, 200. (1.62)

k=1

33

This statistic is important since the residual variance is one component of the estimated variance of the coefficient estimates. If o is biased downwards (upwards), we would expect coefficient t statistics to be too large (small) on average, provided the coefficient estimator is unbiased. Im any event we would expect bias in 3 (7) to diminish as sample size increases.

(iii) A Komolgorov-Smirmov (K-S) statistic to test the

null hypothesis that the sum of the estimated coefficients

was equal to the sum of the true coefficients. Let CSUM(T,k) denote the sum of the coefficients from the estimated lag distribution of order m*(T,k) minus the sum of the population

distributed lag coefficients, divided by the (T,k)'? standard error of the estimated sum. Assume that the CSUM(T,k) are independent and identically distributed normal variates qwith zero mean and unit variance, denoted N(0,1) Suppose CSUM(T,k) is re-indexed by CSUM(T,2) so that the CSUM(T,2), %=1, ... 100 form a monotone increasing sequence. Let $(CSUM(T,%)) denote the cumulative density function (CDF) of a N(0,1) evaluated at CSUM(T,2). The two-sided K-S(T), T=50, 100, 200 statistics reported below are equal to the maximum absolute difference between the sample and population CDF of CSUM(T,2),

K~S(T) = max £=1, ..., 100

|e - O(CSUM(T,2))|, (1.63)

T=50, 100, 200.

The null hypothesis that the estimated sum is equal to the true sum is rejected at the 5% (1%) significance level if the K-S(T) statistic exceeds .136(.163).

An asterisk denotes acceptance of the null hypothesis at the 5% significance level.

We choose to report a K-S statistic for the sum of the estimated coefficients since for some distributed lag models this sum

represents the cumulative or iong run response of an endogenous

34

variable to a once and for all change in an exogenous variable.

The researcher may want to know if the use of the CAT criterion distorts this statistic. It should be noted that the same information concerning the distribution of the sum of the estimated coefficients could have been obtained from a standard t-statistic.

Failure to do so was an oversight on the author's part.

(iv) The average bias of the coefficient estimates, denoted BIAS(i,T). Let B(i,T,k) denote the ith estimated coefficient i=0, 1, ..., m*(T,k) ‘for sample size T and replication k, and let B(i) denote the value of the ith population coefficient from 2.12 a-d.

For T=50, 100, 200 define 8*(i,T,k) as

(sa,7%) if i=0, 1, ..., m*(T,k), and

6 (1.64)

B*(i,T,k) = l O if i> m*(T,k) + 1, ..., T.

Then BIAS (i,T) is given by

1 100 «

(1.65)

i=0, 1, ..., re, and T=50, 100, 200.

This statistic is important since it helps determine the reliability of coefficient point estimates when the lag distribution has been estimated using the CAT criterion.

(v) -The number of times that coefficient (i,T) was included in the estimated lag distribution, denoted NIIME(i,T}. Define the variable Q(i,T,k) as follows;

1 if i < m*(T,k), and Q(i,T,k) = — (1.66)

0 if i= m*(T,k) +1, ..., T°.

35

Then NTIME(i,T) is given by 100 6

NTIME(i,T) = £ Q(i,T,k), i=0, 1, ..., T°, k=1 (1.67)

T=50, 100, 200.

; 6 The values NTIME(i,T) i=0, 1, ..., T , T#50, 100, 200 can be

used to compute the sample frequencies that m*(T) = i, i=0, eee (7°) which can then be compared to the limiting probabilities that m* = i, i=0, ..., (M14) that were calculated in section two pages 18-26. The sample frequency that m*(T) = for any sample size and lag distribution is given by (NTIME(i,T) = NTIME(i+1,T))/100 for i=0, ..., (T°°=1), (1.68)

me and NTIME(i,T)/100 for i=T*”.

We expect there to be greater coincidence between the sample and limiting probabilities that m*(T) = i, i=0, 1, ..., ro, the larger is the sample size.

(vi) The average mean Square error of the estimated

coefficients, denoted MSE(i,T)

1 100 « MSE(i,T) = too t he (i,T,k) - BG)’,

(1.69)

i=0, 1, ..., re, T=50, 100, 200.

This statistic is useful since we will compare it to the average

76

estimated variance of each coefficient EVAR(i,T), i=0, ..., >

T=50, 100, 200, which is described below.

(vii) The average estingted variance of coefficient (i,T), denoted EVAR(i,T). Let G2 (T,k)*(X(T,k)' * X(T »k)) zt

denote the diagonal elements of the estimated

covariance matrix of the 8(i,T,k), i=0, 1, ..., ro, Define

36

A . -1 o 7 (7,k) (X(T,k)"* K(k) Fy for i=0,1, ..., m*(T,k), and

° -6 O for i=m*(T,k)+1, ..., T .

% «2 (Tk) (X*(T, kK)" K*(TJK)) 7}

(1.70)

Then EVAR(i,T) is given by

1 100 « 2 “1 EVAR(i,T) = Too xX o¥ (X*(T,k)' *X*(T,k)) 5 =] ?

i=1, ..., T'°, T=50, 100, 200.

If BIAS(i,T) is small, we expect EVAR(i,T) and MSE(i,T) to be

roughly the same. Should EVAR(i,T) be biased upwards (downwards )

then coefficient t-tests will be too small (large). We examine

the bias of coefficient hypothesis tests under the null hypothesis

that B(i) = 0, i=0, 1, ..., r’® using F(i,T,k) described below. (viii) The sample CDF of the ratio of each coefficient

to its estimated standard error, assuming the ratio has a N(0,1) distribution. Define F(i,T,k) as

-1 ii F(i,T,k) = i=0, 1, ..., m¥(T,k), and (1.71)

-5 for i=m*(T,k) + 1,°..., 7°,

1/2, for

$18 (4,7, k) / (62 (T,k) + (R(T) X(T, KFT

If the CAT criterion selected an estimated lag distribution of 6 order m*(T,k) < T’ , then the (a*(T,k) + 1) through cr’) -th

coefficients were arbitrarily assigned a cumulative probability

37

of .5. The cumulative normal probabilities F(i,T,k) are found in 13 columns of tables 2.3, 2.5, 2.7, and 2.9, a table for each lag distribution (1.60)a-d. For example, the entry corresponding to the column headed by 3 and the row associated wieh Lag 6 in Table 2.3-Lag Distribution (a) - T=50, page 43 is 2. ‘This means that F(6,50,k) is in the third probability cell; F(6,50,k) is less than .05 and greater than or equal to .025 for 2 out of the 100 replications for lag distribution (a), sample size 50, andthe coefficient corresponding to the x variable lagged six periods. The null hypothesis that the ratio of an estimated coefficient to its estimated standard error is distributed as a N(0O,1) is incorrect for the first Mtl coefficients of each lag distribution and correct for all others” Theoretically, the sample ratios of those coefficients whose population value is zero should be distributed across the columns of tables 2.3, 2.5, 2.7, and 2.9 in proportion to the cumulative probabilities at the head of each column. For example, given any lag distribution there should be approximately 30 replications in the first five columns in all rows where the population coefficient is zero, since the first five columns represent a cumulative probability of .30. We shall retum to this point during the analysis of the Monte Carlo results.

Last, the statistics described in (i-vii) above are found in tables 2,2, 2.4, 2.6, and 2.8; these tables correspond to

the four lag distributions 1.60 a-d. The first three statistics

(i-iii) are found under the heading general statistics, while

(iv-vii) are found under the heading coefficient statistics.

38

Table 2.2

Lag Distribution (a)

T=50 General Statistics

nm (50) = 3.22 o7(50)= -8836 K-S(50) = .0974*

Coefficients Statistics

BIAS (50, i) NIIME(50, i) MSE (50, i) EVAR( 50, i) LAG-i

0281 100 0264 -0116 0 -.0040 51 0755 -0332 1 -0370 45 -0634 -0316 2 -.00898 38 -0701 -0282 3 -.0207 34 -0412 0251 4 -0135 30 -0439 20211 5 -0105 25 -0394 -0194 6 ~.0135 24 -0378 -0181 7 -00456 21 -0590 -0162 8 -.00225 18 -0466 -0140 9 -00461 16 +0354 0115 10 -.000409 13 -0264 -00640 11 -.000504 . 7 -00856 -00152 12

39

Table 2.2 cont.

Lag Distribution (a)

T=100 General Statistics

m*(100) = 3.07 07 (100) = .9476 K-S(100) = .1013*

Coefficients Statistics

BIAS (160) ,i) NTIME (100,i) MSE(100,i) EVAR(100,1) LAG-i eR ed) NAR CLOO,L) LAG-i

-0178 100 -00902 -00543 0 -.0213 56 0254 -0157 1 -0122 41 +0247 -0101 2 -0104 34 0273 -0120 3 ~.0274 28 0199 -0107 4 -0177 25 -0169 -00949 5 - 00356 21 -0180 -00845 6 ~.0211 18 0154 -00760 7 -0136 15 -0212 -00694 8 -.0000761 13 -0146 -00627 9 ~.00212 12 00755 -00485 10 -0003 31 8 -00678 -00433 11 0111 8 -00570 -00431 . 12 -.0121 8 -00753 -00399 13 ~-00256 7 -00921 -00304 14 -0114 5 -00984 -00223 15 -.00753 4 -00440 -00163 16 -002 33 3 -000761 -000705 17 - 000430 1 -0000185 -000110 18

40

Table 2.2 cont.

Lag Distribution (a)

T=200 “ General Statistics

m*(200) = 3.11 07(200) = .9747 K+S(200) = .0778*

Coefficients Statistics

BIAS (200,i) NTIME (200, i) MSE(200 ,i) EVAR(200 ,i) LAG-i

-00856 100 -00372 -00230 0 -.00993 51 -0134 -00656 1 -00718 36 0142 -00632 2 -00480 32 0116 -00524 3 -.0137 26 -00952 00425 4 -000530 21 -00623 -00356 5 -00488 17 -00641 -00317 6 -00207 16 -00433 00264 7 -.00140 12 -00426 -00233 8 -00399 11 -00124 -00218 9 -.00667 10 -00239 - .00212 10 -00528 10 -00331 00202 11 -000179 9 * ,00259 -00196 12 -.00381 9 - 00290 -00182 13 -0116 8 -00629 -00168 14 -.00910 8 -00289 -00128 15 -000691 be) -000579 -00111 16 -.00411 5 -000949 -00109 17 -00780 5 «00253 «000958 18 -.00371 4 -00118 -000913 19 - 00430 4 -00174 -000860 20 -.00364 4 -00104 -000621 21 ~.00147 3 000127 -000402 22 -00101 2 -000186 -000330 23 -000744 2 - 0000993 -000210 | 24 -0000909 1 -000000826 -000504 25 - 0 - - 26 - 0

- - 27

Table 2.3 - Lag Distribution (a)

T=50

Number of replications in each probability cell

13

12

11

10

Probabilities

975

~9-

-0125- -025- -05- -05

0.0-

-9875 1.0

w O

fon)

0125

Lag

0 100

0

oO

Oo

0

0

0

11

55 62 66

‘1

70 75 76 79 82

84 87

10

11 12

93

T=100

100

41

NTN

AHO oO

StNOM SG

MAM

11

72 75

Table 2.3 - Lag Distribution (a)

T=100 Number of replications in each probability cell

13

12

11

10

Probabilities

1.0

°95- =. 975-

975

9-

-0125- -025- -05- 1- -05

025

0.0-

9875

95

ee

0125

Lag

MOAMNAN

wAOOOO

AH OoOrfa &

79 82 85

Hondo

AaAadtHoe

Naw oO

ANON

oO won~Onre

ooo

Hoo

92 92 92

AON

Aor

oOr

ooo

11 12 13

93 95 96 97

15

16 17 18

99

T=200

42

OnwstTONN

ONK Hone

ONNOMM eS

OMMNNHOO

omnotr erm

OnnMNn stam

49 64 68 74 79 83

OOMMOSTANN

ODONMSN

orn AMO eY

Oot HOMONNO

orn Moor

OtNAMN OF

CANN TNO

Table 2.3 - Lag Distribution (a)

. T= 200 Number of replications in each probability cell

13

12

11

10

Probabilities

1.0

9875

-975

-9~

~7- 9

-025- -05

-0125-

.025

0.0-

-95

0125

Lag

84 88

89

90 90 91 91

10 11 12

13

92

14

92

15

95 95 95 96

16 17

18

19 20 21 22 23 24 25 26 27

96

0

96

0 2

97

98 98

43

99 100

100

44

Table 2.4

Lag Distribution (b)

T=50 General Statistics m*(50) = 11.17. 07(50) = .9678 K-S(50) = .0645*

Coefficient Statistics

BIAS (50,i) NTIME(50,i) MSE (50, i) EVAR(50,i) LAG-i -.000726 100 -0413 -03 86 0 -00821 100 143 129 1 -.00368 100 - 169 144 2 -0289 100 -126 142 3 -.0309 100 -101 142 4 -.00246 100 141 141 5 -00548 100 - 150 -140 6 -.0107 100 ~152 ~142 7 0255 100 -194 143 8 -.0403 100 «224 .137 9 -0436 98 -193 -109 10 -.0143 74 - 139 -0655 11

-.0151 45 -0394 -0170

i Bo

45

Table 2.4 cont.

Lag Distribution (b)

T=100 General Statistics m*(100) = 12.65 o7 (100) = -9709 K-S(100) = .0705* |

Coefficient Statistics ©

BIAS (100,i NTIME(100,i1). _MSE(100,i) “EVAR(100,i) LAG-i -.0124 = “100 0125 -0134 0 -03.2 100 ~~ .0507 -0472 1 ~ .0367 100 -0431 0531 2 -0321 100 -0483 0533 3 -.0130 100 - 0659 -0533 4 - .000332 100 -0626 +0535 3 -.0162 100 -0655 -0533 6 -0141 100 0575 -9533 7 +00265 100 0550 -0534 8 -0164 100 - 0636 . -0535 9 ~.0225 100 +0507 -0509 10 -00419 99 -0632 -0366 11 -.0150 63 -0506 -0229 12 -0265 — 39 «0312 -0150 13 -.00234 28 0197 -00899 14 ~.0134 16 -0100 © -00532 15 ~-.00139 10 -00757 00335 16 -00635 6 -00759 -00196 17

-.00189 4 -00119 -000479 18

46

Table 2.4 cont. .-

Lag Distribution (b)

T=200 General Statistics .

m*(200) = 13.60 07€200) = .9726 K-S(200) = .0917*

Coefficient Statistics

BIAS (200,i) _ NTIME(200, 1) MSE (200,1) EVAR(200,1) — LAG~-i

-.00547 - 100 ~ 00613 .00578 0° _ .0158 100 .0178 -0205 1 -.0287 100 -0180 -0229 2 -0323 100 -0255 -0229 3 -.0205 100 .0246 -0229 4 -.000843 100 .0220 -0228 5 . 00678 100. .0242 .0228 6

-0681 - 100 . 0207 .0228 7 -.0135 100 _.0201 .0229 8 -00826 100 -0251 .0230 9 ~.000897 - 100 - .0242 .0225 10

-.00953 100 -0344 0185 11 -00990 78 60234 -0121 12 00232 47 .0128 -00762 13 -.00654 31 - .0178 -00522 “14 ~.000163 . 23 -.0117 .00393 15 -.000289 . 18 -00420 -00304 16 -00124 - 14 - 00436 -00240 17 ~.00226 11 -00507 .00201 18 -00577 -10. ~ £00408 00153 19 ~.00342 _ 7 .00378 .00120 20 -000717 6: .00309 -00100 21 ~.000182 5 -00150 .000814 22 - 000406 4 -00116 -000602 23 -000904 3 -00108 =; -000424 24 ~.00106 2 -000474 . .000230 - 25 -.00128° 1 -000164 -0002 30 26 - 0 - - 27

47

68 26 Oot 86 66 OOT

OTe Te BC 8T 62 82 BE es cS EL 99 9L OOT

0°T “GL86°

€T

SCHnHOOMmN

Am NNO 4

T 8 TT c

6 9 0

cL86° -S$L6°

ct

9 Tt v7] 0 0 0 0 0 0 0 0 V4 T 0 0 0 0 0 0 0 0 0 0 0 0 0 9) 0 0 0 0 0 0 T Tt ) ) 0 0 0 0 ) 0) 1) ) 9) 0 0 ) 4) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 OOT=1 9 v7] 6 € cS 9 € 0 z T 0 ¢ l TT SS 97 L OT S 0 0 0 £ ¢ e06T SZ 8 t € 0 T 0 OT. OT TZ ZT «(0 oT 9 € 0 T 0 TT T Lt OT oO 9 S 0 -T 0 0 OT 8 8f 9 O c € 0 0 0 0 TT TT 6T € 0 Zz T 0 0 0 0 6 It 9T € O T 0 0 0 1) 0 eT «I 9 € 0 ) 0 0 0 0 0 9 9 8 0 oO T 0 0 ) 0 0 t 4 7Iz—CO 0 0 0 0 0 0 S 9 8 0 ) T ) 0 0 0 ) Oo 0) 0 0 0 0 0 0 0 0 ) ¢l6° S6° 6° rn Tn i €° T° c0° ¢z0’ Gzt0° -S6° -6° “4° -G" -€° -T’ -GO° -SZ0° -SZTO° -0'0 SeTITTTqeqoid | TT OT 6 8 L 9 ¢ % eG Zz T TI@> AItTTqeqozd yore uy suofjeot{der Jo rzequny os=L

(9) uoF3ngzzastq Bey - ¢*Z eTIeL

OANMN TH

aN ac

oO et

OANMNTHOR OAD

wey

OOT 66

OOT OOT 00T OOT OOT

O°T

-SL86°

€T

oooo0oono

TRA BRNOTFOOO

oawrnm et ae

G186°

-SL6°

ra

0 0 oO 0 0 oO 0 0 oO 0 0 Oo 0 0 oO 0 0 oO 0 0 0 T 0 0 0 0 Zz T o T 0 tT 2 T Zz 2 v) 4 9 lL Sat 8 4 €1 9 Hl 62 8 6 12 ‘g 8 9 9 9 8 ral cS oe

SL6° S6° 6° -S6° ~6° “l°

TT OT 6

) 0 0 ) 0. +) ) 0 0 ) ) 0 0 0 0) ) ) 0 0) 0 +) 0 ) ) 0 0 0) 0

0 0 0 o- oO +) ) fr) ) ) 0 0) ) 0 0 ) 0 0 ) 0 0

00Z=1 ) 96 0 z ff) 4) T 0 46 z tr) L tr) +) T 06 z T 0 i) 0. rd 98 € rd rd 0 v4 € wz..COS S € T T 8 19 € 7 z T 0 l LE z 9 - € T T €I T L II T T ) oT 0) L z 0) 0 0 6 ) Z 0) T ) ) € 0 z ) 0 0 ) T 4) z ) 0 ft) ) ) ) T 0 0 0 ) ay en eC Sn ¢0° ¢z0°

-<s° -¢*° -T° -GO° -¢z0° -SzTO° soTdsTEqeqord g L 9 c 9 € z

IT22 AattFqeqoad yore uz suopTIeoTTdaz Jo zaquny

OOT =L

(q) woF3nqyzasta Bey - ¢°7 eTqRL

ooooo°o°o

900 00 0HMN ANOS

Gz10°

-0°0

OR NOT MN O

8T LT 9T _ST 9T €T

TT oT

q wornan

49

0 0 0 0 Oo 0 00T 0 0 0 0 0 0 LZ 0 0 0 0 oO 0 66 0 0 0 T 0 0 92 0 0 0 0 T 0 86 0 0 0 0 0 T cz 0 T 0 0 oO 0 16 iT 0 0 0 0 T 72 0 0 T 0 oO T 9 2~=OoiT 0 0 T 0 0 €2 T 0 0 0 oO T $6 OO € 0 0 0 0 Zz Zz 0 0 0 oO Z "6 0 0 0 T 0. T Tz 0 0 0 I 2 0 £6 Zz 0 0 0 0 4 0z T z 0 0 0 Y 06 O T I 0 0 T 6T T 0 0 0 € 0 68 O S 4 0 0 0 ST T 0 T TIT € T 99 Zz: 2Z r4 0 0 T LT Z 0 T I Zz Z zw € € 0 0 0 9T 2 T 4 ZY T 7 a an 3 T Z 0 € ST ¥ T T TT € € 69 Zz 4 S Y Z €. yT T € € € 1 2 es 2 69 S € Z T EI 22 l 9 6 %@ 6 zOC«S 9 T T 0 0 ZI ¥€ 4 € zZI€2 IT O l 9 Zz 0 0 0 TI BE ral Ot Sst «1% 2Z 0 T T 0 0 0 0 OT 89 8 ‘9 8 6 T 0 0 -0 0 0 0 0 6 48 9 Y € ¢€ 0 0 0 0 0 0 0 0 g 96 T € 0 O 0 0 0 0 0 0 0 0 L Ol Glee Slo co 6 “Lo § "~€* T s0° $zo’ Gzio° Seq -G186° -SL6° -S6° -6° -Z° -S° -€° T° °-S0° -Sz0" ~-SzTO° -0°0 SPT IFT EGE qOId €T ral IT Ol 6 gs iL 9 S Y € 4 T

T1e° AAFTTqeqoad yore ut suoypieot{dez Jo zaequnyn 007 =L . (q) uofanqzzystqd Bey - ¢°Z eTqeL

50

Table 2.6

Lag Distribution (c)}

- T=50 General Statistics m*(50) = 7.52 o* (50) = .9232. K-S(50) = .0752*

Coefficient Statistics

BIAS (50, i) NTIME (50, i) MSE (50,i) EVAR(S50, i) LAG-i -.0237 100 0357 -0290 - 0 -0273. 100 - 116 0951 1 -0133 100 127 -106 2 -.0319 100 -0839 - 106 3 -0445 100 - 0936 - 106 4 -.0181 100 -138 -0966 5 ~.0541 96 121 -0635 6 -0238 51 -0999 -0393 7 -00714 36 -0580 -0281 8 -0149 27 - 0586 -0198 9 -.0150 19 -0398 0143 10 -. 00698 15 -0324 - 00928 1l

- 00625 8 -0102 -00253 12

m*(100) = 8.18 07(100) = .9665 K-S(100) = .0581*

-.0206 0363 00869

~.0246

-.0137 0279

~.0328 0305

-.0204 0157

-.0101 0100

-.00350

-.00918 0149

~. 0117 00518 000177

-.000429

Table 2.6 cont":

Lag Distribution (c)

T=100

General Statistics .

Coefficient Statistics

BIAS (100, i) NTIME(100,i) MSE(100,i) EVAR(100,i) LAG~i

100 100 100 100

.0127 -0458 0535 0431 0476 0517 .0332 . 0332 .0208 .0158 -0232 .0238 0147 .0117 .0132 00826. 00595 . 00286 -00135

0119 -0415 -0466 -0465 -0465 -0439 -0269 -0159 -0122 -0100 -00911 - 00798 -00737 -00628 -00518 -00376 - 00201 -000941 -000225

5T

WOON KDE WHE O

H °o

11

a ee ee

m*(200). = 7.86 92 (200) = .9810 K-S(200) = .0801*

Coefficient Statistics

Table 2.6 cont..

Lag Distribution (c)}

T=200

General Statistics

“52

BIAS(200,i) _NTIME(200,i) __MSE(200,i) _EVAR(200,4) __LAG-i

--0104 -0143 -00793

-.0124

-.00273 -0164

-.0227 -00956 - 00296 -00129

-.000682

-.0118 -O111

-.00553 -00643

~. 00628 -00405

~.00392 -00138 -00339

- .00392 -00292

-.00216 -000 766

-.000612 - 000992

--000362

100 100 100 100 100 100 100

OPP EPEHPEPNWWRERYUOW

-00516 -0211 -0282 -0220 -0191 -0203 -0170 -0132 -00592 -00469 -00858 -00741 -00507 -00399 -00393

§.00225

- 00113 -000485 -000699 -000685 -00134 -00173 -000466 -00005 86

«0000374

-0000984 -0000131

-00540 .0189 .0210 -0211 .0121 -0200 -0124 .00712 -00479 . 00366 .00305 00245 .00200 .00173 .00150 -00109 . 000947 -000923 -000791 .000716 .000548 - 000323 .000260 -000257 -000257 - 000228 -0000648°

FOUWUMAN AUF WNE O

53

0°T ~$186°

€T

OCOANNE ST

NONODOO

OT 8 TT 9 eT

S rh

SL86° ~S$L6°

eT

SENN TREN

T 4 0 0

T € S S .9 eT Ss rh rh

_ $L6° -S6°

TT

I1a0 Ad TTqeqord yore ut suoy}e TT dex JO Teaquiny

AMMAN AA ed

O~nnndone«d

ome ea

L

Saad

S6° -6°

OT

T 0 0 0 0 0 0 0 0 0 0 0 0 0. 0 T 0 0 0 0 0 0 0 0 0 T: 0 0 0 0 OOT = T 0 7 OSH T ze cgi T € € Ts 0 7 4 S el 7 g € 9 4 ¢ 9 9 6b 9 S L 9 4 Y S wT 0 T T 7 T 0 T 0 8 4 0 0 0 OT 2z 0 0 T oT 2z 0 0 0 9 T 0 T 0 “6 “£2 $ Gt EF -L° -¢° -¢° ‘-T° SeTITTFqGeqorg

6 8 £ 9 S

Os =L (9) uopangyzistq B8e7 - 2°7 eTqQeL

ooaqaooo

SO CCC OnKMNMHNAO

T’

-S0".

V4

eo00o0°0

[e0000004NoOHORd

SO° ~SZ0°

oo 0000

SCO COCO OCC OMONA

AU ~SZTO°

oooo0o°o

S990 0CCONMNMNNO

w EN on °

e

2.

Fe ee ce eee mabe

OANKM TH

Onn ade

a

54

68 OOT OT 00T oot 00T 00T

ONOHNNAHAOHOe

~

O°T -SL86°

eT

OOQO0O0O0N

AMNMNONONHOO AOHO

SL86° -SL6°

ra

€ S T 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 QO. 0 0 0 0 0 0 0 0 0 0 0 0. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0. 0 0 0 0 0 0 0 0 0 0 0 00Z=L 0 0 0 0 86 0 0 0 0 0 0 0 0 96 0 T T T 0 0 vA r4 £6 T 0 0 T 0 T 0 € 68 T T T 0 0 € € 0 L8 T tA T 0 £ T T Z €8 T V7] T T T TT. r4 tA €8 € vA 0 rd T T € A 6L é 9 T 0 rA T € 9 82 T vA T €. wA T S S PL T L T T ‘? 0 8 9 89 T 9 tA T € V] 6 OT 9S € € vA 9 8 € OT tA 0. 0 0 0 0 [7 er rn on rr -¢c6° -6° -L° -<S° ~€° -T° ~So° -Sc0° SOFAITF (EGE qorg IT OT 6 g iL 9 S 4 € T1920 Aatrrqeqoad yore ut suoyieofttTdez Jo zequny OOT=L ,

(2) voz angyzzastq Bey - 1°% eTqeL

oooo0oo0°o

[TOM MONHFAOOHOOO

¢cO -~SzT0°

eooo0ooco0o

OA MOANANANOOR

SzT0°

-0°0

OANMSFNO

55

0 0 0 0 0 0 ooT 0 0 0 0 0 0 LZ 0 0 0 0 0 0 66 T 0 0 0 0 0 97 0 0 0 0 T 0 66 0 0 0 0 0 0 GZ 0 0 0 0 0) 0 66 T 0 0 0 0 0 97 0 0 0 0 0 T 66 0 0 0 0 0 0 tz 0 0 0 0 0 0 66 0 0 T 0 0 0 aA T. 0 0 0 0 0 86 0 0 T 0 0 0 TZ 0 0 0 0 0 T L6 0 0 T 0 T° 0 02 0 0 T 0 0 z L6 0 0 0 0 0 0 6T 0 0 T 0 0 z 96 0 0 T 0 0 0 ST 0 0 0 0 0 0 96 z z 0 0 0 0 £t 0 0 0 z 0 0 96 vA 0 0 0 0 0 9T 0 0 0 T T T €6 0. T 0 0 z T ct T T 0 4 T 0 76 T 0 T T 0 0 oT T 0 0 T T 0 16 T r4 T T 0 1 €T v4 0 T T 4 0 68 0 z 0 0 T 0 @T T 0 T T T T 98 z T z T 0 € Tt 0 - z r4 0 € T 48 T z T T z T OT T 0 7 T € c 08 T 4 @ T T 0 6 0 T T T ral Z OL € fs 0 T T € 8 % T 9 € y ¢ 9S OT ¢ T € Z Zz L —sT tie te tC 6 © Ss € T 60° $20 Selo’ Set “G186°. -SL6° -S6° 6° ~L° —S° -€° -T’ -SO° <-Sz0° ~-SZzTO° -0°0 SOFIFT EGBG d €T aT It. OT 6 8 L£ 9 ¢ 9 € . z T T1229 Aat¢Ttqeqoad yoee uy suofeodz{ der Jo rsqunyN 00z =.

(9) uofanqzzastq 3eq - 2°72 PTIFL

rey

-.110 -0718 -0624

-.0782 0232

~.0312 - 0386

~-.0336 -0125

-.0611 - 0380

-.128 +267

Coefficient Statistics

BIAS(50,i) __NTIME(50,i) _ MSE(50,i) _EVAR(50,i) _LAG-1

100 100 100 100 100 100 100 100 100

Table 2.8

Lag Distribution (d)

T=50

General Statistics

-0555 -114 -139 -144 | ~157 - 163 -171 -202 -189 - 166 - 166 -200 -140

m*(50) = 11.53 07(50) = 1.280 K-S(50) = .7004

56

G-i 0518 0 «171 1 193 2 195 3 -194 4 ~193 5 191 6 191 7 187 8 183 9 L71 10. 129 11

0344 12

m*(100) = 16.43 o7 (100) = 1.024 K-S (100) = .4030

BIAS(100,i)

Coefficient Statistics

Table 2.8 cont.

Lag Distribution (d)

T=100

General Statistics

NTIME(100,i) | MSE(100,i) |

EVAR(100, i)

57

LAG-i

oe eo on ne cn

~-.0205 - 00888 -0296 -.00849 -.0189 - 00353 -0131 -.0133 -.00213 -00221 -00520 -.00469 -.0177 -0153 -00530 -00225 -.0259 -0147 -0370

100 100 100 100 100 100 100 100 100 100

~ 100

100 100

-0155 . 20384 .0566 -0527 -0548 .0527 -0586 _ 0710 - 0667. .0625 .0611 .0631 -0553 -0497 . 0531 -0502 -0435_ -.0344 .0144

-0158 -0538 -0600 0601

. 0601

-0602 -0603

+0603

-0602

~.0596

-0597 0591 -0576 -0537 0444 -0334 -0194 -00444

WWDUNDUMNEWNHEHO

m* (200) = 20.75 67 (200) = .9959 K-S(200) = .2566

0138 0181 00589 00831 00143 .00269 00505 00406 .0113 00942 00944 0169 0105 00704, 00173 00906 0143 0338 0243 0209 00869 00124 00750 0116 000980 0123 000397 000952

Table 2.8 cont.

Lag Distribution (d)

T=200

General Statistics

Coefficient Statistics

100 100 100 100 100 100 100 100 100 100 100 100 100 100 100

-00676 0251 -0302 -0221 -0234 -0238 -0277 -0315 0251 -0241 -0268 -0247 -0212 -0270 - 0296 0271 - 0243 -0219 -0214 .0193 -0162 -0167 - 00970 -00959 -0101 -00641 00744 -00327

-00620 -0216 -0241 0242 - 0242 -0241 -0241 -0241 0242 0243 -0242 -0242 +0242 -0241 -0240 -0234 - 0224 -0203 -0170 - 0137 -0113 -00916 - 00746 -00600 -00494 - 00387 -00271 - 000710

58

BIAS (200, i) NTIME(200,i) |MSE(200,i) EVAR(200,i) LAG-i

59

LT ce 9% 89 €6 OOT

t at oF

sain oor nn

et

BC cs 16

0°T -SL86°

€T

eT 9T

€T

NAYUNNNAN GA ond

et

OT ¢é

GL86° ~SL6°

eT

TT €T 9T

d

et

ONAAMNNMNNMMNMAM

et

TT 0

SL6° -S6°

TT TT

OMNOmM

T

Ott OOONE et

val 6T OT OT T

S6° -6°

OT

TT.

0

L

0c Oc 62 TE Of ve ce Of

ZZ

eT 0

-L°

6

OT oO OT 0 oT os 0 S 0 T 0 0 0 T 0 0 0 0 0 0 0 oO 0 OOT=L € ze 0 8 ZI sgt Tz ) nn aA +I oz Tt 92 ral 8T OFT ZT TZ O. €T 4TH TZ 0 €T 8 TZ 0 Zt 9 9T OO OT € 8T oO 6 T OT oO 2z 0 7 0 0 0 0 0 0 0 “To Ss €° -S° -€° -T’ STITT T QE Gord 8 2 9 c

oo0o0o000

COoOn, OHOMNYST THRO

|

I wy or

Vv]

oooo0o0°0o

oooooocooco0co0o0ooto

AVFTTQeqoad yoes ut suoTq}e TT dex yo Toequnyn

0S =L

(Pp) woFanqzz3sta Bey - 6% eTqQeL

s0° -Sc0°

oo0o0o0o00

S2OCTCO COCO OONNMO

Sc0° -ScT0°

ooo0o0°o

[O00 007 OFFA HOOO

GzT0° -0°0

COANM THI

oOndn adie

eB

q SCANMTNOR AG

60

ce LY S9 98 G6 OOT OOT

TAAMNDAMNANTAONNHNAODAAH

T

O°T -SL86°

€T

VT

T

oOOnte

DWMOMSTITNANNN TMNOP St

SL86° ~SL6°

eT

SL6°

0c

OP OOONArE SOON ®

aomnm~mo

T ce Sc ce ce 6T T €¢ 8c TE TE

OONDTOMTAMNNUOM

T

ONDHTSTDAIONOATNAOAN AM

x

TT OT 6

S6° 6°

OT

€T

9T 92 8T €¢ 6T LT OT 9T TT

L° -s°

920 COO COOMNNA®

rc

oooo0ont

SOT] -TTqe qo d

8

rh

9

oooocot.t

an

LT ST 8T LT T¢ 9T ST cr €T

S

ONONMNTHATR OST O

-s0°

9

ooo 0o000

ooo0o0000

COANDNOOHYTTOO

sO’ -Sc0°

Tteo. AFT Fqeqord yore uz suoTieof{dez Jo zaquny

OOT =

(P) voftangtzistad Bey - 6°72 PTqQPL

ooo0o0c°o

oO0000 7 OF HOW OD

S$c0° -SZTO°

S000 000

oo o000co0co F ooono

S7ZT0° -0°0

OANMN TNO

61

PPD OAOODANDOOAHNTHOHAAOM

9 0 0 0 T A 68 T 0 0 0 0 € 0 T 0 iA A S38 0) V4 T tA 0 T 0 T vA 0 Vv] 08 iA L T 0) 0 T vA T 0 8 T LL € T tA T T tA € T 9 L S OL T S T 0 0) T T (4 € 9 S £9 S 6 € T T 4 0 € € L S AS 6 ¢ T T 9. 9 T € 9 €T € 94 9 €T c 0) tA 9 € S S rai et ce TT 6 Zz T 0 9 T 4 8 eT er LT LT VT V4 G tA L S 9 9 ce 6T 8 €T GT T tA 0 y T € S 8T €Z € cT LT 9 € z S A tA 9 ce 8T 0 ST T? T c T 9 9 € TI 8T 6T 0 oT 8T c € 0 V7 Va L Or LT €Z 0 ral LT 9 z 0 9 9 9 L 82 Gc 0 TT €T tA 0 0 y € Vv] 8 82 92 0 TI €T T T 0 S Z 8 ct 8Z 6T 0 6 6 0) T iA L S ‘OT 8 €€ LT 0 6 6 0 c 0 eT 8 ST €T Gc eT 0 8 S T 0) 0) 7¢ 8 9 TT Le €T 0 9 S 0 0 0 OT cie6° cl6° G6" 6 “2° Go "¢& €* T° 0° $70" -SL86° -Sl6° -S6° “6° ~L° -S° -¢€° -T° ~S0° -Sz0° -GcToO° SOTIFTFQeqoidg €T eT TT OT 6 8 Ll 9 S 9 € tA “TT=> AIT Tqeqoid yoes uf suopyeottdaz so taquny 0072 =L

(P) uozanqtTzastq Bey - 6°27 aTGeL

wy N ad Oo e i

e

1 °o oO

62

For each replication let m* denote the lag length selected ‘by the CAT criterion and let M denote the true lag length. The results of the Monte Carlo experiments indicate that when the CAT criterion is applied to the truncated lag distributions a and c, m* is greater than or equal to M except for 4 replications of lag distribution c, sample size 50, and m* is equal to M for approximately half the replications. The sample probabilities that m* = Mti, i=0, 1, ..., (r* ° 21) roughly correspond to the limiting probabilities that m* = Mi, i=0, 1, ..., 14 which were tabulated in section two, page 26. As one would expect, there is greater coincidence between the sample and limiting probabilities the larger is the sample size.

The coefficients in lag distribution b decline linearly from one to zero in steps of .10. Coefficients whose population value is close to zero are frequently excluded from the lag distribution chosen by the CAT criterion, although the frequency of replications where m* is less than M declines as sample size increases. A similar result holds for the infinite geometric lag distribution d. In this experiment, the average m* is an increasing function of the sample size. ‘Although we have not analyzed the large sample properties of the CAT criterion when the population distributed lag is infinite, the results of experiment d indicate that the bias of the estimated coefficients is small

despite the specification error.

63

For experiments a, b, and c the null hypothesis that the sum of the estimated coefficients is equal to the sum of the population coefficients is easily accepted at a 5% significance level. This hypothesis is rejected for experiment q (all sample sizes) at any reasonable significance level. The latter result is not surprising as the fitted lag distributions are too short. For sample sizes of 50, 100, and 200, the average m* is equal to 11.5, 16.4, and 20.8, and the difference between the sum of experiment d population distributed lag coefficients (5.0) and the sum of the first 11, 16, and 21 population coefficients is -43, .14, and .05. This analysis suggests that the majority of the ratios (the sum of the estimated coefficients minus the actual sum divided by the standard error of the sum) used in the calculation of the Komolgorov-Smirnov statistics are positive. So it is unlikely that these sample ratios for experiment d are realizations from a N(0,1) population and hence the null hypothesis is rejected.

There is a discrepancy between the mean square error (MSE) of the estimated coefficients for all experiments and sample sizes and the average estimated variance of the coefficients, EVAR. This discrepancy has two components. First, one can only expect the approximate equality of MSE and EVAR when the coefficient

bias is zero. The average bias of the estimated coefficients

64

over all’ experiments is small, so the coefficient bias explains only a part of the MSE-EVAR disparity. Second, the average estimate of o*, the variance of the disturbance term, is biased downwards for experiments a, b, and c and upwards fér experiment d, except for experiment d, T=200, where o* is essentially unbiased. In all cases the bias tends to zero as sample size increases. The direction of the o* bias is as expected since the average m* exceeds M for the finite distributed lag models a, b, and c, and m* is always too short for the infinite distributed lag model d. For thosecoefficients which are always included in the lag distribution selected by the CAT criterion, experiments a-c, EVAR is usually less than MSE. The reverse is true for experiment d except for the case T=200 when EVAR and MSE are roughly coincidental. These results reflect the direction of the o? bias. Last, for those coefficients which are frequently omitted from the lag distribution selected by the CAT criterion, all experiments, there is greater disparity between EVAR and MSE, the smaller is NTIME.

We now turn our attention to the analysis of tables 2.3, 2.5, 2.7, and 2.9. As was stated earlier, the null hypothesis that the ratio of an estimated coefficient to its estimated standard error is distributed as a N(0,l1) is incorrect for the first M+l coefficients of each lag distribution and correct for

all others. When the null hypothesis is true, one would expect

65

to finda distribution of the F(i,T,k) across the columns of tables 2.3, 2.5, 2.7, and 2.9 in proportion to the probabilities at the head of each column. Since a different order lag distribution is selected each replication, and since those coefficients not included in the fitted model are assigned a cumulative probability of .5, we cannot expect the F(i,T,k) to be distributed across the columns of these tables in the manner described above -except for those coefficients just beyond the end of the true lag distribution which are frequently included in the fitted model, but whose population value is zero. Despite the drawbacks of this analysis, it is still possible to. make general statements concerning the type I and type II error probabilities associated with the maintained hypothesis B(i) = 0, for all i, when the CAT criterion selects the order of the estimated distributed lag model.

A review of tables 2.3, 2.5, 2.7, and 2.9 indicates that for the first Mtl coefficients, the probability of a type II error (accepting the hypothesis B(i) = 0, i=0, ..., M-l when it is false) decreases with sample size and increases as one moves closer to the end of the true lag distribution. To see this note that the number of replications in the last column of tables 2.5, 2.7, and 2.9 for the first Mtl coefficients increases (to a maximum of 100) as sample size increases, but the number of

replications in the last column decreases as one moves closer

66

to the end of the true lag distribution. This generality does not apply to the first experiment, lag distribution a (M=0), since in this. case there are 100 replications in the first row and last column of the table 2.3 for all sample sizes. For those estimated coefficients whose population value is zero, the probability of a type I error (rejection of the null hypothesis B(i) = 0, i=M+1, ... when it is true) tends to decline as sample size increases, and as one moves further away from the end of the true lag distribution. To see this note that the number of replications in colum 7 of tables 2.3, 2.5, 2.7 and 2.9 for the M+l through (7°°)-th coefficients increases with sample size, and is larger (with a maximum of 100) for those coefficients correspondirg to the longest lags. In summation, it is only for those estimated coefficients in a band around the true lag length M that the coefficient hypothesis tests tend to be biased, and it is quite likely that EVAR-MSE discrepancy is a principle source of this bias.

The results of this section constitute strong evidence for the use of the CAT criterion to estimate the order and the coefficients of the distributed lag model 1.3. Using moderate size samples we have found corroborative evidence for the limiting probabilities that m* = Mti, i=0, 1, ..., M derived

in section two. Despite the similarity of the CAT criterion

67

to regression strategies (18, pp. 603-606), coefficient t-statistics (except for those coefficients in a band round the true lag length © M) are not biased by the use of the CAT selection procedure. But until we derive the large sample properties of Parzen's criterion for the case of infinite lag distributions, the applicability

of the CAT criterion is limited to circumstances where the researcher has a priori knowledge that the lag distribution is

finite.

68

Foctnotes

1366 Kmenta (10); pages 282-294 or Johnston (9), pages 259- 265 for a discussion of these procedures.

* Sims (17) provides an excellent survey of the various estimators of the distributed lag model. He shows (pp. 305-308, 326-329) that there exists a sequence of m's converging to inifinity with T, m/T>0, so that ordinary least squares, feasible generalized least squares, and Hannan inefficient (7) estimators of the DLM all have the same asymptotic distribution.

355 spectrai theory, see Hannan (8, pp. 273-288) for a discussion of the relationship between expanding parameterizations and sample size. Amemiya (3) does not provide guidelines for choosing the order of the residual autoregression as a function

of the sample size, but the ideas presented in Hannan still applicable.

“theoretically, the CAT criterion selects the order of an approximating autoregressive process which minimizes the one step ahead mean square prediction error. In (12) a set of AR models are estimated using monthly economic data (1960-1974) to see if the CAT criterion selects the appropriate order of an AR process. These experiments are inconclusive, but they suggest that the CAT decision rule does not always select AR models that minimize mean square prediction error,

>See Hannan (8), pages 204-220, especially theorem 6,

Sohne matrix result A> B > 0 = pl Goldberger (6 , p- 38).

> al is well knowns see

7 snderson shows that am a has a limiting normal distribu-

' 2 . tion with zero mean vector and covariance matrix 0 Q(m) given assumptions that are satisfied by our covariance stationary x process. His theorem 2.6.1 (4, pp. 23-24) implies that

4 . xz ry = e . plin Tt xX, & 0 for m fixed |

Tc

2, : Ssee Theil (18, p. 380). When m <M, plim go” is strictly

To

greater than o because .B., # 0 and

69

7 Cay, _ (0 0 met 0 | Qin) -9* (m,n) "Q* a) Tic)

-

and the lower right hand block of the limit matrix has full rank M-n.

9 Lemma 1: Let s(t) be a sequence of random variables, s(t) > 0 for all t, E(s(t)) <~ for all t. If lim E(s(t)) = 0,

To

then plim s(t) = 0. TO

Proof: Let 6 > 0, and let f£(s(t)) denote the probability density function (pdf) of s(t). Then

6 co E(s(t)) = { s(t)£(s(t))ds(t) + fs(t)£(s(t))ds(t), 6

E(s(t)) 26° J £(s(t))ds(t), )

(DE(s(t)) > S £(s(t))ds(t) = Prob(s(t) > 6)f | Consequently

(lim E(s(t)) > lim Prob(s(t) > 6) = lim Prob(|s(t)-0] > 6), {00

{~00 to thus O > lim Prob(|s(t) -.0] > 6), that is: plim s(t) = 0. tt T° 10

If a fumction m°(T) had been chosen so that m°(T)+© as Tro fo) while lin 2. > Q--. for example m°(T) = beT, O<b<1-- Too ,

then the arguments presented on pages 16-17 cannot be used to show plim CAT(m9(T)) = -o72. This is so because To00

70

lim E sm =ob,

Too

and the conditions for lemma 1 footnote 9 are no longer satisfied.

Ly am indebted to John Geweke, my thesis advisor, for his help in deriving the results on pages 19-25,

12

For n events A > AL one uses the definition of condi-

1? eee tional probability and an inductive argument to show

n - n-1 = a PC AD P(A))P(A,[A,)P(A,/4,™))«-. P(A | of A,)-

There are n! such formulae. 130he mechanics on pages 20-22 can also be used to analyze the large sample properties of the residual variance model fitting criterion, Theil (18, pp. 543-5): the expression

T°: Can ~ on) converges in distribution to 0 (k - xX 2 K)). The oo i k= = <j lim Prob (ai M) = Tl R, > Ry Prob(u, 1 +v< i| Usa < (i-1)), Teo i=l

‘i=l, ..., where Uy In this case, the limiting probability that m*=M is not greater than

zero because the Ro converge too slowly to one. To see this note that

and v are defined as they were in the text.

the expectation of 0° (4-77 (4)) is zero and the variance of 0° (5-7 (50) is 205. The convergence of R, to one as T*© is too slow for the Produck of he R, to be bounded away from zero. The probability

that Pa (j- a (j)) is greater than Zero approaches 25 as j gets

large. For the CAT criterion, Be 52 (2j- ¥() > 5/07 » an increasing

function of j. The probability that 45 (24-x 2(5)) is greater

that zero goes to one as j*. Thus the convergence of the Ps

to one is faster for the CAT criterion, and lim Prob(m* = M)> 0. To. 14 The statement in the text must be interpreted with care. 1f x(t) is a white noise process, there is no gain in asymptotic efficiency over HI for the first M+l coefficients since

plim Gar x DD » any fixed m > M, is a diagonal matrix. If

T-700

71

x(t) follows an ARMA(p,q) process, then there is a gain in asymptotic efficiency over HI for some of the first Mtl coeffi-

cients, and it is the form of plim Axx 74, any fixed m > M, rT-40 T mm _

which determines the coefficients with smaller asymptotic variance.

' : 15 P A(T) =

We must first establish that plim ( Xu fm(t)-wH ) exists,

T T00

and then that it is positive definite. Suppose we look at a sequence of non-zero quadratic forms in the matrices 1.27 of the text, page 14. The limit of this sequence exists since the quadratic forms are monotone decreasing and bounded below. Proposition Fl: As T>~, the limit of the sequence of non-zero

quadratic forms in the matrices of 1.27 is greater than zero,

' i.e., plim a n(t wu is positive definite. Too

Proof: Suppose the x process has moving average mepresensansons = b(s)e(t-s), b(0) = 1, and = b(s) <®@. s=0 s=0

x(t)

The matrix C = plim (Sy P(T)-we/D can be interpreted as T00

the matrix of 1 to (Mtl) step ahead prediction error variances and covariances from a projection of x(t) on its infinite past history. Let Q(t) be the set of observed innovations at time t, Q(t) = {... e(t-1), e(t)}. The ith qiagonal element of C, C(i,i), is equal to ;

E(x(ttMt2-i)7|2(t)) = E [ z bade (erma-t-s))7|a00)| s=0 e 2 2 gMre-i, sn = Eb(s)*£ [ cecetmt2-i-s)) [2ce) | =o 5 b(s)*, s=0° "gO

for i=l, ..., MFl. C(i,i) is the (M+2-i) step ahead prediction ertor variance of x(t). The covariances Cti,j) =

min {C(i,i), C(j,4)}. Let s be a (MH1) x 1 vector, s # 0. Suppose s'Cs = 0. This implies that there exist some j,

O < j < Mtl such that x(tt+j) is perfectly predictable (with probability one) for all t. This is impossible since x(t)

is nondeterministic. Therefore, C is positive definite. The conclusion in the text is appropriate since the inverse of C

is a continuous function of the elements of C.

72

The standard normal variates are termed pseudo-random since they are generated by the method of Box and Muller (5) on a Univac 1110 digital computer at the University of Wisconsin, Madison.

17 Most quarterly economic time series are well represented by stochastic second order difference equations, see Sargent (16, chapter XI).

18 an infinite geometric lag distribution is not appropriate for model 1.3; we include this parameterization in our set of experiments to gain insight into the behavior of the CAT criterion when applied to an infinite lag distribution.

? nespite the fact that the disturbance term of 1.3 is distributed as a N(0,1), we have not shown that the estimated coefficient vector selected by the CAT criterion is normally distributed. In empirical work it is convenient to assume that the order of the fitted model is fixed, and to proceed with conventional hypothesis tests as if they were appropriate. The results of the Monte Carlo experiments reported in the text indicate that this simplification does not result in test statistics which are grossly distorted.

20 We chose to analyze the null hypothesis that the sample ratios of the estimated coefficients to their estimated standard errors are distributed as a N(0,1) although it is incorrect for the first M+l coefficients, since the magnitude of the t ratio. is more often than not the decision criterion used by empirical researchers in determining what variables to keep in their models.

1)

2)

3)

4)

5)

6)

7)

8)

9)

10)

11)

12)

73

Bibliography

Aitken, A.C. (1935) "On Least Squares and Linear Combina-

tions of Observations." Proceedings of the Royal Society of Edinburgh, 55, pp. 42-48.

Akaike, H. (1974), "A New Look at the Statistical Model Identification," IEEE Trans. Auto. Control, Vol. AC-19, pp- 716-723. . .

Amemiya, T. (1973), "Generalized Least Squares with an Estimated Autocovariance Matrix," Econometrica, 41, No. 4, 723-732. ,

Anderson, T.W. (1971), The Statistical Analysis of Time Series, New York: John Wiley and Sons.

Box, G.E.P., and Muller, M.E., "A Note on the Generation of Random Normal Deviates," Ann, Math. Statistics, 28 (1958), pp. 610-611.

Goldberger, A.S., (1964), Econometric Theory, New York: John Wiley & Sons.

Hannan, E.J. (1963), "Regression for Time Series," in

Proceedings of a Symposium on Time Series Analysis, M. Rosenblatt (ed.), New York, John Wiley and Sons.

Hannan, E.J. (1970), Multiple Time Series, New York: John Wiley and Sons.

Johnston, J. (1963), Econometric Methods, New York: McGraw-Hill Book Company Inc.

Kmenta, J. (1971), Elements of Econometrics, New York: Macmillan Publishing Co., Inc.

Lindgren, B.W., Statistical Theory, London: Macmillan, 1968. Meese, R., "Distributed Lag Order Determination With an Appli-

cation to the Multiperiod Theory of the Firm" unpublished Ph.D. thesis, University of Wismnsin, Madison, 1978.

13)

14)

15)

16)

17)

18)

74

-Parzen, E. (1974), "Some Recent Advances in Time Series

Analysis," IEEE Trans, Auto. Control, Vol. AC-19, pp. 723-730.

(1975), "Multiple Time Series: Determining the Order of Approximating. Autoregressive Schemes ," Technical Report 23, State University of New York (SUNY) at Buffalo. °

(1976) "An Approach to Time Series Modeling and Forecasting Illustrated by Hourly Electricity Demands ," Technical Report No. 37, SUNY at Buffalo.

Sargent, T., Notes on Macroeconomic Theory, University of Minnesota, 1977.

Sims, C. (1974), "Distributed Lags," in M. Intriligator and D. Kendrick,. (eds.), Frontiers of Quantitative

Economics, Volume II, Amsterdam, North Holland.

Theil, H. (1971), Principles of Econometrics, New York: John Wiley and Sons.

Cite this document
APA
Federal Reserve (1978, September 30). Distributed Lag Order Determination. Ifdp, Federal Reserve. https://whenthefedspeaks.com/doc/ifdp_1978-126
BibTeX
@misc{wtfs_ifdp_1978_126,
  author = {Federal Reserve},
  title = {Distributed Lag Order Determination},
  year = {1978},
  month = {Sep},
  howpublished = {Ifdp, Federal Reserve},
  url = {https://whenthefedspeaks.com/doc/ifdp_1978-126},
  note = {Retrieved via When the Fed Speaks corpus}
}