Distributed Lag Order Determination
International Finance Discussion Papers Number 126
October 1978 -
DISTRIBUTED LAG ORDER DETERMINATION by
Richard Meese
NOTE: International Finance Discussion Papers aré preliminary materials circulated to stimulate discussion and critical comment, References in publications to International Finance Discussion Papers (other than an acknowledgment by a writer that he has had access to uapublished material) should be cleared with the author or authors,
Distributed Lag Order Determination
by Richard Meese*
Introduction
This paper is organized as follows: parameterization problems endemic to time series models are discussed in section one. New approaches to the parameterization problems are summarized and then applied to the problem of simulataneously estimating the length and the coefficients of a distributed lag regression model. The asymptotic properties of the new estimator of the distributed lag model (DLM) are examined in section two, and the small sample properties of the
estimator are examined in section three using Monte Carlo experiments.
1. Time Series Parameterization Problems
Considerable economic analysis is carried out using time series data. The large forecasting models of the U.S. economy that help determine macro-policy are notable examples. Hence it is important that econometricians develop and use estimation procedures that are appropriate for time series. An appropriate regression technique for handling this type of data is generalized least squares (GLS), a procedure that dates back to the work of Aitken (1). Consider the
generalized linear regression (GR) model:
y = XB + u , with (1.1) (Txl) . (TxK) (Tx1) (Kx1)
a) X full column rank,
*Economist International Finance Division. John Geweke and Arthur Goldberger made helpful comments on earlier drafts of the manuscript. Any errors which may remain are my own.
b) E(u|X) = 0, and
c) E(uu’|X) = 2, positive definite.
Since 2 is positive definite, so is gt, Let G'G = got, After
premultiplication of y and X by G, the least squares regression of Gy on GX is best linear unbiased (BLU). Although theoretically eloquent, this procedure is only a paradigm since the researcher rarely knows the disturbance covariance matrix.
Estimation of 8B with an unknown 2 matrix has been one focal point of the time series econometric literature for many years. Since an unrestricted 2 matrix contains T(T+1)/2 distinct parameters, some restrictions on the autocovariance function of the disturbance process are necessary. The earliest estimators of B were derived using the assumption that the disturbance process followed a first order autoregression, AR(1), u(t) = pu(t-1) + e(t). In what follows €(t) shall always denote a process which is independent and identically distributed (iid) with zero mean and variance 0°. For the AR(1) process £ contains two unknown parameters, p and o, and an asymptotically efficient estimator of 8 can be obtained using a variety of procedures. Hannan (7) and Amemiya (3) have worked out estimators of 8 with less tringent assumptions on the disturbance process. They assume that the
disturbance foliows an ARMA (p,q) process.
P q Lr oa,u(t-j) = ¥,eCt-i) where (1.2)
b) p and q unknown non-negative integers, y aol c) the zeros of = a,z
j = 0 (z complex) and.
j=0
q : z Y,2 = 0 lie outside the unit circle. i=0
Hannan's estimator of 8 is developed in the frequency domain while Amemiya's estimator is formulated in the time domain. Both may be interpreted as multistage GLS procedures which require consistent estimation of the parameters of G or = as an intermediate step. Since the autocovariance fumction of the disturbance process is unknown, the Amemiya and Hannan procedures can only be shown to be asymptotically efficient. To prove asymptotic efficiency (and asymptotic normality) of either coefficient estimator, the estimates of the parameters of G or 2 must "improve" as the sample size increases. This requires that the following conditions be satisfied: (a) The number of parameters characterizing the disturbance . process must be allowed to increase without bound. (b) The number of observations must increase at a faster rate than the number of parameters so that the ratio of parameters to observations tends. to.zero as each tends to infinity. Point (a) ensures that the approximation of the true disturbance
process improves as the sample size increases, and point (b)
ensures that the estimate of the approximation is consistent. In this paper we shall be concerned with the estimation of the special case of model 1.1 in which the columns of X are successive lagged values of the same variable. Consider the distributed lag model: M -
y(t) = 2 B(s)x(t-s) + E(t) with _— (1.3) s=0
a) Ma fixed unknown non-negative integer, M 2 b) 2% B(s)” <, B(M) #0, s=0 c) (x(t), €(t))° a zero mean jointly covariance stationary
process and
d) E(e€(t)\x(t-s)) = 0 for all t ands.
Model 1.3 has an observation matrix X with unknown column dimension. Although models 1.1 and 1.3 have striking dissimilarities, they have a common parameterization problem. Feasible GLS
estimators (Hannan and Amemiya) of model 1.1 with disturbance
Process 1.2 require close attention to points (a) and (b) above. Since Mis unknown in model 1.3, a defensible procedure in this context is to expand the length of the fitted distributed lag indefinitely as sample size increases so that specification error is avoided asymptotically. Again, the number of parameters must be al-
lowed to increase without bound as the sample size tends to infinity,
while the ratio of parameters to sample size converges to zero. In practice, the parameterization problem is solved by
increasing the dimension of the parameter space deterministically
with sample size T. For example, let m be the maximum length
of the distributed lag that is to be fit for a given sample size.
If a deterministic rule is followed, we choose m as a function
m(T) T
of T, m(T), so that m>™ as T+ © and lim Too
guarantees that for sufficiently large T, m(T) > M, and under-
= 0. This
fitting of model 1.3, i.e., m <M, is avoided asymptotically. Also, since lim 2) . 0, the estimator of the coefficients of the distributed lag can be shown to have desirable asymptotic properties.” The feasible GLS procedures can be made operational by use of the deterministic rule m = n(T) described above. For Hannan efficient estimation of model 1.1, a consistent estimator of the spectral density of the disturbance process can be obtained by expanding the width of the spectral window as a function of the sample size. Amemiya's estimator of model 1.1 is made operational by expanding the length of the residual autoregression as a function of sample size.? |
Recently, Akaike (2) and Parzen (13, 14, 15) have suggested
a new resolution of the type of parameterization problems discussed
above. Both authors have suggested methods for choosing the order
(length) of an autoregressive process when the order is unknown.
Their procedures are similar to regression strategies since one
estimates a set of autoregressive models whose length varies from zero to m, (m chosen as a function of the sample size) choosing the order that is best according to some criterion. Akaike's decision rule is based on the principle of maximum Likelihood estimation while Parzen's criterion minimizes the one step ahead mean square prediction error. It has been shown (14, p. 14)
that for any autoregressive process and large T, the Parzen criterion selects an order that is bounded above by the order chosen using Akaike's criterion. Although this result does
not imply that Akaike's decision rule is less useful, we choose to restrict attention to Parzen's criterion, which is called CAT for "criterion autoregressive transfer function."
Parzen's criterion was derived for use in selecting the order of the estimated autoregressive process, but his decision rule can be applied to the problem of estimating M, the length of the distributed lag in model 1.3. One chooses a lag length
.m* which gives the minimum value of
2 o,*, i=0, 1, ..., 0, (1.4)
CAT(i) = 0 o, i
Ne) tue
j
Aa
where °; is the residual variance from the regression of y(t) on current and j lagged values of the independent variable x(t). The variable T denotes sample size, and m is chosen as a function
of T in a manner described below (section two).: A rigorous
derivation of criterion 1.4 can be found in Parzen (14, pp. 16-20).
The following is a paraphrase of Parzen's rationale for the use of CAT as a method of order estimation. Let s(t) be a zero mean, normal, covariance stationary process with auto-
covariance function R,(v) = E(s(t) ° s(ttv)), v0, +1, ... . (1.5)
We assume that s(t) has autoregressive representation,
Z ai(j)s(t-j) =e(t), a (0) = 1. (1.6) j=0
Define the mmemory prediction error as
© (t) = s(t) - E[s(t)|s(t-1), s(c-2), ..., s(t-m)], (1.7)
The normality of s(t) implies that the m-memory prediction error is linear in past and present s(t), m € Ct) = seo a ()s(t-9), a (0) =1. (1.8)
Since eft) is uncorrelated with past values of s(t), E(e (t) * s(t-k)) = 0, k=l, ..., m, (1.9)
the mmemory autoregressive coefficients a (i> j=l, ..., m can be found by solving a set of m Yule-Walker equations, where a (0)
is defined to be one:
a (DR, Gi-k) = 0, k=1, ..., m (1.10)
im
j=0
The m-memory prediction error variance is given by oF Ele (t))2 = £ a G)RG), 20) Zl. (1.1) m m j=0 m s m
From 1.6 €(t) is the infinite~memory prediction error obtained
from the projection of s(t) on its infinite past, E(t) = s(t) - E[s(t)|s(t-1), s(t-2), ...]. (1.12) Let 02 denote the infinite-memory prediction error variance,
on = Be, (t))? = 2 aR), a0) 214, (2.23)
j=0
and define the transfer functions
m : g(z)=1+ 2 a (424 ™ j=l 2
g(2) =1+ = a(4) 24 jl
for z complex, and let
y(e'”) = 077g (e”), (1.14)
“| iw “m2. , iw Yate 2 =O, ei(e), and
Aa mn A € Ct) = seo a (i)s(t-3)-
a aw
27) ie . ; 2 The ao” a6 ) and g ) are consistent estimates of oy a6) and go) respectively which are obtained by solving the sample Yule-Walker equations, where a (9) is defined to
be one:
mM iA —— , Za (J)R,G-k) = 0, kel, ..., m
420 R (v) == £. s(t)s(ttv), and ‘ . (1.15) s T j=1 n9 n . A aA o* 520 a (GR, (3) -
The idea is to approximate the autoregressive process 1.6, of unknown but possibly infinite order, by a finite order process so as to minimize the one-step ahead mean square prediction error associated with the approximation of s(t) by. an AR(m). As a
"Measure of the one-step ahead mean square prediction error Parzen takes (14, p. 19) |
J = E(E_(t) - e_(t))? (1.16)
_ H a iw iw, ;2 = ES [¥g(e") = va(e™) [" £06) a0,
where £(w) is a spectral density function on (-II,Il) given by
2 fo}
l
-iw , _1
f(w) = ; 5
>) - (1.17)
4
2 "iw e
jos
ge )|
10
For large T Parzen shows (14, pp. 16-20) that approximately, m Xoo,” + (o_* = a )- . (1.18)
This mean square error expression is the sum of two terms,
(a2 - 07) representing bias and + E Oo, representing eo nm T jel ;
-2 . representing variability of Yn Because 0, is not a function
of m, to find m*, the optimal order of the AR process, it is
sufficient to find the minimum of
-2 -2 Oo
~2 . ga Jot
CAT(i) = Jy -oO =
foo)
Hie
i=1, ..., m.(1.19)
In practice, when using the CAT criterion to fit an autoregressive model, it is necessary to replace the oF j=l, ..., min formula 1.19 by their consistent estimates. See Parzen (14, pp. 20-23) for further discussion of this point.@
Although Cat's theoretical justification is completely different from that of the residual variance criterion, Theil (18, pp. 543-545), the two methods of determining the order of the distributed lag mdel 1.3 have similarities. The method of selecting the variables to be included in a regression model by choosing the specification with smallest residual variance can be used since the expected value of the residual variance of the erroneous model minus the expectation of the residual variance
of the true model is non-negative. On average, one cheoses the
il
correct specification of the regression model, but
the residual variance criterion produces estimates of the population variance of y given x that are biased downward,
and will not choose the correct specification of the regression ©
model if it is not one of the models that is being considered. For — n ;
fixed m the expression = I~ a? - o” converges to Tt. j _m j=0 Ao yp Bay plim (-0 ")because the first term of CAT, = £060,” converges T->0o m T j=0
to zero as T+ {see section two below). In large samples, minimizing CAT(i), i=l, ..., m ts thus similar to choosing the model specification with smallest residual variance, because nininizing -s* is equivalent to choosing the model with smallest o. Despite the similarity, we produce asymptotic distribution results in the next section which indicate the
superiority of CAT over the residual variance criterion.
12
2. Properties of an estimator of the distributed lag model 1.3 when CAT is used to determine the order of the lag distribution.
Consider the distributed lag model 1.3 with the additional
assumptions
e) lim (cov(x(t), x(t-s)) = 0, soo
£) x(t) has finite folrth order moments, and g) €(t) is normally distributed with zero mean and unit
variance for all t.
It will be convenient to write model 1.3 in matrix notation,
y = Xeu + € . (1.20) (TxL) peer) (TLD (M+1) x1
Let m denote the maximum length of the distributed lag that is to . > ition X= be fit for a given T. For m > M partition x (X> x and
write 1.20 as
Bug y= Ye XW ° 3 +€, where 8B = 0. (1.21)
Tx(M+1), Tx(m-M) =n-M
i}
When m < M partition Xy (x, Xm and write 1.20 as
Bx ~m = * = *&R* = y= X Br tve v= e+ KSB By as (1.22) —M-n
Assumptions (c), (e), and (f) are sufficient to show?
13
T+ (4. Q(™) s (1.23) (any fixed M)
where Q(M) is an (M+tl)x(Mt1) matrix with 3) element equal
to cow(x(t-i), x(t-j)). When m > M is fixed, define
\ Xe Seay Qc) Q(M,m-™ BS
co , : )" QGrm) To x! xt Q(M,m-M)' Q(m-M) (M,m “ea | Mn m MoM | ‘ (1.24) | Given that the limit matrix of 1.24 is nonsingular, Pim cms ~ , T+ + as (1.25)
(M,m fixed) xe x eM |
exists since the elements of the inverse of a matrix are continuous functions of the elements of the matrix itself, (18, p. 363)
Similarly, when m < M define
plim 1 wee eR Q*(m) Q*(m,M-m) Too ; T é r) = ' (m,M fixed) en, ee Q*(m,M-m)' Q*(M-m) / | | (1.26)
{t is also true that limit matrix of 1.26 has an inverse by the Same argument given above, (18, p. 363). In what follows
we will be interested in expanding the maximum length
14
of the fitted distributed lag indefinitely as Te . Suppose we
let m = m(T) so that m(T)>- as Te, and lim m(T) = Q. Clearly, T0 o
for T sufficiently large, m(T) > M. Now consider the sequence of
i . i ' > (Mt1) x (Mtl) matrices for any sample size T; TOR > i ' ' i ' i ' ! TOP Sp! 2 OPH) 2 +++ 2 TOP ary > 9 (1.27) where > and > denote matrix ordering and P. = (I - K, (XIX) kK),
i=1, ..., m(T)-M, where xX, is the Txt matrix whose Fi colum,
j=1, ... i is composed of x(t) lagged M+ j times. The inverses
of the matrices in 1.27 also form a monotone sequence,
. 6 T(x1X,) < T(XIP, XK) Kw. K TOP A cp ao? (1.28)
Nn Finally, we define on as the error sum of squares e'e froma
regression of y(t) on x(t), x(t-l), ..., x(t-m, t=1
d e009
T divided by T-m-1.
For any fixed m and iid c(t), plim ¢ x, ©) = 0, Too A (4, pp. 23-24).7 When m < M, plin (62) =
T3303
1.2 li —m- * plim =[g'x! ai (T/ (T-m-1)) | phim TB. x Xu + 284X! tet e'P* ©], (1.29)
where Pe = (I — X*(Xx'y% -1 nm a ( be x*) xa") .
15
22 'y Piin i; se (8 bss > (2) + (1.30)
: A2 _ 2 . ' _ t -1 = For m > M, plim O70 since x, Xx | x) = 0
Toxo when m>M. Therefore, a is a consistent estimator of o” when m > M, m fixed. For m< M, plim on > oc. Now consider
Too the case m = n(T):
“2 g'e plim Cnr) = Pim |p aGya-t ™ _ (1.32) tim | = %ncry Snr) Xycr)) Xr) & Toa T-m(T)-1 . | = | ' -l., ; 8 Nacry = *n(T) (2) *nr)? X(T)’ Then from 1.31, 2 eo? = aim (——P £'Nacr) £ Toe “m(t) “9 = a (ea )puss tT Je (1.32)
16
Observe that
Le eN € (£%ac*) > 0 for all T and n(T), and that
(1.33) ' E ( “wen ) - o7m(T) . T T £'Nany= 2 ,m(T) Clearly, lim E = lim o ¢ T ) = Q, from which Too T T0o
e'N € A plim 2") = 0.9 Therefore, on) is a consistent estimator Too
of o* provided lim mo) = Q.
T3730 We are now ready to examine the large sample properties of CAT as a model selection criterion. For sufficiently large T, m(T) > M so without loss of generality, we need only examine the
case for which m(T) > M. Consider the probability limit of
CAT(m(T)), plim CAT(m(T)) = plim T = 6, -o ¢r)]* T¥0 TH jeo J a (1.34) : M aL m(T) aL aL plim + Eo.” + plim= = od - plimo/.). mo Tye9 J 0 pe 7 jem J pom 2
We shall analyze each part separately:
plim = = a5 = lim Co) x plin a7? < lin @ (MELO? = 0, (1.35) Toco 4=0 To j=0 To j :
17
m9 9 l M a) since for j <M, plimo.~ <g - Therefore, plim > z o, = 0. Too j To00 . j=0 + 22 - We have already established that plim (-o )= -s, so there To m(T) :
is one part of 1.34 left to consider. Now
m(T) « m(T) AL m(T) E(z f o?)e2 Fo gry -— 2: th | jamt1 J j=Mt1 j=MHL 0° (1-5-3)
; (1.36)
a Aa
since gj & Vo is distributed as a x" (T-j-1) for j=Mt1, ..., m(T), and using the definition of expectation, the reciprocal of a
1 2 (t-j-1) variate can be shown to be >): The last term on the
right hand side of 1.36 is bounded above.
i mT) T-j-1 < B(T)-M (2-2),
j=M+1 o7 (T~4-3) ~ To? T-m(T)-3 (1.37)
Thelimit as T*~ of the right hand side of 1.37 is equal to zero,
mT) A_ ;
so plim= f£ a? = 0 by lemma 1 in footnote 9. Returning to Tro * j=m+1 J
our original problem 1.34, we have
m(T) w_ AL _ plim caT(m(t)) = plinf2 ¢° 9-2 _ 3 ty p= 07. T1090 T-+0 j=o J m(T) (1.38) Provided m(T) increases at a slower rate then T, i.e., T . AL lin 3) = 0, CAT(m) is a consistent estimator of -¢ 210
To
TT RN RR ee tte ee nent re
18
Define m* as the lag length which gives the minimum value of
boa, A. | j=0
2X 1 CAT (i) = 7
Proposition 1: The lim Prob(m* > M) = 1, i.e., the probability . Too
that minimizing CAT(i), i=0, 1, ..., m(T) results in specification
error, goes to zero as To, Proof: Suppose we look at a finite subcollection of the set {iz=O, 1, 2, ..., m(T)} that does not contain any integers greater
than or equal toM. Let i= {0, 1, 2, ..., M1}.
Then plim [min CAT(i)] = min(plim CAT(i)) since the min function
Te i i does not depend upon the sample size T, and plim CAT(i) exists T00 by virtue of 1.30. The min (plim CAT(i)) > -o72 from 1.30. {TT
If we look at a different finite subcollection of the i's that contains at least one integer greater than or equal to M,
i* = {non-negative intergers < M, and at least one integer > Mi,
then
plim (min CAT(i)) = min (plim CAT(i)) = -o Too 8 ik i* Tro
Since m(T)>°© as Toe, for sufficiently large Twe have m(T)>M and
tnere exists a finite subcollectionof {i=0; 1, ..., m(T)} that contains
at least one integer greater than or equal to M as T+, Therefore,
lim Prob(m* > M) = 1. Too
Proposition 2: Let k be a fixed positive integer. Then
19
T * (CAT(Mtk) - CAT(M)) converges in distribution to
dian (k)) as To, 2 o Proof: T(CAT(M+tk) - CAT(M)) = Dy Oo, + To. - To — jee J M Mtk
-
= f oj) + TY Cae (1.39) Ss: ' ' jaM+1 (e'e ) (€ KE 1 ( = ueetu ) toactca\(; 7 4, 4 ) — ees - aR ' -er _ 42, * T To DAS Sa pc nee a j “1 “| Nn e jnwe (Ss) Sieur) T T ' Because marae is a consistent estimator of o* for i=l, ..., Are Lo 2. 2 Settee k < 0, and 2 Ce Ey LAE RoVIE has ax (k) distribution independent
of T, expression 1.39 converges in distribution to
2 2 2 wo? + (EE PY WD Jd on - 2),
fo} fo}
Proposition 3:
ral
' Aa 4, * “) A 4 A _ . Cov( ean Sux SiS? Se Ses Sey Spm OS FSH <K.
a
20
P f: F < <i< 2 rene or 0 < jf < i< kK, it is true that con Ev; “Ene Ene) and ¢€! i eae are independent (18, pp. 84-85, 139). This implies
' ' =
cov(e,, © Seej ~ Stee Sera? Eerifaes? 79>
whence (1.40) ‘ ' = M7 4
cov(Ey Suey ? Eve eme 7 Var(el Ema? 2(T-M—-i-Do*™,
The second line follows from the fact that — has a chifo} a
square distribution with (T-M-i-1) degrees of freedom, and the
variance of a chi-square variate is equal to twice the number of an aA an “A Aw
e! degrees of freedom. The cov(et € Sax 7 Exes Evy? CML Cy
nw an
av ave = 0 since for 0 < j <i <K, and using 1.40 above,
coves Eman ~ SxeiEra? Sera Sora ~ Sve Sey? (1.41)
2(T-M-K-1)o" - 2(T-M-K-1)o" + 2(T-M-i-1)o" - 2(T-M-i-1)o" = 0.
Using propositions 2 and 3 we can now calculate the
lim Prob(m* = Mtj), j=0, 1, ..., k. It has already been established T-r00 that lim Prob(m* = M-j) = 0, j=l, ..., M. Define the following:
T7300 :
S, = 2°34 + (e'
F le ety 7 cle/o, 0<4<e,
(1.42) § =0.
2 Observe that S, ~ 2*j - j) and S, = 5S, + 2-w., where the F j-xQ j jel - 4;
w,'s are distributed as independent x71) variates. Define
21
= Preb(Z, > O|Z. > 0 > j=1,2, eve and k= - > —4 > i= ace j
2 - 2 here Z, ~ 2*j7 -y°(j); 2.20, Z, = Z. , +y., y, 2 - xX (1) and w j j-x Gs 2, hi joa * %5° 957 x (1) y, is indeperident of 254 1 Ye Now
lim Prob(m* = M+j) = Prob(-S, > 0, sy -~S,>0, ...,; 730 j j - > ~- > S _ > weed . S347 8; 0, S17 8; 0, 342 s, 0, ) (1.44) = Prob(-~$§ ~ XS © Prohl - 2 Prop(~S, > 0, ...,; S54 > oy Prob(S 45 oF > 0, );
since the random variables (S, - 8). 0 <i< j-1 are independent
of (Ss. - S,), n> j+1 by proposition 3. Note that
j
Prob(-S. > 0, ..., S, - S.>0) = (-S, j-1 j )
Prob(S, - S. > 0) ° Prob(S, - S,'> QOjS. -S,>0)°: j-1 j (Sj j S54 j )
Prob(S,_ -S,>0,S,,-S,>0)°... (1.45)
- s.|s, 3 ; j-2 j Fr 12 - Prob(S, , - S, > O|S, .,, -S, > 0, ..-, 8, , - S$, > 0). j-i jj j- j
itl j ; j-1
Given 1 < i < j observe that
22
Prob(S, , - S, > O|S. j-i jj
-S,>0,... -S,>0) = jetta 7 55 7 0 re 8, 8, > 0)
Prob(-Z, >.0|-Z,_, > 0, ...,- Z, > 0) =
1 (1.46)
Prob(-Z,_, - y, > O|-2,_ > 0, ..., -Z, > 0)
1 1
= Prob(-Z,_, - y, > 0|-2,_, > 0),
1
_ because yy is independent of Zs n=1, ..., i-l. Therefore,
Prob(-S, >0, Ss - 8, >0, ..., 8 ~ S, > 0) =
1 j-1 (1.47) j | J IIT Prob(Ss, -S,>0js. , -S,>0) = IIL px. i=l j-i j j-itl j {=1 i Similiarly, one can show foo] - > ~ > ~..) = . . Prob($. 44 s, 0, S540 s, 0, ) I P. (1.48) i=1 Putting the two results together, j © lim Prob(m* = M+tj) = IT p* TT op. (1.49) T-100 i=l *i=1 7
Inorder to calculate the limiting probability that m* is equal to M+j, we need to approximate expression 1.49. To accomplish this end let Wid ~ x? (4-1), Vw ¥2(1) independent of Vip and for i >2
i note tnat
23
= . > > Prob(Z, olZ, 5 0)
Py
= Prob(2i - u,) - v > 0|2(i-1) - uo
(1.50) = Prob(u;_y tv < 2ilu,_) < 2(4-D) ; of4- 13 = Prob(u, < 2iju,_) < 2(4-1)). Figure 1
u,_it
v>
The shaded area of figure 1 represents the set of Usd and v
+ < 1 < i- . ® ° such that ya Vv 24 and Wd 2(i-1). Let Fu; | ), Foyer ); and FL(*) denote the cumulative distribution functions (cdf) of us
u;_, and v and let f, , fi...» and fy denote the corresponding
i i-L probability density functions (pdf). It is clear from figure 1.1
that there are several representations of P, in terms of these cdf
and pdf. Since the Pp. must be calculated numerically, we have chosen
24
the following expression for p, to minimize computational cost:
2
Pp, = Fy, 2D - £ Oy EY - Fay OW EM av F (2(i-1)) “i-1 2 : Fu, OP - ara, Ey av = F (2) + F (QGai)) . (1.51) “irl
Numerical integration of the second term in the numerator of 1.51 was carried out as follows. The closed interval [0,2] was divided into 20,000 disjoint intervals, and the value of f(y),
0 < v < 2 was calculated using the approximation
£ (vy) = Prob(y7(1) < v + .0001) ~ Prob(y7(1) <v- .0001).
(1.52) A similar procedure is used to find pF, i<j: *k = - > - > pt = Prob(-Z, > 0|-z,_, > 0) = i- - < i- - < Prob(2i wii7y 0|2(i-1) Usa 0) (1.53)
Prob(u,_, +v > 2ilu,_, > 2(i-1))
Prob(u, > 2ifuss > 2(i-1)).
25
Figure 2
2(i-1)
_ The shaded area of figure 2 represents the set of Usd and v
such that u, _ + v > 2i and u > 2(i-1). i-l i-l
2 l- Fy (2(i-1)) -f (F (2i-v) - Fs (2(4~1)) )£ Cv) dv
_ ed o “ya pz = 1-F (2-1) u, : i-l 5 | (1.54) l+F (2(4-1))F(2)-S F (2i-v)£ (v)av . “irl Vv QO “i-1 M 1-F (2G-1)) Ui-d
Table 2.1 summarizes the results of these calculations.
26
Table 2.1 Limiting Probabilities that m*=M+j, j=0, ..., 14.
J Mtj lim Prob(m* = M+j) = II 9 T?. i=l i=l
iH oO uo wi) re
j
ht i) i) rary ln
oO ON KD MN F&F WD bd o: 18) far So
oe ee WR EH O ooo oO Co oOoO08 6 6 OPN WN © Fr OO BR
The figures in Table 2.1 indicate that as T?~, the CAT criterion does not prevent asymptotic overfitting of the distributed lag model, i.e. lim Prob(m*>M) = .5646. As To Parzen's criterion selects the true lag length approximately 44% of the time. The limiting probabilities that m* = M+j converge rapidly to zero as j increases. Because the use of the CAT criterion results in a non-zero probability that m*=M, there is a gain in asymptotic efficiency when using this criterion to estimate the DLM. Consider the estimator of an unconstrained distributed lag model of unknown order called Hannan inefficient (HI), (7). To get
consistent estimates of the coefficients using HI, the fitted
distributed lag m(T) must be expanded to infinity with sample size, while the ratio of m(T) over T goes to zero, (17, p. 304). The terminology "Hannan inefficient" stems from the fact that the lag distribution must be expanded indefinitely in both directions, so there is no way of incorporating prior information on the lag
distribution (one-sidedness or known lag length) into the estimation
procedure. Our analysis is comparable to the HI procedure if model 1.3 is amended to include M future x's. None of the preceding analysis (propositions 1-3) is affected if we fit symmetric two-
sided lag distributions of order m(T). Once the random variables
oF j=l, ..., m(T) are redefined as the error sum of squares from
a two sided distributed lag with j leads and lags divided by T-2j-1, the previous results follow with appropriate changes in degrees of freedom. The salient feature of these results is that
lim Prob(m*=M) > 0. Also, since the limiting probability that T00
m* = M+j goes to zero as j gets large, for 6 > 0 there exists some
integer M, = M.(6) such that lim Prob (m* >M.) <6. Hence there 0 0 Toe 0 is a gain in asymptotic efficiency over HI when CAT is used to
estimate the length of a symmetric two sided distributed lag model.
As Tee, the CAT criterion selects a finite lag length m* > M with
14 non-zero probability.
We conclude this section with a discussion of the large sample “ -1 . = ' ' ; i properties of B . (Xe Xw7: We shall focus attention on
aw
the first M+l components of Bee
28 ous . Te ' f @ = Proposition 4: Let Bak (Biy> Bt» . Then pyim By By. Proof: By proposition 1 we need only consider those values of m*
such that m* > M. Let {x(t)} denote the entire history of the x
process, {..., Xp Xe Xap? -..}. Then Lim E(B, [ix(t)}, mA) = (x g'X ) XX By B (M+1)x.1 =| ™ (1.55) o / (m*M)x1
* _ 2 ey -1 iim Var(B ,|m*, {x(t)}) =o (XX ae) . The (M+l)x(M+1) upper left hand corner block of o°(x! AX 4) is equal Zee -1 . , * . to 0 OO at - Denote this matrix by V(m*). The expectation of V(m*) with respect to m* is equal to the variance of the first
Mtl components of Bx conditioned on {x(t)} alone:
M-1 Ex (V(m*) |{x(t)}, m*) = 2 v(m*)e(m*, T|{x(t)}) m*=0 (1.56) m(T) + EZ V(m*)£(m*,T{{x(t)}), m*=M
where f(m*,T|{x(t)}) is the pdf of m* given the x process. The
sample size is included as an argument of f(*) to re-emphasize the
dependence of m* on T. Since plim V(m*) is a well defined matrix (see T00
footnote 15 below), the first term on the right hand side of 1.56
converges to zero as T*© by proposition 1. The second tern,
29
n(T) 7 n(T)
EZ v(m*) £(m*,T]{X(t)}) = 0? COP ah £Ga*, | Lx(t)}) m*=M m*=M . (1.57) Zoe -1 m(T) ° So Paro — ” £(ms, THxcr) }), m*=M
where < denotes matrix eerie The matrix Pa(T) -M is equal to
7 ~ Xa crm x pay Xi(r)-w* Observe that plim o” yP x) ° “su T|{x(t)}) = im &) prin ( an -an) -1-0,
Too Teo
' -1 Xe —wu 15 since plim ( mt ) is a well defined matrix.
T70
Therefore, the first Mtl components of Bix consistently estimate By Conditional on {x(t)} and a fixed m* >M, the limiting
distribution of vI(B - ( o>) is normal with zero mean vector
: : 2 -1 : and covariance matrix 0° Q(m*) ~. This is true because for
= 2, Xn ene fixed m* > M, JE ie £~ NO, oC A )) and
xh xe 1 -1 plim ( —T ) = Q(m*) ~. The innovation in our analysis
T 700
has been the estimation of both M and the distributed lag coeffcients, but it is applied work that motivates this discussion of the conditional limiting distribution of Bix given m* and
{x(T)}. Once m* has been selected using the CAT criterion, it
is convenient for the researcher to act as if m* were fixed
30
when performing hypothesis tests using coefficient estimates and their estimated covariance matrix. In the next section we examine the bias associated with conventional coefficient t-tests when CAT is used to select a DLM from a set of competing models of various lag lengths.
Although we have worked’ out the limiting probabilities that m* = Mti, i=0, 1, ..., m(T) for a distributed lag regression model with an arbitrary zero mean covariance stationary x process and normal independent disturbances, the problem of determining the limiting distribution of VT(B g - Baw or some other function of 3 gs remains a difficult problem. The column dimension of xe is a random variable with no upper bound, and "unconventional" central limit theorems are required to examine the limiting distribution of (1/YT)x', € . This is
n
true even if one restricts attention to By =
aA
-1
' ’ .
CP XP em = the first Mtl components of Bee The paper by Sims (17) discusses the problems associated with infinite dimensional parameter spaces; we shall not pursue the
subject further.
31
3. A Monte Carlo study of the use of the CA! criterion to select the order of the distributed lag model.
In this section we report results from a series of Monte Carlo experiments in which the CAT criterion was used to select the length of the coefficient lag distribution in the regression model 1.3 of section one,
M
y(t) = 2 B(s)x(t-s) + €(t), t=l, ..., T, s=0 ;
when M is an unknown nonnegative integer. The explanatory
variable x(t) was generated by the covariance stationary process 2 x(t) (1-.8L) = e*(t), t=1, oeey T, (1.59)
where L denotes the lag operator. Both e*(t) and ¢(t) are
"pseudo-random" standard normal variates independent of one another. This particular parameterization for the x(t) process was chosen since the autocovariance function of x(t) closely resembles that of a typical U.S. time series.” For each experiment we chose samples of size 50, 100, and 200, and one of the following lag distributions:.
a) -No.lag distribution; 8(0) = 1.0 and B(i) = 0, i¥0.
b) Linear Decay; B(i) = 1.0, 0 <i <4and 8(i) =
1.0 - .1(4i-4), 5 <i < 13. (1.60) c) Box; B(O) = .5= 8(6), B(i) = .8, i <i <5.
. 18 d) Infinite geometric; B(i) = 8, i=0, 1, .
32
For each sample size the maximum order of the fitted distributed lag models was approximately r°°, when T=50, distributed lags of order 0-12 were fit, for T=100; 0-18, and for T=200, 0-27. For each replication and sample size the CAT criterion estimated a particular lag distribution. After 100 replications the following summary statistics (i-viii) were calculated.
(i) The average order of the lag distribution that ¢
was selected by minimixing CAT(i), i=0, 1, ..., (T° )s
denoted m*(T). Let m*(T,k) denote the order chosen
for sample size T and replication k. The average
order m*(T) is given by
100
x m*(T,k), T=50, 100, 200. (1.61) k=1
m*(T) = 5 We include this statistic in the analysis since it gives a general indication of the performance of CAT as sample size increases. For example, when the CAT criterion is applied to — the finite lag distributions (a), (b), (c), we would expect the average m* to more closely coincide with the population lag. length M, the larger is the sample size. When-CAT is used to estimate the lag distribution which is of infinite length, lag distribution (d), we would expect the average m*(T) to be an increasing function of sample size.
(ii) The average estimate of o, the variance of the disturbance term (a? = Var (e€(t)) = 1), denoted
“9 cay
Oo -(T). Let o* (T,k) denote the estimate of o for
sample size T and replication k. Then o7 (7) is given by
; Ap , 100, O° (T) = Too X o° (T,k), T=50, 100, 200. (1.62)
k=1
33
This statistic is important since the residual variance is one component of the estimated variance of the coefficient estimates. If o is biased downwards (upwards), we would expect coefficient t statistics to be too large (small) on average, provided the coefficient estimator is unbiased. Im any event we would expect bias in 3 (7) to diminish as sample size increases.
(iii) A Komolgorov-Smirmov (K-S) statistic to test the
null hypothesis that the sum of the estimated coefficients
was equal to the sum of the true coefficients. Let CSUM(T,k) denote the sum of the coefficients from the estimated lag distribution of order m*(T,k) minus the sum of the population
distributed lag coefficients, divided by the (T,k)'? standard error of the estimated sum. Assume that the CSUM(T,k) are independent and identically distributed normal variates qwith zero mean and unit variance, denoted N(0,1) Suppose CSUM(T,k) is re-indexed by CSUM(T,2) so that the CSUM(T,2), %=1, ... 100 form a monotone increasing sequence. Let $(CSUM(T,%)) denote the cumulative density function (CDF) of a N(0,1) evaluated at CSUM(T,2). The two-sided K-S(T), T=50, 100, 200 statistics reported below are equal to the maximum absolute difference between the sample and population CDF of CSUM(T,2),
K~S(T) = max £=1, ..., 100
|e - O(CSUM(T,2))|, (1.63)
T=50, 100, 200.
The null hypothesis that the estimated sum is equal to the true sum is rejected at the 5% (1%) significance level if the K-S(T) statistic exceeds .136(.163).
An asterisk denotes acceptance of the null hypothesis at the 5% significance level.
We choose to report a K-S statistic for the sum of the estimated coefficients since for some distributed lag models this sum
represents the cumulative or iong run response of an endogenous
34
variable to a once and for all change in an exogenous variable.
The researcher may want to know if the use of the CAT criterion distorts this statistic. It should be noted that the same information concerning the distribution of the sum of the estimated coefficients could have been obtained from a standard t-statistic.
Failure to do so was an oversight on the author's part.
(iv) The average bias of the coefficient estimates, denoted BIAS(i,T). Let B(i,T,k) denote the ith estimated coefficient i=0, 1, ..., m*(T,k) ‘for sample size T and replication k, and let B(i) denote the value of the ith population coefficient from 2.12 a-d.
For T=50, 100, 200 define 8*(i,T,k) as
(sa,7%) if i=0, 1, ..., m*(T,k), and
6 (1.64)
B*(i,T,k) = l O if i> m*(T,k) + 1, ..., T.
Then BIAS (i,T) is given by
1 100 «
(1.65)
i=0, 1, ..., re, and T=50, 100, 200.
This statistic is important since it helps determine the reliability of coefficient point estimates when the lag distribution has been estimated using the CAT criterion.
(v) -The number of times that coefficient (i,T) was included in the estimated lag distribution, denoted NIIME(i,T}. Define the variable Q(i,T,k) as follows;
1 if i < m*(T,k), and Q(i,T,k) = — (1.66)
0 if i= m*(T,k) +1, ..., T°.
35
Then NTIME(i,T) is given by 100 6
NTIME(i,T) = £ Q(i,T,k), i=0, 1, ..., T°, k=1 (1.67)
T=50, 100, 200.
; 6 The values NTIME(i,T) i=0, 1, ..., T , T#50, 100, 200 can be
used to compute the sample frequencies that m*(T) = i, i=0, eee (7°) which can then be compared to the limiting probabilities that m* = i, i=0, ..., (M14) that were calculated in section two pages 18-26. The sample frequency that m*(T) = for any sample size and lag distribution is given by (NTIME(i,T) = NTIME(i+1,T))/100 for i=0, ..., (T°°=1), (1.68)
me and NTIME(i,T)/100 for i=T*”.
We expect there to be greater coincidence between the sample and limiting probabilities that m*(T) = i, i=0, 1, ..., ro, the larger is the sample size.
(vi) The average mean Square error of the estimated
coefficients, denoted MSE(i,T)
1 100 « MSE(i,T) = too t he (i,T,k) - BG)’,
(1.69)
i=0, 1, ..., re, T=50, 100, 200.
This statistic is useful since we will compare it to the average
76
estimated variance of each coefficient EVAR(i,T), i=0, ..., >
T=50, 100, 200, which is described below.
(vii) The average estingted variance of coefficient (i,T), denoted EVAR(i,T). Let G2 (T,k)*(X(T,k)' * X(T »k)) zt
denote the diagonal elements of the estimated
covariance matrix of the 8(i,T,k), i=0, 1, ..., ro, Define
36
A . -1 o 7 (7,k) (X(T,k)"* K(k) Fy for i=0,1, ..., m*(T,k), and
° -6 O for i=m*(T,k)+1, ..., T .
% «2 (Tk) (X*(T, kK)" K*(TJK)) 7}
(1.70)
Then EVAR(i,T) is given by
1 100 « 2 “1 EVAR(i,T) = Too xX o¥ (X*(T,k)' *X*(T,k)) 5 =] ?
i=1, ..., T'°, T=50, 100, 200.
If BIAS(i,T) is small, we expect EVAR(i,T) and MSE(i,T) to be
roughly the same. Should EVAR(i,T) be biased upwards (downwards )
then coefficient t-tests will be too small (large). We examine
the bias of coefficient hypothesis tests under the null hypothesis
that B(i) = 0, i=0, 1, ..., r’® using F(i,T,k) described below. (viii) The sample CDF of the ratio of each coefficient
to its estimated standard error, assuming the ratio has a N(0,1) distribution. Define F(i,T,k) as
-1 ii F(i,T,k) = i=0, 1, ..., m¥(T,k), and (1.71)
-5 for i=m*(T,k) + 1,°..., 7°,
1/2, for
$18 (4,7, k) / (62 (T,k) + (R(T) X(T, KFT
If the CAT criterion selected an estimated lag distribution of 6 order m*(T,k) < T’ , then the (a*(T,k) + 1) through cr’) -th
coefficients were arbitrarily assigned a cumulative probability
37
of .5. The cumulative normal probabilities F(i,T,k) are found in 13 columns of tables 2.3, 2.5, 2.7, and 2.9, a table for each lag distribution (1.60)a-d. For example, the entry corresponding to the column headed by 3 and the row associated wieh Lag 6 in Table 2.3-Lag Distribution (a) - T=50, page 43 is 2. ‘This means that F(6,50,k) is in the third probability cell; F(6,50,k) is less than .05 and greater than or equal to .025 for 2 out of the 100 replications for lag distribution (a), sample size 50, andthe coefficient corresponding to the x variable lagged six periods. The null hypothesis that the ratio of an estimated coefficient to its estimated standard error is distributed as a N(0O,1) is incorrect for the first Mtl coefficients of each lag distribution and correct for all others” Theoretically, the sample ratios of those coefficients whose population value is zero should be distributed across the columns of tables 2.3, 2.5, 2.7, and 2.9 in proportion to the cumulative probabilities at the head of each column. For example, given any lag distribution there should be approximately 30 replications in the first five columns in all rows where the population coefficient is zero, since the first five columns represent a cumulative probability of .30. We shall retum to this point during the analysis of the Monte Carlo results.
Last, the statistics described in (i-vii) above are found in tables 2,2, 2.4, 2.6, and 2.8; these tables correspond to
the four lag distributions 1.60 a-d. The first three statistics
(i-iii) are found under the heading general statistics, while
(iv-vii) are found under the heading coefficient statistics.
38
Table 2.2
Lag Distribution (a)
T=50 General Statistics
nm (50) = 3.22 o7(50)= -8836 K-S(50) = .0974*
Coefficients Statistics
BIAS (50, i) NIIME(50, i) MSE (50, i) EVAR( 50, i) LAG-i
0281 100 0264 -0116 0 -.0040 51 0755 -0332 1 -0370 45 -0634 -0316 2 -.00898 38 -0701 -0282 3 -.0207 34 -0412 0251 4 -0135 30 -0439 20211 5 -0105 25 -0394 -0194 6 ~.0135 24 -0378 -0181 7 -00456 21 -0590 -0162 8 -.00225 18 -0466 -0140 9 -00461 16 +0354 0115 10 -.000409 13 -0264 -00640 11 -.000504 . 7 -00856 -00152 12
39
Table 2.2 cont.
Lag Distribution (a)
T=100 General Statistics
m*(100) = 3.07 07 (100) = .9476 K-S(100) = .1013*
Coefficients Statistics
BIAS (160) ,i) NTIME (100,i) MSE(100,i) EVAR(100,1) LAG-i eR ed) NAR CLOO,L) LAG-i
-0178 100 -00902 -00543 0 -.0213 56 0254 -0157 1 -0122 41 +0247 -0101 2 -0104 34 0273 -0120 3 ~.0274 28 0199 -0107 4 -0177 25 -0169 -00949 5 - 00356 21 -0180 -00845 6 ~.0211 18 0154 -00760 7 -0136 15 -0212 -00694 8 -.0000761 13 -0146 -00627 9 ~.00212 12 00755 -00485 10 -0003 31 8 -00678 -00433 11 0111 8 -00570 -00431 . 12 -.0121 8 -00753 -00399 13 ~-00256 7 -00921 -00304 14 -0114 5 -00984 -00223 15 -.00753 4 -00440 -00163 16 -002 33 3 -000761 -000705 17 - 000430 1 -0000185 -000110 18
40
Table 2.2 cont.
Lag Distribution (a)
T=200 “ General Statistics
m*(200) = 3.11 07(200) = .9747 K+S(200) = .0778*
Coefficients Statistics
BIAS (200,i) NTIME (200, i) MSE(200 ,i) EVAR(200 ,i) LAG-i
-00856 100 -00372 -00230 0 -.00993 51 -0134 -00656 1 -00718 36 0142 -00632 2 -00480 32 0116 -00524 3 -.0137 26 -00952 00425 4 -000530 21 -00623 -00356 5 -00488 17 -00641 -00317 6 -00207 16 -00433 00264 7 -.00140 12 -00426 -00233 8 -00399 11 -00124 -00218 9 -.00667 10 -00239 - .00212 10 -00528 10 -00331 00202 11 -000179 9 * ,00259 -00196 12 -.00381 9 - 00290 -00182 13 -0116 8 -00629 -00168 14 -.00910 8 -00289 -00128 15 -000691 be) -000579 -00111 16 -.00411 5 -000949 -00109 17 -00780 5 «00253 «000958 18 -.00371 4 -00118 -000913 19 - 00430 4 -00174 -000860 20 -.00364 4 -00104 -000621 21 ~.00147 3 000127 -000402 22 -00101 2 -000186 -000330 23 -000744 2 - 0000993 -000210 | 24 -0000909 1 -000000826 -000504 25 - 0 - - 26 - 0
- - 27
Table 2.3 - Lag Distribution (a)
T=50
Number of replications in each probability cell
13
12
11
10
Probabilities
975
~9-
-0125- -025- -05- -05
0.0-
-9875 1.0
w O
fon)
0125
Lag
0 100
0
oO
Oo
0
0
0
11
55 62 66
‘1
70 75 76 79 82
84 87
10
11 12
93
T=100
100
41
NTN
AHO oO
StNOM SG
MAM
11
72 75
Table 2.3 - Lag Distribution (a)
T=100 Number of replications in each probability cell
13
12
11
10
Probabilities
1.0
°95- =. 975-
975
9-
-0125- -025- -05- 1- -05
025
0.0-
9875
95
ee
0125
Lag
MOAMNAN
wAOOOO
AH OoOrfa &
79 82 85
Hondo
AaAadtHoe
Naw oO
ANON
oO won~Onre
ooo
Hoo
92 92 92
AON
Aor
oOr
ooo
11 12 13
93 95 96 97
15
16 17 18
99
T=200
42
OnwstTONN
ONK Hone
ONNOMM eS
OMMNNHOO
omnotr erm
OnnMNn stam
49 64 68 74 79 83
OOMMOSTANN
ODONMSN
orn AMO eY
Oot HOMONNO
orn Moor
OtNAMN OF
CANN TNO
Table 2.3 - Lag Distribution (a)
. T= 200 Number of replications in each probability cell
13
12
11
10
Probabilities
1.0
9875
-975
-9~
~7- 9
-025- -05
-0125-
.025
0.0-
-95
0125
Lag
84 88
89
90 90 91 91
10 11 12
13
92
14
92
15
95 95 95 96
16 17
18
19 20 21 22 23 24 25 26 27
96
0
96
0 2
97
98 98
43
99 100
100
44
Table 2.4
Lag Distribution (b)
T=50 General Statistics m*(50) = 11.17. 07(50) = .9678 K-S(50) = .0645*
Coefficient Statistics
BIAS (50,i) NTIME(50,i) MSE (50, i) EVAR(50,i) LAG-i -.000726 100 -0413 -03 86 0 -00821 100 143 129 1 -.00368 100 - 169 144 2 -0289 100 -126 142 3 -.0309 100 -101 142 4 -.00246 100 141 141 5 -00548 100 - 150 -140 6 -.0107 100 ~152 ~142 7 0255 100 -194 143 8 -.0403 100 «224 .137 9 -0436 98 -193 -109 10 -.0143 74 - 139 -0655 11
-.0151 45 -0394 -0170
i Bo
45
Table 2.4 cont.
Lag Distribution (b)
T=100 General Statistics m*(100) = 12.65 o7 (100) = -9709 K-S(100) = .0705* |
Coefficient Statistics ©
BIAS (100,i NTIME(100,i1). _MSE(100,i) “EVAR(100,i) LAG-i -.0124 = “100 0125 -0134 0 -03.2 100 ~~ .0507 -0472 1 ~ .0367 100 -0431 0531 2 -0321 100 -0483 0533 3 -.0130 100 - 0659 -0533 4 - .000332 100 -0626 +0535 3 -.0162 100 -0655 -0533 6 -0141 100 0575 -9533 7 +00265 100 0550 -0534 8 -0164 100 - 0636 . -0535 9 ~.0225 100 +0507 -0509 10 -00419 99 -0632 -0366 11 -.0150 63 -0506 -0229 12 -0265 — 39 «0312 -0150 13 -.00234 28 0197 -00899 14 ~.0134 16 -0100 © -00532 15 ~-.00139 10 -00757 00335 16 -00635 6 -00759 -00196 17
-.00189 4 -00119 -000479 18
46
Table 2.4 cont. .-
Lag Distribution (b)
T=200 General Statistics .
m*(200) = 13.60 07€200) = .9726 K-S(200) = .0917*
Coefficient Statistics
BIAS (200,i) _ NTIME(200, 1) MSE (200,1) EVAR(200,1) — LAG~-i
-.00547 - 100 ~ 00613 .00578 0° _ .0158 100 .0178 -0205 1 -.0287 100 -0180 -0229 2 -0323 100 -0255 -0229 3 -.0205 100 .0246 -0229 4 -.000843 100 .0220 -0228 5 . 00678 100. .0242 .0228 6
-0681 - 100 . 0207 .0228 7 -.0135 100 _.0201 .0229 8 -00826 100 -0251 .0230 9 ~.000897 - 100 - .0242 .0225 10
-.00953 100 -0344 0185 11 -00990 78 60234 -0121 12 00232 47 .0128 -00762 13 -.00654 31 - .0178 -00522 “14 ~.000163 . 23 -.0117 .00393 15 -.000289 . 18 -00420 -00304 16 -00124 - 14 - 00436 -00240 17 ~.00226 11 -00507 .00201 18 -00577 -10. ~ £00408 00153 19 ~.00342 _ 7 .00378 .00120 20 -000717 6: .00309 -00100 21 ~.000182 5 -00150 .000814 22 - 000406 4 -00116 -000602 23 -000904 3 -00108 =; -000424 24 ~.00106 2 -000474 . .000230 - 25 -.00128° 1 -000164 -0002 30 26 - 0 - - 27
47
68 26 Oot 86 66 OOT
OTe Te BC 8T 62 82 BE es cS EL 99 9L OOT
0°T “GL86°
€T
SCHnHOOMmN
Am NNO 4
T 8 TT c
6 9 0
cL86° -S$L6°
ct
9 Tt v7] 0 0 0 0 0 0 0 0 V4 T 0 0 0 0 0 0 0 0 0 0 0 0 0 9) 0 0 0 0 0 0 T Tt ) ) 0 0 0 0 ) 0) 1) ) 9) 0 0 ) 4) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 OOT=1 9 v7] 6 € cS 9 € 0 z T 0 ¢ l TT SS 97 L OT S 0 0 0 £ ¢ e06T SZ 8 t € 0 T 0 OT. OT TZ ZT «(0 oT 9 € 0 T 0 TT T Lt OT oO 9 S 0 -T 0 0 OT 8 8f 9 O c € 0 0 0 0 TT TT 6T € 0 Zz T 0 0 0 0 6 It 9T € O T 0 0 0 1) 0 eT «I 9 € 0 ) 0 0 0 0 0 9 9 8 0 oO T 0 0 ) 0 0 t 4 7Iz—CO 0 0 0 0 0 0 S 9 8 0 ) T ) 0 0 0 ) Oo 0) 0 0 0 0 0 0 0 0 ) ¢l6° S6° 6° rn Tn i €° T° c0° ¢z0’ Gzt0° -S6° -6° “4° -G" -€° -T’ -GO° -SZ0° -SZTO° -0'0 SeTITTTqeqoid | TT OT 6 8 L 9 ¢ % eG Zz T TI@> AItTTqeqozd yore uy suofjeot{der Jo rzequny os=L
(9) uoF3ngzzastq Bey - ¢*Z eTIeL
OANMN TH
aN ac
oO et
OANMNTHOR OAD
wey
OOT 66
OOT OOT 00T OOT OOT
O°T
-SL86°
€T
oooo0oono
TRA BRNOTFOOO
oawrnm et ae
G186°
-SL6°
ra
0 0 oO 0 0 oO 0 0 oO 0 0 Oo 0 0 oO 0 0 oO 0 0 0 T 0 0 0 0 Zz T o T 0 tT 2 T Zz 2 v) 4 9 lL Sat 8 4 €1 9 Hl 62 8 6 12 ‘g 8 9 9 9 8 ral cS oe
SL6° S6° 6° -S6° ~6° “l°
TT OT 6
) 0 0 ) 0. +) ) 0 0 ) ) 0 0 0 0) ) ) 0 0) 0 +) 0 ) ) 0 0 0) 0
0 0 0 o- oO +) ) fr) ) ) 0 0) ) 0 0 ) 0 0 ) 0 0
00Z=1 ) 96 0 z ff) 4) T 0 46 z tr) L tr) +) T 06 z T 0 i) 0. rd 98 € rd rd 0 v4 € wz..COS S € T T 8 19 € 7 z T 0 l LE z 9 - € T T €I T L II T T ) oT 0) L z 0) 0 0 6 ) Z 0) T ) ) € 0 z ) 0 0 ) T 4) z ) 0 ft) ) ) ) T 0 0 0 ) ay en eC Sn ¢0° ¢z0°
-<s° -¢*° -T° -GO° -¢z0° -SzTO° soTdsTEqeqord g L 9 c 9 € z
IT22 AattFqeqoad yore uz suopTIeoTTdaz Jo zaquny
OOT =L
(q) woF3nqyzasta Bey - ¢°7 eTqRL
ooooo°o°o
900 00 0HMN ANOS
Gz10°
-0°0
OR NOT MN O
8T LT 9T _ST 9T €T
TT oT
q wornan
49
0 0 0 0 Oo 0 00T 0 0 0 0 0 0 LZ 0 0 0 0 oO 0 66 0 0 0 T 0 0 92 0 0 0 0 T 0 86 0 0 0 0 0 T cz 0 T 0 0 oO 0 16 iT 0 0 0 0 T 72 0 0 T 0 oO T 9 2~=OoiT 0 0 T 0 0 €2 T 0 0 0 oO T $6 OO € 0 0 0 0 Zz Zz 0 0 0 oO Z "6 0 0 0 T 0. T Tz 0 0 0 I 2 0 £6 Zz 0 0 0 0 4 0z T z 0 0 0 Y 06 O T I 0 0 T 6T T 0 0 0 € 0 68 O S 4 0 0 0 ST T 0 T TIT € T 99 Zz: 2Z r4 0 0 T LT Z 0 T I Zz Z zw € € 0 0 0 9T 2 T 4 ZY T 7 a an 3 T Z 0 € ST ¥ T T TT € € 69 Zz 4 S Y Z €. yT T € € € 1 2 es 2 69 S € Z T EI 22 l 9 6 %@ 6 zOC«S 9 T T 0 0 ZI ¥€ 4 € zZI€2 IT O l 9 Zz 0 0 0 TI BE ral Ot Sst «1% 2Z 0 T T 0 0 0 0 OT 89 8 ‘9 8 6 T 0 0 -0 0 0 0 0 6 48 9 Y € ¢€ 0 0 0 0 0 0 0 0 g 96 T € 0 O 0 0 0 0 0 0 0 0 L Ol Glee Slo co 6 “Lo § "~€* T s0° $zo’ Gzio° Seq -G186° -SL6° -S6° -6° -Z° -S° -€° T° °-S0° -Sz0" ~-SzTO° -0°0 SPT IFT EGE qOId €T ral IT Ol 6 gs iL 9 S Y € 4 T
T1e° AAFTTqeqoad yore ut suoypieot{dez Jo zaequnyn 007 =L . (q) uofanqzzystqd Bey - ¢°Z eTqeL
50
Table 2.6
Lag Distribution (c)}
- T=50 General Statistics m*(50) = 7.52 o* (50) = .9232. K-S(50) = .0752*
Coefficient Statistics
BIAS (50, i) NTIME (50, i) MSE (50,i) EVAR(S50, i) LAG-i -.0237 100 0357 -0290 - 0 -0273. 100 - 116 0951 1 -0133 100 127 -106 2 -.0319 100 -0839 - 106 3 -0445 100 - 0936 - 106 4 -.0181 100 -138 -0966 5 ~.0541 96 121 -0635 6 -0238 51 -0999 -0393 7 -00714 36 -0580 -0281 8 -0149 27 - 0586 -0198 9 -.0150 19 -0398 0143 10 -. 00698 15 -0324 - 00928 1l
- 00625 8 -0102 -00253 12
m*(100) = 8.18 07(100) = .9665 K-S(100) = .0581*
-.0206 0363 00869
~.0246
-.0137 0279
~.0328 0305
-.0204 0157
-.0101 0100
-.00350
-.00918 0149
~. 0117 00518 000177
-.000429
Table 2.6 cont":
Lag Distribution (c)
T=100
General Statistics .
Coefficient Statistics
BIAS (100, i) NTIME(100,i) MSE(100,i) EVAR(100,i) LAG~i
100 100 100 100
.0127 -0458 0535 0431 0476 0517 .0332 . 0332 .0208 .0158 -0232 .0238 0147 .0117 .0132 00826. 00595 . 00286 -00135
0119 -0415 -0466 -0465 -0465 -0439 -0269 -0159 -0122 -0100 -00911 - 00798 -00737 -00628 -00518 -00376 - 00201 -000941 -000225
5T
WOON KDE WHE O
H °o
11
a ee ee
m*(200). = 7.86 92 (200) = .9810 K-S(200) = .0801*
Coefficient Statistics
Table 2.6 cont..
Lag Distribution (c)}
T=200
General Statistics
“52
BIAS(200,i) _NTIME(200,i) __MSE(200,i) _EVAR(200,4) __LAG-i
--0104 -0143 -00793
-.0124
-.00273 -0164
-.0227 -00956 - 00296 -00129
-.000682
-.0118 -O111
-.00553 -00643
~. 00628 -00405
~.00392 -00138 -00339
- .00392 -00292
-.00216 -000 766
-.000612 - 000992
--000362
100 100 100 100 100 100 100
OPP EPEHPEPNWWRERYUOW
-00516 -0211 -0282 -0220 -0191 -0203 -0170 -0132 -00592 -00469 -00858 -00741 -00507 -00399 -00393
§.00225
- 00113 -000485 -000699 -000685 -00134 -00173 -000466 -00005 86
«0000374
-0000984 -0000131
-00540 .0189 .0210 -0211 .0121 -0200 -0124 .00712 -00479 . 00366 .00305 00245 .00200 .00173 .00150 -00109 . 000947 -000923 -000791 .000716 .000548 - 000323 .000260 -000257 -000257 - 000228 -0000648°
FOUWUMAN AUF WNE O
53
0°T ~$186°
€T
OCOANNE ST
NONODOO
OT 8 TT 9 eT
S rh
SL86° ~S$L6°
eT
SENN TREN
T 4 0 0
T € S S .9 eT Ss rh rh
_ $L6° -S6°
TT
I1a0 Ad TTqeqord yore ut suoy}e TT dex JO Teaquiny
AMMAN AA ed
O~nnndone«d
ome ea
L
Saad
S6° -6°
OT
T 0 0 0 0 0 0 0 0 0 0 0 0 0. 0 T 0 0 0 0 0 0 0 0 0 T: 0 0 0 0 OOT = T 0 7 OSH T ze cgi T € € Ts 0 7 4 S el 7 g € 9 4 ¢ 9 9 6b 9 S L 9 4 Y S wT 0 T T 7 T 0 T 0 8 4 0 0 0 OT 2z 0 0 T oT 2z 0 0 0 9 T 0 T 0 “6 “£2 $ Gt EF -L° -¢° -¢° ‘-T° SeTITTFqGeqorg
6 8 £ 9 S
Os =L (9) uopangyzistq B8e7 - 2°7 eTqQeL
ooaqaooo
SO CCC OnKMNMHNAO
T’
-S0".
V4
eo00o0°0
[e0000004NoOHORd
SO° ~SZ0°
€
oo 0000
SCO COCO OCC OMONA
AU ~SZTO°
oooo0o°o
S990 0CCONMNMNNO
w EN on °
e
2.
Fe ee ce eee mabe
OANKM TH
Onn ade
a
54
68 OOT OT 00T oot 00T 00T
ONOHNNAHAOHOe
~
O°T -SL86°
eT
OOQO0O0O0N
AMNMNONONHOO AOHO
SL86° -SL6°
ra
€ S T 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 QO. 0 0 0 0 0 0 0 0 0 0 0 0. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0. 0 0 0 0 0 0 0 0 0 0 0 00Z=L 0 0 0 0 86 0 0 0 0 0 0 0 0 96 0 T T T 0 0 vA r4 £6 T 0 0 T 0 T 0 € 68 T T T 0 0 € € 0 L8 T tA T 0 £ T T Z €8 T V7] T T T TT. r4 tA €8 € vA 0 rd T T € A 6L é 9 T 0 rA T € 9 82 T vA T €. wA T S S PL T L T T ‘? 0 8 9 89 T 9 tA T € V] 6 OT 9S € € vA 9 8 € OT tA 0. 0 0 0 0 [7 er rn on rr -¢c6° -6° -L° -<S° ~€° -T° ~So° -Sc0° SOFAITF (EGE qorg IT OT 6 g iL 9 S 4 € T1920 Aatrrqeqoad yore ut suoyieofttTdez Jo zequny OOT=L ,
(2) voz angyzzastq Bey - 1°% eTqeL
oooo0oo0°o
[TOM MONHFAOOHOOO
¢cO -~SzT0°
eooo0ooco0o
OA MOANANANOOR
SzT0°
-0°0
OANMSFNO
55
0 0 0 0 0 0 ooT 0 0 0 0 0 0 LZ 0 0 0 0 0 0 66 T 0 0 0 0 0 97 0 0 0 0 T 0 66 0 0 0 0 0 0 GZ 0 0 0 0 0) 0 66 T 0 0 0 0 0 97 0 0 0 0 0 T 66 0 0 0 0 0 0 tz 0 0 0 0 0 0 66 0 0 T 0 0 0 aA T. 0 0 0 0 0 86 0 0 T 0 0 0 TZ 0 0 0 0 0 T L6 0 0 T 0 T° 0 02 0 0 T 0 0 z L6 0 0 0 0 0 0 6T 0 0 T 0 0 z 96 0 0 T 0 0 0 ST 0 0 0 0 0 0 96 z z 0 0 0 0 £t 0 0 0 z 0 0 96 vA 0 0 0 0 0 9T 0 0 0 T T T €6 0. T 0 0 z T ct T T 0 4 T 0 76 T 0 T T 0 0 oT T 0 0 T T 0 16 T r4 T T 0 1 €T v4 0 T T 4 0 68 0 z 0 0 T 0 @T T 0 T T T T 98 z T z T 0 € Tt 0 - z r4 0 € T 48 T z T T z T OT T 0 7 T € c 08 T 4 @ T T 0 6 0 T T T ral Z OL € fs 0 T T € 8 % T 9 € y ¢ 9S OT ¢ T € Z Zz L —sT tie te tC 6 © Ss € T 60° $20 Selo’ Set “G186°. -SL6° -S6° 6° ~L° —S° -€° -T’ -SO° <-Sz0° ~-SZzTO° -0°0 SOFIFT EGBG d €T aT It. OT 6 8 L£ 9 ¢ 9 € . z T T1229 Aat¢Ttqeqoad yoee uy suofeodz{ der Jo rsqunyN 00z =.
(9) uofanqzzastq 3eq - 2°72 PTIFL
rey
-.110 -0718 -0624
-.0782 0232
~.0312 - 0386
~-.0336 -0125
-.0611 - 0380
-.128 +267
Coefficient Statistics
BIAS(50,i) __NTIME(50,i) _ MSE(50,i) _EVAR(50,i) _LAG-1
100 100 100 100 100 100 100 100 100
Table 2.8
Lag Distribution (d)
T=50
General Statistics
-0555 -114 -139 -144 | ~157 - 163 -171 -202 -189 - 166 - 166 -200 -140
m*(50) = 11.53 07(50) = 1.280 K-S(50) = .7004
56
G-i 0518 0 «171 1 193 2 195 3 -194 4 ~193 5 191 6 191 7 187 8 183 9 L71 10. 129 11
0344 12
m*(100) = 16.43 o7 (100) = 1.024 K-S (100) = .4030
BIAS(100,i)
Coefficient Statistics
Table 2.8 cont.
Lag Distribution (d)
T=100
General Statistics
NTIME(100,i) | MSE(100,i) |
EVAR(100, i)
57
LAG-i
oe eo on ne cn
~-.0205 - 00888 -0296 -.00849 -.0189 - 00353 -0131 -.0133 -.00213 -00221 -00520 -.00469 -.0177 -0153 -00530 -00225 -.0259 -0147 -0370
100 100 100 100 100 100 100 100 100 100
~ 100
100 100
-0155 . 20384 .0566 -0527 -0548 .0527 -0586 _ 0710 - 0667. .0625 .0611 .0631 -0553 -0497 . 0531 -0502 -0435_ -.0344 .0144
-0158 -0538 -0600 0601
. 0601
-0602 -0603
+0603
-0602
~.0596
-0597 0591 -0576 -0537 0444 -0334 -0194 -00444
WWDUNDUMNEWNHEHO
m* (200) = 20.75 67 (200) = .9959 K-S(200) = .2566
0138 0181 00589 00831 00143 .00269 00505 00406 .0113 00942 00944 0169 0105 00704, 00173 00906 0143 0338 0243 0209 00869 00124 00750 0116 000980 0123 000397 000952
Table 2.8 cont.
Lag Distribution (d)
T=200
General Statistics
Coefficient Statistics
100 100 100 100 100 100 100 100 100 100 100 100 100 100 100
-00676 0251 -0302 -0221 -0234 -0238 -0277 -0315 0251 -0241 -0268 -0247 -0212 -0270 - 0296 0271 - 0243 -0219 -0214 .0193 -0162 -0167 - 00970 -00959 -0101 -00641 00744 -00327
-00620 -0216 -0241 0242 - 0242 -0241 -0241 -0241 0242 0243 -0242 -0242 +0242 -0241 -0240 -0234 - 0224 -0203 -0170 - 0137 -0113 -00916 - 00746 -00600 -00494 - 00387 -00271 - 000710
58
BIAS (200, i) NTIME(200,i) |MSE(200,i) EVAR(200,i) LAG-i
59
LT ce 9% 89 €6 OOT
t at oF
sain oor nn
et
BC cs 16
0°T -SL86°
€T
eT 9T
€T
NAYUNNNAN GA ond
et
OT ¢é
GL86° ~SL6°
eT
TT €T 9T
d
et
ONAAMNNMNNMMNMAM
et
TT 0
SL6° -S6°
TT TT
OMNOmM
T
Ott OOONE et
val 6T OT OT T
S6° -6°
OT
TT.
0
L
0c Oc 62 TE Of ve ce Of
ZZ
eT 0
6°
-L°
6
OT oO OT 0 oT os 0 S 0 T 0 0 0 T 0 0 0 0 0 0 0 oO 0 OOT=L € ze 0 8 ZI sgt Tz ) nn aA +I oz Tt 92 ral 8T OFT ZT TZ O. €T 4TH TZ 0 €T 8 TZ 0 Zt 9 9T OO OT € 8T oO 6 T OT oO 2z 0 7 0 0 0 0 0 0 0 “To Ss €° -S° -€° -T’ STITT T QE Gord 8 2 9 c
oo0o0o000
COoOn, OHOMNYST THRO
|
I wy or
Vv]
oooo0o0°0o
oooooocooco0co0o0ooto
AVFTTQeqoad yoes ut suoTq}e TT dex yo Toequnyn
0S =L
(Pp) woFanqzz3sta Bey - 6% eTqQeL
s0° -Sc0°
oo0o0o0o00
S2OCTCO COCO OONNMO
Sc0° -ScT0°
ooo0o0°o
[O00 007 OFFA HOOO
GzT0° -0°0
COANM THI
oOndn adie
eB
q SCANMTNOR AG
60
ce LY S9 98 G6 OOT OOT
TAAMNDAMNANTAONNHNAODAAH
T
O°T -SL86°
€T
VT
T
oOOnte
DWMOMSTITNANNN TMNOP St
SL86° ~SL6°
eT
SL6°
0c
OP OOONArE SOON ®
aomnm~mo
T ce Sc ce ce 6T T €¢ 8c TE TE
OONDTOMTAMNNUOM
T
ONDHTSTDAIONOATNAOAN AM
x
TT OT 6
S6° 6°
OT
€T
9T 92 8T €¢ 6T LT OT 9T TT
L° -s°
920 COO COOMNNA®
rc
oooo0ont
SOT] -TTqe qo d
8
rh
9
oooocot.t
an
LT ST 8T LT T¢ 9T ST cr €T
S
ONONMNTHATR OST O
-s0°
9
ooo 0o000
ooo0o0000
COANDNOOHYTTOO
sO’ -Sc0°
Tteo. AFT Fqeqord yore uz suoTieof{dez Jo zaquny
OOT =
(P) voftangtzistad Bey - 6°72 PTqQPL
ooo0o0c°o
oO0000 7 OF HOW OD
S$c0° -SZTO°
S000 000
oo o000co0co F ooono
S7ZT0° -0°0
OANMN TNO
61
PPD OAOODANDOOAHNTHOHAAOM
9 0 0 0 T A 68 T 0 0 0 0 € 0 T 0 iA A S38 0) V4 T tA 0 T 0 T vA 0 Vv] 08 iA L T 0) 0 T vA T 0 8 T LL € T tA T T tA € T 9 L S OL T S T 0 0) T T (4 € 9 S £9 S 6 € T T 4 0 € € L S AS 6 ¢ T T 9. 9 T € 9 €T € 94 9 €T c 0) tA 9 € S S rai et ce TT 6 Zz T 0 9 T 4 8 eT er LT LT VT V4 G tA L S 9 9 ce 6T 8 €T GT T tA 0 y T € S 8T €Z € cT LT 9 € z S A tA 9 ce 8T 0 ST T? T c T 9 9 € TI 8T 6T 0 oT 8T c € 0 V7 Va L Or LT €Z 0 ral LT 9 z 0 9 9 9 L 82 Gc 0 TT €T tA 0 0 y € Vv] 8 82 92 0 TI €T T T 0 S Z 8 ct 8Z 6T 0 6 6 0) T iA L S ‘OT 8 €€ LT 0 6 6 0 c 0 eT 8 ST €T Gc eT 0 8 S T 0) 0) 7¢ 8 9 TT Le €T 0 9 S 0 0 0 OT cie6° cl6° G6" 6 “2° Go "¢& €* T° 0° $70" -SL86° -Sl6° -S6° “6° ~L° -S° -¢€° -T° ~S0° -Sz0° -GcToO° SOTIFTFQeqoidg €T eT TT OT 6 8 Ll 9 S 9 € tA “TT=> AIT Tqeqoid yoes uf suopyeottdaz so taquny 0072 =L
(P) uozanqtTzastq Bey - 6°27 aTGeL
wy N ad Oo e i
e
1 °o oO
62
For each replication let m* denote the lag length selected ‘by the CAT criterion and let M denote the true lag length. The results of the Monte Carlo experiments indicate that when the CAT criterion is applied to the truncated lag distributions a and c, m* is greater than or equal to M except for 4 replications of lag distribution c, sample size 50, and m* is equal to M for approximately half the replications. The sample probabilities that m* = Mti, i=0, 1, ..., (r* ° 21) roughly correspond to the limiting probabilities that m* = Mi, i=0, 1, ..., 14 which were tabulated in section two, page 26. As one would expect, there is greater coincidence between the sample and limiting probabilities the larger is the sample size.
The coefficients in lag distribution b decline linearly from one to zero in steps of .10. Coefficients whose population value is close to zero are frequently excluded from the lag distribution chosen by the CAT criterion, although the frequency of replications where m* is less than M declines as sample size increases. A similar result holds for the infinite geometric lag distribution d. In this experiment, the average m* is an increasing function of the sample size. ‘Although we have not analyzed the large sample properties of the CAT criterion when the population distributed lag is infinite, the results of experiment d indicate that the bias of the estimated coefficients is small
despite the specification error.
63
For experiments a, b, and c the null hypothesis that the sum of the estimated coefficients is equal to the sum of the population coefficients is easily accepted at a 5% significance level. This hypothesis is rejected for experiment q (all sample sizes) at any reasonable significance level. The latter result is not surprising as the fitted lag distributions are too short. For sample sizes of 50, 100, and 200, the average m* is equal to 11.5, 16.4, and 20.8, and the difference between the sum of experiment d population distributed lag coefficients (5.0) and the sum of the first 11, 16, and 21 population coefficients is -43, .14, and .05. This analysis suggests that the majority of the ratios (the sum of the estimated coefficients minus the actual sum divided by the standard error of the sum) used in the calculation of the Komolgorov-Smirnov statistics are positive. So it is unlikely that these sample ratios for experiment d are realizations from a N(0,1) population and hence the null hypothesis is rejected.
There is a discrepancy between the mean square error (MSE) of the estimated coefficients for all experiments and sample sizes and the average estimated variance of the coefficients, EVAR. This discrepancy has two components. First, one can only expect the approximate equality of MSE and EVAR when the coefficient
bias is zero. The average bias of the estimated coefficients
64
over all’ experiments is small, so the coefficient bias explains only a part of the MSE-EVAR disparity. Second, the average estimate of o*, the variance of the disturbance term, is biased downwards for experiments a, b, and c and upwards fér experiment d, except for experiment d, T=200, where o* is essentially unbiased. In all cases the bias tends to zero as sample size increases. The direction of the o* bias is as expected since the average m* exceeds M for the finite distributed lag models a, b, and c, and m* is always too short for the infinite distributed lag model d. For thosecoefficients which are always included in the lag distribution selected by the CAT criterion, experiments a-c, EVAR is usually less than MSE. The reverse is true for experiment d except for the case T=200 when EVAR and MSE are roughly coincidental. These results reflect the direction of the o? bias. Last, for those coefficients which are frequently omitted from the lag distribution selected by the CAT criterion, all experiments, there is greater disparity between EVAR and MSE, the smaller is NTIME.
We now turn our attention to the analysis of tables 2.3, 2.5, 2.7, and 2.9. As was stated earlier, the null hypothesis that the ratio of an estimated coefficient to its estimated standard error is distributed as a N(0,l1) is incorrect for the first M+l coefficients of each lag distribution and correct for
all others. When the null hypothesis is true, one would expect
65
to finda distribution of the F(i,T,k) across the columns of tables 2.3, 2.5, 2.7, and 2.9 in proportion to the probabilities at the head of each column. Since a different order lag distribution is selected each replication, and since those coefficients not included in the fitted model are assigned a cumulative probability of .5, we cannot expect the F(i,T,k) to be distributed across the columns of these tables in the manner described above -except for those coefficients just beyond the end of the true lag distribution which are frequently included in the fitted model, but whose population value is zero. Despite the drawbacks of this analysis, it is still possible to. make general statements concerning the type I and type II error probabilities associated with the maintained hypothesis B(i) = 0, for all i, when the CAT criterion selects the order of the estimated distributed lag model.
A review of tables 2.3, 2.5, 2.7, and 2.9 indicates that for the first Mtl coefficients, the probability of a type II error (accepting the hypothesis B(i) = 0, i=0, ..., M-l when it is false) decreases with sample size and increases as one moves closer to the end of the true lag distribution. To see this note that the number of replications in the last column of tables 2.5, 2.7, and 2.9 for the first Mtl coefficients increases (to a maximum of 100) as sample size increases, but the number of
replications in the last column decreases as one moves closer
66
to the end of the true lag distribution. This generality does not apply to the first experiment, lag distribution a (M=0), since in this. case there are 100 replications in the first row and last column of the table 2.3 for all sample sizes. For those estimated coefficients whose population value is zero, the probability of a type I error (rejection of the null hypothesis B(i) = 0, i=M+1, ... when it is true) tends to decline as sample size increases, and as one moves further away from the end of the true lag distribution. To see this note that the number of replications in colum 7 of tables 2.3, 2.5, 2.7 and 2.9 for the M+l through (7°°)-th coefficients increases with sample size, and is larger (with a maximum of 100) for those coefficients correspondirg to the longest lags. In summation, it is only for those estimated coefficients in a band around the true lag length M that the coefficient hypothesis tests tend to be biased, and it is quite likely that EVAR-MSE discrepancy is a principle source of this bias.
The results of this section constitute strong evidence for the use of the CAT criterion to estimate the order and the coefficients of the distributed lag model 1.3. Using moderate size samples we have found corroborative evidence for the limiting probabilities that m* = Mti, i=0, 1, ..., M derived
in section two. Despite the similarity of the CAT criterion
67
to regression strategies (18, pp. 603-606), coefficient t-statistics (except for those coefficients in a band round the true lag length © M) are not biased by the use of the CAT selection procedure. But until we derive the large sample properties of Parzen's criterion for the case of infinite lag distributions, the applicability
of the CAT criterion is limited to circumstances where the researcher has a priori knowledge that the lag distribution is
finite.
68
Foctnotes
1366 Kmenta (10); pages 282-294 or Johnston (9), pages 259- 265 for a discussion of these procedures.
* Sims (17) provides an excellent survey of the various estimators of the distributed lag model. He shows (pp. 305-308, 326-329) that there exists a sequence of m's converging to inifinity with T, m/T>0, so that ordinary least squares, feasible generalized least squares, and Hannan inefficient (7) estimators of the DLM all have the same asymptotic distribution.
355 spectrai theory, see Hannan (8, pp. 273-288) for a discussion of the relationship between expanding parameterizations and sample size. Amemiya (3) does not provide guidelines for choosing the order of the residual autoregression as a function
of the sample size, but the ideas presented in Hannan still applicable.
“theoretically, the CAT criterion selects the order of an approximating autoregressive process which minimizes the one step ahead mean square prediction error. In (12) a set of AR models are estimated using monthly economic data (1960-1974) to see if the CAT criterion selects the appropriate order of an AR process. These experiments are inconclusive, but they suggest that the CAT decision rule does not always select AR models that minimize mean square prediction error,
>See Hannan (8), pages 204-220, especially theorem 6,
Sohne matrix result A> B > 0 = pl Goldberger (6 , p- 38).
> al is well knowns see
7 snderson shows that am a has a limiting normal distribu-
' 2 . tion with zero mean vector and covariance matrix 0 Q(m) given assumptions that are satisfied by our covariance stationary x process. His theorem 2.6.1 (4, pp. 23-24) implies that
4 . xz ry = e . plin Tt xX, & 0 for m fixed |
Tc
2, : Ssee Theil (18, p. 380). When m <M, plim go” is strictly
To
greater than o because .B., # 0 and
69
7 Cay, _ (0 0 met 0 | Qin) -9* (m,n) "Q* a) Tic)
-
and the lower right hand block of the limit matrix has full rank M-n.
9 Lemma 1: Let s(t) be a sequence of random variables, s(t) > 0 for all t, E(s(t)) <~ for all t. If lim E(s(t)) = 0,
To
then plim s(t) = 0. TO
Proof: Let 6 > 0, and let f£(s(t)) denote the probability density function (pdf) of s(t). Then
6 co E(s(t)) = { s(t)£(s(t))ds(t) + fs(t)£(s(t))ds(t), 6
E(s(t)) 26° J £(s(t))ds(t), )
(DE(s(t)) > S £(s(t))ds(t) = Prob(s(t) > 6)f | Consequently
(lim E(s(t)) > lim Prob(s(t) > 6) = lim Prob(|s(t)-0] > 6), {00
{~00 to thus O > lim Prob(|s(t) -.0] > 6), that is: plim s(t) = 0. tt T° 10
If a fumction m°(T) had been chosen so that m°(T)+© as Tro fo) while lin 2. > Q--. for example m°(T) = beT, O<b<1-- Too ,
then the arguments presented on pages 16-17 cannot be used to show plim CAT(m9(T)) = -o72. This is so because To00
70
lim E sm =ob,
Too
and the conditions for lemma 1 footnote 9 are no longer satisfied.
Ly am indebted to John Geweke, my thesis advisor, for his help in deriving the results on pages 19-25,
12
For n events A > AL one uses the definition of condi-
1? eee tional probability and an inductive argument to show
n - n-1 = a PC AD P(A))P(A,[A,)P(A,/4,™))«-. P(A | of A,)-
There are n! such formulae. 130he mechanics on pages 20-22 can also be used to analyze the large sample properties of the residual variance model fitting criterion, Theil (18, pp. 543-5): the expression
T°: Can ~ on) converges in distribution to 0 (k - xX 2 K)). The oo i k= = <j lim Prob (ai M) = Tl R, > Ry Prob(u, 1 +v< i| Usa < (i-1)), Teo i=l
‘i=l, ..., where Uy In this case, the limiting probability that m*=M is not greater than
zero because the Ro converge too slowly to one. To see this note that
and v are defined as they were in the text.
the expectation of 0° (4-77 (4)) is zero and the variance of 0° (5-7 (50) is 205. The convergence of R, to one as T*© is too slow for the Produck of he R, to be bounded away from zero. The probability
that Pa (j- a (j)) is greater than Zero approaches 25 as j gets
large. For the CAT criterion, Be 52 (2j- ¥() > 5/07 » an increasing
function of j. The probability that 45 (24-x 2(5)) is greater
that zero goes to one as j*. Thus the convergence of the Ps
to one is faster for the CAT criterion, and lim Prob(m* = M)> 0. To. 14 The statement in the text must be interpreted with care. 1f x(t) is a white noise process, there is no gain in asymptotic efficiency over HI for the first M+l coefficients since
plim Gar x DD » any fixed m > M, is a diagonal matrix. If
T-700
71
x(t) follows an ARMA(p,q) process, then there is a gain in asymptotic efficiency over HI for some of the first Mtl coeffi-
cients, and it is the form of plim Axx 74, any fixed m > M, rT-40 T mm _
which determines the coefficients with smaller asymptotic variance.
' : 15 P A(T) =
We must first establish that plim ( Xu fm(t)-wH ) exists,
T T00
and then that it is positive definite. Suppose we look at a sequence of non-zero quadratic forms in the matrices 1.27 of the text, page 14. The limit of this sequence exists since the quadratic forms are monotone decreasing and bounded below. Proposition Fl: As T>~, the limit of the sequence of non-zero
quadratic forms in the matrices of 1.27 is greater than zero,
' i.e., plim a n(t wu is positive definite. Too
Proof: Suppose the x process has moving average mepresensansons = b(s)e(t-s), b(0) = 1, and = b(s) <®@. s=0 s=0
x(t)
The matrix C = plim (Sy P(T)-we/D can be interpreted as T00
the matrix of 1 to (Mtl) step ahead prediction error variances and covariances from a projection of x(t) on its infinite past history. Let Q(t) be the set of observed innovations at time t, Q(t) = {... e(t-1), e(t)}. The ith qiagonal element of C, C(i,i), is equal to ;
E(x(ttMt2-i)7|2(t)) = E [ z bade (erma-t-s))7|a00)| s=0 e 2 2 gMre-i, sn = Eb(s)*£ [ cecetmt2-i-s)) [2ce) | =o 5 b(s)*, s=0° "gO
for i=l, ..., MFl. C(i,i) is the (M+2-i) step ahead prediction ertor variance of x(t). The covariances Cti,j) =
min {C(i,i), C(j,4)}. Let s be a (MH1) x 1 vector, s # 0. Suppose s'Cs = 0. This implies that there exist some j,
O < j < Mtl such that x(tt+j) is perfectly predictable (with probability one) for all t. This is impossible since x(t)
is nondeterministic. Therefore, C is positive definite. The conclusion in the text is appropriate since the inverse of C
is a continuous function of the elements of C.
72
The standard normal variates are termed pseudo-random since they are generated by the method of Box and Muller (5) on a Univac 1110 digital computer at the University of Wisconsin, Madison.
17 Most quarterly economic time series are well represented by stochastic second order difference equations, see Sargent (16, chapter XI).
18 an infinite geometric lag distribution is not appropriate for model 1.3; we include this parameterization in our set of experiments to gain insight into the behavior of the CAT criterion when applied to an infinite lag distribution.
? nespite the fact that the disturbance term of 1.3 is distributed as a N(0,1), we have not shown that the estimated coefficient vector selected by the CAT criterion is normally distributed. In empirical work it is convenient to assume that the order of the fitted model is fixed, and to proceed with conventional hypothesis tests as if they were appropriate. The results of the Monte Carlo experiments reported in the text indicate that this simplification does not result in test statistics which are grossly distorted.
20 We chose to analyze the null hypothesis that the sample ratios of the estimated coefficients to their estimated standard errors are distributed as a N(0,1) although it is incorrect for the first M+l coefficients, since the magnitude of the t ratio. is more often than not the decision criterion used by empirical researchers in determining what variables to keep in their models.
1)
2)
3)
4)
5)
6)
7)
8)
9)
10)
11)
12)
73
Bibliography
Aitken, A.C. (1935) "On Least Squares and Linear Combina-
tions of Observations." Proceedings of the Royal Society of Edinburgh, 55, pp. 42-48.
Akaike, H. (1974), "A New Look at the Statistical Model Identification," IEEE Trans. Auto. Control, Vol. AC-19, pp- 716-723. . .
Amemiya, T. (1973), "Generalized Least Squares with an Estimated Autocovariance Matrix," Econometrica, 41, No. 4, 723-732. ,
Anderson, T.W. (1971), The Statistical Analysis of Time Series, New York: John Wiley and Sons.
Box, G.E.P., and Muller, M.E., "A Note on the Generation of Random Normal Deviates," Ann, Math. Statistics, 28 (1958), pp. 610-611.
Goldberger, A.S., (1964), Econometric Theory, New York: John Wiley & Sons.
Hannan, E.J. (1963), "Regression for Time Series," in
Proceedings of a Symposium on Time Series Analysis, M. Rosenblatt (ed.), New York, John Wiley and Sons.
Hannan, E.J. (1970), Multiple Time Series, New York: John Wiley and Sons.
Johnston, J. (1963), Econometric Methods, New York: McGraw-Hill Book Company Inc.
Kmenta, J. (1971), Elements of Econometrics, New York: Macmillan Publishing Co., Inc.
Lindgren, B.W., Statistical Theory, London: Macmillan, 1968. Meese, R., "Distributed Lag Order Determination With an Appli-
cation to the Multiperiod Theory of the Firm" unpublished Ph.D. thesis, University of Wismnsin, Madison, 1978.
13)
14)
15)
16)
17)
18)
74
-Parzen, E. (1974), "Some Recent Advances in Time Series
Analysis," IEEE Trans, Auto. Control, Vol. AC-19, pp. 723-730.
(1975), "Multiple Time Series: Determining the Order of Approximating. Autoregressive Schemes ," Technical Report 23, State University of New York (SUNY) at Buffalo. °
(1976) "An Approach to Time Series Modeling and Forecasting Illustrated by Hourly Electricity Demands ," Technical Report No. 37, SUNY at Buffalo.
Sargent, T., Notes on Macroeconomic Theory, University of Minnesota, 1977.
Sims, C. (1974), "Distributed Lags," in M. Intriligator and D. Kendrick,. (eds.), Frontiers of Quantitative
Economics, Volume II, Amsterdam, North Holland.
Theil, H. (1971), Principles of Econometrics, New York: John Wiley and Sons.
Cite this document
Federal Reserve (1978, September 30). Distributed Lag Order Determination. Ifdp, Federal Reserve. https://whenthefedspeaks.com/doc/ifdp_1978-126
@misc{wtfs_ifdp_1978_126,
author = {Federal Reserve},
title = {Distributed Lag Order Determination},
year = {1978},
month = {Sep},
howpublished = {Ifdp, Federal Reserve},
url = {https://whenthefedspeaks.com/doc/ifdp_1978-126},
note = {Retrieved via When the Fed Speaks corpus}
}