ifdp · September 30, 1991

Parameter Constancy, Mean Square Forecast Errors, and Measuring Forecast Perfomance: An Exposition, Extensions, and Illustration

Abstract

Parameter constancy and a model's mean square forecast error are two commonly used measures of forecast performance. By explicit consideration of the information sets involved, this paper clarifies the roles that each plays in analyzing a model's forecast accuracy. Both criteria are necessary for "good" forecast performance, but neither (nor both) is sufficient. Further, these criteria fit into a general taxonomy of model evaluation statistics, and the information set corresponding to a model's mean square forecast error leads to a new test statistic, forecast-model encompassing. Two models of U.K. money demand illustrate the various measures of forecast accuracy.

Board of Governors of the Federal Reserve System International Finance Discussion Papers Number 412 October 1991

PARAMETER CONSTANCY, MEAN SQUARE FORECAST ERRORS, AND MEASURING FORECAST PERFORMANCE: AN EXPOSITION, EXTENSIONS, AND ILLUSTRATION

Neil R. Ericsson

NOTE: International Finance Discussion Papers are preliminary materials circulated to stimulate discussion and critical comment. References in publications to International Finance Discussion Papers (other than an acknowledgment that the writer has had access to unpublished material) should be cleared with the author or authors.

ABSTRACT

Parameter constancy and a model's mean square forecast error are two commonly used measures of forecast performance. By explicit consideration of the information sets involved, this paper clarifies the roles that each plays in analyzing a model's forecast accuracy. Both criteria are necessary for "good" forecast performance, but neither (nor both) is sufficient. Further, these criteria fit into a general taxonomy of model evaluation statistics, and the information set corresponding to a model's mean square forecast error leads to a new test Statistic, forecast-model encompassing. Two models of U.K. money demand illustrate the

various measures of forecast accuracy.

Key words and phrases: forecasting, diagnostic testing, econometrics, encompassing, evaluation criteria, information sets, mean square error, parameter constancy, predictive failure, statistical inference.

Parameter Constancy, Mean Square Forecast Errors, and Measuring Forecast Performance: An Exposition, Extensions, and Illustration

Neil R. Ericsson!

1. Introduction

Parameter constancy and the mean square forecast error (MSFE) are two commonly used measures of the forecast performance of empirical macro-models. Parameter constancy has long been viewed as a desirable economic and statistical property, and it is closely linked to the issue of predictive failure; cf. Chow (1960) and Hendry (1979). Further, parameter constancy can imply super exogeneity, which is necessary to sustain counter-factual policy simulations of an econometric model; cf. Hendry (1988). Lack of parameter constancy can induce apparent unit roots, posing potential difficulties when testing for cointegration; cf. Hendry and Neale (1991). The MSFE is a common criterion for evaluating the performance of alternative macro-models; cf. Fair (1986) for a general discussion and Meese and Rogoff (1983) for a classic example with models of the exchange rate.

Sometimes, the literature has viewed these two forecast criteria as competing rather than complementary. Thus, this paper aims to clarify the roles of parameter constancy and the MSFE in evaluating the forecast accuracy of a model.

Section 2 works through some simple examples to show that (i) parameter constancy is neither necessary nor sufficient for minimizing MSFE across a given set of models, and (ii) both criteria together are necessary but not sufficient to obtain the best forecasting model, even on only the data available from the given set of models. Section 3 explains why,

Showing that parameter constancy and minimizing MSFE are criteria that evaluate a given

Forthcoming in a special issue of the Journal of Policy Modeling entitled Cointegration, Exogeneity, and Policy Analysis. The author is a staff economist in the Division of International Finance, Federal Reserve Board. The views expressed in this paper are solely the responsibility of the author and should not be interpreted as reflecting those of the Board of Governors of the Federal Reserve System or other members of its staff. I am grateful to Julia Campos, Frank Diebold, Hali Edison, David Hendry, David Howard, Ross Levine, Jaime Marquez, Garry Schinasi, and P.A.V.B. Swamy for useful comments and discussions, and to Hong-Anh Tran for invaluable research assistance. I am particularly indebted to Edison, Schinasi, and Swamy, whose empirical papers on modeling and forecasting exchange rates motivated this paper; cf. Edison (1985), Edison and Klovland (1987), Edison (1991), Schinasi and Swamy (1989), and Swamy and Schinasi (1989). All numerical results were obtained using PC-GIVE Version 6.01; cf. Hendry (1989).

model (respectively) against that model's own data and against other models’ data. Both are reasonable criteria, but other criteria are also important for determining the forecast adequacy of an empirical model. Section 4 introduces a new model evaluation criterion, "forecastmodel" encompassing, and the corresponding test statistic. Further, Section 4 shows that minimum MSFE, forecast encompassing, and forecast-model encompassing parallel variance dominance, variance encompassing, and parameter encompassing respectively. Section 5 discusses several implications for forecasting integrated and cointegrated variables. Section 6 comments briefly on the role of time-varying coefficient models in forecasting. Section 7 illustrates the various forecast-based criteria with an application to two models of narrow money demand in the United Kingdom.

Before turning to the heart of the paper, three remarks may be helpful. First, the results in this paper are quite general. However, to illustrate the central concepts, simple, static, Gaussian models are used as examples throughout. See Hendry and Richard (1982) for a framework in which more general results may be obtained.

Second, in order to abstract from sampling issues, results are often presented as “asymptotic”. This in no way invalidates the results, but simply permits a clearer exposition.

Third, the concept of an “adequate forecasting model" is intentionally left vague. Roughly, such a model efficiently uses the information available for creating forecasts. It is defined in part by its negation. For instance, a model is inadequate for forecasting if its forecast errors are predictable, a situation including both parameter nonconstancy and lack of forecast encompassing (as will be seen below). For Gaussian processes, minimum MSFE is a condition for forecast adequacy, but Section 2 below shows that it is not sufficient because the

corresponding errors may still be predictable.

2. Parameter Constancy and Minimizing MSFE

This section shows that: (i) parameter constancy is neither necessary nor sufficient for minimizing MSFE across a given set of models, and (ii) both criteria together are necessary (but not sufficient) to obtain the best forecasting model on the data available. Four

"propositions" establish (i) and (ii), which are illustrated by some simple examples. The

3 analytical form of the MSFE and of Chow's (1960) “prediction interval" statistic clarifies the absence of relationship between minimizing MSFE over a set of models and obtaining parameter constancy for an individual model.

To show the lack of connection between parameter constancy and minimizing MSFE, consider the following simple process in which the dependent variable y, is linearly dependent upon three regressors, each of which is normally and independently distributed.

Models and the data generation process. Suppose that the data Vp t=1,...,T+n) are generated by:

Y, = Bix, + Boxy, + Bjx5, + &, €, ~ NID(0,02) , (la) and the x;,'s are normally and independently distributed (NID):

xi, ~ NID(Q,@,) i= 1, 2, 3, (1b)

where @..

; the variance of x;,, may change over time. The double index on @;; denotes that the variance is the ith diagonal element from the (diagonal) contemporaneous covariance matrix (Q) of the {x;,}. Equations (1a)-(1b) are referred to as the data generation process (DGP). To exclude trivial cases, o2 and all @.,'s are positive and all B's are nonzero.

The econometrician does not know the DGP, and estimates the following (mis-

specified) models by OLS over a subsample [1,T] and evaluates their forecast performance

over n periods [T+1,T+n].

My y, = MX, + Ui, u;, ~ NID(O,o%) (2a) Mp: y = NiX1t + Y2X2r + Uy, Uy, ~ NID(0,05) (2b) M3: y, = 6)x),+ 54X3, + Us, U3, ~ NID(0,03) (2c)

The sets of coefficients {1}, {%%}, and {6,,6,} are used to distinguish the models’ coefficients from the underlying coefficients of the DGP, i.e., {B,,B,,8;}. For convenience, Vij denotes the prediction of y in period j, using the parameter estimates from model M; estimated over [1,T]. For example, 92; is:

Yj = YX + Vor X2 j = T+1,...,T+n, (3) where Vr and Yor are the coefficients y, and 75, estimated over [1,T].

The MSFE for the ith model is:

T+n MSFE; = 6[ (y;-¥;)2/n J i= 1, 2, 3, (4a) jeTt+l where the expectation &[-] is over {€}. For the models discussed in this paper, each term in the summation in (4a) has the same expectation, so MSFE; = & [(y;-¥;,)7], independent of j, for j=T+1.,...,T+n. In practice, the MSFE is estimated by the sample average of the squared forecast errors:

a T+n j=T+1 for the ith model. Most of the discussion in this paper is in terms of the underlying population moments, i.e., the MSFE, thereby abstracting from the additional complication of the sampling distribution of (4b).2 Parameter constancy may be evaluated by any of a number of statistics, with Chow's (1960, pp. 594-595) "prediction interval" statistic being one of the more common.3 The Chow

statistic can be written as:

CHOW;(n,T-k;)

T+n yyT+n T+n x,>T+n,\7-1/.,T+n x>T+n a2 7-2 Cay Yay Wary Yap) Wa Yan (62/02, }

T+1 T+1 T+1 T+1 T+n . . = € 2 9% VE, + OT) j=T+l = ; G2 1 MSFE//6-... + Oe ), (5)

T+n T+1

equation error variance for model Mj over [1,T], k; is the number of regressors in model M,,

T+n_ » xyT+n . 2: : where Your Vp Ya ; You is the forecast of Y by model M,, G- is the estimated

and Ont) denotes a term of order T-! in probability. As (5) clarifies, the Chow statistic in

2Note that the estimation and forecast periods do not overlap. By contrast, (e.g.) dynamic simulation uses overlapping (usually identical) estimation and "forecast" periods. Mean square forecast errors from such simulations may have quite different properties from those discussed herein. See Hendry and Richard (1982), Chong and Hendry (1986), and Pagan (1989) on the tole of dynamic simulation in model comparison.

3Chow (1960) also discusses a parameter constancy test statistic based on the analysis of covariance, in which estimates of the coefficients over the two subsamples are compared for equality; see Fisher (1922) for its original development. This statistic is distributed as F(k;,T+n-2k;) under the null hypothesis, with classical assumptions about the regressors and disturbances. This covariance test statistic is sometimes (and confusingly) referred to as the “Chow statistic” although Chow (1960, p. 592) was well aware of its presence in the literature. In the current paper, the phrase "Chow statistic" refers exclusively to Chow's prediction interval statistic (5).

Wilson (1978) discusses conditions under which each of the Chow (prediction interval) Statistic and the covariance statistic is uniformly most powerful. Fisher (1970) and Dufour (1980) present intuitive derivations of the two statistics.

effect tests whether each of the forecast errors of a given model has zero mean, i.e., é (y;-¥;)=0 for j=T+1,...,T+n. It does so by comparing the mean Square forecast error against the estimated error variance over the estimation subsample. Under the null hypothesis, and with fixed regressors and normal disturbances, the Chow statistic is distributed as F(n,T-k;). "Significant" Chow statistics are often referred to as "predictive failure"; cf. Hendry (1979).4

To simplify the analysis even further, suppose that T is large enough so the uncertainty in estimating the model parameters can be ignored when considering the characteristics of the MSFE and the Chow statistic. This assumption and virtually all assumptions in (1)-(2) are for expositional purposes only, and most of the results below obtain under more general conditions (e.g., nonlinearity; autocorrelated, multicollinear, endogenous regressors; more or fewer regressors relative to those in the models here; non-normality of the errors and/or regressors).

Examples 1 and 2 consider two situations, one in which all population data moments

are constant and the other in which some of them change over time.

Example 1: constant population data moments. This is equivalent to having (@,;, B,, i=1,2,3) and o2 constant in (1).

All models (i.e., M,, Mz, M3) will have constant parameters because the corresponding (OLS) estimators are functions of the sample data moments, with the sample data moments being constant in expectation (by assumption). For the DGP in (1), OLS for each model in (2) is unbiased for the relevant subset of { B,,B,B,}, and is so only because the regressors are uncorrelated with each other and are static. That property does not generalize. However, even with (e.g.) correlated regressors, constant population data moments are sufficient for parameter constancy.

From (1) and (2), it follows directly that the mean square forecast errors for My, Mo,

and Mz are:

4The first term on the last line of (5) is 1/n times Hendry's (1979) y2 statistic for testing the numerical accuracy of the forecasts. Chow's statistic tests their statistical accuracy by accounting for the uncertainty arising from estimating (rather than knowing) the regression coefficients. This affects only the finite sample distribution of the statistic: Hendry's and Chow's statistics are asymptotically equivalent. Because coefficient uncertainty is ignored for

the most part in this paper, the equivalency proves useful, given the simpler form of Hendry's Statistic. :

MSFE, = 0° + Say + Kos; (6a) MSFE, = oO? + (Ra,, (6b) MSFE, = 02+ BJa,,. (6c)

Clearly, M, has the largest MSFE; the ranking of Mz and M3 depends upon the relative

magnitudes of B4a,. and [%w,3. This indeterminacy leads to the first proposition.

Proposition 1. If a model has (empirically) constant parameters, it can have either a smaller or a larger MSFE than some other model.

That is, parameter constancy is not sufficient for obtaining the smallest MSFE among a set of

models.

Nonconstant population data moments help demonstrate the lack of necessity.

Example 2: nonconstant population data moments. Suppose that the variance of X», increases from @), to Wy, at time T+1 and remains at wy, thereafter.

For models M, and Msg, the increase from @,, to w$, implies a forecast error variance larger than the estimation subsample error variance, so the Chow statistic will indicate parameter nonconstancy. If regressors are correlated, either or both models may have coefficient nonconstancy, apparent (e.g.) through graphs of the recursively estimated coefficients.

The mean square forecast errors for the models are:

MSFE, = of + Boks, + Ko,, (7a) MSFE, = o2+ ha,, (7b) MSFE; = 02 + Bsa, . (7c)

Again, M, has the largest MSFE, but the ranking of those for Mz and M3 could be the same as (or different from) the ranking of (6), depending upon w%,. Further, whether or not a model exhibits parameter nonconstancy has little to do with its ranking by MSFE. For instance, M3 can have a smaller or larger MSFE than Mp, depending upon the values of Pa, and Bia3, but M3 exhibits parameter nonconstancy whereas M2 does not. This indeterminacy implies

another proposition.

Proposition 2. If a model has (empirically) nonconstant parameters, it can have either a smaller or larger MSFE than some other model.

That is, parameter constancy is not necessary for obtaining the smallest MSFE among a set of

models.

Whether the parameters of the "other" model are constant or not makes no difference to

either Proposition 1 or Proposition 2, and this provides a different view on the lack of necessity

and sufficiency.

Proposition 3. For both Propositions 1 and 2, the constancy or otherwise of the "other" model is immaterial.

For instance, consider a fourth model:

May, = OX, + Us, uy, ~ NID(,o3) , (8a) which has MSFE:

MSFE, = 0? + Bta,, + Ban, . (8b) Model My has constant parameters, but its MSFE may be smaller or larger than that for M, (which has nonconstant parameters), depending upon the relative variances of the regressors. Hence, parameter constancy is neither necessary nor sufficient for minimizing MSFE across a given set of models. Table 1 summarizes the properties of these models; Figure 1 provides a schematic of their relationships in terms of MSFE.

The ranking of models by MSFE can change across subsamples as well. For instance, the ranking of the models above depends upon the variances of the regressors, and those need not remain constant over subsamples. Unless a model is well-specified in a very general sense (i.e., “congruent” with the evidence; cf. Hendry (1987), Campos and Ericsson (1988), and White (1989)), there is no guarantee whatsoever that an observed ranking in mean square forecast error will obtain over different sample periods.

Finally, consider the relationship between the properties of parameter constancy and

minimizing MSFE, and an adequate forecasting model.

Proposition 4. Individually and jointly, parameter constancy and minimizing MSFE are necessary but not sufficient to ensure an adequate forecasting model.

Necessity is shown by considering the implications of lacking either property. Specifically, forecast errors from a nonconstant model contain a predictable element (e.g.) by imposing an incorrect coefficient on the variable with nonconstant moments. For instance, M3

imposes a zero coefficient on x,,. Also, a model that.does not minimize MSFE, does not do so

Table 1. Models for the discussion of MSFE and parameter constancy

Model Equation MSFE, Constancy! M, Y, = @%Xy,+ Ut oO? + Bor, + Bo, No M2 YY, = VX + YoXo, + Uo, O+ 503, Yes M3 y, = 6)x;,+ 3X3, + Us, o2 + Box, No _ 9” DGP y, = yxy, + ByX>, + B3Xx3, + &, o2 Yes Note:

1Model constancy is evaluated under the condition that the variance of Xo changes at time

T+1, i.e., that @3,#@,,. Hence, models excluding X», are nonconstant.

Figure 1. The ranking of MSFE. across models'”” M M, My M, Ms

‘Arrows denote direction of decreasing MSFE. The ranking of MSFE is indeterminate for each of the following pairs of models: ( &,,.M,), (M,, M,), and (M,, My).

2Models in bovifel are always constant. Models in roman are nonconstant if 3 4#@ >, and are constant otherwise.

Notes:

8 because it makes inefficient use of the information available for forecasting. A model that has constant parameters and does minimize MSFE across a set of models meets a necessary condition for being an adequate forecasting model. However, that condition is only necessary, and is not sufficient.

Lack of sufficiency can be shown, as follows. For the DGP and models above, model Mz has constant parameters and, if Ba, is larger than 63,3, Mz minimizes MSFE among the models M;, M2, M3, and My. However, the forecast errors for M> are:

Uy = Yy- Yo = Byxs + & j=T+1,...,T+n, (9) and so are predictable on the data sets available to the models M;, M2, M3, and My. The Tegressor x3, is valuable in forecasting y;, and Mb) ignores that information but M3 does not. Technically speaking, the forecast error Uy; iS not an innovation with respect to the information set generated by models M;, M2, M3, and Mg.

This analysis clarifies why minimizing MSFE is not enough for obtaining a good forecasting model. Although model M2 minimizes MSFE over the set of models M), Mo, My, and (if }@,,<B5@%,) Mg, the forecast errors of each model may be (in part) predictable from some other model's data.

Conversely, it is possible to create a model from the data of those four models which uniformly dominates M;, M3, My, and M> in MSFE, and which has constant parameters. One

such model, denoted Ms, is:

Ms: yy, = ™Xy, + MXy + 13X3, + Us, Us, ~ NID(0,03) ’ (10a) with MSFE: MSFEs, = o2. (10b)

Model Ms has a smaller MSFE than even Mp) and has constant coefficients. Model Ms happens to be the DGP and happens to nest Mj, M2, M3, and Mg, but neither property is necessary for it to "dominate" the other models in terms of MSFE. Rather, model Ms dominates M;, M2, M3, and My because the forecast errors of any one of those four models are in part predictable from the regressors used in Ms, but not conversely. Note also that Ej itself may be in part predictable on a larger information set, in which case the corresponding model's

MSFE would be smaller than o2 in (10b). Hendry ( 1986) discusses some of these issues in the

9 related context of n-step ahead ex ante forecasts from macro-models.

If a model M, minimizes the MSFE over a set of models {M,}, that shows that the other models are worse in a specific sense. It does not show that M, is a good forecasting model, even on only the data available in the models {M;}. Even jointly, parameter constancy and minimum MSFE do not ensure efficient forecasting from the information available. Hence, there exists a need for more powerful tools in evaluating the forecast performance of models. The key to designing those tools is the information set against which models are being evaluated when MSFEs are compared. In Section 3, information sets resolve the logical status of parameter constancy vis-a-vis minimum MSFE. In Section 4, information sets define

a taxonomy for test criteria, with parameter constancy and minimum MSFE being members of

that taxonomy.

3, Information Sets

Information sets help clarify both why the MSFE and parameter constancy are sensible criteria for evaluating how "well" a model forecasts, and why having constant parameters and minimizing MSFE over a set of models are not in general sufficient conditions for obtaining an adequate forecasting model. The MSFE and tests of parameter constancy evaluate a given model against different sources of information, being either other models' MSFEs or the given model's fit over the estimation subsample. The former is obvious; the latter follows from (5), the equation for the Chow statistic. Expressed somewhat differently, the Chow statistic evaluates a given model over different subsamples of that model's data, whereas minimizing MSFE evaluates several models over a given subsample but across the models’ different datasets. The informational content of an alternative model's data and of the data of one’s own model need not be (and generally are not) equivalent, so tests based on those information sets

need not give similar results.

4. The Roles of Parameter Constancy and MSFE in Empirical Modeling This section discusses how parameter constancy and minimizing MSFE fit into a general framework for evaluating (and designing) empirical models. That framework is based

on the information sets against which models are evaluated and designed. It clarifies the

relationship of parameter constancy and minimizing MSFE to other test statistics. It also results in a new test statistic, forecast-model encompassing, which is a more general and more stringent criteria for evaluating forecast performance than minimizing MSFE.

How well or poorly designed an empirical economic model is depends upon its ability (or lack thereof) to capture salient features of the data and to deliver reliable inference on economic issues (e.g., coefficient estimates, predictions, policy effects). Many statistics exist for evaluating such properties of a model; they relate to goodness-of-fit, absence of residual autocorrelation and heteroscedasticity, valid exogeneity, predictive ability, parameter constancy, the statistical and economic interpretation of estimated coefficients, the validity of a priori restrictions, and the ability of a model to account for properties of alternative models. These test statistics can serve as criteria both for evaluating existing specifications and for designing new ones. Table 2 summarizes the statistics, which are arranged by the type of information

generating testable null hypotheses:

(A) the data of one's own model,

(B) the measurement system of the data, (C) economic theory, and

(D) the data of alternative models.

For details, see Spanos (1986), Hendry and Richard (1982), Ericsson and Hendry (1985), Hendry (1987), and Ericsson, Campos, and Tran (1991).

Parameter constancy belongs to category (A3) in Table 2 [the relative future of the data of one's own model], and is at the heart of model design, both statistically and economically. Most estimation techniques require parameter constancy for valid inference, and those that seem not to do so, still posit "meta-parameters" assumed constant over time. Since economic Systems appear far from constant empirically, and the coefficients of derived ("non-structural" or "reduced form") equations may alter when any of the underlying parameters or data correlations change, it is important to identify empirical models which have reasonably

constant parameters and which remain interpretable when some change occurs.5 That puts a

5See Goldfeld and Sichel (1990) for a discussion of the nonconstancy of many estimated money-demand equations. That nonconstancy implies nonconstancy in one or more of the equations of the underlying data generation process.

Information Set

(A) own model's data (Al) relative past

(A2) relative present

(A3) relative future

(B) measurement system

(D) alternative models’ data

(D1) relative past

”

(D2) relative present

(D3) relative future

10a

Table 2. Evaluation/design criteria

Null Hypothesis

innovation errors

normality of the errors

weakly exogenous Tegressors

constant parameters, adequate forecasts

data admissibility theory consistency

variance dominance

variance encompassing

parameter encompassing

exogeneity encompassing

MSFE dominance

forecast encompassing

forecast-model encompassing

References

Alternative Hypothesis

first-order residual autocorrelation

qth—order residual autocorrelation

invalid parameter restrictions

qth—order ARCH

heteroscedasticity quadratic in regressors

qh—order RESET

skewness and excess kurtosis

invalid conditioning

parameter nonconstancy, predictive failure

"impossible" predictions of observables

"implausible" coefficients,

predictions; no cointegration

poor fit relative to an alternative model

inexplicable observed error variance

significant additional variables

inexplicable valid conditioning

poor forecasts relative to

those of alternative models

informative forecasts from alternative models

regressors from alternative models valuable for forecasting

Durbin and Watson (1950, 1951)

Box and Pierce (1970); Godfrey (1978), Harvey (1981, p. 173)

Johnston (1963, p. 126)

Engle (1982)

White (1980, p. 825), Nicholls and Pagan (1983)

Ramsey (1969) Jarque and Bera (1980)

Sargan (1958, 1980), Engle, Hendry, and Richard (1983)

Fisher (1922), Chow (1960), Brown, Durbin, and Evans (1975), Hendry (1979)

Engle and Granger (1987)

Hendry and Richard (1982)

Cox (1961, 1962), Pesaran (1974), Hendry (1983)

Johnston (1963, p. 126), Mizon and Richard (1986)

Hendry (1988) Granger (1989),

Granger and Deutsch (1991) Chong and Hendry (1986)

(this paper)

premium on good theory. Conversely, empirical models with constant parameterizations in spite of “structural change" elsewhere in the economy may provide the seeds of fruitful research in economic theory. Parameter constancy typically is evaluated by comparing parameter estimates of a given model obtained from different subsamples of data. Recursive estimation of an equation provides an incisive tool for investigating parameter constancy, both through the sequence of estimated coefficient values and via the associated Chow statistics; cf. Dufour (1982).

Minimizing MSFE, like parameter constancy, focuses on the "relative future", but on that of alternative models’ data rather than on the data of one's own model. Thus, MSFE dominance belongs to category (D3). Because the structure of (D3) parallels that of (D1) [the relative past of alternative models’ data], (D1) is briefly discussed to elucidate the connections between the two. Also, for reasons which will be apparent shortly, the criterion of minimizing MSFE will be referred to as MSFE dominance.

Parameter encompassing, variance encompassing, and variance dominance. Consider the following two alternative non-nested linear models, both claiming to explain Ye

M: y, = dz, + vy, V;, ~ NID(0,0?) (11a)

Mo: y, = 5jzy + Vy, V>_ ~ NID(O,03) (11b) The notation is distinct from that in Section 2 above. In (11), 6, and 6, are k,x1 and k>x1 vectors of unknown parameters. The vectors Z;, and z,, are of k, and ky regressors respectively, with each vector having at least some variables which are not in common with those in the other vector. For simplicity, assume that none are in common. To ensure the feasibility of the parameter-encompassing and forecast-model encompassing statistics, assume that T > k, +k, and n > max(k,,k»). |

As alternative models, (11a) entails the irrelevance of Zy, in explaining y,, given z,,; and vice versa for (11b). In any event, the variables Yp Zp and Z,, are generated by some process, and, under the simplifying (but inessential) assumption of joint normality, z,, and z,, can be linked using:

Z, = Tz, + Oy, (12) where IT is defined by & (2,6; )=0, and (again for expositional simplicity) &(¢,,¢7,)=Q.

12 Substituting (12) into (11a) obtains:

—_ , y = Oz, + Vit

(5,’TDz, + (Yj, + 650),))

(63)Z>, + Voe - (13) In (13), the parameter 5, and the error V>, are derived from (11a) and (12), being 11’, and V,,+6}¢,, respectively. Consequently, (13) is what model (11a) predicts model (11b) should find, and it implies several hypotheses, including:

H,; 6, = II’6, (14a) and

Hy: =o = of + 8;06,. (14b) These hypotheses are called parameter encompassing and variance encompassing respectively, and the positive definiteness of Q in the latter implies variance dominance:

H: oF < 6. (14c) H,, Hp, and H, are implications of omitted variable bias in (11b), assuming that (11a) plus (12) are the DGP. These three hypotheses, albeit in reverse order, generate the evaluation criteria for (D1) on Table 2; cf. Hendry (1983) and Mizon and Richard (1986).

Parameter encompassing by (lla) of (11b) may be tested using the formula in (14a) or by testing whether z,, is irrelevant if added to (11a). To see the latter, let 5, be unconstrained, and define the kjx1 vector y as 6,-I1’6,. H, is equivalent to y=0. By substitution of 6, = 11’6,+7 in (13),

y, = O12, + YZ + Vy- (15) Thus, H, is equivalent to claiming that z,, has no power in explaining y,, given z,, (or in explaining the residuals from (11a)). In practice, it is simplest to estimate 5, and y jointly in (15) and test y=-O with the standard F-statistic.

Variance encompassing may be tested either using (14b) or by testing the insignificance in (lla) of the fitted values from (11b). That is, hypothesis H, can be tested by jointly estimating 5, and © in:

y, = O12, + Wy + Vi» (16)

and testing that a=0 (where Y>,= 05,75). Testing a=0 is equivalent to testing

03 = of + 6;Q6,; see Davidson and MacKinnon (1981, p. 789), Hendry (1983), and Mizon and Richard (1986). Equally, Hy, is equivalent to claiming that a certain linear combination of Zo43 namely y>,, has no power in explaining the residuals from (11a). Because (16) is testing for the insignificance of that certain linear combination, rather than of any linear combination (as in (15)), the test of a=0 is a narrower test than that of y=0. The t-statistic on @ in (16) is Davidson and MacKinnon's (1981) P statistic: it is asymptotically N(O0,1) when M, is true and is asymptotically equivalent to Cox's (1961) statistic for testing non-nested hypotheses, as applied to linear regression models by Pesaran (1974).

The logic of the hypotheses in (14) is as follows: variance dominance is necessary, but not sufficient, for variance encompassing, which in turn is necessary, but not sufficient, for parameter encompassing. Conversely, if y=O in (15), then @ also must be zero in (16), from which it follows that of is less than 0% because z,, is not an exact linear transformation of Z94:

In light of the preceding analysis, it readily follows that MSFE dominance parallels variance dominance and that two other forecast criteria (forecast encompassing and the new forecast-model encompassing) parallel variance encompassing and parameter encompassing. The remainder of this section explores those connections between (D1) and (D3).

Forecast-model encompassing, forecast encompassing, and MSFE dominance. Assume that the two alternative models (11a) and (11b) have been estimated over the sample period [1,T] and are being used to forecast over [T+1,T+n]. The forecasts from the two models are:

Yi; = 812); (17a)

92; = 5329; j = T+1,,...,T+n, (17b) ignoring (again) the uncertainty arising from estimating coefficients over [1,T]. As with (13) above, under Mj,

oo Z,; + Vij

j (O)IDz, + (vy; + 570) ))

(63); + [(y;-¥1,) + 6; Oyj] Yj + (¥;-¥oj) > (18) where the last line follows from (17b) and the equality y; = y; Equation (18) implies two

testable hypotheses:

HY: 8 = I1’6, (19a) and

HE: 866 (¥;-¥2)? = 6 (y;-¥1))? + 60*6, , (19b) where an asterisk * denotes the corresponding hypothesis or matrix over the forecast period. The second hypothesis may be written as:

H#: MSFE, = MSFE, + 6,Q*6, , (19c) and implies:

H*: MSFE, < MSFE). (19d) These three hypotheses H*, H*, and H* are called "forecast-model” encompassing, forecast encompassing, and MSFE dominance. Forecast encompassing could be called MSFE encompassing as well; see Chong and Hendry (1986). From (19d), it follows that an adequate forecasting model must minimize MSFE (asymptotically), but doing so is a necessary (and not sufficient) condition for obtaining an adequate forecasting model, as discussed in Section 2. The design of tests of H* and H¢ parallels that of those of H, and Hp.

Forecast-model encompassing by (lla) of (11b) may be tested using the formula in (19a) or by testing whether z,,; is irrelevant in explaining the forecast errors from (17a). As with parameter encompassing, let y= 6,-I1’6,. By substitution in (18),

y, = 912, + V2 + Vy; j = T+1,...,T+n, (20a) or

yi Nj = Voi + Vyj- (20b) H* is equivalent to yO and so to claiming that z); has no power in explaining the forecast errors from (17a). For large T, fixed Zi; 8» and normal v, } it is straightforward to show that the standard F-ste~’-*- testing y=0 in (20b) is distributed as F(k2,n-ky) under Mj; for stochastic (weakly exogenous) ZijS, it is distributed as F(k2,n-k2) for large n. See the Appendix for details.

An exact test of forecast-model encompassing also exists. It can be motivated most easily by recognizing why the F-statistic for y=O in (20b) does not have a simple exact distribution: the statistic conditions upon the estimated value of 6,, thereby ignoring the

uncertainty inherent in the corresponding estimator of 6,. The solution is simple: estimate 6,

15 and y jointly. To do so, consider the following model:

y, = O[2, + 2%, + Vy t = 1,...,T+n, (21) where the entire data sample [1,T+n] is used, and z%, is zero over [1,T] and equal to z, over [T+1,T+n]. The F-statistic for y=0 in (21) is exactly distributed as F(kz,T+n-k,-k2) under M, for fixed z,'s and normal v,;, asymptotically so for stochastic (weakly exogenous) z;,'s. Instrumental variable and recursive generalizations of the test statistics for y=O in (20b) and (21) follow naturally.6

Forecast encompassing by (11a) of (11b) may be tested either using (19c) directly or by testing for the insignificance of the forecast values (17b) in explaining the forecast errors from (17a). Thus, Hf can be tested by estimating o in:

Yj- Vy = OC + Wij j=T+l,,...,T+n, (22) and testing that a=0. That is equivalent to testing that MSFE, = MSFE, + 6;*6,, following the logic used for variance encompassing. Noting that y,;=6;z,;, (22) is similar to (16), the principal difference being that the time period is [T+1,T+n] rather than [1,T]. Chong and Hendry (1986) have shown that, for large T and n, the t-statistic on @ is N(O,1).7

The logic of the hypotheses in (19) is as follows: MSFE dominance is necessary, but not sufficient, for forecast encompassing, which in turn is necessary, but not sufficient, for forecast-model encompassing. Conversely, if y=O0 in (20b), then @ [in (22)] must be zero because ¥=632,;. If a=0, then MSFE; is less than MSFE because z,; is not an exact linear transformation of Zp}.

Forecast-type encompassing and parameter constancy. As illustrated in Section 2, even if the "structural" relationship has constant parameters (e.g., (7;,0%) in (10a), or (6,,0%) in (11a)), nonconstant population data moments have implications for the empirical constancy (or

6The structure of (21) also leads to two classes of forecast-based encompassing tests, one which assumes constancy between the estimation and forecast samples, and one which does not.

7An extensive literature has developed on the combination or "pooling" of forecasts, i.e., where some (usually linear) combination of forecasts from different models is taken to obtain a new forecast. In comparision to any of the individual model forecasts, that new forecast may have better properties, usually being a smaller MSFE. Given the discussion in the text above, finding such a pooled forecast is prima facie evidence of all individual models being misspecified, and may well indicate that a single model can be constructed which has a smaller MSFE than even the pooled forecast. See Clemen (1989) for a review and bibliography on combining forecasts, and Granger (1989, pp. 187-191) and Diebold (1989) for recent analyses.

16 lack thereof) of mis-specified models. Nonconstant population data moments also have implications for forecast-type encompassing tests. For instance, if the (reduced-form) variance matrix © changes, MSFE, in H# will alter as that new matrix (i.e., Q* in (19c)) does. Likewise, if the II matrix changes, because M2 (falsely) assumes 6, is constant, M2 will have systematic forecast errors which are a function of the changing II matrix. These "predictions" about model behavior suggest a more general encompassing strategy, including predicting problems in alternative models of which their proponents are unaware. Corroborating such phenomena adds credibility to the claim that the successful model reasonably represents the

data generation process, whereas disconfirmation clarifies that it does not.

5. Forecasting with Integrated and Cointegrated Variables

This section describes a certain lack of invariance present in the forecast-encompassing Statistics, and illustrates that lack of invariance with a cointegrated process. The forecastencompassing statistic is then modified to produce an "invariant" test statistic, which tests "forecast-differential encompassing". See Lu and Mizon (1991) for a related discussion.

The forecast errors y;-y,; and y,-Y2; are invariant to nonsingular linear transformations of

the corresponding models’ data, (y,,z;,)° and (¥:23)) . However, the forecasts themselves are not invariant to such transformations, and so neither is the forecast-encompassing test statistic from (22). Specifically, suppose that both Z,; and Zp, include the lagged dependent variable Yup in which case models M; and Mz may be written without loss of generality with either Yj or Ay; as the dependent variable, where A is the first-difference operator. In the first case, with yj the auxiliary equation for the forecast-encompassing Statistic is (22) as written. In the second case, AY; .wplaces 92; as the right-hand side variable in (22). In both cases, the t-statistic on q@ is asymptotically N(0,1) under the null hypothesis of My, being correctly specified. However, the two t-statistics are not necessarily equivalent under mis-specification of M,. This is most apparent when Y; is an integrated process.

To illustrate, suppose that Yjr Zap and Zp; are each I(1) processes, and that each model (M, and M2) represents a cointegrating relationship. This could arise if (e.g.) Zi; and Z9j

involved different lag structures of the same underlying variables. From Granger (1986), the

17 forecast errors Y;-91; and Y;-¥2; are each I(0), whereas the forecasts Yi and Yj are I(1). Thus, in order for (22) to be "balanced" in terms of orders of integration, @ must be zero. Surprisingly, @ must be zero even if M, is mis-specified and Mp is the correct model. That is, the forecast-encompassing test may have no power when the dependent variable is I(1).8°9

For Yio Zyj> and 29; with the properties specified, both M, and Mp, have error-correction representations. In the error-correction representation, the dependent variable is Ay;» rather than Yj: The corresponding forecast errors remain unchanged numerically, but the right-hand side variable in (22) becomes AY is an I(0) variable, in contrast to V2 which is I(1). Balance is unaffected by the value of @ in the regression with AY 9;5 so the corresponding forecastencompassing test appears more promising for good power properties than the test based on (22) with ¥2; on the right-hand side. This feature supports formulating forecast models “in I(0) space" as error-correction models, rather than "in I(1) space" in terms of the original [I(1)] levels variables.

It may be desirable to have a forecast-encompassing test which is invariant to nonsingular linear transformations of the data. Such a test may be constructed as follows. As (22) stands, the coefficient on a is constrained to be unity, while the coefficient on ¥2; is estimated unrestrictedly. Instead, both coefficients could be estimated, with their sum constrained to be unity. The resulting equation can be written as:

¥y- Vy = OY - Hy) + Vy; j=T+1,...,T+n, (23) where a* is estimated unrestrictedly, and o*=0 is tested. This equation would parallel Davidson and MacKinnon's (1981) J statistic if the coefficients in VF were estimated jointly with a*.

The test from (23) has two important features. First, because the right-hand side variable is the differential between the two forecasts [(¥2;-¥1))1] rather than either forecast alone, the right-hand side variable is unaffected by nonsingular linear transformations of the models’ 8] am grateful to Stephen Hall for bringing to my attention (via David Hendry) the apparently

low power empirically of Chong and Hendry's forecast-encompassing test with I(1) forecasted

variables. Also, see Hendry (1989, pp. 95-97) on implications of nonsingular linear transformations of a linear model's data.

%Either or both of models M, and M2 might lack cointegration, in which case the distributions of the forecast-based test statistics may change. We do not consider such cases here.

18 data. Thus, the test of a@*=0 is invariant to such transformations. Second, for integrated Yj> 21 and z,; with both models cointegrated, the right-hand side variable (Y2;-¥1)) is 1(0), preserving balance. This follows because V4 and ¥9j must each cointegrate with y; (with unit coefficients), and so i; cointegrates with ¥9j (also with unit coefficients). The t-statistic on o* will be called the forecast-differential encompassing statistic, noting the form of the right-hand side variable.

As a practical matter, any of a model's forecast errors, its forecasts, or the forecast differential (¥9;-V1)) may have a nonzero mean. Thus, the power of the tests from (22) and (23) may be affected by the inclusion of a constant term in the auxiliary regression. Under the null of correct specification, the constant term should have a zero coefficient, so it is appropriate to test that a (or o&*) and the constant term's coefficient are jointly zero.

Before applying several of the above tests to a pair of empirical models, we briefly

consider the class of models with time-varying coefficients.

6. Forecasting and Models with Time-varying Coefficients

Time-varying coefficient (TVC) models have been proposed as a means of improving forecast performance. The results above can clarify when that may (or may not) be so.

First, if the data are generated with time-varying coefficients and a correctly specified TVC model is estimated and used for forecasting, then the TVC forecasts will minimize MSFE, and (in general) fixed-coefficient models will have a higher MSFE. Evidence that the TVC model satisfied the evaluation criteria listed on Table 2 would be necessary for the model to be credible as representing the data process.

Second, sometimes an estimated TVC model is recognized as being mis-specified, but it is claimed that the TVC model will forecast better than fixed-coefficient models because the former accounts (in part, at least) for observed parameter nonconstancy in the latter, cf. Chow (1984). However, for general parameter nonconstancy, a TVC model need not minimize MSFE relative to a fixed-coefficient model, even asymptotically and even if the TVC model nests the fixed-coefficient model. An example suffices.

Suppose that the data are generated as:

y, = O12, + Vy, V,, ~ NID(0,0%) , (24) where z,, is stationary, distributed as N(O,'¥,,); and 6,,=6-@ (640) for the first half of the estimation period, 6,,=5+6@ for the second half of the estimation period, and 6,,=6 for the forecast period, which is the single observation T+1 (for expository convenience). The fixedcoefficient model is (24) but assumes that 5, is constant. The TVC model specifies (e.g.) that (5,,-6)=9(5,,_;-5)+6, where & is assumed to be (e.g.) white noise, |@|<1, and 6 is the unconditional mean of 6,3 ¢f. Swamy and Tinsley (1980) and Chow (1984).

The fixed-coefficient model, although manifesting parameter nonconstancy in sample, has & (6,)=6, and so has a MSFE of approximately 07. The TVC estimate br is approximately 6+ because the TVC estimator places more weight on recent data than on older data. Here, the TVC model does so by obtaining estimates of 5 and @ which are approximately 6 and unity respectively. For the forecast observation T+1, the TVC model uses the prediction OF rater which is approximately 5

or (6+6)z Thus, the TVC

1,771,741 model has MSFE of approximately o7+0’'¥,,0, which is greater than 2.

1,T+1

7. An Empirical Illustration: Models of Narrow Money Demand in the United Kingdom This section calculates the MSFE and forecast-type encompassing statistics for two models of U.K. money demand from Hendry and Ericsson (1991). The first model is an error correction model, the second is a partial adjustment model, and the forecast period is the 1980s. Hendry and Ericsson (1991) develop a constant, parsimonious, error correction model of narrow money in the United Kingdom for the period 1964(3)-1989(2). Estimating their

equation through 1979 and forecasting over the 1980s obtains the following.

me OSH” GON PMI

- 0.63 RF - 0.102 (m-p-y), 4 + 0.021 (25) (0.10) (0.013) (0.007)

T = 62 [1964(3)-1979(4)] + 38 forecasts R2=0.69 G= 1.389% The data are nominal M, (M), 1985 price total final expenditure (Y), the corresponding

deflator (P), and a (learning-adjusted) net interest rate (R*). Lower case denotes logarithms,

20 and standard errors are in parentheses. Hendry and Ericsson (1991) describe the data in their appendix and discuss the statistical and economic merits of (25) in their Section 4.

Hendry and Ericsson (1991) also estimate a partial adjustment model for real narrow money with an autoregressive error, in the spirit of Goldfeld (1973). Contrasting with results on U.S. data, the partial adjustment model appears reasonably constant during the missing money period: the Chow statistic is F[12, 29] = 1.89 [p-value = 0.079] for forecasts over 1973(1)-1975(4). When estimated through 1979 and forecast over the 1980s, the estimates for

the partial adjustment model are as follows.

(m-p), = 0.955 (m-p),_1 + 0.087 yi (0.024) (0.020)

- 0.78 RF - 043 - 0.31 wey (26) (0.07) (0.42) (0.13)

T = 62 [1964(3)-1979(4)] + 38 forecasts 6 = 1.572%

The coefficient on t,_; is the estimated parameter of the (modeled) first-order autoregressive disturbance.

For money-demand equations, the 1980s are of particular interest to forecast, as Goldfeld and Sichel (1990, p. 300) note: "..., in the 1980s, U.S. money demand functions, whether or not fixed up to explain the 1970s, generally exhibited extended periods of underprediction as observed velocity fell markedly." In the United Kingdom, velocity fell by twice the percentage drop in the United States. Somewhat surprisingly, neither model fails the Chow test, as seen in Table 3.

Figure 2 graphs the forecast errors from (25) and (26). Visually, the errors from (25) appear uncorrelated with near zero mean, whereas those from (26) are highly autocorrelated, trending from large and positive in 1980 to large and negative in 1989. These series are the dependent variables in the auxiliary regressions for calculating the forecast-type encompassing test statistics. The particular type of test determines the "independent" variables in those auxiliary regressions. Figures 3 and 4 present those variables for the forecast-encompassing tests. Figure 3 graphs the forecasts of A(m-p), from (25) and (26), while Figure 4 graphs the

forecasts of (m-p), from (25) and (26). For reference, Figures 3 and 4 also include the series

20a

Table 3. Chow, encompassing, and related statistics

eee Null hypothesis (i.e., hypothesized encompassing model)!’2

Statistic error correction: (25) partial adjustment: (26) no constant constant ho constant constant eee Chow statistic 0.73 0.81 [0.84] [0.75] F[38,57] F[38,57] o 1.389% 1.572% Root MSFE 1.232% 2.507% Forecast encompassing3 Variable forecast is: A(m-p), 0.00 0.70 4.22 3.78 [0.96] [0.50] [0.047] [0.032] F[1,37] F[2,36] F[1,37] F[2,36] Variable forecast is: (m-p), 1.12 1.21 0.09 63.14 [0.30] [0.31] [0.77] [0.000] F[1,37] F[2,36] F[1,37] F[2,36] Forecast-differential encompassing 2.27 1.65 125.54 63.35 [0.14] [0.21] [0.000] [0.000] F[1,37] F[2,36] F[1,37] F[2,36] Forecast-model encompassing 0.85 29.38 [0.53] [0.000] F[5,33] F[5,33] Forecast-model encompassing4 0.37 3.27 ("exact") [0.87] [0.009] F[5,90] F[5,90}

ss Iv

Notes:

'The three entries for a given statistic and equation are: the value of the Statistic, the righthand tail probability associated with that statistic, and the statistic's distribution under the

null hypothesis of that equation being correctly specified.

The estimation period is

1964(3)-1979(4) [T=62]; the forecast period is 1980(1)-1989(2) [n=38].

The phrases "constant" and "no constant" denote whether or not a constant term is included

in the auxiliary regression, i.e., in (22). or (23).

3Quantitatively similar results obtain for the followin

and A(m-p-y), and (m-p-y),.

g pairs of variables forecast: Am, and m,,

4The full-sample (T=100) values of t,_; in (26) are used to calculate the "exact"

forecast-model encompassing test statistic.

Figure 2.

-. 86408

Figure 3.

- 6368 - 6260 - 6168 - 8860 - 6160

- 4200

. 8400 { ,

» 6569

- 8800 - 0668 - 6468

20b

- 6486

Forecast errors from (25) L

. 630 th Py

Forecast errors from (26)

1986 1981 13982 1983 1984 1985 1986 1987 1988 1989 1996

Forecast errors from two models of narrow money demand in the United Kingdom: equation (25) [the error correction model] and equation (26) [the partial adjustment model].

hn~ m\ \

/ . AN Forecast of o(m-p) from (25) [yf \ Ww \ / \

Lf i « Forecast of o(m-p) from (26)

1986 1981 1982 1983 1984 1983 1986 1987 1988 1989 1990

Actual values of the growth rate of real money in the United Kingdom [A(m-p),], and the forecast values thereof from equations (25) and (26).

20c

11.6

11.5

11.4

11.3

11.2

11.1

11.6

18.9

Forecast of m-p from (25) a 18.8 L

16.7

18.6 Forecast of m-p from (26)

198@ 1981 1982 1983 1984 1985 1986 1987 19388 1989 1990

Figure 4. Actual values of the logarithm of real money in the United Kingdom [(m-p),], and the forecast values thereof from equations (25) and (26). : ’

21 being forecast. The initial underprediction and subsequent overprediction by (26) is clear in both graphs although, from the Chow statistic, these deviations are not statistically detectable as a Structural break. Even so, the forecast-type encompassing test statistics detect the systematic (and hence predictable) nature of (26)'s forecast errors.

Specifically, as shown in Table 3, equation (26) fails all forecast-type encompassing tests, except for the forecast-encompassing test with (m-p), as the forecast variable and no constant term included in (22). That single lack of failure is due to the approximately zero mean of (26)'s forecast errors and the large nonzero mean of (m-p),: see Figures 2 and 4 respectively. Inclusion of a constant term results in rejection at the 0.1% level, with the (upwardly) trending (m-p), "explaining" the (downwardly) trending forecast errors.

Equation (25) dominates (26) substantially in terms of MSFE. Additionally, (25) encompasses (26) according to all forecast-based encompassing tests.

The results for (25) and (26) show how two models may be empirically constant, yet (at least) one may be inadequate for forecasting. This parallels Proposition 1. In evaluating models of the U.S. trade balance, Marquez and Ericsson (1991) find an empirically nonconstant model that obtains the minimum MSFE with respect to all other models considered. That

parallels Proposition 2.

8. Summary

Parameter constancy and minimizing the mean square forecast error are sensible criteria that evaluate empirical models against two different information sets, the data of one's own model and the data of alternative models. MSFE dominance is a necessary condition for two more general criteria for evaluating forecast performance: forecast encompassing and forecastmodel encompassing. Parameter constancy, MSFE dominance, and the two types of forecast encompassing fit into a general taxonomy of model evaluation criteria. Satisfying all those evaluation criteria (and not just those of parameter constancy and MSFE dominance) are in general necessary for obtaining an adequate forecasting model. Two models for money

demand in the United Kingdom help illustrate the concepts developed.

APPENDIX. Distributions of Statistics for Testing Forecast-Model Encompassing

The basis for the forecast-model encompassing test statistic is the auxiliary regression in (20b):

Y¥i-Nj = Vo + Vi; j = T+1,...,T+n. (Al) Under M,, for large T, fixed ZS, and normal Vijs the dependent variable in (A1) is Vij and y is zero, so the standard F-statistic testing y=0 is distributed as F(k2,n-k2). For stochastic (weakly exogenous) ZS, it is distributed as F(k2,n-kz) for large n. See Hendry (1979) and Kiviet (1986).

The distribution of the modified forecast-model encompassing test statistic from the

auxiliary regression in (21) follows directly from (e.g.) Johnston (1963, Chapter 4).

References

Box, G.E.P. and D.A. Pierce (1970) "Distribution of Residual Autocorrelations in Autoregressive-integrated Moving Average Time Series Models", Journal of the American Statistical Association, 65, 332, 1509-1526.

Brown, R.L., J. Durbin, and J.M. Evans (1975) "Techniques for Testing the Constancy of Regression Relationships over Time", Journal of the Royal Statistical Society, Series B, 37, 2, 149-192 (with discussion).

Campos, J. and N.R. Ericsson (1988) "Econometric Modeling of Consumers’ Expenditure in Venezuela", International Finance Discussion Paper No. 325, Board of Governors of the Federal Reserve System, Washington, D.C.

Chong, Y.Y. and D.F. Hendry (1986) "Econometric Evaluation of Linear Macro-economic Models", Review of Economic Studies, 53, 4, 671-690.

Chow, G.C. (1960) "Tests of Equality Between Sets of Coefficients in Two Linear Regressions", Econometrica, 28, 3, 591-605.

Chow, G.C. (1984) "Random and Changing Coefficient Models", Chapter 21 in Z. Griliches and M.D. Intriligator (eds.) Handbook of Econometrics, Amsterdam, North-Holland, Volume 2, 1213-1245.

Clemen, R.T. (1989) "Combining Forecasts: A Review and Annotated Bibliography”, International Journal of Forecasting, 5, 4, 559-583.

Cox, D.R. (1961) "Tests of Separate Families of Hypotheses" in J. Neyman (ed.) Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, Volume 1, 105-123.

Cox, D.R. (1962) "Further Results on Tests of Separate Families of Hypotheses", Journal of the Royal Statistical Society, Series B, 24, 2, 406-424.

Davidson, R. and J.G. MacKinnon (1981) "Several Tests for Model Specification in the Presence of Alternative Hypotheses", Econometrica, 49, 3, 781-793.

Diebold, F.X. (1989) "Forecast Combination and Encompassing: Reconciling Two Divergent Literatures", International Journal of F. orecasting, 5, 4, 589-592.

Dufour, J.-M. (1980) "Dummy Variables and Predictive Tests for Structural Change”, Economics Letters, 6, 3, 241-247.

Dufour, J.-M. (1982) "Recursive Stability Analysis of Linear Regression Relationships: An Exploratory Methodology", Journal of Econometrics, 19, 1, 31-76.

Durbin, J. and G.S. Watson (1950) "Testing for Serial Correlation in Least Squares Regression. I", Biometrika, 37, 3 and 4, 409-428.

Durbin, J. and G.S. Watson (1951) "Testing for Serial Correlation in Least Squares Regression. II", Biometrika, 38, 1 and 2, 159-178.

Edison, H.J. (1985) "The Rise and Fall of Sterling: Testing Alternative Models of Exchange Rate Determination", Applied Economics, 17, 1003-1021.

Edison, H.J. (1991) "Forecast Performance of Exchange Rate Models Revisited", Applied Economics, 23, 187-196.

Edison, H.J. and J.T. Klovland (1987) "A Quantitative Reassessment of the Purchasing Power Parity Hypothesis: Evidence from Norway and the United Kingdom", Journal of Applied Econometrics, 2, 309-333.

Engle, R.F. (1982) "Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation", Econometrica, 50, 4, 987-1007.

Engle, R.F. and C.W.J. Granger (1987) "“Co-integration and Error Correction: Representation, Estimation, and Testing", Econometrica, 55, 2, 251-276.

Engle, R.F., D.F. Hendry, and J.-F. Richard (1983) "Exogeneity", Econometrica, 51, 2, 277-304.

Ericsson, N.R., J. Campos, and H.-A. Tran (1991) "PC-GIVE and David Hendry's Econometric Methodology", International Finance Discussion Paper No. 406, Board of Governors of the Federal Reserve System, Washington, D.C.; forthcoming in the Revista de Econometria.

Ericsson, N.R. and D.F. Hendry (1985) "Conditional Econometric Modeling: An Application to New House Prices in the United Kingdom", Chapter 11 in A.C. Atkinson and S.E. Fienberg (eds.) A Celebration of Statistics: The ISI Centenary Volume, New York, Springer-Verlag, 251-285.

Fair, R.C. (1986) "Evaluating the Predictive Accuracy of Models", Chapter 33 in Z. Griliches and M.D. Intriligator (eds.) Handbook of Econometrics, Amsterdam, North-Holland, Volume 3, 1979-1995.

Fisher, F.M. (1970) "Tests of Equality Between Sets of Coefficients in Two Linear Regressions: An Expository Note", Econometrica, 38, 2, 361-366.

Fisher, R.A. (1922) "The Goodness of Fit of Regression Formulae, and the Distribution of Regression Coefficients", Journal of the Royal Statistical Society, 85, 4, 597-612.

Goldfeld, §.M. (1973) "The Demand for Money Revisited", Brookings Papers on Economic Activity, 1973:3, 577-646 (with discussion).

Goldfeld, S.M. and D.E. Sichel (1990) "The Demand for Money", Chapter 8 in B.M. Friedman and F.H. Hahn (eds.) Handbook of Monetary Economics, Amsterdam, North-Holland, Volume 1, 299-356.

Godfrey, L.G. (1978) "Testing Against General Autoregressive and Moving Average Error Models When the Regressors Include Lagged Dependent Variables", Econometrica, 46, 6, 1293-1301.

Granger, C.W.J. (1986) "Developments in the Study of Cointegrated Economic Variables", Oxford Bulletin of Economics and Statistics, 48, 3, 213-228.

Granger, C.W.J. (1989) Forecasting in Business and Economics, Boston, Academic Press, Second Edition.

Granger, C.W.J. and M. Deutsch (1991) "Comments on the Evaluation of Policy Models", Journal of Policy Modeling, forthcoming (this issue).

Harvey, A.C. (1981) The Econometric Analysis of Time Series, Oxford, Philip Allan.

Hendry, D.F. (1979) "Predictive Failure and Econometric Modelling in Macroeconomics: The Transactions Demand for Money", Chapter 9 in P. Ormerod (ed.) Economic Modelling, London, Heinemann Education Books, 217-242.

Hendry, D.F. (1983) "Comment", Econometric Reviews, 2, 1, 111-114.

Hendry, D.F. (1986) "The Role of Prediction in Evaluating Econometric Models", Proceedings of the Royal Society of London, Series A, 407, 25-34 (with discussion).

Hendry, D.F. (1987) "Econometric Methodology: A Personal Perspective", Chapter 10 in T.F.

Bewley (ed.) Advances in Econometrics, Cambridge, Cambridge University Press, Volume 2, 29-48.

Hendry, D.F. (1988) "The Encompassing Implications of Feedback versus Feedforward Mechanisms in Econometrics", Oxford Economic Papers, 40, 1, 132-149.

Hendry, D.F. (1989) PC-GIVE: An Interactive Econometric Modelling System, Version 6.0/6.01, Oxford, University of Oxford, Institute of Economics and Statistics and Nuffield College.

Hendry, D.F. and N.R. Ericsson (1991) "Modeling the Demand for Narrow Money in the United Kingdom and the United States", European Economic Review, 35, 4, 833-881.

Hendry, D.F. and A.J. Neale (1991) "A Monte Carlo Study of the Effects of Structural Breaks on Tests for Unit Roots", Chapter 8 in P. Hackl and A.H. Westlund (eds.) Economic Structural Change: Analysis and Forecasting, Berlin, Springer-Verlag, 95-119.

Hendry, D.F. and J.-F. Richard (1982) "On the Formulation of Empirical Models in Dynamic Econometrics", Journal of Econometrics, 20, 1, 3-33.

Jarque, C.M. and A.K. Bera (1980) "Efficient Tests for Normality, Homoscedasticity and Serial Independence of Regression Residuals", Economics Letters, 6, 3, 255-259.

Johnston, J. (1963) Econometric Methods, New York, McGraw-Hill.

Kiviet, J.F. (1986) "On the Rigour of Some Misspecification Tests for Modelling Dynamic Relationships", Review of Economic Studies, 53, 2, 241-261.

Lu, M. and G.E. Mizon (1991) "Forecast Encompassing and Model Evaluation", Chapter 9 in P. Hackl and A.H. Westlund (eds.) Economic Structural Change: Analysis and Forecasting, Berlin, Springer-Verlag, 123-138.

Marquez, J. and N.R. Ericsson (1991) "Evaluating Forecasts of the U.S. Trade Balance", forthcoming in R. Bryant, P. Hooper, C. Mann, and R. Tryon (eds.) Empirical Evaluation of Alternative Policy Regimes, Washington, D.C., Brookings Institution.

Meese, R.A. and K. Rogoff (1983) "Empirical Exchange Rate Models of the Seventies: Do They Fit Out of Sample?", Journal of International Economics, 14, 1/2, 3-24.

Mizon, G.E. and J.-F. Richard (1986) "The Encompassing Principle and Its Application to Testing Non-nested Hypotheses", Econometrica, 54, 3, 657-678.

Nicholls, D.F. and A.R. Pagan (1983) "Heteroscedasticity in Models with Lagged Dependent Variables", Econometrica, 51, 4, 1233-1242.

Pagan, A. (1989) "On the Role of Simulation in the Statistical Evaluation of Econometric Models", Journal of Econometrics, 40, 1, 125-139.

Pesaran, M.H. (1974) "On the General Problem of Model Selection", Review of Economic Studies, 41, 2, 153-171.

Ramsey, J.B. (1969) "Tests for Specification Errors in Classical Linear Least-squares Regression Analysis", Journal of the Royal Statistical Society, Series B, 31, 2, 350-371.

Sargan, J.D. (1958) "The Estimation of Economic Relationships Using Instrumental Variables", Econometrica, 26, 3, 393-415.

Sargan, J.D. (1980) "Some Approximations to the Distribution of Econometric Criteria Which Are Asymptotically Distributed as Chi-squared", Econometrica, 48, 5, 1107-1138.

Schinasi, G.J. and P.A.V.B. Swamy (1989) "The Out-of-Sample Forecasting Performance of Exchange Rate Models When Coefficients Are Allowed to Change", Journal of International Money and Finance, 8, 3, 375-390.

Spanos, A. (1986) Statistical Foundations of Econometric Modelling, Cambridge, Cambridge University Press.

Swamy, P.A.V.B. and G.J. Schinasi (1989) "Should Fixed Coefficients Be Re-estimated Every Period for Extrapolation?", Journal of Forecasting, 8, 1-17.

Swamy, P.A.V.B. and P.A. Tinsley (1980) "Linear Prediction and Estimation Methods for Regression Models with Stationary Stochastic Coefficients", Journal of Econometrics, 12, 2, 103-142.

White, H. (1980) "A Heteroskedasticity-consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity", Econometrica, 48, 4, 817-838.

White, H. (1989) Specification Analysis in Econometrics, Cambridge, Cambridge University Press, forthcoming.

Wilson, A.L. (1978) "When Is the Chow Test UMP?", American Statistician, 32, 2, 66-68.

IFDP NUMBER

412

411

410

409

408

407

406

405

404

403

402

401

400

398

-27-

International Finance Discussion Papers

TITLES 1991

Parameter Constancy, Mean Square Forecast Errors, and Measuring Forecast Performance: An Exposition, Extension, and Illustration

Explaining the Volume of Intraindustry Trade: Are Increasing Returns Necessary?

How Pervasive is the Product Cycle? The Empirical Dynamics of American and Japanese Trade Flows

Anticipations of Foreign Exchange Volatility and Bid-Ask Spreads

A Re-assessment of the Relationship Between Real Exchange Rates and Real Interest Rates: 1974 - 1990

Argentina’s Experience with Parallel Exchange Markets: 1981-1990

PC-GIVE and David Hendry’s Econometric Methodology

EMS Interest Rate Differentials and Fiscal Policy: A Model with an Empirical Application to Italy

The Statistical Discrepancy in the U.S. International Transactions Accounts:

Sources and Suggested Remedies

In Search of the Liquidity Effect

Exchange Rate Rules in Support of Disinflation Programs in Developing Countries The Adequacy of U.S. Direct Investment Data Determining Foreign Exchange Risk and Bank

Capital Requirements

Precautionary Money Balances with Aggregate Uncertainty

Using External Sustainability to Forecast the Dollar

AUTHOR(s

Neil R. Ericsson

Donald Davis

Joseph E. Gagnon Andrew K. Rose

Shang-Jin Wei

Hali J. Edison B. Dianne Pauls

Steven B. Kamin

Neil R. Ericsson Julia Campos Hong-Anh Tran

R. Sean Craig

Lois E. Stekler

Eric M. Leeper David B. Gordon

Steven B. Kamin Lois E. Stekler Guy V.G. Stevens Michael P. Leahy

Wilbur John Coleman II

Ellen E. Meade Charles P. Thomas

Please address requests for copies to International Finance Discussion Papers, Division of International Finance, Stop 24, Board of Governors of the

Federal Reserve System, Washington, D.C.

20551.

IFDP NUMBER

397

396

395

394

393

392

391

390

389

388

387

384

383

382

381

- 28 -

International Finance Discussion Papers

TITLES 1991

Terms of Trade, The Trade Balance, and Stability: The Role of Savings Behavior

The Econometrics of Elasticities or the Elasticity of Econometrics: An Empirical Analysis of the Behavior of U.S. Imports

Expected and Predicted Realignments: The FF/DM Exchange Rate during the EMS

Market Segmentation and 1992: Toward a Theory of Trade in Financial Services

1990

Post Econometric Policy Evaluation A Critique

Mercantilism as Strategic Trade Policy: The Anglo-Dutch Rivalry for the East India Trade

Free Trade at Risk? Perspective

An Historical

Why Has Trade Grown Faster Than Income?

Pricing to Market in International Trade: Evidence from Panel Data on Automobiles and Total Merchandise

Is the EMS the Perfect Fix? An Empirical

Exploration of Exchange Rate Target Zones

Estimating Pass-through: Stability

Structure and

International Capital Mobility: from Long-Term Currency Swaps

Evidence

Is National Treatment Still Viable? Policy in Theory and Practice

U.S.

Three-Factor General Equilibrium Models: A Dual, Geometric Approach

Modeling the Demand for Narrow Money in the United Kingdom and the United States

The Term Structure of Interest Rates in the Onshore Markets of the United States, Germany, and Japan

Financial Structure and Economic Development

AUTHOR(s

Michael Gavin

Jaime Marquez

Andrew K. Rose Lars E. 0. Svensson

John D. Montgomery

Beth Ingram Eric M. Leeper

Douglas A. Irwin Douglas A. Irwin Andrew K. Rose

Joseph E. Gagnon

Michael M. Knetter

Robert P. Flood Andrew K. Rose Donald J. Mathieson William R. Melick Helen Popper

Sydney J. Key Douglas A. Irwin David F. Hendry Neil R. Ericsson

Helen Popper

Ross Levine

Cite this document

APA

Neil R. Ericsson (1991). Parameter Constancy, Mean Square Forecast Errors, and Measuring Forecast Perfomance: An Exposition, Extensions, and Illustration (IFDP 1991-412). Board of Governors of the Federal Reserve System, International Finance Discussion Papers. https://whenthefedspeaks.com/doc/ifdp_1991-412

BibTeX

@techreport{wtfs_ifdp_1991_412,
  author = {Neil R. Ericsson},
  title = {Parameter Constancy, Mean Square Forecast Errors, and Measuring Forecast Perfomance: An Exposition, Extensions, and Illustration},
  type = {International Finance Discussion Papers},
  number = {1991-412},
  institution = {Board of Governors of the Federal Reserve System},
  year = {1991},
  url = {https://whenthefedspeaks.com/doc/ifdp_1991-412},
  abstract = {Parameter constancy and a model's mean square forecast error are two commonly used measures of forecast performance. By explicit consideration of the information sets involved, this paper clarifies the roles that each plays in analyzing a model's forecast accuracy. Both criteria are necessary for "good" forecast performance, but neither (nor both) is sufficient. Further, these criteria fit into a general taxonomy of model evaluation statistics, and the information set corresponding to a model's mean square forecast error leads to a new test statistic, forecast-model encompassing. Two models of U.K. money demand illustrate the various measures of forecast accuracy.},
}