ifdp · September 30, 1991

Comments on the Evaluation of Policy Models

Abstract

This paper examines the evaluation of models claimed to be relevant for policy making purposes. A number of tests are proposed to determine the usefulness of such models in the policy making process. These tests are applied to three empirical examples.

Board of Governors of the Federal Reserve System International Finance Discussion Papers Number 413 October 1991

COMMENTS ON THE EVALUATION OF POLICY MODELS Clive W.J. Granger and Melinda Deutsch

NOTE: International Finance Discussion Papers are preliminary materials circulated to stimulate discussion and critical comment. References in publications to International Finance Discussion Papers (other than an acknowledgment that the writer has had access to unpublished material) should be cleared with the author or authors.

ABSTRACT

This paper examines the evaluation of models claimed to be relevant for policy making purposes. A number of tests are proposed to determine the usefulness of such models

in the policy making process. These tests are applied to three empirical examples.

Key words and phrases: evaluation, forecasts, forecast encompassing, mean square forecast error, policy models.

COMMENTS ON THE EVALUATION OF POLICY MODELS Clive W.J. Granger and Melinda Deutsch ;

§1. Introduction

Applied economic research produces many empirical models of various parts of the economy. The models are evaluated in a variety of ways, some will report the specification search used to reach the final model, some will employ a battery of specification tests looking at missing variables, parameter consistency or heterogeneity, some will use cross-validation on post sample evaluation techniques, and so forth. Discussion of these procedures and some difficulties that arise can be found in the book of readings, Granger (1990). Many applied papers, as well as some theoretical ones, will end with a section on the ” policy implications” of the model. These sections rarely emphasize that strictly the policy implications only follow if the model is correct and is the actual data generating mechanism, which is an unreasonably strong assumption. It is also an assumption that cannot be true if one has two competing policy models. Of course models are built for a variety of purposes and some are fairly easy to evaluate or to compare. For example if two models are built to provide forecasts, they can be run in real time, after the date of the construction, and the forecasts compared using some pre-agreed cost function or criterion. This approach is less easy to apply to a policy model. A complete evaluation would require some organization, such as the Federal Reserve, to use the model to decide on policy and then to see how well the policy thus achieved actually performs. It is unlikely that most models can be evaluated in such a manner, and so less ambitious methods are required.

{ The first author is a Professor of Economics at the University of California, San Diego and the second author is a graduate student at the same institution. This paper was revised while the first author was a Visiting Scholar at the Federal Reserve Board. This paper represents the views of the authors and should not be interpreted as reflecting the views of the Board of Governors

-of the Federal Reserve System or members of its staff. Partially supported by NSF grant SES 89-02950,

In this paper we start with a given model that is claimed to have been built for policy purposes. We consider what are the implications of this model being used by a policy maker to try to keep a single variable of interest near to a series of target values. To do this, a policy variable has its value chosen by use of the model. As background variables change and as targets alter, so the policy variable will take a series of values. It can be argued that each different policy value represents a new “policy regime” but we would prefer to keep this phrase to indicate more momentous, less frequent events such as a change in the variable of interest (from unemployment to inflation) or of the policy variable (from interest rates to money supply) or of the policy model being utilized. In these cases the Lucas critique becomes relevant and tests of super-exogeneity can be employed, as in Engle and Hendry(1989). This is an important aspect of policy model evaluation which we will not consider here, but see Hoover and Sheffrin (1990). We will not consider also the argument from the rational expectations literature that policy cannot be

successful, although that argument could be used to give an interpretation to some of our results.

It has been suggested that modern policy makers do not consider specific targets and so the type of control mechanism considered here is unrealistic. As a counterexample, it may be noted that the finance ministers of the G-7 countries meet twice a year to give targets for ten or so indicators of their economies, for use in international policy coordination; as discussed by Frankel (1990). Other examples are also available. It should be noted that the G-7 targets are not made public, and so cannot be used by most economists in evaluation exercises.

In what follows we will assume that targets exist but are not known. It is also assumed that policy makers are optimizing rather than satisficing. A good discussion of alternative approaches to policy is given by van Velthoven (1990).

§2. The Control Mechanism Suppose that the proposed policy model takes the form

where Y; is the variable that the decision maker is trying to influence, called the variable of interest, such as unemployment, C; is the variable that the decision maker has available as a policy variable, such as money supply, X; is a vector of other variables that influence Y;, and e; is the residual, taken to be zero mean white noise. (2.1) is often called the plant equation in the control literature. For the moment it will be assumed that the coefficients a, b, k are constant, and in

2

particular do not change as C; changes. Let T; denote a target series, which is the desired values of Y;. Denote ce; = Y; — T;, the control error and let S(ce) be the cost function of the decision maker, representing the cost of an error ce.

The objective of the policy maker will be taken to be to manipulate C;4; so that Y:41 is as close as possible to T;4, when using a one-step horizon. More precisely the policy maker will chose C41 to minimize E,[S(cez41)]. It should be noted that it is generally not possible to equate Y;4; with T;,, because X; and ey are random variables that are not perfectly forecastable or controllable. The timing is also important as C,4, and T,41 are determined at time ¢ but in a sense do not become operative until time ¢ + 1. Replacing t by t+ 1 in (2.1) and using a least-squares cost function suggests that the forecast of Y:41 made at time t is

fei = at bCi41 + kfey (2.2)

if Cz41 is known, as it is potentially for the decision maker. Here ff, is the optimum one-step forecast of X; using information available as time ¢ plus Cz41. Requiring this forecast to be equal to the target, gives

and then the forecast error will be,

CCe41 = C141 + (X41 — fe4).

Thus if the decision maker were using (2.1) to determine the policy, the values for the policy variable would be given by (2.3). Given a satisfactory method of forecasting X41 everything else in (2.3) can be taken as known.

To an outsider things are rather different as Cr41 will not be known at time ¢ and neither will be T;41. The first can be forecast but there is often little direct information about targets.

To an outsider, the best forecast of Yi41 will be

Y 91) =at bgt + koe,

according to the model (2.1), which is being investigated, where 974 is the optimum

one step forecast of X;41 using just information available at time t, not including C41. The forecast error will now be

Y €tt1 = Yesi — ge

3

= ecez1 + O(Ceti — ot) + KF — 974)

As these two components will be uncorrelated if (2.1) is correct (as otherwise the size of ecz41 could be reduced), one gets

Ele? 41] 2 Elect, 4] (2.4)

in general. This suggests the following test for evaluating a policy model. The conditional forecast, using information in C41, of Y:4; should on average be superior to the unconditional forecast, using just information available at time t to an outsider. Note that both forecasts are constructed directly from the model that is being evaluated. Of course without this constraint, the conditional forecast should always be better than the unconditional forecast, on average, as it is based on more information. It should be noted that equality holds in (2.4) only if Ci; = gf, that is the control variable C is perfectly forecastable from the information set J;. In this case the decision maker cannot influence C and so this variable is irrelevant as a control. If the model (2.1) is correct it follows from (2.3) that if C is perfectly forecastable, so will be the target series, T; and also there is no instantaneous causation from C; to Xz. In these circumstances, the model (2.1) or any other model would not be relevant for control purposes. Thus non-rejection of the null hypothesis of equality in (2.4) is equivalent to rejection of (2.1) as a control model. As this hypothesis can be rejected in many ways, it does not follow that (2.1) is useful for policy if the hypothesis is rejected. This evaluation technique is clearly relevant for a model that claims to be a policy model and is illustrated in the next section. It is also briefly discussed in Chong and Hendry (1986).

Nevertheless, (2.3) can be used to estimate the target series as

Troi = bCi41 tat kfN (2.5)

using the observed value of C';41 and the forecast of X14. If some of the estimated target values are unrealistic, such as negative unemployment or very high inflation rates, this would clearly imply that the model is inadequate. A more technical test can be used when Y; is I(1), so that the change series AY; is stationary, say. It was pointed out in Granger (1988) that if a policy control is somewhat successful, then at the very least Y; and T; must be cointegrated. It follows that Y; and T;, will be cointegrated, which implies that Y; — bC; —kX;—a is I (0) or stationary, which can

be tested using standard unit root tests. This evaluation technique is discussed in section 4.

To summarize this discussion, there are several evaluation tests that can be

applied to a model that is claimed to be relevant for policy selection, assumed to be of form (2.1).

Test 1

From the model two (sets of) forecasts can be constructed. The ” unconditional” one-step forecast of Y;4; made at time t, is

Gt =a+ bgt + ko,

where 901 is the optimal one-step forecast of C4; made at time t using the information set J; : Xt-j,Cr-j,j > 1 and similarly for X. ”Conditional” one-step forecasts are now made of the form

fi = a4 bi + k fx

where now f* and f~* are the optimal one-step forecasts based on the large information set J; : I, Cz41 acting as though C4; is known at time t, which is correct for the decision made in this framework. Forecast errors will result from the two forecasts

€tt1 = Yiqi — 941 and

Y ectt1 = Ye4i — fry

and the null hypothesis tested is equality of the mean squared errors of € and ec. Assuming that these errors have zero means, the null hypothesis is easily tested using the procedure discussed in chapter 9 of Granger and Newbold (1987). Define

Dy = €4 — ecy

Sp=€:+ecy

and the null hypothesis is then equivalent to testing correlation(D;, S;) = 0, which follows by noting that cov(D,,5;) = Ele? — ec?].The test employed in the next section is to regress S on a constant and D . If the coefficient of D is significantly different from zero, conclude that SSE of the conditional forecasts is significantly different from the SSE of the unconditional forecast; otherwise conclude they are not. This is the test that is emphasized in the following section. Non-rejection of the null hypothesis is equivalent to a rejection of the usefulness of the model (2.1) for policy purposes. As the forecast horizon used by the policy maker is unknown, the test was repeated for horizons, 1,2,3, and 4 of the observational time unit.

Test 2

If the model (2.1) is correctly specified for the control variable, the conditional forecast should forecast encompass the unconditional forecast, as discussed in Chong

5

and Hendry (1986). This means that the poorer unconditional forecast contains no information that is helpful in improving the quality of the conditional forecast. This has to be true in theory if the forecasts are not being constructed from a given model. The test uses post-sample data to form the regression

Yi41 = a1 + oft + 439+,1 + residual

and forecast encompassing occurs if a2 = 1,a; = a3 = 0 and the residual is white noise. Similar regressions can be formed for different horizon forecasts but are not reported below.

Test 3

From (2.5), under the assumption that the model (2.1) is correctly specified, the underlying target series can be estimated. The reasonableness of these estimates can be judged by the evaluator. Clearly, this is not a formal test.

Test 4

If the target series is I(1), then Y; will be J(1), and this can be tested by standard techniques such as augmented Dickey-Fuller tests. It will then follow that zt = Y; — bC; — kX; — a is I(0), or stationary, with zero mean. Again z; can be tested for the null of [(1) and, if this null is rejected and z; has declining autocorrelations, one may accept its stationarity. Note that this is equivalent to testing for cointegration between Y;, C;,and X, but with given coefficients, as determined by the model. If these coefficients were unconstrained and X; included lagged Y;, the test would be of little interest as z; should then always be I (0).

Test 5

In the above test, it is assumed that the target series T; is unobserved, which is usually the case. If the targets are available, C,41 could be estimated from (2.3), giving C141, and then a regression run of the form

Crt =a+ BCr41 + error

and the null hypothesis that the model is correct is tested for the joint requirements

that a = 0, 6 = 1, error = white noise. This test is not considered in the following section.

Test 6

A further test that can be performed, but which is also not considered in the empirical sections, is to ask if some other policy variable (or its unanticipated component) is missing from the model. Thus, for example, in a model relating money

6

supply to unemployment, unanticipated government expenditure can be considered as a possible missing variable using a Lagrange Multiplier test. If such a potential

control variable is missing, the original model cannot be used as a successful policy model.

It should be noted that these tests are not necessarily strong ones and represent only necessary conditions that a policy model should possess. The tests are for a single model and may not be helpful in comparing models. It is also assumed that C;, or a component of it, is a potentially controllable variable. The tests are not inter-related in a simple way as they concentrate on different types of possible mis-specification of (2.1).

To perform tests 1 and 2 it is necessary to form unconditional forecasts of the control variable C;. This will usually come from a reduced form regression of C; on various explanatory variables, including lagged C,. This regression can be thought of as an approximation to a reaction function for C;. Clearly, if this equation is badly specified then this could be a reason why the null hypothesis considered in the two tests are rejected. In the empirical model considered in the next section, the control variable is taken to be the unanticipated change in money (denoted DM R), and thus its unconditional forecast will just be zero for all horizons.

There is relatively little discussion in econometrics about how to evaluate a policy model as compared to models designed for forecasting or hypothesis testing. In other disciplines there is consideration given to policy evaluation, but it is usually nontechnical, see for example Nagel (1990).

§3. An Application to a Model of the Unemployment Rate.

The first empirical example examines a model of the unemployment rate proposed by Barro and Rush (1980) which is based on the Natural Rate/Rational Expectations hypothesis. Although this model was not formulated for policy making purposes, it will be used to illustrate some of the concepts presented in this paper. The model of the unemployment rate consists of two equations: the money

growth equation and the unemployment rate equation. The money growth equation is given by:

6 3 DMz = a9 + > a1j;DMy_; + a2F EDV; + a3;U Ni; + DMR,

j=l j=1

where M is a quarterly average of M1, DM; = log M; — log M;_1, FEDV, is an estimate of the deviation from the normal value of the real expenditure of the federal

7

government, U is the quarterly average unemployment rate, and UN; = log( 74). The residual, DM Rz, is the unanticipated change in the money supply, and is taken to be the control variable. Thus, for this example, the above equation is used only

to estimate the control variable since it is unknown to the investigator.

The equation explaining movements in the unemployment rate is given by

10

j=l where MIL =(military personnel / male population aged 15-44).

This equation is the plant equation. Note that the future values of MIL, are taken

as known when forming the conditional and unconditional forecasts.

In addition, the following univariate model of the unemployment rate was used

as a benchmark for comparison.

UN; = ao + a,U Ny-1 + agU Ni_2 + a3U Niy_3 + residual.

The above equations were estimated using quarterly data. The estimated

money growth equation was

DM, =.02 + A2DMi-1 + O05DMy-2 + .12DM;_3 (3.05) (4.94) (.55) (1.36) - 15SDMi4+ 28 DMis- .083DMi¢6+ .01FEDV, (1.71) (3.21) (.31) (1.45) (1.15) (.21) (.87) T=148 [1952:1-1988:4] R? = 51 D.W. = 2.00 &=.00710

(the modulus of t-values are shown in parentheses ). The estimate of Barro’s un-

8

employment equation was

UN, =-50 - 2.02 DMR, - 119DMR,, - 2.46 DMR,» (6.03) (2.87) (1.64) (3.42)

- 2.51 DM Ry_3 -1.67 DM RiE4 - 1.71 DM Ri_s (3.27) (2.06) (2.03)

- 2.14 DMR;_¢-1.16 DMR;_7- 1.00 DMR,_s (2.53) (1.40) (1.24)

-.08 DM Ry» + .04 DMR;,-3- —.76 MIL; (.01) (.05) (5.52)

+140 UNy1- 64 UN;,_2 + residual (19.64) (9.10)

T = 123(1954:3-—1985:1]) R?=.98 DW.=2.06 6 =.04869

The univariate model was estimated to be

UN, = -.12 + 1.66 UNi_1 - .85 UNi_2 + 14 UNi_3 4+ residual (2.45) (19.59) (5.67) (1.65)

T = 138(1950:4-—1985:1] R?=.96 D.W.=2.01 6 =.06545

In order to test the usefulness of this model for policy making purposes the following analysis was performed. First, the conditional and unconditional forecasts were computed for the model using data from 1985:1 to 1988:4. Recall that if the model is useful for policy purposes, then the conditional forecasts should outperform the unconditional forecasts. In order to determine whether this is the case, these forecasts were be compared in a number of ways. First, for test 1, the root mean sum of squared forecast errors (RMSFE) was computed and is displayed in Table 1. Note that for all forecast horizons, the RMSFE of the unconditional forecasts were smaller than the conditional forecasts.

While this suggests that the conditional forecasts do not outperform the uncon-

ditional forecasts, it is desirable to test whether the SSE are significantly different from each other.

Unfortunately, some of the assumptions necessary for the validity of this test appear to be violated in this example. In particular, the one-step ahead forecast errors appear to be autocorrelated and this serial correlation is not completely

9

corrected for by assuming the the residuals from the relevant regression are AR(1). A similar problem seems to hold for the 2, 3, and 4 step ahead forecast errors. In some of the cases, the regression results also indicate that the forecast errors are biased. If, for illustrative purposes, the apparent violation of these two key assumptions is ignored, the conclusion drawn from the test results for all forecast horizons is that the SSE of the conditional and unconditional forecasts are not significantly different.

For test 2, the one-step regression was completely unsatisfactory, with f,1 having a negative coefficient(of -1.12) and the residuals containing strong serial correlation, with Durbin-Watson statistic of .35. The results clearly indicate that conditions required from test 2 for the model to be satisfactory for policy purposes are not found.

Test 4 suggests that an additional test of model adequacy should be employed if the variable that the decision maker is trying to influence, Y;,is I(1): namely, the estimated target from the model under consideration should be cointegrated with Y;.! As a first step in implementing this test, the augmented Dickey-Fuller (ADF) test with two lags and a constant term (see Granger and Newbold (1989) for a discussion of this test) was used to test for stationarity of the unemployment series. The test statistic was 2.95, thus the results are inconclusive as to whether the unemployment series is I(0) or I(1). However, it shall be assumed, for illustrative purposes, that U is I(1) so that the test for cointegration can be performed. ? The cointegrating regression of unemployment on the estimated target from Barro’s model and a constant was run and the ADF test with two lags was used on the residuals from this regression to determine if they were I(0). The t-statistic from the Dickey-Fuller test with a constant term is 19.80, indicating that unemployment and the estimated target from Barro’s model are cointegrated. Thus, the unemployment model does not fail this simple test.

For test 3, as a final test of the usefulness of the model for policy purposes, the estimated target series for the model was computed. It does not appear from

Figure 1 that these values are unrealistic, so the model does not fail this simple test.

1 For this example, it should be noted that the estimated target was approximated by the in-sample fitted values from the unemployment model since the number of out-of-sample values was too small to perform the unit root test. In addition, UN and its associated target was transformed to levels since the unemployment models considered in this section estimate the I(0) variable, UN; = log(<44- where U; is the quarterly average unemployment rate.

2 It should also be noted that the coefficients of the lagged terms in the univariate model for UN add almost to one (1.66 — 854+ .14= .95).

10

The above results, taken together, strongly suggest that the conditional forecasts are not superior to the unconditional forecasts and, thus, on this basis, we conclude, that the model is not useful for policy purposes. The error autocorrelation

inter alia points to model mis-specification, and the latter may be why the forecast tests are what they are.

§4. An Application to Two Models of the Demand for Borrowed Reserves.

The second example examines two models of the demand for borrowed reserves. These models are relevant to policy since the Federal Reserve targets intermediate reserves,and borrowing from the Discount Window has an obvious impact on the reserves. In addition, studies by Keir (1981) and others suggest that the Federal Reserve uses a version of these models. The first model was proposed by Goldfeld and Kane (1966). Briefly, a bank is assumed to have an exogenous reserve need. It can either borrow from the Federal Reserve or from an alternative source. The bank minimizes its total cost of borrowing by choosing the optimal amount of borrowing from each source. The model proposed by Goldfeld and Kane is

RP = a9 +a, Ky +a.R?, + az3AR?? + e

where RF is the level of borrowing from the Discount Window, ARY® is the change in unborrowed reserves, K = ip — is, and ip is the discount rate, and zg is the interest rate of the alternative source.

Dutkowsky (1984) extended Goldfeld and Kane’s model by arguing that a switching regression model gave a more accurate representation of the behavior of unborrowed reserves. The switching regression model proposed by Dutkowsky is

pe fo +arKi+ af RB, +akARYF + hb if K < K* t a¥ + a¥ log K, + aYR2, +aYARUF 4 YU fk > k* where A* is an unobservable switching point that needs to be estimated. The discount rate ip is the control variable throughout this section.

The above models were estimated using seasonally adjusted monthly data. Due to the difference in the sample period and the seasonal adjustment of the data, an additional lag of R® was found to be significant and included in the regression results. The Goldfeld and Kane model was estimated to be

RP = 97.18 + 91.06 K, + 1.03 RB, (6.02) (6.19) (22.93)

-.24 RB, - 12 ARY® + residual (6.71) (4.11)

11

T = 198(1959:7—1975:12] R?=.95 D.W.=1.89 & = 129.03

For the Dutkowsky model, the unobservable switchpoint, K*, which maximized a likelihood function was found using a grid search and estimated to be 0.15. Dutkowsky’s estimated model was: for K; > .15

RB = 225.93 + 107.35 log(K,) + 1.06 RB, (5.78) (4.86) (21.70)

-.27 RB ,- 15 ARY + residual (6.66) (4.16)

and, for K; < .15,

RB =77.45 + 64.76 K,+ .81 RB, (2.87) (1.92) (6.74)

-.07 RB, - .005 ARY® + residual (.83) (.09)

T = 192(/1960:1—1975:12} R?=.95 D.W.=1.91 = 130.06

tor

The univariate model was

RP =48.44 + 1.28 RB, - 16 RB, -.21 RB, + residual (2.86) (18.32) (1.36) (2.95)

T = 200[1959:5—1975:12] R?=.93 D.W.=2.01 6 =164.97

For test 1 unconditional forecasts of 2p are required. As ip is (1), an AR(3) model for Az p was estimated as

Aip,t = .54Atp t-1 + .10Atp 1-3 + whitenoise

Thus the unconditional forecast of 2p 441 is the unconditional forecast of Atp 141 plus 7p,z. Multiple step forecasts are easily formed.

The interpretation of the results for this example proved to be less straightforward than for the first example and seemed to depend in part on the forecast horizon considered. Following the analysis above, the RMSFE of the conditional and unconditional forecasts was computed for both models for 1976:1 to 1978:12

12

and is displayed in Table 2. It can be seen that the conditional RMSFE was less than the unconditional RMSFE for some forecast horizons.

Further investigation was needed to determine whether these differences were statistically different. In particular, the test involving the sum and differences of the forecast errors was employed and the results are displayed in Table 3.

The conclusion of test 1 obtained for the Goldfeld and Kane model is that the conditional and unconditional forecasts are not significantly different for any of the forecast horizons. For the Dutkowsky model, the forecast errors for steps one and four were found to be significantly different suggesting, surprisingly, that the unconditional forecasts were superior to the conditional forecasts for those forecast horizons. The significantly smaller unconditional RMSFE is prima facie evidence

of model mis-specification, and the latter may be why the forecast tests are what they are.

For test 2, as a further test of the superiority of the conditional forecasts, the conditional and unconditional forecasts were combined using regression (3.1). For a one-step horizons, the estimated parameters were as follows:

Goldfeld and Kane’s Model:

Re =46.34 - 9.41 fr + 10.36 gy) +residual (.78) (1.50) (1.64)

T = 36(1976 :1—1978:12] R*=.70 D.W.=247 6 = 228.39

which is hardly interpretable and

Dutkowsky’s Model:

RA =84.00 + 12 fer +.73 ge. +residual (1.37) (.06) (.09)

T = 36(1976:1—1978:12] R? = .66 D.W.=2.54 6 = 244.31 Both of these application of test 2 do not support the usefulness of the models for policy purposes.

Lastly, for test 3, the estimated target series for each model was computed. It does not appear from Figure 2 and 3 that these values are unrealistic. Taken together, the above results indicate that neither model is useful for

policy purposes since the above analysis suggests that the conditional forecasts do not appear to be superior to the unconditional forecasts.

13

As a final observation it is interesting to note that most of the tests described above can be used to examine the performance of a model during different regimes. For example, an investigator may believe that Dutkowsky’s model is only useful for policy making purposes when K > K*. Thus, the SSE of the conditional and unconditional forecasts for the two regimes may be calculated. The results are displayed in Table 4. Again, there is not clear-cut superiority of the conditional forecasts over the unconditional ones.

§5. An Application to a Model for the Demand for Narrow Money in the United Kingdom.

The final example examines a model of the demand for narrow money in the United Kingdom. The model proposed by Hendry and Ericsson (1991) is

A(m — p)t = ao + a, Apt + a2A(m — p— y)t-1 + a3 RE + a4(m — p — y)t-1 + €t

where A(m—p), is the growth rate of real money, Ap; is the inflation rate, (m—p—y)¢ is the inverse of velocity, and Rj is the learning-adjusted net interest rate. (See Hendry and Ericsson (1991) for a detailed description of the variables). For the sample period 1964:1-1985:2, the above model was estimated to be

A(m—p):= .02 - .70 Ap, - 19 A(m—p—y)t-1 (4.64) (5.27) (2.98)

-.62 Ri - .09 (m—p—y)t_1 + residual (8.24) (9.94)

T = 86[1964:1—1985:2] R?=69 DW.=215 6 =.01344

The estimated univariate model was

A(m — p)t = .26 A(m — p)t-1 + .25 A(m — p)t-2 + residual (2.45) (2.29)

T = 86(1964:1-1985:2) R?=.16 D.W.=1.96 &=.02170

For test 1 unconditional forecasts of Rf are required.A model for AR? was estimated as

AR? = .01 — .13R7_, + whitenoise

The unconditional forecast of R?,, is the unconditional forecast of AR}, , plus Ri. Multiple step forecasts are easily formed.

14

Following the analysis in the first example, the RMSFE of the conditional and unconditional forecasts was computed for both models for 1985:3 to 1989:2 and is displayed in Table 5. It can be seen that the conditional RMSFE was less than the unconditional RMSFE for all forecast horizons.

Further investigation was needed to determine whether these differences were statistically different. In particular, the test involving the sum and differences of the forecast errors was employed and the results are displayed in Table 5. The conclusion of test 1 obtained for the Hendry and Ericsson model is that the conditional and unconditional forecasts are not significantly different for steps one, two, and three but are significantly different for step four.

For test 2, as a further test of the superiority of the conditional forecasts, the conditional and unconditional forecasts were combined using regression (3.1). For

a one-step horizon, the estimated parameters were as follows:

A(m — p)t41 =-002 + 1.15 fra - .31 gta +residual (.26) (2.65) —(.62)

T = 16(1985:3-—1989:2] R* = .56 D.W.=2.32 6& = .01316

Lastly, for test 3, the estimated target series for each model was computed. The estimated target series and the observed A(m—p) are within 1 to 3 percentage points of each other, which is not unrealistic.

Taken together, the above results indicate that the usefulness of the above model for policy purposes may depend on the forecast horizon used by the policy maker since the above analysis suggests that the conditional forecasts do not appear to be superior to the unconditional forecasts for steps one, two, three but are superior for step four. However, it should be noted that one interpretation of the

failure of the forecast tests to reject is that they lack power due to the small sample size of the forecasts.

§6 Conclusion.

Despite the importance of reliable models used in the policy making process,

there has been little consideration given to the evaluation of policy models in the * Sconometric literature. This paper discussed a number of tests that could be used for such a purpose. While these tests provide only necessary properties that a policy

model should possess, they do aid the decision maker by excluding some inadequate models.

15

Appendix

M1 (1950:1-1958:12)

Banking and Monetary Statistics: Board of Governors of the Federal Reserve System (1976)

ee M1 (1959:1-1988:12) Citibase Series FM1

Citibase Series LHURR

Unemployment Rate, All Workers, Including Resident Armed Forces

F Total Federal Government Citibase Series GGFEX Expenditures DEFLAT Implicit Price Deflator: Citibase Series GDGGF Federal Government

MALE Male Population Aged 15- Sum of Citibase Series

44 PANM4, PANM5, PANM6, PANM7, PANM8, and PANM9

Civilian Population Citibase Series POPCIV

Total Population Including Citibase Series POP Armed Forces Overseas

Table 6. Data from the First Example

16

The above series were converted into quarterly values when appropriate and the following variables were formed.

List of Variables for Example 1 DM; = log M1; — log M1,_

UN; = log( enter )

] F, — t FED, ~— DEFLAT,

FEDV, = log FED, — [log FED]? log FED]} = .05[log FED]; + .95{log FED]*_,

POP,-CIV, = te et MIL, = “Spree.

Table 7.

Total Borrowings at Reserve Banks Citibase Series FECMB tD ts

Discount Rate, Federal Citibase Series FYGD Reserve Bank of New York | is Federal Funds Rate Citibase Series FYFF

Table 8. Data from the Second Example

The above series were seasonally adjusted and the following variables were formed:

List of Variables for Example

1 2 ARY® = RY? — RUB

Table 9.

17

References

Barro, R.J., and Rush, M. (1980) Unanticipated Money and Economic Activity. In Rational Expectations and Economic Policy (Stanley Fischer, Ed.). University of Chicago Press (for Nat. Bur. Econ. Res.), Chicago.

Chong, Y.Y, and Hendry, D.F. (1986) Econometric Evaluation of Linear Macro- Economic Models, Review of Economic Studies, LIII: 671-690. (Also, chapter 17 of Granger, (1990)).

Dutkowsky, D. (1984) The Demand for Borrowed Reserves: A Switching Regression Model, The Journal of Finance, 69: No. 2, 407-424.

Engle, R.E., and Hendry, D.F.( 1989) Testing Superexogeneity and Invariance. Discussion Paper, Economics Department, University of California, San Diego, Paper No. 89-51, San Diego, California.

Frankel, J.A. (1990) International Nominal Targeting: A Proposal for Overcoming Obstacles to Policy Coordination. Working Paper, Economics Department, University of California, Berkeley.

Goldfeld, S.M., and Kane, E.J. (1966) The Determinants of Member Bank Borrowing: An Econometric Study, Journal of Finance 21:499-514.

.Granger, C.W.J., and Newbold, P.(1987) Forecasting Economic Time Series - Academic Press, New York.

Granger, C.W.J. (1988) Causality, Cointegration, and Control, Journal of Economic Dynamics and Control, 12:551-559.

Granger, C.W.J.,(Ed.) (1990) Modelling Economic Time Series: Readings in | Econometric Methodology, Oxford University Press.

Hendry, D. and Ericsson, N. (1990) Modeling the Demand for Narrow Money in the United Kingdom and the United States, European Economic Review 35, 4, 833-881.

Hoover, M.D. and Sheffrin, $.M. (1990) Causation, Spending, and Taxes: Sand in the Sandbox or Tax Collector for the Welfare State?. Working Paper, Economics Department, University of California, Davis.

Keir, P. (1981) Impact of Discount Policy Procedures on the Effectiveness of Reserve Targeting. In New Monetary Control Procedures, Federal Reserves Staff Study, Board of Governors of the Federal Reserve System, Vol.1,February..

Nagel, S.S. (1990) Policy Theory and Policy Evaluation Greenwood Press New York.

Van Velthoven, B.C.J. (1990) The Applicability of the Traditional Theory of Economic Policy Journal of Economic Surveys 4:59-88.

18

Steps Ahead Univariate Coeff. |t| Cond Uncond

_ Steps Ahead Univariate Cond Uncond Cond Uncond

D (Dutkowsky) D (Goldfeld/Kane) Steps Ahead Coeff. |t| Coeff. |t|

-3.35 28 -1.57 53 -.04 -

1.84 1.86

2.55 2.62 2.95 3.44 4.03

Table 4.

19

8.0

[area Steps Ahead Univariate

Table 5.

Estimated Target from Barro's Model

—t— unemployment

—~—_ Barro's Target

85:3 86:1 86:3 87:1 87:3 88:1 88:3 Date

Figure 1

20

Estimated Target from the Goldfeld/Kane Model 1500

—t— Borrowed Reserv —e— Goldfeld/Kane

1000

500

76:1 76:5 76:9 T1:1 715 77:9 78:1 78:5 78:9 78:12 Date

Figure 2

Estimated Target from the Dutkowsky Model

1500 —t}— Borrowed Reserv —~e— Dutkowsky 1000 500 0

719 78:1 78:5 78:9 78:12

Figure 3

21

IFDP NUMBER

413

412

411

410

409

408

407

406

405

404

403

402

401

400

399

International Finance Discussion Papers

TITLES 1991

Comments on the Evaluation of Policy Models

Parameter Constancy, Mean Square Forecast Errors, and Measuring Forecast Performance: An Exposition, Extension, and Illustration

Explaining the Volume of Intraindustry Trade: Are Increasing Returns Necessary?

How Pervasive is the Product Cycle? The Empirical Dynamics of American and Japanese Trade Flows

Anticipations of Foreign Exchange Volatility and Bid-Ask Spreads

A Re-assessment of the Relationship Between Real Exchange Rates and Real Interest Rates: 1974 - 1990

Argentina's Experience with Parallel Exchange Markets: 1981-1990

PC-GIVE and David Hendry'’s Econometric Methodology

EMS Interest Rate Differentials and Fiscal Policy: A Model with an Empirical Application to Italy

The Statistical Discrepancy in the U.S. International Transactions Accounts: Sources and Suggested Remedies

In Search of the Liquidity Effect

Exchange Rate Rules in Support of Disinflation Programs in Developing Countries The Adequacy of U.S. Direct Investment Data Determining Foreign Exchange Risk and Bank Capital Requirements

Precautionary Money Balances with Aggregate Uncertainty

__ ees

Please address requests for co Papers, Division of International F Federal Reserve System, Washington, D.C.

20551.

22

AUTHOR (s

Clive W.J. Granger Melinda Deutsch

Neil R. Ericsson

Donald Davis

Joseph E. Gagnon Andrew K. Rose

Shang-Jin Wei

Hali J. Edison B. Dianne Pauls

Steven B. Kamin

Neil R. Ericsson Julia Campos Hong-Anh Tran

R. Sean Craig

Lois E. Stekler

Eric M. Leeper David B. Gordon

Steven B. Kamin Lois E. Stekler Guy V.G. Stevens

Michael P. Leahy

Wilbur John Coleman II

pies to International Finance Discussion inance, Stop 24, Board of Governors of the

IFDP NUMBER

398

397

396

395

394

393

392

391

390

389

388

387

386

385

384

383

International Finance Discussion Papers

TITLES 1991

Using External Sustainability to Forecast the Dollar

Terms of Trade, The Trade Balance, and Stability: The Role of Savings Behavior

The Econometrics of Elasticities or the Elasticity of Econometrics: An Empirical Analysis of the Behavior of U.S. Imports

Expected and Predicted Realignments: The FF/DM Exchange Rate during the EMS

Market Segmentation and 1992: Toward a Theory of Trade in Financial Services

1990

Post Econometric Policy Evaluation A Critique

Mercantilism as Strategic Trade Policy: The Anglo-Dutch Rivalry for the East India Trade

Free Trade at Risk? An Historical Perspective

Why Has Trade Grown Faster Than Income? Pricing to Market in International Trade: Evidence from Panel Data on Automobiles and Total Merchandise

Is the EMS the Perfect Fix? An Empirical

Exploration of Exchange Rate Target Zones

Estimating Pass-through: Stability

Structure and

International Capital Mobility: Evidence

from Long-Term Currency Swaps

Is National Treatment Still Viable? Policy in Theory and Practice

U.S.

Three-Factor General Equilibrium Models: A Dual, Geometric Approach

Modeling the Demand for Narrow Money in the United Kingdom and the United States

23

AUTHOR (s

Ellen E. Meade Charles P. Thomas

Michael Gavin

Jaime Marquez

Andrew K. Rose Lars E. 0. Svensson

John D. Montgomery

Beth Ingram Eric M. Leeper

Douglas A. Irwin Douglas A. Irwin Andrew K. Rose

Joseph E. Gagnon

Michael M. Knetter

Robert P. Flood Andrew K. Rose Donald J. Mathieson William R. Melick Helen Popper

Sydney J. Key

Douglas A. Irwin

David F. Hendry Neil R. Ericsson

Cite this document
APA
Clive W.J. Granger and Melinda Deutsch (1991). Comments on the Evaluation of Policy Models (IFDP 1991-413). Board of Governors of the Federal Reserve System, International Finance Discussion Papers. https://whenthefedspeaks.com/doc/ifdp_1991-413
BibTeX
@techreport{wtfs_ifdp_1991_413,
  author = {Clive W.J. Granger and Melinda Deutsch},
  title = {Comments on the Evaluation of Policy Models},
  type = {International Finance Discussion Papers},
  number = {1991-413},
  institution = {Board of Governors of the Federal Reserve System},
  year = {1991},
  url = {https://whenthefedspeaks.com/doc/ifdp_1991-413},
  abstract = {This paper examines the evaluation of models claimed to be relevant for policy making purposes. A number of tests are proposed to determine the usefulness of such models in the policy making process. These tests are applied to three empirical examples.},
}