ifdp · July 31, 1986

Should Fixed Coefficients be Reestimated Every Period for Extrapolation?

Abstract

This paper demonstrates that forecast accuracy is not necessarily improved when fixed coefficient models are sequentially reestimated, and used for prediction, after updating the database with the latest observation(s). This is at variance with the now popular method (see Meese and Rogoff (1983, 1985)) of sequentially reestimating fixed coefficient models for prediction as new data "rolls" in. It is argued that although "rolling" may minimize the variance of predictions for some classes of estimators, "rolling" does not necessarily yield accurate predictions (i.e., predictions that are close to actual data). Minimizing the mean squared prediction errors is a necessary condition for maximizing the probability that a given predictor is more accurate than other predictors. This minimization need not require, and may even exclude, the most recent data. A by-product of the demonstration is that for predictors based on the same sample size, a predictor with smaller variance need not be more accurate than another predictor with a larger variance.

International Finance Discussion Papers Number 287

August 1986

SHOULD FIXED COEFFICIENTS BE REESTIMATED EVERY PERIOD FOR EXTRAPOLATION?

by

P.A.V.B. Swamy and Garry J. Schinasi

NOTE: International Finance Discussion Papers are preliminary materials circulated to stimulate discussion and critical comment. References in publications to International Finance Discussion Papers (other than an acknowledgment that the writer has had access to unpublished material) should be cleared with the author or authors.

Abstract

Tnis paper demonstrates that forecast accuracy is not necessarily improved when fixed coefficient models are sequentially reestimated, and used for prediction, after updating the database with the latest observation(s). This is at variance with the now popular method (see Meese and Rogoff (1983, 1985)) of sequentially reestimating fixed coefficient models for prediction as new data "rolls" in. It is argued that although "rolling" may minimize the variance of predictions for some classes of estimators, "rolling" does not necessarily yield accurate predictions (i.e., predictions that are close to actual data). Minimizing the mean squared prediction errors is a necessary condition for maximizing the probability that a given predictor is more accurate than other predictors. This minimization need not require, and may even exclude, the most recent data. A by-product of the demonstration is that for predictors based on the same sample size, a predictor with

smaller variance need not be more accurate than another predictor with a

larger variance,

Should Fixed Coefficients be Reestimated Every Period for Extrapolation?

P.A.V.B. Swamy and Garry J. Schinasi*

1. Introduction

In making forecasts of future variables, some econometricians reestimate their models using all past data prior to each forecast period.

The process involves fixing the starting date and the initial size of the sample and enlarging the sample by adding successive observations for reestimation and prediction as new data become available (see, e.g., Fromm and Klein (1976, p. 9) and Meese and Rogoff (1983, 1985)). It has been suggested that such a procedure improves forecast accuracy for two reasons: first, because a larger sample reduces the variances of fixed coefficient estimators; and second, because sequential estimation or “rolling” captures any variation in coeffiecients.

The theory underlying the common impression that one should always use all the available observations in estimation and prediction is more. ambiguous than may be generally realized.! Without proper theoretical justification, like an explicit risk minimizing motivation (favoring good predictions), a procedure that sequentially updates estimates of coefficients and predictions in a model assumed to have constant coefficients is meaningless. The primary purpose of the paper is to demonstrate that any sequential method

* Views expressed in this paper are those of the authors and do not reflect the views of the Board of Governors or the staff of the Federal Reserve System. The comments by N. Ericsson, J. Marquez, P. von zur Muehlen and P. Tinsley

are greatly appreciated.

1 This impression is also not fully supported by the asymptotic theory, some simple normal cases apart. For example, Lehmann (1983, pp. 352-388) analyzes various nonnormal situations showing that a parameter is more efficiently estimatecl even in large samples by discarding some sample observations than by using all the observations.

-2-

of estimating constant coefficients does not necessarily yield accurate forecasts. A by-product of the demonstration is that a predictor with a smaller variance need not be better than another predictor with a larger variance, even though both predictors are based on the same sample.

The logic of the basic argument of this paper is as follows. If the objective of estimation is forecast accuracy, then one should prefer predictions that are close to actual realizations to predictions that are close to some other quantities (such as the means of predictors). Consequently, one should select the predictor that has the highest probability of being close to actual realizations. Although it is difficult to derive estimators based on this general criterion, a necesary condition for this probability to be a maximum is for the mean square error of a predictor (i.e., the expected squared deviation of a predictor from realization) to be a minimum. “Rolling” (i.e., sequential estimation) may minimize the forecast variance for broad classes of estimators, but minimizing variance does not necessarily minimize mean square error.? Hence, for any given estimator, "rolling" may reduce

the variance of a predictor, but it does not necessarily improve, and may

2 The constant coefficient approach is inherently incorrect if some or all of the regression slopes change over time, sample enlargement notwithstanding.

3 In practice, many of these optimal predictors involve unknown parameters and their operational versions based on some sample estimators of the parameters May not satisfy all the conditions for achieving minimum variance. This

makes it difficult to recognize an operational predictor with minimum variance in small samples. Because of this difficulty, attention has been shifted to the asymptotically optimal predictors which have the smallest asymptotic variance within the class of asymptotically normal predictors. This does not eliminate the difficulty because some asymptotically optimal predictors do

not possess finite variances in small samples and hence violate a necessary condition for maximizing the probability of obtaining forecasts within an interval around actual realizations.

-3-

even reduce, forecast accuracy. Note that this is a theoretical result about the forecasting properties of “rolling” regressions and it simply states that there is no reason to believe that “rolling” improves forecast accuracy. The logic of this argument extends to more general cases, where two or more predictors are compared for forecast accuracy using the same sample.

Section 2 introduces and briefly discusses two forecast criteria that compare forecasts with realizations rather than the means of predictive distribut:ions. We then discuss conventional estimators satisfying certain sampling properties in Section 3. Operationally, some versions of these estimators may not have finite variances. Using the realization-based criterion, Section 4 shows that the best predictor must have minimum mean square error, and that there is no reason why the method of sequential estimation

achieves this. Conclusions are presented in Section 5.

2. Forecasting Criteria Based on Realizations

Suppose that we are interested in predicting the value that would be taken by the random variable Ye in a future period T+s, where T is the terminal period of the currently available sample observations on ytd

We now define two criteria for comparing predictors.

Criterion of Highest Concentration

The criterion of highest concentration compares predictions with actual

realizations and is defined as follows: an operational predictor vn of the

4 We distinguish a random variable from its value by an asterisk. For example, Ye is the value taken by the random variable Yt in period t.

-~4-

actual value vn is better than any other operational predictor vt Lf +s s

the probabilities satisfy the condition

a ~~

Pr -~.r’,< < +.) > Pry -2, < < + A») (1) nrg VS Mts Treg 927 2 TEM ay ALS Yay gS Yop g” 42

for all possible values of A, and \2 in a chosen interval (0, X) and for all possible realizations Yong .

s A necessary condition that (1) be satisfied for all A is

E(y = 2 < EF Or 2 , (2) THs Ts 7” THs Tats

that is, the mean square error of yo about the actual realization voy is a +s "s

minimum (see Rao (1973, p. 315)). This condition indicates, for example, when a predictor does not satisfy the so-called criterion of highest concentration in (1).

Now sufficient conditions will be given for a criterion weaker than (1) to be satisfied. This weaker criterion is based on the concept of Pitman's

nearness (PN) which is defined as follows:

Pitman's Nearness (PN)

A predictor Yon is nearer to the value Voy than another predictor ¥

Ss Ss

if Pr[L(y > y ») < LG )] > 1/2, where L(y > ) represents a Tts “T+s ov > Tas "ts? T+s *P

s loss in predicting Voy by Yong - We may consider two standard loss functions, s s

namely,

a

L, ( yely - and Ly(y =(y nis? THs Yo Tong | 2 n45’ Yrs Yrs THs

+s

-5-

With these assumptions, Peddada (1985) has proved that if

(i) EL,(y y )< EL,(Y ) for i = 1,2; 3 i T+’ T+s i a+’ rts ( a) (41) e, = Ee, = EL,(¥ ) - EL,(y ) i i ints? THs iors’ "Ts < -2.67 for i = 1,23 and (3b) (444) E(ey-e,)J < 3! for i = 1,2; j =1,2,..., (3c) where e, =L (y ) -.L (y ); i 1 a6? "TH+ it? "THs then 7 is closer to y than ¥ in the PN sense. T+s T+s Tts

3. Fixed Coefficients Approaches

With these desirable properties of predictors in hand, we next discuss conventional predictors and their statistical properties. The first step in any statistical method of generating predictions is to formulate a statistical model about the possible data generating process. It is usually postulated that the observations on ye are generated by the following reduced form equation with fixed coefficients:

y* = Xn + e%, (4) where

y* = the (Txl) vector of variables Yeo t = 1,2,...T3

X = the (TxK) matrix of sample observations on K "fixed" exogenous variables; wt = the (Kxl) vector of reduced-form coefficients; and

e* = the (Txl) vector of reduced-form disturbances.

It is usually assumed that E(e*|X) = E(e*) = 0 and E(e*e*"|X) = o2v. (5)

If model (4) holds for the post-sample periods t = Tts, s = 1,2.....,

* also, then for period T+s the variable Yt+, given the vector of regressors

will be

* ° * Ytts = XT+s™ + ET,’ (6) where

* YT+g = the scalar regressand;

the (1xK) vector of prediction regressors; and

XT +5 Ett, = the scalar prediction disturbance. Goldberger (1962) points out that for model (4), an appealing set of assumptions

is one which allows the prediction disturbance to be correlated with the

sample disturbances. Therefore, we shall assume

* e * Elen lXpy5) = Elen.) = 0, (7a) Be “tg lxt45) = 0%, and (7b) E(e* e"|X, xp.) = w, (7c) T+s 8

where w is the Txl vector of covariances of the prediction disturbance with

the vector of sample disturbances. Thus, implicit in the use of model (4)

5 If equation (1) represents an autoregressive model, then X consists of the lagged values of y and w can be equal to 0. Alternatively, the vector w can

be zero if equation (1) represents a regression model with serially uncorrelated error term.

-7-

for prediction is the assumption that the actual value of the regressand for period T+s will be a drawing from a distribution with mean Xp4gM and variance of.

The second step in any prediction method is to estimate m. The

vector ™ can be estimated by one or more of the following procedures:

Estimation Procedures

(1) The least squares procedure;

(2) The generalized least Squares procedure based on an estimated error covariance matrix;

(3) A fully or partially restricted reduced form procedure that fully or partially accounts for the connection between 7 and the coefficients of a structural model;

(4) A Bayes procedure;

(5) Bayes-like procedures; and

(6) Robust procedures.

Let 7 denote any one of these estimators.

The final step is to use an estimated m in predicting the value Yorg’ Even though any or all of the estimation procedures (1)-(6) can be considered for sequential estimation, to save some computations we may, if possible, want to first find the most efficient of all the above six estimators and then use it in a sequential estimation. Furthermore, all the criteria of estimation

used by the above estimation procedures may not be compatible with the criterion

(1) or Pitman's nearness, as we show in the following section.

A Minimum Variance Predictor It has been shown by Goldberger (1962) that the minimm variance

linear "unbiased" predictor of Yn is® s

“ ° “ 271 * = +w°V * - Xr (8) ts ~ “ts *Y LY ) o

if 7 = (X'v—lx)-lxtyn-lys is the minimum variance linear unbiased estimator of

a./

It is also shown by Goldberger (1962) that

E(y -y* )2< E(x” t-y* 2 (9) T+s Tt+s — T+s T+s

with the equality holding when w = 0.

A difficulty with the minimum variance predictor (8) is that w and o2v are usually unknown. The inequality (9) may be reversed if YT+s represents an operational predictor of the form

Xppgt t+ wv l(y* - xn) , (10) oO

based on some sample estimates of Wy o2, and V, and T represents an operational

estimator

(xty7lxy-lyry-lys (11)

6 A predictor } is said to be “unbiased” if E ¥ =Ey* , T+s T+s T+s

7 The minimum variance linear unbiased predictor of an element of a vector variable following a vector autoregressive (VAR) model also has the same form. For example, if equation (4) is an autoregressive model imbedded in a VAR model, then Xs includes the lagged values of Vote? w represents

the contemporaneous covariances between the dependent variable of equation (4) and the other dependent variables included in the VAR model, o“V represents the contemporaneous covariance matrix of these other variables, and y* and X consist of the current and lagged values of the other variables respectively.

-~9 -

or a restricted estimator based on v, provided the second-order moments of

(10) and @ are finite (see Rao (1967, 1975)).8 In other words, an operational

version of (8) may not have the minimum variance (about voy ) and may even have s

infinite variance. The estimator (11) may also have infinite variances. The

predictors with infinite variances violate condition (2). They may not even

be good in the PN sense.

The same difficulty may arise if 7 represents a partially or fully restrictied reduced~form estimator. The conditions for the finiteness of the moments of partially restricted reduced-form estimators in the normal case are given in Swamy and Mehta (1981, 1982), and Swamy, Mehta and Tyengar (1983). If the predictors of Yate based on the estimation procedure (3) need satisfy conditions (3a)-(3c), then Swamy and Mehta's modifications, that guarantee the existence of moments of all orders in small samples, are necessary. Using these modifications, whether the predictor (10) has a smaller variance about Yate than the predictor 4s" depends on how far En is from 1 and on the precision of the estimators w and v-1(1/62).

A result due to Rao (1967) shows that for a matrix Z of maximum rank such that X'Z = 0, the estimator (11) will have bigger variances than the least squares estimator (X'x) ~lx-yx if and only if X'VZ is sufficiently close to a null matrix. In this case the predictor Xppg (XIX) “lx tys may have a smaller variance than the predictor (10) or the predictor rs Thus, it is not possible in practice to recognize an operational predictor

with minimum variance in small samples.

8 This comment does not apply if we assume that w=0 because in this case the second term in (10) will be zero.

- 10 -

Asymptotically Efficient Predictors

If the estimation period is sufficiently long, asymptotic theory may apply and predictors (10) and (8) may have the same limiting distributions. This can occur even when t represents a partially or fully restricted reducedform estimator. It is possible that the asymptotic variance of the predictor (10) is smaller if 7 represents a robust estimator than if 7 represetts the estimator (11) or a restricted reduced-form estimator. The comparisons of asymptotic variances are relevant for moderate sample sizes if the estimators of ™ are convergent in the rth mean. Such estimators are developed in Swamy, Mehta and Iyengar (1983).

On the basis of either the exact finite sample distributior. theory or the asymptotic theory, a universally preferred choice among the estimation

procedures (1)-(6) is not possible.

4. Does Sequential Estimation Necessarily Improve Forecasting Accuracy?

Suppose that one of the estimation procedures (1)-(6) is used in

sequential estimation and the corresponding predictors of Yt are computed, s and further assume that the corresponding predictors possess finite variances. Suppose also that these variances decrease as the estimation period increases. What is the correct interpretation of the predictors of an obtained in this Ss

sequential estimation procedure?

We can answer this question by using the following standard result in probability theory. Let y and ¥ be two operational predictors

Tts Tts of y with finite means Wy and Uo» finite variances o? and 03, and Tts distribution functions F, and FQ respectively. Suppose that both Y and 4

3

y are based on the same formula but y uses a bigger sample than Tts : Tts

-1l-

Vong so that of < 03. Then according to a standard result in

probability theory,

Fyly +1) - Fu(-y + 1) > Fo( + 9) - Fo(- + 2) (12 THs 1 1 T+ 1) 2 2.74 2 2 Vote 2 )

Ss

for each possible value Vote implies that of < 03. However, the converse proposition is not true. The inequality of < 0 implies that inequality

(12) is true for at least one value Yong but not necessarily for all possible

values of vats (see Rao (1973, p. 96)). Thus, if we have two operational predictors with finite second-order moments, then it is not necessarily true that the precliictor with a smaller variance will take values around the actual

value Yay with a higher probability than the predictor with a larger variance.

Ss

These general arguments can now be applied to the specific choice of two sets of predictions, based on the same predictor using different samples. We can therefore conclude the following about “rolling:" if we reestimate a model using all past data prior to each forecast period to. obtain one-step ahead forecasts to reduce the variance, then the one-step ahead forecasts will not necessarily be closer to the realized values of the forecasted variable than the multi-step ahead forecasts, even though the former are based on more observations and hence may have smaller variances than the latter.

More formally, comparing the inequality (12) with (1) shows that the criterion of minimum variance ("“unbiasedness") prediction only satisfies a necessary condition for maximizing the probabilities of intervals around the mean values of predictors (see (12)) while the criterion of minimum mean

Square error about actual realizations satisfies a necessary condition

-12-

for maximizing the probabilities of intervals around actual realizations (see (1)). Therefore, predictors should satisfy condition (2). If predictors satisfying the criterion of highest concentration do not generally exist, then we should satisfy condition (2)(minimum mean square error) as closely as possible.

How do we nearly satisfy condition (2)? As explained by Chipman (1976, pp. 612-613), the conditional expectation of the random variable y* given y will have the minimum mean square error about y* withir: a wide T+s T+s class of functions of y. If we restrict this class to linear functions or if we assume that ye and y* are jointly normally distributed random variables satisfying equations (6) and (4) respectively, then the conditional expectation of Ione given y is the same as the predictor (8) with 7 replaced by 1 (see. Chipman (1976, pp. 603-604)). Thus, the conditional expectation of Yas given y nearly satisfies condition (2), and it can be evaluated fairly accurately, provided model (6) and assumptions (5), (7a)-(7c) are true.2 Another difficulty is that if we believe a priori that model (6) and assumptions (5), (7a)-(7c) are true, then the conditional mean of Wb given y involves 1, w and o2V which are usually unknown. The result (12) shows that any procedure of

estimating these unknown quantities that attempts to reduce the variance of

an estimator of mt does not necessarily lead to better predictors of Voy. . s

Minimum Mean Square Error Predictors

An appropriate way to overcome this difficulty is to estimate models

using a smaller mean square error criterion. We can consider two cases:

9 Unfortunately, it is not possible to establish the truth of any logically valid model (see Swamy, Conway and von zur Muehlen (1985)).

-13-

normal and non-normal distributions of the errors. For the case in which a multivariate normal mean under quadratic loss is to be estimated, a paper by Natarajan and Strawderman (1985) establishes the existence of two-stage sequential estimators that are better, both in risk (mean square error) and sample size, than the usual estimator of a given fixed sample size. When e€ in equation (1) is normal, the problem of estimating m under quadratic loss becomes identical to the case discussed by Natarajan and Strawderman (1985). Given any sample size T, we can find a two-stage sequential estimator of 1 truncated at T, with a positive probability of stopping earlier and mean Square error lower than that of an estimator of m7 based on all the T observations. The predictor of Vite based on a Natarajan-Strawderman-like estimator of m can have smaller mean square error than any of the predictors of Vote based on an estimator of m that uses all the observations available up to T or Tts-1.

Where normality is not appropriate, the advantages of a median or a trimmed mean or a robust estimator relative to the mean are well-known (see Lehmann (1983, pp. 352-388)). The predictors of Vote based on these estimators of 7 may have smaller mean square errors than predictors of Vote based on other estimators of w that use all the sample observations. Thus, there is no theoretical justification for reestimating models of the type (4) using all past data prior to each forecast period either in the normal case or in a nonnormal case.!9

10 If, in fact, the slope coefficients of equation (4) change over time, then it is incorrect to stack the observations as

i] ' i= 1 1 ' eee .

In other words, the estimators of fixed coefficients are inappropriate when

the slope coefficients vary over time. Appropriate estimators for a time-varying parameters model are given in Swamy and Tinsley (1980) and two applications

in Reslear, Barth, Swamy and Davis (1985) and Swamy, Kennickell and von zur Muehlen (1986).

- 16 -

Rao, C.R., 1973, Linear statistical inference and its applications, 2nd ed. (Wiley, New York).

Rao, C.R., 1975, Simultaneous estimation of parameters in differeat linear models and applications to Biometric problems, Biometrics 31, 545-554.

Resler, D.H., J.R. Barth, P.A.V.B. Swamy and W.D. Davis, 1985, Detecting and estimating changing economic relationships: The case of discount window borrowings, Applied Economics 17, 509-527.

Swamy, P.A.V.B. and J.S. Mehta, 1980, On the existence of moments of partially restricted reduced-form coefficients, Journal of Econometrics 14, 183-194.

Swamy, P.A.V.B. and J.S. Mehta, 1981, On the existence of moments of partially restricted reduced-form estimators: A comment, Journal of Econometrics 17, 389-392.

Swamy, P.A.V.B. and J.S. Mehta, 1983, The existence of moments of ridgelike k-class and partially restricted reduced-form estimators, Communications in Statistics 11, 2793-2799.

Swamy, P.A.V.B., J.S. Mehta and N.S. Iyengar, 1983, Convergence of the moments of the modified K-class estimators, Sankhya B 45, 398-412.

Swamy, P.A.V.B. and P.A. Tinsley, 1980, Linear prediction and estimation methods for regression models with stationary stochastic coefficients, Journal of Econometrics 12, 103-142.

Swamy, P.A.V.B., R.K. Conway, P. von zur Muehlen, 1985, The founclations of econometrics--~Are there any? (with discussion), Econometric Reviews

-17 -

Swamy, P.A.V.B., A.B. Kennickell and P. von zur Muehlen, 1985, Forecasting tioney demand with econometric models, Special Studies Paper, Federal

Reserve Board, Washington, D.C.

-18-

International Finance Discussion Papers

IFDP NUMBER TITLES AUTHOR(s) 1986

287 Should Fixed Coefficients be Reestimated P.A.V.B. Swamy Every Period for Extrapolation? Garry J. Schinasi

286 An Empirical Analysis of Policy Hali J. Edison Coordination in the U.S., Japan and Ralph Tryor: Eur ope

285 Comovements in Aggregate and Relative B. Dianne Fauls Prices: Some Evidence on Neutrality

284 Labor Market Rigidities and Unemployment: Michael K. Gavin the Case of Severance Costs

283 A Framework for Analyzing the Process Allen B. Frankel of Financial Innovation Catherine L. Mann

282 Is the ECU an Optimal Currency Basket? Hali J. Edison

281 Are Foreign Exchange Forecasts Rational? Kathryn M. Dominguez New Evidence from Survey Data

280 Taxation of Capital Gains on Foreign Garry J. Schinasi Exchange Transactions and the Non-neutrality of Changes in Anticipated Inflation

279 The Prospect of a Depreciating Dollar Jacques Melitz and Possible Tension Inside the EMS

278 The Stock Market and Exchange Rate Dynamics Michael K. Gavin

277 Can Debtor Countries Service Their Debts? Jaime Marquez Income and Price Elasticities for Exports Caryl McNeilly of Developing Countries

276 Post-simulation Analysis of Monte Carlo Neil R. Ericsson Experiments: Interpreting Pesaran's (1974) Study of Non~nested Hypothesis Test Statistics

275 A Method for Solving Systems of First Order Robert A. Johnson

Linear Homogeneous Differential Equations When the Elements of the Forcing Vector are Modelled as Step Functions

Please address requests for copies to International Finance Discussion Papers, Division of International Finance, Stop 24, Board of Governors of the Federal Reserve System, Washington, D.C. 20551.

IFDP

NUMBER

27h

273

272

271

270

269 268

267

266

265 264

263

-19 -

International Finance Discussion Papers ae eee eee ee eps s

TITLES |

International Comparisons of Fiscal Policy: The OECD and the IMF Measures of Fiscal Impulse

An Analysis of the Welfare Implications of Alternative Exchange Rate Regimes: An Intertemporal Model with an Application

1985 (partial listing)

Expected Fiscal Policy and the Recession of 1982

Elections and Macroeconomic Policy Cycles

Assertion Without Empirical Basis: An Econometric Appraisal of Monetary Trends

in ... the United Kingdom by Milton Friedman and Anna J. Schwartz

Canadian Financial Markets: Proposal for Reform

Was It Real? The Exchange Rate Interest Differential Relation, 1973-1984

The U.K. Sector of the Federal Reserve's Multicountry Model: The Effects of Monetary and Fiscal Policies

Optimal Currency Basket in a World of Generalized Floating: An Application to the Nordic Countries

Money Demand in Open Economies: Substitution Model for Venezuela

A Currency

Comparing Costs of Note Issuance Facilities and Eurocredits

Some Implications of the President's Tax Proposals for U.S. Banks with Claims on Developing Countries

Monetary Stabilization Policy in an Open

. Economy

Anticipatory Capital Flows and the Behaviour of the Dollar

The Government's

AUTHOR(s)

Garry Schinasi

Andrew Feltenstein David Lebow Anne Sibert

William H. Branson Arminio Fraga Robert A. Johnson

Kenneth Rogoff Anne Sibert

David F. Hendry Neil R. Ericsson

Garry J. Schinasi Richard Meese Kenneth Rogoff

Hali J. Edison

Hali J. Edison Erling Vardal

Jaime Marquez Rodney H. Mills

Allen B. Frankel

Marcus H. Miller

Arnold Kling

Cite this document
APA
P.A.V.B. Swamy and Garry J. Schinasi (1986). Should Fixed Coefficients be Reestimated Every Period for Extrapolation? (IFDP 1986-287). Board of Governors of the Federal Reserve System, International Finance Discussion Papers. https://whenthefedspeaks.com/doc/ifdp_1986-287
BibTeX
@techreport{wtfs_ifdp_1986_287,
  author = {P.A.V.B. Swamy and Garry J. Schinasi},
  title = {Should Fixed Coefficients be Reestimated Every Period for Extrapolation?},
  type = {International Finance Discussion Papers},
  number = {1986-287},
  institution = {Board of Governors of the Federal Reserve System},
  year = {1986},
  url = {https://whenthefedspeaks.com/doc/ifdp_1986-287},
  abstract = {This paper demonstrates that forecast accuracy is not necessarily improved when fixed coefficient models are sequentially reestimated, and used for prediction, after updating the database with the latest observation(s). This is at variance with the now popular method (see Meese and Rogoff (1983, 1985)) of sequentially reestimating fixed coefficient models for prediction as new data "rolls" in. It is argued that although "rolling" may minimize the variance of predictions for some classes of estimators, "rolling" does not necessarily yield accurate predictions (i.e., predictions that are close to actual data). Minimizing the mean squared prediction errors is a necessary condition for maximizing the probability that a given predictor is more accurate than other predictors. This minimization need not require, and may even exclude, the most recent data. A by-product of the demonstration is that for predictors based on the same sample size, a predictor with smaller variance need not be more accurate than another predictor with a larger variance.},
}