ifdp · July 31, 1991

PC-GIVE and David Hendry's Econometric Methodology

Abstract

This paper summarizes David Hendry's empirical econometric methodology, unifying discussions in many of his and his co-authors' papers. Then, we describe how Hendry's suite of computer programs PC-GIVE helps users implement that methodology. Finally, we illustrate that methodology and the programs with three empirical examples: postwar narrow money demand in the United Kingdom, nominal income determination in the United Kingdom from Friedman and Schwartz (1982), and consumers' expenditure in Venezuela. These examples help clarify the methodology's central concepts, which include cointegration, error-correction, general-to-simple modeling, dynamic specification, model evaluation and testing, parameter constancy, and exogeneity.

Board of Governors of the Federal Reserve System International Finance Discussion Papers Number 406

August 1991

PC-GIVE AND DAVID HENDRY’S ECONOMETRIC METHODOLOGY

Neil R. Ericsson, Julia Campos, and Hong-Anh Tran

NOTE: International Finance Discussion Papers are preliminary materials circulated to stimulate discussion and critical comment. References in publications to International Finance Discussion Papers (other than an acknowledgment that the writer has had access to unpublished material) should be cleared with the author or authors.

ABSTRACT

This paper summarizes David Hendry’s empirical econometric methodology, unifying discussions in many of his and his co-authors’ papers. Then, we describe how Hendry’s suite of computer programs PC-GIVE helps users implement that methodology. Finally, we illustrate that methodology and the programs with three empirical examples: postwar narrow money demand in the United Kingdom, nominal income determination in the United Kingdom from Friedman and Schwartz (1982), and consumers’ expenditure in Venezuela. These examples help clarify the methodology’s central concepts, which include cointegration, error-correction, general-to-simple modeling, dynamic specification, model evaluation and testing, parameter constancy, and exogeneity.

Key words and phrases: cointegration, conditional models, dynamic specification, encompassing, error-correction models, exogeneity, general-to-simple modeling, Hendry, model evaluation, parameter constancy, sequential reduction, testing.

PC-GIVE and David Hendry’s Econometric Methodology

Neil R. Ericsson, Julia Campos, and Hong-Anh Tran!

1. Introduction

In economics, it is common for researchers to develop an economic theory, find data which appear to correspond to the associated economic theoretic constructs, estimate the model with that data, and conduct inferences therefrom. While theory, data, and estimation are important issues in empirical modeling, they are not sufficient to insure that inferences from empirical models are reliable. Several econometric methodologies attempt to address problems arising in this “textbook” approach to modeling: one of those methodologies is associated with David Hendry, its most vocal advocate and contributor.2 Hendry has embodied this methodology in a suite of computer programs called PC-GIVE, making the methodology easily accessible for practical use. This paper describes Hendry’s methodology and its implementation in PC-GIVE, and motivates the methodology and programs via several empirical examples.

Section 2 summarizes Hendry’s econometric methodology, focusing on the status of empirical models; their evaluation, design, and structure; modeling strategies; and estimation and testing.

1 Forthcoming in the journal Revista de Econometria. The first two authors are staff economists in the Division of International Finance, Federal Reserve Board, and at the Banco Central de Venezuela respectively; and the third author is a research assistant in the Division of International Finance, Federal Reserve Board. The views expressed in this paper are solely the responsibility of the authors and should not be interpreted as reflecting those of the Board of Governors of the Federal Reserve System, the Banco Central de Venezuela, or other members of their staffs. Specifically, this paper does not represent an endorsement of any software package by the Board of Governors of the Federal Reserve System, the Banco Central de Venezuela, or members of their staffs. Helpful discussions with and comments from Hali Edison, Dale Henderson, David Hendry, David Howard, Sgren Johansen, Deb Lindner, Jaime Marquez, and Pedro Valls Pereira are gratefully acknowledged. We note that Hendry is a professional colleague and personal friend of the authors. In addition, Hendry was a Ph.D. adviser for Campos and Ericsson, and has co-authored papers with them. However, we believe that these associations do not cloud the objective analysis of this paper. All numerical results were obtained using PC-GIVE Version 6.01; cf. Hendry (1989b).

2 The “Hendry methodology” is also sometimes referred to as “British econometrics” or econometrics in the LSE (London School of Economics) tradition, reflecting the nationality and affiliation of major contributors to its formulation; see Gilbert ( 1989) for an historical perspective. While any list of contributors invariably excludes important participants, several contributors should be noted: James Davidson, Jim Durbin, Rob Engle, Clive Granger, Andrew Harvey, S¢ren Johansen, Grayham Mizon, Adrian Pagan, Bill Phillips, Peter Phillips, Jean-Francois Richard, Denis Sargan, Pravin Trivedi, Ken Wallis, and Hal White, not all of whom are British or have been associated with the LSE.

Section 3 discusses the structure of Hendry’s two primary empirical programs, PC- FIML and PCGIVE, and includes a brief history of PC-GIVE.

Section 4 illustrates the major concepts involved with new empirical analyses of data from Hendry’s and our research. First, with Hendry and Ericsson’s (1991b) post-war quarterly data on narrow money demand in the United Kingdom, we examine issues of cointegration, error-correction, general-to-simple modeling, dynamic specification, and sequential reduction. Second, with Friedman and Schwartz’s (1982) phase-average data, we use test statistics as criteria to evaluate Friedman and Schwartz’s model of nominal income in the United Kingdom. Finally, with Campos and Ericsson’s (1988) annual data on consumers’ expenditure in Venezuela, we test for (and find) super exogeneity of prices and incomes in Campos and Ericsson’s conditional error-correction model of consumers’ expenditure.

The Appendix includes a complete chronological bibliography for David Hendry, and categorizes each paper by its focus, whether empirical, Monte Carlo, on computer programs, or on econometric theory. To economize on space, references authored or coauthored by Hendry are not included in the regular list of references.

Sections within this paper extensively cross-reference each other to highlight and clarify the links between methodology, software, and empirical practice. Further, the sections are reasonably modular, so (e.g.) a reader primarily interested in the application of the methodology can skip to Section 4 directly, referring to earlier sections only as needed.

2. Econometric Methodology

In this section, empirical econometric models are seen to be simplifications of the underlying data generation process. Related test statistics serve in both evaluation and design of the model, with the hypotheses being tested corresponding to statistical reductions of the data generation process to the model. This interpretation of models and test statistics leads naturally to a general-to-specific modeling strategy, and helps establish a taxonomy of model classes. The latter ties in closely with the recent literature on cointegration. Model estimation via maximum likelihood follows from the probability approach to models adopted at the outset, and recursive techniques are highly informative in the determination of parameter constancy. Thus, the initial overview immediately below serves to guide and motivate the more formal presentation on the status of models (Section 2.A), the uses of test statistics in practice (Section 2.B), model classes (Section 2.C), modeling strategies (Section 2.D), and estimation (Section 2.£). The final subsection (2.F) summarizes and ties together the various themes.

Modeling is considered as an attempt to characterize data properties in simple parametric relationships which remain reasonably constant over time, account for the findings of pre-existing models, and are interpretable in the light of the subject matter. The data arise from the economic activities of the relevant groups of agents, filtered by a measurement process which itself is often of human design (as with systems of national accounts). The joint effect of activity and measurement is referred to as the data generation process (abbreviated DGP). Existing theories to account for economic phenomena are generally part of a sequence, and are modified and extended by empirical evidence and new insights. The empirical econometric model seeks to characterize the empirical evidence in terms of existing and new insights. It also leaves open the possibility that the existing theory

may not be the most useful way of summarizing the empirical evidence, and that may be discovered by testing.

If the DGP were known (as in a Monte Carlo experiment where it is formulated by the investigator), then the population outcome of any combination of prior model specification and estimation method could be deduced analytically. Alternatively expressed, the DGP entails the empirical model. The required analytical derivation would involve transforming and reducing the DGP till the model resulted. Thus, empirical econometric models are implicitly being derived from the economic DGP by sequences of transformations and reductions. Typical transformations include aggregation (over space, time, agents, and goods) as well as the standard mathematical operations of division, logarithms, etc. Typical reductions comprise (i) eliminating unwanted variables (e.g., disaggregated information), usually referred to as marginalizing (from its origins in tabular presentations), and (ii) conditioning the analysis on other variables which are not to be explained (as in regression). Every transformation and reduction applied to the data series entails a corresponding transformation and reduction of the original parameters of the DGP. Thus, an empirical model is associated with a set of transformations and reductions of the DGP, with those transformations and reductions producing the (reduced) parameterization of the econometric model. An important implication of this is shown below: given the observed data and some formal model specification, then the model’s error process is also a derived function (rather than an autonomous innovation). By construction, that error must contain everything in the data which is not explicitly allowed for by the model.

Statistical tests aim to detect whether or not various reductions are valid, corresponding to whether or not no loss of information occurred in the reduction. Equally, models are open to design possibilities whereby undesirable features of either the error process or the parameterization can be eliminated by appropriate re-specification. Often, this is again an implicit rather than an explicit aspect of a modeling strategy, as when residual autocorrelation (manifested perhaps in a low Durbin- Watson statistic) is removed by a Cochrane-Orcutt transformation. In that instance, the design criterion is a Durbin- Watson value of around 2 and the procedure is to generate a new derived error process by extracting from the previous one any components thereof which could be predicted from their own past values. While this example may be optimal in some circumstances, generally it is predicated on a non sequitur, i.e., adopting the alternative against which the null was rejected. Rather obviously, a final Durbin-Watson value of 2 is no evidence that the adopted procedure was sensible since 2 is the stopping criterion. In technical terms, the “insignificance” of the Durbin-Watson statistic does not imply that valid common-factor

restrictions have been imposed (see Sargan (1980c) and Hendry and Mizon (1978) for detailed analyses).

The discussion in this section is drawn from joint work with Hendry, especially several versions of a paper analyzing Friedman and Schwartz (1982) (i.e., Hendry and Ericsson (1983, 1985, 1987), published as Hendry and Ericsson (1991a) and Campos, Ericsson, and Hendry (1990)); a contributed paper on house prices (Ericsson and Hendry (1985)); and our own work (Campos (1988) and Campos and Ericsson (1988)). For a general exposition and bibliographic perspective on this methodology, see Hendry and Richard (1982, 1983), Hendry (1983b), Hendry and Wallis (1984a), Spanos (1986), Gilbert (1986, 1989), Hendry

(1987a), Phillips (1988), the PC-GIVE manual (Hendry (1989b)), Hendry, Qin, and Favero (1990), and Hendry (1991a). The methodology is heavily based on Haavelmo (1944), as is apparent from Hendry, Spanos, and Ericsson’s (1989) summary of Haavelmo’s (1944) contributions.

While we have attempted to keep our notation as closely in line with that in Hendry’s (and other authors’) papers, some minor changes have been necessary. Also, with such a broad review, some conflicts in notation invariably arise: we point them out where they occur, and the context of usage should avoid ambiguities.*

A. The Status of Empirical Models: Derived, not Autonomous

This subsection formally derives the relationship between the DGP and the empirical economic model. The various steps from one to the other are marginalization, sequential conditioning, distributional assumptions (e.g., normality), linearization, lag truncation, contemporaneous conditioning, and parameter constancy. We consider the effects of each of these in turn.

The DGP. The observed data (w,...w7) are regarded as a realization from an unknown dynamic economic mechanism (the DGP) represented by the joint density function:

where Fy(-|-) denotes the density function for {w;}, T is the number of observations on variable w, Wo denotes the initial conditions, = is the relevant parameterization (which could include transients and parameters dependent on time t), and W is the parameter space of ~. The density Fw (wi...wr|Wo; 7) is a function of great complexity and high dimensionality, summarizing myriads of disparate transactions by economic agents and involving relatively heterogeneous commodities and prices, as well as different locations and time periods. Limitations in data, time, and knowledge preclude estimating the complete mechanism.

Marginalization. An empirical econometric model for a vector of observable variables {x,} can be conceptualized as arising by first transforming w; so that z; is a sub-vector of wz, then (implicitly) marginalizing the joint density Fyw(w,...w7 |Wo; w) with respect to all variables in w; other than x. Letting wi = (wj’,2}) where the variables w? are not

considered in the analysis, that marginalization eliminates wy from Fw (wi...wr|Wo; v) to produce a reduced density:

% The primary conflicts in notation are as follows. First, the parameters a and/or ( appear in standard regression analysis (y; = 6’z; + e;), the taxonomy of autoregressive distributed lag models (a as a constant, {@;} as coefficients), and Johansen’s analysis of cointegration (a as the weighting matrix, 6’ as the matrix of cointegration vectors). Second, the variables y and x appear in standard regression analysis, and in a system’s structure (where x; = (yj: z{)). Third, upper and lower case denote matrices and vectors respectively (e.g., X¢_1 and z; in the analysis of systems), but also levels and logarithms (e.g., 2 = In(Z) in the error-correction model, and m = In(M) in the analyses of money demand). Finally, a few symbols for economic variables in Section 4 conflict with notation

in Sections 2 and 3 (e.g., Y, y, p). For the most part, the context in which each symbol is used should clarify which meaning it has.

5 (2) Fx(a1...27| Xo; 0) = | Fe ws... | Wo y) -d{Wo wi...wp} 6€ 90,

where @ is an induced function of 7, and © is the (induced) parameter space for 0.

Sequential conditioning. Given the sequential nature of economic transactions, the density Fx(zr,...27| Xo; 6) is sequentially factorized. Without loss of generality, each zr; is conditioned on past observables X;_, to yield:

T (3) Fyx(Xp| Xo; 0) = [] Fe(xe| Xs-15 Ar). t=1

In a convenient notation, X} = (z;...2;)(T >j>i> 1) so that Xi is the complete sample of data, and X;_; = (Xo 21... 24_ 1) is a subsample that includes the initial conditions Xo. The vector (A,...AZ)’ = A = f(8) is the corresponding re-parameterization arising from the sequential factorization.

Distributional properties, functional form, and linearity. The distribution F,(z:|X+_1; Az) in (3) is related to the functional form of the moments of zz in terms of X;_,, with transformations of z; to achieve a certain distribution for F,(-|-) directly affecting the functional form, and vice versa. Although the choice of functional form is of considerable importance, it depends intimately on the nature of the problem. Thus, for this general analysis, we assume that the time series z; has been appropriately transformed to make the assumption of a linear conditional expectation from a normal distribution reasonable, so rt may involve (e.g.) logarithms and ratios of the original variables. Under the assumption of linear conditional normality, F(x | Xt-1; Az) in (3) can be rewritten as:

(4) rt | Xt-1 ~ N(w2,0:),

where the conditional mean and variance of Zz are yz and 2; respectively. The parameter A+ in (3) comprises the non-redundant elements of He and :. Defining e; as rz; — 4, then {ez} is a sequence of martingale differences, from which it follows that ¢; is an innovation with respect to X+_1, and hence is white noise. However, €; need not be an innovation with respect to information outside Xt_1 (eg., W,*_,): this is important for testing, see Section 2.B.

The assumption of conditional linearity entails that pz; is linear in Xt~1. This assumption is easily relaxed to allow for nonlinear models, but doing so poses severe problems for empirical modeling unless the exact nonlinear functions and dynamics are well characterized a prior: by theory. The assumption of conditional normality also is in part for convenience. However, non-normality has implications for inference, both asymptotically and in finite samples; and mis-specified non-normality for nonlinear models can imply inconsistency of some estimators; cf. Amemiya (1977) and Phillips (1982).

Lag truncation. For an empirically testable model to be formulated, it must be assumed that X;_, in (4) can be adequately approximated by X ivf (where @ is the fixed

longest lag considered) without invalidating the innovation properties of the {e:}. Thus, we postulate:

£ (5) ze= )) mutes t eg ~ IN(0,0:), t=1,...,T, t=1

where pz is ean m424-; . In practice, (5) may include a constant and dummies as well. Equation (5) is a vector autoregression (or VAR), and typically is the most unrestricted set of equations estimated in a modeling exercise, the comments above on reduction notwithstanding. Equation (5) is sometimes referred to as the unrestricted reduced form (Hendry (1976)), the observable system (as opposed to the model; Hendry (1989b, p. 64)), or the statistical model (Spanos (1986, p. 218)). Estimation of (5) provides baseline innovation standard errors for the z,;;. Additionally, (5) is the basis for cointegration analysis; see Section 2.C for the background theory and Section 4.A for an empirical example. The length £ depends upon the economic problem being investigated and the frequency of observation, so it is difficult to specify 2 a priori. Sargan (1980a, pp. 116-121) and Hendry, Pagan, and Sargan (1984) discuss the selection of @ in greater detail.4 ‘Contemporaneous conditioning and exogeneity. If (5) is not rejected at the outset, one can turn to modeling y:. There are many basic approaches to doing so, including causal chains, simultaneous systems, block recursive models, and various simplifications based directly on (5). An important potential problem with a VAR such as (5) being a derived representation is that not all of the {7;} may be constant; yet functions of them could be. This arises when a subset of interdependent behavioral equations are nonconstant so that interrelated reduced forms become nonconstant also. One approach to isolating the invariants is to partition z; into (yj, 24)’ and explain y; conditional on z:, corresponding to the factorization:

(6a) Fy (x¢ | Xt-15 Az) = Fyje(ye | 2e, Xt—15 Art) + F(z | Xt-15 Aze)

where the induced parameterization is (A, : A4;) = g(At)’. For ease of exposition, we employ the representation in (3) rather than (5): also, this choice is permitted by several equivalent reduction paths from the DGP to the empirical model. Equation (6a) factorizes the density of z; into the conditional density of y; given z; and the marginal density of

zt: this is without loss of generality. Equally, there is an alternative factorization with 2; conditional on y;:

(65) F,(x¢ | Xt-15 At) = Fejy(2e | yes Xt—-15 Gre) - Fy (ye | Xt-15 $2¢) where the parameterization is ($4; : $4:) = g*(Az)’, and so there is a one-to-one mapping between (A4,: A$.) and (¢4, : ¢5,). While (6b) is not of interest for the most part, given our (assumed) interest in explaining y; conditional upon z, this unique mapping will be

4 Also, from Monte Carlo and analytical results, Sargan (1980c, p. 880) suggests in a closely related problem that: “ ... the sample size should be greater than 25 times the number of variables [in a given equation, including lags] ... if the latent roots of the system are not too close to the unit circle.” For a given number of variables in z; and a given sample size T, that rule of thumb implies a maximum practical é.

important in discussions of super exogeneity and the inversion of conditional equations below.

To reiterate, the factorization (6a) is without loss of generality. By contrast, ignoring the marginal density for z; in (6a) and modeling the conditional density Fy)2(ye | zt, Xt_1; A1t) alone is with loss of generality. Doing so entails a corresponding loss of information except under certain conditions, which depend upon the purpose(s) of the modeling exercise. The nature of these conditions leads naturally (and necessarily) to a discussion of exogeneity.

Engle, Hendry, and Richard (1983) discuss four distinct concepts of exogeneity, namely weak, strong, super, and strict. These concepts correspond to different notions of being “determined outside the model under consideration” according to the purposes of the inferences being conducted, i.e., conditional inference, prediction, policy analysis, and forecasting, respectively.

The essential concept is weak exogeneity, which requires that Ay; and 2; from the factorization in (6a) are variation-free, and that all the parameters of interest (say) can be obtained from 4 ; alone.® If z; is weakly exogenous for \4; (and so for y), 24 provides no additional information on \,;, so only the conditional model Fy|z(yt | Zt, Xt-15 A1z) needs to be analyzed for conditional inference (estimation and testing). That greatly simplifies modeling if there are many variables in z. Often, and relatedly, the factorization (6a) aims inter alia to isolate the nonconstancies of the full vector of parameters A; as the sub-vector Azt (the parameters of the density of the non-modeled variables). Section 4.D illustrates.

Valid prediction of y; from its conditional model requires more than weak exogeneity. With weak exogeneity alone, lagged yt still could influence z; via X;_, in the marginal model for z;, in which case z in the conditional model Fy\z(yt | 2t, X13 A1z) could not be treated as “fixed” for prediction. The requisite additional restriction is that y; does not Granger-cause z;. Granger non-causality plus weak exogeneity generate the concept strong ezogenetty. Unlike weak exogeneity, Granger causality by itself involves no assumptions about parameters of interest.

Policy analysis (or counter-factual analysis) often involves changing the marginal process for z;. For analysis of the conditional model to be valid under such changes, we require that the parameters 41; be invariant to those changes (or class of interventions). The relevant concept is super ezogeneity, whereby z; is weakly exogenous for the parameters of

5 The concept variation-free has the following meaning. Let Ait and Ag; denote the spaces over which the parameters A,; and Azt Tange, i.e., Azz € Ay, and A2t © Age. Then Ait and A9; are variation-free if the parameter space Aj; is not a function of the parameter Azt, and the parameter space A; is not a function of the parameter \4;. Being variationfree implies that 34 and 2 come from a product space, i.e., (Ax¢, Aot) € Ait X Act. Expressed slightly differently, knowledge about the value of one parameter provides no information on the other parameter’s range of potential values.

A more stringent concept, invariance, is introduced in the context of super exogeneity (below). The parameter Ait is invariant to a class of interventions (i.e., changes) in A2; if Aq; is not a function of A2; for that class of interventions. For invariance, lack of dependence

between the parameters themselves matters, and not just lack of dependence between parameters and parameter spaces.

interest y and 4; is invariant to the class of interventions to Age under consideration. Section 4.D describes and implements tests for super exogeneity in a model of consumers’ expenditure.

Super exogeneity has several important implications. First, the empirical presence of super exogeneity refutes the Lucas (1976) critique; see Hendry (1988c), Engle and Hendry (1989), Ericsson and Hendry (1989), and Favero and Hendry (1989). Suppose the conditional and marginal models represent agents’ and policy maker’s decision rules respectively. Then the agents’ parameter vector 1 is invariant to changes in the marginal process for 2, (changes in policy maker’s rules), which is opposite to the implication of the Lucas critique. Second, if the factorization (6a) has constant conditional parameters 3; = A, but the marginal parameters \2; are changing over time, the reverse factorization (6b) entails that both $1; and $2: are nonconstant over time. That follows because each is a function of both A, (constant) and 2 (nonconstant). While (6b) is peculiar at first glance, it is precisely what occurs when (e.g.) estimated money-demand functions are “inverted” to obtain prices as a function of money (common among macro-economists) or to obtain interest rates as a function of money (common among macro-modelers). The resulting nonconstancy of the equation for z; conditional on y; can be demonstrated both analytically and empirically; cf. Hendry (1985) and Hendry and Ericsson (1991a, 1991b). Third, super exogeneity can identify parameters, in the sense of uniqueness. “[T|hat [constant conditional] relationship cannot be confounded with any shifting [marginal] relations” (Hendry (1987a, p. 40)) because any (nontrivial) combination of the conditional and marginal equations would be nonconstant. Finally, policy analysis does not require Granger non-causality on the part of y;, and contrasts with a common approach to exogeneity. With super exogeneity, lagged yz may still influence current z; in the marginal equation for z;. In other words, Granger non-causality is not the relevant concept of exogeneity for policy analysis.

These concepts of exogeneity are discussed more fully in Engle, Hendry, and Richard (1983) and Florens and Mouchart (1985a, 1985b), and build on the work of Koopmans (1950) and Barndorff-Nielsen (1978). Strict exogeneity is not of immediate interest in the present context because it is defined without reference to parameters of interest.

Before returning to the reduction process, two comments are in order about exogeneity. First, in no case is it legitimate to “make variables exogenous” simpy by not modeling them. Whether a variable is exogenous or not depends upon the variables considered (and excluded) and the purpose of the analysis. Second, the “causality” of one variable y; for another z; has no necessary connection with their respective exogeneity status in that z; being “exogenous” or endogenous is neither necessary nor sufficient for it to influence y; (except for the trivial case that by definition strictly exogenous variables cannot be Granger-caused by endogenous variables). For a useful discussion of the concept of causality in econometrics, see Zellner (1979).

Parameter constancy. In the factorization (6a), we have allowed A; and A»; to be time-dependent. To estimate either or both of these parameters and conduct inferences about them, some assumption must be made about that dependence. Commonly, we assume no time dependence, i.e., Aiz¢ = A, for all t. Whether or not this specification is reasonable depends upon the data transformations selected to obtain (4) with conditional linear normality. Relatedly, randomly varying coefficients models and the like do not intro-

duce any greater generality here, noting that these models have “meta-parameters” which themselves are assumed constant (and which could be interpreted as y, say). Because of this assumption of constancy, and in order to simplify notation, parameters below will not be subscripted by t unless explicitly required.

Distributional properties, functional form, and lag truncation revisited. Given the assumptions above on distributional properties, functional form, linearity, lag length, and constancy, the conditional model Fy z(¥t| 2t, Xt-1; Az) can be written as the sub-system:

L (7) vt = Boze + D> By tit M2 ~ IN(0,%11), t=1,...,T.

t=1 Here and below, matrices and vectors have been partitioned conformably with r= (yf: z,). From properties of the normal distribution, it follows that Bo = 9293), Bi = M1; — M2NZ2 Tai, Vit = €1t — 1237 €21, and (so) Ly, = Ay, — 123,21; see Engle, Hendry, and Richard (1983, p. 297). Even so, there is no immediate determination as to whether the parameters of interest come from the parameters in (5) or from only those in (7). If the conditional model (7) is viewed as coming directly from (6a) without the distributional and other assumptions, then those assumptions are required at this stage to derive the empirically estimable conditional model, which is (7).

Equation (7) is commonly referred to as an autoregressive distributed lag (AD). It is autoregressive because lagged y; enter via x4_;, and it is a distributed lag because both current and lagged z; appear. Section 2.C below discusses the properties of AD models in greater detail.

In conditional analysis, (7) provides the general maintained model for y;, while noting its relationship to the VAR (5) and given the same caveats that apply to the VAR. Specifically, the parameters (Bo, {B;}, £11) and the disturbance 14; are derived (and derivable) from the DGP, are not autonomous, and may depend upon time even though we have dropped the time subscript (for ease of reading). Further, numerous a priori assumptions have to be made in order to formalize (7), the usefulness of which will depend upon those assumptions. Many of those assumptions entail restrictions on the observables and hence have testable implications, but need not be valid empirically. Consequently, we now consider how to evaluate empirical econometric models.

B. Evaluation and Design of Empirical Models

Statistical inference in multivariate time-series processes is a hazardous and contentious issue. From the time of Hooker (1901) and especially Yule (1926) onwards, the enormous difficulties inherent in conducting valid inference in such processes have gradually become documented. Presently, econometricians are more aware of the pitfalls in analyzing economic time series than of methods which ensure, with any reasonable likelihood, that sensible and sustainable conclusions can be reached. An empirical “conclusion” is deemed sustainable only if it satisfies a range of criteria discussed in detail below. Most of those criteria are well-known and widely accepted, we consider all of them to be justifiable, and we contend that satisfying such criteria constitutes a minimal necessary condition for judging an empirical model to be credible.

The main criteria in question relate to goodness-of-fit, absence of residual autocorrelation and heteroscedasticity, valid exogeneity, predictive ability, parameter constancy,

the statistical and economic interpretation of estimated coefficients, and the validity of a prtort restrictions. Rather than discuss each of those issues in an ad hoe manner, below we adopt the taxonomy in Hendry and Richard (1982) in which design criteria are related to particular types of information available to the modeler. Further, we consider the relationship between these types of information and the reductions from the DGP entailed by a specified empirical model, and we motivate how the associated tests can be employed in both model evaluation and model design.

A tazonomy. As formulated, (7) entails restrictions relative to four distinct sources of information, which are summarized as:

(A) the data of one’s own model, conveniently partitioned into: (A1) the relative past of the {z:} process, denoted by X;¢_1 (namely, only Xi-f is relevant if (7) is valid); (A2) the relative present of x; (namely, it is valid to condition y; on 2); (A3) the relative future of the {z:} process (namely, the parameters remain constant on X. ae ); (B) the structure of the measurement system (e.g., definitional constraints must not be violated); (C) the subject-matter theory (so that (7) is consistent with the available theory); and (D) alternative models’ data, denoted {¢,} (which should contain no additional information relevant to explaining {y;}), which may be partitioned as in (A): (D1) the relative past of the {¢:} process (namely, {¢ ¢,... ¢¢—1} is irrelevant in explaining y;, conditional upon Xf); (D2) the relative present of ¢ (namely, it is valid to condition y; on (24, ¢¢)); (D3) the relative future of the {¢} process (namely, the forecast errors of the conditional model are innovations with respect to the information available in {¢} at the time of forecast).

That the past is immutable, the present is occurring, and the future is uncertain are among the basic tenets of economics, so it is unsurprising that procedures for model evaluation should focus separately on those various subsets, as they do in (A1)-(A3) and (D1)-(D3).

Model evaluation. Corresponding to that two-tier partition of the information, we have the following eight evaluation criteria: innovation errors, weak exogeneity, parameter constancy, data admissibility, theory consistency, parameter encompassing, exogeneity encompassing, and forecast-model encompassing. In statistical terms, each criterion yields a testable null hypothesis (subject to identification requirements), and above in (A)-(D) those criteria are stated in terms of their corresponding null hypotheses. Even so, departures from each null could take many forms. Table 1 lists the main test statistics reported herein. Most of these tests are applied in the Lagrange Multiplier spirit, with all the data used in the coefficient estimates quoted. For instance, the Chow statistic acts as a postestimation diagnostic for predictive failure over a designated final subset of observations. Joint tests could be constructed, as in Jarque and Bera (1980); and, as Kiviet and Phillips (1986) also note, many of the statistics are asymptotically independent so that their y?forms could be added together to construct a “portmanteau” mis-specification statistic. Clearly, care should be taken to control for Type I errors over the set of tests.

Information Set

(A) own model's data

(Al) relative past

(A2) relative present

(A3) relative future

(B) measurement system

(D) alternative models’ data

(D1) relative past

(D2) relative present

(D3) relative future

10a

Table 1. Evaluation/design Criteria

Null Hypothesis

innovation errors

normality of the errors

weakly exogenous Tegressors

constant parameters, adequate forecasts

data admissibility

theory consistency; cointegration

variance dominance

variance encompassing

parameter encompassing

exogeneity encompassing

MSFE dominance

forecast encompassing

forecast-model encompassing

Alternative Hypothesis

first-order residual autocorrelation

jt_order residual autocorrelation

invalid parameter restrictions

j't_order ARCH

heteroscedasticity

quadratic in regressors

jth_order RESET

skewness (SK) and excess kurtosis (EK)

invalid conditioning

parameter nonconstancy,

predictive failure

"impossible" predictions

of observables

"implausible" coefficients, predictions; no cointegration

relative poor fit

inexplicable observed

elror variance

significant additional variables

inexplicable

conditioning properties

relative poor forecasts

informative forecasts

from alternative model

regressors from alternative mode! valuable for forecasting

Sources

Durbin and Watson (1950, 1951)

Box and Pierce (1970); Godfrey (1978), Harvey (1981, p. 173)

Johnston (1963, p. 126)

Engle (1982)

White (1980, p. 825), Nicholls and Pagan (1983)

Ramsey (1969) Jarque and Bera (1980)

Sargan (1958, 1980b), Engle, Hendry, and Richard (1983)

Fisher (1922), Chow (1960), Brown, Durbin, and Evans (1975), Hendry (1979b)

Engle and Granger (1987)

Hendry and Richard (1982) Cox (1961, 1962),

Pesaran (1974), Hendry (1983a) Johnston (1963, p. 126),

Mizon and Richard (1986) Hendry (1988c) Granger (1989, pp. 186-187)

Chong and Hendry (1986)

Ericsson (1989)

Statistical criterta and reductions. Each criterion in Table 1 matches some reduction in deriving the empirical model (7) from the DGP in (1). Conversely, each reduction corresponds to one or more criteria, depending upon the particular choice of information set being excluded. The following discussion of reductions should clarify these correspondences.

First, the usefulness of the marginalized process (2) depends on the actual irrelevance of the marginalized variables {wf}, and on the suitability of the parameterization 6. As the choice of {wy} (and so {z;}) varies, so will 6 and {e;}. Exclusion of an alternative model’s data (D) from analysis is this sort of reduction, with invalid exclusion characterized by the standard textbook analysis of omitted variables; cf. Johnston (1972, pp. 168-169). Also, and relatedly, marginalization could either lose or deliver constancy in @ (and so 4), irrespective of the constancy of elements of ~ (in the DGP (1)). Indeed, since a primary objective of most modeling exercises is obtaining a set of constant parameters, this consideration is a major influence on the choice of reductions and hence of parameterizations and model specifications to be adopted.

Second, sequential factorization is not a reduction per se, but is without loss of generality: the mapping from @ to A in (3) is one-to-one. However, the re-parameterization generating \ may be economically important. For instance, all elements in 6 might be nonconstant, but the re-parameterization could isolate nonconstancy to a very few elements of A, leaving the remaining elements constant.

Third, the assumed distributional form of Fx(X}| Xo; 6) and (relatedly) the transformations applied to x; may be inappropriate. If so, the assumed conditional normality in (4) ignores information on higher moments of the data. Jarque and Bera’s (1980) test in (Al) aims to detect this loss of information. Further, an improper distributional assumption may violate data admissibility in (B). For example, a linear model of the level of prices with normal disturbances would predict negative prices with positive probability.

Fourth, if the lag length @ is chosen too small, €; will not be an innovation, nor need it be white noise. Several of the tests in (A1) aim at detecting precisely these phenomena.

Fifth, linearity is testable as zero restrictions on nonlinear terms, either generally (as in some of White’s tests) or for specific nonlinearity (as with RESET). Both correspond to the information set (A1).

Sixth, the validity of conditioning (A2) is testable, albeit often indirectly via showing super exogeneity. For the latter, both tests of constancy (next) and direct tests of invariance are helpful; cf. Hendry (1988c) and Engle and Hendry (1989). How z is partitioned into (y},2{) clearly depends upon the relevant subject matter. Even so, data can provide some guidance on partitioning of z; via the empirical constancy or otherwise of \,. For instance, Hendry and Ericsson (1991a, 1991b) find a constant model of money conditional on prices, but show that a model of prices conditional on money is nonconstant.

Seventh, the assumption that (Bo, {Bi}, X11) (= Az) in (7) is constant involves an implicit marginalization with respect to subsets of data. Specifically, given knowledge of (Ai:, 7 =1,...,¢), the remaining parameters (Ai, =t+1,...,7) are redundant. If not, predictive failure should be observable in subsamples.

Finally, certain values of coefficients may be economically “implausible”, in which case parametric restrictions (and so reductions) provide evidence on economic theory (C).

Empirically, Section 4.A tests for theory consistency via tests of cointegration. Sections 4.B and 4.C test for the validity of various marginalizations, assumed distributional form, lag length, linearity, parameter constancy, and economic plausibility (in the sense of coefficients with proper sign and magnitude). In Section 4.D, we test for the validity of conditioning via testing for super exogeneity. In Sections 4.A, 4.C, and 4.D, tests help evaluate the model at hand, whereas in Section 4.B, tests serve primarily to design a more parsimonious model, starting with an unrestricted AD model.

An example. The standard linear model estimated by least squares well illustrates the role of reduction in empirical modeling. At the outset, it is important to distinguish between the economic theory-model and the empirical model that the former serves to interpret. A theory model is freely created by the human imagination, e.g.,

(8) yt = b’xz,

where b = Oy;/Ox;, and y; and x; (in typed font) denote the economic theoretic variables. (In (8) and in (9) below, we have used a standard notation, but one in conflict with that used for the discussion of reduction above.)

A corresponding empirical model is anything but freely created. Rather, as the theory of reduction has shown, the properties of the empirical model are determined by the DGP. To demonstrate, consider the empirical model:

(9) yi = P'tr+e t=1,...,T,

where y; and 2; (in italic) are observed economic data, and we assume that the expecta-

tion €(e;|zz) is zero. That assumption defines the parameter vector 8 in terms of data properties:

(10) E(y: | zt) = Bz. Likewise, the error e; is a function of the data: (11) er = ye — E(ye | 22).

The properties of 8 and e; vary with choice of z; and with the orthogonality assumption, or equivalently, with the (often implicit) choice of variables ignored and with the conditioning assumption. Thus, “wrong signs” may well be “wrong interpretations”, and can arise via improper reduction (such as omitted variables). The coefficients and errors of empirical models are derived, not “autonomous”.

In such a situation, it is justified to ask whether or not we can conduct reliable inference. The answer lies in the validity (or otherwise) of the implied reductions, and the test statistics are constructed to evaluate those reductions. Also, those statistics may be used as criteria to design models that satisfy the statistics by construction. Such model design may be perfectly reasonable and desirable: Hendry’s analogies include “... model aircraft are designed to fly ...” (Hendry (1983b, p. 197)) and “engineers design bridges to withstand the loads of cars and lorries”. However, in the presentation of results, it is

crucial to distinguish test statistics appearing as evaluation criteria from those appearing as (possibly implicit) design criteria.

Implicit and explicit model design. (A)-(C) generate reasonably conventional criteria for selection and evaluation of models. However, such criteria are minimal in that they often can be satisfied simply by destgning empirical models appropriately. Within-sample “test statistics” become selection criteria, since “large” values on such tests would have induced a re-designed model. For example, a theory-based model imposed on data and with any residual serial correlation removed usually satisfies (A1) and (C); and so on.

Consequently, while these criteria are necessary, they are not sufficient to justify a given model for inference, forecasting, or policy analysis. Genuine tests of a data-based formulation occur only if new data, new forms of tests, or new rival models accrue. “New” can also mean “unused” in this context. Further, because modelers commonly neglect the value of data from existing rival models, tests based on (D) often can help evaluate a given model. That is, we would require evidence on the ability of a model to encompass rival hypotheses, demonstrating that the information in (D) is irrelevant, conditional on (A) and (C). (Here we assume common agreement about and satisfaction of (B).) Since encompassing is a relatively unfamiliar concept, we now discuss it before summarizing this sub-section.

Encompassing. Encompassing can be understood intuitively from the following example illustrating parameter encompassing. Suppose Model 1 predicts @ as the value for the parameter a in Model 2, whilst Model 2 actually has the estimate @. Then we test the closeness of @ to @, taking account of the uncertainty arising in estimation. Model 1 parameter-encompasses Model 2 if a is “statistically close” to G@, so that Model 1 explains why Model 2 obtains the results it does.

For single equations estimated by least squares, a necessary (but not sufficient) condition for parameter encompassing is variance dominance, where one equation variancedominates another if the former has a smaller variance.© Thus, encompassing defines a partial ordering over models, an ordering related to that based on goodness-of-fit; however, encompassing is more demanding than having the “best” goodness-of-fit. Encompassing is also consistent with the concept of a progressive research strategy (e.g., see Lakatos (1970) and Section 2.D below), since an encompassing model is a kind of “sufficient representative” of previous empirical findings. Because any given model automatically encompasses all special cases of that model, encompassing can become vacuous by choosing a very large imbedding model. Thus, Florens, Hendry, and Richard (1987) introduce a more stringent concept, parsimonious encompassing. As an example, the sequential reduction in Section 4.B attempts to find a parsimoniously encompassing model. Hendry’s empirical studies generally emphasize both encompassing and parsimony; cf. Davidson, Hendry, Srba, and Yeo (1978), Hendry (1983b), Hendry (1988c), Hendry and Mizon (1989), Baba, Hendry, and Starr (1991), and Hendry and Ericsson (1991a, 1991b).

In general, an encompassing strategy suggests trying to anticipate problems in rival models of which their proponents may be unaware. For example, one model may correctly

© Formally, variance dominance refers to the underlying (and unknown) error variances. Without loss of clarity, we often will say a model variance-dominates another if the est#mated restdual variance of the former is smaller than that of the latter.

predict that the errors of another model are not innovations, or that the parameters of the other model are not constant over time. Corroborating such phenomena adds credibility to the claim that the successful model reasonably represents the data process, whereas disconfirmation clarifies that it does not. For comprehensive accounts of tests for encompassing and of related non-nested hypothesis tests, see Mizon and Richard (1986), Mizon (1984), MacKinnon (1983), and Pesaran (1982). Hendry (1988c) and Ericsson and Hendry (1989) consider encompassing implications when the two models are conditional and rational expectations respectively.

The use of test statistics in both design and evaluation is similar in spirit to the databased aspect of Box and Jenkins’s (1976) methods for univariate time-series modeling, but existing empirical models and available subject-matter theory play a larger role, while being subjected to a critical examination for their data coherency on (A)-(D). Further, Hendry’s methodology emphasizes the need to estimate the most general model under consideration to establish the innovation variance. Given that most general model, it is of interest to ask what simplifications to that model are available and what effects various simplifications have on the properties of the model.

C. Types of Empirical Models

From the autoregressive distributed lag relationship in (7), nine distinct model classes are derivable, and correspond to different parametric restrictions on the coefficients (Bo, {B;i}). For expositional simplicity, we consider only current and one-period lags of scalar variables y; and z; entering (7), albeit with an explicit constant term. Generalizations to longer lags and more variables follow immediately, and appear in Hendry, Pagan, and Sargan (1984). Furthermore, properties of (7) and the system from which it is derived (equation (5)) determine whether or not a long-run relationship exists between y and 2, and so whether or not y; and z are cointegrated. Hence, this section divides neatly into model classes, the error-correction model in particular, and cointegration.

Model classes. With a slight change of notation, (7) is:

(12) Ye = + Bot + Pizzi + Bomit% vz ~ IN(0,02)

in its simplified form. Without loss of generality, (12) can be re-arranged via data transformations to achieve:

(13) Ay: = a+ BAz + y(ye-1 — b2t-1) + % vw.~ IN(0, o2)

where § = Bo, y = Bz — 1, and 6 = —(8o + B1)/(B2 —1) (provided B, 4 1).7’® For reasons that will be apparent shortly, we refer to (13) as an error-correction model (denoted ECM).°

7 With the lag operator L defined as Lz; = Zt-1, we let the difference operator A be (1—L); hence Az; = z4—z4_1. More generally, Aix; = (1—L/)*z¢. If i (or J) is undefined, it is taken to be unity.

8 If B. = 1, then (13) is: Ay, = a+ BAZ + (Bo + B1)244-1+%.

® At least two other distinct representations have been labeled as “error-correction”. First, Granger (1986, p. 216) and Engle and Granger (1987, p. 254) describe an error-

As summarized in Table 2, parametric restrictions on (Go, 81,82) (and so on (f,7,6)) can imply the following models: static regression, univariate time series, differenced data (ie., growth rate), leading indicator, distributed lag, partial adjustment, common factor (i.e., models with autoregressive errors), homogeneous error-correction, and reduced form (ie., dead start). Volumes have been written on the properties of these models, with a lucid summary in Hendry, Pagan, and Sargan (1984, pp. 1040-1049). Crucially, all of these models involve testable restrictions on the autoregressive distributed lag. In practice, these restrictions are often left untested, thereby imposing a possibly unwarranted reduction of the DGP. Even so, the autoregressive distributed lag model is almost invariably trivial to estimate, so there is little justification in omitting the corresponding test. Conversely, a methodology aiming to simplify the AD model into one of these model types has a statistical framework for doing so.

Comfac models. The nine model classes follow directly from the corresponding restrictions. To illustrate, we consider the relationship of one model type to its restriction on (12) — models with a common factor or, equivalently, with an autoregressive error. This model type is particularly important because of its ubiquity in the empirical literature, its relation to the Engle-Granger two-step procedure (for cointegration), and the confusion over its logical status. Specifically on the last issue, models with autoregressive (AR) errors imply a restriction on a more general model, rather than a generalization from a more specific model. See Hendry and Mizon (1978) and Sargan (1980c) for detailed analyses.

Consider the model:

with autoregressive errors: (15) Ut = pur_i tlt.

By substitution of (15) into (14) and noting the definition of uz_, from lagging (14), we have:

correction form in which Ay; depends on (y:_1 — 62z4~1) and lagged values of Ay; and Az, i.e., with no current-dated Az;. Engle and Granger’s equation is not a conditional model, but is either a marginal model or (equivalently) a single equation from the joint distribution of x; in (5). Given the role of weak exogeneity and contemporaneous conditioning in Hendry’s methodology, it is natural to work with an ECM including rather than excluding Az.

Second, Phillips (1988, p. 355) describes a model in which y; depends on 2, lags of (y;— éz), and current and lagged Az. This equation is isomorphic to (13), or generalizations thereon. Even so, we will work with (13) and its generalizations rather than Phillips’s representation because inter alta the former often obtain a more orthogonal information set. For instance, the right-hand side variables of our generalized ECM (Azz, lagged Ay: and Az, and the error-correction term (y:-1 — 6z¢_1)) usually are not highly intercorrelated; cf. Section 4.B. However, the term (y; — 624) does tend to be highly autocorrelated, in which case its different lags in Phillips’s representation will be highly correlated with each other. See Section 4.B on data transformations as well.

15a

Table 2. Model Classes for the Autoregressive Distributed Lag*”®

Model type Equation Restrictions Autoregressive Yi = Boz, + Biz) + Poyy-1 + Vv; None distributed lag

General Ay, = BoAz, + (Bo-1)(y-62),_ + V, None error-correction®

Static y, = Boz + V; By = Bo = () regression

Univariate yt = Byii+ Vy Bo = B, = 0 time series

Differenced data Ay, = BoAz, + vy B> = 1, B, = -Bo (growth rate)

Leading Yi = Bia + VY Bo = Br = 0 indicator

Distributed Yt = Boz, + Biz + Bp =0

lag

Partial Yi = Boz + Boyi + Vi B, =0 adjustment

Common factor y. = Boz, +u, u, = Bou) + Vv, B, = -BoB>2 (AR error)

Homogeneous Ay, = BoAz, + (Bo-1)(y-z),-. + Vy; Bo+ Bi +Bo = 1 error-correction

Reduced form y, = Biz) + Poy + vy By = 0

(dead start)

aThe typology is illustrated here with equation (12), a first-order autoregressive distributed lag AD(,1). For generalizations, see Hendry, Pagan, and Sargan (1984, p. 1042).

For ease of exposition, the constant term © (in (12)) is ignored throughout.

cThe general error-correction model is isomorphic to the autoregressive distributed lag, with

the parameter 6 being -(By+f,)/(B2-1) (assuming B #1).

(16) ye = bze+ purity = bze + p(ye-1 — bae_-1) + % = bz, — pbz¢_1 + pyz-1 + %-

Thus, this model contains the restriction that BoG2 = —G,. It is referred to as the “common factor” or “comfac” restriction because (16) can be re-written with yz; and z being premultiplied by the common factor (1 — pL):

(17) (1 — pL)y: = (1 — pL)bae + 1%,

where L is the lag operator. Thus, to paraphrase Hendry and Mizon’s (1978) title, autoregressive errors are a testable and possibly convenient restriction, not a nuisance.

The comfac restriction is testable by Lagrange multiplier, likelihood ratio, and Wald procedures (see Section 2.£). The Wald statistic is the easiest to calculate, being based upon the unrestricted (e.g., OLS) estimates of (Go, 81,82) in (12). The likelihood ratio statistic can be calculated from the likelihoods (or, often equivalently, the residual sums of squares) from (14)-(15) and (12). Although feasible, the Lagrange multiplier statistic is rarely used in this context because of the necessarily iterative techniques for estimating (14)-(15). See Sargan (1980c) and Sargan (1964) on testing for comfac restrictions with Wald and likelihood ratio statistics respectively.

Error-correctton models. The properties of and intuition behind error-correction models are important for understanding Hendry’s approach, so we consider them in detail, taking a slightly circuitous route to reach (13). See Davidson, Hendry, Srba, and Yeo (1978, pp. 679-683) and Hendry, Pagan, and Sargan (1984) for additional discussion.

Consider a non-stochastic steady-state theory which implies proportionality between two variables Y and Z (e.g., consumption and income, money and nominal income, or wages and prices) so that Y = KZ where K is constant for a given growth rate of Z (and so of Y). In logs, that theory becomes y = «+z with K = In(K). Without a precise, real-time economic theory of the dynamic relationship between the corresponding observable variables y; and z;, a general autoregressive distributed lag relationship is postulated, with the parameters satisfying the restriction entailed by the steady-state solution. Alternatively, Nickell (1985) justifies error correction mechanisms as arising from the optimal response of economic agents in certain dynamic environments. Hendry and Ericsson (1991a) discuss how ECMs generalize conventional partial adjustment models and can be consistent with Ss-type adjustment by economic agents; cf. Baumol (1952) and Miller and Orr (1966). Developing on Campos and Ericsson (1988), Hendry and Ericsson (1991b) re-interpret ECMs as forward-looking, albeit with “data-based” rather than model-based formations of expectations.

Equation (12) is the general AD relationship with only current and one-period lags of yt and z; entering. Long-run homogeneity between y and z requires 6 = 1, or equivalently, Bo + 61 + B2 = 1. Re-writing (12) with that restriction obtains:

(18) Ay: = a+ BAz: +7(yt-1 — 24-1) +4 7 #0,

where a, @, and ¥ are the corresponding unrestricted parameters. While equation (18) is often called an error-correction, we will distinguish it from (13) by calling the latter

a long-run homogeneous ECM, at least when ambiguity might arise. Equation (18) has numerous important properties.

First, equation (18) is representative of a large class of models belonging to (7): that class satisfies steady-state economic theoretic restrictions and allows for general dynamic responses. ECMs contrast with other model types, which typically impose restrictions on dynamic responses.

Second, equation (12), and hence (18), can be expressed as the conditional density for yz in (6a) with A, = (a B 7 o,), x, = (ys 2), and £= 1. Intuitively, the term BAz reflects the immediate impact that a change in z; has on y:. The term 1(ye—1— 2#-1) (with 7 negative for dynamic stability) is statistically equivalent to having Y(yt-1 — K — 24-1) instead in (18), and hence reflects the impact on Ay; of having y:_1 out of line with k + 6z:_,. Such discrepancies could arise from errors in agents’ past decisions, with the presence of +(y:_1 — 2:_1) reflecting their attempts to correct such errors: hence the name error-correction model.

Third, for a steady-state growth rate of Z; equal to g (ie., g = Az; = Ay) and v; = 0, then, solving (18), we have:

(19) Yt = 2: + exp{[—a + g(1 — A)]/7},

reproducing the assumption of proportionality between Y; and Z; entailed by the nonstochastic steady-state theory. Note also that K = exp{[—a + g(1 — )]/7}, which is independent of g only if 8 = 1 or a depends on g appropriately. See Kloek (1984) and Salmon (1982). Valid inferences about the “long-run parameter” K require efficiently (and so consistently) estimating both the vector of parameters in (18) (including the parameters 6 and + corresponding to short-run and dis-equilibrium effects) and that vector’s variance matrix.

This example of the simple one-lag, homogeneous error-correction model is readily extended to include non-proportionality (as in (13)), additional lags, and vector (rather than scalar) yz and z;. See Hendry, Pagan, and Sargan (1984). Examples of ECMs appear in Sections 4.B and 4.D.

Cointegration. The ECM class arises naturally from considering the time-series properties of economic data, as is apparent on introducing the related concept cointegration. Typical macro-economic data series (e.g., money, income, prices, GNP, consumers’ expenditure, investment) appear non-stationary. Granger (1981) formalizes the concept of a series being integrated of order n (denoted I(n)) if its nth difference is stationary but its (n — 1)th difference is not. In practice, n = 1 often suffices. Thus, univariate autoregressive representations of the (scalar) y: and z in our example could each have one root of unity, whereas Ay; and Az; would then be stationary (or I(0)). For an arbitrary linear combination of y; and z; (y: — dz; = ug, say), that linear combination u; is generally also I(1) and Au; is 1(0). However, there can exist a unique value of 6 such that wu; is I(0); if so, yz and z are said to be cointegrated. For example, in (18), to ensure that all the regressors balance with the regressand being 1(0), then y; and z; would have to be cointegrated with 6 = 1. More generally, this cointegrating vector 6 is the same 6 in (13), thus tying cointegration directly to ECMs. For the initial development of cointegration, see Granger (1981), Granger and Weiss (1983), the papers in Hendry (1986b), and Engle and Granger

(1987). For recent summaries and extensions, see Hylleberg and Mizon (1989), Engle and Yoo (1989), Dolado, Jenkinson, and Sosvilla-Rivero (1990), Phillips (1991), and Phillips and Loretan (1991). Section 4.A analyzes Hendry and Ericsson’s (1991b) money-demand data for cointegration.

To present a clearer overall picture of cointegration, we return to the unrestricted system for (yj, 2), namely, equation (5), and follow the interpretation provided by Johansen (1988) and Johansen and Juselius (1990). Both y; and z may be vectors, and that complicates the analysis — there may be more than one cointegrating vector.

By adding and subtracting various lags of x:, (5) may be rewritten as:

(20) Art = mzt-1 + CyAry_1 +...+ Ce-1Azp_ey1 + & where the {C;} are (21) C;=-(tigat...tm) i=1,...,0-1 and

(22) nm=(S om) -I.

The a matrix defined in (22) contains information on the long-run properties of the x; process.'° We consider the special situation where x; is I(1) (at most) and so Az; is I(0) in which case the rank of 7 determines the cointegration properties of z;. Denoting the dimension of x; as p x 1 and the polynomial (Si m2?) —I asnx(Z), those properties are as follows. (i) rank(7) = p. For 7 to have full rank, none of the roots of |7(Z)| = O can be unity. Provided |7(Z)| = 0 has all its £- p roots strictly outside the unit circle, zz is stationary because (ZL) can be inverted to give an infinite moving average (MA) representation of z¢. (ii) rank(7) = 0. This implies that 7 = 0, so (20) is an equation in differences only. Also, this means that each variable in zz is I(1). (iii) 0 < rank() =r < p. In this case, we can write 7 as the outer product of two (full column rank) p x r matrices a and B:

(23) m= of",

’

where f’ is the matrix of cointegrating vectors and a is the matrix of “weighting elements”.!! That is, each 1 x p row 6; in 6’ is a cointegrating vector, as

10 Johansen (1988) and Johansen and Juselius (1990) write (20) with the level of z entering at the ¢ th rather than the first lag. Doing so does not alter the coefficient on the lagged level (which is 7) although it does change the coefficients on the lagged values of Az. The analysis of cointegration concerns the properties of 7 alone, so the choice of lag on z is irrelevant in this context.

11 We use the notation of a and 6 because they are standard in Johansen’s cointegration

analysis. However, they are not to be confused with a and 6 in (13) and (18), which are unrelated to the current usage.

is required for “balance” to make the cointegrating relation 6/x:_, an I(0) process in (20) when (23) is substituted into (20). And, each 1 x r row a; of @ is the set of weights for the r cointegrating terms appearing in the jth equation. Thus, the rank r is also the number of cointegrating vectors in the system. While a and @ themselves are not unique, @ uniquely defines the cointegration space, and suitable normalizations for a and @ are available.

In the bivariate example for the ECM, p = 2 and r = 1 so that there is a single cointegrating vector 6’ = (1,6). Also, we have normalized on the coefficient for yz.

Numerous systems-based test procedures have been proposed, with the most straightforward being those of Johansen (1988) and Johansen and Juselius (1990). First, they develop maximum likelihood-based testing procedures for determining the value of r, and tabulate the (asymptotic) critical values of the likelihood ratio statistic as a function of p—r. A likelihood framework is particularly appealing in our context, given the initial specification of the DGP as a density function. Further, noting that rank(7) is the number of nonzero eigenvalues in a determinantal equation closely related to estimating 7, the LR test ties directly back to 7 by testing how many of those eigenvalues are zero. Additionally, the cointegrating vectors in 6’ are a subset of the associated eigenvectors. Two variants of the LR statistic exist, one using the maximal eigenvalue over a subset of smallest eigenvalues, the other using all eigenvalues in that subset. These tests and a and B' are computed in Section 4.4 below. Second, Johansen and Juselius develop procedures for testing hypotheses about a@ and #’, such as zero restrictions.

For the conditional model (7), additional issues regarding cointegration arise. Normalizations aside, the cointegrating vectors @ are invariant to the particular contemporaneous factorization chosen, i.e., how z; is partitioned into y; and z. However, the weighting coefficients are not invariant because a is co-mingled with 1,2 and Q22, which depend upon which variables appear in y; and in z%. Still, the weighting coefficients are of interest because weak exogeneity is lost if a cointegration vector appears in both the conditional and marginal densities (i.e., it has nonzero weights in both). Johansen (1990) proposes an ingenious likelihood-based test of weak exogeneity pertaining to the cointegrating vectors. Our conditional analysis following from (7) above and also from (12) assumed weak exogeneity, and thereby excluded the same cointegration vector from appearing in both the conditional and marginal processes. If weak exogeneity is valid, cointegration analysis can proceed on the conditional model (7) without loss of information, and Johansen (1990) shows how to do so.

Equation (7) might simplify further, such as by having common factors. If it does, then testing for a unit root in u; amounts to testing whether any of the common factors has a unit root (or whether rank(7) = 0). Somewhat surprisingly, even if common factors are invalid in (7) but are imposed in estimation, tests of a unit root in uz generally are consistent, but they do lack power relative to some other tests; cf. Kremers, Ericsson, and Dolado (1989).

Engle and Granger (1987) establish the consistency of and propose the use of unitroot tests in the context of cointegration. Cointegration of y; and z corresponds to the roots of uz being within the unit circle, i.e., that u; is I(0) rather than I(1). Test statistics include the augmented Dickey-Fuller (1979, 1981) statistic ADF(i) and the Durbin-Watson

statistic dw, using the bounds in Sargan and Bhargava (1983) for the latter. Further, Engle and Granger (1987) establish an isomorphism between cointegration and error correction: models with valid ECMs entail cointegration and, conversely, cointegrated series imply an error-correction representation for the econometric model. (For an exposition and extension, see Granger (1986).) They also show that 6 can be consistently estimated from the static regression of y; on z;: the asymptotic distribution of that estimator is derived by Stock (1987), Phillips (1987), and Phillips and Durlauf (1986). Nevertheless, inference about 6 depends upon nuisance parameters; and, as Banerjee, Dolado, Hendry, and Smith (1986) demonstrate, large finite-sample biases can result when estimating 6 in this way. Finally, many hypotheses of interest relate to the complete conditional model specification (13), and concern speeds of adjustment and the constancy of 6 over time.

To summarize, cointegration is an important unifying concept. First, it ties long-run economic-theoretic relationships to a statistical time-series framework. Second, it provides the statistical basis for testing the existence of a long-run relationship via tests of unit roots. Third, it establishes a firmer statistical and economic basis for the empirically successful error-correction models. Fourth, it resolves the “spurious regressions” or “nonsense-correlations” problem associated with “trending” time-series data via the distributional theory of integrated processes; cf. Phillips (1986). Finally, it clarifies the relationship between Box-Jenkins-type time-series models and economically based levels (often static) models, with the former capturing dynamics but ignoring multivariate relationships and the latter emphasizing the multivariate nature of economic data but ignoring

its dynamics. Equations (5) and (7) allow for both dynamics (via lag structure) and multivariate relationships (via cointegration).

D. Modeling Strategies

Throughout his various discussions on methodology, Hendry distinguishes between the roles of model discovery and model justification in empirical model-building. This subsection briefly discusses these concepts, the relative merits of general-to-simple and simple-to-general modeling, and the role of encompassing in a progressive research strategy.

In model justification, tests such as those in Section 2.B are fundamental as evaluation criteria. They help detect whether or not a model is well-specified, i.e., whether or not that model entails valid reductions against the various information sets considered. Evaluation is a routine, almost mechanical procedure.

Model discovery is anything but routine. While “failed” tests may indicate the sorts of mis-specification present, most tests have power against a wide range of alternatives other than the one for which the test was designed. Thus, “rejection of the null does not imply the alternative”, even though many economists interpret (e.g.) autocorrelation in the residuals as indicative of autoregressive errors, and re-estimate the mis-specified model with an AR error process. As seen above, that implies an untested and possibly false common factor restriction. Further, the autocorrelated residuals need not even be due to dynamic mis-specification of the empirical model, but could arise from omitted variables, mis-specified functional form, etc. Still, we should not ignore the information provided by failed test statistics; rather, we should be cautious in interpreting why the model failed those tests, and hence how best to improve the model.

Hendry (1987a, pp. 29-30) emphasizes the inherent dependence of a model’s discovery

on its designer’s abilities, noting

... four golden prescriptions for those who seek to study data: I. Think brilliantly. ... Il. Be infinitely creative. ... Ill. Be outstandingly lucky. ... [and, lacking any of these three characteristics, ...] IV. Stick to being a theorist. (italics in original)

While tongue in cheek, these prescriptions hold a substantial element of truth. Much of the empirical modeler’s “value added” is via discovery of a model, not via demonstration of the model’s empirical validity, even given the importance of the latter. Indeed, because we do not know the DGP, no route or method is excluded from finding a correct specification, nor can the route taken to obtain a model affect its intrinsic validity or otherwise. However, once we posit a model as well-specified, we have many tools for evaluating that claim, and/or checking that the model has been well-designed. All this said, there are some guidelines to modeling that are appealing theoretically and have often been successful in practice.

General-to-stmple versus simple-to-general. In general-to-simple modeling, we start with the most general model feasible and simplify it to as parsimonious and economically interpretable a model as is statistically acceptable. The simplifications are reductions in themselves, and parallel the reductions described in Sections 2.A and 2.B on the status and evaluation of empirical models. Tests of those simplifications check whether or not the corresponding reductions are valid.

The “most general feasible model” usually is profligate with parameters, so many potential paths of simplification are available. Data transformations prior to simplification sometimes helps identify a sensible path. E.g., the autoregressive distributed lag in (12) is often easier to model when expressed as the equivalent (13). The latter is in differences (growth rates) and differentials, which tend to be nearly uncorrelated with each other, rather than levels and lagged levels, which are highly correlated. The nearly uncorrelated regressors of (13) are appealing for other reasons as well: measurement errors on one regressor have little effect on inference regarding another regressor, and the regressors may correspond to agents’ orthogonalization of their information set in making their decisions. Even so, more than one data-acceptable parsimonious model may be obtainable from the same data set, requiring additional data, tests, or outside information to differentiate between them.

Hendry’s (1979b) study of the U.K. demand for M, is a good example of the generalto-simple approach, where he starts with a fourth-order autoregressive distributed lag for money, total final expenditure (the scale variable), prices, and the interest rate, and simplifies to obtain a parsimonious (six-parameter), data-coherent model. From developments in Trundle (1982), Hendry (1985, 1988c), and Hendry and Ericsson (1991b), it is possible to obtain an even simpler (still data-coherent) model over a substantially longer data sample. Sections 4.A and 4.B consider this data in greater detail, with 4.B illustrating general-to-simple modeling.

While the simple-to-general approach (illustrated by the autocorrelation /untested

common factors example) cannot be excluded as a route to finding a better model, generally it uses the data information available in an inefficient and potentially inconsistent manner. This approach always assumes that the existing (simple) model has valid associated reductions, except possibly for the one being tested (e.g., autoregressive errors). If any other reductions are invalid, inferences on the simple model become hazardous. Hendry (1980) and Hendry and Richard (1982) illustrate some of the pitfalls in the simple-to-general approach.

Encompassing. Any existing model cannot preclude the value of new insights. Encompassing ensures not only that a model based on those insights adds to the existing knowledge about the phenomenon being modeled, but also that it does not neglect existing knowledge. So, encompassing provides a basis for a progressive research strategy, wherein any new (encompassing) model both contributes something new to the explanation of the phenomenon modeled while accounting for previous models’ results.

Davidson, Hendry, Srba, and Yeo’s (1978) initial study of U.K. consumer expenditure and the subsequent study by Hendry and von Ungern-Sternberg (1981) both emphasize encompassing as a critical property of an empirically acceptable model, albeit before the word “encompassing” was in general use to describe what they were doing. See Hendry (1987a, p. 43) for the progression of studies on U.K. consumers’ expenditure and narrow money demand.

Hendry and Ericsson’s (199la) model of annual U.K. money demand is an explicit outcome of a progressive research strategy. An initial model was formulated in Hendry and Ericsson (1983), which stimulated further studies, including an improved specification on the same information set by Longbottom and Holly (1985), and a nonlinear reformulation by Escribano (1985). Longbottom and Holly’s and Escribano’s models each encompassed the 1983 model, but the 1983 model could not encompass either of theirs. Still, neither of the improved models could encompass the other improved model, implying the potential for further improvement on the existing data set. To that end, the money-demand model published in Hendry and Ericsson (1991a) encompasses the 1983 model, and the models of Longbottom and Holly and of Escribano. Via encompassing, this most recent model represents a sort of “sufficient statistic” for modeling money demand on this data set, while recognizing that yet further improvements may be possible on this data set or by extending the data set (cf. Hendry and Richard (1989) on properties of encompassing models and Klovland (1987) on an extended data set).

Model design. Encompassing emphasizes the importance of explaining properties of other models, and so the validity of reductions on data that are in other models but not in the model being evaluated. As Section 2.B indicates, encompassing and other test statistics can be employed as model design criteria, with models explicitly constructed to satisfy these tests. White (1988, 1990) provides a statistical theoretic basis for doing so:

Thus, we have an m-testing model selection procedure which rejects sufficiently misspecified models and retains sufficiently correctly specified models with confidence approaching certainty as [T] + oo . Use of such procedures has the potential to remove some of the capriciousness associated with certain empirical work in economics and other fields. For this reason we wholeheartedly endorse progressive research strategies such as that of Hendry and Richard (1982)

and Hendry (1987[a]) for arriving at sufficiently well specified characterizations of the DGP. We believe the m-testing framework set forth here can be a convenient vehicle for such strategies. White (1990, p. 381)

To summarize, Hendry’s modeling strategy is intimately linked to the status of empirical models themselves, the statistics for evaluating and designing them, and the classes of models generated by various reductions. The approach stresses a general-to-simple approach, starting with an autoregressive distributed lag and simplifying where statistically feasible, and in an economically interpretable manner. The mapping between AD models, ECMs, and cointegration aids economic interpretability; statistical tests guide the simplifications. Encompassing, along with that explicit model design, helps to ensure progressivity.

E. Model Estimation and Testing

As Sections 2.A and 2.B show, empirical models and test statistics derive from the probability density of the DGP, so it is unsurprising that model estimation and testing are based on the likelihood function. Relatedly, Hendry (1976) shows that virtually all existing estimators can be viewed as (and better understood as) numerical and statistical approximations to the maximum likelihood estimator (MLE). These approximations include instrumental variables (and so Hansen’s (1982) generalized method of moments, or GMM, estimator), 2SLS, 3SLS, k-class estimators, and autoregressive variants thereon. In Hendry’s (1976) framework, it is easy to determine which estimators are asymptotically (or even numerically) equivalent to MLE, which are not, and under what circumstances. In practice, MLE is obtained by maximizing the likelihood and/or by setting its first derivative (the “score”) to zero. Both the likelihood and the score play fundamental roles in testing as well. Tests and estimation are also linked via recursive updating algorithms for estimation, which also generate tests of parameter constancy. Thus, this subsection reviews the three principles for testing, sketches the analytics of recursive estimation, and describes the relationship of recursive procedures to Chow’s statistic and Brown, Durbin, and Evans’s CUSUM and CUSUMSQ statistics.

Testing. Test statistics fall into one of three categories: likelihood ratio (LR), Lagrange multiplier (LM), or Wald. For the first, the likelihood function is evaluated under both the maintained (unrestricted) and null (restricted) hypotheses. The negative of twice the difference of the log-likelihoods is asymptotically distributed as y? under the null hypothesis. For the LM statistic, the score of the maintained hypothesis is evaluated at the restricted parameter estimate. Deviations of that score from zero reflect the degree to which the null hypothesis is not close to the unrestricted estimate. To capture that notion statistically, the LM statistic is constructed as a quadratic form in the score, with weightings from the covariance matrix of the score itself. By contrast, the Wald statistic uses only unrestricted parameter estimates. The degree to which these unrestricted estimates satisfy the parametric restrictions of the null hypothesis is evaluated, and the Wald statistic is constructed as a quadratic form of the (suitably weighted) discrepancies from the restrictions. Typically, the LM, LR, and Wald statistics are asymptotically equivalent, so computational convenience and finite sample properties determine the choice of which one to use. See Engle (1984) for a comprehensive description of test procedures, and Buse (1982) for a graphical analysis of the relationship between the three test statistics.

Estimation. Hendry’s empirical models frequently involve contemporaneous conditioning of a single variable or subset of variables on several other variables, greatly simplifying ML estimation. Provided weak exogeneity is valid, MLE is often least squares. Recursive least squares provides evidence on parameter constancy, predictive accuracy, and the assumed weak exogeneity (via tests of super exogeneity), so we consider recursive methods in some detail.

Recursive methods. One dominant design criterion in Hendry’s methodology is parameter constancy. Recursive estimation of an equation provides an incisive tool for investigating parameter constancy, both through the sequence of estimated coefficient values and via the associated Chow statistics for constancy. Tests of forecast accuracy are intimately related to these tests of constancy; cf. Hendry (1979b) and Kiviet (1986). See Brown, Durbin, and Evans (1975), Harvey (1981, pp. 54-57, 148-159), and Dufour (1982) for excellent discussions of recursive techniques and their implications.

Using the standard linear regression model (9) to illustrate, the OLS estimator of @ over observations [1, t] is:

(24) Bt = (XIX) OXY,

where X; = (z,...z;)’ and Y; = (y1...yz)’. It is relatively simple (and computationally inexpensive) to obtain the entire sequence of OLS estimates {B: :t=h,...,T}, starting with some initial number of observations h (h > k for k regressors). The actual algorithm for recursive least squares (RLS) illuminates the properties of that sequence and the properties of test statistics based upon it, so we digress to discuss it.

Much of the actual computational time in calculating Bi is spent inverting (X{X;). RLS inverts only (X;,X;,) and avoids inversion of (X{X;) fort > h by the following updating rule, based on a lemma using the formula for partitioned inversion of a matrix:!?

(25) (X7Xz)7) = (X1_, Xt) 71 aah /f, = t= A +1,...,T, where

(26) at = (X¢_,Xt-1) 1:

and

(27) fe = V+ x (X¢_,Xe-1) 712 = 14 chap.

The vector X{Y; can be calculated (computationally trivially) as X{_,Y¥:-1 + xeyt, so:

(28) Be=Btrtam/fr t=h+1,...,T,

‘2 For the partitioned matrix [4, “8 , we have: (A+ BC~'B')~' = A~1— A“1B(C + B'A“1B)“!B’A7!

Equation (25) follows with A = X{_,X+-1, B= 24, and C = 1.

by substitution of (25) into (24), and where the innovation (equivalently, recursive residual) Nt is:

(29) nme=ye—Bl_ytee =9t=h+1,...,T.

The variances of 7; and Be-1 are o2- f, and o2 - (X/_,X;_1)~! respectively.

Intuitively, the updating formula (28) modifies };_1 to the degree that the new information is not in line with the previous estimate (i.e., the extent to which n; is nonzero), weighted by the uncertainty in the estimate ;_, relative to the uncertainty surrounding the news nz. Likewise, equation (28) is a natural estimation analogue to the sequential factorization of the likelihood function in (3).}3

Other recursive estimates follow immediately from these formulae. The residual sum of squares (RSS;) is RSS;_1 + n?/ ft, from which comes the recursive equation standard error 6; (= /[RSS;/(t — k)]). Recursive estimated standard errors for {(;} follow from 6; and (25).

The innovations {n;¢} also are the basis for two classes of parameter-constancy test statistics, which are proposed by Chow (1960) and augmented by Brown, Durbin, and Evans (1975). These statistics play crucial roles both for testing weak exogeneity indirectly through testing the conjunction of hypotheses embodied in super exogeneity, and (relatedly) for testing feedback versus feedforward empirical models; cf. Engle, Hendry, and Richard (1983), Hendry (1988c), and Engle and Hendry (1989). To discuss these statistics more easily, it is helpful to introduce the “quasi-innovation” nt4;2 = yt+; —Bloes; (J > 0): that is, the innovation for time t + j based on By. Above, we have used the simplified notation Nt = Nt,t-1-

Chow derives the covariance matrix for {nt+j,t, J = 1,...,N} under classical assumptions. From that, he constructs two statistics, one for testing that all of the quasiinnovations have zero expectation {€(n:+;,1) =0, 7 =1,...,N}, and the other for testing

that the arithmetic mean of the quasi-innovations has zero expectation {€ (op ne+j,t/N) = 0}. The former is commonly referred to as the “Chow statistic”, and we will use that terminology, although noting that Fisher’s (1922) covariance statistic is also sometimes called the Chow statistic, particularly by North-American writers. We will refer to the statistic on the arithmetic mean as the Chow t-statistic because of its associated distribution under the null hypothesis. We now consider how the Chow statistic and the Chow t-statistic are calculated from the recursive estimates, and what the statistics’ variants are in terms of the forecast horizon.

The Chow statistic. The (N-step) Chow statistic [to test {E(nt4;2) =0,7 =1,...,N}] can be directly calculated as:

(30) CHOW(N,t — k) = a = — _ | -[(t—k)/N],

where RSS; and RSSi4yn are obtained recursively, and are numerically equivalent to the residual sums of squares directly calculated from #3; and Bi+n respectively. Under the null

13 Recursive instrumental variables estimation is also feasible although the algorithms are considerably more complicated; cf. Phillips (1977) and Hendry and Neale (1987).

hypothesis of correct specification, the statistic CHOW(N,t — k) is exactly distributed as an F(N,t — k) for normal independent e; with fixed regressors z;, approximately so for dynamic regressors. The Chow statistic itself is approximately:

(31) CHOW(N,t — ~r( (ye+j — Bixe4;)? /N}/6? = o ni+5t/N}/6? ?

j=l

where the coefficient uncertainty (and so the correlations between n¢+:,2 and 7143.1, t # J) is ignored. In effect, the Chow statistic compares the mean square forecast error (the average in curly brackets in (31)) with the in-sample error variance 6?. Equally, the Chow statistic compares the observed quasi-innovations with their anticipated variance covariance matrix (approximately 6? - Iy) to measure whether or not the quasi-innovations’ means have deviated significantly from zero. Hendry’s (1979b) x? statistic for testing against predictive failure is N times the right hand side of (31).

Recursive estimation is sequential, so sequences of statistics for testing for parameter constancy are natural and convenient to evaluate. Common sequences are characterized by whether the “forecast horizon” N is fixed, decreasing, or increasing. Those sequences for the N-step Chow statistic are:

(i) fixed N-step, {CHOW(N,t—k),t=h,...,T—N},

(ii) decreasing horizon (N |), {CHOW(N,T —-N—k), N=T-—h,...,1} , and

(iii) increasing horizon (N t), {CHOW(N,h—k), N=1,...,T—h}.

An important member of (i) is the sequence of one-step Chow statistics, i.e., {n?/(6?_, fi), t=h,...,T — 1}, where a typical statistic is testing €(n:) = 0.

The sequences (i)-(iii) are portrayed in Figures 1a-1c, with the first figure being for the one-step ahead sequence (for ease of illustration). Further, in empirical work itself, graphs conveniently and succinctly present these sequences, the recursive estimates, and the onestep residuals described below, as shown in Figures 5-11 (Sections 4.C and 4.D). The sequences (ii) and (iii) sometimes are called “break-point” and “fixed-point” sequences, given the nature of the division between estimation and forecast periods in each. In practice, it is useful to have N, T, and/or h chosen by the user, e.g., when attempting to detect a particular class of departures from constancy.

The Chow t-statistic. The N-step Chow t-statistic [for testing E(pey ni+j,t/N) = 0] is:

N (32) t — CHOW(N :t—k) = {} 0 (ye45 — Bizess)/N}/{62 - w?y wht? j=l

N = 0 messe/NY/{6? - wed”, j=1

where

(33) Win = NO + (do 2h45/N)( )(X;Xt)~ Ho sus) j=l

26a

a 1 n+] But Xh+2 |

la , 1 +2 Bn+2 Xh+3 |

0 tee h h+1 h+2 h+3 --- T-1 T time

Figure la. The sequence of one-step ahead Chow statistics: estimates and forecasts.

0 tae h h+1 h+2 h+3 --- T-1 T time

Figure 1b. The sequence of N|-step ahead Chow statistics: estimates and forecasts.

| Bn R Bu’ Xn41 , 5 Bn { By’X;, ish+1.h+2}

3 B,.’X;, izh+l

0 tee h h+1 h+2 h+3 --- T-1 T time

Figure lc. The sequence of N{-step ahead Chow statistics: estimates and forecasts.

Unlike (31), the equalities in (32) are exact. Under the null hypothesis of correct specification, the statistic t-CHOW(N : t — k) is exactly distributed as Student’s t with (t — k) degrees of freedom for normal independent e; with fixed regressors z;, approximately so for dynamic regressors. The forecast horizon N is included as an argument to clarify the number of quasi-innovations averaged. 14 The Chow t-statistic also allows three classes of sequences: (i") fixed N-step, {t-CHOW(N :t—k),t=h,...,T—N}, (ii") decreasing horizon (N |), {t-CHOW(N:T-—N—k), N=T—h,...,1} and (iii) increasing horizon (N ft), {t-CHOW(N :h—k), N =1,...,T— h}. However, given the Chow statistics from (30), the one-step Chow t-statistic t-CHOW(1 : t — k) is uninteresting (apart from its sign) because its square is CHOW(1,t — k). The CUSUM and CUSUMSGQ statistics. Brown, Durbin, and Evans (1975) propose using the cumulative sum (CUSUM) and cumulative sum of squares (CUSUMSQ) of {n:}

for testing against nonconstant (G,07). The relationship of those statistics to Chow’s is as follows.

The CUSUMSQ statistic is:

t T (34) CUSUMSQ, = ( )> })/( 3° 12) = RSS,/RSSr, j=k+1 j=kt+1

which is equivalent to the (T — t)-step CHOW(T — t,t — k) statistic via a nonlinear

transformation, noting (30). Brown, Durbin, and Evans suggest plotting the sequence

{CUSUMSQ,, t =h,...,T}, equivalent to the sequence of decreasing-horizon Chow statis-

tics (ii). Confidence bands for the sequence of CUSUMSQ are lines parallel to the 45° line. Brown, Durbin, and Evans’s CUSUM statistic is:

t T (35) CUSUM: = { D> om}/{ 30 nf /(T- ky},

j=k+1 j=k+1

and is closely related to the t-CHOW(T — t : t— k) in that both are designed to have particular power against a nonzero average of the forecasts’ means. However, the Chow t-statistic looks at the average of the quasi-innovations whereas the CUSUM statistic uses the average of the innovations, so their properties in finite samples may differ somewhat. The Chow t-statistic can be written in terms of a weighted average of the appropriate innovations, paralleling (35), but the weights are unequal. Confidence bands for sequences of Chow t-statistics are simple to calculate; those for the CUSUM statistic are less so (see

Brown, Durbin, and Evans (1975)), although the modified CUSUM statistic CUSUM; /(t— k) 1/2 is approximately normal.

*4 Chow(1960, p. 594) actually proposes the square of t-CHOW(N : t — k), distributed

as F(1,t—k), but the t-statistic is more useful because it includes the sign of the deviation from zero.

Various sequences of the CUSUM statistic are trivial to calculate from the sequence of innovations generated by RLS. Sequences of fixed N -step and decreasing-horizon Chow t-statistics require the sequence {(X/X;)~}}, making any one of them easy to calculate during estimation but more time-consuming to do so if requested after estimation when the sequence {(X{X;)~'} is not stored.

These sequences of statistics and the sequences of coefficient estimates with their associated standard errors provide different, albeit related, views on empirical parameter constancy and are designed to detect various types of structural breaks which might occur. The sequence of the “one-step residual” {y: — Blaze} and a measure of its uncertainty {0 + 26} together provide a convenient set of statistics summarizing much from the other recursive estimates and statistics.

Recursive least squares and the corresponding test statistics generalize immediately to unrestricted systems of equations: Y; in (24) is reinterpreted as a matrix. Hence, this estimator is known as recursive multivariate least squares (RMLS). It is central to testing the constancy of the complete system (5), and to testing the constancy of (7) when (7) is a sub-system but not a single equation.

F. Summary and Perspective

Figure 2 summarizes the relationships between several principal concepts discussed above. Each concept has a statistical and an economic counterpart. Economic theory (in the upper right hand corner) is central to all concepts, and serves as a natural starting point from which to trace the various branches of the figure.

Cotntegration and dynamic specification. At a minimum, economic theory suggests which variables ought to exhibit a long-run relationship. Cointegration links that economic notion with a statistical model of those variables. Cointegration also implies the existence of an error-correction representation of the relevant variables, leading to the short-run as well as long-run interactions, and hence dynamic specification.

Dynamic specification influences model design because of the statistical and economic importance of white-noise, innovation disturbances. Dynamics may appear in empirical models because the agent’s optimization explicitly dictates its presence, because agent behavior implies cointegrated variables and hence dynamics, because ceteris paribus conditions of the theory-model may not hold in fact, or because of any combination thereof. As a rule, dynamic mis-specification invalidates inference, so dynamics cannot be safely ignored, and a general specification (at the outset, at least) often is advisable.

Agents’ contingent plans. Weak exogeneity can occur when agents condition on information, e.g., in forming contingent plans. If they use the information efficiently, innovation errors are implied, raising the issue of dynamic specification. Weak exogeneity implies a statistical factorization of the data density. Weak exogeneity is testable, often as an implication of super exogeneity (and so of conditional models having constant parameters).

Weak exogeneity also ties back to cointegration. First, if a given cointegrating vector appeared in both the conditional and marginal models, weak exogeneity would be violated: the parameters of the conditional model (1) would depend upon the parameters of the marginal process (A2:). Second, the choice of normalization of the cointegrating vector is an unresolved issue, but both economics and the data can help. Economic theory may suggest which variables agents aim to control and on which ones they condition their plans,

28a

PARAMETER +—> ECONOMIC THEORY:

CONSTANCY PARAMETERS OF (RLS) ft INTEREST changing agent's marginal | contingent processes plans long-run (Z,'S) Statistical factorization Lo SUPER EXOGENEITY COINTEGRATION (policy) innovation errors / y, short-run / (ECM) ZL | WEAK +¢—reliable —— DYNAMIC

EXOGENEITY inference SPECIFICATION (inference)

Figure 2. Some Relationships between Econometric Concepts

and parameter constancy in an empirical model is not invariant to normalization when the economy exhibits structural change.

Parameter constancy. Economic theory focuses on the invariants of the economic process. The continuing debates on autonomy, “deep” or “structural” parameters, and the Lucas critique all reflect that. Thus, parameter constancy is at the heart of economic model design. Since economic systems are far from being constant, and the coefficients of derived (“non-structural” or “reduced form”) equations may alter when any of the underlying parameters or data correlations change, it is important to identify empirical models which have reasonably constant parameters and which remain interpretable when some change occurs. Empirical models with constant parameterizations in spite of “structural change” elsewhere in the economy exhibit super exogeneity, as required for policy analysis. Super exogeneity implies weak exogeneity, thereby sustaining valid statistical inference.

Parameter constancy is also a central concept from a statistical perspective. Most estimation techniques require parameter constancy for valid inference, and those that seem not to do so, still posit “meta-parameters” assumed constant over time. Recursive estimation of an equation provides an incisive tool for investigating parameter constancy, both through the sequence of estimated coefficient values and via the associated Chow statistics for constancy.

Other approaches. In some common empirical practice, the data serve only for estimating parameters of the theory-model, with the theory imposed on the data. Short-run dynamics are non-existent, or are supposedly eliminated by “corrections” for AR residuals or by partial adjustment.

Hendry’s econometric approach contrasts with this practice in several respects. Economic theory is imbedded in the empirical model such that that model satisfies the economic-theoretic constraints under the conditions that those constraints were derived (e.g., in the long-run, or in steady state). Short-run dynamics are modeled jointly with long-run properties via the ECM. Unlike (e.g.) partial adjustment models, the ECM does not restrict the magnitude of (short-run) responses actually present. Rather, it allows for the possibility of general dynamics, so that their extent (or lack thereof) can be determined from the data. Estimation is an important issue, but, in itself, provides little guidance on the value or lack thereof of the empirical model obtained. An empirical model is unlikely to allow reliable statistical or economic inference, forecasting, or policy analysis unless it is “well-designed” in the sense that it does not violate either the assumptions made at the outset or the numerous testable implications of those assumptions.

See Gilbert (1986) for a non-technical discussion of the contrasts between these two approaches. See Hendry (1987a), Leamer (1987), and Sims (1987) for concise statements of their advocated methodologies, and Pagan (1987) and Phillips (1988) for critical appraisals.

Hendry, Leamer, and Poirier (1990) focus on the distinctions between Hendry’s and the Bayesian approaches.

3. The Structure of PC-GIVE Version 6.01 PC-GIVE is the menu-driven, DOS-based suite of computer programs, or “interactive modeling system”, that David Hendry has written to implement the methodology described

in Section 2. Summary information for PC-GIVE appears on Table 3. The modeling system PC-GIVE contains two primary programs:

29a

Table 3. Summary information for PC-GIVE Version 6.014

Author Distributor

Format

Computer

Storage space on hard disk (if used)

Printers supported Language

Computational accuracy

Documentation

David F. Hendry

Institute of Economics and Statistics St. Cross Building, Manor Road, Oxford OX1 3UL, England

(0865) 271090 (Lucy Gibbins)

Two high-density or seven double-density diskettes (either 5.25" or 3.5"), not copy protected

IBM PC, XT, AT, PS/2, or compatible

MS-DOS 2.0 or higher

Twin floppy disk drive or hard disk drive

488Kb RAM free to run PCGIVE, 476Kb RAM free to run PCFIML

Math coprocessor strongly recommended

Hercules, CGA, EGA, VGA, and Paradise VGA graphics cards supported

2.0 Mb

Most dot-matrix and laser printers

Microsoft FORTRAN and Assembler (source code not available to the user)

Acceptable, using Longley's data: see Terdsvirta (1988)

353 page users manual, on-line help facilities

aFor further information, contact the distributor.

PCGIVE: for dynamic single-equation modeling, data analysis, and transformations; and PCFIML: for estimation and testing of systems of equations;

where program names do not include a hyphen. This section aims to relate the econometric methodology described in Section 2 to the functioning of PCGIVE and PCFIML. Thus, we consider the structure of each program, and how elements of their structure are designed to implement the methodology. Sections 3.A and 3.B respectively describe PCFIML and PCGIVE, noting that PCFIML is the more general of the two programs while PCGIVE tends to be used most in practice. Within each section, discussion is organized around the primary menus of the respective program. To put the programs in perspective, the remainder of this introduction summarizes the context in which the programs were developed and the purposes motivating that development. Section 3.C briefly compares PC-GIVE with RATS, Micro-FIT, TSP, and GAUSS.

A brief history. During the late 1960’s, Hendry wrote several programs on the University of London mainframe computer as part of his thesis, Hendry (1970). Those programs included GIVE (an acronym for Generalized Instrumental Variables Estimation) and FIML (for Full Information Maximum Likelihood). GIVE calculated OLS and instrumental variables (IV), autoregressive variants thereon, and tested the common factor restrictions implied in the autoregressive errors, following Sargan (1959, 1964). FIML calculated full information maximum likelihood estimates for sets of dynamic simultaneous equations.

While written in the context of his thesis, three aims soon dominated Hendry’s motivation for further developing GIVE, FIML, and others of his programs:

(i) to provide “best available methodology” for his own research,

(ii) to provide other researchers with that methodology (thereby both avoiding redundant programing efforts and encouraging best practice in the profession), and

(iii) to provide a tool for teaching that methodology.

Thus, these programs often provided new tests and techniques years in advance of commercially available packages, and not infrequently in advance of the publication of the articles in which the tests and techniques were developed. Examples include a numerically efficient algorithm for estimating equations with AR errors, LR and Wald tests of common factors, tests of predictive failure and parameter constancy, LM tests of AR errors, LM tests of ARCH errors, and White’s heteroscedasticity-consistent standard errors. Hendry and Srba (1980) summarize the state of GIVE, FIML, and related mainframe packages at the end of the 1970's.

When the IBM PC-XT was released in 1983, Hendry decided that it had just enough power to make a PC version of GIVE feasible. The mainframe program code was ported to the PC, recompiled and debugged, and the program PCGIVE was “born”. Not only was the XT powerful enough, but the interactive and graphical capabilities of a personal computer radically reshaped the program. Important resulting additions included a menudriven user interface, the statistical analysis of sequential reduction, and the graphical analysis of recursive estimation and testing procedures. Later, with the increased power of the PC-AT, the more computationally intensive program FIML was ported to the PC as “PCFIML” and adapted to the menu structure that had been developed for PCGIVE.

Prompt coding of new tests and techniques notwithstanding, objectives (ii) and (iii) have concentrated much program development time on “error-trapping” (e.g., preventing users from selecting combinations of choices that don’t make sense) and user-friendliness. The latter has included development of a straightforward menu interface and an extensive help system.

A. The Structure of the Program PCFIML

Figure 3 sketches the structure of PCFIML according to its primary menus, which are named: data input, model, cointegration, R.M.L.S. Chow tests graph, dynamic (ez ante) forecasts, graphics, and estimator selection. Some additional, more minor menus will be mentioned in the course of describing PCFIML’s functioning. See Hendry, Neale, and Srba (1988) for a detailed description of the methodology, structure, and algorithms for PCFIML.

Data input. On entry to PCFIML, the user chooses the data to be analyzed, i.e., the vector x; in (5). Restrictions on the size of data set are as for PCGIVE, see Section 3.B.

Model. From the model menu, the user specifies and estimates the unrestricted reduced form (5) (i.e., a system in Hendry’s terminology), and (then) any variants of (5), e.g., a set of over-identified simultaneous equations. In practice, the unrestricted reduced form could be a conditional subsystem, as in (7), rather than the complete system in (5). The system’s specification includes designation of maximum lag length £, endogenous (yz) versus strongly exogenous (z;) variables, and what dummies (e.g., seasonals, constant, and trend) to include. Only weak exogeneity of z; is required for estimation and testing, but strong exogeneity is needed for dynamic forecasting. For simplicity of exposition, we assume that a complete system is analyzed.

Cointegration. Because establishing cointegration is critical to an economic interpretation of the empirical model, PCFIML begins with a cointegration analysis of the VAR (5). PCFIML estimates the coefficient matrices {7;} in (5), the covariance matrix 9, and the coefficient standard errors; derives the corresponding estimate of the long-run coefficient matrix 7 in (22); calculates Johansen’s trace and maximal eigenvalue statistics for testing the rank of 7; and solves for the normalized a and #’ matrices. At the cointegration menu, the user can graph each (possible) cointegrating combination Biz,. Estimation at this stage is maximum likelihood assuming normally distributed ¢€,, which is multivariate least squares because (5) is unconstrained. Additional output includes the value of the likelihood function and associated measures of goodness-of-fit.

R.M.L.S. Chow tests graph. In addition to the multivariate least squares estimate of the {7;}, PCFIML calculates their recursive estimates (i.e., R.M.L.S.), the recursive estimates of 1;;, and the corresponding (equation by equation) Chow statistics; see Section 2.E. The latter can be graphed and compared against one-off critical values. PCFIML also calculates F statistics for excluding a given regressor from all equations — useful for simplifying an overly parameterized reduced form.

Dynamic (ez ante) forecasts. Given the close relationship between predictive accuracy and parameter constancy, PCFIML graphs dynamic (ex ante multi-period) forecasts and their confidence intervals; see Chong and Hendry (1986).

Graphics. Finally (for the unrestricted reduced form), PCFIML calculates the (insample) dynamic simulation. The user may graph the simulated values, actual values,

ENTER

DATA INPUT

ESTIMATOR SELECTION

| |

| Grapuics | COINTEGRATION* |

DYNAMIC FORECASTS*

R.M.L.S. CHOW TESTS GRAPH *

Figure 3. A Schematic of PCFIML

*Optional menu.

and/or fitted values. While Hendry and Richard (1982) have shown that dynamic simulation is an invalid technique for model comparison, having the technique programmed has been important for demonstrating precisely that, and for replicating other researchers’ results.

At this point, the user returns to the model menu to specify the model to be estimated. Through the auxiliary model modification menu, the user chooses which variables to include and exclude from each equation, preferably starting from (5) and following a general-to-simple approach. Once the model is specified, PCFIML estimates it by twostage least-squares (2SLS), reporting structural coefficient estimates and standard errors (e.g., for {Bo, {Bj}, X11} and the parameters of the marginal equation for zt), the restricted reduced form coefficients (ie., the {1;} solved from the structural estimates), their standard errors, the reduced form covariance matrix, and the LR statistic for testing the over-identifying restrictions. Because the cointegration analysis and R.M.L.S. rely on (5) being unrestricted, model estimation skips past the cointegration and R.M.L.S. menus and proceeds directly to the dynamic forecasts menu and the graphics menu, with output as described before, but using the 2SLS estimates.

Estimator selection. Having started with 28LS, PCFIML now allows a wide selection of simultaneous-equations estimators: three-stage least-squares (3SLS), limited information instrumental variables (LIVE), full information instrumental variables (FIVE), and full information maximum likelihood (FIML). Limited information maximum likelihood (LIML) can be calculated as FIML with all equations just identified except the equation of interest. If FIML is selected, the user chooses which algorithm via the optimization menu: Powell’s method (no derivatives), or a quasi-Newton method using the Broyden- Fletcher-Goldfarb-Shanno (BFGS) update algorithm, either with analytical or numerical derivatives. Hendry (1976; 1989b, Chapter 4) describes the relationships between these estimators and the algorithms used. The 2SLS estimates are required to compute 3SLS, LIVE, and FIVE, and are often good starting values for FIML. Hence PCFIML always estimates a model first by 2SLS, with the choice of other estimators being optional. For FIML estimation, the user specifies convergence criteria and the maximal number of iterations, and may choose starting values other than 2SLS, if desired. Also, the likelihood function itself can be graphed. That can prove highly informative because multimodality may result in convergence to an inferior optimum and/or may indicate model mis-specification.

Having obtained estimates using a given estimator, PCFIML generates output equivalent to that obtained for 2SLS. Upon completion of estimation and testing, the user

returns to the model menu for further model modification, data transformation, etc., and eventually, to exit.

Modeling a system of equations. Empirical modeling of systems of equations qua systems is less developed than single-equation methodology, but examples exist. For instance, Hendry (1974) develops a small macro-model of the U.K. economy, Hendry and Anderson (1977) and Anderson and Hendry (1984) model the behavior of U.K. building societies,

and Hendry and Mizon (1989) model the determination of U.K. money, income, prices, and interest rates.

Summarizing, PCFIML estimates the (unrestricted) system, evaluates it for cointegration, and provides a range of diagnostic tests. With that estimated system as a benchmark,

sets of restricted, possibly simultaneous equations may be estimated and tested with several standard techniques. Methodological concepts dominant in the program’s structure are: cointegration, parameter constancy and forecasting, exogeneity, lag length (and so dynamic specification), and over-identifying restrictions (and so marginalization)

B. The Structure of the Program PCGIVE

Figure 4 sketches the structure of PCGIVE according to its primary menus, which are named: data input, model, estimation, reduced form estimates, equation estimates, recursive least squares graph options, graphics, diagnostic tests, and action.

Data input. On entry to PCGIVE, the user chooses the data set for analysis. Data may be read from ASCII files in free or fixed format, or from PC-GIVE’s own data files (see below). A given data set may have a maximum of 240 observations on 40 variables. PC-GIVE databank facilities help ameliorate this restriction.

Model. From the model menu, the user selects from the data set which variables are to be analyzed (i.e., r,). These variables specify a single equation, either one from (5) or one from (7). As with PCFIML, the choice of lag length @ is separate from that of 2}.

Estimation. The estimation options are OLS and IV, the recursive variants thereon (denoted RLS and RIV), and rth-order autoregressive least squares (RALS). For OLS, RLS, and RALS, all variables other than the one normalized are assumed weakly exogenous for the parameters of interest. Later in the program, that assumption is testable via tests of super exogeneity. For IV and RIV, the user explicitly specifies the partitioning of x4 into the endogenous y; and weakly exogenous z:, and may select various lags of each as instruments. In addition, the user also specifies a subsample (if any) over which to forecast. The output and menus that follow depend upon both the choice of estimator and whether forecasts were specified.

Reduced form estimates. (IV and RIV only) PCGIVE reports estimates, coefficient standard errors, and equation standard errors for the unrestricted reduced form, i.e., for each endogenous variable in the vector yt. The graphics menu is available for fitted and actual values, and the residuals. The reduced form equations are important because they provide a measure of how good the instruments are, and because the test of over-identifying restrictions is a test of the equation estimated by IV against the corresponding (and less restrictive) reduced form equation.

Equation estimates. PCGIVE reports the coefficient estimates, their standard errors, White’s heteroscedasticity-consistent standard errors (if OLS or RLS), the equation standard error, and several auxiliary statistics. For IV and RIV, the latter include Sargan’s (1958) test of over-identifying restrictions, also known as the test of the validity of the instruments.

If RALS is selected, PCGIVE reports two sets of equation estimates, the first without and the second with autoregressive errors estimated. PCGIVE calculates starting values for the second set of estimates from the first set of estimates in conjunction with the latter’s LM statistic for testing AR errors. The optimization menu allows choice of: these or other starting values, the order of the autoregressive process, the maximum number of function values calculated, convergence criteria, and a plot of the concentrated likelihood function. Upon convergence, the user obtains the second set of estimates.

| |

EQUAT ION ESTIMATES

REDUCED FORM ESTIMATES *

RECURSIVE L.S. GRAPH OPTIONS*

Figure 4. A Schematic of PCGIVE

*Optional menu.

Recursive least squares (or recursive instrumental variables) graph options. (RLS and RIV only) From this menu, the user may graph and/or store recursive sequences for: coefficient estimates (Bs in the notation of Section 2.E) and their standard errors; the corresponding t-values; the residual sum of squares (RSS;); the standardized innovation (nt/ f+); the one-step, increasing horizon (N ¢), and decreasing horizon (N |) Chow statistics; and the one-step residual (n;,:) plus-or-minus twice its standard error (0 + 2-6). From Section 2.E, all of these sequences are closely related, with the equation standard error 6; and all the Chow statistics being derived from the residual sum of squares.

If forecasts were selected, an analysis of the (one-step) forecasts appears, with actual values, forecast values, forecast errors, standard errors of forecasts, and forecast error t-values. And, Chow’s (1960) and Hendry’s (1979b) statistics for parameter constancy (predictive failure) are reported. The forecast graphing menu is available for forecast and actual values, with a band of plus-or-minus twice the forecast standard error around each forecast.

Graphics. The user may plot actual and fitted values (with forecast values, if applicable), and residuals.

If the model estimated was an autoregressive distributed lag such as (12), PCGIVE solves for the estimate of the long-run coefficient 6 on z;, and its standard error. PCGIVE continues with an analysis of lag structure, including tests of the significance of each variable (i.e., at all lags), and tests of the significance of each lag (i-e., for all variables). At the lag weight graph menu, the user can plot the coefficients of (12) solved as y; on a distributed lag of z.

Diagnostic tests. Here, the user specifies statistics from Table 1, chooses the relevant degrees of freedom (e.g., the order of the AR process if testing for AR errors), and obtains the outcome. Tests are available for: AR errors, ARCH errors, non-normality, heteroscedasticity due to squares of the regressors, functional form mis-specification, omitted variables, common factors, and encompassing. Additionally, on entering the diagnostic tests menu, a batch mode is available for calculating test statistics with pre-defined degrees of freedom.

Action. At this point, the user chooses which variables to delete (or add) to the equation, whether to transform the equation’s variables, and what sample and estimator to use. Given those choices, the user generates new output starting from the estimation menu. The user also may return to the model menu for complete re-specification of the equation, access to a different data set, and eventually, to exit.

Modeling a single equation. In light of Section 2’s discussion on reductions and marginalization, it is desirable for the user to start with a general autoregressive distributed lag, and then simplify in light of the evidence. If the user does so, PCGIVE keeps track of the progress. At the Action menu, PCGIVE tabulates and/or graphs statistics on the sequential reductions for easy inspection and judgment of the reductions’ success. The statistics include all possible F statistics for the sequentially selected exclusions of variables, the residual sums of squares, the equation standard errors, and the Schwarz criterion. Section 4.B illustrates.

User services. At numerous points, PCFIML and PCGIVE offer various user services, either via a menu directly or through a “line menu” at the bottom of the screen. These

services include: transform, delete, graph, list, and save the data; change the sample period; change the screen color; review the results; load a new data set; access DOS; and access help facilities.

Input and output files. In order to identify certain sorts of information, PC-GIVE associates several DOS file extensions (“suffixes”) with particular input and output of its programs. The file extensions and their meanings are as follows.

Context Extension File content PC-GIVE INF data information | .BIN binary data .PIC graphics (“picture”) PCGIVE EQN summary output | OUT full output .BNK databank .DAT ASCII PCFIML MDL summary output -LST full output JOB model specification

(with an .INF and .BIN file pair)

Brackets indicate where files exist as pairs. While the types of files are not critical to the methodology, understanding their functions is important to using the programs.

Summarizing, PCGIVE estimates a specified single equation and provides a range of diagnostic tests. If the user models from general to simple, PCGIVE summarizes the statistical reduction process. Methodological concepts dominant in the program’s structure are: cointegration (via single equation cointegration tests), parameter constancy and forecasting, dynamic specification and lag structure, exogeneity, and reduction.

C. Comparison of Computer Packages

Numerous articles formally review PC-GIVE: see Turner and Podivinsky (1987) on Version 4.1, Terasvirta (1988) and Parks (1989) on Version 5.0, and Godfrey (1990) and Ericsson and Lyss (1991) on Version 6.0/6.01. (Also, see Hendry (1986g) on teaching with PC-GIVE.) “Competing” econometrics packages have been extensively reviewed in various issues of the American Statistician, Economic Journal, Journal of Applied Econometrics, and Journal of Economic Surveys. Thus, this section provides only a cursory comparison of PC-GIVE with a few other major PC-based econometrics packages (RATS, Micro- FIT, TSP, and GAUSS), and solely in the context of empirically implementing Hendry’s methodology.

RATS is good for analyzing systems if used with Johansen and Juselius’s “front-end”, available through the RATS electronic bulletin board. With that front-end, the user can test hypotheses about a and 3. However, RATS is command- rather than menu-driven,

and so is not very user-friendly. Also, RATS provides relatively few of the test statistics offered in PC-GIVE.

Micro-FIT (previously called DATA-Fit) is designed for single-equation modeling. It is easy-to-use (menu-driven), and provides many diagnostic test statistics. It does not offer any system analysis, so the Johansen procedure is not available. Also, Micro-FIT often lacks measures of significance. For instance, it provides no p-values of test statistics, and no standard errors for recursive estimates.

TSP can estimate both single equations and systems, but lacks any pre-programmed facilities for system cointegration analysis. TSP is command-driven, and offers relatively few diagnostic statistics.

GAUSS is a high-level language rather than an econometrics program. However, many economists are familiar with GAUSS. It can be programmed to perform all features in PC-GIVE, but requires substantial work on the part of the user to do so.

PC-GIVE was designed to implement Hendry’s methodology. Unsurprisingly, many of the tools required are present: Sections 3.A and 3.B discuss how the structure of PC- GIVE embodies that methodology, while Section 4 demonstrates PC-GIVE in practice. Of the packages discussed, only PC-GIVE calculates an extensive set of single equation and system diagnostics, including RLS, RIV, and RMLS. To be fair, neither RATS nor TSP were designed to implement this methodology. Micro-FIT was influenced by this methodology, and Micro-FIT and PC-GIVE are remarkably similar in several aspects of program structure and output content for single-equation analysis.

While we are long-time and enthusiastic users of PC-GIVE, we have several minor reservations about the package. These are: the size of a given PC-GIVE data matrix (from a pair of .inf and .bin files) can be limiting; test statistics are not calculated for hypotheses about Johansen’s a and £; standard errors are not included at the lag weight graph menu; PCFIML ez ante forecast standard errors exclude “coefficient uncertainty”;

and no recursive t-Chow statistics are available. We understand some of these concerns are being addressed.

4. Some Empirical Examples of the Methodology

This section illustrates the principle methodological concepts via three datasets. Section 4.A examines cointegration for Hendry and Ericsson’s (1991b) quarterly data on U.K. narrow money demand. Section 4.B uses the same data set to demonstrate general-tosimple modeling, dynamic specification, and sequential reduction. Section 4.C illustrates diagnostic testing as model evaluation with a phase-average U.K. nominal income equation from Friedman and Schwartz (1982). Section 4.D tests for super exogeneity in Campos and Ericsson’s (1988) model of annual Venezuelan consumers’ expenditure via tests of constancy and omitted variables. To emphasize the generality of the methodological concepts and techniques, we consider models of various aspects of the economy, using data from both developed and developing countries and measured at different frequencies over different sample periods.

A. Cointegration: Narrow Money Demand in the United Kingdom

This subsection briefly sketches the static theory-model on which the analysis is based,

describes the data, and discusses the cointegration results obtained using Johansen’s procedure.

In the standard theory of money demand, we have: (36a) M?*/P = q(Y,R),

where M®@ js nominal money demanded, P is the price level, Y is a scale variable (e.g., income), and R is a vector of interest rates. The function q(-,-) is increasing in Y, decreasing in those elements of R for assets that are alternatives to money, and increasing in those elements of R for assets within the measure of money. One common specification of (36a) is log-linear in money, prices, and incomes, but linear in interest rates:

(365) mt —p=c-y+dR,

where variables in lower case are in logarithms. The elements of d are negative or positive, corresponding to the associated asset being excluded from or included in the selected monetary aggregate. The parameter c is one-half in Baumol’s (1952) and Tobin’s (1956) transactions demand theory and unity in Friedman’s (1956) quantity theory.

The data for this subsection and Section 4.B are from Hendry and Ericsson’s (1991b) study of the demand for narrow money in the United Kingdom. Specifically, M, Y, and P are seasonally adjusted nominal M, real total final expenditure (TFE) at 1985 prices, and the TFE deflator. There are two interest rates, the three-month local authority interest rate (R$) and the learning-adjusted retail sight-deposit interest rate (Rra). Finally, two derived variables are of interest: the rate of inflation (Ap), and the net interest rate or opportunity cost, defined as R3 — Rra and denoted R*. The data are quarterly, 1963(1)- 1989(2). Allowing for lags and transformations, estimation is over 1964(3)-1989(2), which is 100 observations.

To analyze the cointegration properties of the series (m, p, y, R38, Rra), we apply the system-based procedures in Johansen (1988) and Johansen and Juselius (1990), as described in Section 2.C and implemented in PCFIML (Section 3.A). While the application of their ML procedure is computationally straightforward, three important issues which could affect inference arise with this data set: the maximal lag length @ in the vector autoregression (5), the order of integration of m and p (whether I(1) or I(2)), and whether to enter the interest rates separately or only via the opportunity cost R*. To gauge the

sensitivity of the cointegration tests to these issues, we examine four systems with the following variables:

System I: m, p, y, R38, Rra;

System II: m, p,y, R*;

System III: m—p, Ap, y, R8, Rra; and System IV: m— p, Ap, y, R*;

with each system estimated in PCFIML five times, varying @ from 2 through 6. A constant term is included in all cases. Tables 4 and 5 present cointegration results for 2 = 5: results for other values of @ are similar.

Table 4 lists the eigenvalues related to #, from smallest (i.e., closest to a unit root) to the largest (most stationary). Values of the maximal eigenvalue statistic and the eigenvalue trace statistic follow. From the rejections obtained, there is clearly at least one

37a

Table 4. Cointegration Results: Eigenvalues and Related Test Statistics

System> 95% Statistic critical I U Il IV value Eigenvalues 1 0).005 0.002 ().022 0.009 2 0.097 0.115 0.050 0.050 3 0.166 0.166 0.112 0.128 4 ().246 0.375 0).226 0.386 5 0.406 - 0.417 - Maximal eigenvalue statistic¢ 1 0.47 0.15 2.26 0.95 8.08 2 10.25 12.20 5.14 5.14 14.60 3 18.18 18.15 11.84 13.74 21.28 4 28.28* 47.05* 25.62 48.76% 27.34 5 52.08* - 53.94* - 33.26 Eigenvalue trace statistic‘ 1 0).47 0.15 2.26 0).95 8.08 2 10.72 12.35 7.39 6.09 17.84 3 28.90 30.49 19.24 19.83 31.26 4 57.18* 77.55* 44.85 68.59* 48.42 5 109.26* - 98.79* - 69.98

aThe statistics are defined in Johansen (1988) and Johansen and Juselius (1990), and critical values are taken from the latter's Table A2. An asterisk denotes significant at the 95% critical value.

bThe systems are fifth-order vector autoregressions of the following variables: I: m, p, y, R3, Rra; Il: m, p, y, R*; Il: m-p, Ap, y, R3, Rra; and TV: = -m-p, Ap, y, R*. The estimation period is 1964(3)-1989(2). See the text for definitions of the variables.

‘The hypothesis being tested is that there are at least s unit roots in the system, where s is the number listed in the first column. Let the system have p roots total (p=5 for models I and III, p=4 for models II and IV). If we reject that there are at least s unit roots, then we infer that there are at least p-s+l cointegrating vectors .

cointegrating vector; and there may be a second, given the tests on System I. However, that latter result may be due to nominal money and prices being possibly I(2), in which case the critical values used are not appropriate.!®

For comparison, Hendry and Mizon (1989) analyze the data (m— p, Ap, y, R8, trend) over a subsample for which Rra is irrelevant. They obtain two cointegrating vectors, one corresponding to excess money demand and the other to the deviation of output from a trend, where the trend proxies for the level of technology. While acknowledging the importance of technological growth, we exclude a trend from our analysis because suitable critical values are not available and because the presence of a trend per se is problematic in a cointegrating relation. If output is cointegrated with technology, our exclusion of the latter reduces the number of cointegrating vectors estimable in our system by one. In support of this interpretation, the value of each penultimate eigenvalue statistic increases upon the addition of a trend to the system, and usually does so beyond its 95% critical value. Even so, Hendry and Mizon’s first cointegrating vector is virtually identical to the cointegrating vector for System IV without a trend.

Table 5 lists the normalized estimated a and 3’ for the four systems, where a and #’ are respectively the weighting matrix and the matrix of cointegrating vectors from (23). The first row in A’ is the first cointegrating vector, so (e.g.) that vector is (1 —1.04 —0.95 7.46) for the variables (m, p, y, R*) in System II. Likewise, the first column in a@ is the set of weighting coefficients across equations for the first cointegrating relation B} 24: (—0.15 0.03 0.01 0.03)’ for System II. Thus, the coefficient of 6, 2t-1 in the money equation for System II is —0.15, and in the price equation, 0.03.

The estimates of a and £’ have several striking features. First, they are remarkably similar across systems, noting the changes in variables. Second, the weight of the first cointegrating vector is approximately —0.2 in the money equation and zero for all the other equations. That is necessary for prices, incomes, and interest rates to be weakly exogenous for the parameters in the money equation, which would validate conditioning on those variables; cf. Johansen (1990). Third, the first cointegrating vector has coefficients closely in line with the quantity theory of money demand. When re-expressed with nominal money as a function of the other variables, the coefficients on prices and real TFE are approximately unity, and R3 and Rra have negative and positive estimated coefficients respectively. Those interest-rate coefficients are nearly equal in magnitude, implying that the interest rates matter only via the net interest rate R*, at least in the long run.!©

15 As part of the cointegration analysis, we would want to determine the order of integration of each series, e.g., with the augmented Dickey-Fuller statistic. Alternatively, Johansen (1991a) proposes a systems approach. For this data, Johansen (1991b) finds that m and p are quite possibly I(2) and cointegrate to an I(1) variable m—p. Real money m — p cointegrates with y, Ap, and R*, which each appear I(1). In light of this evidence, only Systems III and IV are interpretable in an I(1) framework. However, we include Systems I and II because it is unclear whether m and p are I(2) or whether they are I(1) but (e.g.) contain “structural breaks” ; cf. Hendry and Neale (1991) and Hendry and Mizon (1989). The interest rate series raise similar issues.

16 Hall, Henry, and Wilcox (1990) use the three-month Treasury bill rate alone rather

38a

Table 5. Cointegration Results: Normalized a and B’ Matrices

Variable a (weighting matrix) B’ (matrix of cointegrating vectors)

System |”

m -0.19 0.00 0.07 0.02 0.00 1 -1.00 -0.77 5.80 -7.77

Pp 0.04 0.00 0.08 0.00 -0.00 -1.81 1 1.08 11.31 15.50

y 0.01 -0.01 0.02 -0.10 0.00 -0.08 -0.12 1 0.03 -1.25

R3 0.06 -0.02 0.04 -0.01 -0.01 0.33 -0.21 -0.30 1 -2.18

Rra 0.01 -0.01 0.01 0.02 0.00 -0.68 0.59 0.45 -0.41 1 System II

m -0.15 -0.01 -0.01 -0.00 1 -1.04 -0.95 7.46

Pp 0.03 -0.01 0.01 0.00 1.21 1 -9.80 -8.11

y 0.01 0.00 0.05 -0.00 -0.61 0.35 1 ~~ -3.06

R* 0.03 0.00 0.03 0.00 1.17 -1.05 -0.70 1 System II]

m-p -0.22 0.00 0.00 -0.01 -0.00 l 5.67 -0.77 5.82 -7.72

Ap 0.04 -0.02 0.01 0.03 0.00 -0.26 1 -0.47 1.98 3.40

y 0.00 -0.07 -0.02 0.04 -0.00 1.19 -12.87 1 6.24 -11.76

R3 0.07 -0.11 -0.00 -0.03 0.00 0.19 -1.85 -0.16 1 -0.65

Rra 0.01 -0.02 0.00 -0.01 -0.00 1.53 13.11 -0.04 -0.23 1 System IV

m-p -0.18 -0.03 0.00 -0.00 1 7.22 -1.08 7.16

Ap 0.02 -0.05 -0.00 0.00 -0.08 1 -0.04 -0.79

y -0.00 0.23 -0.01 -0.00 -1.26 16.03 1 -7.00

R* 0.03 0.14 0.00 0.01 1.33 6.58 -0.12 1

aSee the text for definitions of the models and variables.

bEntries have been rounded relative to PCFIML's output, with the sign of the estimate retained even if the rounded value is zero.

B. General to Simple: Narrow Money Demand in the United Kingdom

Starting with an unrestricted AD model for money, this subsection examines the issues of data transformations and sequential reduction, and, via the latter, general-to-simple modeling, dynamic specification, and error-correction models; cf. Sections 2.A, 2.C, 2.D, and 3.B. The data are those from the cointegration analysis above.

The unrestricted AD model. In light of the cointegration results, we now analyze the equation for money as a single-equation, conditional model in PCGIVE. We assume weak exogeneity of p, y, and the interest rates for the parameters of the conditional model. Johansen (1991b) tests and finds those variables to be weakly exogenous for the cointegrating vector. Hendry and Ericsson (1991b) test and find super exogeneity for the parameters of the conditional model, which implies weak exogeneity.

To match @ = 5 in the VAR, we begin with a fifth-order autoregressive distributed lag of nominal money, conditional on prices, the real TFE, and interest rates. That AD model corresponds to (7), and so to a generalization of (12): Table 6 lists estimates of the coefficients {Bo, {B;}}, with their estimated standard errors in parentheses.

Several features of the AD model are of interest. First, the AD model establishes a baseline against which all reductions (i.e., simplifications) of it can be compared. The estimated equation standard error 6 is 1.318%; so, by the relation of variance dominance to encompassing, any other model purporting to explain the same data should not have a standard error significantly larger than that. The standard F statistic is appropriate for testing this encompassing implication.

Second, the validity of the AD model itself can be tested via diagnostic statistics. Table 6 reports statistics from Table 1, which test against various alternative hypotheses: residual autocorrelation (dw and AR 1-4), skewness and excess kurtosis (Normality), autoregressive conditional heteroscedasticity (ARCH 1-4), RESET (RESET), heteroscedasticity (X?), and heteroscedasticity quadratic in the regressors (alternatively, functional form mis-specification) (X;X;). Pairs of numbers such as “1-4” denote the minimal and maximal lags; and the null distribution is designated by x?[-] or F[-,-], where the degrees of freedom fill the square brackets. These statistics are used as diagnostic tests of the AD model, checking to see whether or not it is general enough to capture the salient features of the data. No statistics are significant at their 95% critical values, so we infer that the unrestricted AD model has approximately white noise, homoscedastic, normally distributed

residuals, which are innovations with respect to the current and lagged variables in the regression.

than both the local authority interest rate (similar to the T-bill rate) and the sight deposit interest rate (via R*) in their cointegration analysis of M,. They find no evidence of cointegration between M,, TFE, the TFE deflator, and their interest rate. From Sections 4.A and 4.B and from Hendry and Ericsson (1991b), the own rate (as Rra) is economically and statistically important for explaining M,, both in the short run and in the long run. So, Hall, Henry, and Wilcox’s (1990) failure to establish cointegration appears to arise from too narrow a choice of variables. Further supporting this interpretation, Hendry and Ericsson’s ECM (reproduced in (39) below) substantially variance-dominates Hall, Henry,

and Wilcox’s ECM, even though the latter incorporates a variable for financial innovation to “obtain” cointegration.

39a

Table 6. A General Autoregressive Distributed Lag for Nominal Money,

Conditional on Prices, Incomes, and Interest Rates

lag i (or summation over lags)

Variable ) 1 2 3 4 5 23-9 mj -1 0.556 0.238 -0.253 0.125 0.175 -0.160 (-) (0.121) (0.138) (0.138) (0.135) (0.112) (0.041) Pt-i 0.259 0.220 -0.034 -0.433 0.030 0.111 0.154 (0.237) (0.388) (0.379) (0.366) (0.361) (0.212) (0.037) Yi -0.057 0.314 -0.082 -0.247 0.130 0.127 0.186 (0.120) (0.136) (0.142) (0.144) (0.142) (0.126) (0.054) R3.5 -0.423 -().298 -0.169 -0.054 -0.065 -0.066 -1.075 (0.129) (0.186) (0.187) (0.181) (0.180) (0.131) (0.200) Rra,_; 0.344 0.040 1.456 -1.35] 0.379 0.246 1.114 (0.447) (0.851) (0.951) (0.965) (0.913) (0.549) (0.307) constant -0.265 (0.521) T = 100 [1964(3)-1989(2)| R2 = 0.9998 6 = 1.318% dw = 1.97 AR 1-4 F[4,66] = 0.48 Normality 72[2] = 4.19 ARCH 1-4 F[4,62] = 0.34 X;2 F[39,30| = 0.66 XiX; F[35,34] = 0.30

RESET F[3,67] = 1.97

Third, some functions of the AD model’s estimated parameters are of interest. Specifically, we can obtain its long-run, static, non-stochastic solution:

(37) m = 096 p + 1.17 y% — 6.7 R38, + 7.0 Rra; — 1.7 , (0.08) (0.30) (1.5) (0.8) (3.4)

which corresponds to (36b) and is analogous to (19) with g = 0. The estimates in (37) closely match the system estimates of the first cointegrating vector. The estimates in (37) are derived from the sums of lag polynomial coefficients in the final column of Table 6, where an estimate in (37) is the corresponding sum in Table 6 divided by the sum for the polynomial in nominal money. Thus, nonzero sums of all polynomials are required for nonzero coefficients in the long-run solution, and so for the cointegration of the variables in (37). From the estimates and standard errors in the final column of Table 6, each sum appears to be statistically significantly different from zero, with no t-ratio being less than three in absolute value. PCGIVE presents these results both in a table like Table 6, and as t-ratios for the tests on the sums.

Several of the estimated coefficients in Table 6 are “statistically significant”. However, the estimates are sensitive to the empirical model’s precise specification, with a high degree of “multicollinearity” present in the data correlation matrix: 151 of the 435 correlations between right-hand side variables are over 0.90. Even so, multicollinearity is not an inherent property of the model, but only of the particular parameterization chosen; cf. Hendry (1989b, pp. 95-97). Thus, we consider data transformations and associated reparameterizations which might resolve this “problem”, and which have other statistically and economically appealing features.

Data transformations and associated re-parametertzations. Two data transformations are particularly useful in dynamic multivariate models: differences and differentials.1” The isomorphic transformation of (12) into (13) illustrates both, so we begin by re-writing (12):

(12) Ye = @+ Boze + Bi2t-14+ Boyz-1+-

Beginning with differences, consider the distributed lag for 21, which is Boz: + 8124-1. If zz is highly autoregressive, the estimates of 89 and §, may be quite imprecise because z_ and z;_, are highly correlated. However, that distributed lag is equivalent to BopAz + (Bo + B1) 2-1, so (12) is also:

(12a) ye = at BAz + O21 + Boyr-1t ,

where B = fo and 6* = fo + f. For highly positively autocorrelated z, the growth rate Az, and the (lagged) level z;_, are nearly uncorrelated, with 6 (or 6*) being possibly quite precisely estimated. Further, @ and 6* are economically appealing coefficients to estimate: B is the immediate, short-run response of y to a change in z, whereas 6* measures the

17 Ratios, as opposed to differentials, are common when using levels of variables rather than their log-levels.

long-run response, which is the sum of coefficients in the original distributed lag in levels. (We temporarily ignore the effect of y:;—1 on the long-run response.) By an equivalent transformation of y; and y:_1, we have:

(125) Ayt = a+ BAze + O21 + yyt-1 + 4,

where +y = £2 — 1.

Differentials are similar to differences, but with the subtraction operator being applied to different variables rather than to different lags of the same variable. In (12b), 6*z¢_4+ Yyt—1 can be re-written as (y:—1— 24-1) + (&* +-)z¢_1, that is, in terms of the differential (ye-1 — 2t-1) and the level of one variable, in this case, z;_;. The resulting equation is:

(12c) Aye = a+ BAz, + y(yt-1 — 2-1) + O21 + 4,

where 6°* = 6* + 7 = Bo + Bi + B2 — 1, which captures the degree of non-homogeneity in the long run between y and z. If y and z are homogeneous in the long run, 6** = 0 and so § = 1 (in (13)), and (12c) is a homogeneous ECM; cf. Table 2. Alternatively, the concept of differentials generalizes to “quasi-differentials”, with 6*z;_1 + Y¥e-1 in (12b) re-written as Y(yt-1 — 62¢-1), where 6 = —6*/y = —(Bo + B:)/(B2 — 1) . That quasi-differential generates (13), repeated here for clarity:

(13) Ay: = a+ BAz: + ¥(yt-1 — 6z¢-1) +.

As with differences, differentials often transform highly correlated data into less correlated data. Economically, the differential transformation is appealing. In (13), it generates the cointegrating relationship y;_1 — 6z;_1. Below, we will use it to transform the two interest rate series R3 and Rra into a spread (R3 — Rra) and a level (Rra). More generally, economic agents might themselves transform their information set into relatively orthogonal pieces of information, using data transformations such as those just described.

In the examples above, differencing and differentials have been applied without loss of generality because a suitable term in levels has always been retained, e.g., z;_, in (12a). This contrasts with a common approach, wherein an entire equation is differenced. That latter procedure is with loss of generality, and corresponds to imposing restrictions on the lag structure. For instance, if (14) in Section 2.C is differenced, that implies a common factor restriction as in (17), with the root of that common factor being unity (p = 1); cf. Davidson, Hendry, Srba, and Yeo (1978) and Hendry and Mizon (1978). Alternatively, such differencing can be interpreted as excluding both z:_; and y:_,1 from (12b), or as excluding the error-correction +y(y;~1 — 6z+-1) from (13). These exclusion restrictions are testable. Further, the interpretation of the restricted coefficients is quite different when viewed as a special case of (12b) (or of (13)) rather than as a filter applied to (14).

Having laid down these two principles, we consider several data transformations for the AD model of money. While many other transformations exist, these prove useful in the sequential reduction that follows.

(i) Nominal money m and prices p are transformed to real money m — p and prices.

(ii) The interest rates R3 and Rra are transformed to the spread R* (= R3—Rra) and a level Rra.

(iii) All variables (m — p, p, y, R*, Rra) are transformed to a single log-level (or level) and a set of current and lagged differences. For reasons that will be apparent later, the log-levels m— p, p, and y are at the first lag, whereas the levels of R* and Rra are current. So, for example, the fifth-order distributed lag for y is re-written in terms of Ay, Ay:1, Ayz—2, Ayz_3, Ayt_a, and Yt—1-

(iv) The variables (m — p);_; and y_, are transformed to (m — p— y)t-1 and Y:—-1, where (m — p — y)+_ is the potential error-correction term.

Coefficient estimates and estimated standard errors for the resulting equation appear in Table 7.

That equation is analytically and numerically equivalent (isomorphic) to the equation estimated in Table 6. However, the equation is re-parameterized, and the newly estimated coefficients may exhibit quite different properties. For instance, only one of the 435 correlations between right-hand side variables in Table 7 exceeds 0.90 (that between y;_1 and Pt-1), only eight of the 435 correlations exceed 0.70, and most correlations are quite small.

Sequential reduction. Sequential reduction in PCGIVE generates F statistics over the complete reduction and over sub-sequences of reductions. These statistics are valuable for several reasons. First, the set of F statistics allows control of size. Second, a set of variables may be significant, even though the corresponding individual variables (or subsets of them) may not be. Third, the set of F statistics helps in spotting which (if any) reduction is invalid. If an invalid reduction is detected, sequential reduction can continue in a different direction, starting at the model previous to the invalid reduction.

To aid in the sequential reduction of the model in Table 7, we list several variables in Table 7 with highly statistically significant coefficients and which are economically reasonable to retain, as well as several sets of variables whose coefficients appear numerically and statistically insignificant. The following are highly significant. The error-correction term (m — p— y)¢_, enters with a coefficient of —0.160, close to the first term in the a matrix for any of the four systems (Table 5). The current net interest rate R; and the current inflation rate Ap; each enter with large negative coefficients, interpretable as reflecting costs to holding money when other assets (or goods) yield a high return. And, the lagged dependent variable A(m — P)t—; is statistically significant.

From Table 7 and various statistics reported in PCGIVE, the following do not appear either numerically or statistically significant:

(i) The fourth lag on A(m — p), Ap, Ay, AR*, and ARra; (ii) The third lag on A(m— p), Ap, Ay, AR*, and ARra; (iii) The second lag on A(m — p), Ap, Ay, AR*, and ARra; (iv) The variables py_, and Yt-13

(v) Rra, and all current and lagged values of ARra; and (vi) All current and lagged values of AR*.

We will entertain two additional sets of reductions, discussed below:

42a

Table 7. The Unrestricted Error-correction Model

lagi Variable 0 1 2 3 4 A(m-p),-; -1 -0.284 -0.047 -0.300 -0.175 (-) (0.109) (0.118) (0.117) (0.112) Api -0.741 0.041 0.245 -0.441 -0.286 (0.237) (0.263) (0.260) (0.251) (0.233) Ay.-i -0.057 0.071 -0.010 -0.257 -0.127 (0.120) (0.139) (0.130) (0.131) (0.126) AR i; 0.653 0.354 0.185 0.131 0.066 (0.232) (0.206) (0.168) (0.148) (0.131) ARra,_; -0.117 -0.376 0.910 -().494 -0.180 (0.418) (0.532) (0.553) (0.540) (0.545) (m-p-y)\-j -0.160 (0.041) Pi-i -0).006 (0.013) Yt-i 0.026 (0.045) Rij -1.075 (0.200) Rra,_; 0.039 (0.263) constant -0.265 (0.521) T = 100 [1964(3)-1989(2)] R2 = 0.82 6 = 1.318%a

aAll residual-based statistics are identical to those in Table 6.

(vii) Ay: and Ap;_, have zero coefficients, and Ay:_; and A(m—p);_1 have equal and opposite coefficients; and (viii) Ap: has a zero coefficient.

From these eight restrictions treated sequentially, we obtain the following nine models:

Model 1: The unrestricted ECM in Table 7 (equivalently, in Table 6);

Model 2: Model 1, excluding the fourth lag on A(m — p), Ap, Ay, AR*, and ARra;

Model 3: Model 2, excluding the third lag on A(m—p), Ap, Ay, AR*, and ARra;

Model 4: Model 3, excluding the second lag on A(m — p), Ap, Ay, AR*, and ARra;

Model 5: Model 4, excluding p:_1 and y:—1;

Model 6: Model 5, excluding Rra; and all remaining current and lagged values of ARra;

Model 7: Model 6, excluding all remaining current and lagged values of AR*;

Model 8: Model 7, excluding Ap:_1, Ayz, and Ayz_1 [once A(m—p)+_1 and Ay: are transformed to A(m — p— y)¢_-1 and Ay;_,]; and

Model 9: Model 8, excluding Ap; (i.e., the model is homogeneous in prices in the short run as well as in the long run).

So, for example, Model 2 is Model 1 plus reduction (i); Model 3 is Model 1 plus reductions (i)-(ii); and Model 3 is also Model 2 plus reduction (ii). When estimating these models sequentially, PCGIVE calculates statistics associated with the implied reductions, including those for all model pairs, and not only those for adjacent models. This facilitates assessing whether or not the sequence of reductions is valid, and if not, where not.

Table 8 reports this information from PCGIVE. The table includes 6 and the Schwarz criterion for each model, the F statistics for all model pairs, and the associated tail probability values.18 From Model 1 through Model 8, & remains relatively constant, the Schwarz criterion is always declining, and none of the reductions (i)-(vii) are statistically significant at the 5% level, whether considered individually or as sub-sequences.!° Other orderings of (i)-(vii) generate somewhat different statistics, but those resulting statistics are unlikely to be highly statistically significant because the reduction of (i)-(vii) as a whole appears valid,

*® The Schwarz criterion is In(RSS7/T) + k- (InT)/T for k parameters, and so is in effect 6?, adjusted for the degree of parsimony. A smaller Schwarz criterion indicates a better-fitting model for a given number of parameters, or a more parsimonious model for a given fit.

‘9 Reduction (ii) is the reduction closest to statistical significance. Conditional on the validity of reduction (i), the F statistic for (ii) is F[5,75] = 2.27 with a p-value of 0.06. Our use of seasonally adjusted data is a likely cause of such a large F statistic. Wallis (1974, p. 21) has shown that seasonal adjustment can introduce “... small positive autocorrelation coefficients at lags of 1-3, 5-7, --- quarters, and somewhat larger negative correlations between observations 4, 8, --- quarters apart.” Because reduction (ii) deletes all third lags of differenced series, inter alia it is deleting the fourth lag of the levels (or log-levels) of the original, seasonally adjusted data. The presence of nearly significant, negative, third-order autocorrelation in Model 8 supports this interpretation.

43a

Table &. F and Related Statistics for the Sequential Reduction from the Fifth-order AD Model in Table 6

Null Hypothesis@ Maintained Hypothesis (Model Number )>

Model k oy SC 1 2 3 4 5 6 7 8

30.) -:1.318% -7.63 -

(i) - 25. 1.306% -7.81 0.73 [0.60] (ii) (5,70) 20 1.357% -7.90 148 2.27 [0.17] [0.06] (iii) (10,70) (5,75)

15 1.346% -8.09 1.24 1.53 0.73 [0.26] [0.15] [0.61] (iv) (15,70) (10,75) (5,80)

13. 1.340% -8.17 1.17 1.38 ().69 0.62 [0.31] [0.19] [0.68] [0.54] (17,70) (12,75) (7,80) (2,85)

oa < wa

eB tN

10 1.325% -8.29 1.05 1.18 0.59 0.46 0.36 [0.42] [0.30] [0.82] [0.80] [0.78] (vi) (20,70) (15,75) (10,80) (5,85) (3,87)

8 1.330% -8.36 1.08 1.20 0.70 0.69 (0).73 1.30 [0.39] [0.29] [0.75] [0.68] [0.61] [0.28] (vil) (22,70) (17,75) (12,80) (7,85) (5,87) — (2,90)

5 1.313% -8.49 0.97 1.05 0).60 0.54 0.53 0.64 0.20 [0.51] [0.42] [0.87] [0.85] [0.83] [0.67] [0.89] (vill) (25,70) (20,75) (15,80) (10,85) (8,87) (5,90) (3,92)

\O <—_——— 00 +) ——_—

4 1.498% -8.26 2.08 2.45 2.32 3.09 3.67 5.44 747 30.01 [0.008] [0.002] [0.007] [0.002] [0.001] [0.000] [0.000] [0.000] (26,70) (21,75) (16,80) (11,85) (9,87) (6,90) (4,92) (1,95)

aThe first four columns report the model number (with reduction), and for that model: the

number of unrestricted parameters k, the estimated equation standard error 6, and the Schwarz criterion SC, defined as In(RSS.JT) +k-(InT)/T. The text defines the models.

bThe three entries within a given block of numbers are: the F statistic for testing the null hypothesis (designated by the model number to the left of the entry) against the maintained hypothesis (designated by the model number directly above the entry), the tail probability associated with that value of the F statistic (in square brackets), and the degrees of freedom for the F statistic (in parentheses).

with F[25,70] = 0.97 and a p-value of 0.51. (Reduction (viii) and the resulting Model 9 will be considered shortly.)

The first six reductions are straightforward and intuitive. Model 7 results, which is:

o_o

(38) A(m—p)t = — 0.17 A(m—p)t-1 — 0.77 Ap, + 0.12 Apy_1 + 0.02 Ay, [0.07] [0.21] (0.21] (0.11|

+ 0.15 Ayz1 — 0.632 Rf — 0.092 (m—p—y):-1 + 0.023 [0.12] [0.059] [0.010] [0.005]

T = 100 [1964(3) — 1989(2)] R?=0.76 6=1.330% dw=2.18.

Values in square brackets [-| are White’s (1980) heteroscedasticity-consistent estimated standard errors (H.C.S.E. in PCGIVE); see also Nicholls and Pagan (1983) and MacKinnon and White (1985).

Simplifications of (38) are possible, including the deletion of Ap:_1, Ay:, and Ayz_1. However, while the coefficient on Ay-1 is not particularly statistically significant, it is opposite in sign and virtually equal in magnitude to the coefficient on A(m — p)4_1, suggesting a different (i.e., nonzero) restriction. That restriction of “equal magnitude, opposite sign” is not only statistically but economically appealing, as it results in a single variable A(m — p— y)t-1, which is the change in the error-correction term. Thus, current money balances are determined not only by past disequilibrium [via (m — p — y)+~1], but by how fast that disequilibrium is changing [via A(m—p—y)+_,]. In Phillips’s (1954, 1957) controltheoretic terminology, these two terms represent proportional and derivative control; cf.

Salmon (1982). Reduction (vii) is the equal-opposite restriction, plus the deletion of Ap;_1 and Ay.

The resulting model, Model 8, is:

(39) A(m—p): = — 0.69 Ap; — 0.17 A(m— p—y)s-1 (0.14] [0.06]

— 0.630 Rj — 0.093 (m—p— y)t-1 + 0.023 [0.053] [0.008] [0.004]

T = 100 [1964(3) — 1989(2)} R?=0.76 6=1.3138% dw =2.18 AR 1-4 F[4,91]= 1.94 — Normality x?[2]= 1.53 ARCH 1-4 F[4,87| =0.74 RESET F[1,94]= 0.08 X? F[8, 86] = 1.36 X;X; F(14,80] = 1.05.

This is equation (6) in Hendry and Ericsson (1991b), where its statistical and economic properties are described at length.

As a final restriction, we consider imposing short-run unit homogeneity in prices [reduction (viii)], which corresponds to deleting Ap; in (39). All the associated F statistics

are significant at the 1% level, and several at the 0.1% or even 0.01% level. Both & and the Schwarz criterion increase sharply. Reduction (viii) clearly is invalid. No other reductions from Model 8 are apparent, so we stop.

This subsection has demonstrated that (39) is a valid reduction of the unrestricted fifth-order AD model in Table 6, and so that the residuals in (39) are innovations with respect to the information set in Table 6. Short-run unit homogeneity in prices is not a valid reduction, and its rejection reflects the high information content of the data.

C. Model Replication and Evaluation: Nominal Income in the United Kingdom

This subsection replicates an equation from Friedman and Schwartz (1982) for nominal income in the United Kingdom and, for that equation, calculates statistics from Table 1 to evaluate the equation’s empirical validity.

Marginalization and mis-spectfication. In turning to model evaluation, we briefly reconsider the effects of invalid marginalization. Direct tests of reduction are available for explicit marginalization (as in Section 4.B above), but much marginalization in empirical work is implicit, with the relevant set of “omitted variables” correspondingly difficult to specify. Even so, some implications of invalid reduction are testable. Specifically, Section 2.A shows that marginalization induces a re-parameterization of the model, so standard tests on those new parameters and on the implied model residuals may have power against invalid marginalization. Put somewhat differently, a test against a specific hypothesis (and so a specific reduction) also may have power against other alternative hypotheses (and so other reductions). Thus, we apply several statistics from Table 1 in this diagnostic, evaluative mode, virtually all of which are from the diagnostic and recursive least-squares graph options menus.

Model replication and evaluation. Replication and evaluation of existing empirical models are fundamental aspects of Hendry’s methodology. First, any new model would need to encompass existing models, and testing encompassing requires estimation of both models (at least implicitly). Second, the type and extent of mis-specification in existing models may suggest how to improve those models.

Friedman and Schwartz (1982, p. 349) present a “final” equation explaining the (log-) level of nominal income in the United Kingdom. Re-estimating that equation, we obtain:

_—

(40) (p+¥); = 038 + 1.011 m; + 14.3 RN; + 0.54 G(p+9); (0.09) (0.017) (4.0) (0.29)

— 1.08 W; — 19.1 9S; (0.64) (3.5) J = 36 [spanning 1874-1973] 6=6.11% dw=1.33 R?=0.99996 AR 1-1 F[1, 29] = 3.95 Normality x?[2] = 2.41 ARCH 1-1 F[1,28] = 21.30 RESET F[1, 29]}=1.66 X? F[12, 17] = 1.44 INN; F[1,29] = 7.51. The data are 36 phase-average observations derived from annual data spanning 1874-1973:

the phases are expansions and contractions, dated by NBER reference cycles. Phaseaveraged variables are denoted by a superscript bar, and indexed by 7. The variables P,

Y, and M are the price level, real net national income, and the broad money stock, with lower case denoting logarithms. RN is the differential between the short-term interest rate and the “own-yield” on money, G(p + y) is the growth rate of nominal income, W is a dummy for “postwar adjustment” (Friedman and Schwartz (1982, p. 228)), and S is a data-based dummy for “[a]n upward demand shift, produced by economic depression and war...” (Friedman and Schwartz (1982, p. 281)) during 1921-1955. Equation (40), along with an equation for prices, purports to show that real income does not depend upon money, but prices do, and so nominal income (in (40)) does depend upon money; cf. Friedman and Schwartz (1982, pp. 422, 351). See Friedman and Schwartz (1982) and Hendry and Ericsson (1991a) for further details.

Equation (40) closely replicates Friedman and Schwartz’s (1982) coefficient estimates, equation standard error, and R?, which are the only statistics that they provide on that specification. Thus, to evaluate their equation, we turn to various standard tests based on the model’s residuals and on its estimated parameters.

The (residual-based) statistics listed with (40) detect substantial heteroscedasticity (ARCH) and possible error autocorrelation (dw and AR 1-1). Further, a trend is highly significant when added to (40) (INN;), indicative of omitted variables.

Recursive least squares is a natural method for evaluating the coefficient estimates themselves. Figure 5 graphs the recursive estimates and estimated standard errors of one coefficient, that on the interest rate RN. The recursive estimates correspond to the sequence {B;} from (24), and these empirical values are notable in their numerical and statistical nonconstancy. When estimated with data through only phase observation 16, the coefficient is negative and insignificant. With the whole sample, the coefficient is +14.3 with a t-ratio of 3.6. Further, the 95% confidence interval for the whole-sample estimate lies entirely outside the 95% confidence interval using the first 16 observations, showing how substantial the nonconstancy is, statistically speaking.

The equation standard error o can be estimated recursively as well. Figure 6 plots the one-step residuals (n;; from Section 2.E) and 0 + 26;, from which it is visually apparent that the residuals are autocorrelated and 6; is not constant.

Formally, nonconstancy can be tested by Chow statistics, and recursive estimation provides the basis for sequences of Chow statistics, as described in Section 2.£. Figures 7,8, and 9 plot the sequences of one-step ahead, break-point (N |), and fixed-point (N 1) Chow statistics: these parallel the schematics in Figures la, 1b, and lc. “Breaks” are present in observations 25 (1944-1946), 28 (1952-1955), 32 (1962-1965), and possibly 22 (1932-1937). While the timing and size of coefficient breaks may be suggestive of the particular mis-specification present, such thoughts must remain conjectural until a wellspecified model is obtained, which would be able to explain the mis-specification of (40) via encompassing. The several observed test rejections are not mutually independent, also clouding inferences on the cause(s) of rejection.

D. Super Exogenetty: Consumers’ Expenditure in Venezuela

In our final empirical example, we test for the super exogeneity of incomes and prices in Campos and Ericsson’s (1988) conditional model of consumers’ expenditure for Venezuela. We test for super exogeneity both indirectly via tests of constancy and directly via tests of invariance. We find super exogeneity, thereby refuting Lucas’s (1976) critique, and also

46a

AC J+2SEC 4) i5S.@ L

1e.@

-190.@ RC {I-ZSEC J) -15.0

18 2a 22 24 26 238 3e 32 34 36 38

phase observation

Figure 5. | Equation (40): recursive estimates of the coefficient on the interest rate RN for a model of phase-average nominal income in the United Kingdom, with +2 estimated standard errors.

46b

.1508 + Za C j)

-160

- GG

-.@50

-.1900

13s 2e 22 24 26 28 30 32 34 36 38

phase observation

Figure 6. Equation (40): one-step residuals and the corresponding calculated equation standard errors for a model of phase-average nominal income in the United Kingdom.

« Scaled statistic value

5% critical value 4

18 2a 22 24 26 22 3a 32 34 36 38

phase observation

Figure 7. Equation (40): the sequence of one-step ahead Chow statistics over phase observations 17-36 (1924-1973) for a model of phase-average nominal income in the United Kingdom, with the statistics scaled by their one-off 5% critical values.

46c

J \

Tt Scaled statistic value

9% critical value

-5@

24 26 28 30 32 34 36 38

phase observation

Figure 8. Equation (40): the sequence of break-point (N|) Chow statistics over phase observations 17-36 (1924-1973) for a model of phase-average nominal income in the United Kingdom, with the statistics scaled by their one-off 5% critical values.

Scaled statistic value 4

5S“ critical value FE

-3@

18 2e@ 22 24 26 28 3e 32 34 36 38

phase observation

Figure 9. Equation (40): the sequence of fixed-point (NT) Chow statistics over phase observations 17-36 (1924-1973) for a model of phase-average nominal income in the United Kingdom, with the statistics scaled by their one-off 5% critical values.

Hall’s (1978) hypothesis.

The conditional ECM. Campos and Ericsson (1988) develop a constant, data-coherent error-correction model of consumers’ expenditure for Venezuela. In the model’s long-run static solution, the consumption-income ratio is a function of the liquidity-income ratio, as in Hendry and von Ungern-Sternberg’s (1981) model of consumers’ expenditure in the United Kingdom. The estimated equation for Venezuela is:

(41) Ac, = 0.457 (Agy?/2) — 0.270 (Ap + A?p); [0.050] (0.031]

— 0.142 [(e — y*) — $(m—y*)]}i-1 + 0.019 + 0.026 Doi [0.014] [0.004] [0.002]

T = 16 [1970-1985] R?=097 6G =0.916%.

The series C, Y, and M are real (1968 Bolivares), per capita, annual values of consumers’ expenditure on non-durables and services, national disposable income, and end-of-period liquidity (M2). The price deflator (P) is for all consumers’ expenditure. To account for the “inflation tax” from holding liquid assets, the income measure Y* used in (41) is Y; — (Apz)M;_1; cf. Hendry and von Ungern-Sternberg (1981). Do; is a +1/ — 1 dummy for 1970-1971 to account for apparent measurement errors in consumers’ expenditure for those years. Lower case denotes logarithms. Campos and Ericsson (1988, Appendices A and B) give details of the data and sources.

Testing super ezogeneity. As discussed in Section 2.A, super exogeneity requires weak exogeneity and the invariance of the parameters of interest (y, which is a function of the conditional model’s parameters A) to changes in the parameters of the marginal process (Azz). Thus, two common tests for super exogeneity are as follows.

(i) Establish the constancy of the parameters in the conditional model (Aq; = A1) and the nonconstancy of those in the marginal model (A2; varies over time). Because A, is constant but Az: is not, A is invariant to 2:4, and so super exogeneity holds; cf. Hendry (1988c).

(ii) Having established (i), further develop the marginal model until it is empirically constant. For instance, by adding dummies and/or other variables, model the way in which 2; varies over time. Then test for the significance of those dummies and/or other variables in the conditional model. Insignificance in the conditional model demonstrates invariance of the conditional

model’s parameters 4, to the changes in the marginal process; cf. Engle and Hendry (1989).

Campos and Ericsson (1988) demonstrate the constancy and data-coherency of the conditional model (41). Thus, we take those properties as given, and turn to modeling the

marginal processes of prices and incomes, which were conditioned upon (contemporaneously) in (41).

Marginal models for prices and incomes. Starting with univariate second-order au-

toregressive processes for p; and y?, and simplifying, we obtain the following random-walk specifications.?°

(42) Pt = pt-1 + 0.086 (0.013)

T = 16 [1970 — 1985] R?=0.99 6 = 5.087% dw=1.26 INN F[2,13] = 1.47

(43) o = yi, + 0.014 (0.025)

T = 16 [1970 — 1985] R?=0.69 6 = 9.982% dw=1.46 INN F(2,13] = 2.52

The tests of reduction from the AR(2) processes (INN) are not rejected. However, both models are highly nonconstant, as plots of the recursively estimated equation standard errors and break-point Chow statistics reveal in Figures 10a-b and 1la-b. Major breaks appear in 1974, 1979-1980, and 1984 for prices, and in 1974 and 1982-1983 for income.

Different approaches could be taken to develop more constant marginal processes, and knowledge of the history and institutions of the Venezuelan economy is clearly helpful. Campos and Ericsson (1988) include additional variables: U.S. prices in the price equation, and petroleum exports in the income equation. Here, we add dummies to (42) and (43) with the dummies proxying for the shifts in A2t over time. The dummies are:

D7980 +1 in 1979 and 1980, for the rise in world inflation rates; D84 +1 in 1984, for a particularly inflationary year in Venezuela at the beginning of the debt crisis; D74/85 +1 for 1974 onwards, for the influence of OPEC; D74 +1 for 1974 only, for the incredibly high (40% real, per capita) growth in income for Venezuela in this year and for higher inflation; and D8283 +1 in 1982 and 1983, for the beginning of the debt crisis;

with observations other than those specified being zero. The dummies D7980, D84, and D74/85 were included in the price equation (42); and D74/85, D74, and D8283 in the incomes equation (43). With those dummies, both equations are empirically constant and have much smaller values of 6 (1.559% and 3.012% respectively), the latter indicating the statistical importance of the dummies in the equations. With this information, we consider the super exogeneity of prices and incomes in the conditional model (41).

Testing super exogenetty: (i). Super exogeneity follows from the constancy of the conditional model (41) and the nonconstancy of the marginal models (42) and (43)

’

20 Equations (42) and (43) appear in terms of the (log-) levels of prices and income to emphasize the equations’ origins from AR(2) models and to display the unit restrictions on the lagged dependent variables explicitly.

48a

150

168 1-step residuals +

-65G

-.e@35@

—.10@0

1974 1976 1978 1986 1982 1984 1986

Figure 10a. Equation (42): one-step residuals and_ the corresponding calculated equation standard errors for a time-series model of prices in Venezuela.

«+ Scaled statistic value

5% critical value 4

1974 1976 1978 198@G 1982 1984 1986

Figure 10b. Equation (42): the sequence of break-point (N|) Chow statistics over 1973-1985 for a time-series model of prices in Venezuela, with the statistics scaled by their one-off 5% critical values.

48b

- 300

+ 2c(t) 4

- 200 TT

- 860 -.1060

-.200

1974 1976 1978 1980 1982 1984 1986

Figure 11a. Equation (43): one-step residuals and the corresponding calculated equation standard errors for a time-series model of income in Venezuela.

+ Scaled statistic value

5% critical value

-3@

1974 1976 1978 1980 1982 1984 1986

Figure 11b. Equation (43): sequence of break-point (N|) Chow statistics over 1973-1985 for a time-series model of income in Venezuela, with the statistics scaled by their one-off 5% critical values.

Testing super ezogenetty: (it). Adding the five new dummies to (41), we obtain F[5,6] = 1.08 with a p-value of 0.46. Further, no subset of dummies is significant at the 5% level.

The power of these tests is also of interest. Following Engle and Hendry (1989), we mis-specify (41) and re-test, with rejection reflecting power. Because of the marginalization from mis-specification, the parameters in the mis-specified conditional model generally are not invariant to changes in 2+, even if the parameters , in the original conditional model are. Thus, tests of super exogeneity in the mis-specified model provide some measure of power.

We consider six mis-specifications, each of which is tested for super exogeneity via addition of the five new dummies. The first three mis-specifications are the deletion from (41) of current-dated income growth Ay;, of current-dated inflation Ap;, and of Ap, and Ay; jointly. To properly exclude these variables from (41), we note that Aoy* = Ay; + Ayj_, and (Ap + A?p); = 2Ap; — Apy_y. The corresponding F statistics and pvalues are F[5,6] = 14.56 [0.003], F'[5,6] = 5.79 [0.027], and F[5,6] = 7.77 [0.013]. For the remaining three mis-specifications, we delete from (41) either Aoy; /2, or (Ap + A?p),, or both Azyf/2 and (Ap + A?p);. Their F statistics and p-values are F{5, 7] = 4.76 [0.032] F|5,7] = 4.82 [0.031], and F[5,8] = 5.74 [0.015]. In all cases, rejection is strong.

These results reflect the power of the tests and (relatedly) the high information content in the data. Even though the number of observations is small, the information per observation is high, with the product of the two being relevant for the information matrix, and so for powers of tests. These data properties appear to characterize data from many developing economies, which have been subject to substantial shocks over the last two decades. While the resulting high per-observation variability in the data may represent major hardships for these economies, the econometric implications are more promising. For example, Campos and Ericsson (1988) show that their short annual Venezuelan data set contains more information for the consumption function than forty years of comparable quarterly postwar U.S. data.

To summarize, the marginal processes for prices and income are nonconstant over the sample period, yet the empirical model of consumers’ expenditure conditional on observed prices and income has constant parameters. Thus, the parameters in (41) are invariant to the class of interventions which occurred in sample: prices and income are super exogenous for those parameters. Super exogeneity is also shown by “variable-addition” tests, with dummies proxying for the nonconstancies in the price and incomes equations being insignificant when added to the consumption function.

5. Summary and Conclusions

David Hendry’s contributions to econometric methodology have notably influenced current empirical practice. This paper summarizes and coalesces his methodological writings, describes the way in which Hendry has implemented his econometric methodology in PC-GIVE, and illustrates several central aspects of the methodology via new substantive empirical examples analyzed with PC-GIVE. A cornerstone of Hendry’s methodology is the derived nature of empirical models via data transformations and marginalizations, from which follow: the dual role of statistics in model evaluation and design, a range of model classes, model estimation, and modeling strategies. The structure of PC-GIVE reflects

these issues.

Empirically, we find cointegration among narrow money, prices, total final expenditure, and interest rates in the United Kingdom, and obtain a data-coherent, parsimonious, economically interpretable model via a sequence of simplifications from an unrestricted autoregressive distributed lag model. We also evaluate Friedman and Schwartz’s (1982) model of nominal income in the United Kingdom with numerous diagnostic tests, and find their model wanting. Rejection by those tests indicates room for improved specification. Finally, applying new tests from Engle and Hendry (1989), we find income and prices to be super exogenous in Campos and Ericsson’s (1988) conditional model of consumers’ expenditure in Venezuela. The tests’ observed power with mis-specified models is promising evidence for modeling data from developing countries and discerning between “good” and “bad” models of those data.

Appendix A Chronological Bibliography for David F. Hendry?!

1966

E Hendry, D.F. (1966) “Survey of Student Income and Expenditure at Aberdeen University, 1963-64 and 1964-65”, Scottish Journal of Political Economy, 13, 3, 363-376.

1970

ET Hendry, D.F. (1970) The Estimation of Economic Models with Auto-regressive Errors, Ph.D. thesis, London: London School of Economics, University of London.

1971

ET Hendry, D.F. (1971) “Maximum Likelihood Estimation of Systems of Simultaneous Regression Equations with Errors Generated by a Vector Autoregressive Process”, International Economic Review, 12, 2, 257-272; and (1974) “Maximum Likelihood Estimation of Systems of Simultaneous Regression Equations with Errors Gener-

ated by a Vector Autoregressive Process: A Correction”, International Economic Review, 15, 1, 260.

1972

M Hendry, D.F. and P.K. Trivedi (1972) “Maximum Likelihood Estimation of Differ-

ence Equations with Moving Average Errors: A Simulation Study”, Review of Economic Studies, 39, 2, 117-145.

1973

M Hendry, D.F. (1973) “On Asymptotic Theory and Finite Sample Experiments”, Economica, 40, 158, 210-217.

21 This bibliography is chronological, but within each year articles are listed following standard journal citation format. For ease of use, that includes giving the original order of authors on papers with co-authors. No book reviews are included except Hendry (1973), but discussion papers appear if they are cited in the text or forthcoming in a publication.

The nomenclature in the left-hand column is as follows: E Empirical Study

M Monte Carlo Experimentation P Computer Program T Econometric Theory (or Methodology),

and is used to denote the focus of each article as a guide to the reader. Often, Hendry illustrates methodology with an empirical example or develops new methodology in the course of empirical research, so these classifications are not categorical. Also, several

publications emphasize teaching and numerical techniques, which do not fall neatly into any of these categories.

1974

Hendry, D.F. (1974) “Stochastic Specification in an Aggregate Demand Model of the United Kingdom”, Econometrica, 42, 3, 559-578.

Hendry, D.F. and R.W. Harrison (1974) “Monte Carlo Methodology and the Small Sample Behaviour of Ordinary and Two-stage Least Squares”, Journal of Econometrics, 2, 2, 151-174.

1975

Hendry, D.F. (1975) “The Consequences of Mis-specification of Dynamic Structure, Autocorrelation, and Simultaneity in a Simple Model with an Application to the Demand for Imports”, Chapter 11 in G.A. Renton (ed.) Modelling the Economy, London: Heinemann Educational Books, 286-322 (with discussion).

1976

Hendry, D.F. (1976) “The Structure of Simultaneous Equations Estimators”, Journal of Econometrics, 4, 1, 51-88.

Hendry, D.F. and A.R. Tremayne (1976) “Estimating Systems of Dynamic Reduced Form Equations with Vector Autoregressive Errors”, International Economic Review, 17, 2, 463-471.

1977

Hendry, D.F. (1977) “Comments on Granger-Newbold’s ‘Time Series Approach to Econometric Model Building’ and Sargent-Sims’ ‘Business Cycle Modeling Without Pretending to Have Too Much A Prior: Economic Theory’” in C.A. Sims (ed.) New Methods in Business Cycle Research: Proceedings from a Conference, Minneapolis, Minnesota: Federal Reserve Bank of Minneapolis, 183-202.

Hendry, D.F. and G.J. Anderson (1977) “Testing Dynamic Specification in Small Simultaneous Systems: An Application to a Model of Building Society Behavior in the United Kingdom”, Chapter 8C in M.D. Intriligator (ed.) Frontiers of Quantitative Economics, Amsterdam: North-Holland, Volume 3A, 361-383.

Hendry, D.F. and F. Srba (1977) “The Properties of Autoregressive Instrumental Variables Estimators in Dynamic Systems”, Econometrica, 45, 4, 969-990.

1978

Davidson, J.E.H., D.F. Hendry, F. Srba, and S. Yeo (1978) “Econometric Modelling of the Aggregate Time-series Relationship between Consumers’ Expenditure and Income in the United Kingdom”, Economic Journal, 88, 352, 661-692.

Hendry, D.F. and G.E. Mizon (1978) “Serial Correlation as a Convenient Simplification, Not a Nuisance: A Comment on a Study of the Demand for Money by the Bank of England”, Economic Journal, 88, 351, 549-563.

1979

Hendry, D.F. (1979a) “The Behaviour of Inconsistent Instrumental Variables Estima-

tors in Dynamic Systems with Autocorrelated Errors”, Journal of Econometrics, 9, 3, 295-314.

Hendry, D.F. (1979b) “Predictive Failure and Econometric Modelling in Macroeconomics: The Transactions Demand for Money”, Chapter 9 in P. Ormerod (ed.) Economic Modelling, London: Heinemann Education Books, 217-242.

1980

Hendry, D.F. (1980) “Econometrics - Alchemy or Science?”, Economica, 47, 188, 387-406.

Hendry, D.F. and F. Srba (1980) “AUTOREG: A Computer Program Library for Dynamic Econometric Models with Autoregressive Errors”, Journal of Econometrics, 12, 1, 85-102.

Mizon, G.E. and D.F. Hendry (1980) “An Empirical Application and Monte Carlo Analysis of Tests of Dynamic Specification”, Review of Economic Studies, 47, 1, 21-45.

1981

Davidson, J.E.H. and D.F. Hendry (1981) “Interpreting Econometric Evidence: The Behaviour of Consumers’ Expenditure in the UK”, European Economic Review, 16, 1, 177-192.

Hendry, D.F. (198la) “Econometric Evidence in the Appraisal of Monetary Policy”, Appendix 1 in Monetary Policy, Third Report from the Treasury and Civil Service Committee, Session 1980-81, House of Commons, London: Her Majesty’s Stationery Office, Volume 3, 1-21.

Hendry, D.F. (1981b) “Comment on HM Treasury’s Memorandum, ‘Background to the Government’s Economic Policy’”, Appendix 4 in Monetary Policy, Third Report from the Treasury and Civil Service Committee, Session 1980-81, House of Commons, London: Her Majesty’s Stationery Office, Volume 3, 94-96.

Hendry, D.F. and J.-F. Richard (1981) “Model Formulation to Simplify Selection

When Specification Is Uncertain” (Abstract), Journal of Econometrics, 16, 1, 159.

Hendry, D.F. and T. von Ungern-Sternberg (1981) “Liquidity and Inflation Effects on Consumers’ Expenditure”, Chapter 9 in A.S. Deaton (ed.) Essays in the Theory

and Measurement of Consumer Behaviour, Cambridge: Cambridge University Press, 237-260.

1982

Hendry, D.F. (1982a) “Comment: Whither Disequilibrium Econometrics?”, Econometric Reviews, 1, 1, 65-70.

Hendry, D.F. (1982b) “A Reply to Professors Maasoumi and Phillips”, Journal of Econometrics, 19, 2/3, 203-213.

Hendry, D.F. (1982c) “The Role of Econometrics in Macro-economic Analysis”, UK Economic Prospect, Autumn 1982, 26-38.

Hendry, D.F. and J.-F. Richard (1982) “On the Formulation of Empirical Models in Dynamic Econometrics”, Journal of Econometrics, 20, 1, 3-33; reprinted as Chapter 14 in C.W.J. Granger (ed.) (1990) Modelling Economic Series: Readings tn Econometric Methodology, Oxford: Oxford University Press, 304-334.

1983

Engle, R.F., D.F. Hendry, and J.-F. Richard (1983) “Exogeneity”, Econometrica, 51, 2, 277-304.

Hendry, D.F. (1983a) “Comment”, Econometric Reviews, 2, 1, 111-114.

Hendry, D.F. (1983b) “Econometric Modelling: The ‘Consumption Function’ in Retrospect”, Scottish Journal of Political Economy, 30, 3, 193-220.

Hendry, D.F. (1983c) “On Keynesian Model Building and the Rational Expectations Critique: A Question of Methodology”, Cambridge Journal of Economics, 7, 1, 69-75.

Hendry, D.F. and N.R. Ericsson (1983) “Assertion without Empirical Basis: An Econometric Appraisal of ‘Monetary Trends in ... the United Kingdom’ by Milton Friedman and Anna Schwartz” in Monetary Trends in the United Kingdom, Bank of England Panel of Academic Consultants, Panel Paper No. 22, London: Bank of England, 45-101 (with additional references); substantially revised and published in Hendry and Ericsson (1991a), as cited below.

Hendry, D.F. and R.C. Marshall (1983) “On High and Low R? Contributions”, Ozford Bulletin of Economics and Statistics, 45, 3, 313-316.

Hendry, D.F. and J.-F. Richard (1983) “The Econometric Analysis of Economic Time Series”, International Statistical Review, 51, 2, 111-163 (with discussion).

1984

Anderson, G.J. and D.F. Hendry (1984) “An Econometric Model of United Kingdom Building Societies”, Ozford Bulletin of Economics and Statistics, 46, 3, 185-210.

Hendry, D.F. (1984a) “Econometric Modelling of House Prices in the United Kingdom”, Chapter 8 in D.F. Hendry and K.F. Wallis (eds.) Econometrics and Quantitative Economics, Oxford: Basil Blackwell, 211-252.

Hendry, D.F. (1984b) “Monte Carlo Experimentation in Econometrics”, Chapter 16 in

Z. Griliches and M.D. Intriligator (eds.) Handbook of Econometrics, Amsterdam: North-Holland, Volume 2, 937-976.

EMT

55°

Hendry, D.F. (1984c) “Present Position and Potential Developments: Some Personal Views [on] Time-series Econometrics”, Journal of the Royal Statistical Soctety, Series A, 147, 2, 327-339 (with discussion).

Hendry, D.F., A.R. Pagan, and J.D. Sargan (1984) “Dynamic Specification”, Chapter 18 in Z. Griliches and M.D. Intriligator (eds.) Handbook of Econometrics, Amsterdam: North-Holland, Volume 2, 1023-1100.

Hendry, D.F. and K.F. Wallis (eds.) (1984a) Econometrics and Quantitative Economics, Oxford: Basil Blackwell.

Hendry, D.F. and K.F. Wallis (1984b) “Editors’ Introduction”, Chapter 1 in D.F. Hendry and K.F. Wallis (eds.) Econometrics and Quantitative Economics, Oxford: Basil Blackwell, 1-12.

1985

Engle, R.F., D.F. Hendry, and D. Trumble (1985) “Small-sample Properties of ARCH Estimators and Tests”, Canadian Journal of Economics, 18, 1, 66-93.

Ericsson, N.R. and D.F. Hendry (1985) “Conditional Econometric Modeling: An Application to New House Prices in the United Kingdom”, Chapter 11 in A.C. Atkinson and S.E. Fienberg (eds.) A Celebration of Statistics: The ISI Centenary Volume, New York: Springer-Verlag, 251-285.

Hendry, D.F. (1985) “Monetary Economic Myth and Econometric Reality”, Ozford Review of Economic Policy, 1, 1, 72-84.

Hendry, D.F. and N.R. Ericsson (1985) “Assertion without Empirical Basis: An Econometric Appraisal of Monetary Trends in ... the United Kingdom by Milton Friedman and Anna J. Schwartz”, International Finance Discussion Paper No. 270, Washington, D.C.: Board of Governors of the Federal Reserve System.

1986

Banerjee, A., J.J. Dolado, D.F. Hendry, and G.W. Smith (1986) “Exploring Equilibrium Relationships in Econometrics through Static Models: Some Monte Carlo Evidence”, Ozford Bulletin of Economics and Statistics, 48, 3, 253-277.

Chong, Y.Y. and D.F. Hendry (1986) “Econometric Evaluation of Linear Macroeconomic Models”, Review of Economic Studies, 53, 4, 671-690; reprinted as Chapter 17 in C.W.J. Granger (ed.) (1990) Modelling Economic Series: Readings tn Econometric Methodology, Oxford: Oxford University Press, 384-410.

Hendry, D.F. (1986a) “Econometric Modelling with Cointegrated Variables: An Overview”, Ozford Bulletin of Economics and Statistics, 48, 3, 201-212.

Hendry, D.F. (ed.) (1986b) Economic Modelling with Cointegrated Variables, Ozford Bulletin of Economics and Statistics, Special Issue, 48, 3.

Hendry, D.F. (1986c) “Empirical Modeling in Dynamic Econometrics”, Applied Mathematics and Computation, 20, 3/4, 201-236.

Hendry, D.F. (1986d) “An Excursion into Conditional Varianceland”, Econometric Reviews, 5, 1, 63-69.

Hendry, D.F. (1986e) “On the Credibility of Econometric Evidence”, Walras-Bowley Lecture, 1986 North American Econometric Society Meetings; Econometrica, forthcoming.

Hendry, D.F. (1986f) “The Role of Prediction in Evaluating Econometric Models”, Proceedings of the Royal Society of London, Series A, 407, 25-34.

Hendry, D.F. (1986g) “Using PC-GIVE in Econometrics Teaching”, Ozford Bulletin of Economics and Statistics, 48, 1, 87-98.

1987

Florens, J.-P., D.F. Hendry, and J.-F. Richard (1987) “Parsimonious Encompassing: An Application to Non-nested Hypotheses and Hausman Specification Tests”, Discussion Paper No. 87-09, Durham, North Carolina: Institute of Statistics and Decision Sciences, Duke University.

Hendry, D.F. (1987a) “Econometric Methodology: A Personal Perspective”, Chapter 10 in T.F. Bewley (ed.) Advances in Econometrics: Fifth World Congress, Cambridge: Cambridge University Press, Volume 2, 29-48.

Hendry, D.F. (1987b) “Econometrics in Action”, Empirica (Austrian Economic Papers), 2°87, 135-156.

Hendry, D.F. (1987c) PC-GIVE: An Interactive Menu-driven Econometric Modelling Program for IBM-compatible PC’s, Version 5.0, Oxford: Institute of Economics and Statistics and Nuffield College, University of Oxford.

Hendry, D.F. and N.R. Ericsson (1987) “Assertion without Empirical Basis: An Econometric Appraisal of Monetary Trends ... in the United Kingdom by Milton Friedman and Anna J. Schwartz”, Applied Economics Discussion Paper No. 25, Oxford: Institute of Economics and Statistics, University of Oxford.

Hendry, D.F. and A.J. Neale (1987) “Monte Carlo Experimentation using PC-NAIVE” in T.B. Fomby and G.F. Rhodes, Jr. (eds.) Advances in Econometrics, Greenwich, Connecticut: JAI Press, Volume 6, 91-125.

1988

Campos, J., N.R. Ericsson, and D.F. Hendry (1988) “Comment on Telser”, Journal of the American Statistical Association, 83, 402, 581.

Hendry, D.F. (1988a) “Assessing Empirical Evidence in Macro-econometrics with an Application to Consumers’ Expenditure in France”, mimeo, Oxford: Nuffield College; forthcoming in A. Vercelli and N. Dimitri (eds.) Macroeconomics: A Survey of Research Strategies, Oxford: Oxford University Press.

Hendry, D.F. (1988b) “Encompassing”, National Institute Economic Review, 3/88, 125, August, 88-92.

Hendry, D.F. (1988c) “The Encompassing Implications of Feedback versus Feedforward Mechanisms in Econometrics”, Ozford Economic Papers, 40, 1, 132-149.

Hendry, D.F. (1988d) “Some Foreign Observations on Macro-economic Model Evaluation Activities at INSEE-DP”, Chapter 6 in Groupes d'Etudes Macroeconometriques Concertées, Paris: INSEE, 71-106.

Hendry, D.F. and A.J. Neale (1988) “Interpreting Long-run Equilibrium Solutions in Conventional Macro Models: A Comment”, Economic Journal, 98, 392, 808-817.

Hendry, D.F., A.J. Neale, and F. Srba (1988) “Econometric Analysis of Small Linear Systems Using PC-FIML”, Journal of Econometrics, 38, 1 /2, 203-226.

1989

Engle, R.F. and D.F. Hendry (1989) “Testing Super Exogeneity and Invariance”, Discussion Paper No. 89-51, San Diego, California: Department of Economics, University of California at San Diego.

Ericsson, N.R. and D.F. Hendry (1989) “Encompassing and Rational Expectations: How Sequential Corroboration Can Imply Refutation” , International Finance Dis-

cussion Paper No. 354, Washington, D.C.: Board of Governors of the Federal Reserve System.

Favero, C. and D.F. Hendry (1989) “Testing the Lucas Critique: A Review”, mimeo, Oxford: Nuffield College.

Hendry, D.F. (1989a) “Comment”, Econometric Reviews, 8, 1, 111-121.

Hendry, D.F. (1989b) PC-GIVE: An Interactive Econometric Modelling System, Ver-

sion 6.0/6.01, Oxford: Institute of Economics and Statistics and Nuffield College, University of Oxford.

Hendry, D.F. and G.E. Mizon (1989) “Evaluating Dynamic Econometric Models by Encompassing the VAR”, mimeo, Oxford: Nuffield College; forthcoming in P.C.B. Phillips and V.B. Hall (eds.) Models, Methods and Applications of Econometrics.

Hendry, D.F. and M.S. Morgan (1989) “A Re-analysis of Confluence Analysis”, Ozford Economic Papers, 41, 1, 35-52.

Hendry, D.F. and J.-F. Richard (1989) “Recent Developments in the Theory of Encompassing”, Chapter 12 in B. Cornet and H. Tulkens (eds.) Contributions to Operations Research and Economics: The Twentieth Anniversary of CORE, Cambridge, Massachusetts: MIT Press, 393-440.

Hendry, D.F., A. Spanos, and N.R. Ericsson (1989) “The Contributions to Econometrics in Trygve Haavelmo’s The Probability Approach in Econometrics”, Sosialg@konomen, 43, 11, 12-17.

EMT

1990

Campos, J., N.R. Ericsson, and D.F. Hendry (1990) “An Analogue Model of Phaseaveraging Procedures”, Journal of Econometrics, 43, 3, 275-292.

Hendry, D.F., E.E. Leamer, and D.J. Poirier (1990) “A Conversation on Econometric Methodology”, Econometric Theory, 6, 2, 171-261.

Hendry, D.F. and G.E. Mizon (1990) “Procrustean Econometrics: Or Stretching and Squeezing Data”, Chapter 7 in C.W.J. Granger (ed.) Modelling Economic Series: Readings in Econometric Methodology, Oxford: Oxford University Press, 121-136.

Hendry, D.F., J.N.J. Muellbauer, and A. Murphy (1990) “The Econometrics of DHSY” Chapter 13 in J.D. Hey and D. Winch (eds.) A Century of Economics: 100 Years of the Royal Economic Soctety and the Economic Journal, Oxford: Basil Blackwell, 298-334.

Hendry, D.F. and A.J. Neale (1990) PC-NAIVE: An Interactive Program for Monte Carlo Experimentation in Econometrics, Version 6.01, Oxford: Institute of Economics and Statistics and Nuffield College, University of Oxford (documentation by D.F. Hendry, A.J. Neale, and N.R. Ericsson).

Hendry, D.F., with D. Qin and C. Favero (1990) Lectures on Econometric Methodology, Oxford: Oxford University Press, forthcoming.

Hendry, D.F. and J.-F. Richard (1990) “Likelihood Evaluation for Dynamic Latent Variables Models”, mimeo, Oxford: Nuffield College; forthcoming in H.M. Am-

man, D.A. Belseley, and C.S. Pau (eds.) Computation in Economics and Econometrics, Dordrecht, The Netherlands: Kluwer.

1991

Baba, Y., D.F. Hendry, and R.M. Starr (1991) “The Demand for M1 in the U.S.A., 1960-1988”, Review of Economic Studies, 58, in press.

Banerjee, A., J.J. Dolado, J.K. Galbraith, and D.F. Hendry (1991) Equilibrium, Error

Correction and Cointegration in Econometrics, Oxford: Oxford University Press, forthcoming.

Hendry, D.F. (1991a) Econometrics: Alchemy or Science?: Essays in Econometric Methodology, Oxford: Basil Blackwell, in press.

Hendry, D.F. (1991b) “Using PC-NAIVE in Teaching Kconometrics”, Ozford Bulletin of Economics and Statistics, 53, 2, 199-223.

Hendry, D.F. (1991c) “Comments: ‘The Response of Consumption to Income: A Cross-country Investigation’ by John Y. Campbell and N. Gregory Mankiw”, European Economic Review, 35, 4, 764-767.

Hendry, D.F. and N.R. Ericsson (1991a) “An Econometric Analysis of U.K. Money Demand in Monetary Trends in the United States and the United Kingdom by

Milton Friedman and Anna J. Schwartz”, American Economic Review, 81, 1,

8-38. E Hendry, D.F. and N.R. Ericsson (1991b) “Modeling the Demand for Narrow Money in the United Kingdom and the United States”, European Economic Review, 35,

833-881. T Hendry, D.F. and M.S. Morgan (eds.) (1991) Classic Readings in the Foundations of Econometric Analysis, Cambridge: Cambridge University Press, forthcoming.

M Hendry, D.F. and A.J. Neale (1991) “A Monte Carlo Study of the Effects of Structural Breaks on Tests for Unit Roots”, Chapter 8 in P. Hackl and A.H. Westlund (eds.) Economic Structural Change: Analysis and Forecasting, Berlin: Springer-Verlag,

95-119.

References

Amemiya, T. (1977) “The Maximum Likelihood and the Nonlinear Three-stage Least Squares Estimator in the General Nonlinear Simultaneous Equation Model”, Econo-

metrica, 45, 4, 955-968; and (1982) “Correction to a Lemma”, Econometrica, 50, 5, 1325-1328.

Barndorff-Nielsen, O. (1978) Information and Exponential Families: In Statistical Theory, Chichester, John Wiley.

Baumol, W.J. (1952) “The Transactions Demand for Cash: An Inventory Theoretic Approach”, Quarterly Journal of Economics, 66, 4, 545-556.

Box, G.E.P. and G.M. Jenkins (1976) Time Series Analysts: Forecasting and Control, San Francisco, Holden-Day, Second Edition.

Box, G.E.P. and D.A. Pierce (1970) “Distribution of Residual Autocorrelations in Autoregressive-integrated Moving Average Time Series Models”, Journal of the American Statistical Association, 65, 332, 1509-1526.

Brown, R.L., J. Durbin, and J.M. Evans (1975) “Techniques for Testing the Constancy of Regression Relationships over Time”, Journal of the Royal Statistical Society, Series B, 37, 2, 149-192 (with discussion).

Buse, A. (1982) “The Likelihood Ratio, Wald, and Lagrange Multiplier Tests: An Expository Note”, American Statistician, 36, 3, Part 1, 153-157.

Campos, J. (1988) “Econometric Modelling”, lecture notes, Caracas, Banco Central de Venezuela.

Campos, J. and N.R. Ericsson (1988) “Econometric Modeling of Consumers’ Expenditure in Venezuela”, International Finance Discussion Paper No. 325, Board of Governors of the Federal Reserve System, Washington, D.C.

Chow, G.C. (1960) “Tests of Equality between Sets of Coefficients in Two Linear Regressions”, Econometrica, 28, 3, 591-605.

Cox, D.R. (1961) “Tests of Separate Families of Hypotheses” in J. Neyman (ed.) Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, Volume I, 105-123.

Cox, D.R. (1962) “Further Results on Tests of Separate Families of Hypotheses”, Journal of the Royal Statistical Society, Series B, 24, 2, 406-424.

Dickey, D.A. and W.A. Fuller (1979) “Distribution of the Estimators for Autoregressive

Time Series with a Unit Root”, Journal of the American Statistical Association, 74, 366, 427-431.

Dickey, D.A. and W.A. Fuller (1981) “Likelihood Ratio Statistics for Autoregressive Time Series with a Unit Root”, Econometrica, 49, 4, 1057-1072.

Dolado, J.J., T. Jenkinson, and S. Sosvilla-Rivero (1990) “Cointegration and Unit Roots”, Journal of Economic Surveys, 4, 3, 249-273.

Dufour, J.-M. (1982) “Recursive Stability Analysis of Linear Regression Relationships: An Exploratory Methodology”, Journal of Econometrics, 19, 1, 31-76.

Durbin, J. and G.S. Watson (1950) “Testing for Serial Correlation in Least Squares Regression. I”, Brometrika, 37, 3 and 4, 409-428.

Durbin, J. and G.S. Watson (1951) “Testing for Serial Correlation in Least Squares Regression. II”, Biometrika, 38, 1 and 2, 159-178.

Engle, R.F. (1982) “Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation”, Econometrica, 50, 4, 987-1007.

Engle, R.F. (1984) “Wald, Likelihood Ratio, and Lagrange Multiplier Tests in Econometrics”, Chapter 13 in Z. Griliches and M.D. Intriligator (eds.) Handbook of Econometrics, Amsterdam, North-Holland, Volume 2, 775-826.

Engle, R.F. and C.W.J. Granger (1987) “Co-integration and Error Correction: Representation, Estimation, and Testing”, Econometrica, 55, 2, 251-276.

Engle, R.F. and B.S. Yoo (1989) “Cointegrated Economic Time Series: A Survey with New Results”, Department of Economics Discussion Paper 87-26R, University of California at San Diego, San Diego, California; forthcoming in R.F. Engle and C.W.J. Granger (eds.) Modelling Long-run Economic Relationships: Readings in Cointegration, Oxford, Oxford University Press.

Ericsson, N.R. (1989) “Parameter Constancy, Mean Squared Forecast Errors, and Measur-

ing Forecast Performance: An Exposition and Extensions”, mimeo, Federal Reserve Board of Governors, Washington, D.C.

Ericsson, N.R. and H. Lyss (1991) “An Update to PC-GIVE: Version 6.01”, Journal of Applied Econometrics, 6, forthcoming.

Escribano, A. (1985) “Non-linear Error-correction: The Case of Money Demand in the U.K. (1878-1970)”, mimeo, University of California, San Diego.

Fisher, R.A. (1922) “The Goodness of Fit of Regression Formulae, and the Distribution of Regression Coefficients”, Journal of the Royal Statistical Society, 85, 4, 597-612.

Florens, J.-P. and M. Mouchart (1985a) “Conditioning in Dynamic Models”, Journal of Time Sertes Analysts, 6, 1, 15-34.

Florens, J.-P. and M. Mouchart (1985b) “A Linear Theory for Noncausality”, Econometrica, 53, 1, 157-175.

Friedman, M. (1956) “The Quantity Theory of Money — A Restatement”, in M. Friedman

(ed.) Studies tn the Quantity Theory of Money, Chicago, University of Chicago Press, 3-21.

Friedman, M. and A.J. Schwartz (1982) Monetary Trends in the United States and the United Kingdom: Their Relation to Income, Prices, and Interest Rates, 1867-1975, Chicago, University of Chicago Press.

Gilbert, C.L. (1986) “Professor Hendry’s Econometric Methodology”, Ozford Bulletin of Economics and Statistics, 48, 3, 283-307.

Gilbert, C.L. (1989) “LSE and the British Approach to Time Series Econometrics” , Ozford Economic Papers, 41, 1, 108-128.

Godfrey, L.G. (1978) “Testing Against General Autoregressive and Moving Average Error Models when the Regressors Include Lagged Dependent Variables”, Econometrica, 46, 6, 1293-1301.

Godfrey, L.G. (1990) “PC-GIVE: A Review”, Economic Journal, 100, 399, 303-307.

Granger, C.W.J. (1981) “Some Properties of Time Series Data and Their Use in Econometric Model Specification”, Journal of Econometrics, 16, 1, 121-130.

Granger, C.W.J. (1986) “Developments in the Study of Cointegrated Economic Variables” , Ozford Bulletin of Economics and Statistics, 48, 3, 213-228.

Granger, C.W.J. (1989) Forecasting in Business and Economics, Boston, Academic Press, Second Edition.

Granger, C.W.J. and A.A. Weiss (1983) “Time Series Analysis of Error-correction Models” in 8. Karlin, T. Amemiya, and L.A. Goodman (eds.) Studies in Econometrics, Time

Series, and Multivariate Statistics: In Honor of Theodore W. Anderson, New York, Academic Press, 255-278.

Haavelmo, T. (1944) “The Probability Approach in Econometrics”, Econometrica, 12, Supplement, i-viii, 1-118.

Hall, R.E. (1978) “Stochastic Implications of the Life Cycle-Permanent Income Hypothesis: Theory and Evidence”, Journal of Political Economy, 86, 6, 971-987.

Hall, S.G., 8.G.B. Henry, and J.B. Wilcox (1990) “The Long-run Determination of the UK Monetary Aggregates”, Chapter 5 in S.G.B. Henry and K.D. Patterson (eds.) Economie Modelling at the Bank of England, London, Chapman and Hall, 127-166.

Hansen, L.P. (1982) “Large Sample Properties of Generalized Method of Moments Estimators”, Econometrica, 50, 4, 1029-1054.

Harvey, A.C. (1981) The Econometric Analysis of Time Series, Oxford, Philip Allan.

Herschel, J. (1830) A Preliminary Discourse on the Study of Natural Philosophy, London, Longman, Rees, Brown & Green and John Taylor.

Hooker, R.H. (1901) “Correlation of the Marriage-Rate with Trade”, Journal of the Royal Statistical Society, 64, 3, 485-492.

Hylleberg, S. and G.E. Mizon (1989) “Cointegration and Error Correction Mechanisms”, Economic Journal, 99, 395, Supplement, 113-125.

Jarque, C.M. and A.K. Bera (1980) “Efficient Tests for Normality, Homoscedasticity and Serial Independence of Regression Residuals”, Economics Letters, 6, 3, 255-259.

Johansen, S. (1988) “Statistical Analysis of Cointegration Vectors”, Journal of Economic Dynamics and Control, 12, 2/3, 231-254.

Johansen, S. (1990) “Cointegration in Partial Systems and the Efficiency of Single Equation Analysis”, mimeo, Institute of Mathematical Statistics, University of Copenhagen, Copenhagen, Denmark; forthcoming in the Journal of Econometrics.

Johansen, S. (1991a) “A Statistical Analysis of Cointegration for I(2) Variables”, Discussion Paper No. 77, Department of Statistics, University of Helsinki, Helsinki, Finland.

Johansen, S. (1991b) “Testing Weak Exogeneity and the Order of Cointegration in UK Money Demand Data”, forthcoming in the Journal of Policy Modeling.

Johansen, S. and K. Juselius (1990) “Maximum Likelihood Estimation and Inference on Cointegration — With Applications to the Demand for Money”, Ozford Bulletin of Economics and Statistics, 52, 2, 169-210.

Johnston, J. (1963) Econometric Methods, New York, McGraw-Hill. Johnston, J. (1972) Econometric Methods, New York, McGraw-Hill, Second Edition.

Kiviet, J.F. (1986) “On the Rigour of Some Misspecification Tests for Modelling Dynamic Relationships”, Review of Economie Studies, 53, 2, 241-261.

Kiviet, J.F. and G.D.A. Phillips (1986) “Testing Strategies for Model Specification”, Applied Mathematics and Computation, 20, 237-269; reprinted as Chapter 2 in J.F. Kiviet

(1987) Testing Linear Econometric Models, Amsterdam, Amsterdam University Press, 13-45.

Kloek, T. (1984) “Dynamic Adjustment when the Target is Nonstationary”, International Economic Review, 25, 2, 315-326.

Klovland, J.T. (1987) “The Demand for Money in the United Kingdom, 1875-1913”, Ozford Bulletin of Economics and Statistics, 49, 3, 251-271.

Koopmans, T.C. (1950) “When Is an Equation System Complete for Statistical Purposes?”, Chapter 17 in T.C. Koopmans (ed.) Statistical Inference in Dynamite Economic Models, New York, John Wiley, 393-409.

Kremers, J.J.K., N.R. Ericsson, and J.J. Dolado (1989) “The Power of Cointegration Tests”, mimeo, Federal Reserve Board of Governors, Washington, D.C.; presented at the World Congress of the Econometric Society, Barcelona, Spain, August 1990.

Lakatos, I. (1970) “Falsification and the Methodology of Scientific Research Programmes”

in I. Lakatos and A. Musgrave (eds.) Criticism and the Growth of Knowledge, Cambridge, Cambridge University Press, 91-196.

Leamer, E.E. (1987) “Econometric Metaphors”, Chapter 9 in T.F. Bewley (ed.) Advances tn Econometrics, Cambridge, Cambridge University Press, Volume 2, 1-28.

Longbottom, A. and S. Holly (1985) “Econometric Methodology and Monetarism: Professor Friedman and Professor Hendry on the Demand for Money”, Discussion Paper No. 131, London Business School.

Lucas, Jr., R.E. (1976) “Econometric Policy Evaluation: A Critique” in K. Brunner and A.H. Meltzer (eds.) Carnegie-Rochester Conferences on Public Policy, Volume 1, Journal of Monetary Economics, 19-46.

MacKinnon, J.G. (1983) “Model Specification Tests Against Non-nested Alternatives” , Econometric Reviews, 2, 1, 85-158 (with discussion).

MacKinnon, J.G. and H. White (1985) “Some Heteroskedasticity-consistent Covariance Matrix Estimators with Improved Finite Sample Properties”, Journal of Econometrics, 29, 3, 305-325.

Miller, M.H. and D. Orr (1966) “A Model of the Demand for Money by Firms”, Quarterly Journal of Economics, 80, 3, 413-435.

Mizon, G.E. (1984) “The Encompassing Approach in Econometrics”, Chapter 6 in D.F. Hendry and K.F. Wallis (eds.) Econometrics and Quantitative Economics, Oxford, Basil Blackwell, 135-172.

Mizon, G.E. and J.-F. Richard (1986) “The Encompassing Principle and Its Application to Testing Non-nested Hypotheses”, Econometrica, 54, 3, 657-678.

Nicholls, D.F. and A.R. Pagan (1983) “Heteroscedasticity in Models with Lagged Dependent Variables”, Econometrica, 51, 4, 1233-1242.

Nickell, S. (1985) “Error Correction, Partial Adjustment and All That: An Expository Note”, Ozford Bulletin of Economics and Statistics, 47, 2, 119-129.

Pagan, A.R. (1987) “Three Econometric Methodologies: A Critical Appraisal”, Journal of Economic Surveys, 1, 1, 3-24; reprinted as Chapter 6 in C.W.J. Granger (ed.) (1990)

Modelling Economic Series: Readings in Econometric Methodology, Oxford, Oxford University Press, 97-120.

Parks, R.B. (1989) “PC Give 5.0”, American Statistician, 43, 1, 60-63.

Pesaran, M.H. (1974) “On the General Problem of Model Selection”, Review of Economic Studtes, 41, 2, 153-171.

Pesaran, M.H. (1982) “Comparison of Local Power of Alternative Tests of Non-nested Regression Models”, Econometrica, 50, 5, 1287-1305.

Phillips, A.W. (1954) “Stabilisation Policy in a Closed Economy”, Economic Journal, 64, 254, 290-323.

Phillips, A.W. (1957) “Stabilisation Policy and the Time-Forms of Lagged Responses”, Economic Journal, 67, 266, 265-277.

Phillips, G.D.A. (1977) “Recursions for the Two-stage Least-squares Estimators”, Journal of Econometrics, 6, 1, 65-77.

Phillips, P.C.B. (1982) “On the Consistency of Nonlinear FIML”, Econometrica, 50, 5, 1307-1324.

Phillips, P.C.B. (1986) “Understanding Spurious Regressions in Econometrics”, Journal of Econometrics, 33, 3, 311-340.

Phillips, P.C.B. (1987) “Time Series Regression with a Unit Root”, Econometrica, 55, 2, 277-301.

Phillips, P.C.B. (1988) “Reflections on Econometric Methodology”, Economic Record, 64, 187, 344-359.

Phillips, P.C.B. (1991) “Optimal Inference in Cointegrated Systems”, Econometrica, 59, 2, 283-306.

Phillips, P.C.B. and S.N. Durlauf (1986) “Multiple Time Series Regression with Integrated Processes”, Review of Economic Studies, 53, 4, 473-495.

Phillips, P.C.B. and M. Loretan (1991) “Estimating Long-run Economic Equilibria”, Review of Economic Studies, 58, 3, 407-436.

Ramsey, J.B. (1969) “Tests for Specification Errors in Classical Linear Least-squares Regression Analysis”, Journal of the Royal Statistical Society, Series B, 31, 2, 350-371.

Salmon, M. (1982) “Error Correction Mechanisms”, Economic Journal, 92, 367, 615-629.

Sargan, J.D. (1958) “The Estimation of Economic Relationships Using Instrumental Variables”, Econometrica, 26, 3, 393-415.

Sargan, J.D. (1959) “The Estimation of Relationships with Autocorrelated Residuals by the Use of Instrumental Variables”, Journal of The Royal Statistical Society, Series B, 21, 1, 91-105.

Sargan, J.D. (1964) “Wages and Prices in the United Kingdom: A Study in Econometric Methodology” in P.E. Hart, G. Mills, and J.K. Whitaker (eds.) Econometric Analysis for National Economic Planning, Colston Papers, Volume 16, London, Butterworths, 25-63 (with discussion); reprinted in D.F. Hendry and K.F. Wallis (eds.) (1984) Econometrics and Quantitative Economics, Oxford, Basil Blackwell, 275-314.

Sargan, J.D. (1980a) “The Consumer Price Equation in the Post War British Economy:

An Exercise in Equation Specification Testing”, Review of Economic Studies, 47, 1, 113-135.

Sargan, J.D. (1980b) “Some Approximations to the Distribution of Econometric Criteria

which Are Asymptotically Distributed as Chi-squared”, Econometrica, 48, 5, 1107- 1138.

Sargan, J.D. (1980c) “Some Tests of Dynamic Specification for a Single Equation”, Econometrica, 48, 4, 879-897.

Sargan, J.D. and A. Bhargava (1983) “Testing Residuals from Least Squares Regression for Being Generated by the Gaussian Random Walk”, Econometrica, 51, 1, 153-174.

Sims, C.A. (1987) “Making Economics Credible”, Chapter 11 in T.F. Bewley (ed.) Advances 1n Econometrics, Cambridge, Cambridge University Press, Volume 2, 49-60.

Spanos, A. (1986) Statistical Foundations of Econometric Modelling, Cambridge, Cambridge University Press.

Stock, J.H. (1987) “Asymptotic Properties of Least Squares Estimators of Cointegrating Vectors”, Econometrica, 55, 5, 1035-1056.

Terasvirta, T. (1988) “A Review of PC-GIVE: A Statistical Package for Econometric Modelling”, Journal of Applied Econometrics, 3, 4, 333-340.

Tobin, J. (1956) “The Interest-elasticity of Transactions Demand for Cash”, Review of Economics and Statistics, 38, 3, 241-247.

Trundle, J.M. (1982) “The Demand for M1 in the UK”, mimeo, Bank of England, London.

Turner, P. and J. Podivinsky (1987) “PC GIVE © David Hendry Version 4.1 July 1986”, Journal of Economic Surveys, 1, 1, 92-96.

Wallis, K.F. (1974) “Seasonal Adjustment and Relations Between Variables”, Journal of the American Statistical Association, 69, 345, 18-31.

White, H. (1980) “A Heteroskedasticity-consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity” , Econometrica, 48, 4, 817-838.

White, H. (1988) Estimation, Inference and Specification Analysis, Cambridge, Cambridge University Press, forthcoming.

White, H. (1990) “A Consistent Model Selection Procedure Based on m-testing”, Chapter 16 in C.W.J. Granger (ed.) Modelling Economic Series: Readings in Econometric Methodology, Oxford, Oxford University Press, 369-383.

Yule, G.U. (1926) “Why Do We Sometimes Get Nonsense-correlations between Timeseries? — A Study in Sampling and the Nature of Time-series”, Journai of the Royal Statistical Society, 89, 1, 1-69 (with discussion)

Zellner, A. (1979) “Causality and Econometrics” in K. Brunner and A.H. Meltzer (eds.}

Three Aspects of Policy and Policymaking: Knowledge, Data, and Institutions, Amsterdam, North-Holland, 9-54.

IFDP NUMBER

406

405

404

403

402

401

400

399

397

396

395

393

- 67 -

International Finance Discussion Papers

TITLES 1991

PC-GIVE and David Hendry’s Econometric Methodolody

EMS Interest Rate Differentials and Fiscal Policy: A Model with an Empirical Application to Italy

The Statistical Discrepancy in the U.S. International Transactions Accounts:

Sources and Suggested Remedies

In Search of the Liquidity Effect

Exchange Rate Rules in Support of Disinflation Programs in Developing Countries The Adequacy of U.S. Direct Investment Data Determining Foreign Exchange Risk and Bank

Capital Requirements

Precautionary Money Balances with Aggregate Uncertainty

Using External Sustainability to Forecast the Dollar

Terms of Trade, The Trade Balance, and Stability: The Role of Savings Behavior

The Econometrics of Elasticities or the Elasticity of Econometrics: An Empirical Analysis of the Behavior of U.S. Imports

Expected and Predicted Realignments: The FF/DM Exchange Rate during the EMS

Market Segmentation and 1992: Toward a Theory of Trade in Financial Services

1990

Post Econometric Policy Evaluation A Critique

AUTHOR(s)

Neil R. Ericsson Julia Campos Hong-Anh Tran

R. Sean Craig

Lois E. Stekler

Eric M. Leeper David B. Gordon

Steven B. Kamin

Lois E. Stekler Guy V.G. Stevens

Michael P. Leahy Wilbur John Coleman II Ellen E. Meade

Charles P. Thomas

Michael Gavin

Jaime Marquez

Andrew K. Rose Lars E. 0. Svensson

John D. Montgomery

Beth Ingram Eric M. Leeper

Please address requests for copies to International Finance Discussion Papers, Division of International Finance, Stop 24, Board of Governors of the

Federal Reserve System, Washington, D.C.

20551.

IFDP NUMBER

392

391

390

389

388

387

386

385

384

381

380

378

377

- 68 -

International Finance Discussion Papers

TITLES 1990

Mercantilism as Strategic Trade Policy: The Anglo-Dutch Rivalry for the East India Trade

Free Trade at Risk? Perspective

An Historical

Why Has Trade Grown Faster Than Income?

Pricing to Market in International Trade: Evidence from Panel Data on Automobiles and Total Merchandise

Is the EMS the Perfect Fix? An Empirical

Exploration of Exchange Rate Target Zones

Estimating Pass-through: Stability

Structure and

International Capital Mobility: from Long-Term Currency Swaps

Evidence

Is National Treatment Still Viable? Policy in Theory and Practice

U.S. Three-Factor General Equilibrium Models: A Dual, Geometric Approach

Modeling the Demand for Narrow Money in the United Kingdom and the United States

The Term Structure of Interest Rates in the Onshore Markets of the United States, Germany, and Japan

Financial Structure and Economic Development

Foreign Currency Operations: Bibliography

An Annotated

The Global Economic Implications of German Unification

Computers and the Trade Deficit: The Case of the Falling Prices

Evaluating the Predictive Performance of Trade-Account Models

Towards the Next Generation of Newly Industrializing Economies: The Roles for Macroeconomic Policy and the Manufacturing Sector

AUTHOR (s)

Douglas A. Irwin

Andrew K. Rose Joseph E. Gagnon Michael M. Knetter

Robert P. Flood Andrew K. Rose Donald J. Mathieson

William R. Melick Helen Popper Sydney J. Key Douglas A. Irwin David F. Hendry

Neil R. Ericsson

Helen Popper

Ross Levine

Hali J. Edison Dale W. Henderson

Lewis S. Alexander Joseph E. Gagnon

Ellen E. Meade Jaime Marquez Neil R. Ericsson

Catherine L. Mann

Cite this document

APA

Neil R. Ericsson, Julia Campos, & and Hong-Anh Tran (1991). PC-GIVE and David Hendry's Econometric Methodology (IFDP 1991-406). Board of Governors of the Federal Reserve System, International Finance Discussion Papers. https://whenthefedspeaks.com/doc/ifdp_1991-406

BibTeX

@techreport{wtfs_ifdp_1991_406,
  author = {Neil R. Ericsson and Julia Campos and and Hong-Anh Tran},
  title = {PC-GIVE and David Hendry's Econometric Methodology},
  type = {International Finance Discussion Papers},
  number = {1991-406},
  institution = {Board of Governors of the Federal Reserve System},
  year = {1991},
  url = {https://whenthefedspeaks.com/doc/ifdp_1991-406},
  abstract = {This paper summarizes David Hendry's empirical econometric methodology, unifying discussions in many of his and his co-authors' papers. Then, we describe how Hendry's suite of computer programs PC-GIVE helps users implement that methodology. Finally, we illustrate that methodology and the programs with three empirical examples: postwar narrow money demand in the United Kingdom, nominal income determination in the United Kingdom from Friedman and Schwartz (1982), and consumers' expenditure in Venezuela. These examples help clarify the methodology's central concepts, which include cointegration, error-correction, general-to-simple modeling, dynamic specification, model evaluation and testing, parameter constancy, and exogeneity.},
}