ifdp · May 31, 1985

Conditional Econometric Modelling: An Application to New House Prices in the United Kingdom

Abstract

The statistical formulation of the econometric model is viewed as a sequence of marginalizing and conditioning operations which reduce the parameterization to manageable dimensions. Such operations entail that the "error" is a derived rather than an autonomous process, suggesting designing the model to satisfy data-based and theory criteria. The relevant concepts are explained and applied to data modelling of UK new house prices in the framework of an economic theory-model of house builders. The econometric model is compared with univariate time-series models and tested against a range of alternatives.

International Finance Discussion Papers

Number 254

Revised: June 1985

CONDITIONAL ECONOMETRIC MODELLING: AN APPLICATION TO NEW HOUSE PRICES IN THE UNITED KINGDOM

Neil R. Ericsson and David F. Hendry

NOTE: International Finance Discussion Papers are preliminary materials circulated to stimulate discussion and critical comment. References in publications to International Finance Discussion Papers (other than an acknowledgment by a writer that he has had access to unpublished materials) should be cleared with the author or authors.

ABSTRACT

The statistical formulation of the econometric model is viewed as a sequence of marginalising and conditioning operations which reduce the parameterisation to manageable dimensions. Such operations entail that the “error” is a derived rather than an autonomous process, suggesting designing the model to satisfy data-based and theory criteria. The relevant concepts are explained and applied to data modelling of UK new house prices in the framework of an economic theory-model of house builders. The econometric model is compared with univariate time-series

models and tested against a range of alternatives.

CONDITIONAL ECONOMETRIC MODELLING: AN APPLICATION

TO NEW HOUSE PRICES IN THE UNITED KINGDOM

Neil R. Ericsson International Finance Division Federal Reserve Board Washington, D.C. 20551

and

David F. Hendry~ Nuffield College Oxford OX1 INF England

Revised: June 1985

*this paper was prepared at the invitation of the International Statistical Institute and appears in a slightly modified form in Ericsson and Hendry [1985]. It is based on work initially undertaken in Ericsson [1978] and extended for the U.K. Department of the Environment in Hendry [1980]. We are indebted to Frank Srba for invaluable help in carrying out the preliminary analyses; to Jon Faust for his excellent research assistance; and to Anthony Atkinson, Julia Campos, John Muellbauer, Andrew Rose, and two anonymous referees for their helpful comments. Recent research has been supported by grants from the Economic and Social Research Council to the MIME programme at the London School of Economics, and by

E.S.R.C. grants HR8789 and B 00 220012 to Nuffield College. We are grateful for the financial assistance from the E.S.R.C., although the views expressed in this paper are solely the responsibility of the authors and should not be interpreted as reflecting those of the E.S.R.C., the Board of Governors of the Federal Reserve System, the Federal Reserve Banks, or other members of their staffs.

1. Introduction

The feature which distinguishes econometric modelling from timeseries analysis is the integral role of economic theory in orienting the former. At one extreme, a univariate time-series model is inherently mechanistic and has little or no need for subject-matter knowledge. Often, the procedure for choosing a model can be automated so as to satisfy appropriate criteria, such as minimising a residual variance adjusted for degrees of freedom. Even a bivariate model needs little more than common sense in selecting a relevant covarying series. At the other extreme, prior to data analysis a formal intertemporal optimization model can be developed for the behaviour of rational economic agents who fully account for all relevant costs and available information. The data evidence is then used to calibrate the unknown parameters of the theorymodel. Any required data transformations are derived from the theory (e.g-, moving averages might represent “permanent" components, or residuals from auxiliary regressions might act as “transitory” or “surprise” effects).

When formulated as such, the "data-driven" and “theory-driven” approaches to modelling have been viewed as competitive rather than complementary (see, for example, Naylor et al. [1972] and Granger and Newbold [1977]). Confrontations between the rival strategies in terms of forecasting accuracy have not generally been kind to supporters of “theory-driven” modelling (see Nelson [1972]), although that is neither surprising nor definitive in view of the choice of a mean-squared-error criterion; moreover, such results depend on the length of the forecast horizon (see the discussion in Kmenta and Ramsey [1981]).

Each extreme also has severe drawbacks. The first approach is open

to such difficulties as spurious correlations (witness Coen, Gomme, and

Kendall [1969]) and often yields forecasts outside the estimation sample which are poorer than the data-based within-sample fit. “Theory-driven" models tend to manifest symptoms of dynamic mis-specification (e.g., residual autocorrelation afflicts many equations in large-scale econometric models) and often fit poorly due to being excessively restricted. Since, in principle, econometric specifications nest t:imeseries formulations (see Zellner and Palm [1974] and Prothero and Wallis [1976]), the complementarity of information from data and theory bears stressing and argues for an integrated approach.

In practice, a complete spectrum of views exists concerning the "best" combinations of theory and data modelling, and most practitioners blend both elements to produce a mixture often labelled “iterative model building”. The statistical properties of such mixed approaches have proved hard to analyse, especially since the initial theory may be revised in the light of any anomalous data evidence. However, some Monte Carlo evidence is available (see Kiviet [1981, 1982]), highlighting the difficulties of selecting models from noisy data. Moreover, pre-test theory indicates severe inferential problems in simplifying models by using sample evidence (see Judge and Bock [1978]). Nevertheless, little empirical research in economics commences from fully pre-specified models which adequately represent all salient features of the data.

Consequently, by default, many important aspects of most models have to be selected from the observed sample, including the choices of alternative potential explanatory variables, lag reaction profiles, functional forms, error properties, seasonal variations, and even the evolution over time of parameters of interest.

In the present approach, the data analysis is strongly guided by

prior economic theory. The theory suggests the form that the model class

should have in order to satisfy a number of reasonable properties likely to obtain in any static-equilibrium state of the world. The functional form is specified to ensure invariance to a range of transformations, and the length of the longest lag in the maintained model is pre-assigned a value such that we would be surprised if even longer lags were needed to make the model adequately characterise the data. Such an approach produces a general maintained model, which is usually heavily overparameterised. Reduction of the general model by data-based sequential simplifications in the light of the prior theory yields a parsimonious summary which aims to be both data-coherent and theory-consistent, with interpretable parameters corresponding to nearly orthogonal variables (see Trivedi [1984] for a general discussion of this strategy for model selection).

At this stage, the model has been designed to satisfy a range of statistical and economic criteria. Since the criteria may conflict, some of the “art™ apparent in modelling remains, perhaps necessitating appropriate compromises between tractability, coherency, and credibility. Moreover, no unique path for simplification exists, so the final model may vary with the investigator. However, by sharing a common initial model and subjecting selected simplifications to testing on later data and against rival models, the strategy has some in-built protection against choosing poor or non-constant representations. It is important to stress that the prototypes of the model presented below were first developed in 1978 and have altered rather little since 1980 despite several later tests. Section 2 presents a more extensive discussion of the empirical econometric modelling methodology to establish terminology and exposit the

main concepts.

The topic of our application naturally determines the formulation of the theory-model. Casual observation suggests that a vast complex of social, economic, and demographic factors influence house prices. Here we are concerned with modelling the average price of newly completed private dwellings in the United Kingdom (from 1959-1982), denoted by Pn, and shown in Figure 1. A separate model of the price at which the existing stock of housing is transacted (denoted by Ph) is presented in Hendry [1984]. Heuristically, we consider the joint density of (Pn, Ph) as being factorised into a conditional density for Pn, given Ph, and a marginal density for Ph, with our models for Pn and Ph corresponding to the latter two densities. The economic justifications for the resulting parameters of the conditional density being of interest are presented in Section 3 in the context of a theory-model of the decisions of the construction industry.? Section 3 also briefly considers alternative approaches to modelling Pn based on construction costs and on (marginal) models derived from the supply of and demand for new dwellings.

In fact, the general formulation of earlier models of new house prices has been of an interaction between supply and demand, with prices implicitly determined by the level which equates supply and demand for new housing. The validity of this market~-clearing paradigm is not obvious in the United Kingdom, where the volume of new housing is small compared to the total volume of transactions in existing houses. Also, there is clear evidence of large changes over time in the stock of completed but unsold houses, indicative of a non-clearing market. That aspect therefore requires appraisal, so the available data are described in Section 3.

Section 4 reports the various empirical results obtained and discusses the light they throw on understanding the determination of

housing prices. Section 5 briefly concludes the study.

220 PRICES

Pn, 180

----Ph,

seeeees P,

140 100 60 _ 20

1960 1965 1970 1975 - 1980

Figure 1. Selected United Kingdom price indices. Dates on this figure and Figures 2-5 mark the first quarters of the respective years.

2. Econometric Modelling

In this section, we discuss relevant aspects of our statistical approach for modelling new house prices: see Hendry and Richard [1982, 1983] for more detailed discussion and bibliographical information. Modelling is viewed here as an attempt to characterise data properties in simple parametric relationships which remain reasonably constant over time, account for the findings of pre-existing models, and are interpretable in the light of the subject matter. The observed data (wy, eee Wp) are regarded as a realisation from an unknown dynamic economic

mechanism represented by the joint density function: (2.1) Dw, --- wr lH >,

where T is the number of observations on w

Wee Wo denotes the initial

conditions, and p is the relevant parameterisation. D(*) is a function of great complexity and high dimensionality, summarising myriads of disparate transactions by economic agents and involving relatively heterogeneous commodities and prices, as well as different locations and time periods (aggregated here to quarters of a year). Limitations in data and knowledge preclude estimating the complete mechanisn.

A model for a vector of observable variables {x,| can be conceptualised as arising by first transforming W, 80 that x isa sub-vector, then (implicitly) marginalising the joint density D(*) with

respect to all variables in %, other than (i.e., with respect to those

Xe variables not considered in the analysis). That produces the reduced density F(x, +++ x,/X93 9), where 9 is the induced function of y. Next:,

one sequentially conditions each X, on past observables to yield:

2.2) FOIE OY = TFG ® gee , t= 7

where xt = & coe %5) and X ) for T > j > i > 1. The usefulness

= (Xx xy 57 % % of (2.2) depends on the actual irrelevance of the variables excluded from D(*), on the suitability of the parameterisation 9 (which may include

"transients" relevant to sub-periods only) and on the adequacy of the

*|9). Aspects of those conditions are open to direct testing against the observed data. Although the choice of functional form is of considerable importance, it depends intimately on the nature of the problem. Thus, for this general analysis, we assume that the time series x has been appropriately transformed so that only linear models need to

be considered (so x

x, may involve logarithms, ratios, etc. of the original

‘ t ' , variables). Finally, x is partitioned into (Xe z,) » where Xe is to be

explained conditional on Zi» corresponding to the claimed factorisation:

5 2) = Fy la, 3 g)°FCE,] +3 ob) »

where 6’ = g(0)' = Co : $5) € ®,x®,, and all the parameters of interest

can be obtained from $, alone. If such conditions are fulfilled, then Zh

is said to be weakly exogenous for a and only the conditional model

F(y, |Z, 3 $) needs to be analysed, greatly simplifying the modelling

exercise if there are many variables in Z¢ This formulation is discussed

more fully in Engle et al. [1983] and Florens and Mouchart [1980, 1985],

and builds on the work of Koopmans [1950] and Barndorff-Nielsen [1978]. Adding the further assumption that the maximum lag length of

dependence in (2.2) is fixed at & < T periods, then the conditional linear

model can be written as

L (2-4) Ye = Boke + L Rika + & tales T

ignoring transients to simplify notation. That provides the general

maintained model and is estimable by a variety of methods. Rather

clearly,

however, many drastic a priori assumptions have been made in

order to formalise (2.4), and such assumptions need not be valid

empirically. Consequently, we now consider how to evaluate such models.

Any postulated model can be evaluated by comparing its claimed

properties with its actual behaviour. As formulated, (2.4) entails

restrict

fons relative to six different sources of information, which are

summarised as:

(A)

(B)

(Cc)

(D)

(E)

(F)

the history of the {x, } process, denoted by X-1 (namely, only xt is relevant if (2.4) is valid);

the current value of x. (namely, it is valid to condition Y, on z,)5

the future of the {x, } process (namely, the parameters remain

constant on xT);

the subject-matter theory (so that (2.4) is consistent with the available theory);

the structure of the measurement system (e.g., definitional constraints nust not be violated); and

rival models (which should not contain additional information

relevant to explaining fy, })-

We now consider empirical model selection criteria derived from each of

those information sources.

(A)

concerns

a good approximation, then

: A crucial aspect of evaluating the empirical validity of (2.4) the properties of {g.}. If the assumptions underlying (2.4) are

& =y, ~ EC, [L,) where L. = (z,, ¥,_1)- Thus

te} is a derived (rather than an autonomous) process, which by

construc

tion is uncorrelated with L. and hence is an innovation relative

One set of tests of (2.4) seeks to evaluate the extent to which

the calculated residuals are consistent with {g.} being such an innovation process.

Several particular hypotheses can be investigated as follows. Firstly, defining white noise by the second-order property that, for E(g,) = Q, Eg, 5!) = 0 for all k#0 (i.e., & is unpredictable from its own past alone), one could test for residual autocorrelation. For example, suppose an investigator postulated a model with a maximal lag length i Tf a were less than % and/or the elements of {B,, in0,..+,2°3 were inappropriately restricted, then the residuals might manifest serial correlation. Hence, criterion (i) of model adequacy is that the residual process (i.e., that which is left unexplained after modelling is ended) should be empirical white noise (see Granger [1983]). Note that an autocorrelated error can be “explained” in part (e.g., by Box-Jenkins methods). Also, te} need not be homoscedastic in (2.4), so that inferences may have to allow for potential heteroscedasticity. Fortunately, heteroscedastic-consistent covariance matrices can be constructed with ease (see White [1980], Domowitz and White [1982], and Messer and White [1984]), and a variety of tests for residual heteroscedasticity is available.

Next, the assertion that E. is an innovation relative to {z,> xivt) entails that lags longer than & are redundant in (2.4) and that selecting <n is invalid. Thus, the residuals in (2.4) should have the smallest generalised variance (adjusted for degrees of freedom) in this class of constructed error processes. That property is called parsimonious variance dominance and provides criterion (ii). If a model did not have white-noise residuals, it could te variance-dominated by a corresponding

model which also “mopped up" the residual autocorrelation parsimonisusly.

Thus, (1) is a necessary condition for (ii), but is not sufficient,

emphasising that white-noise residuals are a minimal requirement for model adequacy, whether or not modified to account for parsimony (e.g-, see Schwarz [1978]). Models derived by sequentially simplifying unrestricted representations such as (2.4) tend to have innovation errors. Conversely, the previously noted drawback of "theory-driven” modelling (that the associated errors are not innovations) is easily understood if the theory is not sufficiently general to posit the “correct” value & of a"

a priori.

(B): The validity of the assertion of weak exogeneity is criterion (iii). Unfortunately, weak exogeneity per se is not easily tested in a class of models like (2.4); and to do so may require modelling {z, thereby defeating the main purpose of the conditioning assumption. However, if the data generation process of 12,1 does not stay constant over the sample, yet gy is constant in (2.3), then this enhances the credibility of the weak exogeneity assertions underlying (2.3). When o is invariant to changes in [y) and (2.3) is valid, then z, is said to be super exogenous for o-

(C): Parameter constancy (after duly incorporating all relevant transients) provides criterion (iv). The formulation in (2.4) explicitly defines certain parameters (By eee Bods changes in which would invalidate the model. It seems natural to seek models with constant parameters, whatever the purpose of the modelling exercise, and to test assertions of constancy as a check on the usefulness of the model.

Summarising, (i) + (ii), (iii), and (iv) respectively relate to the validity of assumptions concerning lagged, contemporaneous, and leading data relative to any given observation at time t. In econometrics, (1)- (iv) are reasonably conventional criteria for selection and evaluation of

models. A model which satisfies (i1)-(iv) will be useful for forecasting

Xe if L. is available when forecasts are made; however, if Ze is contemporaneous with Yee then L. will rarely be known at time t-l. Also, L. need not be a "good" information set for explaining Y_» nor need the

{B, } in (2.4) bear sensible economic interpretations or be constants across different states of the world. Consequently, while these data criteria are necessary, they are not sufficient to justify a given model for inference, forecasting, or policy analysis. Indeed, three further criteria are of equal importance.

(D): The first of these, criterion (v), is theory consistency, which is also standard in econometrics and requires that an empirical model should reproduce the theory from which it is ostensibly derived under the hypothetical conditions relevant to that theory. That may sound weak, but some published equations violate (v), and finding a model form which is theory-consistent in several different but relevant hypothetical states of the world can be non-trivial.

(E): Next, data admissibility, criterion (vi), entails that a model should be unable to predict data values which violate definitional constraints. For example, that prices are always non-negative or that houses not started cannot be completed are data requirements and so should be satisfied automatically (i-.e., with probability one) by the model.” Clearly, (vi) is closely related to the choice of functional form.

(F): Finally, and perhaps most importantly, encompassing (labelled criterion (vii)) requires that any model F(*) claimed to adequately represent the data generation process D(*) should be able to account for the results obtained by other models of that process. That follows because if one knew the mechanism generating all the data (as in a Monte Carlo study, say), which here would be Dw, eos wr lW ; wv), then by formal

reductions equivalent to those which produced F(*), one could deduce what

parameter values should be found in other medels of the mechanism (at least in large samples). Consequently, if the selected model is claimed to characterise the data process adequately, it too should satisfy that requirement and allow the results of rival models to be derived. ‘Should the estimated parameters of rival models which are in fact obtained differ significantly from those derived using the selected model, then that would contradict the assertion that the selected model adequately described the data generation process. Thus, encompassing requires that any model F( +) of the mechanism generating ix, | should mimic that property of D(*) and be able to account for the empirical results reported by rival models of ix, |}- Before concluding this section, we consider the implications of this concept and its relation to testing non-nested hypotheses, which can be

seen most easily for two rival linear models:

. = gz! (2.5) HL: EQ. |z3.) = 4148

and (2.6) Hy + Bre lepg) = Bbed »

) ~ IN(O, o

where each hypothesis separately asserts Vy = red

0 eee In (2.5) and (2.6), & and So) are ky

parameters, and Zit and Zor (generic symbols for sets of regressors) have -

x1 and koxl vectors of unknown

(at least) some variables which are not in common. For simplicity, we

assume they have none in common. Formally, the joint density of Yer Zap?

and Z,, can be factorised as F(y Zap» Zoe *)°F(z3, 1 Zoe» °F» *)-

Note that, given the joint density, both (2.5) and (2.6) must be derived representations; hence, while separate, they are also inter-related. Here, (2.5) entails the conditional irrelevance of Zoe in explaining yy

given z

-1t° can be linked using:

Under joint normality, Za and Zor

(2-7) Zi, * Boe + be?

where Il is defined by E(z,,|Zo4) = To, (so E(Zo. Sip) = 0), and (again for

exposit:ional simplicity) we assume ECS Sit? = Q. From (2.5) and (2.7),

= t ' = (2-8) oy, = SiMe t+ OM, + bie) = Shoe + Mae .

Consequently, (2.8) is what the model (2.5) predicts the model (2.6) should find, so that if (2.5) is to encompass (2.6), it must be the case

that

. = ’ (2.9a) He : &, i & and

(2.9b) 4H Gyn = 98,, + S26

b* [22> 11 7 Ay

The hypothesis in (2.9a) is called parameter encompassing, and that in (2.9b) variance encompassing, where a least-squares notion of encompassing is being employed. In passing, there seems little point in testing (2.5) against (2.6) or vice versa unless both models do satisfy their claimed formulations, which first requires evaluating both on criteria (i)-(iv) at least. If so, then (in large samples) the non-negative definiteness of Q entails that q cannot hold unless 944 < Oy9 >» i.e., yy variance-dominates

H Thus, variance dominance is necessary, but not sufficient, for

3 2° variance encompassing. That in turn entails that encompassing is asymmetric in the present context: if (2.5) encompasses (2.6), the converse is false. Also, as Hq, is sufficient for Hq» it is readily established that least-squares encompassing is transitive. Neither of (2.5) or (2.6) may encompass the other, in which case a more general model is necessary. Thus, encompassing defines a partial ordering over models, an ordering related to that based on goodness-of-fit; however, encompassing is more demanding. It is also consistent with the concept of a progressive research strategy (e.g., see Lakatos [1970]), since an

encompassing model is a kind of “sufficient representative" of previous

empirical findings.

More generally, an encompassing strategy suggests trying to anticipate problems in rival models of which their proponents may be unaware. For example, (2.5) may correctly predict that {v5.3 is not white noise, or that So is not constant over t = 1,...,T (e-g.-, if & is constant but Il varies as t does, then (2.5) predicts that & should vary with t). Corroborating such phenomena adds credibility to the claim that the successful model reasonably represents the data process, whereas disconfirmation clarifies that it does not.

For the specific class of linear models, the propositions in (2.9) are testable by a large range of tests. Of these, perhaps the best known belong to the class of one-degree-of-freedom tests proposed by Cox [1961, 1962] using a modified likelihood-ratio statistic and implemented by Pesaran [1974] for models like (2.5) versus (2.6). That class seems to test A,» which is necessary but not sufficient for Ho when kyl. (See the discussion following the survey by MacKinnon [1983].) Mizon and Richard [1983] and Mizon [1984] present equations for generating a very large class of tests of either a. or H,» or other functions of parameters for

which encompassing is deemed relevant. Under Hy)» E(y, lZae> )

Zor) - Zin Sp so that ECV, |Zoe) = 0. Consequently, if 6, Sp» and Il are separately estimated under their own assumptions and used to construct 4n estimate of a= &, - Ts, (so He becomes a = Q), then a minor transformation of the Wald test of a= 0 yields the conventional F-test on the marginal significance of adding (the non-redundant elements of ) Zoe to (2.5) (see Atkinson [1970] and Dastoor [1983]). It is unsurprising that in the present linear context there should be no sharp dichotomy between nested and non-nested approaches to testing (2.5) against (2.6). However, the

union of (2.5) and (2.6) must always encompass both, so parsimonious

encompassing is essential to avoid vacuous formulations. For example, if

adding Zor to (2.5) produces an insignificant improvement in fit, then (2.5) parsimoniously encompasses the model embedding (2.5) and (2.6). This aspect of simplicity, therefore, remains important in establishing credible models. For an empirical attempt at encompassing a range of disparate models using an embedding strategy, see Davidson et al. [1978] and the follow-up in Davidson and Hendry [1981]; conversely, Bean [1981] investigates encompassing using the Cox approach. Note, however, from (2.9b), that only one direction of testing is really worthwhile and that all models in Bean's study that are variance-dominated are rejected.

Given the best available theoretical formulation of any problem, it seems sensible to design models to satisfy (1)-(vii) as far as possible, recognising the possibility of conflict between criteria for any limited class of models under consideration. In particular, data admissibility ts remarkably difficult to achieve in practice without simply asserting that errors are drawn from truncated distributions with bounds which conveniently vary over time; and theory consistency and variance dominance also may clash, especially for parsimonious formulations. How a compromise is achieved mst depend on the objectives of the analysis (e.g., forecasting, policy advice, testing economic theories, etc.) as well as on creative insights which effect a resolution.

Since (1)-(vii) are to be satisfied by appropriate choice of the model given the data, relevant “test statistics" are little more than selection criteria, since “large” values on such tests would have induced a re-designed model. Genuine tests of a data-based formulation then occur only if new data, new forms of tests, or new rival models accrue. Such an approach is similar in spirit to the data-based apsect of Box and Jenkins's [1976] methods for univariate time-series modelling, but

emphasises the need to estimate the most general model under consideration

to establish. the innovation variance. Moreover, existing empirical models and available subject-matter theory play a larger role, while being subjected to a critical examination for their data coherency on

(i)-(vit).

Indeed, econometric analysis always has involved a close blend of economic theory and statistical method (e.g., see Schumpeter [1933]).

That economic analysis should be used is unsurprising, but the role of statistics has proved more problematical in terms of a complete integration of the economic and statistical aspects of model formulation, even though Haavelmo [1944] stressed the necessity of carefully formulating the statistical basis of an economic theory-model. He also showed the dangers of simply “adding on" disturbance terms to otherwise deterministic equations and asserting convenient properties to justify (say) least-squares estimation.

Conversely, massive difficulties confront any purely data-based method, since the interdependence of economic variables entails a vast array of potential relationships for characterising their behaviour. That aspect is discussed more fully in Hendry et al. [1984], but the theorymodel in Section 3 highlights the existence of many derived equations from a small set of “autonomous” relationships. Since economic systems are far from being constant, and the coefficients of derived equations may alter when any of the underlying parameters or data correlations change, it is important to identify models which have reasonably constant parameters and which remain interpretable when some change occurs. That puts a premium on good theory.

While our paper remains far from resolving these fundamental issues, it seeks to link the two aspects by using considerations from both

Sections 2 and 3 in formulating the empirical equations of Section 4.

3. The Economic Theory-Model

As noted above, it is important to distinguish autonomous from derived relationships in a non-constant world; yet in practice it is exceptionally difficult to do so. The main basis for any asserted status of an equation must be its correspondence (or otherwise) to a theoretical relationship. Thus, to guide the data analysis, we present a suggestive, if somewhat simplistic, theory-model which highlights what dependencies between variables might be anticipated.

Most house builders are small in terms of the fractions of the markets they supply (housing) and from which they demand inputs (labour, capital, land, materials and fuel). In the longer run, competitive forces might be expected to operate in such conditions so that only normal profits are earned (the going rate of return on capital), with builders who fail to minimise costs eventually being eliminated. Consequently, despite its artificiality, insight can be gained by analysing the decision processes of a single builder who produces homogeneous units, faces given costs, uses best practices, and seeks to optimise his expected long-run return. Those assumptions would allow one to formulate an "optimalcontro]." model yielding linear, intertemporal decision rules which maximise the expected value of the postulated objective function conditional on costs and demand, by using the certainty-equivalence principle (e.g., see Theil [1964, pp. 52ff.]). An analytical solution can be obtained only by postulating known and constant stochastic processes for the uncontrolled variables. Such an assumption in effect removes the uncertainty from the problem, and will be interpreted here as narrowing the applicability of the resulting theory to an equilibrium world: that is, one which is stationary, essentially certain, and devoid of problems

like evolving seasonality, adverse weather, changes in legislation or

tastes, and so on. Nevertheless, the resulting equations help to constrain the equilibrium solutions of the empirical model as well as to indicate relevant variables and parameterisations of interest.

Builders, as location-specific suppliers of new dwellings, have some element of monopolistic power and can influence sales somewhat by (say) advertising. In the medium term, they can determine the volume or the price of their new construction (or possibly some combination thereof); usually, their supply schedule reflects a willingness to supply more houses with higher profitability of construction. Conversely, final purchasers demand more housing as its price falls relative to that of (e.g-) goods and services, and choose between new and second-hand units on the basis of their costs (other factors being constant). Ina schematic formulation which deliberately abstracts from dynamics, completions of new houses in period t (denoted by Cc.) are produced from a stock of uncompleted dwellings (U.1)> with variations in the rate of completions depending on changes in new house prices (Pn) and in construction costs (CC). We use a log-linear representation

(3.1) ce = By + Byu,_, - Bcc, + Bypn, (6, > 0, 1 = 1,2,3),

where lowercase variables denote logarithms of the corresponding capitalized variables and c* denotes the planned supply of completions.

Letting Ss. denote starts of new dwellings, then

t-1 t Thus, UL is the (end-of-period) integral of past starts less completions, and stock-flow ratios (e-g., u/c.) are crude measures of the average lag

between starting and ending construction. Since adjustment costs (hiring

and firing workers, paying overtime or idletime, etc.) suggest that change

is costly and costs are incurred by maintaining the inventory Ure a builder minimising costs should aim for a constant rate of production. * Thus, By = 1 seems likely in (3.1), corresponding to C = KU in equilibrium (K a positive constant).

Since (3.1) is conditional on the pre-existing stock of work in progress, the role of cc, and pn. is to alter the mean lag around exp(-By)', with the main impact of changes in long-run profitability being

via the level of u Thus, it seems reasonable to expect By = B, with

t-1" both being relatively small. On the demand side, population, income, interest rates, and the

relative price of new to second-hand housing are the main determinants of

purchases of completions. Again we use a log-linear equation

(3.3) 2 = v9 + ¥,Cy-m), + Yom, - ¥g(PU-Ph), - YR, > where cf is the demand for completions, N. is the total number of families in the relevant geographical region,” y, is total real personal disposable income, and R. is the interest rate. It is unclear how significantly demographic factors should influence the relative price of new to existing housing (since much of their effect will be reflected in the conditioning variable Ph). So, to a first approximation, we assume Yy* Yo * 1; hence nm, can be dropped, leaving Ve to capture both scale changes (e.g., via population size) and changes in real personal disposable income per capita. Since this abstract analysis assumes homogeneous housing units,° a very large value of ¥3 might be anticipated, reflecting a willingness to switch freely between otherwise identical new and almost new dwellings, depending on their respective prices.

The overall demand for housing relates to the national stock, He

and, as Cc. is a small fraction of that stock, the average second-hand

house price Ph, is determined primarily by the demand for housing in relation to the pre-existing stock, Wet: Given Phi» (3.3) then determines the demand for new houses which is confronted with a supply of cr dwellings. In general, co and cr will not be equal, and builders will either experience unsatisfied demand or end up holding unsold houses. Either way, they must adjust by changing output or price. However, if ¥3 is large, disequilibrium will persist until Pn is fully adjusted to Ph.

Simultaneously, Hq. must be altering, given the equation (3.4) HL = (1-6, Hy + C. + 0,

where 6. is the rate of destruction of houses and 0. is other net sources of housing supply (e.g., from the governmental or rental sectors). The whole stock-flow system evolves until an appropriate combination of stock

and price results, with flows in balance.

If we consider a static equilibrium defined by ca = Cc.

and all change

ceasing, then, from (3.4), (3.5) c¢ = &ndét+h , using log-linear equations where 5, = 6 and 0.= O for simplicity. Further, we assume that the function for the total demand for housing can be written as = - _ ~_ _ ’ (3.6) h o + hy A, (ph P) ABR AAz > and the function for the volume of work in progress is = _ ’ (3.7) u Ko + K, (pn cc) + Kz >

where z denotes other exogenous influences (such as technology in (3.7)) and P is the overall price level of goods and services. Together with (3.1) and (3.3), we obtain five equations to determine the equilibrium

values of c, h, u, pn, and ph, given y, 6, p, R, cc, and Z; Consequently,

in such an equilibrium pn is determined by cc and the factors affecting profitability. Since the system evolves so long as a disequilibrium persists, pn, will reflect current and past values of construction costs. Nevertheless, “explaining” pn, by conditioning on Lec, as i= 0,.-.,2} alone would not necessarily produce a useful model.’ Ore way to see such an argument is to consider (3.1) and (3.3) being d

equated instantaneously by Poe adjusting so that ce = Che Then

(3-8) pay = Cagt8s) [Cg By) + gph, + Byce, + YY, ~ BLUE YURe] + For large values of Y3> we have pn, * ph, and the influence of cc, becomes small.* A generalisation of (3.8) is the basis for the empirical model presented below, which turns out to yield estimates consistent with the view that Y3 is indeed large and y * 1. However, there is evidence that ce # cf in general, using data on a series for the stock of unsold

completions, US. (which was collected only up until 1978). From the

identity

. = +C. - (3.9) US. US Cc. Sk

1 t ?

where S& denotes sales of completions, then A,us, reflects changes in net demand relative to the outstanding stock.> As shown in Figure 2, US has fluctuated substantially. A model for US. is developed at the end of Section 4.

Ass a consequence of these considerations, our model class was required to reproduce (3.8) only under equilibrium assumptions, but otherwise was determined empirically by commencing from an unrestricted autoregressive-distributed lag equation in which all variables entered with up to four lags. Additive seasonal intercepts were included, since

none of the data series was seasonally adjusted, but these did not prove

stgnificant. The log-linear specification was retained because Lt ensured

20a

3.0 us, ; 2.6 2.2 1.8 \

1.4

1.0

0.6

1968 1970 1972 1974 1976 1978

Figure 2. Logarithm of the stock of unsold completions (us,).

positive predictions of prices, yielded parameters which were elasticities and so could in principle be constant over time, and allowed freedom to switch between any of a number of sensible alternatives for the dependent variable [such as pas (pn-ph) 5 (pace), (pn-p) > (pn-p-y) > and changes in all of these] without altering the specification of the model. Moreover, a constant percentage residual standard error also seemed a reasonable requirement.

The basic difficulties inherent in econometric modelling have now been introduced, and prior to empirical implementation it is worth considering: why bother? A straightforward answer is that anything less than a properly specified, autonomous relationship may "break down" whenever the statistical properties of any of the actually relevant variables alter. In practice, such events occur with monotonous regularity. Seen from this perspective, time-series models are simply very special cases of econometric equations in which all (for univariate models) or most (for “transfer function" models) covariates are ignored. Consequently, they too should regularly “break down" or mis~predict; and, in practice, they do indeed do so. However, tests of predictive failure of time-series models may have low power because the models themselves fit poorly, in which case large mis-predictions are needed to obtain “significant” outcomes. Econometric equations also often badly mispredict, revealing their inadequacy; but they remain susceptible to progressive improvement using the approach discussed in Section 2.

Nevertheless, much of the reduction in the error variance may derive from only a few additional factors, as the residual standard deviations 6 in Table 1 illustrate. !° There, k is the total number of regressors in each model, the sample size is 94 (1959(1)-1982(11)), and the mean and

standard devtation of (Ay pa, } (the cegr.-sand in every case) are 2.59% and

Table 1. A Comparison of G for Different Models of Aypne

Univariate Bivariate Econometric autoregression model model Eq- (4.1) Eq. (4.2) - Eq- (4.4)

6 1.55% 1.29% 0.94%

k 3 6 12

2.38% respectively. As can be seen, three variables effect a 35% reduction in 6 over the unconditional standard deviation, whereas it takes twelve to effect a 60% decrease.

A stronger justification than goodness-of-fit is needed for the larger size of the econometric specification. The two natural arguments are the resulting understanding of how the market functions (of obvious importance for predicting the complicated indirect effects flowing from changing government policies) and the feedback to improve economic analysis of markets in general (so that theories can commence from corroborated nodels, rather than from a priori assertions). The following

results should help in judging the realism of such justifications.

4, Eis lex) Tindiags

Ags .2 iLiustration of the methodology described in Sections 2 and 3, we mode’ che determinants of new house prices and present evidence for Market “isequilibria using the series on unsold completions. Before doing SO, We iuvsc- cevelop simple time-series models of Pn: such models establish a useful baseline against which to evaluate our econometric model of ®n. Further, those models will allow insight into the role of economic theory in econometric modelling.

Tae historical quarterly time series for the rates of change of pn and (pn-p) (i-e., nominal and real new house prices respectively) are shown in Figures 3 and 4. The series are highly volatile, and two large “booms” in 1971-1974 and 1977-1979 are evident. Also, Aj pn, has tended to be positive (in almost every quarter), whereas A, (pn-p), often has been negative. ?recise data definitions are recorded in Appendix B. Next, the relative price (pn-ph), is shown in Figure 5 and reveals substantial swings: xenerally, (pn-ph) falls (rises) when ph is rising (falling) most rapidly, suggestive of pn adjusting to lagged (and possibly current) ph.

& ciéth-order autoregression for pn suggested the following

simplified model:

(4.1) A,pa. = 0.50 Ajpn._, ~ 0.24 Arpn,_, + 0.006 Mt ¢06) 2&2 toy F 83.003)

O04 Re = 0.59 G= 1.55% n, (6,85) = 0.7 (6,85) = 0.15

ng(F, 58) = 1.0 ng(4,87) = 0.7 & (1) = 0.04 €,(2) = 27.9.

In (4.1), cosfiéieient standard errors are shown in parentheses, T denotes the sample size, 8% is the squared multiple correlation coefficient, and o is the ©. ,icual standard deviation. The n,C*) are test statistics labelled 2: far as possible to correspond to the order of the criteria in

Section 2, and the flgures in parentheses are their degrees of freedom.

23a

1960 1962 1964 1966 1970 1972 1974 1976 1978 1980

Figure 3. Rate of change of new house prices (Apn,).

0.08

0.06

0.04

0.02

0.00

23b

——A,(pn— p),

1972 197

\ l 18160 1962 1988 1966 " 1970

ine

Figure 4. Rate of change of real new house prices [A,(pn — p),].

23c

0.06

1960 1962 1964 1966 1968 1970 1972 1974

—0.10

—0.14

—0.18

Figure 5. Logarithm of the relative price of new to second-hand houses [( pn — ph),].

All statistics except ny (+) are viewed as Lagrange-multiplier or efficient

score statistics (see Rao [1948]). Under the relevant null, ny ts

distributed in large samples as a central F; similarly, £0) is

asymptotically x2(j,0). The statistics are

ny(*) = Lagrange-multiplier statistic for testing against residual autocorrelation (see Godfrey [1978], Harvey [1981, p. 173]),

n9(*) = Wald statistic for testing against the relevant unrestricted maintained model (e.g., in (4.1), a fifth-order autoregression

with seasonally shifting intercepts),

n3(°) = Chow's [1960, pp. 594-595] statistic for testing parameter constancy,

n5(*) = White's [1980] statistic for testing against residual heteroscedasticity,

E6(*) = Engle's [1982] ARCH statistic (i-e., for testing against firstorder autoregressive conditional heteroscedasticity),

&7(*°) = Jarque and Bera's [1980] statistic for testing against nonnormality in the residuals (based on skewness and excess kurtosis).

From m0), (4.1) has white-noise residuals, and no(*) confirms it is an acceptable simplification of the autoregression. The parameters are not significantly non-constant over the last two observations (despite measurement problems noted below) or indeed over several longer test samples; tt There is no evidence of residual heteroscedasticity, but the residuals are highly non-normal, reflecting the marked failure of (4.1) to predict the large changes in pn observed during the boom periods.

An indirect check on the usefulness of the theory-model of Section 3 is that lagged values of ph should be informative about present pn, since it is assumed that the market for the stock of houses nearly clears each period whereas that for the flow adjusts more slowly while builders adapt to disequilibria. Thus, a bivariate model of pn on lagged values of pn

and ph should perform better than (4.1). Simplifying from a bivartate

model with up to fifth-order lags on pn and ph yielded the following

equation:

(4.2) ipa, = 0.55 Ayph,_, + 0-29 Aypn,_, ~ 0.004 [.08] [.07] [.004]

+ 0.012 Q,. + 0.012 Q,. + 0.008 Q t.004) ** f.o04) 2% [.005) 3

T= 94 R%= 0.72 G= 1.29% 1,(6,82) = 1-2 ,(8,80) = 0.6

Nz (12,76) = 2.3 ng, (13,75) = 4.0 E6(1) 0.02 65(2) = 6.0 .

[*] denotes White's [1980] heteroscedasticity-consistent estimate of the standard error. Manifestly, Ayph. 1 is a highly significant predictor of Ajpn.: thus, (4.2) is as usable for one-step-ahead predictions as (4.1), but has a significantly smaller residual variance. Although (4.1) and (4.2) are not nested, (4.1) is a special case of the unrestricted version of (4.2) which the latter parsimoniously encompasses. Directly testing the significance of lags of ph in the unrestricted version of (4.2) yields Ny (5,80) = 8.6, so that the white-noise residual of (4.1) is far from being an innovation on the joint information set generated by (pn, ph). Conversely, the errors on (4.2) are accepted as being an innovation process on that information set. However, ng C+) reveals residual heteroscedasticity in (4.2), and ng(*) indicates the possibility of nonconstant parameters.

That last problem is probably due primarily to measurement errors. During late 1981, commercial banks began to rapidly expand their loans for house purchases in competition with building socteties. /? Banks lent mainly against more expensive dwellings, and in larger than average loans. Thus, they attracted a distinctly biased sample of house purchasers. However, the data series are based on returns for the average prices of

houses sold with a mortgage from a biilding society. Consequently, the

series were distorted for a period until building socteties attracted back a representative selection of house purchasers. For second-hand house prices, the main biases appear to have been in 1981(iv) and 1982(1) and (11) (the termination of our data period). Here, the model is conditional on lagged ph; and as the relative distortions between the price series for new and existing house purchases is not known, it is difficult to assert any precise pattern for the residuals consequent on the measurement problem. For example, if the bias in observing A ph, as Ayph, is d. and

hat for A Kpn. i h that for Ajpn, as Ajpn, is e,, then

Ta Patol r~ (4.3) @)pn, = 0.55A,ph,_, + 0.294)pn,_, + je, - 0-29e,_5 - 0.554, _j} + &;

where the innovation error is denoted by ©, and we assume (4.2) holds for

t the correctly measured series. (Seasonal factors are ignored for simplicity.) The restriction that e. = d. with the latter measured as -2.1%, -2.1%, and +4.2% in the three relevant quarters (from Hendry [1984]) could be rejected. This finding is consistent with the observed predictions from (4.1), which suggest a smaller but more prolonged distortion, but is also interpretable as evidence against the hypothesis that the large value of n3(*) here (or the corresponding test statistic for the ph model) is mainly due to mismeasurement. Until later data become available to clarify the issue, some doubt must remain concerning how distorted the series are over the last four observations. Below, however, we will continue to act as if the hypothesis were valid and, from the patterns of the residuals in (4.1) and (4.2), construct a dummy variable D with the values (110-1 -1) from 1981 (141) to 1982(111).2° The unrestricted fourth-order autoregressive-distributed lag representation with (3.8) as its static-equilibrium solution is shown in

Table 2, in a reparameterisation intended to aid interpretability of this

highly over-parameterised equation.!4 Unsurprisingly, few of the

26a

Table 2: An Unrestricted Model for A,pn,”

Variable 4 = 0 1 2 3 4 AyPR -1 -05(.16) -25(.16) -04(.15) ~13(.11) Ayphy_, -35( .09) =. 05(.13) -.06(.13) -04(.12) -.08(.12) du, - -00(.07) ~.18(.08) -01(.08) -.11(.08) Me -05( .08) = .21(.14) .21(.12) -15(.12) -05(.10) d(ce-p),_, -14( 11) «11..12) -14(.11) -05(.11) -.03(.10) My; -07( .18) .20(.19) .12(.19) .12(.18) -04(.17) Qs -41(1.21) .003(.008) .002(.010) .001(.008) -

Ph, _, - -.09(.07) - - - (u-y) po - -.01(.02) - - - Ye-j - -.05(.13) - - . (ce-p),_, - - - - -18(.10) Pej - - - - -11(.05) (pn-ph), _, - ~-.48(.15) - - -

Sr = 94, R2 = 0.87, G = 1.092%, £ (12) = 9.7, m(11,45) = 1.8, £(1) = 0.00. {Qs j=0,...,3} denote a constant and three seasonal shift dummy variables. &, (12) is Box and Pierce's [1970] statistic based on the residual

correlogranm.

regressors have “t-statistics" in excess of 2, although that should not be interpreted as entailing their irrelevance, given the generally high correlations between successive lags of economic variables. Notwithstanding their standard errors, many of the coefficients are negligibly small. However, three of the variables in levels omitted from both (4.1) and (4.2) have large coefficients, highlighting the role of the "“equilibrating™ mechanisms postulated in Section 3.

To an economist, both (4.1) and (4.2) have fatal flaws as claimed autonomous relationships, even though neither is an unacceptable data description on (i)-(iv) of Section 2. Concerning (4.1), the rate of inflation of new house prices is modelled as independent of inflation of goods and services; and, if A,pn, became constant at wu, the model would restrict y to the single value 2.6% (the sample mean). Both implications are implausible, although only prolonged changes from the sample behaviour would reveal the empirical inadequacy of the model. Similar difficulties afflict (4.2) even though A phy is used: for example, under a constant growth rate ri in second-hand house prices (and averaging over the seasonals), yu * 0.8y" + 0.005. Thus, the prices Pn and Ph would diverge indefinitely unless ma = 2.62.

The significance of (pn-ph) in Table 2 can now be seen in perspective: any divergence of the house prices alters the conditional growth rate of pn relative to ph so as to bring the relative price (Pn/Ph) into line (see Granger and Weiss [1983] for a discussion on the relationship between time-series models, error-correction models like the one in (4.4), and the existence of long-run relationships between variables). Consistent with (3.8), the long-run relative price varies with real construction costs. That evidence is sufficiently favourable to

the theory in Section 3 to merit parsimonious modelling by forming

interpretable Functtons of the regressors in Table 2 and marginalising with respect to all the other potential variables. En route, a distributed lag in the mortgage interest rate Rm; (j = 1,--.,4) also was allowed for, and that is reflected in the finally selected specification. Since it illustrates the relative roles of all the potential explanatory factors, the following equation was selected as being the most

interesting for present purposes:

(4.4) Kon

2 = 0.67 A, (A, ph.) + 0.19 A Pn. + 0.10 A3(cc Pp) 0.16 Aju

et 1.06] [.05] 2.04] t 1.03) 2 &? + 0.09 Ajy,_, - 1.9 D° - 0.25 (pn-ph),_, - 0.027 (ph-p),_ r.o4) 7&7) 7.5} [.06] tl 1.015] t-1

+ 0.13-(ce-p),_, — 0.022 (u-y),_, - 0.23 {Rm(1-1t)},_, - 0.073 [.03] t-4 5.012] tl .13] tt 1.049]

T=94 R2= 0.86 G= 0.944% n(6,76) = 0-9 n4(2,80) = 0.9

Ep(1) = 0.2 €(2) = 165 ,

where A, (x) = 0.1 var (4-4) x, as a normalised linearly declining distributed lag; t is the standard tax rate, so Rm(1-t) is the after-tax interest rate; and D° = D/100, so that its coefficient represents 1.%%. The first six regressors represent disequilibrium or growth factors, all of which vanish in a static state; and the last six represent levels which persist in the equilibrium solution. Within those sets, variables are organised by influences from house prices, costs, net demand or supply, and other factors.

To analyse (4.4), we first derive its static solution by setting all

growth rates equal to zero.

(4.5) pa-ph = -0.10(ph-p) + 0.52(ce-p) - 0.09(u-y) - 0.92Rm(1-t) - 0.29 (.05) (.09) (.05) (.71) ¢.18)

The quoted standard errors are asymptotic approximations for non-linear

functions of estimated parameters, based on the covariance matrix of the estimates in (4.4). The coefficients in (4.5) can be interpreted in the light of (3.8) (assuming Y= 8, = 1). That entails By = 6.0, By = 1.2, ¥3 = 10.2. The last two of those three coefficients are of the anticipated size and sign, but the first is so much larger than expected as to be somewhat implausible (suggesting that supply is very elastic in response to changes in real costs, rather than in response to profitability). However, in (4.4) the coefficients of (ph-p) ,_7 and (ce-p) 4 would have to be equal in magnitude and opposite in sign for Bo to equal B., when (4.5) is interpreted as (3.8). For comparison, the

static solution implied by the unrestricted model in Table 2 is (4.6) pn-ph = -0.19(ph-p) + 0.37(cc-p) - 0.01l(u-y) - 0.lly + 0.03p .

Given the uncertainty inherent in the unrestricted model, the two derived equilibria are acceptably similar. Of course, direct estimation of (3.8) (1.e., omitting all dynamics)

is not overly enlightening; but, for completeness, we record such

results. /> — (4.7) (pn-ph), = - 0.35 ph, - 0.02 u. + 0.23 Ye + 0.42 (ce-p), + 0.32 Py

+ 0.81 {Rm(1-t)}, + 0.021 Q), + 0-012 Q,, + 0.004 Q,, - 2.09 T=94 G= 1.75% n, (6,78) = 9.6 .

While (4.7) does not even fit as well as (4.1) (which phenomenon does not entail that equilibrium economic theory is vacuous), a coherent pattern of estimates emerges across the empirical counterparts of (3.8), with the last coming close to implying that Bo * By (but y # 1).

The test statistics of (4.4) suggest white-noise errors (with (4.4)

parsimoniously dominating the model in Table 2 in terms of the value of

6), constant parameters (rejected if D° is omitted), and approximately normal errors. Thus, the equation offers a reasonable data description and is consistent with the economic analysis of Section 3. Figure 6 shows the track of ipa, from (4.4) against that of A, pn.

Next, we consider the dynamics of (4.4). The reaction of pn, to a change in ph. is quite rapid initially (e.g., Poy would change by over 0.6% by the end of six months in response to a 1% change in Phi)» followed by an oscillatory convergence. For construction costs, however, a much slower adjustment pattern is observed, the main feedback in (4.4) being lagged by one year. Such lag responses are also consistent with the

theory-model of Section 3 and show the need to model Pn given Ph and solve for the long-run role of construction costs rather than to model Pn directly, given CC (omitting Ph). Note that, because Table 2 includes the “construction cost" hypothesis as a special case, the present approach automatically encompasses that rival model (although Appendix A should clarify that several different interpretations are possible). The dynamic impacts of changes in u and y have appropriate signs and seen more important quantitatively than their equilibrium impacts. The interest-rate coefficient is small and not well determined.

The above interpretations of the individual coefficients of (4.4) rest upon an implicit assumption of relatively orthogonal regressors. Table 3 reports the matrix of correlations for the whole sample. The figures in brackets are the partial correlations from estimating (4.4), and it is noteworthy that five of these have the opposite sign to the corresponding simple correlation, highlighting the difficulty of interpreting simple correlations directly when in a multivariate context. Of the regressor intercorrelations, 36 are smaller than 0.5 in absolute

value and only two are larger than 0.75. Since both involve the term

30a

‘(pp) uonenbs wos pany pur yenqor saad ssnoy Mou Jo adurRYy Jo HRY *g aandLy

(ud'y) sanjea pentg MNS

(ud'y) senyea jenyoy -—————

S200

0s0°0

S400

0010

Gcl'O

30b

*(y°y) BuzQeuy Ise wWorzy suOoTAeTeIIOD [eTIzed ay vsAe SqJayoEIAG UI soanstd

3 ¢(2-1)ma} +01 T3(K-n) +6 WA(d-29) +g M3 (aud) +7 TA (yd-ud) +9

1-9 Keo,

2-31, *(d-vo)by +¢

ft gl v4

Cudlyy’y +1

Vv °c?

ee Ae Ne

TE°O- ¥8°0 8L°0 €0°0 61°0- 82°0- €0°0 ¥T°0- 90°0 (S1°0-) IT°O 6€°0- T°O 19°0- 80°0- SO°O 02°0 €0°0 €0°0 (61°0-) 90°0 7$°0 §=69€°O) =: OT°0- =—-TZ°*0- «= 8T°0- ~=ZT*0- «=60°0- = (€¥"°0 +) £0°0- TS*0- 90°0- 61°O- LE°O 90°0- 9€°0 (ZT*O-) T¥°0 ST°0- 41*0- 1S*0- €l°O0- z2S°0- (€¥*°0-) SS*0- 79°0 = ZEO-——sC60°0)—s ZE*O—s (#20 -+) *TE*O L£Z7°0 §=660°0)— @*0—s (94*0-) 90°0 SO°0 §=6TS*0)—s (62°0 ) TS°0 61°0 §=6(ZE*O +) TE*0 (£L°0 ) S8°0 6 8 L 9 S 9 € z I 7 ud ly

2(v"y) U} SOTQeTILA BYR AOJ XFAQeW UVOTIBTSAIOD eIeq *E BTGPL

{Rm( 1-1) } and since that variabie plays a small role in the model, it

t-1? would seem sensible to further simplify the model by omitting interest rates altogether. Doing so, 6 increases to 0.95% and (u-y) 24 ceases to be significant, whereas (ph-p) 4 becomes better determined, with most of the remaining coefficients being unaltered.

Since the data behave very differently before and after 1971, and since the sample is large enough to produce sensible estimates from each half, we tested the model by fitting it separately to the two sub-samples. That is a demanding test in the present context, since “success” (i-e., non-re jection of parameter constancy) would imply an ability to track turbulent data from estimates based on a quiescent period. Conversely, rejection would imply a need to revise the specification, though without clarifying precisely how. In the event, the sub-sample estimates of o (in models without interest rates) were 0.80% and 0.90% respectively, against the whole-period figure of 0.95%. An F-test of constancy across subsamples yielded n(10,73) = 3.1, thus rejecting the null. Most of the estimates were in fact fairly similar between the sub-periods and the whole sample, but those for (u-y) 4 changed sign, as did that for (ce-p),_y in the first sub-period. Otherwise, the second-period estimates were similar to those for the whole period, with rather more rapid adjustment, consistent with the need for builders to respond more rapidly to large disequilibria and/or substantial changes.

The last stage of the analysis is to examine the evidence for market disequilibria using the short series on unsold completions (us, ) for 1967-1978. As before, we First record the unrestricted log-linear

representation in which us, is explained by up to two lags of y, r,

t (pn-p), and (ph-pn) (see Table 4). There, ) denotes the sum over j of

coefficient estimates for a given variable, from which a derived long-run

Table 4: An Unrestricted Model for us,?

t Variable j=0 1 2 ) US tj -1 (-) 0.74 (0.21) -0.15 (0.16) -0.41 Yt-j -0.71 (1.12) -1.98 (0.93) -0.30 (1.23) -2.99 (pn-P),-; 1.10 (1.65) 1.50 (2.52) -9.90 (1.42) 1.70 (ph-pn),_; -1.81 (1.36) 0.09 (1.70) -0.17 (1.92) -1.89 Quit -0.13 (0.08) -0.14 (0.06) -0.07 (0.08) -

Constant 32.4 (10.8) - - -

a7 = 45, R2 = 0.988, G = 9.0%, nm (6,21) = 2.01, 13(6,21) = 2.23, &6(1) = 2.3. . solution analogous to equation (4.6) above can be derived. The mean and standard deviation of us, and Ayus, are (2.1, 63%) and (-1%, 21%) respectively; so, even if Aus, were the dependent variable in Table 4, R2 would still exceed 80% despite the large value of 6. All of the effects of r, y, (pn-p), and (ph-pn) are sensibly signed, the last of these yielding a static relative price elasticity in excess of 4.5 (though some care is required in interpreting these magnitudes, as us, is conditioned on pn, )-

A simplified representation of the model in Table 4 is given below:

- : (4.8) A,us, = 0.51lA,r, + 1.36 A, (pn-p) 4 + 0.93 r 27 3.07 Yee

- 0.09 Q,. - 0.15 Q,. - 0.08 Q r.04) ‘* roa 2® = ¢.057 7¢

T=45 R2=0.87 G= 8.6% n, (6,28) = 1.6 n,(7,27) = 0.6 n4(6,28) = 1.6 & (1)

1.9 6 (2)=0.3 .

All of the test criteria are acceptable, but, although both n+) and n3C*) are smaller than in Table 4, neither is greatly favourable to the model. Unsold completions seem to be extremely sensitive to all of the demand factors and to adjust fairly rapidly, but still to reveal that disequilibria persist for around about three quarters to a year on

average.

5. Conclusion

A complete statistical analysis of the model-building procedures applied above is bound to indicate that the finally selected model is subject to considerable uncertainty in its specification. Most Monte Carlo studies hold the model specification fixed as the sample size varies and still yield large uncertainty regions. Moreover, in dynamic equations, nominal and actual test sizes can depart radically for relevant sample sizes, and test powers often are unimpressive. When the equation formulation is itself data-based, models can have but a tentative status. In Monte Carlo terms, large variability seems likely to arise from the selection process.

Some protection against “spurious” estimates is provided by having a pre-defined maintained hypothesis embodying subject-matter knowledge; Sections 2 and 3 above discussed the principles underlying the model of Table 2. However, the reparameterised equation (4.4) which summarises the salient features of that table reflects a larger element of judgment; the reported coefficient standard errors are much smaller in (4.4) than in Table 2, primarily because of the reduced collinearity, but also partly because the imposed restrictions are acceptable through being apparent in

the unrestricted model. Different samples and/or investigators could

produce simplifications different from (4.4). The alternative of not basing the selection of the model upon the data is even less appealing.

Partial protection is offered by testing the selected model for its ability to encompass rival hypotheses as well as by checking parameter constancy on later data. Models which are encompassing and remain constant over time are useful tools for later applications. The present. equation for new house prices fits substantially better than any preexisting models of house prices (e.g., see the estimates discussed in Nellis and Longbottom [1981]). However, it is an explanation conditional on contemporaneous second-hand house prices Phi: marginalising with respect to Ph. would increase the residual standard deviation from 6 to o = (62 + 4202)1/2, where 4 is the estimated coefficient of Ph. in (4.4) (1.e., 0.27) and ®2 is the estimated error variance of the model for Ph, (e-8-5 @ * 1.43% in Hendry [1984, eq. (18)])- Thus, 6 * 1.02% ts implied, which remains smaller than in existing models which do not include Phy in modelling Paie Nevertheless, we intend to conduct direct: tests between the various marginal models of pn in due course, and hope to account for the sub-sample variation in Ge

A mixed outcome, in which some developments have been implemented while others remain to be carried out, is fairly typical of empirical econometric research. Viewed as part of a research strategy in which anomalies point towards new research areas, remaining problems become a future stimulus rather than a major drawback. They are also a caution to the limitations of a model rather than a definitive rejection, and a sign that stringent evaluation criteria are being demanded rather than that

econometric modelling is not worth undertaking.

Appendix A. A Simple Theoretical Model of the Market for New Housing

Consider a builder constructing dwellings subject to a Cobb~Douglas

production function of the form

(A.1) C=K, > Mo L a+ s6<l,

where a, 8, and Ky are positive constants; M denotes direct inputs (workers and materials, with a price tndex per unit dwelling given by CC); and L denotes land (with a price per plot of P&). Ignoring any fixed

costs, profits are given by (A.2) wm = PneC — CCM — PReL 3

and the objective is to maximise 1 subject to (A.1), where CC, P&, and Ph are taken as given (i.e., the builder is small relative to the whole market). ‘The assumption that a+ 6 <1 is made to rule out increasing returns as the scale of production grows. In practice, there are elements of local monopolistic power, so the demand function facing the builder is

postulated to be (A.3) c=K > (Pn/Ph) * y¥<o0,

where KX is a positive constant.

The algebra of maximising m7 subject to (A.l) and (A.3) is tedious but well-known (for an excellent introduction, see Smith [1982]) and yields the solution that pn is a weighted average of ph, cc, and p&, with the individual weights being dependent on a, 8, and y, and where (e.g.) doubling all nominal prices would leave decisions about quantittes unaltered. In the important special case that a+ 8 = 1 (so constant returns prevail), pn only depends on cc and p& with weights a and l-a.

However, Ph would then proxy P&, since any increase in land prices would

be reflected in increased prices for existing housing. Thus, an alternative representation would be pn = Acc + (1-\)ph (+ demand factors), which is closely similar to (3.8). Note that the functtonal form of the model is log-linear, given the postulated production and demand functions. Also, if (A.3) is “inverted” to express Pn/Ph as a function of C, the latter can be eliminated from the schedule for the supply of new housing, as in the main text. Conversely, since u in (3.8) is endogenous to the construction sector, a formulation dependent only on “outside” influences

would necessitate substituting in its determinants.

Appendix B. Data Definitions

VARIABLE?

DEFINITION

SOURCE>

Private sector housing completions (GB)

Index of the cost of new construction (1975 = 100)

Dummy variable for 1981(111)-1982(111)

General index of retail prices (1975 = 100)

Index of prices of comparable dwellings (second-hand houses) on which transactions were completed

Average price of new dwellings on which new

building-society mortgages were completed (1975 = 100)

Bank of England's minimum lending rate to the market (the base rate from 1981(iii))

Building-society interest rate on new mortgages

Private sector housing starts (GB) Standard rate of income tax

Uncompleted houses (GB): constructed from the identity Ur = Ur-1 + S, - Cy» with

several benchmark surveys as checks

Stock of unsold completions ("dwellings completed not sold")

Real personal disposable income (1975 prices)

E.T., M.D.S.

H.C.S., M.D.S.

See text

E.T.

D.O.E., B.S.A.

E.T., B.S.A.

E.T., M.D.S.

B.S.A., F.S.

E.T., M.D.S.

A.A.S.

H.C.S.

E.T., M.D.S.

ee ee eee eee a — —n

4411 variables are quarterly, seasonally unadjusted.

Da.A.s., Annual Abstract of Statistics, H.M.S.0.; 8.S.A., Building Societies Association Bulletins and Compendium of Statistics; D.O.E., Department of the Environment; E.T., Economic Trends, Annual Supplements, 1980-1985, H.M.S.0.; F.S., Financial Statistics, H.M.S.0.; H.C.S., Housing and Construction Statistics, H.M.S.0.; M.D.S., Monthly Digest of Statistics,

H.M.S.0.

Footnotes lNote that few agents buy a weighted average of new and second-hand housing, hence our desire to model both variables rather than an overall “house price index". In any case, the relative price of new to existing dwellings is crucial to the construction industry.

2rypical models of completions have them determined as a distributed lag on starts. Even tf the weights on the lags sum to unity, any (e.g.) positive stochastic disturbance to the equation implies houses completed which were not started.

Formally, variance dominance refers to the underlying (and unknown) error variances. In practice, we often say a model variance-dominates another 1f the estimated residual variance of the former is smaller than that of the latter. +Note that we have abstracted from weather effects, etc. yamilies already housed may nevertheless wish to switch to a new house. a practice, it would be desirable to allow for changes in their attributes and composition.

"tn particular, being a derived relationship, its parameters need not be very constant.

8, pendix A sketches an alternative theoretical derivation for a profitmaximising builder which yields a solution similar to that in (3.8). Defining the lag operator L as Lx, = » then we let A

X17 * = (1-L)x, -

More generally, Aix, = c1-15)*x,. If i and/or j is undefined, it is taken

j to be unity.

Opor comparison with Tables 1 and 2 and equation (4.4), Ericsson [1978] obtains G = 1.14% (1958(1)-1974(iv)), and Hendry [1980, p. 31], 6 = 0.93%

(1958(iv)-1976(iii)).

i should be noted that 13 ¢°) will usually reflect changes in o as well

as in the regression coefficients. 12 these last are friendly (non-profit-making) societies whose primary function is to act as financial intermediaries between savers and potential house owners seeking mortgages. For econometric analyses of their behaviour, see Hendry and Anderson [1977] and Anderson and Hendry [1984].

3the last observation lies outside our data period and hence entails a testable prediction of the model.

14 e omputer-program limitations precluded additional lags or the inclusion of further variables, so Ro, and its lags were omitted.

1506 be comparable with (4.5), p is included and the interest rate is

after tax. The substantial residual autocorrelation in (4.7) precludes

calculation of sensible estimates of standard errors.

Bibliography

Anderson, G.J. and D.F. Hendry (1984) "An econometric model of United Kingdom building societies", Oxford Bulletin of Economics and Statistics, 46, 3, 185-210.

Atkinson, A.C. (1970) "A method for discriminating between models", Journal of the Royal Statistical Society, Series B, 32, 3, 323-353 (with discussion).

Barndorff-Nielsen, 0. (1978) Information and Exponential Families in Statistical Theory. New York: John Wiley and Sons.

Bean, C.R. (1981) “An econometric model of manufacturing investment in the UK", Economic Journal, 91, 361, 106-121.

Box, G.E.P. and G.M. Jenkins (1976) Time Series Analysis: Forecasting and Control (revised edition). San Francisco: Holden-Day.

Box, G.E.P. and D.A. Pierce (1970) "Distribution of residual autocorrelations in autoregressive-integrated moving average time

series models”, Journal of the American Statistical Association, 65, 332, 1509-1526.

Chow, G.C. (1960) "Tests of equality between sets of coefficients in two linear regressions", Econometrica, 28, 3, 591-605.

Coen, P.J., E.D. Gomme, and M.G. Kendall (1969) "Lagged relationships in economic forecasting”, Journal of the Royal Statistical Society, Series A, 132, 2, 133-163 (with discussion).

Cox, D.R. (1961) “Tests of separate families of hypotheses", in J. Neyman

(ed-) Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, volume 1. Berkeley: University of

California Press, 105-123.

Cox, D.R. (1962) "Further results on tests of separate families of

hypotheses", Journal of the Royal Statistical Society, Series B, 24, 2, 406-424.

Dastoor, N.K. (1983) "Some aspects of testing non-nested hypotheses”, Journal of Econometrics, 21, 2, 213-228.

Davidson, J.E.H., D.F. Hendry, F. Srba, and S. Yeo (1978) "Econometric modelling of the aggregate time-series relationship between consumers’ expenditure and income in the United Kingdom", Economic Journal, 88, 352, 661-692.

Davidson, J.E.H. and D.F. Hendry (1981) “Interpreting econometric evidence: the hehaviour of consumers’ expenditure in the UK", European Economic Review, 16, 177-192.

Domowitz, I. and H. White (1982) "“Misspecified models with dependent observations”, Journal of Econometrics, 20, 1, 35-58.

Engle, R.F. (1982) “Autoregressive conditional heteroscedasticity with

estimates of the variance of United Kingdom inflation", Econometrica, 50, 4, 987-1007.

Engle, R.F., D.F. Hendry, and J.-F. Richard (1983) "Exogeneity"”, Econometrica, 51, 2, 277-304.

Ericsson, N.R. (1978) Modelling the Market for Owner-Occupied Housing in

the United Kingdom: An Exercise in Econometric Analysis. M.Sc.

thesis, London School of Economics.

Ericsson, N.R. and D.F. Hendry (1985) "Conditional econometric modeling: an application to new house prices in the United Kingdom", Chapter 11 in A.C. Atkinson and S.E. Fienberg (eds.) A Celebration of Statistics: The ISI Centenary Volume. New York: Springer-Verlag, 251-285.

Florens, J.-P. and M. Mouchart (1980) “Initial and sequential reduction of Bayesian experiments”, CORE discussion paper 8015, Université Catholique de Louvain, Louvain-la-Neuve, Belgium.

Florens, J.~P. and M. Mouchart (1985) “Conditioning in dynamic models”, Journal of Time Series Analysis, 6, 1, 15-34.

Godfrey, L.G. (1978) "Testing against general autoregressive and moving average error models when the regressors include lagged dependent variables", Econometrica, 46, 6, 1293-1301.

Granger, C.W.J. (1983) "Forecasting white noise" in A. Zellner (ed.), Applied Time Series Analysis of Economic Data. Washington: U.S. Bureau of the Census, 308-326 (with discussion).

Granger, C.W.J. and P. Newbold (1977) "The time series approach to econometric model building” in C.A. Sims (ed.) New Methods in Business Cycle Research. Minneapolis: Federal Reserve Bank of Minneapolis,

Granger, C.W.J. and A.A. Weiss (1983) "Time series analysis of errorcorrection models" in S. Karlin, T. Amemiya, and L.A. Goodman (eds.) Studies in Econometrics, Time Series, and Multivariate Statistics. New York: Academic Press, 255-278.

Haavelmo, T. (1944) "The probability approach in econometrics", Econometrica, 12, supplement, i-viii, 1-118.

Harvey, A.C. (1981) The Econometric Analysis of Time Series. Oxford: Philip Allan.

Hendry, D.F. (1980) An econometric model of the UK housing market. London: Economists Advisory Group.

Hendry, D.F. (1984) "Econometric modelling of house prices in the United Kingdom", Chapter 8 in D.F. Hendry and K.F. Wallis (eds.) Econometrics and Quantitative Economics. Oxford: Basil Blackwell, 211-252.

Hendry, D.¥. and G.J. Anderson (1977) "Testing dynamic specification in small simultaneous systems: An application to a model of building society behavior in the United Kingdom", Chapter 8c in M.D. Intriligator (ed.) Frontiers in Quantitative Economics, volume IITA. Amsterdam: North-Holland Publishing Co., 361-383.

Hendry, D.F., A.R. Pagan, and J.D. Sargan (1984) “Dynamic specification”, Chapter 18 in Z. Griliches and M.D. Intriligator (eds.), Handbook of Econometrics, volume II. Amsterdam: North-Holland Publishing Co., 1023-1100.

Hendry, D.F. and J.-F. Richard (1982) "On the formulation of empirical models in dynamic econometrics”, Journal of Econometrics, 20, 1, 3-33.

Hendry, D.F. and J.-F. Richard (1983) "The econometric analysis of economic time series", International Statistical Review, 51, 2, 111-163 (with discussion).

Jarque, C.M. and A.K. Bera (1980) "Efficient tests for normality, homoscedasticity and serial independence of regression residuals", Economics Letters, 6, 3, 255-259.

Judge, G.G. and M.E. Bock (1978) The Statistical Implications of Pre-test and Stein~Rule Estimators in Econometrics. Amsterdam: North-Holland Publishing Co.

Kiviet, J.-F. (1981) "On the rigour of some specification tests for modelling dynamic relationships”, paper presented at the Amsterdam Conference of the Econometric Society.

Kiviet, J.F. (1982) "Size, power and interdependence of tests in sequential procedures for modelling dynamic relationships”, Discussion paper, University of Amsterdam.

Kmenta, J. and J.B. Ramsey (eds.) (1981) Large-scale Macro-econometric Models. Amsterdam: North-Holland Publishing Co.

Koopmans, T.C. (1950) “When is an equation system complete for statistical purposes?", Chapter 17 in T.C. Koopmans (ed.) Statistical Inference in Dynamic Economic Models, Cowles Commission Monograph 10. New York: John Wiley and Sons, 393-409.

Lakatos, I. (1970) "“Falsification and the methodology of scientific research programmes" in I. Lakatos and A. Musgrave (eds.), Criticism and the Growth of Knowledge. Cambridge: Cambridge University Press,

MacKinnon, J.G. (1983) "Model specification tests against non-nested alternatives”, Econometric Reviews, 2, 1, 85-158 (with discussion).

Messer, K. and H. White (1984) "A note on computing the heteroskedasticity consistent covariance matrix ustng instrumental variable techniques", Oxford Bulletin of Economics and Statistics, 46, 2, 181-184.

Mizon, G.E. (1984) "The encompassing approach in econometrics”, Chapter 6 in D.F. Hendry and K.F. Wallis (eds.), Econometrics and Quantitative Economics. Oxford: Basil Blackwell, 135-172.

Mizon, G.E. and J.-F. Richard (1983) "The encompassing principle and its application to testing non-nested hypotheses"; CORE discussion paper 8330, Université Catholique de Louvain, Louvain-la-Neuve, Belgiun, forthcoming in Econometrica.

Naylor, Th.H., T.G. Seaks, and D.W. Wichern (1972) “Box-Jenkins methods: An alternative to econometric models", International Statistical Review, 40, 2, 123-137.

Nellis, J.G. and J.A. Longbottom (1981) “An emptrical analysis of the determination of house prices in the United Kingdom", Urban Studies, 18, 9-21.

Nelson, C.R. (1972) “The prediction performance of the FRB-MIT-PENN model of the U.S. economy", American Economic Review, 62, 5, 902-917.

Pesaran, M.H. (1974) "On the general problem of model selection", Review of Economic Studies, 41, 2, 153-171.

Prothero, D.L. and K.F. Wallis (1976) "Modelling macroeconomic time series", Journal of the Royal Statistical Society, Series A, 139, 4, 468-500 (with discussion). -

Rao, C.R. (1948) “Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation", Proceedings of the Cambridge Philosophical Society: Mathematical and Physical Sciences, 44, 50-57.

Schumpeter, J. (1933) "The common sense of econometrics", Econometrica, 1, 1, 5-12.

Schwarz, G. (1978) "Estimating the dimension of a model", Annals of Statistics, 6, 2, 461-464.

Smith, A. (1982) A Mathematical Introduction to Economics. Oxford: Basil Blackwell.

Theil, H. (1964) Optimal Decision Rules for Government and Industry.

Amsterdam: North-Holland Publishing Co.

Trivedi, P.K. (1984) "Uncertain prior information and distributed lag analysis", Chapter 7 in D.F. Hendry and K.F. Wallis (eds.), Econometrics and Quantitative Economics. Oxford: Basil Blackwell, 173-210.

White, H. (1980) "A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity”, Econometrica, 48, 4, 817-838.

Zellner, A. and F. Palm (1974) “Time series analysis and simultaneous equation econometric models”, Journal of Econometrics, 2, Ll, 17-54.

Cite this document

APA

Neil R. Ericsson and David F. Hendry (1985). Conditional Econometric Modelling: An Application to New House Prices in the United Kingdom (IFDP 1985-254). Board of Governors of the Federal Reserve System, International Finance Discussion Papers. https://whenthefedspeaks.com/doc/ifdp_1985-254

BibTeX

@techreport{wtfs_ifdp_1985_254,
  author = {Neil R. Ericsson and David F. Hendry},
  title = {Conditional Econometric Modelling: An Application to New House Prices in the United Kingdom},
  type = {International Finance Discussion Papers},
  number = {1985-254},
  institution = {Board of Governors of the Federal Reserve System},
  year = {1985},
  url = {https://whenthefedspeaks.com/doc/ifdp_1985-254},
  abstract = {The statistical formulation of the econometric model is viewed as a sequence of marginalizing and conditioning operations which reduce the parameterization to manageable dimensions. Such operations entail that the "error" is a derived rather than an autonomous process, suggesting designing the model to satisfy data-based and theory criteria. The relevant concepts are explained and applied to data modelling of UK new house prices in the framework of an economic theory-model of house builders. The econometric model is compared with univariate time-series models and tested against a range of alternatives.},
}