feds · August 31, 2007

Averaging Forecasts from VARs with Uncertain Instabilities

Abstract

A body of recent work suggests commonly-used VAR models of output, inflation, and interest rates may be prone to instabilities. In the face of such instabilities, a variety of estimation or forecasting methods might be used to improve the accuracy of forecasts from a VAR. These methods include using different approaches to lag selection, different observation windows for estimation, (over-) differencing, intercept correction, stochastically time-varying parameters, break dating, discounted least squares, Bayesian shrinkage, and detrending of inflation and interest rates. Although each individual method could be useful, the uncertainty inherent in any single representation of instability could mean that combining forecasts from the entire range of VAR estimates will further improve forecast accuracy. Focusing on models of U.S. output, prices, and interest rates, this paper examines the effectiveness of combination in improving VAR forecasts made with real-time data. The combinations include simple averages, medians, trimmed means, and a number of weighted combinations, based on: Bates-Granger regressions, factor model estimates, regressions involving forecast quartiles, Bayesian model averaging, and predictive least squares-based weighting. Our goal is to identify those approaches that, in real time, yield the most accurate forecasts of these variables. We use forecasts from simple univariate time series models as benchmarks.

Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C. Averaging Forecasts from VARs with Uncertain Instabilities Todd E. Clark and Michael W. McCracken 2007-42 NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

Averaging Forecasts from VARs with Uncertain Instabilities ∗ Todd E. Clark Federal Reserve Bank of Kansas City Michael W. McCracken Board of Governors of the Federal Reserve System October 2006 Abstract A body of recent work suggests commonly–used VAR models of output, inflation, and interest rates may be prone to instabilities. In the face of such instabilities, a variety of estimation or forecasting methods might be used to improve the accuracy of forecasts from a VAR. These methods include using different approaches to lag selection, different observation windows for estimation, (over-) differencing, intercept correction,stochasticallytime–varyingparameters,breakdating,discountedleastsquares, Bayesian shrinkage, and detrending of inflation and interest rates. Although each individual method could be useful, the uncertainty inherent in any single representation of instability could mean that combining forecasts from the entire range of VAR estimateswillfurtherimproveforecastaccuracy. FocusingonmodelsofU.S.output,prices, and interest rates, this paper examines the effectiveness of combination in improving VAR forecasts made with real–time data. The combinations include simple averages, medians, trimmed means, and a number of weighted combinations, based on: Bates- Granger regressions, factor model estimates, regressions involving forecast quartiles, Bayesianmodelaveraging, andpredictiveleastsquares–basedweighting. Ourgoalisto identify those approaches that, in real time, yield the most accurate forecasts of these variables. We use forecasts from simple univariate time series models as benchmarks. JEL Nos.: C53, E37, C32 Keywords: Forecast combination, real-time data, prediction, structural change ∗Clark (corresponding author): Economic Research Dept.; Federal Reserve Bank of Kansas City; 925 Grand; Kansas City, MO 64198; todd.e.clark@kc.frb.org. McCracken: Board of Governors of the Federal Reserve System; 20th and Constitution N.W.; Mail Stop #61; Washington, D.C. 20551; michael.w.mccracken@frb.gov. This paper was written for a Reserve Bank of New Zealand conference MacroeconometricsandModelUncertaintyheldinJune2006. WegratefullyacknowledgehelpfulconversationswithSimonPotterandShaunVaheyandhelpfulcommentsfromChristieSmithandotherconference participants. The views expressed herein are solely those of the authors and do not necessarily reflect the views of the Federal Reserve Bank of Kansas City, Board of Governors, Federal Reserve System, or any of its staff.

1 Introduction Small–scale VARs are now widely used in macroeconomics and central bank forecasting. Examples of VARs used to forecast output, prices, and interest rates include Sims (1980), Doan, et al. (1984), Litterman (1986), Brayton et al. (1997), Jacobson et al. (2001), Robertson and Tallman (2001), Del Negro and Schorfheide (2004), and Favero and Marcellino(2005). However, thereisanincreasingbodyofevidencesuggestingthattheseVARs maybepronetoinstabilities.1 ExamplesincludeWebb(1995),KozickiandTinsley(2001b, 2002), Cogley and Sargent (2001, 2005), Boivin (2006), and Beyer and Farmer (2006). Although many different structural forces could lead to instabilities in macroeconomic VARs (e.g., Rogoff (2003) and others have suggested that globalization has altered inflation dynamics),muchoftheaforementionedliteraturehasfocusedonshiftspotentiallyattributable to changes in the behavior of monetary policy. Accordingly, in previous work (Clark and McCracken, 2006a) we considered the performance of various methods for improving the forecast accuracy of VARs in the presence of structural change. For trivariate VARs in a range of measures of output, inflation, and a short–terminterestrate,thesemethodsinclude: sequentiallyupdatinglagorders,usingvarious observation windows for estimation, working in differences rather than levels, making intercept corrections (as in Clements and Hendry (1996)), allowing stochastic time variation in model parameters, allowing discrete breaks in parameters, discounted least squares estimation, Bayesian shrinkage, and detrending of inflation and interest rates. While some of these methods performed well at various times, various forecast horizons, and for some variables, simple averages (across the various methods just described) were consistently among the best performers. One interpretation of this result is that it is crucial to have an understanding of the form of instability when constructing good forecasts. Another, and the one we prefer, is that in practice it is very difficult to know the form of structural instability, and model averaging provides an effective method for forecasting in the face of such uncertainty. As summarized by Timmermann (2006), competing models will differ in their sensitivity to 1Admittedly,whiletheevidenceofinstabilitiesintherelationshipsincorporatedinsmallmacroeconomic VARsseemstobegrowing,theevidenceisnotnecessarilyconclusive. RudebuschandSvensson(1999)apply stabilityteststothefullsetofcoefficientsofaninflation–outputgapmodelandareunabletorejectstability. Rudebusch(2005)findsthathistoricalshiftsinthebehaviorofmonetarypolicyhaven’tbeenenoughtomake reduced form macro VARs unstable. Estrella and Fuhrer (2003) find little evidence of instability in joint tests of a Phillips curve relating inflation to the output gap and an IS model of output. Similarly, detailed test results reported in Stock and Watson (2003) show inflation–output gap models to be largely stable. 1

structural breaks. Depending on the size and nature of structural breaks, models that quickly pick up changes in parameters may or may not be more accurate than models that do not. For instance, in the case of a small, recent break, a model with constant parameters may forecast more accurately than a model that allows a break in coefficients, due to the additional noise introduced by the estimation of post–break coefficients (see, for example, ClarkandMcCracken(2005b)andPesaranandTimmermann(2006)). However, inthecase of a large break well in the past, a model that correctly picks up the associated change in coefficients will likely forecast more accurately than models with constant or slowly changing parameters. Accordingly, Timmermann (2006) and Pesaran and Timmermann (2006) suggest that combinations of forecasts from models with varying degrees of adaptability to uncertain (especially in real time) structural breaks will be more accurate than forecasts from individual models. In this paper we provide empirical evidence on the ability of various forms of forecast averaging to improve the real–time forecast accuracy of small-scale macroeconomic VARs in the presence of uncertain forms of model instabilities. Focusing on six distinct trivariate models incorporating different measures of output and inflation and a common interest rate measure, we consider a wide range of approaches to averaging forecasts obtained with a variety of the aforementioned primitive methods for managing model instability. The average forecasts include: equally weighted averages with and without trimming, medians, common factor-based forecasts, Bates–Granger combinations estimated with ridge regression, MSE– weightedaverages,lowestMSEforecasts(predictiveleastsquaresforecasts),Bayesianmodel averages, and combinations based on quartile average forecasts (as suggested by Aiolfi and Timmermann (2006)). For each of these forms of forecast or model averaging we construct real time forecasts of each variable using real–time data. We compare our results to those from simple baseline univariate models and selected baseline VAR models. Our results indicate that while some of the primitive forms of managing structural instability sometimes provide the largest gains in terms of forecast accuracy — notably those models with some form of Bayesian shrinkage — model averaging is a more consistent method for improving forecast accuracy. Not surprisingly, the best type of averaging often varies with the variable being forecast, but several patterns do emerge. After aggregating across all models, horizons and variables being forecasted, it is clear that the simplest forms of model averaging — such as those that use equal weights across all models or those 2

that average a univariate model with a particular VAR, such as a VAR(4) using detrended inflation and interest rates — consistently perform among the best methods. At the other extreme, forecasts based on OLS–type combination and factor model–based combination rank among the worst. The remainder of the paper proceeds as follows. Section 2 describes the real-time data andsamples. Section3providesasynopsisoftheformsofmodelaveragingusedtoforecast in the presence of uncertain forms of structural change. Section 4 presents our results on forecast accuracy, including root mean square errors of the methods used. Section 5 concludes. 2 Data We consider the real–time forecast performance of models with three different measures of output (y), two measures of inflation (π), and a short–term interest rate (i). The output measures are GDP or GNP (depending on data vintage) growth, an output gap computed with the method described in Hallman, Porter, and Small (1991), and an output gap estimatedwiththeHodrickandPrescott(1997)filter. Thefirstoutputgapmeasure(hereafter, the HPS gap), based on a method the Federal Reserve Board once used to estimate potential output for the nonfarm business sector, is entirely one–sided but turns out to be highly correlated with an output gap based on the Congressional Budget Office’s (CBO’s) estimate of potential output. The HP filter of course has the advantage of being widely used and easy to implement. We follow Orphanides and van Norden (2005) in our real time application of the filter: for forecasting starting in period t, the gap is computed using the conventional filter and data available through period t 1. The inflation measures include − the GDP or GNP deflator or price index (depending on data vintage) and CPI price index. The short–term interest rate is measured with the 3–month Treasury bill rate; using the federalfundsrateyieldsqualitativelysimilarresults. Note,finally,thatgrowthandinflation rates are measured as annualized log changes (from t 1 to t). Output gaps are measured − in percentages (100 times the log of output relative to trend). Interest rates are expressed in annualized percentage points. The raw quarterly data on output, prices, and interest rates are taken from the Federal Reserve Bank of Philadelphia’s Real–Time Data Set for Macroeconomists (RTDSM), the Board of Governor’s FAME database, and the website of the Bureau of Labor Statistics 3

(BLS). Real–time data on GDP or GNP and the GDP or GNP price series are from the RTDSM. For simplicity, hereafter we simply use the notation “GDP” and “GDP price index” to refer to the output and price series, even though the measures are based on GNP and a fixed weight deflator for much of the sample. In the case of the CPI and the interest rates, for which real time revisions are small to essentially non–existent, we simply abstract fromrealtimeaspectsofthedata. FortheCPI,wefollowtheadviceofKozickiandHoffman (2004) for avoiding choppiness in inflation rates for the 1960s and 1970s due to changes in index bases, and use a 1967 base year series taken from the BLS website in late 2005.2 For the T-bill rate, we use a series obtained from FAME. The full forecast evaluation period runs from 1970:Q1 through 2005; as detailed in section3, forecastsfrom1965:Q4through1969:Q4areusedasinitialvaluesinthecombination forecasts that require historical forecasts. Accordingly, we use real time data vintages from 1965:Q4 through 2005:Q4. As described in Croushore and Stark (2001), the vintages of the RTDSM are dated to reflect the information available around the middle of each quarter. Normally, inagivenvintaget, theavailableNIPAdatarunthroughperiodt 1.3 Thestart − dates of the raw data available in each vintage vary over time, ranging from 1947:Q1 to 1959:Q3, reflecting changes in the published samples of the historical data. For each forecast origin t in 1965:Q4 through 2005:Q3, we use the real time data vintage t to estimate output gaps, estimate the forecast models, and then construct forecasts for periods t and beyond. The starting point of the model estimation sample is the maximum of (i) 1955:Q1 and (ii) the earliest quarter in which all of the data included in a given model are available, plus five quarters to allow for four lags and differencing or detrending. We present forecast accuracy results for forecast horizons of the current quarter (h = 0Q), the next quarter (h = 1Q), four quarters ahead (h = 1Y), and eight quarters ahead (h = 2Y). In light of the time t 1 information actually incorporated in the VARs used for − forecastingatt,thecurrentquarter(t)forecastisreallya1–quarteraheadforecast,whilethe nextquarter(t+1)forecastisreallya2–stepaheadforecast. Whatarereferredtoas1–year ahead and 2–year ahead forecasts are really 5– and 9–step ahead forecasts. In keeping with common central bank practice, the 1– and 2–year ahead forecasts for GDP/GNP growth and inflation are four–quarter rates of change (the 1–year ahead forecast is the percent 2The BLS only provides the 1967 base year CPI on a not seasonally adjusted basis. We seasonally adjusted the series with the X-11 filter. 3In the case of the 1996:Q1 vintage, with which the BEA published a benchmark revision, the data run through 1995:Q3 instead of 1995:Q4. 4

change from period t+1 through t+4; the 2–year ahead forecast is the percent change from period t+5 through t+8). The 1– and 2–year ahead forecasts for output gaps and interest rates are quarterly levels in periods t+4 and t+8, respectively. For computational simplicity in our extensive real–time analysis, all of the multi–step forecasts are obtained by iterating the 1–step ahead models. As discussed in such sources as Romer and Romer (2000), Sims (2002), and Croushore (2006), evaluating the accuracy of real time forecasts requires a difficult decision on what to take as the actual data in calculating forecast errors. We follow Romer and Romer (2000) and use the second available estimates of GDP/GNP and the GDP/GNP deflator as actuals in evaluating forecast accuracy. In the case of h–step ahead (for h = 0Q, 1Q, 1Y, and 2Y) forecasts made for period t+h with vintage t data ending in period t 1, the − second available estimate is normally taken from the vintage t+h+2 data set. In light of our abstraction from real time revisions in CPI inflation and interest rates, for these series the real time data correspond to the final vintage data. 3 Forecast methods The forecasts of interest in this paper are combinations of forecasts from a wide range of approaches to allowing for structural change in trivariate VARs: sequentially updating lag orders, using various observation windows for estimation, working in differences rather than levels, making intercept corrections (as in Clements and Hendry (1996)), allowing stochastic time variation in model parameters, allowing discrete breaks in parameters identified with break tests, discounted least squares estimation, Bayesian shrinkage, and detrending of inflation and interest rates. Table 1 lists the set of individual VAR forecast methods considered in this paper, along with some detail on forecast construction. To be precise, for each model — defined as being a baseline VAR in one measure of output (y), one measure of inflation (π), and one short–term interest rate (i) — we apply each of the estimation and forecasting methods listed in Table 1. Note that, although we simply refer to all the underlying forecasts as VAR forecasts, in fact the list of individual models includes a univariate specification for each of output, inflation, and the interest rate. For output the univariate model is an AR(2). In the case of inflation, we follow Stock and Watson (2006) and use an MA(1) process for the change in inflation (∆π), estimated with a rolling window of 40 observations. For simplicity, in light 5

of some general similarities in the time series properties of inflation and short–term interest rates and the IMA(1) rationale for inflation described by Stock and Watson, the univariate model for the short-term interest rate is also specified as an MA(1) in the first difference of the series (∆i).4 Table 2 provides a comprehensive list, with some detail, of the approaches we use to combining forecasts from these underlying models. The remainder of this section explains the averaging methods. 3.1 Equally weighted averages We begin with seven distinct, simple forms of model averaging, in each case using what could loosely be described as equal weights. The first is an equally weighted average of all the VAR forecasts in Table 1, for a given triplet of variables. More specifically, for a given combination of measures of output, inflation, and the interest rate (for example, for the combination GDP growth, GDP inflation, and the T-bill rate), we average forecasts from the 50 VARs listed in Table 1. With an eye towards making this model average robust to individual forecasts that might be considered outliers, we also consider the median forecast and both 10 and 20 percent trimmed means. We include a fifth average forecast approach motivated by the results of Clark and Mc- Cracken (2005b), who show that forecast accuracy can be improved by combining forecasts from models estimated with recursive (all available data) and rolling samples. For a given VAR(4), we form an equally weighted average of the model forecasts constructed using parameters estimated (i) recursively (with all of the available data) and (ii) with a rolling window of the past 60 observations. Three other averages are motivated by the Clark and McCracken (2005a) finding that combining forecasts from nested models can improve forecast accuracy. We consider an average of the univariate forecast with the VAR(4) forecast, an average of the univariate forecast with the DVAR(4) forecast, and an average of the univariate forecast with a forecast from a VAR(4) in output, detrended inflation, and the detrended interest rate (Table 1 and section 3.7 provide more information on the VAR with detrending). While these pairwise average forecasts may seem ad hoc from a Bayesian model averaging perspective, our aforementioned results, based on frequentist methods, suggest they 4After completing the results and analysis presented below, we went back and compared the IMA(1) for the interest rate to various AR alternatives. The IMA(1) generally dominated these alternatives. 6

maybeeffective,especiallyinthefaceofconsiderableparameterestimationnoiseassociated with VARs. As an example, suppose that, in truth, output, inflation, and the interest rate canbemodeledasaVAR(4). Thefrequentistresultsinourpriorwork(theory,MonteCarlo experiments, and empirics in Clark and McCracken (2005a)) imply that, unless the VAR(4) is estimated with great precision, combining forecasts from the VAR(4) with forecasts from univariate models will likely improve forecast accuracy. Similar arguments suggest averaging a DVAR(4) (or a VAR(4) in detrended data) with univariate forecasts and averaging a rolling estimate of the VAR with forecasts based on recursive estimates. In each case, combination improves forecast accuracy by shrinking the larger model forecast with relatively high sampling error and arguably less bias to a smaller model forecast with less sampling error but greater bias. 3.2 Combinations based on Bates–Granger/ridge regression We also consider a large number of average forecasts based on historical forecast performance—onesuchapproachbeingforecastcombinationbasedonBatesandGranger(1969) regression. For these methods, we need an initial sample of forecasts preceding the sample to be used in our formal forecast evaluation. With the formal forecast evaluation sample beginning with 1970:Q1, we use an initial sample of forecasts from 1965:Q4 (the starting point of the RTDSM) through 1969:Q4. Therefore, in the case of current quarter forecasts constructed in 1970:Q1, we have an initial sample of 17 forecasts to use in estimating combination regressions, forming MSE weights, etc. Note also that these performance-based combinations are based on real time forecast accuracy. That is, in period t, in deciding how best to combine forecasts based on historical performance, we use the historical real time forecasts compared to real time data in determining the combinations. To obtain combinations based on the Bates–Granger approach, for each of output, inflation, and the interest rate we use the actual data that would have been available to a forecasterinrealtimetoestimateageneralizedridgeregressionoftheactualdataonthe50 VAR forecasts, shrinking the coefficients toward equal weights. Our implementation follows that of Stock and Watson (1999): letting Z denote the vector of 50 forecasts of variable t+ht | z made in period t and βequal denote a 50 1 vector filled with 1/50, the combination t+h × coefficient vector estimate is βˆ = (cI + Z Z ) 1(cβequal + Z z ), (1) 50 t+ht t!+ht − t+ht t+h | | | t t ! ! 7

where c = k trace(50 1 Z Z ). We consider three different forecasts, based on × − t t+h | t t!+h | t different values of the shrin"kage coefficient k: .001, .25, and 1. A smaller (larger) value of k implies less (more) shrinkage. Following Stock and Watson (1999), we use a value .001 to approximate the OLS combination of Bates and Granger (1969). For each value of k, we consider forecasts based on both a recursive estimate of the combination regression (using all available forecasts) and a 10–year rolling sample estimate (using just the most recent 10 years of forecasts, or all available if less than 10 years are available). 3.3 Common factor combinations Stock and Watson (1999, 2004) develop another approach to combining information from individualmodelforecasts: estimatingacommonfactorfromtheforecasts,regressingactual data on the common factor, and then using the fitted regression to forecast into the future. Therefore, using the real time forecasts available through the forecast origin t, we estimate (by principal components) one common factor from the set of 50 VAR forecasts for each of output, inflation, and the interest rate (estimating one factor for output, another for inflation, etc.). We then regress the actual data available in real time as of t on a constant and the factor. The factor–based forecast is then obtained from the estimated regression, usingthefactorobservationforperiodt. Asinthecaseoftheridgeregressions, wecompute factor–combination forecasts on both a recursive (using all available forecasts) and 10–year rolling (using just the most recent 10 years of forecasts, or all available if less than 10 years are available) basis. 3.4 MSE–weighted and PLS forecasts We also consider several average forecasts based on inverse MSE weights. At each forecast origint,historicalMSEsofthe50VARforecastsofeachofoutput,inflation,andtheinterest rate are calculated with the available forecasts and actual data, and each forecast i of the given variable is given a weight of MSE 1/ MSE 1. In addition, following Stock and i− i i− Watson (2004), we consider a forecast based o"n a discounted mean square forecast error (in which, from a forecast origin of t, the squared error in the earlier period s is discounted by a factor δt s). We use a discount rate of δ = .95. − We also consider a forecast based on the model(s) with lowest historical MSE — i.e., based on predictive least squares (PLS). At each forecast origin t, we identify the model forecast with the lowest historical MSE, and then use that single model to forecast into the 8

future. We compute alternative MSE–weighted and PLS forecasts with several different samples of historical forecasts: all available forecasts (recursive), a 10 year rolling window of forecasts, and a 5 year rolling window of forecasts. 3.5 Quartile forecasts AiolfiandTimmermann(2006)developalternativeapproachestoforecastcombinationthat take into account persistence in forecast performance — the possibility that some models may be consistently good while others may be consistently bad. Their simplest forecast is an equally weighted average of the forecasts in the top quartile of forecast accuracy (that is, the forecasts with historical MSEs in the lowest quartile of MSEs). More sophisticated forecasts involve measuring performance persistence as forecasting moves forward in time, sorting the forecasts into clusters based on past performance, and estimating combination regressions with a number of clusters determined by the degree of persistence. For tractabilityinourextensivereal–timeforecastevaluation,weconsidersimpleversionsofthe Aiolfi–Timmermann methods, based on just the first and second quartiles. Specifically, we consider a simple average of the forecasts in the top quartile of historical forecast accuracy. We also consider a forecast based on an OLS–estimated combination regression including a constant, the average of the first quartile forecasts, and the average of the second quartile forecasts. We compute these quartile–based forecasts with several different samples of historical forecasts: all available forecasts (recursive), a 10 year rolling window of forecasts, and a 5 year rolling window of forecasts. 3.6 Bayesian model averages Following Wright (2003) and Koop and Potter (2004), among others, we also consider forecasts obtained by Bayesian model averaging (BMA). At each forecast origin t, for each equation of the 50 models listed in Table 1, we calculate a posterior probability using the conventional formula Prob(data M ) Prob(M ) i i Prob(M data) = | × (2) i | Prob(data M ) Prob(M ) i | i × i Prob(M ) prior probability on model i = 1/50 i " ≡ Prob(data M ) marginal likelihood for model i. i | ≡ 9

We consider several different measures of the marginal likelihood, each of which yields a different BMA forecast. The three measures are the AIC, BIC, and Phillips’ (1996) PIC.5 TheBICiswellknowntobeproportionaltothemarginallikelihoodofmodelsestimatedby OLS or, equivalently, diffuse priors. BMA applications such as Koop, Potter, and Strachan (2005) and Garratt, Koop, and Vahey (2006) have also used BIC to estimate the marginal likelihood and in turn average models. The AIC can be viewed as another measure of the marginallikelihoodformodelsestimatedbyOLS. Phillips(1996)developsanothercriterion, PIC, as a measure of marginal likelihood appropriate for comparing VARs in levels, VARs in differences, and VARs estimated with informative priors (BVARs). Specifically, at each forecast origin t, for each of the model estimates listed in Table 1, we compute the AIC, BIC, and PIC for each equation of the model.6 For each criterion, we then form a BMA forecast using .5T times the information criterion value as the marginal likelihood of each − equation. In our application, calculating the information criteria requires some decisions on how to deal with some of the important differences in estimation approaches (e.g., rolling versus recursive estimation) for the 50 underlying model forecasts. In the case of models estimated with a rolling sample of data, we calculate the AIC, BIC, and PIC based on a model that allows a discrete break in all the model coefficients at the point of the beginning of the rolling sample. For models estimated by discounted least squares (DLS), we calculate the information criteria using residuals defined as actual data less fitted values based on the DLS coefficient estimates. In the case of the AIC and BIC applied to BVAR models, for simplicity we abstract from the prior and calculate the criteria based on the residual sums of squares and simple parameter count (PIC is calculated for VARs and BVARs, to take account of priors, as describedinPhillips(1996)).7 AsPhillips(1996)notes,thepriorisasymptoticallyirrelevant in the sense that, as the sample grows, sample information dominates the prior. For marginal likelihood measures other than PIC, taking (proper Bayesian) account of the 5Note that our BMA forecasts are numerically equivalent (with equal prior weights on each model) to those that would be obtained under the information criteria–weighting approach developed in Kapetanios, Labhard, and Price (2005). These authors, however, suggest a frequentist, rather than Bayesian, interpretation of the information criterion–weighted forecast. 6IncalculatingPICfortheunivariateIMAmodelsforinflationandinterestrates,wesimplyapproximate theMAfitswithAR(1)modelsestimatedfor∆π and∆i(estimatingseparatemodelsfortherollingsample and the earlier sample), and calculate PIC values using these AR(1) approximations. 7For BVARs with TVP, at each point in time t we calculate the model residuals as a function of the period t coefficients and use these residuals to compute the residual sums of squares. 10

finite–sample role of the Bayesian prior in combining forecasts from models estimated with different priors would require Monte Carlo integration, which is intractable in our large– scale, real–time forecast evaluation.8 3.7 Benchmark forecasts To evaluate the practical value of all the averaging methods described above, we compare the accuracy of the above combination or average forecasts against various benchmarks. In light of common practice in forecasting research, we use forecasts from the univariate time series models as one set of benchmarks.9 We also include for comparison forecasts from selected VAR methods that are either of general interest in light of common usage or performed relatively well in our prior work: a VAR(4); DVAR(4) (a VAR with inflation and the interest rate differenced); BVAR(4) with conventional Minnesota priors; BVAR(4) with stochastically time–varying (random walk) parameters; and a BVAR(4) in output, detrended inflation, and the interest rate less the inflation trend. The BVAR(4) with inflation detrending draws on the work of Kozicki and Tinsley(2001a,b,2002)onmodelswithlearningaboutanunobservedtime–varyinginflation target of the central bank. For tractability in real time forecasting, we follow Cogley (2002) in estimating the inflation target or trend with exponential smoothing.10 Table 1 provides additional detail on all of these model specifications. 4 Results In evaluating the performance of the forecasting methods described above, we follow Stock and Watson (1996, 2003, 2005), among others, in using squared error to evaluate accuracy and considering forecast performance over multiple samples. Specifically, we measure accuracy with root mean square error (RMSE). In light of the evidence in Stock and Watson 8As Koop and Potter (2004) note, BMA allows for two types of shrinkage: (1) through priors on parameters imposed in parameter estimation and (2) through the model priors in the calculation of the BMA weights. Accordingly, in practice, there is some interchangeability between the two types of shrinkage. Asymptotically, the first form becomes irrelevant asymptotically. Our simple approach with AIC and BIC corresponds to focusing entirely on the second form of shrinkage. 9Ofcourse,thechoiceofbenchmarkstodayisinfluencedbytheresultsofpreviousstudiesofforecasting methods. Although a forecaster today might be expected to know that an IMA(1) is a good univariate modelforinflation,thesamemaynotbesaidofaforecasteroperatingin1970. Forexample,Nelson(1972) used as benchmarks AR(1) processes in the change in GNP and the change in the GNP deflator (both in levels rather than logs). Nelson and Schwert (1977) first proposed an IMA(1) for inflation. 10As noted in Clark and McCracken (2006a), the resulting trend estimate is quite similar to measures of long–run inflation expectations. 11

(2003) and others of instabilities in forecast performance over time, we examine accuracy over forecast samples of 1970-84 and 1985-2005, to ensure our general results are robust across sample periods.11 Tobeabletoprovidebroad,robustresults,intotalweconsideralargenumberofmodels and methods — too many to be able to present all details of the results. In the interest of brevity, we present more detailed results on forecasts of GDP growth and inflation than forecasts of the output gap measures or interest rates. We also focus on a few forecast horizons — those for h = 0Q, h = 1Q, and h = 1Y — but do present results for the h = 2Y horizon. Tables 3 and 4 report forecast accuracy (RMSE) results for GDP growth and either GDP price index-based or CPI-based inflation using 38 forecast methods. In each case we use the 3-month T–bill as the interest rate, and present results for horizons h = 0Q, h = 1Q, and h = 1Y. Table 5 reports the same but for the h = 2Y horizon. In Table 6 we report forecast accuracy results for the T-bill rate at all horizons, from models using GDP growth and GDP inflation. In every case, the first row of the table provides the RMSE associated with the baseline univariate model, while the others report ratios of the corresponding RMSE to that for the benchmark univariate model. Hence numbers less than one denote an improvement over the univariate baseline while numbers greater than one denote otherwise. In Table 7 we take another approach to broadly determining which methods tend to perform better than the benchmark. Across each variable, model and forecast horizon, we compute the average rank of the methods included in Tables 3-6. We present average rankings for every method we consider across each variable, forecast horizon, and the 1970- 84 and 1985-05 samples (spanning all columns of Tables 3-6 plus unreported results for forecasts from models using an output gap as well as forecasts of the T-bill rate from models using our various measures of output and inflation). Todeterminethestatisticalsignificanceofdifferencesinforecastaccuracy,weuseanon– parametric bootstrap patterned after White’s (2000) to calculate p–values for each RMSE ratio in Tables 3-6. The individual p–values represent a pairwise comparison of each VAR or average forecast to the univariate forecast. RMSE ratios that are significantly less than 1 at a 10 percent confidence interval are indicated with a slanted font. To determine whether 11With forecasts dated by the end period of the forecast horizon h = 0,1,4, the VAR forecast samples are, respectively, 1970:Q1+h to 1984:Q4 and 1985:Q1 to 2005:Q3-h. 12

a best forecast in each column of the tables is significantly better than the benchmark once the data snooping or search involved in selecting a best forecast is taken into account, we apply Hansen’s (2005) (bootstrap) SPA test to differences in MSEs (for each model relative to the benchmark). Hansen shows that, if the variance of the forecast loss differential of interest differs widely across models, his SPA test will typically have much greater power than White’s (2000) reality check test. For each column, if the SPA test yields a p–value of 10 percent or less, we report the associated RMSE ratio in bold font. Because the SPA test isbasedont–statisticsforequalMSEinsteadofjustdifferences inMSE(thatis, takesMSE variability into account), the forecast identified as being significantly best by SPA may not be the forecast with the lowest RMSE ratio.12 We implement the bootstrap procedures by sampling from the time series of forecast errors underlying the entries in Tables 3-6. For simplicity, we use the moving block method of Kunsch (1989) and Liu and Singh (1992) rather than the stationary bootstrap actually used by White (2000) and Hansen (2005); the moving block is also asymptotically valid. The bootstrap is applied separately for each forecast horizon, using a block size of 1 for the h = 0Q forecasts, 2 for h = 1Q, 5 for h = 1Y, and 9 for h = 2Y.13 In addition, in light of the potential for changes over time in forecast error variances, the bootstrap is applied separately for each subperiod. Note, however, that the bootstrap sampling preserves the correlations of forecast errors across forecast methods. 4.1 Declining volatility While there are many nuances in the detailed results, some clear patterns emerge. The univariate RMSEs clearly show the reduced volatility of the economy since the early 1980s, particularlyforoutput. Foreachhorizon,thebenchmarkunivariateRMSEsofGDPgrowth declined by roughly two-thirds across the 1970-84 and 1985-05 samples (Tables 3-5). The reduced volatility continues to be evident for the inflation measures (Tables 3-5). At the shorterhorizons,h = 0Qandh = 1Q,thebenchmarkRMSEsfellbyroughlyhalf,butatthe longer h = 1Y and h = 2Y horizons the variability declines nearly two-thirds. The reverse is true for the interest rate forecasts (Table 6). At the shorter horizons the benchmark RMSEs fell by roughly two-thirds but at the longer horizons the variability declines by less 12Formulti–stepforecasts,wecomputethevarianceenteringthet–testusingtheNeweyandWest(1987) estimator with a lag length of 1.5 h, where h denotes the number of forecast periods. ∗ 13Foraforecasthorizonofτ periods,forecasterrorsfromaproperlyspecifiedmodelwillfollowanMA(τ 1) − process. Accordingly, we use a moving block size of τ for a forecast horizon of τ. 13

than half. 4.2 Declining predictability ConsistentwiththeresultsinCampbell(2005),D’Agostino,etal.(2005),StockandWatson (2006), and Tulip (2005), there are some clear signs of a decline in the predictability of both output and inflation: it has become harder to beat the accuracy of a univariate forecast. For example, at forecast horizons of h = 1Y or less, most methods or models beat the accuracy of the univariate forecast of GDP growth during the 1970-84 period (Tables 3 and 4). In fact, many do so at a level that is statistically significant; at each horizon Hansen’s (2005) SPA test identifies a statistically significant best performer. But over the 1985-2005 period, for h = 0Q and h = 1Q forecasts only the BVAR(4)-TVP models are more accurate at short horizons, and that improvement fails to be statistically significant. At the h = 1Y horizon a handful of the methods continue to outperform the benchmark univariate, but very few are statistically significant. Interestingly, at the longest h = 2Y horizon (Table 5), it appears that it has become modestly easier to predict GDP growth, though again, few are statistically significant. The predictability of inflation has also declined, although less dramatically than for output. For example, in models with GDP growth and GDP inflation (Table 3), the best 1–year ahead forecasts of inflation improve upon the univariate benchmark RMSE by more than 10 percent in the 1970-84 period but only about 5 percent in 1985-05. The evidence of a decline in inflation predictability is perhaps most striking for CPI forecasts at the h = 0Q horizon. In Table 4, most of the models convincingly outperform the univariate benchmark duringthe1970-84period,withstatisticallysignificantmaximalgainsofroughly20percent. But in the following period, fewer methods outperform the benchmark, with gains typically about 4 percent.14 Predictability of the T-bill rate has not so much declined as it has shifted to a longer horizon. In Table 6 we see that at the h = 0Q horizon far fewer methods outperform the univariatebenchmarkaswemovefromthe1970-84periodtothe1985-05period. However, the decline in relative predictability starts to weaken as the forecast horizon increases. At the h = 1Q horizon some methods continue to beat the benchmark, although with maximal gains of only about 5 percent. But at the h = 1Y and h = 2Y horizons, not only do a 14Some the change in CPI predictability at the h = 0Q horizon could be due to the 1983 change in the CPI’s treatment of housing. Prior to 1983, changes in mortgage interest costs were included in the CPI. 14

larger number of methods improve upon the benchmark, they do so with maximal gains that are substantial and statistically significant, at about 12 percent. 4.3 Averaging methods that typically outperform the benchmark Thesharpdeclineofpredictabilitymakesitdifficulttoidentifymodelsoraveragingmethods that consistently beat the accuracy of the univariate benchmarks. The considerable sampling error inherent in small sample forecast comparisons further compounds the difficulty of finding a method that always or nearly always beats the univariate benchmark. Suppose, for example, that there exists a model or average forecast that, in population, is somewhat more accurate (by 10 percent, say) than the univariate benchmark. For forecast samples of roughly 15 years, there is a good chance that, in a given sample, the univariate forecast will actually be more accurate (see, e.g., Clark and McCracken’s (2006b) results for Phillips curve forecasts of inflation). The sampling uncertainty grows with the forecast horizon. As a result, we probably shouldn’t expect to be able to identify a particular forecast model or methodthatbeatstheunivariatebenchmarkforeveryvariable, horizon, andsampleperiod. Instead, we might judge a model or method a success if it beats the univariate benchmark mostofthetime(withsomeconsistencyacrossthe1970-84and1985-05samples)and,when it fails to do so, is not dramatically worse than the univariate benchmark. With these considerations in mind, the best forecast would appear to come from the pairwise averaging class: the single best forecast is an average of the univariate forecast with the forecast from a VAR(4) with inflation detrending (a VAR(4) in y, π π , and − ∗ − 1 i π , motivated by the work of Kozicki and Tinsley (2001a,b, 2002)). More so than any − ∗ − 1 other forecast, the forecast based on an average of the univariate and inflation detrended VAR(4)projectionsbeatstheunivariatebenchmarkaveryhighpercentageofthetimeand, when it fails to do so, is generally comparable to the univariate forecast. For example, in the case of forecasts of GDP growth and GDP inflation from models in these variables and the T-bill rate (Table 3), this pairwise average’s RMSE ratio is less than 1 for all samples and horizons, with the exception of h = 0Q and h = 1Q forecasts of GDP growth for 1985- 05, in which cases the RMSE ratio is only slightly above 1. For 1–year ahead forecasts of GDP growth, the RMSE of this average forecast is about 15 percent below the univariate benchmark for 1970-84 and 9 percent below for 1985-05; the corresponding figures for GDP inflation are each roughly 3 percent. While not quite as good as the average of the univariate and inflation detrended VAR 15

forecasts,someotheraveragesalsoseemtoperformwell,beatingtheaccuracyoftheunivariate benchmark with sufficient consistency as to be considered superior. In particular, two of the other pairwise forecasts — the VAR(4) with univariate and DVAR(4) with univariate averages — are often, although not always, more accurate than the univariate benchmarks. For instance, in forecasts of GDP growth and CPI inflation (Table 4), these pairwise averages’ RMSE ratios are less than 1 in 8 of 12 columns, and only slightly to modestly above 1 in the exceptions. The VAR(4)–univariate average tends to have a more consistent advantage in 1985-05 forecasts. In addition, among the inflation forecasts, the three pairwise combinations (univariate with inflation detrended VAR(4), VAR(4) and DVAR(4)) are the most consistent out-performers of the univariate benchmark across both the 1970-84 and 1985-05 subsamples. The rankings in Table 7 confirm that, from a broad perspective, the best forecasts are simple averages. In these rankings, the single best forecast is the average of the forecasts from the univariate and inflation detrended VAR(4). Across all variables, horizons, and samples, this forecast has an average ranking of 6.4; the next–best forecast, the average of the univariate and VAR(4) forecasts, has an average ranking of 12.0. While the univariate– inflation detrended VAR(4) average is, in relative terms, especially good for forecasting the T-bill rate (see column 5), this forecast retains its top rank even when interest rate forecasts are dropped from the calculations (column 2). This average forecast also performs relatively well for forecasting both output (column 3 shows it ranks a close second to the BVAR(4) with inflation detrending) and inflation (column 4 shows it ranks first). As to sample stability, the univariate–inflation detrended VAR(4) average is best in each of the 1970-84 and 1985-05 samples (columns 6-7). 4.4 Averaging methods that sometimes outperform the benchmark Among other forecasts, it is difficult to identify any methods that might be seen as consistently equaling or materially beating the univariate benchmark. Take, for instance, the simple equally weighted average of all forecasts, applied to a model in GDP growth, GDP inflation, and the T-bill rate (Table 3). This averaging approach is consistent in beating the univariate benchmark in the 1970-84 sample, but in most cases fails to beat the benchmark in the 1985-05 sample. Similarly, in the case of T-bill forecasts from the same model (Table 6, left half), the all–model average loses out to the univariate benchmark for two of the six combinations of horizon and sample, while the generally best–performing method 16

of averaging the univariate and inflation detrended VAR(4) forecasts beats the univariate benchmark in all cases. A number of the other averaging methods perform quite comparably to the simple average — and thus, by extension, fail to consistently equal or beat (materially) the univariate benchmark. Among the broad average forecasts, from the results in Tables 3-6 there seems to be no advantage of a median or trimmed mean forecast over the simple average. The accuracy of these forecasts tends to be quite similar. For example, in the case of 1-year ahead forecasts of GDP growth and GDP inflation for 1985-05, the 20 percent trimmed mean forecast’s RMSE ratios are .972 (growth) and 1.023 (inflation), compared to the simple average’s ratios of, respectively, .962 and 1.036 (Table 3). Similarly, MSE–weighted forecasts are comparable to simple average forecasts, in terms of RMSE accuracy.15 To use the same example of 1-year ahead forecasts of GDP growth and GDP inflation for 1985-05, the recursively MSE–weighted forecast’s RMSE ratios are .957 (growth) and 1.028 (inflation), compared to the simple average’s ratios of, respectively, .962 and 1.036 (Table 3). In 1-year ahead forecasts of CPI inflation (Table 4), the RMSE ratio of the recursively MSE–weighted forecast is .951 for 1970-84 and 1.055 for 1985-05, compared to the simple average forecast’s RMSE ratios of .950 and 1.066, respectively. Using the best–quartile forecast yields mixed results: the best quartile forecasts are sometimes more accurate and other times less accurate than the simple average and univariate forecasts. For example, in Table 4’s results for 1-year ahead forecasts of GDP growth, the best quartile forecast based on a 10 year rolling sample has a RMSE ratio of .780 for 1970-84 and 1.017 for 1985-05, compared to the simple average forecast’s RMSE ratios of, respectively, .839 and .997. Similarly, for Table 4’s CPI inflation forecasts, the 10 year rolling best quartile approach yields a forecast that is more accurate than the simple average for 1970-84 and less accurate for 1985-05. Where the best quartile forecast seems to have a consistent advantage over a simple average is in output forecasts for 1970-84. The rankings in Table 7 confirm the broad similarity of the above methods — the simple average, MSE–weighted averages, and best quartile forecasts. For example, the simple average forecast has an overall average ranking of 14.5, compared to rankings of 12.0 fortherecursiveMSE–weightedforecastand12.6fortherecursivebestquartileforecast. By comparison, the best forecast, the univariate–inflation detrended VAR(4) average, has an 15However, in the case of forecasts of the HP output gap, the MSE–weighted averages are consistently slightly better than the simple averages. 17

overall ranking of 6.4. In a very broad sense, most of the aforementioned average forecasts are better than the univariate benchmarks in that they all have higher rankings than the univariate’saveragerankingof17.3(column1). Note,however,thatmostoftheiradvantage comes in the 1970-84 sample; in the later sample, the univariate forecast generally ranks higher. For instance, for 1970-84 output and inflation forecasts, the all–model average has an average accuracy rank of 13.4, compared to the univariate ranking of 21.8 (column 6). But for 1985-05 forecasts, the all–model average has an average accuracy rank of 16.6, compared to the univariate ranking of 13.9 (column 7). 4.5 Averaging methods that rarely outperform the benchmark Many of the other averaging or combination methods are clearly dominated by univariate benchmarks (and, in turn, other average forecasts). OLS combinations or ridge combinations that approximate OLS often fare especially poorly. The OLS–approximating ridge regression combination (the one with k = .001) consistently yields poor forecasts. For example, inthecaseof1985-051–yearaheadforecastsofCPIinflationfrommodelswithGDP growth (Table 4), the RMSE ratio of the recursively estimated ridge regression with shrinkage parameter of .001 is 1.458. In other instances, the RMSE of the OLS–approximating ridge combination is about twice as large as that of the univariate benchmark. Similarly, the forecasts based on OLS combination regression using the first and second quartile average forecasts — especially those using rolling samples — are generally dominated by other average forecasts. In the same example, the RMSE ratios of the forecasts based on rolling OLS combinations of the top two quartile forecasts are 1.125 (10 year rolling) and 1.110 (5 year rolling), respectively, compared to the all–average forecast’s RMSE ratio of 1.066. While using more shrinkage improves the accuracy of forecast combinations estimated with generalized ridge regression, even the combinations based on ridge regression with non–trivialshrinkagearegenerallylessaccuratethantheunivariatebenchmarksandsimple average forecasts. For example, in 1985-05 forecasts of GDP growth from models using the GDP inflation measure (Table 3), the RMSE ratios of the k = 1 recursive ridge regression forecast are all above those of the simple average forecast. While the ridge forecasts are more commonly beaten by the simple average, there are, to be sure, a number of instances (as in the same example, but with a forecast sample of 1970-84) in which ridge forecasts are more accurate. On balance, though, the ridge combinations seem to be inferior to alternatives such as the simple average forecast. 18

Forecastsbasedonusingfactormodelmethodstoobtainacombinationarealsogenerally less accurate than alternatives such as the univariate and simple average forecasts. For example, in the case of 1-year ahead forecasts of GDP growth and GDP inflation for 1985- 05, the recursively estimated factor combination forecast’s RMSE ratios are 1.021 (growth) and1.536(inflation),comparedtothesimpleaverage’sratiosof,respectively,.962and1.036 (Table 3). The same is true for the PLS forecasts: although PLS forecasts are sometimes more accurate than the simple average, they are often worse. In the same example, the recursive PLS forecast’s RMSE ratios are 1.108 and 1.011, respectively. TheBMAforecastsarealsogenerally,althoughnotuniversally,dominatedbythesimple average. Forexample,inTable6’sforecastsoftheT-billrate,theRMSEratiosoftheBMA: BIC forecast are consistently above the ratios of the simple average forecast. However, in Table3’sresultsforGDPgrowthandGDPinflation,theaccuracyoftheBMA:BICforecast is generally comparable to that of the simple average forecast. Among the alternative BMA forecasts, there are times when those using AIC or PIC to measure the marginal likelihood are more accurate than those using BIC. But more typically, the BMA: BIC forecast is more accurate than the BMA: AIC and BMA: PIC forecasts — the pattern is especially clear in 1985-05 forecasts. The rankings in Table 7 provide a clear and convenient listing of the forecast methods that are generally dominated by the univariate benchmark and alternatives such as the best–performing pairwise average forecast and the all–model simple average. As previously mentioned, generalized ridge forecasts with little shrinkage (k = .001, so as to approximate OLS–based combination) typically perform among the worst forecasts for all horizons, variablesandperiods,withaverageranksconsistentlyinthelow-tomid-30s. OLScombinations of quartile forecasts also fare quite poorly when based on rolling samples, with ranks generally in the mid 20s to low 30s. The factor–based combination forecasts are also consistently ranked in the bottom tier, with average rankings generally in the mid-20s. While not necessarily in the bottom tier, the BMA forecasts are generally dominated by the simple average forecast. The overall rankings of the BMA: BIC, BMA: PIC, and BMA: AIC forecasts are 22.0, 25.4, and 29.0, respectively, compared with the simple average forecast’s ranking of 14.5 (first column). The average ranks of the PLS forecasts are consistently around 20 (or much worse in the 5 year rolling case). The ridge–based combination forecasts with the highest degree of shrinkage (k = 1) fare much better than the OLS–approximating ridge 19

combinations, but consistently rank below the simple average forecast. For example, as shown in the first column, the 10–year rolling ridge regression with k = 1 has an average ranking of 16.9. 4.6 Single VAR methods Among the single VAR forecasts included for comparison, the BVAR(4) with inflation detrending is generally best. While shrinkage in the form of averaging forecasts from an inflation detrended VAR(4) with univariate forecasts is better than estimating the inflation detrended VAR(4) by Bayesian methods, the latter at least performs comparably to the simple average forecast. For example, as shown in Table 3, forecasts of GDP growth from the BVAR(4) with inflation detrending are often at least as accurate as the simple average forecasts (as, for example, with 1-year ahead forecasts for 1985-05). However, forecasts of GDP inflation from the same model are generally less accurate than the simple average (see, for example, the 1-year ahead forecasts for 1985-05). These examples reflect a pattern evident throughout Tables 3-4: while inflation detrending might be expected to most improve inflation forecasts, it instead most improves output forecasts. Although the accuracy of the other individual VAR models is more variable, overall these models are more clearly dominatedby the univariate benchmarkandothers such as the simple average forecast. For example, in the case of the BVAR(4) using GDP growth and GDP inflation (and the T-bill rate), the simple average forecasts are generally more accurate than the BVAR(4) forecasts of growth over 1970-84, inflation over 1970-84, and inflation over 1985-05 (Table 3). Consistent with these examples, in general, forecasts from single models are dominated by average forecasts. The pattern is clearly evident in the average rankings of Table 7. Across all variables, horizons, and samples, the best–ranked single model is the BVAR(4) with inflation detrending, which is out–ranked by 4 different average forecasts. The other single models rank well below the BVAR(4) with inflation detrending. While averages are broadly more accurate than single model forecasts, it is less clear that they are consistently more accurate across sample periods. To check consistency, we calculated the correlation of the ranks of all 32 average forecasts and all 50 single model forecasts across the 1970-84 and 1985-05 periods, based on the inflation and output results covered in columns 6-7 of Table 7 (using rankings including T-bill rate forecasts yields essentially the same correlations). The correlation of single model forecast rankings is 53 percent; the correlation of the average forecast rankings is 92 percent. The implication is 20

that not only is the typical average forecast more accurate than the typical single model forecast, it is also consistently so across the two periods. 4.7 Interpretation Why might simple averages in general and the pairwise average of univariate and inflation– detrended VAR(4) forecasts be more accurate than any single model? As noted in the introduction, in practice it is very difficult to know the form of structural instability, and competing models will differ in their sensitivity to structural change. In such an environment, averages across models are likely to be superior to any single forecast. In line with prior research on combining a range of forecasts that incorporate information from different variables (such as Stock and Watson (1999, 2004) and Smith and Wallis (2005)), simple equally weighted averages are typically at least as good as averages based on weights tied to historicalforecastaccuracy. Thelimitationsofweightedaveragesrelativetosimpleaverages are commonly attributed to difficulties in estimating potentially optimal weights in finite samples, especiallywhenthecross–sectiondimensionislargerelativetothetimedimension. As to the particular success of forecasts using inflation detrending, one interpretation is that removing a smooth inflation trend — a trend that matches up well with long–term inflation expectations — from both inflation and the interest rate does a reasonable job of capturing non–stationarities in inflation and interest rates. Kozicki and Tinsley (2001a,b, 2002) have developed such VARs from models with learning about an unobserved, time– varying inflation target of the central bank. However, such a single representation is surely not the true model, and noise in estimating the many parameters of the model likely have an adverse effect on forecast accuracy. Therefore, a better forecast can be obtained by applying some form of shrinkage. One approach, which primarily addresses parameter estimation noise, is to use Bayesian shrinkage in estimating the VAR with inflation detrending. Another approach is to combine forecasts from the inflation detrended VAR with forecasts from an alternative model — in our case, the univariate benchmark (note that the IMA(1) benchmarks for inflation and the T-bill rate imply random walk trends).16 Koop and Potter (2004) note that such model averaging can be viewed as a form of shrinkage for addressing both parameter estimation noise and 16As discussed in Stock and Watson (2006), suppose inflation is equal to the sum of a trend component andacyclecomponent. Moreover,supposethetrendisarandomwalkandthecycleisjustwhitenoise. The changeininflationisthenequaltothesumofthetrendinnovationandthechangeinthecyclecomponent, which is an MA(1) process. 21

model uncertainty. The superiority of this average forecast can be interpreted as highlighting the value of inflation detrending, shrinkage of parameter noise, and shrinkage to deal with model uncertainty.17 5 Conclusion In this paper we provide empirical evidence on the ability of several forms of forecast averaging to improve the real–time forecast accuracy of small-scale macroeconomic VARs in the presence of uncertain forms of model instability. Focusing on six distinct trivariate models incorporating different measures of output and inflation (but a common interest rate measure), we consider a wide range of approaches to averaging forecasts obtained with a variety of primitive methods for managing model instability. These primitive methods include sequentially updating lag orders, using various observation windows for estimation, working in differences rather than levels, making intercept corrections (as in Clements and Hendry (1996)), allowing stochastic time variation in model parameters, allowing discrete breaks in parameters identified with break tests, discounted least squares estimation, Bayesian shrinkage, and detrending of inflation and interest rates. The forecast averages include: equally weighted averages with and without trimming, medians, common factor-based factors, combinations estimated with ridge regression, MSE–weighted averages, lowest MSE forecasts (predictive least squares forecasts), Bayesian model averages, and combinations based on quartile average forecasts. Ourresultsindicatethatsomeformsofmodelaveragingdoconsistentlyimproveforecast accuracyintermsofrootmeansquareerrors. Notsurprisingly,thebestmethodoftenvaries with the variable being forecasted, but several patterns do emerge. After aggregating across all models, horizons and variables being forecasted, it is clear that the simplest forms of model averaging — such as those that use equal weights across all models or those that average a univariate model with a particular VAR, such as a VAR(4) with inflationdetrending—consistentlyperformamongthebestmethods. Attheotherextreme, forecastsbasedonOLS–typecombinationandfactormodel–basedcombinationrankamong the worst. 17The results of Clark and McCracken (2005b) can be used to make a frequentist case for averaging the inflation detrended VAR with the univariate benchmark, based entirely on parameter estimation error. 22

References Aiolfi, Marco and Allan Timmermann (2006), “Persistence in Forecasting Performance and Conditional Combination Strategies,” Journal of Econometrics 135, 31-54. Bates, J.M. and Clive W.J. Granger (1969), “The Combination of Forecasts,” Operations Research Quarterly 20, 451-468. Beyer, Andreas and Roger E.A. Farmer (2006), “Natural Rate Doubts,” Journal of Economic Dynamics and Control, forthcoming. Boivin, Jean (2005), “Has U.S. Monetary Policy Changed? Evidence from Drifting Coefficients and Real-Time Data,” Journal of Money, Credit and Banking 38, 1149-1173. Campbell, Sean D. (2005), “Stock Market Volatility and the Great Moderation,” Federal Reserve Board FEDs Working Paper No. 2005-47. Clark, Todd E. and Michael W. McCracken (2005a), “Combining Forecasts from Nested Models,” manuscript, Federal Reserve Bank of Kansas City. Clark, Todd E. and Michael W. McCracken (2005b), “Improving Forecast Accuracy by Combining Recursive and Rolling Forecasts,” manuscript, Federal Reserve Bank of Kansas City. Clark, Todd E. and Michael W. McCracken (2006a), “Forecasting with Small MacroeconomicVARsinthePresenceofInstability,”manuscript,FederalReserveBankofKansas City. Clark, Todd E. and Michael W. McCracken (2006b), “The Predictive Content of the Output Gap for Inflation: Resolving In–Sample and Out–of–Sample Evidence,” Journal of Money, Credit, and Banking 38, 1127-1148. Clements, Michael P. and David F. Hendry (1996), “Intercept corrections and structural change,” Journal of Applied Econometrics 11, 475-494. Cogley,Timothy(2002),“ASimpleAdaptiveMeasureofCoreInflation,”Journal of Money, Credit, and Banking 34, 94-113. Cogley,TimothyandThomasJ.Sargent(2001),“EvolvingPostWorldWarIIU.S.Inflation Dynamics,” NBER Macroeconomics Annual 16, 331-373. Cogley, Timothy and Thomas J. Sargent (2005), “Drifts and Volatilities: Monetary Policies and Outcomes in the Post World War II U.S.,” Review of Economic Dynamics 8, 262- 302. Croushore, Dean (2006), “Forecasting with Real–Time Macroeconomic Data,” Handbook of Forecasting, G. Elliott, C.W.J. Granger, and A. Timmermann, eds., North Holland. Croushore, Dean and Tom Stark (2001), “A Real-Time Data Set for Macroeconomists,” Journal of Econometrics 105, 111-30. D’Agostino, Antonello, Domenico Giannone and Paolo Surico (2005), “(Un)Predictability and Macroeconomic Stability,” manuscript, ECARES. 23

Del Negro, Marco and Frank Schorfheide (2004), “Priors from General Equilibrium Models for VARs,” International Economic Review 45, 643-673. Doan, Thomas, Robert Litterman and Christopher Sims (1984), “Forecasting and Conditional Prediction Using Realistic Prior Distributions,” Econometric Reviews 3, 1-100. Estrella, Arturo and Jeffrey C. Fuhrer (2003), “Monetary Policy Shifts and the Stability of Monetary Policy Models,” Review of Economics and Statistics 85, 94-104. Favero, Carlo and Massimiliano Marcellino (2005), “Modelling and Forecasting Fiscal Variables for the Euro Area,” Oxford Bulletin of Economics and Statistics, forthcoming. Garratt, Anthony, Gary Koop and Shaun P. Vahey (2006), “Forecasting Substantial Data Revisions in the Presence of Model Uncertainty,” Reserve Bank of New Zealand Discussion Paper 2006/02. Jacobson, Tor, Per Jansson, Anders Vredin and Anders Warne (2001), “Monetary Policy Analysis and Inflation Targeting in a Small Open Economy: a VAR Approach,”Journal of Applied Econometrics 16, 487-520. Hallman, Jeffrey J., Richard D. Porter and David H. Small (1991), “Is the Price Level Tied to the M2 Monetary Aggregate in the Long Run?” American Economic Review 81, 841-858. Hansen, Peter R. (2005), “A Test for Superior Predictive Ability,” Journal of Business and Economic Statistics 23, 365-380. Hodrick, Robert and Edward C. Prescott (1997), “Post-War U.S. Business Cycles: A Descriptive Empirical Investigation,” Journal of Money, Credit, and Banking 29, 1-16. Kapetanios,George,VincentLabhardandSimonPrice(2005),“ForecastingUsingBayesian and Information Theoretic Model Averaging: An Application to UK Inflation,” Bank of England Working Paper no. 268. Koop, Gary and Simon Potter (2004), “Forecasting in Dynamic Factor Models Using Bayesian Model Averaging,” Econometrics Journal 7, 550-565. Koop,Gary,SimonM.PotterandRodneyW.Strachan(2005),“ReexaminingtheConsumption– Wealth Relationship: the Role of Model Uncertainty,” Federal Reserve Bank of New York Staff Report no. 202. Kozicki, Sharon and Barak Hoffman (2004), “Rounding Error: A Distorting Influence on Index Data,” Journal of Money, Credit, and Banking 36, 319-38. Kozicki, Sharon and Peter A. Tinsley (2001a), “Shifting Endpoints in the Term Structure of Interest Rates,” Journal of Monetary Economics 47, 613-652. Kozicki, Sharon and Peter A. Tinsley (2001b), “Term Structure Views of Monetary Policy under Alternative Models of Agent Expectations,” Journal of Economic Dynamics and Control 25, 149-84. Kozicki, Sharon and Peter A. Tinsley (2002), “Alternative Sources of the Lag Dynamics of Inflation,” in Price Adjustment and Monetary Policy, Bank of Canada Conference Proceedings, 3-47. 24

Kunsch, Hans R. (1989), “The Jackknife and the Bootstrap for General Stationary Observations,” Annals of Statistics 17, 1217-1241. Litterman, Robert B. (1986), “Forecasting with Bayesian Vector Autoregressions — Five Years of Experience,” Journal of Business and Economic Statistics 4, 25-38. Liu, Regina Y. and Kesar Singh (1992), “Moving Blocks Jackknife and Bootstrap Capture Weak Dependence,” in R. Lepage and L. Billiard, eds., Exploring the Limits of Bootstrap, New York: Wiley, 22-148. Nelson, Charles R. (1972), “The Predictive Performance of the FRB-MIT-PENN Model of the U.S. Economy,” American Economic Review 62, 902-917. Nelson,CharlesR.andG.WilliamSchwert(1977),“Short-TermInterestRatesasPredictors of Inflation: On Testing the Hypothesis that the Real Rate of Interest is Constant,” American Economic Review 67, 478-486. Newey, Whitney K. and Kenneth D. West (1987), “A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix,” Econometrica 55, 703-708. Orphanides, Athanasios and Simon van Norden (2005), “The Reliability of Inflation Forecasts Based on Output Gap Estimates in Real Time,” Journal of Money, Credit, and Banking 37, 583-601. Pesaran,M.Hashem,DavidePettenuzzoandAllanTimmermann(2006),“ForecastingTime Series Subject to Multiple Structural Breaks,” Review of Economic Studies 73, 1057- 1084. Pesaran, M. Hashem, and Allan Timmermann (2006), “Selection of Estimation Window in the Presence of Breaks,” Journal of Econometrics, forthcoming. Phillips, Peter C.B. (1996), “Econometric Model Determination,” Econometrica 64, 763- 812. Robertson, John and Ellis Tallman (2001), “Improving Federal-Funds Rate Forecasts in VAR Models Used for Policy Analysis,” Journal of Business and Economic Statistics 19, 324-330. Rogoff, Kenneth (2003), “Globalization and Global Disinflation,” in Monetary Policy and Uncertainty: Adapting to a Changing Economy, Federal Reserve Bank of Kansas City. Romer, Christina D. and David H. Romer (2000), “Federal Reserve Information and the Behavior of Interest Rates,” American Economic Review 90, 429-457. Rudebusch, Glenn D. (2005), “Assessing the Lucas Critique in Monetary Policy Models,” Journal of Money, Credit, and Banking 37, 245-272. Rudebusch,GlennD.andLarsE.O.Svensson(1999),“PolicyRulesforInflationTargeting.” inJ.Taylor,ed.,Monetary Policy Rules,UniversityofChicagoPress: Chicago,203-246. Sims, Christopher A. (1980), “Macroeconomics and Reality,” Econometrica 48, 1-48. Sims, Christopher A. (2002), “The Role of Models and Probabilities in the Monetary Policy 25

Process,” Brookings Papers on Economic Activity 2, 1-40. Smith, Jeremy and Kenneth F. Wallis (2005), “Combining Point Forecasts: The Simple Average Rules, OK?” manuscript, University of Warwick. Stock, JamesH.andMarkW.Watson(1996), “EvidenceonStructuralStabilityinMacroeconomicTimeSeriesRelations,”Journal of Business and Economic Statistics 14,11-30. Stock, James H. and Mark W. Watson (1999), “A Dynamic Factor Model Framework for Forecast Combination,” Spanish Economic Review 1, 91-121. Stock, JamesH.andMarkW.Watson(2003), “ForecastingOutputandInflation: TheRole of Asset Prices,” Journal of Economic Literature 41, 788-829. Stock, James H. and Mark W. Watson (2004), “Combination Forecasts of Output Growth in a Seven–Country Data Set,”Journal of Forecasting 23, 405-430. Stock, James H. and Mark W. Watson (2006), “Why Has U.S. Inflation Become Harder to Forecast?” Journal of Money, Credit, and Banking, forthcoming. Timmermann, Allan (2006), “Forecast Combinations,” Handbook of Forecasting, G. Elliott, C.W.J. Granger, and A. Timmermann, eds., North Holland. Tulip, Peter (2005), “Has Output Become More Predictable? Changes in Greenbook Forecast Accuracy,” Federal Reserve Board FEDs Working Paper No. 2005-31. Webb, Roy H. (1995), “Forecasts of Inflation from VAR Models,” Journal of Forecasting 14, 267-285. White, Halbert (2000), “A Reality Check for Data Snooping,” Econometrica 68, 1097-1126. 26

Table 1: VAR forecasting methods method details VAR(4) VARiny,π,iwithfixedlagof4 VAR(2) sameasabovewithfixedlagof2 VAR(AIC) VARwithsystemlagdeterminedateachtbyAIC VAR(BIC) VARwithsystemlagdeterminedateachtbyBIC VAR(AIC,byeq.&var.) VARiny,π,iallowingdifferent,AIC-chosenlagsforeachvariableineachequation VAR(BIC,byeq.&var.) sameasabove,withBIC-determinedlags DVAR(4) VARiny,∆π,∆iwithfixedlagof4 DVAR(2) sameasabovewithfixedlagof2 DVAR(AIC) VARiny,∆π,∆iwithsystemlagdeterminedateachtbyAIC DVAR(BIC) VARiny,∆π,∆iwithsystemlagdeterminedateachtbyBIC DVAR(AIC,byeq.&var.) VARiny,∆π,∆iallowingdifferent,AIC-chosenlagsforeachvariableineachequation DVAR(BIC,byeq.&var.) sameasabove,withBIC-determinedlags BVAR(4) VAR(4)iny,π,i,est.withMinnesotapriors,usingλ1=.2,λ2=.5,λ3=1,λ4=1000 BDVAR(4) VAR(4)iny,∆π,∆i,est.withMinnesotapriors,usingλ1=.2,λ2=.5,λ3=1,λ4=1000 VAR(4),rolling VARiny,π,iwithfixedlagof4,estimatedwitharollingsample VAR(2),rolling sameasabovewithfixedlagof2 VAR(AIC),rolling sameasabovewithAIC–determinedlag VAR(BIC),rolling sameasabovewithBIC–determinedlag VAR(AIC,byeq.&var.),rolling sameasabovewithAIC-determinedlagsforeachvar.ineacheq. VAR(BIC,byeq.&var.),rolling sameasabovewithBIC-determinedlagsforeachvar.ineacheq. DVAR(4),rolling VARiny,∆π,∆iwithfixedlagof4,estimatedwitharollingsample DVAR(2),rolling sameasabovewithfixedlagof2 DVAR(AIC),rolling sameasabovewithAIC–determinedlag DVAR(BIC),rolling sameasabovewithBIC–determinedlag DVAR(AIC,byeq.&var.),rolling sameasabovewithAIC-determinedlagsforeachvar.ineacheq. DVAR(BIC,byeq.&var.),rolling sameasabovewithBIC-determinedlagsforeachvar.ineacheq. BVAR(4),rolling BVAR(4)iny,π,iwithλ1=.2,λ2=.5,λ3=1,λ4=1000,est.witharollingsample BDVAR(4),rolling BVAR(4)iny,∆π,∆iwithλ1=.2,λ2=.5,λ3=1,λ4=1000,est.witharollingsample DLS,VAR(4) VAR(4)iny,π,i,est.byDLS,usingdis.ratesof.01foryeq.and.05forπandieq. DLS,VAR(2) sameasabovewithfixedlagof2 DLS,VAR(AIC) sameasabovewithlagdeterminedfromAICappliedtoOLSestimatesofsystem DLS,DVAR(4) VAR(4)iny,∆π,∆i,est.byDLSusingdis.ratesof.01foryeq.and.05for∆πand∆ieq. DLS,DVAR(2) sameasabovewithfixedlagof2 DLS,DVAR(AIC) sameasabovewithlagdeterminedfromAICappliedtoOLSestimatesofsystem VAR(AIC),AICinterceptbreaks VAR(AIClags)iny,π,i,withinterceptbreaks(upto2)chosentominimizetheAIC VAR(AIC),BICinterceptbreaks sameasabove,usingtheBICtodeterminethenumberofinterceptbreaks VAR(4),interceptcorrection VAR(4)forecastsadjustedbytheaveragevalueofthelastfourOLSresiduals VAR(AIC),interceptcorrection VAR(AIClag)forecastsadjustedbytheaveragevalueofthelastfourOLSresiduals V V A A R R ( ( 4 2 ) ) , , i i n n fl fl a a t t i i o o n n d d e e t t r r e e n n d d i i n n g g V sa A m R e (4 a ) s i a n bo y v , e π w − it π h− ∗ fi 1 x , e a d n l d ag i − of π 2− ∗1 ,whereπ∗=π − ∗1 +.05(π − π − ∗1 ) V V B B A A V V A A R R R R ( ( A B ( ( I 4 4 I C C ) ) , ) ) w , , in i i i t fl n n h a fl fl t a T a i t t o V i i n o o P n n de d d t e e r t t e r r n e e d n n i d d n i i g n n g g s s B T a a V V m m A P e e R B a a ( s s V 4) a a A b b i R n o o ( v v y 4 e e , ) w w π in i i t t − h h y, π A B −π ∗ I I , 1 C C , i – – a d w d n e e i d t t t e h e i r r m m λ − 1 i i n n π = e e − ∗ d d 1 .2 , l l a , a u g g λ si 2 f f n o o = g r r t λ t . h h 5 1 e , e = λ s s y y 3 . s s 2 = t t , e e λ m m 1 2 , i i λ = n n 4 y y . = 5 , , , π π . λ 1 − − 3 ,λ = π π − ∗ − = ∗ 1 1 1 , . , , 0 λ a a 0 4 n n 0 = d d 5 i i 1 − 0 − 0 π 0 π − ∗ − ∗ 1 1 BVAR(4)withTVP,λ4=.5,λ=.0025 TVPBVAR(4)iny,π,iwithλ1=.2,λ2=.5,λ3=1,λ4=.5,λ=.0025 BVAR(4)withTVP,λ4=1000,λ=.005 TVPBVAR(4)iny,π,iwithλ1=.2,λ2=.5,λ3=1,λ4=1000,λ=.005 BVAR(4)withTVP,λ4=1000,λ=.0001 TVPBVAR(4)iny,π,iwithλ1=.2,λ2=.5,λ3=1,λ4=1000,λ=.0001 BVAR(4)withinterceptTVP BVAR(4)iny,π,i,TVPinintercepts,λ1=.2,λ2=.5,λ3=1,λ4=.1,λ=.0005 BVAR(4)withinterceptTVP,λ4=.5,λ=.0025 BVAR(4)iny,π,i,TVPinintercepts,λ1=.2,λ2=.5,λ3=1,λ4=.5,λ=.0025 univariate AR(2)fory,rollingMA(1)for∆π,rollingMA(1)for∆i Notes: 1. The variables y, π, and i refer to, respectively, output (GDP growth, the HPS gap, or the HP gap), inflation (GDP or CPIinflation),andthe3-monthT-billrate. 2. Unlessotherwisenoted,allmodelsareestimatedrecursively,usingalldata(startingin1955orlater)availableuptothe forecastingdate. Therollingestimatesoftheunivariatemodelsfor∆πand∆iuse40observations. Therollingestimatesof theVARmodelsuse60observatinos. 3. TheAICandBIClagordersrangefrom0(theminimumallowed)to4(themaximumallowed). 4. Theinterceptcorrectionapproachtakestheformofequation(40)inClementsandHendry(1996). 5. In BVAR estimates, prior variances take the “Minnesota” style described in Litterman (1986). The prior variances are determined by hyperparameters λ1 (general tightness), λ2 (tightness of lags of other variables compared to lags of the dependentvariable),λ3(tightnessoflongerlagscomparedtoshorterlags),andλ4(tightnessofintercept). Thepriorstandard deviationofthecoefficientonlagkofvariablej inequationj issetto λ1 . Thepriorstandarddeviationofthecoefficient kλ3 on lag k of variable m in equation j is λ k 1 λ λ 3 2 σ σ m j , where σj and σm denote the residual standard deviations of univariate autoregressionsestimatedforvariablesjandm. Thepriorstandarddeviationoftheinterceptinequationjissettoλ4σj. In fixedparameterBVARs,weusegenerallyconventionalhyperparametersettingsofλ1=.2,λ2=.5,λ3=1,andλ4=1000. Thepriormeansforallcoefficientsaregenerallysetat0,withthefollowingexceptions: (a)priormeansforownfirstlagsof π andiaresetat1(inmodelswithlevelsofinflationandinterestrates);(b)priormeansforownfirstlagsofy aresetat 0.8inmodelswithanoutputgap; and(c)priormeansfortheinterceptofGDPgrowthequationsaresettothehistorical averageofgrowthinBVARestimatesthatimposeinformativepriors(λ4 =.1or.5)ontheconstantterm. 6. ThetimevariationinthecoefficientsoftheTVPBVARstakesarandomwalkform. Thevariancematrixofthecoefficient innovationsissettoλtimestheMinnesotapriorvariancematrix. Intime–varyingBVARswithflatpriorsontheintercepts (λ4 =1000),thevariationoftheinnovationintheinterceptissetatλtimesthepriorvarianceofthecoefficientontheown firstlaginsteadofthepriorvarianceoftheconstant. 27

Table 2: Forecast averaging methods method details avg.ofVAR(4),univariate averageofforecastsfromunivariatemodelandVAR(4)iny,π,andi a a v v g g . . o o f f i D n V fl A .d R e ( t 4 r ) . , V u A n R iv ( a 4 r ) ia , t u e nivariate a a v v e e r r a a g g e e o o f f f f o o r r e e c c a a s s t t s s f f r r o o m m u u n n i i v v a a r r i i a a t t e e m m o o d d e e l l a a n n d d V V A A R R ( ( 4 4 ) ) i i n n∆ y, y π , − ∆π π ,− ∗ a 1 n , d an i di − π − ∗1 avg.ofVAR(4),rollingVAR(4) averageofforecastsfromrecursiveandrollingestimatesofVAR(4)iny,π,andi averageofallforecasts simpleaverageofforecastsfrommodelslistedinTable1 median medianofmodelforecasts trimmedmean,10% averageofmodelforecasts,excluding3highestand3lowest trimmedmean,20% averageofmodelforecasts,excluding5highestand5lowest ridge: recursive,.001 combinationofmodelforecasts,est.withridgeregression(1),k =.001 ridge: recursive,.25 sameasabove,usingk =.25 ridge: recursive,1. sameasabove,usingk =1 ridge: 10yrolling,.001 sameasabove,usingk =.001andarollingwindowof40forecasts ridge: 10yrolling,.25 sameasabove,usingk =.25andarollingwindowof40forecasts ridge: 10yrolling,1. sameasabove,usingk =1andarollingwindowof40forecasts factor,recursive forecastfromregressiononcommonfactorinmodelforecasts factor,10yrolling sameasabove,usingrollingwindowof40forecasts MSEweighting,recursive inverseMSE–weightedaverageofmodelforecasts MSEweighting,10yrolling sameasabove,usingarollingwindowof40forecasts MSEweighting,5yrolling sameasabove,usingarollingwindowof20forecasts MSEweighting,discounted inversediscountedMSE–weightedaverageofmodelforecasts,withdiscountrateof.95 PLS,recursive forecastfrommodelwithlowesthistoricalMSE PLS,10yrolling sameasabove,usingarollingwindowof40forecasts PLS,5yrolling sameasabove,usingarollingwindowof20forecasts bestquartile,recursive simpleaverageofmodelforecastsinthetopquartileofhistorical(MSE)accuracy bestquartile,10yrolling sameasabove,usingarollingwindowof40forecasts bestquartile,5yrolling sameasabove,usingarollingwindowof20forecasts OLScomb.ofquartiles,recursive forecastfrom(OLS)regressionontheavg. forecastsfromthe1stand2ndquartiles OLScomb.ofquartiles,10yrolling sameasabove,usingarollingwindowof40forecasts OLScomb.ofquartiles,5yrolling sameasabove,usingarollingwindowof20forecasts BMA:AIC BMAofmodelforecasts,usingAICasmeasureofmarginallikelihood BMA:BIC BMAofmodelforecasts,usingBICasmeasureofmarginallikelihood BMA:PIC BMAofmodelforecasts,usingPhillips’(1996)PICasmeasureofmarginallikelihood Notes: 1. Allaveragesarebasedonthe50forecastmodelslistedinTable1,foragivencombinationofmeasuresofoutput, inflation,andtheshort-terminterestrate. 2. SeethenotestoTable1. 28

noitaflni PDG dna htworg PDG rof stluser ESMR emit-laeR :3 elbaT )srehto lla ni soitar ESMR ,wor tsrfi ni sESMR( stsacerof noitaflni PDG stsacerof htworg PDG 5002-5891 48-0791 5002-5891 48-0791 Y1=h Q1=h Q0=h Y1=h Q1=h Q0=h Y1=h Q1=h Q0=h Y1=h Q1=h Q0=h dohtem tsacerof 347. 250.1 989. 664.2 242.2 119.1 763.1 628.1 557.1 336.3 320.5 055.4 etairavinu 639. 449. 000.1 660.1 410.1 499. 150.1 821.1 511.1 049. 949. 360.1 )4(RAV 110.1 659. 699. 109. 149. 899. 680.1 252.1 912.1 167. 829. 530.1 )4(RAVD 641.1 210.1 610.1 690.1 940.1 659. 979. 430.1 030.1 579. 129. 859. )4(RAVB 300.1 779. 300.1 601.1 350.1 369. 919. 789. 599. 079. 039. 759. PVThtiw)4(RAVB 782.1 720.1 010.1 370.1 360.1 100.1 739. 060.1 760.1 118. 038. 388. gnidnertednoitaflni,)4(RAVB 139. 759. 189. 999. 679. 659. 659. 120.1 420.1 398. 229. 689. etairavinu,)4(RAVfo.gva 969. 169. 979. 439. 259. 569. 699. 080.1 170.1 597. 988. 559. etairavinu,)4(RAVDfo.gva 079. 769. 689. 579. 279. 169. 819. 610.1 710.1 658. 498. 859. etairavinu,)4(RAV .rted.flnifo.gva 321.1 799. 040.1 460.1 920.1 389. 431.1 551.1 011.1 979. 500.1 111.1 )4(RAVgnillor,)4(RAVfo.gva 630.1 899. 320.1 979. 299. 339. 269. 060.1 940.1 638. 778. 449. stsacerofllafoegareva 610.1 399. 710.1 500.1 210.1 459. 199. 670.1 240.1 768. 978. 739. naidem 720.1 399. 220.1 589. 499. 049. 769. 160.1 050.1 448. 188. 649. %01,naemdemmirt 320.1 299. 120.1 989. 699. 349. 279. 260.1 050.1 848. 488. 849. %02,naemdemmirt 557.1 101.1 741.1 768.1 235.1 452.1 885.1 353.1 592.1 985.1 095.1 961.2 100.,evisrucer :egdir 930.1 889. 020.1 190.1 440.1 869. 510.1 231.1 770.1 377. 678. 759. 52.,evisrucer :egdir 189. 489. 810.1 780.1 930.1 369. 499. 521.1 370.1 377. 958. 939. .1,evisrucer :egdir 317.1 471.1 771.1 988.1 715.1 912.1 227.1 267.1 365.1 396.1 178.1 233.2 100.,gnillory01 :egdir 050.1 979. 720.1 770.1 630.1 769. 180.1 621.1 080.1 197. 188. 869. 52.,gnillory01 :egdir 799. 389. 320.1 570.1 230.1 269. 120.1 711.1 770.1 487. 168. 449. .1,gnillory01 :egdir 635.1 370.1 130.1 299. 780.1 600.1 120.1 111.1 590.1 929. 039. 789. evisrucer,rotcaf 914.1 110.1 810.1 930.1 421.1 120.1 390.1 801.1 211.1 539. 639. 399. gnillory01,rotcaf 820.1 699. 220.1 489. 299. 539. 759. 850.1 640.1 428. 678. 349. evisrucer,gnithgiewESM 920.1 499. 220.1 489. 299. 539. 259. 750.1 640.1 528. 778. 349. gnillory01,gnithgiewESM 620.1 899. 320.1 479. 989. 439. 759. 650.1 440.1 138. 578. 149. gnillory5,gnithgiewESM 320.1 599. 220.1 189. 099. 439. 859. 750.1 540.1 928. 778. 349. detnuocsid,gnithgiewESM 110.1 879. 089. 302.1 170.1 399. 801.1 490.1 601.1 608. 519. 379. evisrucer,SLP 290.1 369. 630.1 352.1 840.1 340.1 040.1 151.1 290.1 018. 519. 379. gnillory01,SLP 231.1 900.1 311.1 103.1 680.1 250.1 470.1 121.1 301.1 828. 339. 320.1 gnillory5,SLP 950.1 289. 510.1 200.1 499. 649. 049. 360.1 820.1 877. 978. 349. evisrucer,elitrauqtseb 910.1 779. 110.1 110.1 889. 549. 269. 160.1 640.1 487. 488. 159. gnillory01,elitrauqtseb 810.1 489. 030.1 079. 779. 839. 020.1 880.1 240.1 328. 878. 539. gnillory5,elitrauqtseb 693.1 299. 430.1 590.1 930.1 400.1 049. 051.1 961.1 019. 719. 879. evisrucer,selitrauqfo.bmocSLO 823.1 939. 020.1 361.1 180.1 220.1 620.1 411.1 090.1 729. 829. 679. gnillory01,selitrauqfo.bmocSLO 520.1 611.1 380.1 073.1 531.1 410.1 755.1 314.1 862.1 159. 419. 869. gnillory5,selitrauqfo.bmocSLO 205.1 621.1 311.1 350.1 320.1 199. 879. 601.1 021.1 109. 669. 800.1 CIA:AMB 500.1 400.1 410.1 199. 400.1 279. 898. 830.1 740.1 469. 909. 649. CIB:AMB 821.1 280.1 080.1 570.1 640.1 269. 629. 780.1 001.1 948. 738. 209. CIP:AMB :setoN ESMReraseirtnerehtollA .stniopegatnecrepdezilaunnanidenfiedselbairavrof,sESMRerawortsrfiehtniseirtneehT .1 .noitacfiicepsetairavinugnidnopserrocehtotevitalernoitacfiicepsdetacidniehtrof,soitar nI .tnofdetnalsaybdetacidnieraseulav–ppartstoobotgnidrocca1wolebyltnacfiingiseratahtsoitarESMRlaudividnI .2 seulav–ptsubor–gnipoonsatadotgnidroccakramhcnebehtnaht)ESMni(rettebyltnacfiingissitsacerofafi,nmulochcae .tnofdlobanisraeppaoitarESMRdetaicossaeht,))5002(nesnaHnisadeppartstoob( stsacerofmrofotdesuera)1 tnidneyllareneghcihw(atadtegatniv,4Q:5002hguorht1Q:0791morftretrauqhcaenI .3 − Y1 = h eht rof noitaflni dna htworg PDG fo stsacerof ehT .)Y1 = h( 4+t dna ,)Q1 = h( 1+t ,)Q0 = h( t sdoirep rof .4+thguorht1+tmorfnoitaflniegarevadnahtworgegareva :segnahctnecreplaunnaotdnopserrocnoziroh .sdohtemtsacerofehtnoliatedrehtrufedivorp2dna1selbaT .4 29

noitaflni IPC dna htworg PDG rof stluser ESMR emit-laeR :4 elbaT )srehto lla ni soitar ESMR ,wor tsrfi ni sESMR( stsacerof noitaflni IPC stsacerof htworg PDG 5002-5891 48-0791 5002-5891 48-0791 Y1=h Q1=h Q0=h Y1=h Q1=h Q0=h Y1=h Q1=h Q0=h Y1=h Q1=h Q0=h dohtem tsacerof 452.1 064.1 043.1 079.2 337.2 711.2 763.1 628.1 557.1 336.3 320.5 055.4 etairavinu 770.1 730.1 689. 399. 949. 758. 460.1 911.1 011.1 439. 679. 270.1 )4(RAV 590.1 210.1 369. 458. 888. 748. 990.1 232.1 112.1 867. 759. 970.1 )4(RAVD 100.1 789. 799. 601.1 330.1 529. 569. 720.1 720.1 739. 419. 559. )4(RAVB 639. 669. 689. 680.1 510.1 419. 539. 199. 399. 349. 529. 359. PVThtiw)4(RAVB 160.1 689. 579. 330.1 310.1 449. 780.1 151.1 251.1 777. 018. 968. gnidnertednoitaflni,)4(RAVB 699. 399. 569. 619. 219. 368. 469. 610.1 220.1 578. 439. 399. etairavinu,)4(RAVfo.gva 310.1 389. 159. 498. 898. 268. 300.1 170.1 860.1 608. 519. 089. etairavinu,)4(RAVDfo.gva 910.1 999. 869. 368. 598. 758. 389. 830.1 830.1 818. 598. 069. etairavinu,)4(RAV .rted.flnifo.gva 341.1 380.1 120.1 530.1 979. 958. 731.1 471.1 261.1 279. 520.1 601.1 )4(RAVgnillor,)4(RAVfo.gva 660.1 210.1 489. 059. 729. 628. 799. 670.1 070.1 938. 309. 749. stsacerofllafoegareva 930.1 000.1 389. 049. 419. 368. 810.1 470.1 950.1 178. 598. 549. naidem 350.1 110.1 189. 549. 529. 438. 999. 570.1 070.1 058. 409. 749. %01,naemdemmirt 840.1 700.1 979. 449. 329. 048. 200.1 570.1 070.1 558. 509. 749. %02,naemdemmirt 854.1 353.1 311.1 310.2 473.1 301.1 103.1 774.1 922.1 533.1 733.1 888.1 100.,evisrucer :egdir 170.1 799. 689. 180.1 179. 548. 779. 690.1 550.1 087. 258. 159. 52.,evisrucer :egdir 210.1 299. 879. 470.1 669. 548. 969. 580.1 950.1 997. 178. 649. .1,evisrucer :egdir 956.1 395.1 591.1 929.1 352.1 650.1 384.1 566.1 113.1 224.1 133.1 329.1 100.,gnillory01 :egdir 410.1 999. 099. 760.1 669. 048. 740.1 090.1 820.1 697. 268. 569. 52.,gnillory01 :egdir 469. 399. 289. 360.1 959. 048. 310.1 390.1 350.1 308. 478. 949. .1,gnillory01 :egdir 942.1 160.1 199. 159. 549. 338. 320.1 390.1 980.1 319. 189. 300.1 evisrucer,rotcaf 431.1 500.1 759. 989. 779. 158. 780.1 390.1 101.1 529. 199. 110.1 gnillory01,rotcaf 550.1 700.1 289. 159. 929. 038. 599. 370.1 560.1 038. 109. 749. evisrucer,gnithgiewESM 550.1 700.1 289. 059. 929. 038. 299. 470.1 460.1 138. 209. 849. gnillory01,gnithgiewESM 050.1 500.1 089. 949. 829. 828. 199. 370.1 060.1 548. 109. 649. gnillory5,gnithgiewESM 150.1 600.1 289. 359. 039. 828. 899. 370.1 460.1 638. 309. 849. detnuocsid,gnithgiewESM 381.1 610.1 500.1 809. 180.1 498. 290.1 012.1 191.1 667. 058. 209. evisrucer,SLP 870.1 730.1 500.1 019. 770.1 958. 290.1 202.1 141.1 667. 248. 209. gnillory01,SLP 660.1 940.1 020.1 109. 601.1 329. 311.1 202.1 131.1 429. 948. 488. gnillory5,SLP 970.1 010.1 479. 229. 339. 868. 599. 960.1 830.1 377. 898. 839. evisrucer,elitrauqtseb 870.1 400.1 679. 829. 149. 768. 710.1 460.1 340.1 087. 698. 349. gnillory01,elitrauqtseb 630.1 410.1 579. 169. 839. 838. 310.1 580.1 850.1 618. 709. 669. gnillory5,elitrauqtseb 170.1 910.1 869. 549. 069. 598. 409. 572.1 681.1 877. 249. 949. evisrucer,selitrauqfo.bmocSLO 521.1 620.1 469. 079. 230.1 678. 470.1 521.1 140.1 418. 559. 369. gnillory01,selitrauqfo.bmocSLO 011.1 850.1 599. 133.2 950.1 977. 907.1 053.1 903.1 398. 669. 490.1 gnillory5,selitrauqfo.bmocSLO 451.1 750.1 199. 650.1 410.1 898. 741.1 091.1 451.1 419. 978. 129. CIA:AMB 301.1 720.1 879. 300.1 269. 878. 420.1 770.1 550.1 788. 768. 959. CIB:AMB 030.1 010.1 700.1 751.1 670.1 019. 231.1 841.1 021.1 218. 697. 378. CIP:AMB :setoN .etarllib-Tehtdna,noitaflniIPC,htworgPDGeraledometairavitlumhcaeniselbairavehT .1 .3elbaTotsetonehteeS .2 30

noitaflni dna htworg PDG fo stsacerof daeha raey-2 rof stluser ESMR emit-laeR :5 elbaT )srehto lla ni soitar ESMR ,wor tsrfi ni sESMR( noitaflni IPC dna htworg PDG htiw sledoM noitaflni PDG dna htworg PDG htiw sledoM stsacerof noitaflni IPC stsacerof htworg PDG stsacerof noitaflni PDG stsacerof htworg PDG 5002-5891 48-0791 5002-5891 48-0791 5002-5891 48-0791 5002-5891 48-0791 dohtem tsacerof 254.1 536.4 093.1 976.3 189. 776.3 093.1 976.3 etairavinu 642.1 231.1 799. 180.1 450.1 541.1 010.1 862.1 )4(RAV 590.1 288. 199. 540.1 990.1 009. 579. 240.1 )4(RAVD 830.1 871.1 429. 881.1 153.1 571.1 869. 832.1 )4(RAVB 498. 512.1 259. 431.1 560.1 732.1 459. 071.1 PVThtiw)4(RAVB 212.1 279. 489. 030.1 085.1 510.1 819. 770.1 gnidnertednoitaflni,)4(RAVB 840.1 389. 769. 899. 589. 440.1 179. 090.1 etairavinu,)4(RAVfo.gva 600.1 629. 299. 500.1 330.1 149. 489. 400.1 etairavinu,)4(RAVDfo.gva 020.1 368. 100.1 569. 570.1 669. 749. 050.1 etairavinu,)4(RAV .rted.flnifo.gva 992.1 971.1 230.1 261.1 133.1 831.1 840.1 992.1 )4(RAVgnillor,)4(RAVfo.gva 750.1 350.1 899. 550.1 431.1 040.1 469. 590.1 stsacerofllafoegareva 210.1 210.1 599. 660.1 411.1 040.1 669. 021.1 naidem 530.1 030.1 399. 640.1 621.1 040.1 269. 190.1 %01,naemdemmirt 130.1 120.1 199. 440.1 321.1 930.1 269. 190.1 %02,naemdemmirt 961.4 553.3 645.1 051.2 720.3 277.1 715.2 920.2 100.,evisrucer :egdir 531.1 935.1 970.1 873.1 742.1 571.1 532.1 981.1 52.,evisrucer :egdir 589. 951.1 040.1 341.1 060.1 411.1 641.1 460.1 .1,evisrucer :egdir 624.3 587.2 379.1 549.1 941.3 447.1 433.2 969.1 100.,gnillory01 :egdir 121.1 815.1 520.1 504.1 342.1 161.1 941.1 812.1 52.,gnillory01 :egdir 849. 141.1 399. 851.1 760.1 690.1 230.1 570.1 .1,gnillory01 :egdir 327.1 658. 110.1 143.1 662.2 348. 920.1 533.1 evisrucer,rotcaf 546.1 878. 910.1 733.1 161.2 468. 740.1 833.1 gnillory01,rotcaf 830.1 220.1 489. 730.1 511.1 130.1 559. 760.1 evisrucer,gnithgiewESM 240.1 120.1 489. 630.1 901.1 230.1 059. 560.1 gnillory01,gnithgiewESM 450.1 810.1 689. 430.1 611.1 920.1 359. 750.1 gnillory5,gnithgiewESM 140.1 020.1 589. 140.1 701.1 030.1 459. 270.1 detnuocsid,gnithgiewESM 353.1 988. 850.1 850.1 592.1 960.1 679. 911.1 evisrucer,SLP 152.1 648. 130.1 350.1 662.1 680.1 600.1 811.1 gnillory01,SLP 413.1 918. 860.1 980.1 263.1 960.1 030.1 231.1 gnillory5,SLP 580.1 249. 689. 040.1 202.1 179. 199. 240.1 evisrucer,elitrauqtseb 821.1 049. 489. 430.1 551.1 189. 559. 040.1 gnillory01,elitrauqtseb 511.1 829. 389. 740.1 341.1 289. 359. 440.1 gnillory5,elitrauqtseb 986.1 248. 550.1 158.1 349.1 439. 241.1 456.1 evisrucer,selitrauqfo.bmocSLO 607.1 619. 250.1 858.1 432.2 349. 350.1 766.1 gnillorraey01,selitrauqfo.bmocSLO 923.2 502.1 266.1 388.1 368.1 963.1 265.1 938.1 gnillorraey5,selitrauqfo.bmocSLO 551.1 451.1 311.1 211.1 577.1 001.1 349. 920.1 CIA:AMB 561.1 151.1 900.1 671.1 130.1 950.1 598. 491.1 CIB:AMB 280.1 282.1 030.1 681.1 461.1 501.1 188. 931.1 CIP:AMB :setoN ehtnI .etarllib-Tehtdna,noitaflniPDG,htworgPDGeraselbairavledomeht,elbatehtfoflahtfelehtnistluserehtnI .1 .etarllib-Tehtdna,noitaflniIPC,htworgPDGehteraselbairavledomeht,elbatehtfoflahthgirehtnistluser stsacerofmrofotdesuera)1 tnidneyllareneghcihw(atadtegatniv,4Q:5002hguorht1Q:0791morftretrauqhcaenI .2 − ot dnopserroc noziroh Y2 = h eht rof noitaflni dna htworg PDG fo stsacerof ehT .)Y2 = h( 8+t hguorht t sdoirep rof .8+thguorht5+tmorfnoitaflniegarevadnahtworgegareva :segnahctnecreplaunna .3elbaTotsetonehteeS .3 31

morf ,etar llib-T eht rof stluser ESMR emit-laeR :6 elbaT noitaflni PDG dna htworg PDG htiw sledom )srehto lla ni soitar ESMR ,wor tsrfi ni sESMR( 5002-5891 48-0791 Y2=h Y1=h Q1=h Q0=h Y2=h Y1=h Q1=h Q0=h dohtem tsacerof 723.2 526.1 877. 873. 876.3 128.2 890.2 503.1 etairavinu 258. 298. 720.1 480.1 021.1 801.1 459. 049. )4(RAV 801.1 380.1 131.1 731.1 939. 189. 719. 339. )4(RAVD 258. 709. 689. 760.1 580.1 720.1 629. 949. )4(RAVB 219. 449. 600.1 870.1 781.1 450.1 339. 949. PVThtiw)4(RAVB 777. 298. 220.1 051.1 658. 809. 068. 039. gnidnertednoitaflni,)4(RAVB 109. 529. 669. 889. 630.1 030.1 039. 439. etairavinu,)4(RAVfo.gva 540.1 230.1 630.1 720.1 159. 669. 909. 139. etairavinu,)4(RAVDfo.gva 268. 809. 759. 289. 809. 469. 019. 429. etairavinu,)4(RAV .rted.flnifo.gva 798. 439. 890.1 281.1 232.1 891.1 599. 769. )4(RAVgnillor,)4(RAVfo.gva 829. 449. 989. 430.1 840.1 810.1 729. 119. stsacerofllafoegareva 229. 649. 989. 330.1 099. 499. 249. 829. naidem 929. 749. 689. 130.1 020.1 700.1 429. 119. %01,naemdemmirt 039. 849. 689. 030.1 310.1 600.1 429. 619. %02,naemdemmirt 394.1 151.1 881.1 411.1 722.1 128.1 128. 189. 100.,evisrucer :egdir 939. 639. 589. 130.1 870.1 821.1 669. 559. 52.,evisrucer :egdir 709. 239. 199. 730.1 921.1 631.1 469. 249. .1,evisrucer :egdir 112.1 621.1 811.1 730.1 443.1 317.1 459. 421.1 100.,gnillory01 :egdir 529. 309. 779. 820.1 870.1 861.1 889. 079. 52.,gnillory01 :egdir 388. 988. 879. 330.1 911.1 561.1 979. 259. .1,gnillory01 :egdir 580.1 801.1 921.1 931.1 012.1 281.1 759. 069. evisrucer,rotcaf 381.1 803.1 702.1 312.1 023.1 052.1 759. 469. gnillory01,rotcaf 619. 249. 099. 630.1 020.1 800.1 629. 719. evisrucer,gnithgiewESM 909. 449. 889. 030.1 810.1 900.1 829. 919. gnillory01,gnithgiewESM 609. 949. 689. 620.1 720.1 110.1 139. 229. gnillory5,gnithgiewESM 709. 349. 989. 230.1 620.1 310.1 929. 129. detnuocsid,gnithgiewESM 087. 119. 489. 401.1 469. 089. 589. 269. evisrucer,SLP 027. 529. 249. 080.1 759. 820.1 559. 010.1 gnillory01,SLP 198. 060.1 680.1 942.1 559. 013.1 750.1 899. gnillory5,SLP 329. 059. 200.1 250.1 329. 189. 049. 759. evisrucer,elitrauqtseb 809. 449. 979. 420.1 139. 399. 849. 559. gnillory01,elitrauqtseb 819. 489. 120.1 260.1 939. 110.1 779. 959. gnillory5,elitrauqtseb 590.1 011.1 861.1 731.1 362.1 292.1 399. 230.1 evisrucer,selitrauqfo.bmocSLO 842.1 444.1 502.1 232.1 552.1 404.1 080.1 980.1 gnillory01,selitrauqfo.bmocSLO 176.1 388.1 735.1 263.1 013.1 552.1 431.1 011.1 gnillory5,selitrauqfo.bmocSLO 060.1 831.1 233.1 434.1 683.1 952.1 450.1 700.1 CIA:AMB 859. 599. 491.1 613.1 230.1 830.1 300.1 859. CIB:AMB 100.1 760.1 052.1 233.1 722.1 931.1 350.1 240.1 CIP:AMB :setoN .etarllib-Tehtdna,noitaflniPDG,htworgPDGeraselbairavledomehT .1 .5dna3selbaTotsetonehteeS .2 32

stsacerof emit-laer ni sgniknar ESMR egarevA :7 elbaT π,y lla π,y lla π,y lla π,y lla π,y lla π,y lla i lla π lla y lla π,y lla lla Y2=h Y1=h Q1=h Q0=h 50-58 48-07 5.7 0.7 2.6 3.11 5.5 4.01 3.3 1.7 9.8 0.8 4.6 .ravinu,)4(RAV .rted.flnifo.gva 3.41 8.31 4.11 8.41 5.9 6.71 9.8 4.11 8.51 6.31 0.21 etairavinu,)4(RAVfo.gva 5.11 2.21 6.21 2.31 0.31 8.11 3.11 1.31 7.11 4.21 0.21 evisrucer,gnithgiewESM 5.11 4.21 2.31 0.41 7.31 8.11 3.11 9.21 6.21 8.21 3.21 gnillory01,gnithgiewESM 9.01 8.31 1.41 2.61 8.41 8.21 8.9 4.91 2.8 8.31 5.21 gnidnertednoitaflni,)4(RAVB 0.21 8.11 7.11 6.11 6.21 0.11 3.41 1.31 4.01 8.11 6.21 evisrucer,elitrauqtseb 7.11 0.11 0.01 5.51 5.21 6.11 9.31 6.9 5.41 1.21 7.21 etairavinu,)4(RAVDfo.gva 6.11 8.21 9.21 8.21 4.41 7.01 1.31 3.21 8.21 5.21 7.21 gnillory5,gnithgiewESM 9.11 1.31 6.31 4.31 5.31 5.21 4.21 7.21 3.31 0.31 8.21 detnuocsid,gnithgiewESM 4.21 2.21 0.31 0.41 0.41 8.11 9.31 6.21 2.31 9.21 2.31 gnillory01,elitrauqtseb 3.41 7.41 6.41 5.51 6.41 0.51 9.11 0.41 5.51 8.41 8.31 %02,naemdemmirt 3.41 4.41 6.41 9.51 4.51 2.41 2.21 6.41 0.51 8.41 9.31 %01,naemdemmirt 7.51 1.41 8.41 4.51 6.61 4.31 4.31 2.51 8.41 0.51 5.41 stsacerofllafoegareva 7.51 0.61 9.41 2.61 1.51 3.61 1.41 4.41 0.71 7.51 2.51 naidem 5.31 8.41 1.51 4.41 6.61 3.21 9.81 1.31 8.51 5.41 9.51 gnillory5,elitrauqtseb 5.12 2.51 9.41 6.51 2.51 4.81 0.71 8.51 9.71 8.61 9.61 .1,gnillory01 :egdir 2.51 1.02 1.81 0.81 9.31 8.12 1.61 9.81 7.61 8.71 3.71 etairavinu 8.91 6.02 4.71 9.61 3.11 0.62 2.71 1.22 2.51 7.81 2.81 PVThtiw)4(RAVB 2.32 3.61 5.71 0.61 9.71 6.81 3.81 7.81 8.71 3.81 3.81 .1,evisrucer :egdir 1.91 7.91 6.12 0.02 6.12 7.81 4.51 1.02 1.02 1.02 5.81 evisrucer,SLP 1.42 9.32 6.91 3.02 0.71 0.72 8.11 2.32 8.02 0.22 6.81 )4(RAVB 4.21 5.41 1.12 9.42 1.22 4.41 8.12 0.51 4.12 2.81 4.91 )4(RAVD 5.81 4.91 6.02 1.52 6.22 1.91 1.71 9.12 9.91 9.02 6.91 gnillory01,SLP 5.82 1.12 5.51 6.71 3.91 1.22 7.81 1.02 3.12 7.02 0.02 52.,gnillory01 :egdir 6.92 8.12 4.81 3.81 4.22 7.12 7.81 8.32 3.02 0.22 9.02 52.,evisrucer :egdir 4.02 6.02 1.81 0.71 8.71 3.02 9.72 5.02 5.71 0.91 0.22 CIB:AMB 3.42 4.42 0.42 8.52 1.12 1.82 8.91 0.02 2.92 6.42 0.32 )4(RAV 1.12 7.32 1.62 9.62 7.62 1.22 2.62 2.72 7.12 4.42 0.52 gnillory5,SLP 5.42 1.22 9.42 2.02 7.22 1.32 8.92 5.32 3.22 9.22 2.52 evisrucer,selitrauqfo.bmocSLO 7.81 9.02 7.32 4.42 0.52 8.81 3.23 5.32 4.02 9.12 4.52 CIP:AMB 1.22 9.62 3.82 3.32 4.82 0.22 2.72 3.52 1.52 2.52 9.52 evisrucer,rotcaf 6.32 8.82 1.82 3.02 8.52 6.42 8.92 2.32 1.72 2.52 7.62 gnillory01,rotcaf 4.82 2.62 2.62 0.02 1.52 3.52 2.33 3.42 1.62 2.52 8.72 gnillorraey01,selitrauqfo.bmocSLO 0.22 6.62 6.82 9.72 0.92 6.32 3.43 3.62 3.62 3.62 0.92 CIA:AMB 8.82 4.03 9.92 3.03 7.92 0.03 0.82 9.72 8.13 8.92 2.92 )4(RAVgnillor,)4(RAVfo.gva 8.23 1.13 0.23 4.92 7.23 0.03 3.63 5.03 2.23 3.13 0.33 gnillorraey5,selitrauqfo.bmocSLO 1.73 4.63 3.63 6.23 3.53 9.53 9.92 9.63 3.43 6.53 7.33 100.,evisrucer :egdir 4.63 0.63 4.73 9.53 8.63 1.63 7.13 7.63 1.63 4.63 8.43 100.,gnillory01 :egdir 42 42 42 42 84 84 84 84 84 69 441 snoitavresbo gniknar fo # :setoN ehT .6-3 selbaT ni dedulcni sledom ro sdohtem tsacerof fo tes lluf eht fo sgniknar ESMR egareva stroper elbat ehT .1 fo stsacerof 441 fo latot a ssorca ,dohtem tsacerof hcae rof ,detaluclac era serugfi fo nmuloc tsrfi eht ni sgniknar egareva tseretni dna ,)noitaflni IPC ,noitaflni PDG :serusaem 2( noitaflni ,)pag PH ,pag SPH ,htworg PDG :serusaem 3( tuptuo .50-5891dna48-0791fo)2(sdoirepelpmasdna Y2=hdna,Q1=h,Q0=hfo)3(snozirohta)etarllib-T :erusaem1(setar stsacerofroselbairavralucitrapedulcnitahtsledomhtiwstsacerofnodesaberasnmulocgniniamernisgniknaregarevaehT tuptuotsujfostsacerof69nodesaberanmulocdnocesehtnisgniknaregarevaeht,elpmaxeroF .cte,elbairavralucitrapafo .noitaluclacgniknaregarevaehtmorfdettimosetartseretnifostsacerofhtiw,noitaflnidna .3elbaTotsetonehteeS .2 33

Cite this document

APA

Todd E. Clark and Michael W. McCracken (2007). Averaging Forecasts from VARs with Uncertain Instabilities (FEDS 2007-42). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2007-42

BibTeX

@techreport{wtfs_feds_2007_42,
  author = {Todd E. Clark and Michael W. McCracken},
  title = {Averaging Forecasts from VARs with Uncertain Instabilities},
  type = {Finance and Economics Discussion Series},
  number = {2007-42},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2007},
  url = {https://whenthefedspeaks.com/doc/feds_2007-42},
  abstract = {A body of recent work suggests commonly-used VAR models of output, inflation, and interest rates may be prone to instabilities. In the face of such instabilities, a variety of estimation or forecasting methods might be used to improve the accuracy of forecasts from a VAR. These methods include using different approaches to lag selection, different observation windows for estimation, (over-) differencing, intercept correction, stochastically time-varying parameters, break dating, discounted least squares, Bayesian shrinkage, and detrending of inflation and interest rates. Although each individual method could be useful, the uncertainty inherent in any single representation of instability could mean that combining forecasts from the entire range of VAR estimates will further improve forecast accuracy. Focusing on models of U.S. output, prices, and interest rates, this paper examines the effectiveness of combination in improving VAR forecasts made with real-time data. The combinations include simple averages, medians, trimmed means, and a number of weighted combinations, based on: Bates-Granger regressions, factor model estimates, regressions involving forecast quartiles, Bayesian model averaging, and predictive least squares-based weighting. Our goal is to identify those approaches that, in real time, yield the most accurate forecasts of these variables. We use forecasts from simple univariate time series models as benchmarks.},
}