Two Practical Algorithms for Solving Rational Expectations Models
Abstract
This paper describes the E-Newton and E-QNewton algorithms for solving rational expectations (RE) models. Both algorithms treat a model's RE terms as exogenous variables whose values are iteratively updated until they (hopefully) satisfy the RE requirement. In E-Newton, the updates are based on Newton's method; E-QNewton uses an efficient form of Broyden's quasi-Newton method. The paper shows that the algorithms are reliable, fast enough for practical use on a mid-range PC, and simple enough that their implementation does not require highly specialized software. The evaluation of the algorithms is based on experiments with three well-known macro models--the Smets-Wouters (SW) model, EDO, and FRB/US--using code written in EViews, a general-purpose, easy-to-use software package. The models are either linear (SW and EDO) or mildly nonlinear (FRB/US). A test of the robustness of the algorithms in the presence of substantial nonlinearity is based on modified versions of each model that include a smoothed form of the constraint that the short-term rate of interest cannot fall below zero. In two single-simulation experiments with the standard and modified versions of the models, E-QNewton is found to be faster than E-Newton, except for solutions of small-to-medium sized linear models. In a multi-simulation experiment using the standard versions of the models, E-Newton dominates E-QNewton.
Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C. Two Practical Algorithms for Solving Rational Expectations Models Flint Brayton 2011-44 NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.
Two Practical Algorithms for Solving Rational Expectations Models ∗ Flint Brayton Federal Reserve Board October 11, 2011 Abstract This paper describes the E-Newton and E-QNewton algorithms for solving rational expectations (RE) models. Both algorithms treat a model’s RE terms as exogenous variables whose values are iteratively updated until they (hopefully) satisfy the RE requirement. In E-Newton, the updates are based on Newton’s method; E-QNewton uses an efficient form of Broyden’s quasi- Newton method. The paper shows that the algorithms are reliable, fast enough for practical use on a mid-range PC, and simple enough that their implementation does not require highly specialized software. The evaluation of the algorithms is based on experiments with three well-known macro models—the Smets-Wouters (SW) model, EDO, and FRB/US—using code written in EViews, a general-purpose, easy-to-use software package. The models are either linear (SW and EDO) or mildly nonlinear (FRB/US). A test of the robustness of the algorithms in the presence of substantial nonlinearity is based on modified versions of each model that include a smoothed formoftheconstraintthattheshort-terminterestratecannotfallbelowzero. Intwosingle-simulationexperimentswiththestandardandmodifiedversions of the models, E-QNewton is found to be faster than E-Newton, except for solutions of small-to-medium sized linear models. In a multi-simulation experiment using the standard versions of the models, E-Newton dominates E-QNewton. ∗ email: Flint.Brayton@frb.gov. I thank John Roberts for helpful comments and Michael Kiley and Peter Chen for assistance with creating the EViews version of EDO. Theviewsexpressedinthispaperarethoseoftheauthoranddonotnecessarilyrepresent the opinions of the Federal Reserve Board or its staff.
2 1. Introduction This paper describes the E-Newton and E-QNewton algorithms for solving rational expectations (RE) models. Like the Fair-Taylor (FT) procedure (Fair and Taylor, 1983), both algorithms treat a model’s RE terms as exogenous variables whose values are iteratively updated until they (hopefully) satisfy the RE requirement.1 To improve solution speed and reliability, the E-Newton algorithm replaces FT’s rudimentary updating mechanism with one based on Newton’s method. In this respect, E-Newton is related to the Newton-based stacked-time procedures that are now more widely used than FT for solving nonlinear RE models.2 The expectations updates in the E-QNewton algorithm use a quasi-Newton method. This paper shows that the E-Newton and E-QNewton algorithms are not only practical and reliable, but also that they are relatively simple to program and thus do not require highly specialized software. For several years, solutions of the RE versions of the Federal Reserve Board’s nonlinear FRB/US macroeconomic model have been computed using code for the E-Newton algorithm written in EViews, a general-purpose, easy-to-use software package. The E-QNewton algorithm, which has been developed more recently, is also written in EViews. The term rational expectations is used here to describe the condition in which expectations of future values are identical to a model’s future solutions in the absence of unexpected shocks. That is, there are no expectations errors. For the algorithms described in this paper, the RE solution of a model with m expectations variables for T periods involves finding the mT expectationsvaluesthatsetalikenumberofexpectationserrorsarezero. The FT algorithm tackles this task with a simple mechanism that is equivalent 1The “E-” prefix attached to the names of the algorithms signifies the central role of the expectations estimates in the iteration process. 2Stacked-timealgorithms,whichimposetheRErequirementbysolvingalltimeperiods in a simulation simultaneously, iterate on the values of all endogenous variables. Newton versions of stacked time are described in Hollinger (1996) and Juillard (1996).
3 to updating at each iteration the estimate of each expected future value in isolation, using only its own expectations error. In contrast, E-Newton and E-QNewton update all mT expectations simultaneously, based on all mT expectations errors and information on the relationship between the expectations estimates and errors. In E-Newton this information takes the form of the Jacobian matrix of first derivatives of the expectations errors with respect to the expectations estimates. E-QNewton uses an approximate Jacobian that is more easily calculated but less accurate than the Newton Jacobian. The relative speed of the two algorithms in solving a particular RE model depends on the model-specific cost trade-off between the accuracy of the Jacobian and the number of iterations needed to achieve an RE solution. E-Newton runs a set of impulse-response simulations (dynamic multipliers) to compute the values of the derivatives that enter the Jacobian and then inverts this matrix. These operations can be quite costly if the size of the Jacobian, mT, is large, especially because the time required to invert a matrix depends on the cube of its size. In fact, when an algorithm similar to E-Newton was developed by Holly and Zarrop (1983) many years ago, these costs were judged to be prohibitive for most RE models.3 Although the vast advances in computation power over the past three decades have greatly diminished the importance of this judgment, the Jacobian costs in E-Newton can nonetheless be substantial for larger RE models. In such cases, faster solutions may be obtained with a quasi-Newton method, because it does not require the explicit calculation of derivatives and can be set up to run without inverting matrices. The E-QNewton algorithm uses a version of the quasi-Newton method of Broyden (1965) that is described by Kelley (1995). In E-QNewton, execution time is more sensitive to the number of solution iterations than to the size of the Jacobian. 3Hall(1985)andFisher(1992)discusstheefficiencyandpracticalityoftheHolly-Zarrop “penalty function” algorithm.
4 Newton and quasi-Newton methods are used to solve systems of equations that arise in many scientific disciplines. As this paper will show, the efficiency of these methods when applied to solving RE models can be improvedbytakingadvantageofsomecharacteristicsthatarecommontomany models of this type. The cost of computing the E-Newton Jacobian can be greatly reduced for many linear and mildly nonlinear RE models through the use of shortcuts that take account of repetitive patterns in their derivatives. In the E-QNewton algorithm, the accuracy of the approximate Jacobian can be improved for most RE models by choosing a very specific initial approximation that is easily calculated. Section 2 describes the E-Newton and E-QNewton algorithms. Section 3 briefly discusses thethree well-known macro models that are used to evaluate the performance of the algorithms: the Smets-Wouters (SW) model, EDO, and FRB/US. SW and EDO are linear DSGE models; FRB/US is mildly nonlinear. Each model is of medium size or larger. To test the algorithms with models that are substantially nonlinear, an alternative version of each model is created that includes a nonlinear but smooth approximation to the constraint that the short-term interest rate cannot fall below zero. The evaluation of the algorithms in section 4 is based on three simulation experiments. The first is a one-time shock to each model’s interest rate policy rule. The second is a shock that reduces output sufficiently to cause the response of the short-term interest rate to be constrained by the zero bound. The third experiment, which requires the execution of a set of RE simulations,assumesthatmonetarypolicyrespondstoashockbyminimizing a loss function. The first two experiments suggest several conclusions. First, for single simulations of linear RE models, E-Newton is likely to be faster for models of small-to-medium size (mT < 2000) and E-QNewton is likely to be faster for larger models. Second, for nonlinear models, E-Newton tends to
5 be penalized relative to E-QNewton to the extent that the nonlinearity eliminates exploitable patterns among the derivatives in the Newton Jacobian. In the specific case of the single-simulation experiment with the zero bound, E-QNewton solves all of the models more rapidly than E-Newton. Third, actual solution times support the claim that the algorithms are sufficiently fast for practical use in many applications. With a mid-range PC, the fastest execution time for each model in each experiment ranges from 0.3 seconds for a stripped-down version of SW in the interest-rate experiment to 13.4 seconds for the version of FRB/US in which all 27 of its expectations variables have RE solutions in the zero-bound experiment. The E-Newton algorithm has a substantial advantage over E-QNewton on experiments that involve a large number of RE solutions, as long as the same Jacobian can be used for each E-Newton solution. This Jacobian condition is satisfied by each of the models in the third experiment, which is designed so that the zero bound is not an issue. This optimal-policy simulation, which entails 42 RE solutions for the linear models (SW and EDO) and 131 RE solutions for FRB/US, executes between 2.2 and 18 times faster with E-Newton than with E-QNewton, depending on the model. For purposes of comparison, solutions of SW and EDO for the first two experiments were also executed with the stacked-time algorithm in Dynare, a specializedsoftwarelanguageforREmodels. TheOctave-basedDynaresolution times are similar to the best of each model’s E-Newton and E-QNewton runs. Had they been run in Matlab, the Dynare solutions would likely have been somewhat faster than the E-Newton and E-QNewton runs but not dramaticallyso. Thissuggeststhatmanyusersmayfindtheaddedsolutiontime of using the algorithms described here in a software language like EViews to be more than offset by the many features available in such a general-purpose environment that facilitate its use.
6 2. The Algorithms This section of the paper, which may be skipped by the reader who is less interested in technical details, has four parts. The first part summarizes the algorithms. The second part examines in more detail the E-Newton Jacobian and the shortcuts that can be used to speed up its construction. The third part describes the E-QNewton approximate Jacobian and a pair of transformations that enable it to be efficiently used. The last part discusses convergenceproperties. Additionalinformationonthealgorithmsisprovided in the Appendix. 2a. Overview Consider a model of n equations in which f = [f ,...f ]′ is a vector- 1 n valued function of n endogenous variables, y = [y ,...y ]′. t 1,t n,t f (y ,y ,y ) 1 t−1 t t+1 . f(y t−1 ,y t ,y t+1 ) = . . = 0 (1) f n (y t−1 ,y t ,y t+1 ) For expositional simplicity, the model contains only a single lag and single lead of y and exogenous variables are not shown. The lead term captures the model’s RE expectations variables. The E-Newton and E-QNewton algorithms (and FT, as well) replace y with an exogenous estimate of ext+1 pectations, denoted by x , t f(y ,y ,x ) = 0. (2) t−1 t t The algorithms execute one or more iterations, each of which starts with a solution of (2) for a given set of expectations estimates. The resulting values of the RE expectations errors, e , t
7 e = x −y . (3) t t t+1 are then computed, taking as given the value of y obtained in solving (2). t+1 Depending on the size of the errors, the solution of equations 2-3 may be repeated with a different step length, in which the expectations estimates are moved either back toward or farther from their values in the previous iteration. Lastly, a set of revised expectations estimates is constructed for use in the next iteration. Some additional notation is needed to describe how the expectations estimates are revised in the last step of each iteration. Assume a solution for t = 1,...,T, and stack the vectors of the expectations estimates and expectations errors for all T time periods in x˜ and e˜, respectively. If there are m distinct expectations variables, e˜ and x˜ are vectors of length mT. The relationship between these two stacked vectors can be written in general terms as e˜= F(x˜), (4) where the function F is a compact representation of the model’s structure, also in stacked form.4 Using this representation, the condition that expectations are rational, F(x˜) = 0, (5) is the solution of a system of (possibly) nonlinear equations. Each evaluation of F requires a simulation of the model given by equations 2-3. 4F also depends on the initial and terminal values of y.
8 The field of numerical analysis contains many algorithms for solving systems of nonlinear equations. Some of these algorithms characterize the adjustmentofx˜atiterationiastheproductofasteplength, λ , andadirection, i ˜ d , i ˜ ∆x˜ = λ d , (6) i i i where the direction in turn depends on the product of an updating matrix U and the value of F in the previous iteration, d = −U F(x˜ ). (7) i i−1 i−1 The iterations of E-Newton and E-QNewton can be expressed in the form of equations 6-7, differing only in how U is defined. The FT algorithm is the special case in which U is the identity matrix. When Newton’s method is used to solve a system of equations, the updating matrix is the inverse of the Jacobian matrix of first derivatives (J = (∂F/∂x˜)). In E-Newton specifically, J is the matrix of derivatives of the expectations errors with respect to the expectations estimates.5 When quasi-Newton methods are used to solve systems of equations, the updating matrix is the inverse of an approximate Jacobian (B) whose elements are based only on the implicit derivative information observed each iteration in the movements of F and x˜. Quasi-Newton methods do not calculate derivatives explicitly. Rather, starting from some initial estimate (B ), 0 they revise the approximate Jacobian each iteration so that it matches the slope of F revealed in the previous iteration in the particular direction that 5NotethedistinctionbetweentheREJacobian,J,whichisassociatedwithequation5, and the structural Jacobian of the original RE model in equation 1. The iterations of the Newton-based stacked-time algorithm make use of a stacked form of the structural Jacobian.
9 x˜ was adjusted. Many matrix updating formulas satisfy this condition. The E-QNewtonalgorithmisbasedonthewidelyusedmethodofBroyden(1965), whose updating formula is the one that makes the smallest possible change to the approximate Jacobian each iteration.6 2b. The E-Newton Jacobian The most important and costly part of the E-Newton algorithm is the computation and inversion of the Jacobian (J). The algorithm executes impulse-response simulations (dynamic multipliers) of the model formed by equations 2-3 to compute J. Although mT impulse-response simulations may be needed for this purpose at each iteration (one for each expectations variableineachsimulationperiod), modelsthatarelinearormildlynonlinear require many fewer simulations. The Jacobian of a linear RE model has a repetitive structure. This is illustrated with a single-equation RE model in which the variable y depends only on its first lead and first lag, y = αy +βy +ǫ . (8) t t−1 t+1 t Define the operational model using x to represent the expectations estimate t and e the expectations error. t y = αy +βx +ǫ (9) t t−1 t t e = x −y (10) t t t+1 Repeated substitution of (9) into (10) yields the following equation, 6ThechangeintheapproximateJacobianisdefinedintermsofitsnorm: ||Bi−Bi−1||.
10 x − t−1 αi+1(βx +ǫ )−αt+1y t < T e = t i=−1 t−i t−i 0 (11) t x T −Py T+1 t = T from which the stacked Jacobian is easily obtained. 1−αβ −β 0 ... ... ... 0 −αβ 1−αβ −β ... 0 −α2β −αβ 1−αβ ... ... 0 J = −α3β −α2β −αβ ... ... ... 0 (12) . . . . . . . . . ... ... ... 0 −αT−2β −αT−3β −αT−4β ... −αβ 1−αβ −β 0 0 0 ... 0 0 1 Element (j,k) of J is the derivative of e with respect to x . Each northwestj k southeastdiagonalcontainsasinglerepeatedvalue,exceptinthebottomrow. For the nonzero diagonals, this pattern is a consequence of the property of a linear model that the effect on an endogenous variable at period j of an exogenous shock at period k depends only on number of intervening periods, k−j. Thezerodiagonalscorrespondtoexpectationserrorsthataredatedtoo far in the future to be affected by the particular expectations estimate. The zeros in most elements of the bottom row appear because the expectations errorassociated with thatrow, e , dependson theunchangingterminalvalue T y . Given these patterns, all of the elements of this Jacobian can be T+1 determined from the values in the first and last columns and the bottom row. These values in turn can be generated by two impulse-response simulations, one that perturbs x and one that perturbs x , and priori knowledge of the 1 T bottom row.
11 A simple generalization of this single-equation Jacobian holds for all linear RE models in which expectations are formed at time t for outcomes at t + 1. The Jacobians of such models, which consist of m2 blocks, each of which is similar to (12), can be calculated with only 2m impulse response simulations and priori knowledge of the bottom row of each Jacobian block. For qualifying models, this shortcut greatly speeds up the computation of the Jacobian.7 The cost of computing the Jacobian can also be reduced for models that are mildly nonlinear. Although the Jacobian blocks of such models do not have constant diagonals, the values along each diagonal may shift relatively smoothly. Useful approximations to these Jacobian can usually be constructed by calculating a subset of the columns with impulse-response simulations and estimating the values in the other columns by linear interpolation along the diagonals. 2c. The E-QNewton approximate Jacobian Define the step s as the change from the previous iteration of x˜ and i z as the resulting change of F. In the RE context, s is the change in i i the expectations estimates and F is the change in the expectations errors. Withthesedefinitions, therevisiontotheapproximateJacobianinBroyden’s method can be expressed as, z −B s B = B + i i−1 i s′. (13) i i−1 s′s i i i Some intuition about this formula is obtained by noting that the revision to 7Some linear RE models contain expectations of outcomes dated more than one period in the future or form expectations only on the basis of lagged information. These modifications alter the repetitive structure of the Jacobian blocks in specific ways. With the first modification, more than one diagonal above the main diagonal contains non-zero entries, and more than one row at the bottom has entries that differ from the pattern in the rows above. The assumption of lagged expectations modifies the right-hand columns of the Jacobian blocks.
12 B depends on the vector (z −B s ), which measures how much the change i i−1 i in F differs from what would have occurred if B had in fact been the true i−1 Jacobian. Although it avoids the cost of computing explicit derivatives, the Broyden formula given by (13) has practical drawbacks in applications whose approximate Jacobians are large and that require many iterations to solve: To compute the adjustment direction each iteration, B needs to be inverted andthenmultipliedbyF(x˜). However, twotransformationsleadtoaformula thatexecutesmuchmorequicklyinthesecases. Thefirsttransformationuses the Sherman-Morrison formula to convert (13) into an expression for directly updating the inverse of the approximate Jacobian. s −B−1z B−1 = B−1 + i i−1 i (s′B−1). (14) i i−1 s′B−1z i i−1 i i−1 i The second transformation recognizes that the expression B−1F(x˜ ) can be i i expressed as a function of the product of the inverse of the initial approximate Jacobian and F(x˜ ), and the iteration histories of revisions to the i expectations estimates and step lengths. B−1F(x˜) = G(s ,s ,...,s ,λ ,λ ,...,λ )B−1F(x˜) (15) i i 1 2 i 1 2 i 0 i As long as i is not too large relative to mT and B−1 has a simple form, it 0 can be considerably faster to compute the adjustment direction using (15) than by the two-step process of forming B−1 from (14) and multiplying the i result by F(x˜ ).8 i 8The transformation embodied in (15) is taken from Kelley (1995, chapters 7-8). The specific form of the function G is presented in the Appendix. As shown in Appendix table A1, the second transformation reduces solution time between 32 and 97 percent, depending on the model and the simulation experiment.
13 Before it starts, the E-QNewton algorithm requires an initial approximate Jacobian, B . The conventional recommendation is to use the identity 0 matrix, a choice that, because it has the effect of eliminating all matrices from (15), causes each iteration to execute very quickly. As a general matter, the choice of B has to balance two considerations: A simpler estimate 0 speeds up each iteration, while a more accurate estimate reduces the number of iterations required for convergence. For most of the models examined here, a simplified, block-diagonal estimate of the Jacobian, calculated from m impulse response simulations (one per RE variable), is usually a better choice than the identity matrix. Because this matrix, Bbd, has an inverse 0 that can be easily computed and stored compactly in m TxT matrices, its use increases computation time per iteration only modestly relative to that for B = I. At the same time, the improvement in the accuracy of the ini- 0 tial approximate Jacobian is such that the benefit of fewer iterations greatly outweighs the higher cost per iteration in most cases. 2d. Convergence and the step length The adjustment of the expectations estimates each iteration is the product of the step direction (d) and the step length (λ). Thus far, the detailed discussion of the two algorithms has concentrated on the E-Newton Jacobian and the E-QNewton approximate Jacobian, matrices that enter the computation of the direction. With regard to the step length, algorithms that use the Newton or quasi-Newton directions always converge to the solutions of linear models when λ=1.0.9 With this step length, E-Newton solutions require a single iteration, as long as the Jacobian is exactly computed, and E-QNewton solutions require at most 2mT iterations.10 9This statement requires several qualifications. The Newton Jacobian or the quasi- Newton approximate Jacobian must be non-singular. The statement does not apply to the E-Newton algorithm when it uses an inexact Jacobian. For an RE model that does notsatisfytheBlanchard-Kahnconditions,thesolutionobtainedmaynotbeeconomically meaningful. 10The E-QNewton convergence results are based on Gay (1979), who proves that Broy-
14 A step length of 1.0 does not guarantee the convergence of E-Newton or E-QNewton to the solution of a nonlinear RE model, unless the starting point is close to the solution. For nonlinear models, the convergence properties of the pair of algorithms is improved if line-search procedures are used to examine alternative step lengths, whenever the default length (λ = 1, typically) yields an insufficient movement toward the RE solution. The ratio of the sum of squared expectations errors in successive iterations, e˜′e˜ g(x˜ ) = i i , (16) i e˜′ e˜ i−1 i−1 provides a scalar measure of the iteration-by-iteration progress. E-NewtonusestherelativelysimpleArmijoline-searchprocedureinwhich thesteplengthstartsat1.0andthenisrepeatedlyshortened,ifnecessary,until either g(x˜ ) < (1−λ γ) or the maximum number of step-length iterations i i is reached.11 In E-Newton, γ = .01, the step length shortening parameter is 0.5, and the maximum number of line-search iterations is ten. For quasi-Newton methods, the decision of how intensively to search for the best step length each iteration is more complex than in the Newton case, because even an iteration that fails to make much or any progress toward the RE solution is still likely to improve the accuracy of the approximate Jacobian to be used in the next iteration. In particular, it may be more efficient to move on to the next iteration than to try to refine the step length in the current iteration, when the direction in which x˜ is being adjusted is likely to make little progress. For this reason, E-QNewton uses the nonmonotone step-length procedure of La Cruz, Martinez, and Raydan (LMR, den’s method always converges to the solution of a linear system of equations when the step length is 1.0, and that the required number of iterations never exceeds twice the number of equations. 11Kelley(1995,pp. 138-141)describestheconditionsunderwhichtheArmijoprocedure is globally convergent.
15 2005). This procedure tolerates temporary, limited increases in the sum of squared expectations errors, g(x˜ ) > 1, without searching for a better step i length, especially in the initial solution iterations, when the accuracy of the approximate Jacobian and the adjustment direction is likely to be worst. In addition, the LMR procedure tests both positive and negative step lengths, a feature that is useful when the direction is so poor that no positive step length reduces the sum of squared expectations errors. BothE-NewtonandE-QNewtonhavemuchbetterconvergenceproperties thanFT.AllofmodelsconsideredhereconvergetotheirREsolutionsinallof the experiments for at least one of the E-Newton Jacobian options and for at leastoneoftheE-QNewtonoptionsforinitializingtheapproximateJacobian, usingtheline-searchalgorithmsjustdiscussedwhenappropriate. Incontrast, the FT algorithm does not have a high rate of successful convergence for these specific models, a result which can be shown using a simple condition provided by this paper’s RE solution framework.12 For a linear model, the evolution of the expectations errors depends on the Jacobian and the expectations estimates. e˜ = e˜ +J [x˜ −x˜ ] (17) i i−1 i i−1 = (I −λJ)e˜ (18) i−1 The derivation of (18) uses the version of (7) that holds under FT to substitute for the change in the expectations estimates. FT convergence of e˜to zero thus requires that the eigenvalues of (I −λJ) lie inside the unit circle. 12For a summary of convergence problems that have been encountered when using FT to solve other RE macro models, see Juillard et. al. (1999), who report that nine of 13 macromodelerssurveyedhaddifficultieswithFTandsimilaralgorithmsincaseswherethe stacked-time algorithm worked. Because they are both Newton methods, the convergence properties of E-Newton and stacked-time should be similar.
16 Related to this is a second condition that requires all of the eigenvalues of (I−J) to be less than 1.0 for there to exist some λ > 0 that satisfies the first condition.13 Of the linear models used in this paper, EDO satisfies the second FT convergence condition but SW does not. For nonlinear models, FT convergence requires that the second condition hold in some neighborhood of the solution. This is satisfied by the simpler version of FRB/US but not by the one in which all 27 of its expectations variables have RE solutions. 3. The Models Three models are used to evaluate the E-Newton and E-QNewton algorithms. One model is the well-known New-Keynesian DSGE of Smets and Wouters (2007). The other two models are ones developed and used at the Federal Reserve Board: EDO and FRB/US. EDO is a New-Keynesian DSGE thatisconsiderablylargerthanSmets-Wouters(SW).14 FRB/US,whichisan even larger model, also has New-Keynesian characteristics, but its structure is estimated with fewer constraints than a DSGE framework would impose.15 Several versions of each model are employed. A small version of SW (SWsmall) eliminates the flexible-price concept of potential output, a change that permits several equations to be dropped but requires that the interest-rate equation be modified to depend on the log level of output rather than on the output gap. One version of FRB/US assumes that all of its expectations variables have RE solutions (FRB/US-all); in a second version, only those expectations that appear in asset-pricing equations have RE solutions (FRB/US-mcap).16 13ForotherrepresentationsofthefirstFTconvergencecondition,seeFisherandHughes Hallett (1988) and Armstrong, et. al. (1998). The relationship between convergence conditions of the first and second type is described in Fisher and Hughes Hallett (1988). 14EDO is described in Chung, et.al. (2010) and Edge, et.al. (2008). 15FRB/US is described in Brayton and Tinsley (1996) and Brayton, et.al. (1997a). 16MCAP: Model-Consistent Asset Pricing.
17 Table 1: Measures of Model Size Model Equations m T mT SW-small 22 7 100 700 SW-full 33 12 100 1200 EDO 70 33 160 5280 FRB/US-mcap 393 7 160 1120 FRB/US-all 393 28 160 4480 Notes: misthenumberofREterms. T isthenumber of simulation periods. For each model version described thus far, table 1 presents two measures of size: the number of equations and the product of the number of distinct RE variables (m) and the number of simulation periods (T). The number of equations varies from 22 (SW-small) to 393 (FRB/US). The second measure of size (mT) is especially important in the E-Newton algorithm, because it equalsthedimension oftheJacobian, whosecomputation andinversion takes the largest component of that algorithm’s execution time. The Jacobian dimension varies from 700 (SW-small) to 5280 (EDO). Because the time required to invert a matrix depends on the cube of its size, inversion of the EDO Jacobian takes about 430 times longer than inversion of the SW-small Jacobian. The number of simulation periods is set separately for each model so that the simulated outcomes in the first part of each experiment are little affected if the simulation length is increased. SW and EDO are linear; FRB/US is mildly nonlinear. In order to expand the set of models to include ones that are more nonlinear than FRB/US, a modified form of each model version is created in which a very nonlinear but continuously differentiable equation is added to constrain the short-term interest rate (rs) from falling below zero. t
18 Figure 1: Zero Bound Constraint .6 .5 .4 .3 .2 .1 .0 -.5 -.4 -.3 -.2 -.1 .0 .1 .2 .3 .4 .5 Unconstrained rate of interest ru +.10e−5r t u if ru ≥ 0 rs = t t (19) t .10e5r t u if ru < 0 t In (19) ru is the unconstrained rate of interest. The dashed line in figure 1 t plots the constrained rate of interest for values of the unconstrained rate of interest between -0.5 and 0.5. Unlike the other models, the design of FRB/US permits each of its expectations terms to have either a backward-looking VAR-based solution, which isbasedonafixedformula, oranREsolution.17 InFRB/US-mcap, theseven expectations that appear in the equations for asset prices have RE solutions and the other 21 have VAR solutions. The VAR formulas themselves are a type of RE expectation, but one that is based on a smaller information set than the one embodied in the full FRB/US model. Each VAR formula is 17The expectations structure of FRB/US is described in Brayton et. al. (1997b)
19 derived as part of the sector-by-sector estimation of FRB/US, which relies on small models that combine one or a few structural equations with several VAR relationships.18 4. Simulation Experiments One of the purposes of this paper is to show that the E-Newton and E-QNewton algorithms are fast enough to be of practical use for solving a variety RE models. The evidence is based on a set of simulation experiments that are run on a mid-range PC, one whose characteristics are more likely to represent the computing resources available to a broad range of people who solve, orwouldliketosolve, REmacromodels, thanwouldthecharacteristics of a state-of-the-art PC. 19 All simulations were run in EViews 7.2.20 In the simulations, convergence to an RE solution occurs when the absolute value of every expectations error is less than 1e-05. In the E-Newton runs that require more than one iteration, the Jacobian is computed initially and then recomputed only after those iterations in which the sum of squared expectations errors falls by less than 50 percent. In all runs that permit the step length to be adjusted from its default value (1.0, typically), alternative step lengths are evaluated only during those iterations in which the sum of 18SeveralaspectsofexpectationsstructureofFRB/US,includingtheVARexpectations option, require an RE solution approach that is more general than the one presented in equations 2-3. In this more general approach, the expectations estimates have separate endogenousandexogenouscomponents, andtheiterativeupdatingprocedureoperateson thelatter, notontheexpectationsestimatesthemselves. Thisandrelatedtechnicalissues are discussed in section A1 of the Appendix. 19The PC used for all simulation experiments contains an Intel Core 2 Duo E6750 2.67 GHz chip and 4 GB of RAM. Some of the experiments were also executed on a PC with a newer Intel I7-2600 3.40 GHz chip. The experiments run about twice as fast on this second machine than on the first. 20BecausethematrixinversioncodeinEViews7isabletodistributethistasktomultiple processingthreads, theE-Newton algorithm runssubstantially fasteronPCs, like theone used in these simulations, that can run more than one process simultaneously, especially when the Jacobian is large. The advantage of having more than two processing threads does not seem to be significant, however.
20 squared expectations errors falls by less than 10 percent at the default step length. Reported execution times do not include the time needed to load each model and its data. The first simulation experiment (table 2) is a one-time, 100-basis-point shock to the equation for the short-term rate of interest. Execution times are reported for three E-Newton options and two E-QNewton options. In the E-Newton solutions, all derivatives are directly computed in constructing the Jacobian under the “every” option; every 12th derivative is directly computed and the others are interpolated under the “interp(12)” shortcut; and only two derivatives are directly computed for each RE variable under the “linear” shortcut. For the three RE model versions that are linear, the “linear” and “every” options both compute the same (exact) Jacobian. The “interp(12)” Jacobian is always an approximation, even for linear models.21 The advantage of the “linear” shortcut for these models is substantial. A comparison of columns 1 and 3 reveals reductions in execution time that range from 56 percent in EDO to 91 percent in SW-full. To a large extent, the variation in these percentages reflects variation in the share of execution time devoted to inverting the Jacobian, an operation that is not affected by any of the options. A model whose mT is high (EDO) devotes proportionally more execution time to matrix inversion than does a model whose mT is low (SW). In the interest-rate simulations of the two FRB/US versions, the nonlinearity of the model, while mild, is nonetheless substantial enough that the resulting inaccuracy of the “linear” Jacobian causes E-Newton to fail to converge. For this model, the “interp(12)” option is a powerful shortcut, one that reduces execution time in this experiment by at least 85 percent. The interest-rate simulations that use the E-QNewton algorithm reveal 21Unlike the “linear” shortcut, the “interp(12)” option does not take account of the irregular derivative patterns in the bottom row of each Jacobian block.
21 Table 2: Interest-Rate Experiment Simulation Time in Seconds (number of iterations in parentheses) Model E-Newton E-QNewton linear interp(12) every B = Bbd B = I 0 0 0 SW-small .3 .5 2.2 .5 2.3 (1,1) (4,1) (1,1) (61) (251) SW-full .3 .6 3.4 .7 2.7 (1,1) (4,1) (1,1) (61) (251) EDO 34.2 42.7 79.4 5.4 23.0 (1,1) (11,1) (1,1) (101) (409) FRB/US-mcap nc 10.9 99.7 2.9 2.1 (4,1) (3,1) (17) (18) FRB/US-all nc 65.5 432.4 9.0 27.7 (5,1) (3,1) (41) (217) Notes: nc: No convergence. The second number in parentheses in the E-Newton columns is the number of iterations in which the Jacobian is computed. Both E-Newton and E-QNewton use a fixed step length (λ=1). theadvantageofsettingtheinitialapproximateJacobiantotheblock-diagonal matrix, B = Bbd, rather than to the identity matrix, B = I. For four of 0 0 0 the five model versions, the block-diagonal initial approximation reduces the number of iterations by at least 75 percent and the solution time by nearly as much. The only exception is FRB/US-mcap which, unlike the other model versions, expresses all of its RE variables in a way that makes the true Jacobian close enough to an identity matrix that the benefit of using Bbd from 0 fewer solution iterationsis minor and doesnot outweigh the cost of theadded computations per iteration. Turning to the question of which algorithm is faster given that its most efficient option is chosen, E-Newton outperforms E-QNewton in the interestrate simulations when mT is not too large (<2000) and the model is linear;
22 E-QNewton is faster otherwise. In general, the E-Newton simulations exhibit much more variation in solution time than do the the E-QNewton simulations. This is clearest in the results for the three linear model versions. In the simulations with the E-Newton “linear” option, the ratio of the longest solution time to the shortest solution time is 114; the corresponding ratio in the simulations with the E-QNewton B = Bbd option is only 11. This 0 0 difference is explained by time required to invert the E-Newton Jacobian, which varies with the cube of mT. The E-QNewton solution times appear to be not far from proportional to mT. The second simulation experiment (table 3) focuses on how well the two algorithms solve RE models that are substantially nonlinear. For this purpose, a smooth approximation to the constraint that the short-term interest rate cannot fall below zero is added to each model, and shocks are introduced that reduce output by an amount sufficient to cause this nonlinearity to have a large impact on the solution. The magnitude of the shocks is chosen so that in each model the cumulative loss of output over the first 20 periods is roughly twice as large as the increase in output that would occur if the signs of the shocks were reversed. In the simulations, the number of periods that the rate of interest, if unconstrained, would be less than zero ranges from seven periods (SW-small and SW-full) to 25 periods (FRB/US-mcap). The E-QNewton algorithm completes all of the zero-bound simulations much more rapidly than does E-Newton. For each model version, the best E-QNewtonsolutiontakesonlyabout20percentaslongasthebestE-Newton solution. Even though the shocks are not the same, a rough gauge of the effects on each algorithm of the substantial nonlinearity introduced in this experiment can be made by comparing the execution times in table 3 with those in table 2. On this basis, the zero-bound nonlinearity tends to increase the E-QNewton execution times modestly and relatively uniformly for each model version. The effects in the E-Newton runs are more variable and de-
23 Table 3: Zero-Bound Experiment Simulation Time in Seconds (number of iterations in parentheses) Model E-Newton E-QNewton linear interp(12) every B = Bbd B = I 0 0 0 SW-small nc nc 6.2 1.1 7.6 (16,2) (88) (392) SW-full nc nc 8.5 1.3 9.1 (16,2) (88) (392) EDO 44.2 51.9 97.7 9.0 34.7 (35,1) (38,1) (36,1) (121) (500) FRB/US-mcap nc 15.2 130.7 3.9 2.8 (11,1) (12,1) (19) (20) FRB/US-all nc 75.8 524.1 13.4 nc (10,1) (6,1) (57) Notes: nc: No convergence. The second number in parentheses in the E-NewtoncolumnsisthenumberofiterationsinwhichtheJacobianiscomputed. E-Newton uses the Armijo step-length procedure. E-QNewton uses theLMRstep-lengthprocedure. ConvergenceoftheFRB/US-allB0 =B 0 bd simulation requires λ≤ 0.5. pend on whether each model’s best Jacobian shortcut in table 2 continues to work. For EDO and the two versions of FRB/US, the best shortcuts remain effective and solution times in the zero-bound experiment are modestly longer than those in the interest-rate experiment. In contrast, neither shortcut now converges for either version of SW, and the best E-Newton solution times are more than 20 times longer in the zero-bound simulation than in the interest-rate simulation. Before turning to the third, more complex simulation experiment, it is useful to address the claim made at the start of this section concerning the practicality of the E-Newton and E-QNewton algorithms. Taking the best result for each model version in each of the first two experiments, the exe-
24 Table 4: Comparison with Dynare Simulation Time in Seconds Model Experiment 1 Experiment 2 Best N/QN Dynare Best N/QN Dynare SW-small .3 .7 1.1 2.0 SW-full .3 .9 1.3 2.4 EDO 5.4 3.4 9.0 8.6 Note: BestN/QN:FastestE-Newton orE-QNewtontime. cution times range from as little as 0.3 seconds for SW-small and SW-full in the interest-rate experiment to as long as 13.4 seconds for FRB/US-all in the zero-bound experiment. These times are clearly fast enough for practical use. Although the focus of this section is on the absolute amount of time that E-Newton and E-QNewton take to solve various RE models, the speed of the two algorithms relative to other RE algorithms is obviously of some interest. Table 4 addresses this issue in a limited way. For purposes of comparison, the SW and EDO simulations of the first two experiments were repeated using the stacked-time algorithm in Dynare.22 Compared with the best of each model’s E-Newton and E-QNewton runs, Dynare solution times are moderately slower for the two SW versions but somewhat faster for EDO. TheDynaresolutionswerecomputedinOctave.23 Eventakingaccountofthe fact that the Dynare solutions would run more rapidly in Matlab, the results indicate that any loss of solution efficiency from using a general-purpose 22DynareversionsofSWandEDOareeitheravailableoreasilycreated. Thisisnotthe case for FRB/US, due to its large size and complex structure. 23The Dynare solutions wre executed on the same PC as the other experiments, using Dynare 4.2.1 and Octave 3.2.4. The standard linear versions of SW and EDO could also be solved using any of the eigenvalue-splitting methods that originated with Blanchard and Kahn (1980), including AIM (Anderson and Moore, 1985) and Gensys (Sims, 2001). The software that is available for these methods typically is not designed for ease of use.
25 software platform to run the algorithms described here, rather than using specialized software designed for RE models, is not that large, and suggest thatformanyusersthiscostmaybemorethanoffsetbythefeaturesavailable in the general-purpose environment that facilitate its use. The final experiment assumes that each model’s monetary policymaker, rather than following a simple policy rule, sets the short-term rate of interest to minimize a loss function. For illustrative purposes, the loss function is a weighted, discounted sum of squared differences of output and inflation from their baseline values and of the rate of interest from its value in the previous period. The loss function is minimized over the first 40 simulation periods. Thesolutionofthisexperimentrequiresasequenceof“outer”iterations, each of which consists of 40 RE derivative solutions that compute the effects on the arguments of the loss function of perturbations to the policy instrument in each optimization period, followed by one or more RE solutions in which the policy variable is set according to these derivatives and the values of the loss function arguments. The experiment repeats the output shocks of the second experiment, but reverses their signs so that output expands initially and the zero bound is not an issue. Because the Jacobian of a linear model is invariant to where it is calculated, a single “linear” Jacobian is clearly sufficient to run all of the necessary E-Newton RE simulations for the linear model versions. In addition, the optimal-policy solution for these models requires a single outer iteration. Although FRB/US is nonlinear, the nonlinearity is mild enough that a single “interp(12)” Jacobian proves to be accurate enough to complete all of the needed E-Newton RE simulations. The nonlinearity does, however, increase to three the number of outer iterations needed for convergence to the optimal-policy solution. That is, the derivatives of the arguments of the loss function with respect to the policy instrument in each period have to be computed three times.
26 Table 5: Optimal-Policy Experiment Model E-Newton E-QNewton Number of option time option time RE instr. sims derivs. SW-small linear .8 Bbd 14.0 42 1 o SW-full linear 1.0 Bbd 17.8 42 1 o EDO linear 48.5 Bbd 153.6 42 1 o FRB/US-mcap interp(12) 84.9 I 202.4 131 3 FRB/US-all interp(12) 241.0 Bbd 521.8 131 3 o The E-Newton algorithm has a substantial advantage in this experiment, because the Jacobian is a pure fixed cost for each model version. The fixed costs associated with E-QNewton are relatively minor. The E-Newton solutions are from 2.2 to 18 times faster than the corresponding E-QNewton solutions (table 5). The E-Newton execution times themselves range from about a second (the two SW versions) to four minutes (FRB/US-all). The latterexecutiontimedoesnotseemexcessive, giventhatthesolutionrequires 131 RE simulations of a large (mildly) nonlinear model. 5. Conclusion More than 30 years ago, Holly and Zarrop (1983) developed an algorithm for solving RE models that used Newton’s method in an iterative framework that, although expressed as an optimal control problem, is broadly similar to the E-Newton algorithm presented here.24 They demonstrated their “penalty-function” algorithm with a model containing two RE variables. Use of the Holly-Zarrop approach seems to have ceased after a few years. S.G. Hall (1985, pp. 157-158) noted: 24The E-Newton algorithm imposes RE by solving a system of equations. When there are asmany instruments asobjectives, asis the casefor REsolutions, the optimal control approach used by Holly and Zarrop is identical to solving a system of equations.
27 The real drawback of [the Holly-Zarrop] technique is ... that it severely limits the number of expectations terms that can appear in the model. Optimal control problems quickly become intractable as the number of control variables rises and it would be impossible to use this technique to solve a model with 20 or more expectations variables. Later, Fisher (1992, p. 56) concluded: ... it is clear that [Holly-Zarrop] methods are not efficient ... These observations reflect the limited capabilities of the computer hardware that was available at the time. Computer hardware is now cheap by the standards of the past, a development that greatly extends the range of RE models for which Newton algorithms like Holly-Zarrop and E-Newton are feasible and practical. Nonetheless, if execution speed were the only criteria for comparing solution procedures, neither E-Newton nor E-QNewton would come out on top. Relative to faster procedures, an important advantage of E-Newton and E-QNewton lies in their simplicity, which enables them to be coded in software that is easy to use. This advantage would not have been very important 20 or 30 years ago, when the time penalty for choosing a user-friendly software product might have translated into many extra minutes or even hours of execution time.25 Today, theavailabilityofmuchfasterhardwaremeansthattheadded execution time is likely to be measured in seconds and thus less important. This paper demonstrates the ability of a Newton algorithm that iterates on a set of expectations estimates to solve several well-known RE models. 25Hollinger (1996) reports solution times using the stacked-time algorithm in Troll for several macro models, of which the 466-equation Multimod is closest in size to FRB/US. A 150-period solution of Multimod required 21 minutes on a relatively fast computer.
28 Compared with the Newton procedure developed by Holly and Zarrop, the E-Newtonalgorithmpresentedherecontainsoptionsthatacceleratethecomputation of the Jacobians of linear and mildly nonlinear RE models. This paper also develops an RE solution algorithm based on the quasi- Newton method of Broyden, an approach that appears not to have been used before on this type of problem. The E-QNewton algorithm combines a block-diagonal initial estimate of the approximate Jacobian, an economical approach to updating the approximate Jacobian, and the option to use the LMR step-length procedure with nonlinear models. E-QNewton solves the reported single-simulation experiments more rapidly than E-Newton, except whenthemodelislinearandofsmall-to-mediumsize. E-Newtonoutperforms E-QNewton on the experiment that involves a set of RE solutions in which the Newton Jacobian is a one-time fixed cost.
29 Appendix A.1 Setting up the RE problem The rational expectations model f(y ,y ,y ) = 0 is split into two t−1 t t+1 models, one that has no leads of endogenous variables and one that has no lags of endogenous variables. The former can be easily solved forward in time and the latter can be easily solved backward in time. The model without leads takes the original model and replaces the vector y with a vector of expectations estimates, χ , that is based on information t+1 t available at t and has two components: the function k(y ,y ) of the model’s t−1 t endogenous variables and the exogenous vector x . t f(y ,y ,χ ) = 0 t−1 t t (A1) χ = k(y ,y )+x t t−1 t t The model without lags contains the RE formulas for the expectations estimates, whose solutions in the second model are denoted by χˆ, and the expectations errors, e. χˆ = nx µ χˆ + ny ν y t i=1 i t+i i=0 i t+i (A2) e t = χPt −χˆ t P Note that y is endogenous in (A1) but exogenous in (A2). The pair of models simplifies to equations 2-3 when the equation for χ in the first model does t not have an endogenous component and the equation for χˆ in the second t model simplifies to χˆ = y . t t+1 The more general form of A1-A2 is necessary to accomodate two characteristicsoftheexpectationsstructureofFRB/USthatareabsentintheother models. First, the VAR-based expectations framework in FRB/US, although designed as an alternative to rational expectations, has proved to be a good starting point for the RE expectations estimates. The VAR-based formulas
30 are the endogenous expectations component in the RE versions of FRB/US. Second, many FRB/US equations are written in such a way that their expectations terms correspond to infinite weighted sums of future values of endogenous variables, not to a single future value. This form appears in the model’s present value relationships for financial variables and in its decisionrule representations of some nonfinancial equations. The latter, which are derived from Euler equations, substitute for one or a few future values of the dependent variable with infinite future sums of the other variables that appear in the equation. Because the coefficients of the infinite sums decline either geometrically or based on a mixture of several geometric parameters, the sums collapse to the form shown in the lower line of A2. For financial expectations, the upper summation limits, nx and ny, are one and zero, respectively. For nonfinancial expectations, the limits are typically greater than one but never larger than three. The RE solution of equations A1-A2, which obtains when x is chosen t such that e = 0 for t = 1,...,T, is written compactly as t F(x˜) = 0. (A3) The stacked vector of expectations constants, x˜, has mT elements, where m is the number of RE variables and T the number of simulation periods. ConvergencetotheREsolutionoccursatiterationiifthemaximumabsolute expectations error, c(x˜), is less than γ . c c(x˜ ) = max|F(x˜)| = max|e˜| < γ , (A4) i i i c The ratio of the sum of squared expectations errors in the current and previous iterations, e˜′e˜ g(x˜ ) = i i , (A5) i e˜′ e˜ i−1 i−1
31 provides a scalar measure of the progress made toward the RE solution each iteration. A.2 The E-Newton algorithm The organization of the E-Newton algorithm is sketched out below. J is i the Jacobian of F at x , λ is the step length, and d is the step direction. i i i Parameter values are γ =0.5, γ =0.9, and γ =1e-05. j ls c 1. Initialize: (a) set i = 0; choose y , y , x˜ 0 T+1 0 (b) compute J 0 (c) d = −J−1F(x˜ ) 1 0 0 2. For i = 1,maxit (a) i = i + 1 (b) x˜ = x˜ +d i i−1 i (c) exit if c(x˜ ) < γ i c (d) for nonlinear models, invoke Armijo line search if g(x˜ ) > γ i ls (1) search for satisfactory λ 6= 1 i (2) x˜ = x˜ +λ d i i−1 i i (3) exit if c(x˜ ) < γ i c (e) compute J if g(x˜ ) > γ ; else J = J i i j i i−1 (f) d = −J−1F(x˜ ) i+1 i i A.2a The E-Newton Jacobian The E-Newton RE Jacobian (J), which is an mTxmT matrix, is composed of m2 mxm submatrices. J ··· J 11 1m J = . . . ... . . . (A6) J m1 ··· J mm
32 Submatrix J contains all of the derivatives that involve expectations conjp stant j and expectations error p. ∂ep,1 ··· ∂ep,1 ∂xj,1 ∂xj,T J jp = . . . ... . . . (A7) ∂ep,T ··· ∂ep,T ∂xj,1 ∂xj,T The term ∂ep,l denotes the derivative of expectations error p in period l with ∂xj,k respect to expectations constant j in period k. Impulse-response (IR) simulations are used to calculate the Jacobian. Each IR simulation, which perturbs a particular x , provides the informaj,k tion needed to fill a single column of J. The number of IR simulations executed each time the Jacobian is calculated depends on the setting of the “jmeth” option. When “jmeth=every”, all mT IR simulations are run. The setting “jmeth=linear” runs the 2m IR simulations needed to construct the Jacobian of a standard, linear RE model. For use with mildly nonlinear models, the setting “jmeth=interp(r)” computes an approximate Jacobian from m(p+1) IR simulations, where p is the integer part of (T +1)/r. The “jmeth=linear” option takes into account the repetitive structure of the Jacobian of a standard linear RE model. When RE expectations are based on information at time t for variables at t+1, each Jacobian submatrix J has the following simple form. jp Q (T−1)xT J = (A8) jp 0 φ 1x(T−1) 1x1 Q is a matrix in which every element along each diagonal has the same value, 0isvectorofzeros,andφequals1.0ifj = pandzerootherwise. TheJacobian shown in equation 12 is a concrete example of this simple form. Only two perturbation simulations are needed per RE variable to construct all the
33 Jacobian submatrices. To see this, recall that each simulation fills a matrix column. Thus, a pair of simulations in which an expectations constant is separately shocked in periods 1 and T is sufficient to determine all elements of Q. The “jmeth=interp(r)” option takes advantage of the property of mildly nonlinear RE models that their Jacobian submatrices have diagonals whose elements tend to move relatively smoothly across time from northwest to southeast. This permits an approximate Jacobian to be constructed by interpolating many of its elements. Let the columns of the submatrix J be jp denoted by J1 ,J2 ,...,JT. The “interp(r)” option runs impulse-response jp jp jp simulations to calculate J1 ,J1+r,J1+2r,..., and calculates the other columns jp jp jp by linear interpolation along the diagonals. Because the interpolation direction is diagonal, there are triangles of elements at the top and bottom of the interpolated columns for which only one adjacent impulse-response column is available. For these triangular regions, the single available impulse-response value is repeated along the diagonal. A.3 The E-QNewton algorithm Broyden’s quasi-Newton method uses an approximation of the Jacobian, B, which is updated along with the tentative solution each iteration. x˜ = x˜ −λ B−1F(x˜ ) (A9) i i−1 i i−1 i−1 z −B s B = B + i i−1 i s′ (A10) i i−1 s′s i i i In (A10), s = x˜ − x˜ denotes the change to the tentative solution for x˜ i i i−1 and z = F(x˜ ) − F(x˜ ) the resulting change in F(x˜). The revision to B i i i−1 depends on the “surprise” to z as given by (z −B s ), a term that would be i i−1 i zero if B were the true Jacobian in the neighborhood of x˜ . It is easily i−1 i−1
34 verified that replacing B with the updated Jacobian approximation, B , i−1 i would have set the ”surprise” to zero. A.3a Derivation of The E-QNewton formulas TheE-QNewtonalgorithmreducescomputationalcoststhroughtwomodifications to (A9) and (A10). First, given that the solution for x˜ in (A9) depends on the inverse of the approximate Jacobian, each iteration can be made more efficient by applying the Sherman-Morrison formula to rewrite (A10) in terms of this inverse. B−1 = [I −w v′]B−1 (A11) i i i i−1 Vectors u, v, and w are defined as follows. z −B s i i−1 i u = (A12) i ||s || i s i v = (A13) i ||s || i B−1u w = i−1 i (A14) i 1+v′B−1u i i−1 i The step norm is defined as ||s || = s′s . i i i The second modification, which isqtaken from Kelley (1995, chapters 7-8), expresses the product B−1F(x˜ ) as a sequence of vector operations applied i i to B−1F(x˜ ). The derivation of this modification starts by (tediously) trans- 0 i forming the expression for w in (A14) into one in which w depends only on i i the step s and step length λ and not on the approximate Jacobian. λ w = i (−s /λ +(λ−1 −1)s ) (A15) i ||s || i+1 i+1 i i i
35 Now, take the definition of the Broyden direction (A16), substitute for B−1 i using (A11) to get (A17), and substitute for w using (A15), substitute for i v using (A13), and rearrange terms to get (A18). i d = −B−1F(x˜ ) (A16) i+1 i i = −[I −w v′]B−1F(x˜ ) (A17) i i i−1 i ||s ||2q −(1−λ )(s′q )s = − i i i i i i (A18) ||s ||2 −λ s′q i i i i The vector q in (A18) is defined as follows: i q = −B−1F(x˜ ) (A19) i i−1 i = −Πi−1 I −w v′ B−1F(x˜ ) (A20) j=1 j j 0 i h i Finally, simplify the matrix operations in (A20) by computing q with the i following loop, after inserting the definitions of w and v . j j 1. Initialize : q = −B−1F(x˜ ) i 0 i (A21) 2. For j = 2,i q = q +( λj−1s +(λ −1)s )(s′ q )/||s ||2 i i λj j j−1 j−1 j−1 i j−1 The key E-QNewton formulas are given by (A18) and (A21). Their use requires only the inverse of the initial approximate Jacobian (B−1) and the 0 sequence of steps {s } and step lengths {λ } from prior iterations. The dei i crease in simulation time obtained when solutions are based on these formulas, rather than on (A11) and (A16), can be substantial. The E-QNewton
36 Table A1: Comparison of Alternative Broyden Formulas Simulation Time in Seconds Model Experiment 1 Experiment 2 E-QNewton Broyden E-QNewton Broyden SW-small .5 1.9 1.1 3.1 SW-full .7 2.1 1.3 3.3 EDO 5.4 179.3 9.0 211.4 FRB/US-mcap 2.1 3.4 2.8 4.1 FRB/US-all 9.0 68.9 13.4 95.8 simulation times in table A1 are taken from the B = Bbd columns of ta- 0 0 bles 2 and 3, except for FRB/US-mcap, whose times are from the B = I 0 columns. The simulation times in the “Broyden” columns are from comparable experiments in which the solutions are based on code that uses (A11) and (A16). A.3b The E-QNewton estimate of B 0 The standard recommendation is to use the identity matrix as the initial approximate Jacobian, B , as this further simplifies the loop of commands 0 (A21) that computes q . For most of the models studied here, however, a i better choice is a block-diagonal matrix, Bbd, that adds modestly to the cost 0 of computing q each iteration but speeds up convergence by substantially i reducing the number of solution iterations. ¯ J 11 Bbd = ... (A22) 0 ¯ J mm ¯ Each (mxm) diagonal block, J , in turn contains only two non-zero values, ii which are repeated along the main diagonal and first super-diagonal. These
37 are the elements of the RE Jacobian that are usually largest in absolute value. տ տ տ δ(1) kk J ¯ = δ(0) ց (A23) kk kk ց The estimates of δ(0) and δ(1) are taken from an impulse-response simkk kk ulation that perturbs the expectations constant k at the date closest to the midpoint of the simulation period (i.e., T/2 if T is an even number, and (T − 1)/2 otherwise). Construction of the block-diagonal matrix Bbd re- 0 quires m impulse response simulations (one per MC variable). Its inverse can be easily computed and stored in m (TxT) matrices. A.3c The E-QNewton step length The non-monotone line-search procedure of La Cruz, Martinez, and Raydan (LMR, 2005) is used to choose the step length when the E-QNewton algorithm is used to solve nonlinear RE models.26 Define n(x˜ ) as the sum i of squared expectations errors and recall that g(x˜ ) is the ratio of this sum i of squares in two successive iterations. In the LMR procedure, line search is not started, or terminates if already started, when g(x˜ ) is less than the sum i of three terms. n(x˜ ) n(x˜ )/i2 g(x˜ ) < (1−γ λ2)+ max i−j −1 + 0 (A24) i ls i 1≤j≤M"n(x˜ i−1 ) # n(x˜ i−1 ) The two parameter in this expression are set as follows: M=4, γ =1e-04. ls 26The LMR procedure is a key part of the DF-SANE equation-solving algorithm, an approach that is not based on Newton or quasi-Newton methods and does not start with a step length of one. For use with the E-QNewton algorithm, the LMR formulas are modified to incorporate an initial step length of one.
38 The first term on the right hand side of (A24), which is always less than 1.0 at the selected value of γ , by itself leads the procedure to try to find ls a step length that reduces the sum of squared expectations errors from one iteration to the next. The remaining terms act to moderate this requirement. The second term allows the {n(x˜ )} sequence to be locally increasing without i invoking line search. The third term, whose value goes to zero in the limit, preventsrisingvaluesofn(x˜ )fromstartinglinesearchintheinitialiterations i of a solution. The E-QNewton algorithm starts each iteration with λ = 1 and x˜ = i i d +x˜ . The expression for d is given in (A18). If condition (A24) fails to i i−1 i hold, the LMR procedure starts and computes a smaller positive step length (λ+). i τ λ+ if α < τ λ+ min i i min i λ+ = τ λ+ if α > τ λ+ (A25) i max i i max i α otherwise i where λ+2 n(x˜ ) α = i i−1 (A26) i n(x˜ +λ+d )+(2λ+ −1)n(x˜ ) i−1 i i i i−1 When λ+ appears on the right hand side of either (A25) or (A26), it denotes i thevaluethatwasassignedtothatparameterpriortothecurrentstep-length iteration, whichis1.0initially. Ifcondition(A24)issatisfiedatthenewvalue of λ+, line search terminates. If the condition is not satisfied, a negative step i length (λ−) is computed using a set of analogous formulas. In this case, the i value of λ− that appears on the right hand side of the formulas is initially i set to -1.0. If condition (A24) is satisfied at the new value of λ−, line search i terminates. If the condition is not satisfied, line search continues with ever smaller values of λ+ and λ− until the condition is satisfied or the maximum i i
39 number of line-search iterations is reached. The simulation experiments set τ =.5 and τ =.1. max min
40 References Armstrong, John, RichardBlack, DouglasLaxton, andDavidRose, 1998. “A Robust Method for Simulating Forward-Looking Models,” Journal of Economic Dynamics and Control, 22, 489-501. Anderson, Gary and George Moore. 1985, “A Linear Algebraic Procedurefor Solving Linear Perfect Foresight Models,” Economics Letters, 17, 247-252. Blanchard, Olivier and Charles Kahn. 1980, “The Solution of Linear Difference Models Under Rational Expectations,” Econometrica, 48: 1305-1311. Broyden,C.G.1965,“AClassofMethodsforSolvingNonlinearSimultaneous Equations,” Mathematics of Computation, 19: 577-593. Brayton, Flint, Andrew Levin, Ralph Tryon, and John C. Williams. 1997a, “The Evolution of Macro Models at the Federal Reserve Board,” Carnegie- Rochester Conference Series on Public Policy, 47: 43-81. Brayton, Flint, Eileen Mauskopf, Dave Reifschneider, Peter Tinsley, and John Williams. 1997b, “The Role of Expectations in the FRB/US Macroeconomic Model,” Federal Reserve Bulletin, April: 227-245. (http://www.federalreserve.gov/pubs/bulletin/1997/199704lead.pdf) Brayton, Flint and Peter Tinsley. 1996, “A Guide to FRB/US: A Macroeconomic Model of the United States,” Finance and Economics Discussion Series 1996-42, Federal Reserve Board, Washington, DC. (http://www.federalreserve.gov/pubs/feds/1996/199642/199642pap.pdf) Chung, Hess, Michael Kiley, and Jean-Phillipe Laforte. 2010, “Documentation of the Estimated, Dynamic, Optimization-Based (EDO) Model of the U.S. Economy: 2010 Version,” Finance and Economics Discussion Series 2010-29, Federal Reserve Board, Washington, DC. (http://www.federalreserve.gov/pubs/feds/2010/201029/201029pap.pdf) Edge, Rochelle, Michael Kiley and Jean-Phillipe Laforte. 2008, “An Estimated DSGE Model of the U.S. Economy with an Application to Natural Rate Measures,” Journal of Economic Dynamics and Control, 32(8): 2512- 2535.
41 Fair, Ray and John Taylor. 1983, “Solution and Maximum Likelihood Estimation of Dynamic Rational Expectations Models,” Econometrica, 51(4): 1169-1185. Fisher, Paul. 1992, Rational Expectations in Macroeconomic Models, Kluwer Academic Publishers, London. Fisher, P.G. and A.J. Hughes Hallett. 1988, “Efficient Solution Techniques for Linear and Non-Linear Rational Expectations Models,” Journal of Economic Dynamics and Control, 12: 635-657. Gay, David. 1979, “Some Convergence Properties of Broyden’s Method,” SIAM Journal of Numerical Analysis, 16: 623-630. Hall, S.G. 1985, “On the Solution of Large Economic Models with Consistent Expectations,” Bulletin of Economic Research, 37: 157-161. Hollinger, Peter. 1996, “The Stacked-Time Simulator in TROLL: A Robust Algorithm for Solving Forward-Looking Models,” Second International Conference on Computing in Economics and Finance, Society of Computational Economics. (http://www.unige.ch/ce/ce96/ps/hollinge.eps) Holly, S. and M.B. Zarrop. 1983, “On Optimality and Time Consistency When Expectations are Rational,” European Economic Review, 20: 23-40. Juillard, Michel. 1996, “DYNARE: A Program for the Resolution and Simulation of Dynamic Models with Forward Variables Through the Use of a Relaxation Algorithm,” CEPREMAP Working Paper no. 9602, Paris, France. Juillard, Michel, Douglas Laxton, Peter McAdam, and Hope Pioro. 1999, “SolutionMethodsandNon-LinearForward-LookingModels,”inAnalyses in Macroeconomic Modelling, eds., Andrew Hughes Hallett and Peter McAdam, Kluwer. Kelley, C.T. 1995, Iterative Methods for Linear and Nonlinear Equations, SIAM Publications, Philadelphia. La Cruz, W., J.M. Martinez, and M. Raydan. 2005, “Spectral Residual Method Without Gradient Information for Solving Large-Scale Nonlinear Systems of Equations,” Mathematics of Computation, 75: 1429-1448.
42 Sims, Christopher. 2001, “Solving Linear Rational Expectations Models,” Computational Economics, 20: 1-20. Smets, Frank and Rafael Wouters. 2007, “Shocks and Frictions in US Business Cycles: A Bayesian DSGE Approach,” American Economic Review, 97(3): 586-606.
Cite this document
Flint Brayton (2011). Two Practical Algorithms for Solving Rational Expectations Models (FEDS 2011-44). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2011-44
@techreport{wtfs_feds_2011_44,
author = {Flint Brayton},
title = {Two Practical Algorithms for Solving Rational Expectations Models},
type = {Finance and Economics Discussion Series},
number = {2011-44},
institution = {Board of Governors of the Federal Reserve System},
year = {2011},
url = {https://whenthefedspeaks.com/doc/feds_2011-44},
abstract = {This paper describes the E-Newton and E-QNewton algorithms for solving rational expectations (RE) models. Both algorithms treat a model's RE terms as exogenous variables whose values are iteratively updated until they (hopefully) satisfy the RE requirement. In E-Newton, the updates are based on Newton's method; E-QNewton uses an efficient form of Broyden's quasi-Newton method. The paper shows that the algorithms are reliable, fast enough for practical use on a mid-range PC, and simple enough that their implementation does not require highly specialized software. The evaluation of the algorithms is based on experiments with three well-known macro models--the Smets-Wouters (SW) model, EDO, and FRB/US--using code written in EViews, a general-purpose, easy-to-use software package. The models are either linear (SW and EDO) or mildly nonlinear (FRB/US). A test of the robustness of the algorithms in the presence of substantial nonlinearity is based on modified versions of each model that include a smoothed form of the constraint that the short-term rate of interest cannot fall below zero. In two single-simulation experiments with the standard and modified versions of the models, E-QNewton is found to be faster than E-Newton, except for solutions of small-to-medium sized linear models. In a multi-simulation experiment using the standard versions of the models, E-Newton dominates E-QNewton.},
}