ifdp · May 7, 2019

Asset Price Learning and Optimal Monetary Policy

Abstract

We characterize optimal monetary policy when agents learn about endogenous asset prices. Learning leads to inefficient asset price fluctuations and distortions in consumption and investment decisions. We find that the policy-relevant natural real interest rate increases with subjective asset price beliefs. Optimal monetary policy therefore raises interest rates when expected capital gains are high. When the asset is not in fixed supply, optimal policy also "leans against the wind". In a simple calibration of the model, a positive response to capital gains in simple interest rate rules is beneficial. Our results are robust to alternative belief specifications. Accessible materials (.zip) Original paper: PDF

K.7 Asset Price Learning and Optimal Monetary Policy Caines, Colin and Fabian Winkler Please cite paper as: Caines, Colin and Fabian Winkler (2019). Asset Price Learning and Optimal Monetary Policy. International Finance Discussion Papers 1236r. https://doi.org/10.17016/IFDP.2018.1236r International Finance Discussion Papers Board of Governors of the Federal Reserve System Number 1236r May 2019

Board of Governors of the Federal Reserve System International Finance Discussion Papers Number 1236r August 2018 (revised: April 2019) Asset Price Learning and Optimal Monetary Policy Colin Caines and Fabian Winkler NOTE:InternationalFinanceDiscussionPapersarepreliminarymaterialscirculatedtostimulatediscussion and critical comment. References to International Finance Discussion Papers (other than an acknowledgment that the writer has had access to unpublished material) should be cleared with the author or authors. Recent IFDPs are available on the Web at www.federalreserve.gov/pubs/ifdp/. This paper can be downloaded without charge from Social Science Research Network electronic library at www.ssrn.com.

Asset Price Learning And Optimal Monetary Policy Colin Caines∗ & Fabian Winkler† § Abstract: We characterize optimal monetary policy when agents learn about endogenous asset prices. Learning leads to inefficient asset price fluctuations and distortions in consumption and investment decisions. Wefindthatthepolicy-relevantnaturalrealinterestrateincreaseswithsubjectiveassetpricebeliefs. Optimal monetary policy therefore raises interest rates when expected capital gains are high. When the asset is not in fixed supply, optimal policy also “leans against the wind”. In a simple calibration of the model, a positive response to capital gains in simple interest rate rules is beneficial. Our results are robust to alternative belief specifications. Keywords: Optimal Monetary Policy, Asset Prices, Natural Real Interest Rate, Learning, Leaning Against The Wind JEL classifications: E44, E52 ∗ TheauthorisastaffeconomistintheDivisionofInternationalFinance,BoardofGovernorsoftheFederalReserve System, Washington, D.C. 20551 U.S.A. The email address of the author is colin.c.caines@frb.gov. † The author is a staff economist in the Division of Monetary Affairs, Board of Governors of the Federal Reserve System, Washington, D.C. 20551 U.S.A. The email address of the author is fabian.winkler@frb.gov. § The views in this paper are solely the responsibility of the authors and should not be interpreted as reflecting the views of the Board of Governors of the Federal Reserve System or of any other person associated with the Federal Reserve System. The authors would like to thank Klaus Adam, Paul Beaudry, Bill Branch, Chris Gust, Damjan Pfajfar, ChrisErceg, KevinLansing, ThomasMertens, RobertTetlow, andseminarparticipantsattheChicagoFed, the San Francisco Fed, Drexel University, UBC, UC Irvine, the 2018 Midwest Macro conference, the 2018 Canadian Economics Association Meetings, the 2018 Econometric Society Summer Meeting, the 2018 conference on “Expectations in Dynamic Macroeconomic Models” conference in Birmingham, and the EABCN conference on “Asset Prices and the Macroeconomy” in Mannheim for helpful comments..

1 Introduction The question of how, if at all, monetary policy should react to asset prices remains controversial. Some argue that asset price misalignments can pose significant risks to macroeconomic and financial stability, and that monetary policy should raise interest rates when asset prices are high; others argue that monetary policy should not pay attention at all to asset prices, or at most in order to improve forecasts of inflation and economic activity.1 Any answer to this question depends on what is assumed about the sources of asset price fluctuations. Are financial markets pricing assets efficiently, and if not, what is the nature of price misalignments? Standard macroeconomic models, including workhorse New-Keynesian models used for monetary policy analysis, embody the efficient market hypothesis that rules out asset price misalignments by design. Gali (2014, 2017) has added rational bubbles to these models, arguing for a negative reaction of interest rates to asset prices because rational bubbles grow more slowly at lower interest rates. Yet rational bubbles are not theonlywaythroughwhichassetpricescandeviatefromtheir“fundamentalvalue”. Analternativenarrative holdsthatinvestors’expectationssufferfromextrapolativebiasandcansufferfromboutsofover-andunderconfidence which affect prices. This narrative can be formalized through models of learning, which are a plausible explanation for many well-known asset price characteristics (Fuster et al. 2012; Collin-Dufresne et al. 2013; Adam et al. 2015; Barberis et al. 2015 for stock prices; Adam et al. (2012); Caines (2016); Glaeser and Nathanson (2017) for house prices). Importantly, these models can explain the systematic bias in return expectations observed in survey data, which is inconsistent with any rational expectations model(GreenwoodandShleifer,2014). However, theeffectofassetpricelearningontheconductofoptimal monetary policy has not been studied previously. Inthispaper,weanalyticallysolveforoptimalmonetarypolicyinamodelwithsubjective,extrapolative beliefs about endogenous asset prices. The model is a simple New-Keynesian model, to which we add a long-term asset and learning about the equilibrium asset price. The learning process implies extrapolative expectations and endogenous boom-bust cycles in equilibrium price dynamics. We keep expectations close torational by restricting themto be model-consistent conditional onsubjectiveasset price beliefs (Winkler, forthcoming).2 Agentsremainforward-lookingandcorrectlyforecastthefundamentalstateoftheeconomy. 1See, for example, Bernanke and Gertler (2001), Gilchrist and Leahy (2002), Christiano et al. (2010), Filardo and Rungcharoenkitkul (2016) and Svensson (2017). 2Conditionally model-consistent expectations require that beliefs be consistent with all equilibrium conditions other than asset market clearing. The concept has also been applied in Caines (2016) and Gandré (2017). 1

Theyalsounderstandthemonetarytransmissionmechanismandthepolicystrategyfollowedbythecentral bank, except for its effects on asset prices. Despite this restriction, fluctuations in subjective asset price expectations have real effects because they distort intertemporal choices of consumption and investment: Optimistic expectations create excess aggregate demand by increasing subjective wealth, and create excess asset production by increasing subjective returns on investment. We show that the policy-relevant natural real rate of interest is not simply a function of technology and preferences, but depends positively on subjective asset price beliefs. The intuition for this result is simple: When agents expect larger capital gains on the asset, then the return on bonds must also rise for the bond market to clear, even if the capital gains expectations aren’t rational. Because realized asset prices in our model are also increasing in subjective beliefs, the model gives rise to a positive relationship between the level of asset prices and the natural real interest rate. In order to follow this natural rate, the central bank needs to raise interest rates when asset prices are high. In terms of target criteria, flexible inflation targeting remains optimal under learning when the asset is in fixed supply. When we allow for production of the asset, the optimal policy instead “leans against the wind”: The central bank is willing to tolerate low inflation and output when agents are overly optimistic about asset prices, so as to mitigate investment distortions arising from subjective belief fluctuations. By contrast, flexible inflation targeting remains optimal under rational expectations. Finally,wenumericallyevaluatesimpleTaylor-typeinterestrateruleswithareactiontoassetprices. We findthatapositivereactiontocapitalgainsmitigatesthedistortionsfromnon-rationalbeliefsandstabilizes assetpricefluctuations. Raisinginterestrateswhensubjectivebeliefsareoverlyoptimisticreducestheprice ofourlong-termasset,therebycorrectingexpectationsdownwards. Thisfindingisincontrasttotherational bubble literature (Gali, 2014; Dong et al., 2017), where raising rates when bubbles are large makes them larger still. Our baseline model assumes a particular process of expectation formation in which agents think that asset prices are a random walk with a small time-varying drift. This process has been shown to fit the data well (Adam et al., 2017), but our results are robust to alternative choices. We show how our results carry over to a more general class of subjective beliefs, including “natural expectations” (Fuster et al., 2012) and “diagnostic expectations” (Bordalo et al., 2018). Another assumption we make is that expectations are model-consistent conditionally on asset price beliefs. This restriction greatly reduces the degrees of freedom for boundedly rational expectations. However, we do not mean to imply that belief distortions or learning 2

about other aspects of the economy, such as inflation or even monetary policy itself, do not matter. Rather, we see this assumption as a modeling device that allows us to isolate the effects of asset price learning from other such distortions. When we analyze optimal policy, we assume that the central bank has complete information and maximizes welfare under the equilibrium probability distribution of our model, which is distinct from the subjective distribution of boundedly rational agents. Effectively, the central bank makes its choices given its own view of the economy, not that of the private sector. In practice, central banks’ views on asset prices do indeed diverge at least sometimes from those of financial markets, as exemplified in Alan Greenspan’s “irrational exuberance” comments (Greenspan, 1996). If the central bank instead had the same beliefs as the private sector, the policy problem in our model would revert to that of a relatively standard, rational expectations New-Keynesian model, which has been extensively studied in the literature. In our model, subjective asset price beliefs have real effects because they affect consumption and investment through changes in subjective wealth and expected returns. We abstract from more complex transmission channels of asset prices, such as credit constraints and balance sheet effects. The advantage of this simplification is that we are able to obtain closed form solutions for optimal monetary policy in the presence of learning. Previous studies have argued for a monetary policy reaction to asset prices in environments without rational expectations, for example Dupor (2005) and Mertens (2011). In these studies, beliefs about the exogenous fundamentals of the economy are distorted, and affect welfare through an investment channel. To our knowledge, our analysis is the first in which beliefs about the asset price itself—an endogenous variable—are distorted. We find departures from the optimal policy under rational expectations even when there is no investment channel. Our analysis is related to Adam and Woodford (2018), who study “robustly optimal policy” in a New- Keynesian model with housing quite similar to ours.3 Like the papers previously mentioned, Adam and Woodford study distortions to beliefs about the exogenous fundamentals of the economy, while our agents learn about the endogenous asset price. But they also set themselves a different policy problem, in which the class of possible belief distortions is large and the policymaker does not know which of these distortions 3Oneimportantdifferenceisthatutilityislinearinassetholdingsintheirmodel. Thiseffectivelyimpliesthatwealthdoes notenterthemodelasastatevariable. Inourlearningmodel,changesinsubjectivewealtharethekeydistortioninducedby non-rational asset price expectations, and hence we cannot make the same assumption. Instead, we solve for optimal policy carrying endogenous state variables. 3

is realized, thus limiting the policymaker’s ability to exploit distorted expectations to its advantage. In our paper, the central bank is certain of the belief distortions in the private sector. We also limit the degree to which it can exploit them, by ruling out that the central bank pursues a different policy targeting rule from what the private sector believes the targeting rule to be. Another difference is that Adam and Woodford find benefits of leaning against the wind only if the steady state is distorted in a particular direction. In contrast, our results are obtained around the fully efficient steady state. Wealsocomplementagrowingliteraturestudyingmonetarypolicyprescriptionsinmodelswithlearning. Fullyoptimalpolicyhasrecentlybeenstudiedinatwo-equationmodelwithlearningbyMolnarandSantoro (2014) and Eusepi and Preston (2018). Eusepi et al. (2018) introduce drift in long-run expectations to a New Keynesian model and show that such beliefs introduce a policy tradeoff between stabilizing current inflationandanchoringlong-horizonbeliefs. Inthesepapers,assetpricesareabsentanditismostlylearning about the inflation process that drives the dynamics of the model.4 Instead, we focus on non-rational asset price beliefs, while endowing agents with conditionally consistent beliefs about the rest of the economy. Our analysis shares some common ground with previous studies on asset prices and monetary policy that stay within the paradigm of rational expectations. Christiano et al. (2010) study the optimal policy reaction to news shocks about future productivity. Good news raises both asset prices and the natural real rate of interest, so that monetary policy should optimally respond by raising interest rates. Our natural real rate is similarly increasing in subjective beliefs, but these beliefs evolve endogenously and do not rely on exogenous news shocks. Gilchrist and Saito (2009) analyze simple interest rate rules in a credit friction model in which the private sector has limited information about the trend growth rate of technology. They find that a reaction to the growth rate of asset prices in the policy rule is beneficial, while a reaction to the level of asset prices is not. Even though we abstract from credit frictions, the same result obtains in our model, too. The reason is that a reaction to the growth rate of asset prices approximates return expectations that enter the natural real rate of interest. The remainder of this paper is structured as follows. We begin by describing the baseline model in Section 2, and our notion of a learning equilibrium in Section 3. We characterize the linearized equilibrium underrationalexpectationsandlearninginSection4. OptimalpolicyisanalyzedinSection5,whileSection 6 discusses how well certain simple interest rate rules approximate the optimal policy. Sections 7 and 8 4Airaudo(2016)augmentsthestandardNewKeynesianmodelwithastockmarketandinfinite-horizonlearningasinPreston(2006)tostudyconditionsunderwhichtherationalexpectationsequilibriumislearnable,butstopsshortofcharacterizing optimal policy. 4

discuss extensions to more general beliefs and asset production, respectively. Section 9 concludes. 2 Model description Our model is an otherwise standard New-Keynesian model in which the representative household holds a stock of a long-term asset that yields utility. In the baseline version of the model, the supply of the asset is fixed. One can think of this asset as a stock of housing, but we will refer to it generically as a long-term asset. A representative household supplies labor and owns firms. It can also hold nominal bonds promising a nominal return i . In addition, the household owns the long-term asset. The household’s problem is t (cid:88) ∞ (cid:32) C1−γ N1+φ H1−θ (cid:33) EP βt t − t +χ t 1−γ 1+φ 1−θ t=0 1+i t−1 s.t. C = W N +Π +T −Q (H −H )+B − B . t t t t t t t t−1 t t−1 1+π t Here, C is the household’s utility from consuming final consumption goods, N is the household’s labor t t supply, and T are lump-sum taxes. Π are the profits received from firms. The quantity of the asset owned t t by the household is denoted H and trades at the price Q . B are government bonds which are in zero net t t t supply. The price level is P and π = P /P −1 is the inflation rate. The expectational operator EP has t t t t−1 a superscript indicating that agents’ expectations are evaluated under a subjective probability measure P. The first order conditions are: W = CγNφ t t t (cid:18) C (cid:19)γ 1+i 1 = βEP t t t C 1+π t+1 t+1 Cγ (cid:18) C (cid:19)γ Q = χ t +βEP t Q . t Hθ t C t+1 t t+1 On the production side, a representative intermediate goods producer transforms household labor into 5

intermediate goods using the decreasing returns to scale technology Y = A Nα. t t t It has to hire labor at the real wage rate w and sells its goods at the real price M . Its first-order condition t t is W = αM A Nα−1. t t t t Intermediate goods are bought by wholesale firms indexed by i ∈ [0,1], who transform them into differentiated wholesale goods using a one-for-one technology. They face a standard Dixit-Stiglitz demand function and a Calvo price setting friction. When producer i is able to set a price P for its output Y , it it it solves: ∞ (cid:32) s (cid:33) (cid:88) (cid:89) maxEP ξΛ ((1+τ )P −M P )Y t t,t+τ t it t+s t+s it+s Pit s=0 τ=1 (cid:18) P (cid:19)−σ s.t. Y = it Y˜, it t P t where σ is the demand elasticity of substitution between varieties, Λ = βτCγC−γ is the household t,t+τ t +τ discount factor between times t and t+τ, and ξ is the probability of not being able to adjust the price in the future. Any profits are distributed to households. The first-order conditions are standard and give rise to the New-Keynesian Phillips curve. The term τ is a government subsidy to revenue. Its steady state value τ¯ = (σ−1)/σ is set so as to t induce a zero steady-state markup, thus rendering the model’s steady state fully efficient. Time-varying shocks to the subsidy τ act as cost-push shocks that affect markups and inflation but leave the first-best t allocation unchanged. A representative retailer buys differentiated wholesale goods at prices (P ) and transforms them it i∈[0,1] backintoahomogeneousfinalconsumptiongood. ThefinalgoodsellsatpriceP andisproducedaccording t to the technology Y˜ t = (cid:18)(cid:90) 1 (Y it ) σ− σ 1 di (cid:19) σ− σ 1 . 0 The first order condition gives rise to the CES demand function above. The price level can be expressed as P = (cid:82)1 P Y /Y˜. t 0 it it t 6

The government transfers a lump sum real amount to households (cid:90) 1 T = τ P Y di t t it it 0 tofinancethesubsidiestofinalgoodsproducersandoffsetthetaxonstockholdings. Profitsandgovernment transfers sum up to Π +T = Y −W N . Finally, the central bank sets the nominal interest rate, to be t t t t t specified later.5 Aggregate fluctuations are caused by productivity and, potentially, cost-push shocks, which follow firstorder autoregressive processes: logA = (1−ρ )logA¯+ρ logA +ε t a a t−1 at τ = (1−ρ )τ¯ +ρ τ +ε t τ t τ t−1 τt The innovations are independent white noise with variances σ2 and σ2. A τ MarketclearinginthefinalgoodsmarketrequiresY˜ = C . Bondsareinzeronetsupplyandthemarket t t clearing condition is therefore B = 0. Finally, the supply of the long-term asset is fixed at unity, so that t asset market clearing requires H = 1. t 3 Definition of equilibrium Let us first recall the formal definition of a rational expectations (RE) equilibrium. Let y ∈ RN denote t the collection of all endogenous model variables—including prices, allocations, and strategies—and by u ∈ RM the collection of all exogenous model variables, i.e. the technology and the cost-push shock, which t we call “fundamentals”. Stochastic processes for y and u are defined on the spaces Ω = Π∞ RN and t t y t=0 Ω = Π∞ RM, respectively. Further, denote by Ω (t) the set of all possible histories of exogenous variables u t=0 u up to period t, and its elements by u(t) ∈ Ω (t) . Finally, let P denote the true probability measure for the u u exogenous variables defined on (Ω ,S(Ω )), where S(·) is the Borel sigma algebra on a metric space. The u u topological support of P is denoted by supp(P ). u u Definition 1. Arational expectations equilibrium isasequenceofmappingsg : Ω (t) (cid:51) u(t) (cid:55)→ y ∈ RN, t = t u t 5Throughout the paper, we will assume that the central bank specifies monetary policy to guarantee uniqueness and determinacy of the equilibrium. We also abstract from the zero lower bound on nominal interest rates. 7

0,1,2,... such that, for all t and u(t) ∈ supp(P ): u 1. thechoicescontainediny solvethetime-tdecisionproblemofeachagentintheeconomy, conditional t on decision-relevant6 past and current outcomes contained in u(t) and y(t) = (cid:0) g (cid:0) u(0)(cid:1) ,...,g (cid:0) u(t)(cid:1)(cid:1) , 0 t and evaluating the probability of future external decision-relevant outcomes under the probability measure P implied by P and the mappings (g )∞ ; RE u t t=0 2. the allocations contained in y = g (cid:0) u(t)(cid:1) clear all markets. t t Under learning, agents are not endowed with knowledge of the equilibrium asset price process, i.e. the mapping of a history of fundamentals u(t) to prices Q . Instead, they use a simple subjective model to t forecast asset prices. s we show in Section 7, this subjective belief system can be made quite general, but here we confine ourselves to our preferred specification which follows Adam et al. (2017). Agents think that the asset price is a simple random walk model with a time-varying drift: ∆logQ = µˆ +z (1) t t−1 t µˆ = ρ µˆ +gz (2) t µ t−1 t where µˆ is the perceived trend price growth, g is the learning gain, and z is the subjective forecast error. t t Under P, z is normally distributed white noise with variance σ2, independent of the other exogenous t z shocks. The belief µˆ is updated in the direction of the last forecast error: When agents see asset prices t rising faster than they expected, they will also expect them to rise by more in the future.7 In order to avoid complications arising from simultaneity in the determination of outcomes and beliefs, we follow Adam et al. (2012) and Caines (2016) and assume that in period t agents make choices conditional on µˆ , and update t−1 their beliefs according to (1) at the end of the period. In order to determine expectations about the remaining variables of the model, we follow Winkler (forthcoming) in assuming that agents have so-called “conditionally model-consistent expectations”. This is arestrictiononexpectationsthateffectivelyallowsustoisolatetheeffectsofassetpricelearningfromother 6A variable is decision-relevant if it enters the agents’ decision problem, and a decision-relevant variable is external if its value is taken as given by the agent, while it is internal if the variable is part of the solution of the agents’ decision problem. Forexample,wholesalersneedtogetinformationoncurrentandfutureaggregatedemandY (decision-relevantandexternal) t tosetpricesP (decision-relevantandinternal),whiletheydonotneedtoforecastwagessincetheironlyproductioninputis it the intermediate good. 7The belief system above is equivalent to a belief that asset price growth is the sum of a temporary and a permanent component, both of which are unobserved. Bayesian updating of this belief leads to the equations above, with the learning gain g representing the perceived ratio of the standard deviation of the permanent relative to the temporary component. 8

potential sources of learning in the economy. Let (Ω ,S(Ω ),P ) be the probability space that defines the z z z subjective beliefs for z (i.e., that the z are iid normally distributed with mean zero and variance σ2). t t z Agents’ subjective beliefs depend on this perceived stochastic forecast error even though in equilibrium, model outcomes are a function only of fundamentals u . The subjective probability measure P is defined t by a mapping from fundamentals u and the subjective forecast error z to model outcomes y . t t t (t) Definition 2. Conditionally model-consistent expectations (CMCE) are a sequence of mappings h : Ω × t u Ω (t) (cid:51) (cid:0) u(t),z(t)(cid:1) (cid:55)→ y ∈ RN, t = 0,1,2,... such that, for all t and (cid:0) u(t),z(t)(cid:1) ∈ supp(P ): z t u,z 1. thechoicescontainediny solvethetime-tdecisionproblemofeachagentintheeconomy, conditional t ondecision-relevantpastandcurrentoutcomescontainedinu(t)andy(t) = (cid:0) h (cid:0) u(0),z(0)(cid:1) ,...,h (cid:0) u(t),z(t)(cid:1)(cid:1) , 0 t and evaluating the probability of future decision-relevant outcomes under the probability measure P implied by P ⊗P and the mappings (h )∞ ; u z t t=0 2. the allocations contained in y = h (cid:0) u(t),z(t)(cid:1) clear all markets except the markets for assets and final t t consumption goods; 3. asset prices under P follow the law of motion given by (1)–(2). Thedefinitionofthemappingsh isalmostidenticaltothatofarationalexpectationsequilibrium,except t that asset market equilibrium is not part of the conditions, and instead the price Q evolves according to t subjective beliefs.8 Conditional model consistency restricts the subjective belief P to have the maximum degree of consistency with the model given agents’ misspecified beliefs about asset prices. We will call the mappings h the subjective or perceived law of motion. t While demand for the long-term asset does not have to be equal to supply under P, the market still has to clear in equilibrium: Definition 3. An equilibrium with conditionally model-consistent expectations is a sequence of mappings r : Ω (t) (cid:51) u(t) (cid:55)→ z ∈ R and g : Ω (t) (cid:51) u(t) (cid:55)→ y∗ ∈ RN, t = 0,1,2,... such that, for all t and t u t t u t u(t) ∈ supp(P ): u 1. g (cid:0) u(t)(cid:1) = h (cid:0) u(t), (cid:0) r (cid:0) u(0)(cid:1) ,...,r (cid:0) u(t)(cid:1)(cid:1)(cid:1) ; 0 t 8Inorderforassetmarketequilibriumtonotenterbeliefs,Walras’lawrequiresthatatleasttwomarketclearingconditions be absent from agents’ information set. Consequently, in definition 2 we impose that allocations under conditionally modelconsistent beliefs do not clear the goods market. In Appendix D we discuss alternatives to these assumptions. 9

2. the allocations contained in y∗ = g (cid:0) u(t)(cid:1) clear the asset market. t t The probability measure implied by P and the mappings (g )∞ is denoted by P. u t t=0 Market clearing is brought about by finding the right value of the price Q that clears the asset market. t We will call the mappings g the equilibrium or actual law of motion. To avoid confusion, we will use t asterisks to denote the equilibrium stochastic processes y∗ defined by the mappings g , as opposed to the t t perceived processes y defined by the mappings h . The equilibrium implies a particular path for the t t subjective asset price forecast error. In equilibrium, z∗ is a function of the states and the shocks of the t model, while under P, z is perceived as an additional unforecastable exogenous disturbance. Because of t this discrepancy, the subjective distribution P and the equilibrium distribution P are mutually singular (P ⊥ P). Agents endowed with conditionally model-consistent expectations may not know the equilibrium mapping from fundamentals to asset prices, but their beliefs about the economy are correct conditional on their subjective view about the evolution of asset prices. This way of setting up expectations is very tractable and also allows us to transparently solve a linearized version of the model. 4 Linearized equilibrium Theanalysisinthispaperwillfocusentirelyonalinearizationofthemodelarounditsefficientsteady-state. 4.1 Rational expectations equilibrium Under rational expectations, a standard log-linearization of the model yields: y = a +αn (3) t t t w = m +a −(1−α)n (4) t t t t (1−ξ)(1−βξ) π = βEPπ + m +η (5) t t t+1 ξ t t w = γc +φn (6) t t t i = γ (cid:0)EPc −c (cid:1) +EPπ (7) t t t+1 t t t+1 q = γc −(1−β)θh −βγEPc +βEPq (8) t t t t t+1 t t+1 10

Q¯H¯ c = y − ∆h (9) t t Y¯ t y = c (10) t t Here, lower-case variables denote log-linearizations around the (zero-inflation) steady state, except for i t (1−ξ)(1−βξ) which is the difference of the nominal interest rate from its steady-state level, and η = (τ¯−τ ) t ξ t is the cost-push shock process. The model still has to be closed with an equation describing monetary policy. The model is simply the textbook New-Keynesian model with an extra equation for the asset price q in (8). However, the asset price is redundant for the model dynamics because the asset is in fixed supply. t An important special case of the model obtains when prices are fully flexible and there are no cost-push shocks (ξ = 0 and η = 0). In this case, the allocation under rational expectations equilibrium is first-best t efficient everywhere regardless of monetary policy. Output and the real interest rate are given by: yn,RE = κ a (11) t 0 t rn,RE = −γκ (1−ρ )a . (12) t 0 a t where κ = (1+φ)/(1+φ−α+αγ). These quantities are called the natural level of output and the 0 natural real rate, respectively. The equilibrium with sticky prices can be expressed in terms of the deviation from this efficient equilibrium. To this end, denote the output gap by yˆ = y − yn,RE. The sticky price equilibrium can be t t t summarized with a Phillips curve, an IS curve, a relation between marginal costs and the output gap, and an asset pricing equation: (1−ξ)(1−βξ) π = βE π + m +η (13) t t t+1 t t ξ 1+φ−α+αγ m = yˆ (14) t t α 1 (cid:16) (cid:17) E yˆ −yˆ = i −E π −rn,RE . (15) t t+1 t γ t t t+1 t 4.2 Learning equilibrium FollowingSection3, wecomputethelearningequilibriumintwosteps. First, wesolveforagents’subjective law of motion given their beliefs P. To do this, we take the system of equations (3)–(10), but replace the 11

goods market clearing condition (10) with the subjective law of motion for asset prices from (1)–(2): ∆q = µˆ +z (16) t t−1 t µˆ = ρ µˆ +gz . (17) t µ t−1 t This subjective law of motion is a forward-looking model that is straightforward to solve. However, there is now an additional shock, the asset price forecast error z , that is absent under RE. This forecast error t will be predictable in equilibrium, but under P agents believe it to be unforecastable. Just as under RE, one still needs to add an equation describing monetary policy. In the second step, we compute the equilibrium. To avoid confusion, we will denote with asterisks the equilibrium law of motion. We impose market clearing in the asset market: h∗ = 0. t This equation implicitly defines the equilibrium law of motion of the asset price q∗, of the forecast error t z∗, as well as of all other equilibrium outcomes. We solve separately for the flexible- and the sticky-price t equilibrium. 4.2.1 Flexible prices Wefirstdescribetheflexiblepriceequilibrium(ξ = 0andη = 0). Wefirstfindthesubjectivelawofmotion t by solving (3)–(10). The learning model has two additional state variables compared to its RE counterpart, q and µˆ . We guess and verify that the asset demand function has the following form: t t−1 hn = k a +k hn −k q +k µˆ (18) t a t h t−1 q t µ t−1 where the coefficients satisfy k ∈ (0,1), k ,k ,k > 0. Exact expressions are in the appendix. Ash a q µ set demand under learning is increasing in productivity, decreasing in the asset price, and increasing in expectations of future capital gains. We can also solve for the values of output and the real interest rate under flexible prices. We write 12

output and the real rate in deviation from their RE counterpart: yn = yn,RE + αγκ 1 (cid:0) k a −(1−k )hn −k q +k µˆ (cid:1) (19) t t 1+φ−α a t h t−1 q t µ t−1 (cid:16) (cid:17) rn = rn,RE +γκ k (2−ρ −k )a −(1−k )2hn −k (1−k )q t t 1 a a h t h t−1 q h t +γκ ((2−ρ −k )k +k )µˆ . (20) 1 µ h µ q t−1 where κ = 1+φ−α Q¯H¯ > 0. The natural rate is increasing in the price growth belief µˆ . Agents’ 1 1+φ−α(1−γ) Y¯ t−1 subjectiveexpectationsaboutoutputunderflexiblepricesareaffectedbythechoiceofassetholdings(which are not constant in agents’ minds). An increase in expected capital gains will increase asset demand, and householdswillincreasetheirlaborsupplyinordertofinancetheirpurchaseoftheasset, therebyincreasing the level of output. The natural real rate under subjective expectations can be understood by the arbitrage relationship between the return on the long-term asset and the return on bonds. Combining the two asset pricing equations (7) and (8), we obtain: 1−β rn = (γc −θh −q )+EP∆q t β t t t t t+1 Up to first order, the expected return on the two assets has to be equal. An increase in expected capital gains EP∆q = µˆ increases the expected return to the asset, and the real interest rate on bonds t t+1 t−1 therefore has to rise as well. To find the flexible-price equilibrium under learning, i.e. the actual law of motion, one has to impose h∗ = 0. From the asset demand function (18), one can then immediately solve for the equilibrium asset t price and the realization of the subjective forecast error: k a +k µˆ∗ q∗ = a t µ t−1. (21) t k q That is, the equilibrium asset price is increasing in both productivity and capital gains expectations. This is intuitive. The demand function (18) is downward-sloping, and so an increase in demand due to either higher productivity (i.e. higher income) or higher expected capital gains has to be met with an increase in the price to bring about equilibrium in the asset market. 13

Substituting the equilibrium price (21) into Equations (19) and (20), we obtain the realized level of output and the real rate under flexible prices: y∗ = yn,RE (22) t t r∗ = rn,RE +γκ (cid:0) (1−ρ )k a +((1−ρ )k +k )µˆ∗ (cid:1) . (23) t t 1 a a t µ µ q t−1 Under learning and flexible prices, the equilibrium level of output is the same as under RE. This coincidence arises because, under flexible prices, output is determined entirely by intratemporal conditions that are independent of expectations. Nonetheless, the real interest rate does depend on expectations, and its natural level under learning is therefore different from rational expectations. In particular, it is increasing in subjectively expected asset price price growth. 4.2.2 Sticky prices With sticky prices, the subjective law of motion can be expressed in deviation from the flexible price allocation, just as under rational expectations. We will use tildes to denote “perceived” gaps, e.g. h˜ = t h −hn denotes the difference of asset holdings from their flexible price level under P. The sticky price t t PLM can be summarized with the equations: (1−ξ)(1−βξ) π = βEPπ + m +η (24) t t t+1 ξ t t 1+φ−α+αγ (cid:16) (cid:17) m = c˜ +κ ∆h˜ (25) t t 1 t α c˜ = EPc˜ − 1 (cid:0) i −EPπ −rn(cid:1) (26) t t t+1 γ t t t+1 t h˜ = γ (cid:0) c˜ −βEPc˜ (cid:1) . (27) t θ(1−β) t t t+1 The first equation is the familiar Phillips curve, and the second equation relates marginal costs to gaps in consumption c˜ and in asset investment ∆h˜ . The investment gap appears because, for a given level of t t consumption, higher asset purchases have to be financed out of additional income from production, and marginal costs are increasing in the level of production. The third equation is an IS equation written in terms of the consumption gap (which under subjective expectations does not equal the output gap). The last equation is the Euler equation for asset demand, rewritten in gap form. 14

Notice that the asset price q itself does not appear in its own Euler equation when it is written in terms t of gaps, because agents perceive q as an exogenous process, independent of the degree of price stickiness. t The asset price still implicitly enters equation (27) through the natural rate rn. t To find the actual law of motion under sticky prices, one imposes h∗ = 0 and solves for q∗. The t t equilibrium depends crucially on the conduct of monetary policy, which we have not yet specified. 4.3 Numerical illustration We illustrate the properties of the learning model using a simple calibration in which we interpret the longterm asset as housing. We set the labor share in output equal to α = 0.7 and the discount factor β equal to 0.995. The coefficient of relative risk aversion is set to γ = 1.39 (Gandelman and Hernández-Murillo, 2014) and the inverse Frisch elasticity of labor supply is set to φ = 0.33. The utility scaling parameter χ is set to 0.01005 in order to achieve a steady state ratio of asset wealth to output of Q¯H¯/Y¯ = 2.01, which corresponds to the US ratio of real estate holdings over GDP in 2016. The degree of price stickiness is set to a standard value of ξ = 0.75, implying an average price duration of four quarters. The elasticity of substitution σ does not affect the first-order dynamics of the model and we set it such that the relative weight on inflation in the loss function (29) below equals λ = 1. We follow Billi (2017) and set the π autocorrelation of both the technology and cost-push shocks to 0.8. Finally, we calibrate the remaining four parameters (σ ,σ ,θ,g) to jointly match the volatilities of output growth σ(∆y ) = 0.64%, inflation A τ t σ(π ) = 0.82%, house price growth σ(∆q ) = 1.51% and real wage growth σ(∆w ) = 0.10%, under the t t t assumption that monetary policy follows the commonly used Taylor rule (cid:16) (cid:17) i = 1.5π +0.125 y −yn,RE . (28) t t t t The resulting parameter values are σ = 0.83%, σ = 0.75%, θ = 0.0068 and g = 0.0041. A p We first document the effect of learning under flexible prices. Learning has no effect on equilibrium allocations relative to rational expectations, but manifests itself in the asset price and the natural real interest rate. Figure 1 plots the response of these two variables to a technology shock under rational expectations and learning for a range of values of the learning gain g. The effect of learning on q is typical for self-referential asset price learning models. Initially, the asset t price rises on impact because higher wage income raises asset demand, as under rational expectations. But 15

Figure 1: Effect of learning under flexible prices. 2.5 2 1.5 1 0.5 0 -0.5 0 5 10 15 20 25 30 35 40 periods q Responseofqtoǫ a 0.05 Learning,g=0.008 Learning,calibration Learning,g=0.002 RationalExpectations 0 -0.05 -0.1 -0.15 -0.2 0 5 10 15 20 25 30 35 40 periods r Responseofrtoǫ a Learning,g=0.008 Learning,calibration Learning,g=0.002 RationalExpectations Note: ResponsetoaonestandarddeviationpositivetechnologyshockεAt. Logpercentagepoints. Flexiblepricesandzeroinflation. For learningcases,outcomesareplottedfortheequilibriumlawofmotion. the initial increase now causes a subsequent revision in beliefs µˆ through the learning mechanism. The t household believes that the shock has some long-run impact on capital gains and responds by increasing its asset demand above the RE demand. This response drives a further increase in q in the next period and t the shock continues to propagate through belief updating thereafter. At some point, expected price growth has risen so much that it outstrips realized price growth. At this point, beliefs µˆ decrease, bringing about t a reduction in asset demand and therefore in equilibrium asset prices, so that the process eventually reverts back to steady state. The strength of these dynamics depends crucially on the learning gain g. The differing response of the real interest rate between the learning and RE environments directly shows the effect of expected capital gains on the natural rate of interest. Even though the impulse response of realized consumption is exactly identical under learning and RE, what matters for the interest rate is expected consumption growth. Under learning, increases in expected capital gains in the periods following the shock also increase expected consumption growth and therefore higher real interest rates, as agents anticipate selling some of their asset holdings in the future to profit from the capital gains. Optimistic capital gains expectations thus imply a higher real rate of interest than under rational expectations. 16

5 Optimal Policy 5.1 Welfare function We provide second-order approximations to the expected discounted sum of utility in our model. Under learning, we will assume that the policymaker maximizes welfare under the equilibrium law of motion, applying the loss function L . If, instead, welfare were maximized under the subjective law of motion of t the agents, then the policy problem could be solved as a standard rational expectations problem, which has been studied extensively in the literature. Welfare under the equilibrium law of motion is proportional to − (cid:80)∞ E L up to second order and t=0 0 t terms independent of policy. The period loss function is given by L = λ (π∗)2+(yˆ∗)2. (29) t π t t where yˆ = y −yn,RE is the deviation of equilibrium output from its flexible-price level, and λ > 0 is a t t t π function of the structural model parameters (see the appendix). This loss function is the same as in the standard, rational expectations New Keynesian model. It penalizes deviations of inflation from zero as well as deviations of output from its natural level under rational expectations (11). This natural level of output is first-best efficient. 5.2 Optimal policy without cost-push shocks We now solve for the optimal monetary policy for the case in which there are no cost-push shocks. For exposition, we start by reviewing the optimal policy under rational expectations. As the flexible price equilibrium under RE is first-best efficient, monetary policy is optimal if it manages to replicate the flexible priceallocationinthepresenceofnominalrigidities. Thisamountstoclosingtheoutputgapandcompletely stabilizing inflation at the same time, as can be seen from the loss function (29). Without cost-push shocks, the “divine coincidence” holds and complete stabilization is achievable: The optimal policy implements π = 0andthePhillipscurve(13)thenimmediatelyimpliesyˆ = 0. Theoptimalpolicycanbeimplemented t t with the following rule: i = rn,RE +φ π t t π t 17

where φ > 1. The interest rate has to track the natural real rate and react more than one-for-one to π inflation, i.e. satisfy the Taylor principle. Under learning, we will see that the first-best allocation is also feasible by pursuing strict inflation targeting, which is therefore the optimal policy. Proposition 1. It is optimal for monetary policy under learning to implement π = 0, regardless of whether t welfare is evaluated under the actual or the perceived law of motion. The optimal policy can be implemented with the rule i = rn,PLM +φ π t t π t where φ > 1. π Proof. See the appendix. The intuition for this result is relatively simple: When the central bank tracks the appropriate natural real rate of interest and there are no cost-push shocks, it implements the flexible price allocation. But we have just seen that the equilibrium under flexible prices is identical to that under RE, and is therefore firstbest. Moreover, the flexible-price allocation is also first-best under the subjective law of motion of agents. Therefore, for the case without cost-push shocks and a fixed asset supply, there is no tension between the central bank’s optimal policy and what agents would perceive to be optimal from a subjective perspective. It might seem at first that the optimal policy prescriptions are unchanged by the presence of learning, because the prescription of strict inflation targeting is unchanged. But the implementation of this target requires a different reaction function under learning. The nominal interest rate has to track the subjective natural real interest rate rn, which is different from the rn,RE under rational expectations.9 Whereas t t rn,RE is a function of productivity a only, rn depends additionally on beliefs µˆ , prices q and the asset t t t t t holdings h . In particular, the real rate rises when expected price growth µˆ increases. In equilibrium, t−1 t the asset price q depends positively on expected price growth, and therefore the central bank has to set t higher interest rates when asset prices, or subjective capital gains expectations, are high. Under rational expectations, such a reaction is not necessary. 9Theprocessrn isalsodifferent,fromtheperspectiveofagents,fromtheequilibriumrealizationr∗ in(23). Theappendix t contains a discussion of this difference and how it matters for the setting of interest rates. 18

5.3 Optimal policy with cost-push shocks The presence of cost-push shocks breaks the “divine coincidence” under rational expectations, so that the first-best allocation is not feasible. Under learning, there exists in principle a non-linear policy that, by exploiting agents’ misperceptions, still achieves the first-best allocation: Proposition 2. With cost-push shocks and learning, it is possible to close the inflation and the output gaps (π∗ = yˆ∗ = 0) with the non-linear targeting rule π = −η /(βρ )+b z , where b is a state-dependent t t t t η t t t coefficient. Proof. See the appendix. This result relies on the manipulation of beliefs by the central bank.10 From the Phillips curve (24), it is clear that inflation and the output gap can only be zero if expected future inflation and the cost-push shock offset each other: βEPπ + η = 0. Under rational expectations, this is infeasible because zero t t+1 t inflation will also imply zero expected inflation. But under learning, this need not be the case. In fact, the central bank can separately control inflation and inflation expectations, if it is able to make the private sector believe that part of its actions are random when in fact they are not. This is what the policy in the proposition above accomplishes: The term b is set such that agents perceive it as independent from z , so t t that b z appears as a random, zero-mean shock to inflation to them, while in equilibrium it will be set in t t a highly systematic manner so as to set equilibrium inflation to zero. Because it implements the first-best allocation, this policy is obviously optimal under learning. But it uses an extreme degree of belief manipulation that is particularly vulnerable to the Lucas critique. Why should agents continue to forever expect inflation when all they ever observe is complete price stability? It is plausible that the very fact that the central bank is trying to exploit a certain expectational bias will change the nature of this bias or even eliminate it. One way out of this problem is to explicitly model interactions between the nature of agents’ belief distortions and the central bank’s policy, as in Woodford (2010). Instead, we will stick with the particular belief distortion we study, but limit the degree to which it can be exploited by the central bank. Welimittheclassofpoliciesconsideredtothoseinwhichtheamountofthecentralbank’smanipulation of beliefs is limited, in the following sense: The target criterion for inflation that the central bank pursues 10The possibility of such an outcome was anticipated by Woodford (2010) who speculated that one “might even conclude that the optimal policy under learning achieves an outcome better than any possible rational-expectations equilibrium, by inducing systematic forecasting errors of a kind that happen to serve the central bank’s stabilization objectives”. 19

in equilibrium has to be the same as the criterion that agents perceive to be pursued under their subjective beliefs. Formally, we require that π be a function only on the fundamental shocks u(t), but not on the t perceived asset price forecast errors z(t); or, equivalently: π = π∗P-almost surely. (30) t t This condition means that agents effectively have rational expectations for inflation.11 This condition is not a restriction on the behavior of agents, who form beliefs under the same learning scheme as before, but a restriction on the set of policies that the central bank can pursue. Policies for which the private sector’s beliefs of inflation are not fully model-consistent, i.e. different from the actual inflation target of the central bank, are ruled out. The class of policies satisfying condition (30) is large, and includes all possible target criteria under rational expectations, including the RE-optimal discretionary and commitment criteria. It also includes target criteria that are made contingent on equilibrium asset price realizations q∗. However, policies with t a target that depends on outcomes contemplated by agents, but never realized in equilibrium, are not part of this class. In particular, then, the class excludes the policy described in Proposition 2. We now solve for the optimal monetary policy within our restricted class of policies. We find that the best that monetary policy can achieve is to replicate the optimal outcomes under rational expectations: Proposition 3. Consider the class of policies satisfying condition (30). Within this class: 1. Theoptimaltargetcriterionundercommitment(fromthetimelessperspective)isgivenbyζp = −∆m t t for some ζ > 0, where p is the price level. The optimal commitment policy achieves the same t equilibrium allocation as the commitment solution under rational expectations. 2. Similarly, the optimal target criterion under discretion is given by ζπ = −m . The optimal discret t tionary policy achieves the same equilibrium allocation as the discretionary solution under rational expectations. 3. In both cases above, optimal policy can be implemented with an interest rule of the form i = rn + t t A(L)η +φ π , where A(L)η is a lag polynomial in the cost-push shock η and φ > 1. t π t t t π 11Note that by definition of the equilibrium, it is always the case that π = π∗ a.s. (almost surely) under the equilibrium t t measure P, but the subjective measure P is not absolutely continuous with respect to P (see Section 3). 20

Proof. See the appendix. In sum, monetary policy can do no better under learning than under RE, and flexible inflation targeting remains the optimal policy even when asset price expectations are not rational. However, implementing flexible inflation targeting requires the central bank to track the subjective natural real rate rn (as in the t previous case without cost-push shocks), which depends positively on subjective expectations of capital gains. Therefore, the optimal nominal interest rate has to react positively to asset prices. Figure2comparestheoptimalpolicyresponseinthecalibratedmodeltoacost-pushshockunderREand learning. Comparing either optimal discretionary or commitment policies, the outcomes for inflation and the output gap under learning and under rational expectations are the same. However, the implementation of these outcomes requires different paths for the nominal interest rate. In all cases displayed, the cost push shock reduces asset prices, as lower income reduces asset demand. The drop in asset prices is magnified under learning as subjective expectations become pessimistic. The natural rate under learning falls with subjective expectations. As a consequence, the optimal discretionary nominal interest rate is lower than under rational expectations. The same holds true under commitment after the first two periods, when the difference of the asset price response becomes sufficiently large. 6 Simple rules Implementing optimal policy in the learning environment requires knowledge of the natural rate of interest under learning, which depends on subjective asset price beliefs. This implies that the central bank can either directly observe these beliefs, or can infer them from the difference between the realized level and the efficient level of asset prices. Either way, measurement of the relevant quantities is fraught with difficulty, which is in fact one of the most frequent arguments made against incorporating reactions to asset prices into monetary policy. In this section, we consider simple interest rate rules rules with a potential reaction to asset prices. These rules can be implemented without knowledge of subjective beliefs or asset price gaps, since they only depend on realized asset prices that are easy to measure and observable with high frequency. We show that for our calibrated model, incorporating a positive reaction to asset price growth is desirable in terms of welfare. This is not a straightforward consequence of the optimal policy analysis in the previous section, because simple rules can be quite far from optimal. Broadly speaking, a rule reaction to asset prices will 21

Figure 2: Optimal policy and alternatives after a cost-push shock. 0.12 0.1 0.08 0.06 0.04 0.02 0 -0.02 0 5 10 15 20 25 30 35 40 periods π Responseofπtoǫ τ 0 Learning,OptimalPolicywithDiscretion Learning,OptimalPolicywithCommitment RE,OptimalPolicywithDiscretion RE,OptimalPolicywithCommitment -0.05 -0.1 -0.15 -0.2 -0.25 0 5 10 15 20 25 30 35 40 periods ˆy Responseofyˆtoǫ τ Learning,OptimalPolicywithDiscretion Learning,OptimalPolicywithCommitment RE,OptimalPolicywithDiscretion RE,OptimalPolicywithCommitment 0.1 0 -0.1 -0.2 -0.3 -0.4 -0.5 -0.6 0 5 10 15 20 25 30 35 40 periods q Responseofqtoǫ τ 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 Learning,OptimalPolicywithDiscretion 0 Learning,OptimalPolicywithCommitment RE,OptimalPolicywithDiscretion -0.02 RE,OptimalPolicywithCommitment -0.04 0 5 10 15 20 25 30 35 40 periods i Responseofitoǫ τ Learning,OptimalPolicywithDiscretion Learning,OptimalPolicywithCommitment RE,OptimalPolicywithDiscretion RE,OptimalPolicywithCommitment Note: Response to a unit standard deviation positive cost-push shock ετt. Log percentage points. For learning cases, outcomes are plottedfortheequilibriumlawofmotion. Theelasticityofsubstitutionσ ischosensuchthatthewelfareweightoninflationintheloss functionLequalsλπ =1. tend to be beneficial if periods of elevated asset prices coincide with excess aggregate demand under that particular rule. For our calibrated model and Taylor-type interest rate rules, that turns out to be the case. We re-compute the model under the assumption that the monetary authority is following a Taylor-type rule of the form: (cid:32) ∞ (cid:33) (cid:88) i = ρ i +(1−ρ ) φ π +φ yˆ +φ ωs∆q . (31) t i t−1 i π t y t q t−s s=0 The rule depends on inflation and the output gap, and has an additional term for asset prices: a moving average of past price changes, with a weight on past observations that decays at the rate ω˜ ∈ (0,1). In whatfollows, wekeepthecoefficientoninflationatφ = 1.5andfindparametercombinations(ρ ,φ ,φ ,ω˜) π i y q that minimize the loss function (29).12 We impose the constraint 0 ≤ ω˜ ≤ 0.999. Table 1 summarizes the 12If one also optimizes over the coefficient and inflation, then the optimal policy under RE is given by φ → ∞ and π 22

Table 1: Performance of optimized simple rules. RationalExpectations σ(πt) σ(yˆt) σ(∆qt) E[L] (1) Optimal Policy, Discretion 0.436 0.085 1.011 0.962 (2) Optimal Policy, Commitment 0.205 0.192 1.010 0.385 (3) it=1.5πt+0.125yˆt 0.349 0.451 0.801 1.583 (4) it=ρ∗ i it−1+(1−ρ∗ i ) (cid:0) 1.5πt+φ∗ y ·yˆt (cid:1) 0.362 0.221 0.776 0.838 {ρ∗,φ∗}={0.681,1.611} i y Learning σ(πt) σ(yˆt) σ(∆qt) E[L] (5) Optimal Policy, Discretion 0.436 0.085 1.724 0.962 (6) Optimal Policy, Commitment 0.204 0.193 1.724 0.385 (7) it=1.5πt+0.125yˆt 0.148 0.428 1.503 1.000 (8) it=ρ∗ i it−1+(1−ρ∗ i ) (cid:0) 1.5πt+φ∗ y yˆt (cid:1) 0.133 0.428 1.608 0.940 {ρ∗,φ∗}={0.147,0.106} i y (9) it=ρ∗ i it−1+(1−ρ∗ i ) (cid:0) 1.5πt+φ∗ y yˆt+φ∗ q (cid:80)∞ s=0 ω∗s∆qt−s (cid:1) 0.088 0.418 1.541 0.850 {ρ∗,φ∗,φ∗,ω∗}={0.302,0.133,0.025,0.999} i y q (10) FM(2007)w/assetprice it=ρ∗ i it−1+(1−ρ∗ i ) (cid:0) 1.5πt+φ∗ y yˆt+φ∗ q qt (cid:1) 0.088 0.417 1.540 0.850 {ρ∗,φ∗,φ∗}={0.301,0.136,0.025} i y q (11) it=ρ∗ i it−1+(1−ρ∗ i )1.5πt 0.014 0.541 1.911 1.393 ρ∗=0 i (12) BG(1999)w/assetprice it=ρ∗ i it−1+(1−ρ∗ i ) (cid:0) 1.5πt+φ∗ q qt−1 (cid:1) 0.050 0.536 1.842 1.380 {ρ∗,φ∗}={0,−0.005} i q (13) BG(2001)w/assetprice it=ρ∗ i it−1+(1−ρ∗ i ) (cid:0) rss+1.5πt+φ∗ q qt (cid:1) 0.012 0.541 1.911 1.393 {ρ∗,φ∗}={0,0.001} i q (14) it=ρ∗ i it−1+(1−ρ∗ i ) (cid:0) rss+1.5πt+φ∗ q (cid:80)∞ s=0 ω∗s∆qt−s (cid:1) 0.078 0.518 1.707 1.310 {ρ∗,φ∗,ω∗}={0,0.024,0.701} i q Note: LossesareevaluatedastheunconditionalexpectationofthelossfunctionL. Thewelfareweightoninflationissettoλπ =1and thelossesarenormalizedto1forrow(7). results. The first four rows show results under rational expectations. Rows (1) and (2) show the optimal policy outcomes under discretion and commitment. Row (3) shows our baseline policy rule used to calibrate the model. Row (4) shows the optimized values of the coefficients on inertia ρ and the output gap φ , holding i y constant the inflation coefficientφ . This rule is more aggressive than the standard Taylor rule and leads to π welfaregainsfromoutputgapstabilization. Allowingforanon-zeroassetpriceresponseintheoptimization leads to an optimal coefficient of φ∗ = 0: There is no benefit from leaning against the wind under rational q φ /φ →ζ >0(BoehmandHouse,2014). Theoutcomesofthislimitpolicyarealsoattainableunderlearningwithasimilar y π policythatalsorespondsinfinitelystronglytoinflationandtheoutputgap. Inthissection,weruleoutinfiniterulecoefficients by keeping the inflation coefficient fixed, and focus only on the tradeoff of reacting to the output gap and asset prices. 23

expectations. Under learning, the picture is quite different. Rows (5) and (6) show optimal policy outcomes, which differ from their RE counterparts only by a higher asset price volatility. Row (7) shows the baseline policy rule used for the calibration. Row (8) optimizes the output gap and inertia coefficients in rule (31), while Row (9) also optimizes the asset price reaction parameters φ and ω. The optimal asset price response φ∗ is q q positive. The optimal coefficient on the output gap is positive as well, and the optimal ω∗ is set very close to one. With this value, the moving average of asset prices closely tracks the subjective belief µˆ , which t itself is a moving average of past price changes. The reaction to the asset price stabilizes both inflation and the output gap, resulting in a 15 percent reduction in expected losses. To get a better idea of the effects of monetary policy reactions to asset prices and the output gap under learning, we compute loss function values as well as the volatilities of inflation, the output gap and asset prices over a range of parameters for the rule in (31). We fix the moving average weight to ω˜ = 0.9, the interest rate inertia to ρ = .147 as in Row (8) of Table 1, and vary the magnitude of the response i coefficients φ and φ on the output gap and inflation.13 Figure 3 contains the results as surface plots. y q The effect of changes in the output gap coefficient are as expected: They lower the volatility of the output gap itself, but increase the volatility of inflation. This trade-off arises because the model has costpush shocks in it. A reaction to the output gap also lowers asset price volatility in this model. But the asset price coefficient also plays an important role. The volatility of asset prices is decreasing in the asset price response φ . The volatility of the output gap is affected little by the asset price response, q but the volatility of inflation is reduced significantly with φ > 0. Therefore, the loss function is minimized q at a strictly interior point at which the central bank reacts to both the output gap and asset price growth. Importantly, a reaction to asset prices always decreases asset price volatility (regardless of the value of ω˜). This is in stark contrast to the rational bubbles of Gali (2014, 2017). Rational bubbles grow at the rate of interest, and so raising rates when a bubble is growing makes it grow even faster, causing more volatility. By contrast, raising rates in our learning model has the effect of lowering the asset price today: A higher realrate requires ahigher expectedasset return. For agivenexpected capitalgainµˆ , ahigher return needs t to be brought about by a lower price today. The reduction in the asset price today then reduces optimism about future price growth. 13Ourresultsarequalitativelyrobusttochangesinthemovingaverageweightω. Inparticular,apositivereactiontoasset prices φ >0 always reduces asset price volatility. q 24

Figure 3: Loss values and volatilities for different output gap and asset price coefficients. (a) Inflation volatility. 0.03 0.121 0.099 0.099 0.121 0.142 0.163 0.025 0.078 0.099 0.02 0.078 0.078 0.099 0.096 0.121 0.142 0.163 0.184 0.057 0.015 0.01 0.036 0.057 0.078 0.099 0.121 0.142 0.163 0.184 0.206 0.005 0.036 0.057 0.078 0.184 0.206 0.227 0 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 y q (b) Output gap volatility. 0.03 0.499 0.487 0.476 0.464 0.453 0.441 0.43 0.025 0.418 0.418 0.51 0.02 0.499 0.487 0.476 0.464 0.453 0.441 0.43 0.04.418 2 0.015 0.51 0.01 0.522 0.499 0.487 0.476 0.464 0.453 0.441 0.43 0.418 0.005 0.51 0 0.407 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 y q (c) Asset price volatility. 0.03 1.634 1.585 1.537 1.489 1.441 1.393 0.025 1.682 0.02 1.634 1.585 1.537 1.489 1.441 1.393 0.015 1.682 1.502 0.01 1.778 1.73 1.634 1.585 1.537 1.489 1.441 1.682 0.0051.826 1.778 1.73 1.634 1.585 1.537 1.489 1.441 0 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 y q (d) Loss function. 0.03 1.34 1.286 1.232 1.178 1.124 1.07 1.016 0.962 0.025 0.02 1.286 1.232 1.178 1.124 1.07 1.016 0.962 0.908 0.962 1.016 0.015 0.01 1.34 1.286 1.232 1.178 1.124 1.07 1.016 0.962 0.962 1.016 0.005 1.34 0.962 1.016 1.07 0 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 y q Note: Unconditionalstandarddeviationofinflationπ∗,assetpricegrowth∆q∗ andoutputgapyˆ∗ undertheequilibriumlawofmotion, t t t and loss function L, as a function of φy and φq. The parameter ρi is kept at 0.147 as in Row (8) of Table 1 and ω˜ =0.9 throughout. Thewelfareweightoninflationissettoλπ =1λπ =1andthelossesarenormalizedasinTable1. Redlinesdenotecontourlinesatthe optimalcoefficientφy withφq =0,i.e. theruleinRow(8)ofTable1. Blackdotsdenotevaluesattainedundertheoptimalcoefficients. Table 1 also evaluates different forms of asset price reactions that have been proposed in the literature. In Row (10), we evaluate a rule which reacts to the level of asset prices. Here, too, we obtain a positive optimalresponsecoefficientonassetprices,incontrasttoFaiaandMonacelli(2007)whoevaluatedthistype ofruleinarationalexpectationsmodelwith creditconstraints. Rows(11) to (14)evaluateruleswithoutan output gap reaction. Including a reaction to the level of asset prices, as proposed in Bernanke and Gertler (1999) and evaluated in Rows (12) and (14), does not enable the central bank to improve outcomes over a rule without such a reaction in Row (10). However, a positive reaction to a moving average of past asset price growth in Row (14) yields a small welfare gain. 25

7 Extension: General Asset Price Beliefs We can show that our results derived so far are robust to a wide range of alternative specifications for agents’ subjective asset price beliefs. We replace the subjective law of motion for asset prices in (1)–(2) with a general belief of the form: q = A(L)z +B(L)u . t t t where A and B are arbitrary lag polynomials. Subjective beliefs can depend in an arbitrary way on the fundamental shocks u (i.e. productivity and cost-push shocks) as well as a subjective forecast error z . t t This general form of beliefs encompasses rational expectations (for which A = 0 and B represents the equilibrium process for asset prices), our baseline belief process (for which B = 0 and A represents the subjective law of motion 1–2), but also other behavioral expectations, such as the “natural expectations” of Fuster et al. 2012, the “diagnostic expectations” of Bordalo et al. (2018), general forms of extrapolation or attenuation bias, and more. The only assumptions we do retain are that i) expectations are conditionally model-consistent in the sense of Definition 2; and ii) the subjective law of motion for q is independent of t policy. This second assumption is somewhat limiting, as it would be interesting to study how changes in policy can change the form of subjective asset price beliefs. However, we conjecture that an environment in which agents think that monetary policy is more powerful in shaping subjective asset price beliefs will provide an even stronger rationale for reacting to asset prices than the one we lay out here. Theappendixshowsthatwiththisgeneralbeliefsystem,Propositions1through3continuetohold: The optimal policy under learning replicates the outcomes from the optimal policy under rational expectations, but has to be implemented by following the perceived natural real interest rate which is increasing in asset price expectations. 8 Extension: Asset Production In the baseline model discussed up until now, learning causes distortions solely through wealth effects affectingaggregateconsumption. Here,weextendthemodeltoallowforthelong-termassettobeproduced insteadofbeinginfixedsupply. Inthisextension,assetpricemisalignmentsalsodistortinvestmentdecisions inadditiontoaggregatedemand. Thiscomplicatesthemonetarypolicytradeoffbecauselearningnowcauses 26

two distortions, one through aggregate demand and one through the misallocation of resources along the consumption-investment margin. We show that, in this case, the optimal policy target criterion under learning is no longer the same as under rational expectations. Instead, the optimal policy “leans against the wind” as defined by Svensson (2017): The central bank should tolerate low inflation at times when current or future expected asset prices are inefficiently high. Relative to the baseline model, we now assume that the stock of the asset depreciates at the rate δ. The representative household owns firms that can produce an amount X of the asset from K consumption t t goods. Their production function has decreasing returns to scale: X = A Kω. t h t Production takes place within one period. The profits of the investment firms (which accrue to households) are: Π = Q X −K t t t t and profit maximization leads to the first order condition: ω I t = A h (ωQ t A h )1−ω . The budget constraint of the household now takes into account profits from asset producers and depreciation of the asset: 1+i t−1 C +Q (H −(1−δ)H )+ B = W N +Π +T +B . t t t t−1 t−1 t t t t t 1+π t Market clearing in the asset market now requires H = (1−δ)H +X . (32) t t−1 t The equilibrium is defined analogously to section 3. Agents do not know the market clearing condition (32), but instead hold subjective beliefs that the asset price follows equations (1)–(2). Beliefs about the hidden state µ are updated using the Kalman filter as before, and expectations about the remaining t equilibrium objects satisfy conditional model consistency as defined in section 3. 27

8.1 Flexible price equilibrium The linearization of the model as well as the derivation of the flexible-price equilibrium under learning and RE is very similar to our baseline model, which we relegate to the appendix. The only added complication is that the model now has one additional endogenous state variable, the asset quantity h . t−1 The allocation under rational expectations and flexible prices is still first-best. In particular, the asset investment choice xn,RE is efficient. However, the learning equilibrium is not, because subjective asset price t beliefs now distort investment decisions, so that xn (cid:54)= xn,RE. Consumption and output distortions under t t flexible prices are functions of the investment distortion: (cid:16) (cid:17) cn−cn,RE = −κ xn−xn,RE t t 1 t t αγ (cid:16) (cid:17) yn−yn,RE = κ xn−xn,RE t t 1+φ−α 1 t t where the constant κ is now defined as 1 1+φ−α δQ¯H¯ κ = . 1 C¯ (1+φ−α)+αγ Y¯ Y¯ Therealinterestrateunderlearningandflexiblepricescanbeexpressedindeviationfromitscounterpart under rational expectations in the form (cid:16) (cid:17) rn = rn,RE −b hn −b hn −hn,RE +b a −b q +b µˆ , (33) t t h1 t−1 h2 t−1 t−1 a t q t µ t−1 where b ,b b ,b > 0. Closed-form expressions for the coefficients are given in the appendix. Most h1 h2 q µ important for our analysis is that b > 0: The natural real rate of interest continues to be increasing in the µ asset price belief µˆ . t 8.2 Welfare function Toevaluatedifferentpolicies, wederiveaquadraticapproximationofthewelfarefunctionundertheequilibrium probability measure P. Welfare is proportional (up to second order and terms independent of policy) 28

by − (cid:80)∞ E L , where the period loss function is given by t=0 0 t (cid:16) (cid:17)2 L = λ (π∗)2+(cˆ∗+κ xˆ∗)2+λ hˆ∗ +λ (xˆ∗)2, (34) t π t t 1 t h t x t where λ ,λ ,λ > 0 are functions of the structural model parameters (see the appendix), and cˆ = π h x t c −cn,RE, hˆ = h −hn,RE, and xˆ = x −xn,RE. As before, asterisks denote the process for a variable t t t t t t t t under the equilibrium law of motion, as opposed to the subjective law of motion under which agents make their decisions. Compared to the loss function in the baseline model, we now have to take into account variation in the asset stock h that the household owns, as well as variations in asset investment x . t t 8.3 Optimal policy As in the baseline model, we can write the perceived law of motion under sticky prices in deviation from the flexible price PLM: (1−ξ)(1−βξ) π = βEPπ + m +η (35) t t t+1 ξ t t C¯(1+φ−α)+Y¯αγ m = (c˜ +κ x˜ ) (36) t Y¯α t 1 t i = γ (cid:0)EPc˜ −c˜ (cid:1) +EPπ +rn (37) t t t+1 t t t+1 t θ(1−β(1−δ))h˜ = γc˜ −β(1−δ)γEPc˜ . (38) t t t t+1 Note that the asset price q does not enter this system save through the dependence of the natural real t rate rn in the IS curve (37). In fact, the system has the same form as in the baseline model with slightly t different coefficients. However, our results on optimal policy are changed in this version with production, due to the fact that the flexible-price allocation under learning is no longer efficient. Asbefore,werestrictourselvestotheclassofpoliciesforwhichthetargetcriterionforinflationisrobust to the Lucas critique, in the sense that beliefs about the inflation objective coincide with the central bank’s actual inflation objective, π = π∗. Again, this class includes the optimal discretionary and commitment t t policies under rational expectations. Within this class, we are able to analyze the policy problem under learning as a recursive linearquadratic problem. We find that unlike in the baseline model, the optimal monetary policy under learning “leans against the wind” in the following sense: 29

Proposition 4. Consider the class of policies for which π = π∗ P-almost surely. Within this class, the t t optimal commitment policy takes the form ζp = −(m +f −(1−δ)f ) t t t t−1 ∞ (cid:88) f = k f +c hˆ∗ + c E qˆ∗ t h t−1 0 t−1 s t t+s s=0 for some ζ > 0, where p is the price level. If the gain parameter g is sufficiently small, then the coefficients t c and (c )∞ are all strictly positive. −1 s s=0 Proof. See the appendix. The proposition establishes that the optimal commitment14 policy “leans against the wind”: Even when production is at its efficient level, i.e. the marginal cost deviation m is zero, and the inherited asset stock t h isefficient,thewelfare-maximizingcentralbankstillwantstosetinflationlowerthanitstargetifcurrent t−1 or future expected asset prices are inefficiently high. Inefficiently high asset prices imply real distortions because they lead to over-investment. Low inflation mitigates over-investment because it induces lower output and higher real interest rates, both of which reduce asset demand. The proposition comes with the qualification that the learning gain g must be sufficiently small. For large values of g, we cannot ensure that the central bank always wants to lean against the wind. The reason is that large values of g lead to oscillatory patterns in the response of asset prices to subjective return surprises, including those induced by monetary policy. While it will always be the case that tighter monetary policy will lead to lower asset prices and investment today, the endogenous belief dynamics of the modelcanleadtohigherassetpricesandinvestmentinthefuture,renderingtheeffectsofpolicyambiguous. However, for low values of the gain g, tighter monetary policy will not increase asset prices at any future horizon. 8.4 Numerical illustration We illustrate our optimal policy results with a simple calibration. We take the same parameters as in the baseline model, and calibrate {θ,g,ω,δ,σ ,σ } so as to match the volatilities of output growth, inflation, A p housepricegrowth, realwagegrowth, residentialinvestment, aswellasthemeanoutputshareofresidential 14We can also numerically compute the optimal discretionary policy with the LQ-formulation given in the proof of the proposition, but the presence of several endogenous state variables prevents us from providing an analytical characterization. 30

investment, under the assumption that monetary policy follows the Taylor rule in (28). The resulting parameter values are θ = 0.081, g = 0.027, ω = 0.547, δ = 0.021, σ = 0.72%, and σ = 0.90%. As in the A p baseline model, we set the elasticity of substitution σ such that the welfare weight on inflation in the loss function L equals λ = 1. π In Figure 4, we compare the optimal commitment polices under RE and learning, as derived in Proposition 4.15 The figure shows impulse responses to a productivity shock. Figure 4: Optimal commitment policy with asset production. 0.1 0.05 0 -0.05 -0.1 -0.15 -0.2 -0.25 0 5 10 15 20 25 30 35 40 periods π Responseofπtoǫ a 0.05 0 -0.05 -0.1 -0.15 -0.2 -0.25 Learning,Std.Weights Learning,EqualWeights RE,Std.Weights -0.3 RE,EqualWeights -0.35 0 5 10 15 20 25 30 35 40 periods ˆc Responseofcˆtoǫ a 1.6 1.4 1.2 1 0.8 0.6 0.4 Learning,Std.Weights 0.2 Learning,EqualWeights RE,Std.Weights 0 RE,EqualWeights -0.2 0 5 10 15 20 25 30 35 40 periods xˆ Responseofxˆtoǫ a Learning,Std.Weights Learning,EqualWeights RE,Std.Weights RE,EqualWeights 2 1.5 1 0.5 0 -0.5 0 5 10 15 20 25 30 35 40 periods q Responseofqtoǫ a 0.7 Learning,Std.Weights Learning,EqualWeights 0.6 RE,Std.Weights RE,EqualWeights 0.5 0.4 0.3 0.2 0.1 0 -0.1 0 5 10 15 20 25 30 35 40 periods y Responseofytoǫ a 0.1 Learning,Std.Weights Learning,EqualWeights 0.05 RE,Std.Weights RE,EqualWeights 0 -0.05 -0.1 -0.15 -0.2 -0.25 -0.3 -0.35 -0.4 0 5 10 15 20 25 30 35 40 periods i Responseofitoǫ a Learning,Std.Weights Learning,EqualWeights RE,Std.Weights RE,EqualWeights Note: ResponsestoaunitstandarddeviationpositivetechnologyshockεAtunderstickypricesandwithassetproduction. Logpercentage points. Inallcases,theelasticityofsubstitutionσischosensuchthatthewelfareweightoninflationinthelossfunctionLequalsλπ =1. Forcaseslabeled“EqualWeights”,alllossfunctionweightsaresettoλπ =λx=λ h =1. The optimal policy under rational expectations is simply to fully stabilize inflation with respect to the technology shock. This policy simultaneously closes all welfare-relevant gaps. Under learning, this outcome is infeasible because eliminating nominal rigidities does not also eliminate investment distortions from asset price learning. The impulse responses under learning in Figure 4 illustrate how the central bank leans against the wind. The extent to which the central bank departs from inflation targeting depends on the parameters of the model. With our calibration and welfare-theoretic weights in the loss function, inflation barely deviates from zero in the top left panel of Figure 4. Consumption (top middle panel) is lower, and 15The impulse response functions under discretion are qualitatively similar. 31

Table 2: “Leaning against the wind” with asset production, no cost-push shocks. (cid:16) (cid:17) σ(πt) σ(yˆt) σ hˆ t σ(xˆt) σ(∆qt) E[L] RationalExpectations πt=0 0.000 0.000 0.000 0.000 0.858 0 Learning,standardweighting πt=0 0.000 0.052 0.627 3.071 1.609 1.000∗ Opt. Commitment Policy 0.004 0.058 0.621 3.034 1.598 0.993∗ Learning,equalweighting πt=0 0.000 0.052 0.627 3.071 1.609 1.000† Opt. Commitment Policy 0.423 0.594 0.087 0.241 0.890 0.062† ∗,†Foreachlearningcase(lossfunctionwithstandardweightsandlossfunctionwithequalweights),thelossisnormalized to1forπt=0. investment (top right panel) is higher than under RE because of the strong asset price response (bottom left panel) caused by subjective optimism. Higher investment also leads to higher output (bottom middle panel). The optimal nominal rate (bottom right panel) is set somewhat higher under learning than under RE,butthisreactiondoesnoteliminatetheinvestmentboomcausedbyoptimisticassetpriceexpectations. When we change the weights in the loss function to equal weights for inflation, consumption, asset holdings and investment, then leaning against the wind becomes much more pronounced. Inflation drops by a sizable amount and opens up a negative consumption gap. The investment gap stays positive but is much more muted. Likewise, the asset price response gets endogenously dampened by the optimal policy response. The central bank accepts lower output in exchange for preventing an investment boom. The real interest rate (not shown) is above the level under RE throughout the time period shown, although the nominal interest rate initially falls below the level under RE due to the fall in inflation. In Table 2, we switch off cost-push shocks and compare strict inflation targeting to the optimal commitment policy. Under rational expectations (top panel), strict inflation targeting is optimal and all gaps are closed. Under learning, this is no longer the case. With standard loss function weights (middle panel), optimal policy reduces asset price and investment fluctuations somewhat at the expense of price stability, though the effect is quantitatively small. With equal weights (bottom panel), the central bank is willing to tolerate a much larger amount of variability in inflation in order to stabilize the economy along the asset investment margin. In Table 3, we compare outcomes for optimal policies under RE and learning alongside simple rules. The first three rows describe the optimal discretionary and commitment policies as well as the standard Taylor rule used in our calibration. Rows (4) and (5), which describe the corresponding optimal policies under learning, confirm that the central bank cannot attain the same outcomes, as the values of the loss 32

Table 3: Performance of optimized simple rules, with asset production. (cid:16) (cid:17) RationalExpectations σ(πt) σ(cˆt) σ hˆ t σ(xˆt) σ(∆qt) E[L] (1) Optimal Policy, Discretion 0.528 0.098 0.048 0.163 0.862 1.098 (2) Optimal Policy, Commitment 0.247 0.225 0.154 0.369 0.857 0.432 (3) it=1.5πt+0.125yˆt 0.340 0.469 0.226 0.779 0.724 1.316 (cid:16) (cid:17) Learning σ(πt) σ(cˆt) σ hˆ t σ(xˆt) σ(∆qt) E[L] (4) Optimal Policy, Discretion 0.529 0.111 0.630 3.086 1.611 1.126 (5) Optimal Policy, Commitment 0.248 0.221 0.701 3.212 1.598 0.459 (6) it=1.5πt+0.125yˆt 0.154 0.472 0.662 3.069 1.515 1.000 (7) it=ρ∗ i it−1+(1−ρ∗ i )·... 0.156 0.431 0.509 2.456 1.366 0.765 (cid:0) 1.5πt+φ∗ y yˆt+φ∗ q (cid:80)∞ s=0 ω∗s∆qt−s (cid:1) {ρ∗,φ∗,φ∗,ω∗}={0.370,0.232,0.036,0.999} i y q Note: LossesareevaluatedastheunconditionalexpectationofthelossfunctionL. Thewelfareweightoninflationissettoλπ =1and thelossesarenormalizedto1forrow(6). function are higher than under RE. This stems from the fact that the larger asset price volatility translates into inefficient investment fluctuations that cannot be undone by monetary policy without hurting inflation and aggregate demand outcomes at the same time. We also evaluate simple interest rate rules, as in the baseline model. When we optimize coefficients on an extended interest rate rule of the form in (31), we find a positive coefficient on asset price growth in Row (7) of Table 3. All other results from Section 6 also carry over. 9 Conclusion In this paper, we have characterized optimal monetary policy in a model in which agents are learning about asset prices. Our model is the standard New-Keynesian model with a long-term asset. Agents form expectations about asset prices in an extrapolative fashion. However, their expectations remain modelconsistent, conditional on their beliefs about asset prices, which allows us to isolate the effects of learning about asset prices from the many other ways in which distorted beliefs can affect the economy. Learning amplifies asset price fluctuations in the model, and leads to perceived wealth effects that create inefficient fluctuations in consumption, saving, and investment decisions. We have given an analytical solution to the optimal policy with learning. Our central insight is that the natural real rate of interest under learning depends positively on asset price expectations and realized asset prices. In our baseline model, flexible inflation targeting remains the optimal target criterion for 33

monetary policy, but it requires a very different implementation: The interest rate has to increase with asset prices and subjective expectations of future capital gains. When we extend the model to allow for asset production, it becomes beneficial to “lean against the wind”, i.e. tolerate low inflation when asset prices are high, in order to mitigate inefficient investment fluctuations. Our results are robust to a wide range of alternative belief specifications. Our model is highly stylized, which allows us to derive many results analytically. However, the method forcomputingoptimalpolicythatwepresenthereisreadilyapplicabletolargerlinearmodelswithlearning. Future work could evaluate our findings from a quantitative perspective. 34

References Adam, K., Kuang, P. and Marcet, A. (2012). House price booms and the current account. NBER Macroeconomics Annual, 26 (1), 77–122. —, Marcet, A. and Beutel, J. (2017). Stock price booms and expected capital gains. American Economic Review, 107 (8), 2352–2408. —, — and Nicolini, J. P. (2015). Stock market volatility and learning. Journal of Finance. — and Woodford, M. (2018). Leaning Against Housing Prices as Robustly Optimal Monetary Policy. Working Paper 24629, NBER. Airaudo, M. (2016). Monetary Policy and Asset Prices with Infinite-Horizon Learning. Tech. rep. Barberis, N., Greenwood, R., Jin, L. and Shleifer, A. (2015). X-CAPM: An extrapolative capital asset pricing model. Journal of Financial Economics, 115 (1), 1–24. Bernanke, B. S. and Gertler, M. (1999). Monetary policy and asset price volatility. Economic Review – Federal Reserve Bank of Kansas City, Q IV, 17–62. —and—(2001).Shouldcentralbanksrespondtomovementsinassetprices? American Economic Review, 91 (2), 253–257. Billi, R. M. (2017). Price Level Targeting and Risk Management. Working Paper Series 302, Sveriges Riksbank. Boehm, C. E. and House, C. L. (2014). Optimal Taylor Rules in New Keynesian Models. Working Paper 20237, NBER. Bordalo, P., Gennaiola, N. and Shleifer, A. (2018). Diagnostic expectations and credit cycles. Journal of Finance, 73 (1), 199–227. Caines, C. (2016). Can Learning Explain Boom-Bust Cycles In Asset Prices? An Application to the US Housing Boom. International Finance Discussion Papers 1181, Board of Governors of the Federal Reserve System (U.S.). Christiano, L., Ilut, C., Motto, R. and Rostagno, M. (2010). Monetary policy and stock market booms. Proceedings , Jackson Hole Economic Policy Symposium, pp. 85–145. Collin-Dufresne, P., Johannes, M. and Lochstoer, L. A. (2013). Parameter Learning in General Equilibrium: TheAssetPricingImplications.NBERWorkingPapers19705,NationalBureauofEconomic Research, Inc. Dong, F., Miao, J. and Wang, P. (2017). Asset Bubbles and Monetary Policy. Working paper. Dupor, B. (2005). Stabilizing non-fundamental asset price movements under discretion and limited information. Journal of Monetary Economics, 52 (4), 727 – 747. Eusepi, S., Giannoni, M. and Preston, B. (2018). Some implications of learning for price stability. European Economic Review, 106 (1), 1–20. — and Preston, B. (2018). The science of monetary policy: an imperfect knowledge perspective. Journal of Economic Literature, 56 (1), 3–59. 35

Faia, E. and Monacelli, T. (2007). Optimal interest rate rules, asset prices, and credit frictions. Journal of Economic Dynamics and Control, 31 (10), 3228–3254. Filardo, A. and Rungcharoenkitkul, P. (2016). A quantitative case for leaning against the wind. BIS Working Papers 594, Bank for International Settlements. Fuster, A., Hebert, B. and Laibson, D. (2012). Natural expectations, macroeconomic dynamics, and asset pricing. In D. Acemoglu and M. Woodford (eds.), NBER Macroeconomics Annual 2011, NBER Chapters, vol. 26, National Bureau of Economic Research, pp. 1–48. Gali, J. (2014). Monetary policy and rational asset price bubbles. American Economic Review, 104 (3), 721–752. —(2017).Monetary Policy and Bubbles in a New Keynesian Model with Overlapping Generations.Working Paper 959, Barcelona GSE. Gandelman, N. and Hernández-Murillo, R. (2014). Risk Aversion at the Country Level. Working paper series, Federal Reserve Bank of St. Louis. Gandré, P. (2017). Learning, house prices and macro-financial linkages. Working paper. Gilchrist, S.andLeahy, J.V.(2002).Monetarypolicyandassetprices.JournalofMonetaryEconomics, 49 (1), 75–97. — and Saito, M. (2009). Expectations, asset prices, and monetary policy: The role of learning. In J. Y. Campbell (ed.), Asset Prices and Monetary Policy, University of Chicago Press. Glaeser, E. L. and Nathanson, C. G. (2017). An extrapolative model of house price dynamics. Journal of Financial Economics. Greenspan, A. (1996). Remarks at the annual dinner and francis boyer lecture of the american enterprise institute for public policy research, washington, d.c. Greenwood, R. and Shleifer, A. (2014). Expectations of returns and expected returns. Review of Financial Studies, 27 (3), 714–746. Lubik, T. A. and Marzo, M. (2007). An inventory of simple monetary policy rules in a new keynesian macroeconomic model. International Review of Economics & Finance, 16 (1), 15 – 36. Mertens, T. M. (2011). Volatile Stock Markets: Equilibrium Computation and Policy Analysis. Working paper, Federal Reserve Bank of San Francisco. Molnar, K. and Santoro, S. (2014). Optimal monetary policy when agents are learning. European Economic Review, 66, 39 – 62. Preston, B. (2006). Adaptive learning, forecast-based instrument rules and monetary policy. Journal of Monetary Economics, 53 (3), 507 – 535. Svensson, L. E.(2017).Cost-benefitanalysisofleaningagainstthewind.Journal of Monetary Economics, 90, 193 – 213. Winkler, F. (forthcoming). The role of learning for asset prices and business cycles. Journal of Monetary Economics. 36

Woodford, M. (2003). Interest and prices: Foundations of a theory of monetary policy. Princeton University Press. — (2010). Robustly optimal monetary policy with near-rational expectations. American Economic Review, 100 (1), 274–303. 37

A Details on the derivations A.1 Asset demand in the model with fixed asset supply We reduce the system (3)–(10) to the equations: q = γc −(1−β)θh −βγEPc +βEPq t t t t t+1 t t+1 1+φ αγ Q¯H¯ c = a − n − (h −h ) t 1+φ−α t 1+φ−α t Y¯ t t−1 q = q +µˆ +z t t−1 t−1 t µˆ = ρ µˆ +gz t µ t−1 t andconjectureasolutionforh oftheformin(18). Substitutingtheguessandsolvingforthecoefficients t yields:  (cid:115)  1 θ (cid:18) θ (cid:19)2 k h = 1+β + (1−β)− 1+β+ (1−β) −4β ∈ (0,1) 2β γκ γκ 1 1 κ (1−βρ ) 0 a k = > 0 a κ +(1−β) θ +βκ (1−k −ρ ) 1 γ 1 h a 1 1−β k = > 0 q γ(1−β) θ +κ (1−βk ) γ 1 h 1 β (1−β) γ θ +κ 1 β(1−k h ) k = > 0. µ γ(1−β) θ +κ (1−βρ )+κ β(1−k )(1−β) θ +κ (1−βk ) γ 1 µ 1 h γ 1 h A.2 Asset demand with extended beliefs We reduce the system (3)–(10) to q = γc −(1−β)θh −βγEPc +βEPq t t t t t+1 t t+1 1+φ αγ Q¯H¯ c = a − n − (h −h ) t 1+φ−α t 1+φ−α t Y¯ t t−1 38

and directly conjecture a solution for h of the form in (86). Substituting the guess and solving for the t coefficients yields the same coefficients k ,k ,k as in the baseline model, and k˜ reads as follows: h a q µ k˜ = β/γ (1−β) γ θ +κ 1 β(1−k h ) > 0. µ (1−β) θ +κ (1+β −βk )(1−β) θ +κ (1−βk ) γ 1 h γ 1 h The natural real rate of interest under P is now given by rn−rn,RE t t = k (2−ρ −k )a −(1−k )2hn −k (1−k )q γκ a a h t h t−1 q h t 1 ∞ (cid:16)(cid:16) (cid:17) (cid:17) (cid:16) (cid:17)(cid:88) + 2−k˜ k˜ +k EP∆q −k˜ 1−β+β(1−k )2 (βk )sEP∆q . (39) h µ q t t+1 µ h h t t+s+2 s=0 A.3 Linearized equilibrium conditions and natural rate in the model with asset production Under RE, the following set of equations describe the linearized equilibrium (up to a monetary policy rule): y = a +αn (40) t t t w = m +a −(1−α)n (41) t t t t (1−ξ)(1−βξ) π = βπ + m +η (42) t t+1 t t ξ w = γc +φn (43) t t t Y¯y = C¯c +Q¯H¯δx (44) t t t δx = h −(1−δ)h (45) t t t−1 ω x = q (46) t t 1−ω i = γ(E c −c )+E π (47) t t t+1 t t t+1 (cid:16) (cid:17) q + 1−β˜ θh = γc +β˜E (q −γc ). (48) t t t t t+1 t+1 where β˜= β(1−δ). The flexible price equilibrium under RE is characterized as: 39

hn,RE = kREhn,RE +kREa , kRE ∈ (0,1−δ), kRE > 0 (49) t h t−1 a t h a 1−δ−kRE kRE xRE = − h hn,RE + a a (50) t δ t−1 δ t C¯ αγ yn,RE = κ a + κ x (51) t Y¯ 0 t 1+φ−α 1 t cn,RE = κ a −κ x (52) t 0 t 1 t rn,RE = γ (cid:16)κ 1 (cid:0) 1−ρ +1−δ−kRE(cid:1) kRE −κ (1−ρ ) (cid:17) a t δ a h a 0 a t −γ κ 1 (cid:0) 1−δ−kRE(cid:1)(cid:0) 1−kRE(cid:1) hRE (53) δ h h t−1 where the coefficients kRE and kRE are given by: h a  (cid:118)  (cid:16) (cid:17) (cid:117) (cid:16) (cid:17) 2 1−β˜ δ (cid:117) 1−β˜ δ k h RE = 2 1 β˜    1+β˜(1−δ)+ γ θ κ + 1−ω − (cid:117) (cid:116)1+β˜(1−δ)+ γ θ κ + 1−ω  −4β˜(1−δ)    1 γω 1 γω ∈ (0,1−δ) (cid:16) (cid:17) κ 1−β˜ρ 0 a kRE = > 0. a (cid:16) (cid:17) (cid:16) (cid:17)(cid:16) (cid:17) 1−β˜ θ +δ−1 κ + 1−ω 1+β˜(1−δ−k −ρ ) γ 1 γω h a The constant κ is given by 0 1+φ κ = . 0 C¯ (1+φ−α)+αγ Y¯ The sticky-price RE equilibrium is characterized in deviation from the flexible price equilibrium through the equations: (1−ξ)(1−βξ) π = βEPπ + m +η (54) t t t+1 ξ t t C¯(1+φ−α)+Y¯αγ m = (c˜ +κ x˜ ) (55) t Y¯α t 1 t i = γ(E cˆ −cˆ)+E π +rn (56) t t t+1 t t t+1 t (cid:18) (cid:19) (cid:18) (cid:19) (cid:16) (cid:17) 1−ω 1−ω θ 1−β˜ hˆ = β˜E xˆ −γcˆ − xˆ −γcˆ . (57) t t t+1 t+1 t t ω ω Underlearning, wecandoasimilarexercise. WefirsttacklethePLM.Allwedoistoreplacethemarket 40

clearing condition (46) with the subjective law of motion for asset prices: y = a +αn (58) t t t w = m +a −(1−α)n (59) t t t t (1−ξ)(1−βξ) π = βEPπ + m +η (60) t t t+1 ξ t t w = γc +φn (61) t t t Y¯y = C¯c +Q¯H¯ (h −(1−δ)h ) (62) t t t t−1 i = γ (cid:0)EPc −c (cid:1) +EPπ (63) t t t+1 t t t+1 (cid:16) (cid:17) q = γc − 1−β˜ θh −β˜γEPc +β˜EPq (64) t t t t t+1 t t+1 q = q +µˆ +z (65) t t−1 t−1 t µˆ = ρ µˆ +gz . (66) t µ t−1 t Under learning and flexible prices, we can boil things down to these two equations to solve for the PLM: (cid:18) 1+φ−α (cid:19)(cid:18) C¯ Q¯H¯ (cid:19) 1+φ 0 = γc + c + h˜ − a (67) t α Y¯ t Y¯ t α t (cid:16) (cid:17) q = γc − 1−β˜ θh −β˜γE c +βδEPq . (68) t t t t t+1 t t+1 Guess and verify hn = k a +k hn −k q +k µˆ (69) t a t h t−1 q t µ t−1 where the coefficients are given by:  (cid:115)  1 θδ (cid:16) (cid:17) (cid:18) θδ (cid:16) (cid:17) (cid:19)2 k h = 2β˜ 1+β˜(1−δ)+ γκ 1 1−β˜ − 1+β˜(1−δ)+ γκ 1 1−β˜ −4β˜(1−δ) ∈ (cid:0) kRE,1−δ (cid:1) h 1−β˜ρ k = κ a > kRE a 0(cid:16) (cid:17) (cid:16) (cid:17) a 1−β˜ θ +κ δ−1 1+β˜(1−δ−k −ρ ) γ 1 h a 1 1−β˜ k = > 0 q (cid:16) (cid:17) (cid:16) (cid:17) γ 1−β˜ θ +κ δ−1 1−β˜k −β˜δ γ 1 h 41

(cid:16) (cid:17) 1 β˜ 1−β˜ γ θ +κ 1 δ−1β˜(1−δ−k h ) k = > 0 µ (cid:16) (cid:17) (cid:16) (cid:17)(cid:16) (cid:17) (cid:16) (cid:17) γ 1−β˜ θ +κ δ−1 1−β˜ρ +β˜(1−δ−k ) 1−β˜ θ +κ δ−1 1−β˜k −β˜δ γ 1 µ h γ 1 h We can characterize the flexible-price PLM investment, consumption, output and interest rates: k −kRE kRE −1+δ (cid:16) (cid:17) xn = xn,RE + h h hn + h hn −hn,RE t t δ t−1 δ t−1 t−1 k −kRE k k + a a a − q q + µ µˆ (70) t t t−1 δ δ δ (cid:16) (cid:17) cn = cn,RE −κ xn−xn,RE (71) t t 1 t t αγ (cid:16) (cid:17) yn = yn,RE + κ xn−xn,RE (72) t t 1+φ−α 1 t t rn = rn,RE − γκ 1 (cid:0) (1−δ−k )(1−k )− (cid:0) 1−δ−kRE(cid:1)(cid:0) 1−kRE(cid:1)(cid:1) hn t t δ h h h h t−1 − γκ 1 (cid:0) 1−δ−kRE(cid:1)(cid:0) 1−kRE(cid:1) (cid:16) hn −hn,RE (cid:17) − γκ 1 k (1−δ−k )q δ h h t−1 t−1 δ q h t + γκ 1 (cid:0) (1−ρ +1−δ) (cid:0) k −kRE(cid:1) −k k +kREkRE(cid:1) a δ a a a h a h a t γκ 1 + (k (1−ρ +1−δ−k )+k )µˆ . (73) µ µ h q t−1 δ In order to find the ALM under flexible prices, we impose market clearing for the asset and obtain: ωδ q∗ = k a +k h∗ −k q∗+k µˆ∗ −(1−δ)h∗ 1−ω t a t h t−1 q t µ t−1 t−1 ⇔ q∗ = 1 (cid:0) k a +k µˆ∗ −(1−δ−k )h∗ (cid:1) . (74) t ω δ+k a t µ t−1 h t−1 1−ω q The equilibrium price is increasing in productivity, increasing in asset price beliefs, and decreasing in the existing asset stock. When the equilibrium asset price q from equation (74) is substituted into the expression for the natural t rate rn, the natural real rate in the ALM rn∗ is increasing in asset price expectations µˆ∗ , just as in the t t t−1 baseline model. The sign of the other coefficients are ambiguous and depend on the parameterization. A.4 Welfare approximations For the model with fixed supply, the approximation of welfare is standard. Following e.g. Woodford, 2003, welfare is approximated by 42

ξ α L = σ (π∗)2+(yˆ∗)2. (1−ξ)(1−βξ)1+φ−α+αγ t t Note that this loss function approximates welfare under the equilibrium law of motion P, which takes intoaccountthattheassetsupplyisfixed. WelfareunderthesubjectivelawofmotionP takesonadifferent form. For the model with asset production, we can derive an approximation of the loss function under P as: ξ 1 (cid:18) 1+φ−αC¯(cid:19) C¯ L = σ (π∗)2+ γ + (cˆ +κ xˆ )2 t 1−ξ1−βξ t α Y¯ Y¯ t 1 t +(1−β(1−δ))θ Q¯H¯ hˆ2+ Q¯H¯ (cid:0) (1−ω)δ2+γδκ (cid:1) xˆ2. Y¯ t Y¯ 1 t B Proofs Proof of Proposition 1. Suppose that the central bank implemented π = 0. The Phillips curve (24) then t reduces to the relationship γα Q¯H¯ y˜ = ∆h˜ . t 1+φ−α(1−γ) Y¯ t Substituting into the asset demand equation (27), we obtain a second-order difference equation of the form θ (cid:16) (cid:17) (1−β) h˜ = −κ ∆h˜ −βEP∆h˜ . γ t 1 t t t+1 It is easily verified that the only solution to this equation is h˜ = 0. But this implies that we implement t the flexible price allocation under the subjective law of motion. From the perspective of agents, the flexible price allocation is first-best efficient, and L = 0. Moreover, the actual equilibrium in this economy has t π∗ = π = 0 and y∗ = yn,RE, as was shown in the last section. This allocation is also first-best efficient t t t t under model-consistent expectations. To show that the rule i = rn,PLM +φ π implements π = 0, we first substitute it into the IS curve t t π t t (26)and note that the system (24)–(27) admits the solution π = 0. We then need to verify that the rule t also ensures determinacy of the equilibrium for φ > 1. Following e.g. Lubik and Marzo (2007), we express π 43

the system (24)–(27) in the form =:B =:C (cid:122) (cid:125)(cid:124) (cid:123) (cid:122) (cid:125)(cid:124) (cid:123)     β 0 κκ −1 κ −κκ 1 1          1 γ 0 x t+1 + −φ −γ 0 x t = 0         0 −βγ −θ(1−β) 0 γ 0 t+1 (cid:16) (cid:17)(cid:48) where x = π ,c˜,h˜ and t t t t−1 (1−ξ)(1−βξ)1+φ−α+αγ κ = . ξ α Because we have two terminal and one initial condition (i.e. two forward- and one backward-looking variable), we then need to verify that the matrix −B−1C has exactly two eigenvalues outside the unit circle. The characteristic polynomial det (cid:0) B−1C −λI (cid:1) is of the form A λ3+A λ2+A λ+A = 0, where 3 2 1 0 the coefficients are given by: A = βγ(κκ +θ(1−β)) > 0 3 1 A = −γκκ (1+β+βφ)−(1−β)θ(κ+γ +γβ) < 0 2 1 A = γκκ (1+φ+βφ)+(1−β)θ(γ +κφ) > 0 1 1 A = −γκκ φ < 0. 0 1 We then verify the following sufficient conditions for two roots outside the unit circle: A +A +A +A > 0 3 2 1 0 −A +A −A +A < 0 3 2 1 0 A (A −A )+A (A −A ) > 0. 0 0 2 3 1 3 The first inequality follows from A +A +A +A = (1−β)θκ(φ−1), which is positive when φ > 1. 3 2 1 0 The second inequality follows directly from the signs of the coefficients. To establish the third inequality, we collect powers of φ to express A (A −A )+A (A −A ) 0 0 2 3 1 3 44

=(γκκ )2(1−β)φ2 1 +[A (1−β)θκ+γκκ ((1+β)(A +(1−β)θγ +κκ )+(1−β)θκ)]φ 3 1 3 1 (cid:18) (cid:19) 1−β +A2 . 3 β This is a quadratic polynomial in φ for which all coefficients are positive, implying that it takes positive values for all φ ≥ 0. Proof of Proposition 2. Substituting the policy π = −(βρ )−1η + b z into the Phillips curve (24), we t η t t t obtain: (cid:16) (cid:17) π = −(βρ )−1η +b z = κ c˜ +κ ∆h˜ t η t t t t 1 t with κ defined as in the proof of Proposition 1. We will find b such that π∗ = 0P-a.s., i.e. under the t t equilibrium law of motion. In this case, c˜∗ + κ ∆h˜∗ = y∗ − yRE = 0 under the ALM as well, and the t 1 t t t first-best allocation is attained. In order to solve for b , then, we need to compute the equilibrium value of z , which amounts to solving t t for the equilibrium price q∗. To do this, we first need to derive the demand function for the asset under t the subjective law of motion. Let x = c˜ +κ ∆h˜ . In analogy to the computations of the flexible-price t t 1 t equilibrium,wecancomputetheassetdemandfunctionasthesolutiontothefollowingsystemofequations: (1−β)θh = γc −q −βEP[γc −q ] t t t t t+1 t+1 Q¯H¯ (1+φ−α+αγ)c = αx +(1−α)a −(1+φ−α) ∆h . t t t Y¯ t (cid:16) (cid:17) With x = b z −(βρ )−1η /κ, we can solve: t t t η t h = k h +k a −k q +k µˆ +k b z +k η . t h t−1 a t q t µ t−1 bz t t η t Here, the coefficients k , k , k and k are the same as in the flexible-price demand function (18). The h a q µ 45

coefficients k and k are given by: η bz k 1−βρ Y¯ α h η k =− η βρ κ1−k βρ Q¯H¯ 1+φ−α η h η k Y¯ α h k =− z,t κ Q¯H¯ 1+φ−α Now the equilibrium is found by imposing h∗ = 0, and this condition leads to the following expression t for z∗: t (cid:18) (cid:19) z∗ = k q 1 (cid:0) k a +k µˆ∗ +k η (cid:1) −q∗ −µˆ∗ t k −k b k a t µ t−1 η t t−1 t−1 q bz t q Imposing βρ b z∗ = η then leads to: η t t t b = (cid:18) k bz + βρ η (cid:0) k a +k µˆ∗ +k η (cid:1) −βρ (cid:0) q∗ +µˆ∗ (cid:1) (cid:19)−1 η . t k k a t µ t−1 η t η t−1 t−1 t q q Proof of Proposition 3. When π = π∗P-a.s., we can write EPπ = E π in the Phillips curve (24). The t t t t t t+1 Phillips curve then implies that m = m∗P-a.s. as well. Furthermore, combining the static equilibrium t t conditions for labor supply and demand, the production function and the household budget constraint, we obtain: 1+φ−α+αγ 1+φ Q¯H¯ y = m + a +γ ∆h . α t t α t Y¯ t This equation has to hold regardless of expectations. Furthermore, under the equilibrium law of motion we have ∆h∗ = 0, while under RE and flexible prices, we have ∆hn,RE = 0 and mn,RE = 0. Together, these t t t conditions imply that equilibrium output under learning equals y∗ = yn,RE +α/(1+φ−α+αγ)m . We t t t can therefore write the policy problem under learning as follows: ∞ (cid:88) (cid:16) (cid:17) max βt λπ2+(yˆ∗)2 t t t=0 46

(1−ξ)(1−βξ) s.t. π = βE π + m +η t t t+1 t t ξ α yˆ∗ = m . t 1+φ−α+αγ t This problem is identical to the policy problem under rational expectations with standard solutions to the commitment and discretion policies. This establishes parts (1) and (2) of the proposition. To prove part (3), we write the IS equation (26) as i = rn+EPπ +γEP∆c˜ . t t t t+1 t t+1 Let φ > 1. The optimality conditions for π in cases (1) and (2) combined with (24) and (27) imply π t that (π )∞ and (c˜)∞ are linear processes of the cost-push shock η only. Therefore, we can evaluate t t=0 t t=0 t EPπ +γEP∆c˜ −φ π = A(L)η for some lag polynomial A(L). The rule satisfies determinacy by t t+1 t t+1 π t t the same argument made in the proof of Proposition 1. (cid:16) (cid:17) We start by noting from the Phillips curve that c +κ x = α/ C¯ (1+φ−α)+αγ m and m = 0 t 1 t Y¯ t t under flexible prices. Therefore, we can rewrite (cid:18) C¯ (cid:19) cˆ +κ xˆ = α/ (1+φ−α)+αγ m∗. t 1 t Y¯ t Dividing through the (positive) coefficient on m∗, we can bring the loss function (34) into the form t (cid:16) (cid:17)2 L = (cid:96) (π∗)2+(m∗)2+(cid:96) (xˆ∗)2+(cid:96) hˆ∗ , t π t t x t h t where (cid:96) ,(cid:96) ,(cid:96) > 0. π x h Next, we note that π has the same distribution under the subjective belief measure P as under the t actual belief measure P. We can therefore omit the asterisk notation and simply write π∗ = π in the t t planner’s problem. Equations (35) and (38) then imply that the same holds true for c˜, h˜ and m . To t t t evaluate the asset and investment gaps in the loss function, we note that (cid:16) (cid:17)∗ hˆ∗ = h −hn,RE = h˜ +hn∗−hn,RE. (75) t t t t t t 47

The first equality is the definition of the asset gap, and the second equality follows from the fact that h˜ = h˜∗ and hn,RE is independent of the learning process. To find hn∗, we can make use of the closed-form t t t t expression for hn derived in (69). t The policymaker’s commitment problem can then be described as follows (Lagrange multipliers to the constraints are in parentheses): ∞ (cid:18) (cid:19) 1 (cid:88) (cid:16) (cid:17)2 max βt (cid:96) π2+m2+(cid:96) (xˆ∗)2+(cid:96) hˆ∗ 2 π t t x t h t t=0 (1−ξ)(1−βξ) s.t. π = βE π + m (µ ) t t t+1 t t ξ (cid:16) (cid:17) αγ (cid:16) (cid:17) (cid:16) (cid:17) θ 1−β˜ h˜ = m −β˜E m −γκ x˜ −β˜EPx˜ (χ ) t C¯ (1+φ−α)+αγ t t t+1 1 t t t+1 t Y¯ hˆ∗ = h˜ +hn∗−hn,RE (ψˆ) t t t t t hn∗ = k a +k hn∗ −k q∗+k µˆ∗ (ψ ) t a t h t−1 q t µ t−1 t ω q∗ = xn∗+x˜ (Ω ) 1−ω t t t qt µˆ∗ = ρ µˆ∗ +g (cid:0) ∆q∗−µˆ∗ (cid:1) . (Ω ) t µ t−1 t t−1 µt The first two constraints are the Phillips curve and the asset Euler equation in gap form (where we have substituted out c˜), which only involve variables that are measurable under P. Next, the policymaker needs t to evaluate the housing and investment gaps, for which we make use of the identity (75) and the law of motion (69) for hn. We impose equilibrium through the market-clearing condition (Ω ), and finally have t qt to take into account the effect of prices q on beliefs µˆ , for which we can combine equations (65)–(66). t t This problem takes the form of a recursive linear-quadratic dynamic programming problem, for which commitmentanddiscretionarysolutionsarewellunderstood. Thediscretionarypolicyisdifficulttoevaluate analytically, but easy to compute numerically (following e.g. REF). For the commitment solution, the first-order conditions of the planner are: (cid:96) π = ∆µ (76) π t t (1−ξ)(1−βξ) Y¯α µ = −m − (χ −(1−δ)χ ) (77) ξ t t C¯(1+φ−α)+Y¯αγ t t−1 48

(cid:16) (cid:17) κ (cid:16) (cid:17) θ 1−β˜ χ = γ 1 β˜E (χ −(1−δ)χ )−χ +(1−δ)χ (78) t t t+1 t t t−1 δ Ω −β˜E Ω +ψˆ + qt t qt+1 t δ xˆ∗−β˜E xˆ∗ ψˆ = (cid:96) hˆ∗+(cid:96) t t t+1 (79) t h t x δ Ω −β˜E Ω ψ = ψˆ +βk E ψ + qt t qt+1 (80) t t h t t+1 δ ω Ω = g(Ω −βE Ω )−k ψ (81) qt µt t µt+1 q t 1−ω Ω = βk E ψ +β(ρ −g)E Ω (82) µt µ t t+1 µ t µt+1 We can combine optimality conditions (76) and (77) into (1−ξ)(1−βξ) (cid:96) p = −m −(f −(1−δ)f ). (83) π t t t t−1 ξ with f = Y¯α χ . We conjecture the solution t C¯(1+φ−α)+Y¯αγ t ∞ (cid:88) χ = k χ + b E ψˆ (84) t h t−1 s t t+s s=0 where 0 < k < 1−δ, b > 0 and b > β˜b for all s ≥ 1. This conjecture implies that h 0 s s−1 (cid:88) ∞ b E ψˆ = (cid:88) ∞ b E (cid:32) (cid:96) hˆ∗ +(cid:96) xˆ∗ t+s −β˜E t xˆ∗ t+s+1 (cid:33) s t t+s s t h t+s x δ s=0 s=0 ∞ ∞ ∞ = (cid:96) x (cid:88) b E xˆ∗ − (cid:96) x β˜ (cid:88) b E xˆ∗ +(cid:96) (cid:88) b E hˆ∗ δ s t t+s δ s t t+s+1 h s t t+s s=0 s=0 s=0 ∞ = (cid:96) x b xˆ∗+ (cid:96) x (cid:88)(cid:16) b −β˜b (cid:17) E xˆ∗ δ 0 t δ s s−1 t t+s s=1 (cid:32) ∞ (cid:33) ∞ (cid:32) ∞ (cid:33) (cid:88) (cid:88) (cid:88) +(cid:96) b (1−δ)s+1 hˆ∗ +(cid:96) (1−δ)τ b xˆ∗ , h s t−1 h s t+s s=0 s=0 τ=s so that the coefficients on hˆ∗ and E xˆ∗ ,s ≥ 0 are all strictly positive. Moreover, we can appeal to the t−1 t t+s market clearing condition (46) to write E xˆ∗ = ω E qˆ∗ . t t+s 1−ω t t+s What is left is to verify conjecture (84). The equation (78) is a second-order difference equation in χ t when ψˆ and Ω are treated as given. The homogeneous form of this second-order difference equation is t qt the same as that for the natural level of asset holdings hn derived earlier. This establishes that k is indeed t h 49

in (0,1−δ). Furthermore, equations 80–82 form a linear forward-looking system in (Ω ,Ω ,ψ ) when ψˆ qt µt t t is treated as given. This implies that the solution in (84) will indeed depend only on forward-looking terms in ψˆ. t Finally, we need to show that b > 0 for all s ≥ 0. We prove this property for g = 0 and then make a s continuityargumenttoextendittog smallenough. Wheng = 0,equation(81)reducestoΩ = −1−ωk ψ . qt ω q t Substituting into (80), we obtain 1−ω (cid:16) (cid:17) ψ = ψˆ +βk E ψ − k ψ −β˜E ψ t t h t t+1 q t t t+1 ωδ = ψˆ t +β˜1 k − h δ + 1 ω − δ ωk qE ψ . 1+ 1−ωk 1+ 1−ωk t t+1 ωδ q ωδ q Now, through equation (80) we also have that Ω −β˜E Ω ψˆ + qt t qt+1 = ψ −βk E ψ t t h t t+1 δ ∞ (cid:88) = a E ψˆ . (85) s t t+s t=0 Because k < 1−δ, we have h β˜1 k − h δ + 1 ω − δ ωk q = βk 1+ 1 k − h δ1 ω − δ ωk q > βk 1+ 1−ωk h 1+ 1−ωk h ωδ q ωδ q and therefore a > 0 for all s ≥ 0. s We now substitute (85) and (84) into the first-order condition (78) to solve for the coefficients b . We s obtain: (cid:18) (cid:19) (cid:32) ∞ (cid:33) δ (cid:16) (cid:17) (cid:88) θ 1−β˜ +1+β˜(1−δ−k ) k χ + b E ψˆ h h t−1 s t t+s γκ 1 s=0 ∞ ∞ (cid:88) δ (cid:88) =β˜ b E ψˆ +(1−δ)χ + a E ψˆ . s t t+s+1 t−1 s t t+s γκ 1 s=0 t=0 Comparing coefficients, we get (cid:18) (cid:19) δ (cid:16) (cid:17) δ s = 0: θ 1−β˜ +1+β˜(1−δ−k ) b = a h 0 0 γκ γκ 1 1 50

(cid:18) (cid:19) δ (cid:16) (cid:17) δ s ≥ 1: θ 1−β˜ +1+β˜(1−δ−k ) b = β˜b + a . h s s−1 s γκ γκ 1 1 By an induction argument, these expressions establish that b > 0 and b > β˜b for all s ≥ 1. 0 s s−1 C Extension to general beliefs As before, we start by considering the flexible-price allocation. The appendix shows that the asset demand function in the subjective law of motion (i.e. under P), which previously was given by (18), is replaced by: ∞ (cid:88) hn = k a +k hn −k q +k˜ (βk )sEP∆q . (86) t a t h t−1 q t µ h t t+s+1 s=0 Insteadofasinglestatevariablerepresentingsubjectivebeliefs, assetdemandnowdependsonthewhole time profile of subjective expected capital gains in the future, but otherwise it has the same form. The coefficients k , k and k are the same as in the baseline model, and k˜ > 0. a h q µ The expression for the flexible-price real interest rate under the equilibrium measure, previously given by (23), is replaced by: r∗ = r,RE +γκ k (1−ρ )a t t 1 a a t (cid:32)(cid:32) (cid:33) ∞ (cid:33) +γκ k˜ 1+ k q EP∆q −(1−βk ) (cid:88) (βk )sEP∆q . (87) 1 µ k˜ t t+1 h h t t+s+2 µ s=0 The real interest rate continues to be increasing in expectations of next period’s capital gains ∆q . t+1 The real rate also turns out to be decreasing in expectations of capital gains further in the future ∆q ,s ≥ 2. This relationship is most easily understood for the case ∆q > 0: Here, investors’ desire to t+s t+2 invest in the asset out of consumption is highest in period t+1, immediately before the realization of high expected capital gains in t+2. As a result, expected consumption c is lower than current consumption t+1 c , lowering the required real interest rate rn = γEP∆c today. The real rate will still positively depend t t t t+1 on asset price expectations as long as (cid:18) 1 (cid:19)s−1 EP∆q ≤ EP∆q if EP∆q > 0 t t+s βk t t+1 t t+1 h (cid:18) 1 (cid:19)s−1 or EP∆q ≥ EP∆q if EP∆q < 0. t t+s βk t t+1 t t+1 h 51

As 1/(βk ) > 1, we only need that capital gains expectations do not increase too much further at future h horizons when they are positive for the immediate future, which is likely satisfied for all but the most extreme forms of extrapolative bias. TheresultsonpolicywederiveinSection5continuetoholdforthegeneralizedbeliefswiththemodified expression for the natural real rate above. Because allocations are determined only by intratemporal labor demand and supply conditions, the level of output under flexible prices is again unaffected by the presence of learning (y∗ = yn,RE). Since the asset price q remains independent of policy under the PLM, it drops t t t out of the equations describing the dynamics of the sticky price equilibrium relative to flexible prices, and so equations (24)–(27) continue to hold. Propositions 1 through 3 continue to hold: The optimal policy under learning replicates the outcomes from the optimal policy under rational expectations, but has to be implemented by following the perceived natural real interest rate which is increasing in asset price expectations. D CMCE with alternative assumptions In this appendix we analyze the model with alternate assumptions about the market clearing conditions thatexpectationsareconsistentwithunderCMCE.Asnotedinsection3,inordertosustainanequilibrium with CMCE, Walras’ law requires that two market clearing conditions be absent from agents’ information set. The first of these is of course the market clearing condition for the asset in question, but the second one is in principle a free choice. Throughout the paper, we also remove the final goods market clearing condition. But one could alternatively consider removing either (i) the bond market clearing condition or (ii) the labor market clearing condition. To model case (i), we modify the system (3)–(10) defining the subjective law of motion, by replacing the household budget constraint (9) with y = c t t B +(1+r¯)B = Q¯H¯∆h . t t t The first of these equations is the market clearing condition of final consumption goods, and the second equationisthemodifiedbudgetconstraintthattakesintoaccountthatagentsdonotknowthebondmarket 52

clearingconditionB = 0. ItiseasytoshowthattheresultingequilibriumisidenticaltotheREequilibrium t regardless of price stickiness, up to the asset price q . Intuitively, agents now think that they can exchange t the asset for bonds. Bonds are useful to agents only insofar as they can be exchanged into physical goods or labor services; but agents do not expect to be able to do so, because at any time in the future, because they understand the supply and demand in all goods and labor markets. To model case (i), we now have to distinguish between labor services produced by agents and labor demanded by firms in the subjective law of motion, because agents think they can buy additional labor services by selling their asset holdings and sell those services to firms. Let n denote the labor used by t firms, and ns the labor produced by agents. We modify the system (3)–(10) defining the subjective law of t motion by replacing the labor supply condition (6) and the budget constraint (9) with y = c t t w = γc +φns t t t Q¯H¯ y = c + ∆h +α(n −ns). t t Y¯ t t t The first of these equations is the market clearing condition of final consumption goods, and the second equation is the modified budget constraint that takes into account that agents think they can buy labor services freely in a spot market. We can show that with this specification, the natural rate of interest in the model is still of the form in (18), with different coefficients but still respecting k ∈ (0,1), k ,k ,k > 0. h a q µ All our results from Sections 5 and 6 continue to hold. As before, under flexible prices the learning model implements the first-best allocation. As a result, the new informational assumptions only affect the asset price and the natural rate of interest. Figure 5 plots impulse responses under sticky prices, assuming a Taylor-type rule. In this case a positive technology shock pushes down inflation. This is a consequence of the impact of the changed information assumption onexpectedfuturemarginalcosts. Whenagents’informationsetdoesnotincludethelabormarketclearing condition, then firms do not internalize labor supply effects when forecasting marginal costs. More specifically, when firms anticipate an increase in future labor demand following the shock, they fail to account for the need for future wages to increase so as to clear the labor market. This consequently pushes down the inflation response relative to baseline case. Notwithstanding the impact on the inflation process, the learning model retains the dependence of the 53

Figure 5: IRFs to a technology shock under sticky prices, alternative CMCE assumptions. 0 -0.01 -0.02 -0.03 -0.04 -0.05 -0.06 -0.07 -0.08 -0.09 -0.1 0 5 10 15 20 25 30 35 40 periods π Responseofπtoǫ a 3 Learning RationalExpectations 2.5 2 1.5 1 0.5 0 -0.5 0 5 10 15 20 25 30 35 40 periods q Responseofqtoǫ a Learning RationalExpectations 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 -0.1 0 5 10 15 20 25 30 35 40 periods y Responseofytoǫ a 0.12 Learning RationalExpectations 0.1 0.08 0.06 0.04 0.02 0 -0.02 0 5 10 15 20 25 30 35 40 periods ˆy Responseofyˆtoǫ a Learning RationalExpectations Note: ResponsetoaonestandarddeviationpositivetechnologyshockεAt. Logpercentagepoints. Modelsimulatedundertheassumption thatagents’informationsetdoesnotincludeassetmarketclearingorlabormarketclearing. Stickyprices,Taylorrule. 54

Table 4: Performance of optimized simple rules, alternative CMCE assumptions. RationalExpectations σ(πt) σ(yˆt) σ(∆qt) E[L] (1) Optimal Policy, Discretion 0.436 0.085 1.011 0.526 (2) Optimal Policy, Commitment 0.205 0.192 1.010 0.210 (3) it=1.5πt+0.125yˆt 0.349 0.451 0.801 0.866 Learning σ(πt) σ(yˆt) σ(∆qt) E[L] (4) Optimal Policy, Discretion 0.436 0.085 2.134 0.527 (5) Optimal Policy, Commitment 0.204 0.193 2.135 0.210 (6) it=1.5πt+0.125yˆt 0.195 0.580 2.702 1.000 (7) it=ρ∗ i it−1+(1−ρ∗ i ) (cid:0) 1.5πt+φ∗ y yˆt+φ∗ q (cid:80)∞ s=0 ω∗s∆qt−s (cid:1) 0.093 0.514 2.093 0.713 {ρ∗,φ∗,φ∗,ω∗}={0.167,0.019,0.018,0.699} i y q Note: Modelsimulatedundertheassumptionthatagents’informationsetdoesnotincludeassetmarketclearingorlabormarketclearing. Note: LossesareevaluatedastheunconditionalexpectationofthelossfunctionL. Thewelfareweightoninflationissettoλπ =1and thelossesarenormalizedto1forrow(6). naturalrateofinterestonexpectedassetprices, regardlessofwhetherthegoodsmarketorthelabormarket clearing condition is omitted from agents’ information set. Table 4 shows optimized simple rule coefficients and their performance, as in section 6. Relative to the baseline case, the welfare gains to including asset prices in the policy reaction function are smaller. However, the optimal simple rule continues to place react positively with prices. When a weighted average of past price growth observations are included in the policy rule (Row 8), welfare is increased 1.7 percent, with the optimized coefficients placing a small positive weight on the asset price term (as was the case with the baseline model in 6). E Tracking the “right” natural real rate The equilibrium realization of the nominal rate under the optimal policy is the expression r∗ derived in t (23). However, an instrument rule that prescribes i = r∗ + φ π would fail to implement the optimal t t π t policy. The equilibrium natural rate rn∗ only coincides with rn when h = 0. While this must be the case t t t in equilibrium, agents under P contemplate other possible realizations of the house price for which they plan on choosing h (cid:54)= 0. These off-equilibrium states of the world enter into agents’ expectations of future t marginal costs. Therefore, the central bank must promise to stabilize inflation even in these off-equilibrium states. Tracking only the equilibrium natural rate is insufficient: It must track the perceived natural rate. As an illustration, Figure 6 shows impulse responses for the learning model with three interest rate 55

equations: i = rn+1.05π (88) t t t i = rn∗+1.05π (89) t t t i = r +1.5π +0.125yˆ. (90) t ss t t The first equation (88) implements strict inflation targeting as per Proposition 1. The only difference of the second equation (89) is that the monetary authority reacts to the equilibrium process of the natural rate instead of the perceived process. Figure 6 shows how using the ALM natural rate of interest in the the policy rule does not yield a zero inflation outcome. As discussed in the last section, the central bank must promise to stabilize inflation even in those states that are never reached in equilibrium—that is, when the housing market doesn’t clear—but contemplated by agents under their subjective expectations. Using the ALM natural rate in the policy rule fails to do so. Due do their beliefs about the process governing Q , agents under the PLM do not account for the effect of the technology shock on future asset price t growth. Consequently, the initial response of consumption is smaller than under rational expectations. From the standpoint of an agent under the flex price ALM on the other hand, the technology shock has an anticipated positive impact on the path of Q due to expected asset demand. As a result, the initial t consumption response and subsequent consumption decline will be greater. The ALM natural rate of interest declines more upon the impact of the shock than does the PLM natural rate of interest. When a monetary authority uses the ALM natural rate in its policy rule as in (89), then, the nominal interest does not increase sufficiently to prevent an inflationary response. Finally, the third equation is a standard Taylor rule. Figure 6 shows that this rule performs somewhat better in terms of outcomes, but is still far from the optimal policy. It is worth noting that the nominal interest rate is more volatile under the Taylor rule than under the optimal rule (88), which reacts to asset prices. The reason is that the stabilization benefits of reacting to asset prices make equilibrium nominal rates more stable as well. 56

Figure 6: Tracking the “right” natural real rate. 0.3 0.25 0.2 0.15 0.1 0.05 0 -0.05 0 5 10 15 20 25 30 35 40 periods π Responseofπtoǫ a 0.3 L L e e a a r r n n i i n n g g , , r r ∗ ∗ = = r r P A ∗ ∗ L L M M 0.25 Learning,Taylor(1993) 0.2 0.15 0.1 0.05 0 -0.05 -0.1 -0.15 0 5 10 15 20 25 30 35 40 periods ˆy Responseofyˆtoǫ a L L e e a a r r n n i i n n g g , , r r ∗ ∗ = = r r P A ∗ ∗ L L M M Learning,Taylor(1993) 3 2.5 2 1.5 1 0.5 0 -0.5 0 5 10 15 20 25 30 35 40 periods q Responseofqtoǫ a 0.25 L L e e a a r r n n i i n n g g , , r r ∗ ∗ = = r r P A ∗ ∗ L L M M Learning,Taylor(1993) 0.2 0.15 0.1 0.05 0 -0.05 0 5 10 15 20 25 30 35 40 periods i Responseofitoǫ a L L e e a a r r n n i i n n g g , , r r ∗ ∗ = = r r P A ∗ ∗ L L M M Learning,Taylor(1993) Note: ResponsetoaunitstandarddeviationpositivetechnologyshockεAt understickyprices. Logpercentagepoints. Theinterestrate rulesusedaregiveninEquations(88)–(90). 57

Cite this document
APA
Colin C. Caines and Fabian Winkler (2019). Asset Price Learning and Optimal Monetary Policy (IFDP 2018-1236). Board of Governors of the Federal Reserve System, International Finance Discussion Papers. https://whenthefedspeaks.com/doc/ifdp_2018-1236
BibTeX
@techreport{wtfs_ifdp_2018_1236,
  author = {Colin C. Caines and Fabian Winkler},
  title = {Asset Price Learning and Optimal Monetary Policy},
  type = {International Finance Discussion Papers},
  number = {2018-1236},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2019},
  url = {https://whenthefedspeaks.com/doc/ifdp_2018-1236},
  abstract = {We characterize optimal monetary policy when agents learn about endogenous asset prices. Learning leads to inefficient asset price fluctuations and distortions in consumption and investment decisions. We find that the policy-relevant natural real interest rate increases with subjective asset price beliefs. Optimal monetary policy therefore raises interest rates when expected capital gains are high. When the asset is not in fixed supply, optimal policy also "leans against the wind". In a simple calibration of the model, a positive response to capital gains in simple interest rate rules is beneficial. Our results are robust to alternative belief specifications. Accessible materials (.zip) Original paper: PDF},
}