feds · December 15, 2022

Understanding Uncertainty Shocks and the Role of Black Swans

Abstract

Economic uncertainty is a powerful force in the modern economy. Research shows that surges in uncertainty can trigger business cycles, bank runs and asset price fluctuations. But where do sudden surges in uncertainty come from? This paper provides a data-disciplined theory of belief formation that explains large fluctuations in uncertainty. It argues that people do not know the true distribution of macroeconomic outcomes. Like Bayesian econometricians, they estimate a distribution. Our main contribution is to explain why real-time estimation of distributions with non-normal tails are prone to large uncertainty fluctuations. We use theory and data to show how small changes in estimated skewness whip around probabilities of unobserved tail events (black swans). Our estimates, based on real-time GDP data, reveal that revisions in the estimates of black swan risk explain most of the fluctuations in uncertainty.

Finance and Economics Discussion Series Federal Reserve Board, Washington, D.C. ISSN 1936-2854 (Print) ISSN 2767-3898 (Online) Understanding Uncertainty Shocks and the Role of Black Swans Anna Orlik and Laura Veldkamp 2022-083 Please cite this paper as: Orlik,Anna,andLauraVeldkamp(2022). “UnderstandingUncertaintyShocksandtheRole of Black Swans,” Finance and Economics Discussion Series 2022-083. Washington: Board of Governors of the Federal Reserve System, https://doi.org/10.17016/FEDS.2022.083. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

Understanding Uncertainty Shocks and the Role of Black Swans Anna Orlik and Laura Veldkamp 1 November 26, 2022 1Please send comments to anna.a.orlik@frb.gov and lv2405@columbia.edu. We are grateful for comments from NBER EF&G meetings, Wharton, Yale SOM, Banque de France, HEC, Bank of England,ToulouseSchoolofEconomics,U.QuebecatMontreal,ConferenceonMacroeconomicUncertainty at Dallas Federal Reserve, the NBER Summer Institute forecasting group, NYU Alumni Conference, North American and European meetings of the Econometric Society, AEA meetings, Becker-Friedman Institute Policy Uncertainty Workshop, SED meetings, NBER Universities Research Conference on The Macroeconomic Consequences of Risk and Uncertainty, Barcelona GSE Summer Forum, CEF meetings, UCL Uncertainty and Economic Forecasting Workshop, Federal Reserve Bank of Boston, the NYU macro lunch, and also to Nick Bloom, Etienne Gagnon, Michele Lenza,IgnacioPresno,ThomasSargent,MatthewSmith,TomStark,ourEF&GdiscussantJennifer La’O, our NBER Universities Research Conference discussant, Rudi Bachmann, our AEA discussant, Lars Hansen, and our PSE-SCOR Annual Conference discussant, Laurent Ferrara. Many thanks to Edward Atkinson, Isaac Baley, David Johnson, Callum Jones, Nic Kozeniauskas, Pau Roldan, Jake Scott and Ethan Rahman for outstanding research assistance. We acknowledge grant assistance from the Stern Center for Global Economy and Business. The views expressed herein arethoseoftheauthorsanddonotnecessarilyreflectthepositionoftheBoardofGovernorsofthe Federal Reserve or the Federal Reserve System.

Abstract Economic uncertainty is a powerful force in the modern economy. Research shows that surges in uncertainty can trigger business cycles, bank runs and asset price fluctuations. But where do sudden surges in uncertainty come from? This paper provides a datadisciplined theory of belief formation that explains large fluctuations in uncertainty. It argues that people do not know the true distribution of macroeconomic outcomes. Like Bayesian econometricians, they estimate a distribution. Our main contribution is to explain why real-time estimation of distributions with non-normal tails are prone to large uncertainty fluctuations. We use theory and data to show how small changes in estimated skewness whip around probabilities of unobserved tail events (black swans). Our estimates, based on real-time GDP data, reveal that revisions in the estimates of black swan risk explain most of the fluctuations in uncertainty.

Economic uncertainty is a powerful force in the modern economy. Recent work shows that surges in uncertainty can trigger business cycles, bank runs and asset price fluctuations.1 But the way uncertainty shocks are typically modelled is that one day, every agent suddenly knows that future outcomes will be less predictable than in the past (i.e., uncertainty shocks are modelled as shocks to the second moment of some fundamental shock, e.g. productivity). These mysterious belief shocks are not disciplined by data, making the theories hard to test. This paper provides a data-disciplined theory of belief formation that explains large fluctuations in uncertainty. It starts from the premise that people do not know what the true distribution of economic outcomes is, when it changes, or by how much. They observe economic information and, conditional on that information, estimate the probabilities of alternative outcomes. Much of their uncertainty comes from not knowing if their estimates are correct. Because everyday occurrences are observed frequently, their probabilities are easytolearn. Afterashortperiod,newdatadoesnotsignificantlyalterthoseestimates. In contrast, thetailsofadistributionarerarelyobserved; sotheirsizeandshapeisdifficultto assess. When people use observed data to infer the probabilities of unobserved tail events, newdatacan“wagthetail”ofthedistribution: Itcauseslargerevisionsintailprobabilities. Since variance is expected squared distance from the mean, changes in the probabilities of eventsfarfromthemeanhaveoutsizedeffectsonconditionalvarianceandthusuncertainty. Thus, everyday fluctuations in a data series can produce large fluctuations in conditional variance for an agent who is constantly re-estimating the tails of the distribution. Weusereal-timedatatomeasuretheuncertainty(conditionalstandarddeviation)that arises from not knowing the true model, model uncertainty. Then, we use a combination of data and probability theory results to explain why uncertainty varies so much. These results reveal that it is the combination of parameter uncertainty and tail risk that makes uncertainty more variable and more counter-cyclical than stochastic volatility alone. We learn why the greatest contribution to uncertainty fluctuations comes not from changes in the variance of the data, but rather from the time-varying risk of the unobserved adverse tail events – the black swans. To explore uncertainty, we use a forecasting model with two key features: First, out- 1See, e.g., Bloom, Floetotto, Jaimovich, Sapora-Eksten, and Terry (2012), Fajgelbaum, Schaal, and Taschereau-Dumouchel (2014), or Bacchetta, Tille, and van Wincoop (2012). 1

comes are not conditionally normally distributed, and second, agents use real-time data to re-estimate parameters that govern the distribution’s higher moments, such as skewness. For each quarter, we use the vintage of U.S. (real) GDP growth data that was available at that date to estimate the forecasting model, update the forecast, and compute uncertainty. We define macroeconomic uncertainty as the standard deviation of next-period GDPgrowthy ,conditionalonallinformationobservedthroughtimet: Std[y ]. We t+1 t+1 t |I use this definition because in most models this is the theoretically-relevant moment: When there is an option value of waiting, forecasts with a higher conditional variance (larger expected forecast error) raise the value of waiting to observe additional information. In order to study how uncertainty changes and why, we feed GDP data into our forecasting model and compute this standard deviation. This conceptually simple measurement exercise makes three contributions. (1) It provides a unified framework to explore the origins of and connections between uncertainty shocks, news shocks (changes in the forecasts of future outcomes) and disaster risk. These strands of the literature have evolved separately and have all suffered from the criticism that the right beliefs can rationalize almost any economic outcome. Allowing all three shocks to arise from observed macro outcomes offers the prospect of a unified informationbased macro theory and a way to discipline the shocks to beliefs. (2) The results teach us that when agents do not know the distribution of shocks, re-estimating beliefs can amplify changes. It is not obvious that parameter learning would amplify shocks. Because key macro data are published only infrequently and are highly persistent, parameter learning is a slow, gradual process. Thus, one might think that learning would make uncertainty shocks smoother than changes in volatility. Instead, we find that the opposite is true. This finding complements models that rely on large, counter-cyclical shocks to uncertainty to generate interesting economic and financial effects. (3) The results are consistent with the observed forecast data, in particular with the puzzling systematic forecast bias observed in the Survey of Professional Forecasters’ (SPF) forecasts of GDP growth.2 Our theoretical results use a change-of-measure argument to prove that the combination of parameter 2It is quite remarkable how well a simple skewed forecasting model of GDP growth re-estimated on real-time GDP growth data only (and / or, in addition to, “signals” - which are meant to summarize all other information professional forecasters use to form their forecasts) fits the average SPF forecast, and, hence, the forecast bias, the variation of the mean forecast over time, and the average forecast error. 2

uncertainty and skewness produces such a bias. When the estimated model matches the degreeofskewnessobservedintheGDPgrowthdata,italsomatchesthesizeoftheforecast bias. The finding resolves a puzzle in the forecasting literature. It also produces beliefs thatlooksimilartowhatanambiguity-averseagentmightreport. But,justasimportantly, this evidence suggests that the model accurately describes how people form beliefs. Using data and a simple model to infer beliefs is a key strength of our approach. We do not aim to measure uncertainty in the most sophisticated possible way. Rather, we use a simple framework to describe a theoretical mechanism, supported by data, to explain why uncertainty, beliefs and tail risk vary.3 The key assumption of the mechanism is that agents use everyday events to revise their beliefs about probabilities over the entire state space. This is what allows small changes in data to trigger large changes in black swan probabilities and sizeable fluctuations in uncertainty. The idea that data in normal times would change how we assess tail risk might strike one as implausible. But there is an abundance of evidence that perceptions of tail risks can vary on a daily basis.4 If we think that tail risks fluctuate in times when no extreme events occur, then either beliefs are random and irrational, or there is some information in the everyday data that agents use to update their beliefs.5 In section 2, we build our forecasting model. Using a change-of measure technique, we amend a standard class of models where GDP growth is assumed to be conditionally normally distributed (whether with homoscedastic or heteroskedastic innovations) by adding an exponential twist, with parameters that regulate the conditional skewness of outcomes. Each period t, our forecaster uses the history of GDP data as seen at time t and Bayes law to estimate her model and forecast GDP growth in t+1. Initially, we hold the volatility of the innovations fixed so that we can isolate the changes in uncertainty that come from parameter learning. Even when the forecaster is certain that the variance of innovations is constant, we 3Still, our simple framework delivers uncertainty measure which correlates and comoves with complex state-of-the-art uncertainty measures documented in the literature (constructed, possibly, from hundreds of times series), for example Jurado, Ludvigson, and Ng (2015). 4See Kelly and Jiang (2014) for evidence based on firm-level asset prices or Gao and Song (2018) for evidence based on index options. 5Of course, it is possible that the everyday data that is informative about tail outcomes is not GDP data. But the same principles apply to other series. One could apply the same framework and estimate tail risk from some other series to amplify its effect on uncertainty. 3

find large changes in conditional variance of forecasts – big uncertainty shocks. Section 3 discusses the sources of these large fluctuations in uncertainty. In particular, we ask how much of these fluctuations comes from skewness, how much comes from parameter updating and how much from their interaction. To tease this out, we turn off parameter learning and skewness, one-by-one. We find that skewness alone generates a tiny fraction of changes in conditional variance. Parameter learning alone accounts for about one-third of our result. Most of the changes in conditional variance come from the interaction of skewness and parameter updating. Our results reveal that the main source of uncertainty fluctuations is something we call “black swan risk,” which is the conditional probability of a rare event, in this case an extremely low growth realization. When the forecasting model implies a normal distribution of outcomes, the probability of an n-standard-deviation event is constant. But when we allow our forecaster to estimate a non-normal model, the probability of negative outliers can fluctuate. A new piece of data can lead the forecaster to estimate more negative skewness, which makes extreme negative outcomes more likely and raises uncertainty. When we apply this model to GDP data, we find that between 60% and 80% (depending on the definition of “extreme”) of the variation in uncertainty can be explained by changes in the estimated probability of black swans. Section 5 puts our model-based uncertainty series in the context of commonly-used uncertainty proxies. In particular, we derive explicit conditions that need to be imposed on the information structure so that either mean-squared error (MSE) or forecast dispersion are equivalent to uncertainty. We also document properties of Baker, Bloom, and Davis (2015) policy uncertainty index, the price of a volatility option (VIX), and Jurado, Ludvigson, and Ng (2015) macro uncertainty index. Our message is that understanding the sources of economic uncertainty requires relaxing the full-information assumptions of rational expectations hypothesis. In such a full-information world, agents are assumed to know what the true distribution of economic outcomes is. Their only uncertainty is about what realization will be drawn from a known distribution. To measure the uncertainty of such a forecaster, it makes sense to estimate a model on as much data as possible, take the parameters as given, and estimate the conditional standard deviation of model innovations. This is what stochastic volatility estimates 4

typically are (Born and Pfeifer, 2012). But, in reality, the macroeconomy is not governed by a simple, known model and we surely do not know its parameters. Instead, our forecast data from the Survey of Professional Forecasters (SPF) suggests that forecasters estimate simple models to approximate complex processes and constantly use new data to update their beliefs. Forecasters are not irrational. They simply do not know the economy’s true data-generating process. In such a setting, uncertainty and volatility can behave quite differently. Our findings teach us that learning about the distribution of economic outcomes may itself generate fluctuations. Related Literature A new and growing literature uses uncertainty shocks as a driving process to explain business cycles (e.g., Bloom, Floetotto, Jaimovich, Sapora-Eksten, and Terry (2012), Basu and Bundick (2017), Christiano, Motto, and Rostagno (2014), Ilut and Schneider (2014), Bidder and Smith (2012)), investment dynamics (Bachmann and Bayer, 2014), price-setting (Baley and Blanco, 2015), asset prices (e.g., Bansal and Shaliastovich (2010), Pastor and Veronesi (2012)), or to explain banking panics (Bruno and Shin, 2015). A related literature uses tail risk to explain asset pricing puzzles (e.g., Rietz (1988), Barro (2006), and Wachter (2013)) and business cycle fluctuations (Gourio (2012)). These theories are complementary to ours. We explain where uncertainty shocks come from, while these papers trace out the many economic and financial consequences of these shocks. A growing literature in macroeconomics and finance explores how agents use informationtoformbeliefs,withtoolssuchasrationalinattention(e.g.,Ma´ckowiakandWiederholt (2009), Matejka and McKay (2015), Kacperczyk, Nosal, and Stevens (2019)), inattentiveness(Reis,2006), sentiments(AngeletosandLa’O,2013)orinformationdiffusion(Amador andWeill,2010). Ourmechanismisnotinconsistentwithanyofthesefrictions,allofwhich have Bayesian updating as a foundation. Instead, our paper shows how enriching the set of variables updated, to include parameters that govern tail risk, can link these dynamics to fluctuations in uncertainty as well. An advantage of our approach is that the belief formation process that we postulate is strictly disciplined by and consistent with the data. Asmallsubsetofthesetheoriesexplainswhyuncertaintyfluctuatesusingnonlinearities in a production economy (Van Nieuwerburgh and Veldkamp (2006), Fajgelbaum, Schaal, and Taschereau-Dumouchel (2014), Jovanovic (2006)), active experimentation (Bachmann and Moscarini (2012)) or multiple equilibria (Bacchetta, Tille, and van Wincoop (2012)). 5

BachmannandBayer(2013)supportthisendogenousuncertaintyapproachbyarguingthat uncertainty Granger-causes recessions, but not the other way around. In Nimark (2014), the key assumption is that only extreme events are reported. Thus, the publication of a signal reveals that the true event is extreme, which raises uncertainty. Our model differs because it does not depend on an economic environment, only on a forecasting procedure. In addition, our paper contributes a framework that connects uncertainty with disaster risk and news shocks, unifying the literature on the role of beliefs in macroeconomics. Our exercise also connects with a set of papers that measure uncertainty shocks in various ways. Bloom (2009), Baker, Bloom, and Davis (2015), Giglio, Kelly, and Pruitt (2015), StockandWatson(2012), Jurado, Ludvigson, andNg(2015), JustinianoandPrimiceri (2008), Born and Pfeifer (2014) document the properties of uncertainty shocks in the U.S. and in emerging economies, while Bachmann, Elstner, and Sims (2013) use forecaster datatomeasureex-anteandex-postuncertaintyinGermany. Whileourpaperalsoengages in a measurement exercise, we primarily contribute a theory of why such shocks arise. Our methodological approach is motivated by Hansen (2007) and Chen, Dou, and Kogan (2022), which critique models that give agents knowledge of parameters that econometricians cannot identify. We were also inspired by two preceding papers that estimate Bayesian forecasting models to describe agents’ beliefs. Cogley and Sargent (2005) use such a model to understand the behavior of monetary policy, while Johannes, Lochstoer, and Mou (forthcoming) estimate a model of consumption growth to capture properties of asset prices. While the concept is similar, our use of a model with skewness is what allows non-extreme data to whip tail risk estimates around. When the model is normal or discrete-state (as in Collin-Dufresne, Johannes, and Lochstoer (2016)), only potential disasters affect beliefs about tail probabilities. Furthermore, disaster states cannot be too extreme. Otherwise, agents will never believe they might be in the disaster. This severely limitsthesizeofuncertaintyfluctuationsthatresult. Inourmodel, theprobabilityofevery tail event, no matter how extreme, fluctuates when new data is observed. Our work further draws on tools and ideas in finance models with learning and nonnormaldistributions,suchasBreon-Drish(2015),StraubandUlbricht(2021)andChabakauri, Zachariadis, and Yuan (2021). In our model, agents learn about both parameters and states. We also draw on ideas in the economic forecasting literature about model compar- 6

isons,e.g.,GiacominiandRossi(2013)andintheBayesianestimationliteratureinmacroeconomics (e.g., Del Negro and Schorfheide (2011)). Finally, the black swan metaphor and its relation to tail risk is of course borrowed from Taleb (2010). 1 Definitions and Data Description A model, denoted , has a vector of parameters θ. Together, and θ determine a M M probability distribution over a sequence of outcomes. Let yt y t denote a series of ≡ { τ }τ=1 data (in our exercises, the GDP growth rates) available to the forecaster at time t. Agent i’s information set will include the model and the history yt of observations up to it I M and including time t. The state S , innovations, and the parameters θ are never observed. t The agent, whom we call a forecaster and index by i, is not faced with any economic choices. She simply uses Bayes’ law to forecast future outcomes. Specifically, at each date t, the agent conditions on his information set and forms beliefs about the distribution it I of y . We call the expected value E(y ) an agent i’s forecast and the square root of t+1 t+1 it |I the conditional variance Var(y ) is what we call uncertainty.6 Forecasters’ forecasts t+1 it |I will differ from the realized growth rate. This difference is what we call a forecast error. Definition 1. An agent i’s forecast error is the distance, in absolute value, between the forecast and the realized growth rate: FE = y E[y ] . i,t+1 t+1 t+1 it | − |I | We date the forecast error t + 1 because it depends on a variable y that is not t+1 observed at time t. Similarly, if there are N forecasters at date t, an average forecast error t is 1 (cid:88) Nt F¯E = FE . t+1 i,t+1 N t i=1 We define forecast errors and uncertainty over one-period-ahead forecasts because that is the horizon we focus on in this paper. But future work could use these same tools to measure uncertainty at any horizon. 6In the SPF, each forecaster is asked to submit GDP growth forecasts for different time horizons, or, formally Eh(y |I ) for h=1,2,...6 with h = 1 denoting a current-quarter (sometimes referred to as a t+1 it nowcast of) GDP level (in USD bln). In this paper we focus our attention of horizons h=1,2 to compute the GDP growth rate for the most immediate horizon (variable “drgdp2” in the SPF). To the extent that uncertainty typically rises with the forecast horizon, our results can then be seen as providing the lower bound for the amount of uncertainty faced by the professional forecasters. 7

Definition 2. Uncertainty is the standard deviation of the time-(t + 1) GDP growth, (cid:114) (cid:104) (cid:12) (cid:105) conditional on an agent’s time-t information: U = E (y E[y ])2(cid:12) . it t+1 t+1 it (cid:12) it − |I I Volatility is the same standard deviation as before, but now conditional on the history yt, the model and the parameters θ: M Definition 3. Volatility is the standard deviation of the unexpected innovations in y , t+1 taking the model and its parameters as given: (cid:114) (cid:104) (cid:12) (cid:105) V = E (y E[y yt,θ, ])2(cid:12)yt, ,θ . t t+1 t+1 (cid:12) − | M M If an agent knew the parameters (i.e., if = yt, ,θ ), then uncertainty and volatilit I { M } ity would be identical. The only source of uncertainty shocks would be volatility shocks. Many papers equate volatility, uncertainty and squared forecast errors. The definitions above allow us to study the conditions under which these are equivalent. Volatility and uncertainty are both ex-ante measures because they are time-t expectations of t+1 outcomes (time-t measurable). However, forecast error is an ex-post measure because it is not measurable at the time when the forecast is made. Combining definition 1 and definition (cid:113) 2 reveals that U = E[FE2 ]. So, uncertainty squared is the same as the expected it i,t+1|I it squared forecast error.7 There are two pieces of data that we use to estimate and to evaluate our forecasting models. The first is real-time GDP data from the Philadelphia Federal Reserve. The variable we denote y is the growth rate of GDP. Specifically, it is the log-difference of t the real GDP series, times 400, so that it can be interpreted as an annualized percentage change. We use real-time data because we want to accurately assess what agents know at each date. Allowing them to observe final GDP estimates, which are not known until much later, isnotconsistentwiththegoal.8 Therefore, y representstheestimateofGDPgrowth t 7Ofcourse,whatpeoplemeasurewithforecasterrorsistypicallynottheexpectedsquaredforecasterror. (cid:113) It is an average of realized squared forecast errors: 1/N (cid:80) FE2 . t i i,t+1 8Naturally, forecasters may use other information in conjunction with past GDP growth realizations to computetheirforecasts. WeexploreamodelwithadditionalsignalsinKozeniauskas,Orlik,andVeldkamp (2018). Anotherapproachwouldbetotakemanyseries,extractaprincipalcomponentorpredictivequantile factor as in Jurado, Ludvigson, and Ng (2015) or Giglio, Kelly, and Pruitt (2015) and apply this Bayesian methodology to that factor. While this would likely produce a higher-precision forecast, the complexity would obscure the main message, which is about how the uncertainty shocks arise. We entertain a version of such an exercise, which we call “model with a public signal” in the appendix. 8

between the end of quarter t 1 and quarter t, based on the GDP estimates available at − time t. Similarly, yt is the history of GDP growth up to and including period t, based on the data available at time t. We use the second set of data, professional GDP forecasts, to evaluate our forecasting models. We describe below the four key moments that we use to make that assessment. The data come from the SPF released by the Philadelphia Federal Reserve. The data are a set of individual forecaster predictions of from quarterly surveys from 1968Q4 to 2022 Q2. In each quarter, the number of forecasters varies from quarter-to-quarter, with an average of 40.5 forecasts per quarter.9 2 A Skewed Forecasting Model with Parameter and State Uncertainty The purpose of the paper is to explain why relaxing rational expectations and assuming that agents do not know the true distribution of outcomes opens up an additional source of uncertainty shocks. The key ingredients for our mechanism to operate are parameter uncertainty and skewness in the distribution. To isolate this new mechanism, we consider as simple a model as possible that has these two ingredients. We set the model up with stochastic volatility so that we can eventually explore the interaction and relative magnitudes of volatility and uncertainty fluctuations. But our results begin by shutting down the stochastic volatility so that we can see what comes from parameter updating alone. Later, we turn stochastic volatility back on to get a more complete picture of the sources of uncertainty shocks. Weconsideraforecasterwhoobservesreal-timeGDPgrowthdataineveryquarter,and forecasts the next period’s growth. The agent contemplates a simple hidden state model as a true data generating process for GDP growth, but does not know the parameters of this model.10 Each period, he starts with prior beliefs about these parameters and the current state, observes the new GDP data and the new revisions of past GDP data, and updates 9WhileourforecastingmodelcanbethoughtofasrepresentingameanquarterlypredictionforrealGDP growth, in the appendix we explain how our baseline model can be extended to account for the average forecast dispersion (across forecasters and time) without augmenting our main results quantitatively. 10Formally, parameter uncertainty constitutes a form of model uncertainty (families of models can be indexed by parameters; a parameter could even be an indicator for one of two non-nested models). 9

his beliefs using Bayes’ law.11 A key question is which forecasting model the agent should use. Once we move away from a linear-normal model, there is an infinite set of possibilities. We narrow this set by focusing on a simple (non-normal but relatively easy to estimate) distribution with skewness. In the real GDP (1968:Q4-2022:Q1) data we use for our forecasting model estimation, the skewness of GDP growth is strong: -2.37. Skewness is also a feature of many models. Models where workers lose jobs quickly and find jobs gradually, or models where borrowing constraints amplify downturns are just a couple of examples of models that generate booms that are more gradual than crashes. Finally, we find that a model with skewness both does a better job of matching features of forecast data and generates much larger uncertainty shocks. Estimatingtheparameteruncertaintyinskeweddistributionstypicallyrequiresparticle filtering, which is possible, but typically burdensome. We make this problem tractable by using a change of measure of a normal variable to introduce skewness.12 The Radon- Nikodymtheoremtellsusthat,foranymeasureg thatisabsolutelycontinuouswithrespect to a measure induced by a normal distribution, we can find a change-of-measure function f (cid:82) such that g(x) = f(x)dΦ(x), where Φ is a normal cdf. If we estimate such an f function, we can use f−1 to take skewed data and transform it into normal data, so that we can then use standard tools from Kalman filtering and Bayesian econometrics to estimate the model parameters. Concave functions of normal variables will produce negatively skewed variablesandconvexfunctionsofnormalvariableswillproducepositivelyskewedvariables. Thus, we consider the following general forecasting model that is a standard linear hiddenstatemodelwithafunctionaloperatorf thatcanbenon-lineartocaptureskewness. y = f(X ) (1) t t X = x(S )+σ(S )(cid:15) t t t t 11In the baseline model we estimate, we let the forecasters use real-time data vintages of a fixed length (of18years). Whenweestimatethemodelonfullavailablehistoriesofobservations,thequalitativeresults onhowuncertaintyshocksemergeprevailbutthereisastrongertrendinthetimeseriesofuncertaintydue to Bayesian learning (and the fact that in 2022Q1, a forecaster would have access to about 230 quarterly observations). 12One other possibility would be to consider a model with t-Student innovations. Unfortunately, the estimation of such models may be cumbersome as well as some moments of the posterior distribution may cease to exist depending on prior assumptions (Geweke (1993)). 10

where (cid:15) N(0,1) is an i.i.d. random variable. We explore linear and non-linear transfort ∼ mations f that induce either conditionally normal or skewed distributions for y . t Of course, allowing a forecaster to explore the whole function space of non-linear f’s is not viable. Instead, we use an approximating function. We focus the problem by considering a function f whose log is a linear approximation to many functions that would fit the data. If this approximate function generates large uncertainty shocks, it tells us that the set of functions f approximates likely do as well. FollowingtextbookBayesianstatisticspractices(e.g.,Headrick(2010),Hoaglin,Mosteller, and Tukey (1985)), we use an exponential f function to approximate the class of skewed distributions.13 Exponential models are used because they have three desirable properties: (1) The domain is the real line (so it can take a normal variable as an argument); (2) it is monotone; and (3) it can be either globally concave or globally convex, depending on the estimated parameters. For our purposes, the simplicity allows us to better understand why the combination of skewness and parameter uncertainty generates large, countercyclical uncertainty shocks, even though the underlying process that we estimate is homoscedastic. Thus, our baseline skewed forecasting model is (1) with the following specific assumptions. Baseline Skewed Model y = g−1[exp(gX ) 1] t t − X = S +σ(cid:15) t t t S = α+βS +σSε t t−1 t (cid:16) (cid:17) where the following restriction applies to parameter g: g 1 ; 1 and ∈ −maxyt −minyt where ε N(0,1) is an i.i.d. random variables, also independent of (cid:15) . t t ∼ This is a simplified representation, a model, of how an agent forms beliefs. Specifically, note that shocks have no time-varying volatility (constant σ and σS). We want to understand the fluctuations in conditional variance that come from skewness and parameter 13What we are doing is estimating a probability density from a set of discrete data. A typical approach is to use a Kernel density estimator. But we want to account for parameter uncertainty. Standard Kernel densitieshavetoomanyparameterstofeasiblyestimatetheirjointdistribution. Therefore,Bayesianstatisticiansusetheg-and-hfamilytoestimatedistributionswithskewness,usingasmallnumberofparameters. Our transformation is a simple, limiting case of this g-and-h transformation where h=0. 11

estimation alone. This assumption allows us to see how uncertainty and the skewness of y (2) depend sensitively on the parameter values in (1). t To isolate the role of parameter uncertainty relative to skewness for the evolution of uncertainty in our baseline model, we conduct two exercises. First, we compare the results from the estimation of this model with results when parameter uncertainty is ignored and the forecaster fixes model parameters. This exercise allows us to contrast uncertainty (Definition 2) with volatility (Definition 3). Next, to isolate the nonlinear transformation (skewness)effect,wecomparetheseresultstothosefromamodelwheref islinear(resulting in a standard normal-Gaussian model). In the appendix we present two extensions of this baseline model: one with stochastic volatility and another one where forecasters receive additional information in the form of private and public signals. Information sets and updating of beliefs in the baseline skewed model Each forecaster has an identical information set, = yt, , i. The state S and the it t I { M} ∀ parameters θ = [g,α,β,σ,σ ](cid:48) are never observed. S The two most important objects in question - GDP (conditional) growth forecast and uncertainty around that forecast - can be written using the law of iterated expectation as (cid:90) (cid:90) (cid:90) (cid:90) E (cid:0) y yt(cid:1) = y p (cid:0) y S ,S ,θ,yt(cid:1) p (cid:0) S S ,θ,yt(cid:1) p (cid:0) S θ,yt(cid:1) p (cid:0) θ yt(cid:1) dθdS dS dy t+1 t+1 t+1 t+1 t t+1 t t t t+1 t+1 | | | | | (2) Insimilarfashion,wealsodefineE (cid:0) y2 yt(cid:1) byintegratingoutoverunknownstatesandpat+1| rameters for every history of observations. Applying the variance formula Var (cid:0) y yt(cid:1) = t+1 | E (cid:0) y2 yt(cid:1) E (cid:0) y yt(cid:1)2 ,andtakingthesquarerootyieldsuncertainty: U = (cid:112) Var(y yt). t+1| − t+1 | t t+1 | The difficulty in characterizing growth forecasts in 2 lies in the joint estimation and filtering within this nonlinear model and, specifically, in characterizing the joint conditional distribution p (cid:0) S ,θ yt(cid:1) . The estimation procedures we implemented relies on the changet | of-measure techniques to convert our nonlinear model into a linear conditional Gaussian model. Noticethat for a giveng, we canmap the observed GDPgrowthrates data, y , into t X as g−1log(1+gy ) = X . The latter variable is conveniently conditionally normally t t t distributed and we can adopt a (slightly modified) version of sequential Bayesian learning techiniques known as exact particle filtering and parameter learning (Johannes and Polson 12

(2006)).14 In general, a particle filtering simply consists of an algorithm for generating new particles (S ,θ) (i) given existing particles and a new observation, y . Exact particle t+1 t+1 filtering is a recursive algorithm for generating direct and exact particles (v ,S ,θ) (i) t+1 t+1 where v is a vector of conditional sufficient statistics for the posterior distribution of the t parameters (hyperparameters of the parameter distribution). For the detailed description of the estimation algorithm and prior assumptions see the appendix. AscanbeseeninFigure1theskewedforecastingmodelappearstobeaplausiblemodel of belief formation because it matches the evolution of forecasts in SPF. In particular, it allows us to reproduce the forecast bias: the average forecast in the period 1968Q4-2022Q1 is 0.3 percentage point lower than the average GDP growth realization over that period (and the forecast bias is twice as large in the period 1968Q4-2008Q1). This bias has been a puzzle in the forecasting literature because an unbiased forecaster with a linear model and morethansixtyyearsofdatashouldnotmakesuchlargesystematicerrors. Weofferanew explanation for this forecasting puzzle: Forecast bias arises from rational Bayesian belief updating when forecasters believe outcomes have negative skewness and are uncertain about model parameters. While this bias might prompt one to use another estimation procedure, keep in mind that the objective in this paper is to describe a belief-formation process. The fact that our model has forecasts that are just as biased as professional forecasts suggests that Bayesian estimation might offer a good approximation to human behavior. 3 Results: Black Swan Risk and Uncertainty Fluctuations The estimated uncertainty associated with GDP growth forecasts in the SPF is plotted in Figure 2. The resulting time series of uncertainty varies substantially over time despite the fact that the fundamental shocks are homoscedastic. In this section we isolate the mech- 14Another way to obtain draws from the joint distribution over parameters and hidden states is to use RandomWalkMetropolisHastingsalgorithm. Theprocedureproducessamplesofparametervectorsfrom the posterior parameter distribution while filtering the hidden state for each parameter vector using the Kalman filter. One referee pointed out in the past that such an algorithm potentially adds “numerical uncertainty” to our measure of estimated uncertainty. We have decided on a different procedure since but would like to report that both estimation techniques produce results which are largely similar. 13

Figure 1: SPF forecasts and model forecasts anisms at play and discuss how uncertainty shocks arise. Constant volatility may or may not be a realistic feature of the data. But it is a helpful starting point because it allows us to isolate the fluctuations in uncertainty that come from skewness and parameter learning. We begin by showing that neither parameter updating nor skewness alone produces the large uncertainty fluctuations. Instead, most of the effect arises from the interaction of these two forces. Then, we proceed to explain how this interaction effect works. Figure 2: Uncertainty and volatility implied by the skewed forecasting model . 14

Time series properties Mean U 3.15% t V 2.65% t Std deviation U 0.76% t V 0.02% t Autocorrelation U 0.95 t V 0.36 t Cyclical properties Corr(U , E [y ]) -0.18 t t t+1 Corr(V , E [y ]) -0.98 t t t+1 Forecast properties SPF data Model Mean forecast 2.31% 2.2% Mean F Err 1.85% 1.84% | | Std forecast 3.37% 2.65% Std F Err 1.58% 1.98% | | Table 1: Properties of model uncertainty series. Forecasts are computed using equation (2). Forecast error is (forecast - final GDP growth). Uncertainty, denoted U , is computed as in Definition t 2. Volatilities,denotedV ,arecomputedasinDefinition3assumingthattheparametersθ areknownand t equal to the mean posterior beliefs at the end of the sample for the parameter learning models. What effects can parameter estimation explain? The upper panel of Table 1 reveals that our skewed model delivers large uncertainty shocks (stdev(U ) = 0.76. When t a forecaster sees an outlier observation, he revises the estimated variance of one or both innovations as well as his estimate of skewness and / or the remaining parameters of the model. (We look into these outlier observations and the resulting parameter revisions in detail below). To isolate the effects of parameter revisions alone, we have also estimated a linear-normal forecasting model. The uncertainty shocks in this model are about half the size of the shocks in our baseline skewed model. Importantly, the linear-normal model cannot account for the SPF forecast bias: the average forecast in this model (2.7%) is even slightly higher than the average GDP growth realization that the forecaster sees (2.6%) in the period 1968Q4-2022Q1. What part of the results can skewness explain? Onereasonthatuncertaintyvaries so little with a normal forecasting model is that the normal distribution has the unusual 15

propertythattheconditionalvarianceisthesameirrespectiveoftheconditionalmean. An n-standard-deviation event is always equally unlikely. Since uncertainty is a conditional variance, the normal distribution shuts down much scope for changes in uncertainty. The skewed forecasting model does have a conditional variance that depends on the mean. Even when parameters of the model are known, changes in the estimated state move the conditional standard deviation of the forecast. This raises the question of whether most of our variation in uncertainty comes from skewness alone. In Table 1, upper panel, the rows labelled V report the moments of the model witht out parameter uncertainty or parameter revisions. Indeed, even without the parameter revisions, uncertainty (volatility) does vary. But that effect is tiny. Updating beliefs about the skewness of the GDP growth distribution has a large effect on uncertainty. Such learning increases the average level of uncertainty by about half a percentage point. And it amplifies uncertainty shocks. One can interpret the magnitude of this standard deviation (0.76%) relative to the mean. A 1-standard deviation shock to uncertainty raises uncertainty about 25% above its mean. That is quite a volatile process andoffersastarkcontrasttotherelativelymodestchangesinvolatilitytypicallymeasured. Keep in mind that there is still no stochastic volatility in this model. To the extent that we believe that there are volatility shocks to GDP, this would create additional shocks to uncertainty, above and beyond those we have already measured. (This is an extension we contemplate in the appendix.) In addition, uncertainty is very persistent and countercyclical. 3.1 Skewness and Time-Varying Black Swan Risk To understand why uncertainty varies so much, it is helpful to look at the probability of tail events. Since our estimated probability distribution is negatively skewed (i.e. the mean/median posterior estimate of parameter g is negative most of the time, see Figure 8), negative outliers are more likely than positive ones. For a concrete example, let us consider the probability of a particular negative growth event. The historical mean of GDP growth is 2.63%, while its standard deviation is 4.56%. If GDP growth were normally distributed, then y 10% would be about a 1-in-100-year event (Pr= 0.0028 quarterly). Let us t+1 ≤ − 16

call this rare event a black swan. BlackSwanRisk = Prob[y 10% ]. (3) t t+1 t ≤ − |I Figure 3 plots the estimated time series of Black Swan Risk. The correlation of this time series with our time series of uncertainty is high. This illustrates that uncertainty shocks arise in times when the estimated probabilities of extreme events change. Our model suggests that uncertainty builds up gradually over time as more and more unusual observations are realized. When the distribution of GDP growth is non-normal and states and parameter estimates change over time, the probability of this black swan event fluctuates. The black Figure 3: Black Swan Risk is defined in (3). swan risk varies considerably. For example, leading up to the 2008 financial crisis, the black swan probability rose from 0.04% in 2007:Q1 to over 18% in 2009:Q3. It reached its historical high during the pandemic. These results teach us that when we include parameter uncertainty in our notion of economic uncertainty, and we consider a model with skewed outcomes, then most changes in uncertainty coincide with changes in the estimated probability of rare events. Most of these uncertainty shocks were not present when we did not allow the forecaster to update his skewness belief. When we allow for learning about skewness, new pieces of data 17

cause changes in the skewness estimates. Tail event probabilities are very sensitive to this skewness parameter. When the probability of extreme events is high, uncertainty is high as well. This explanation raises the question: What types of data realizations make estimated skewnessmorenegative,increaseblackswanrisk,andtherebygenerateuncertaintyshocks? We find two types of episodes that set up large uncertainty shocks. The first is simply a large negative GDP growth realization. When a negative outlier is observed, the forecaster revises skewness to be more negative and increases the estimated variance of shocks, both of which cause the probability of a black swan event and uncertainty to rise. This is what happens in 2008 and in the early 1980s. But there is a second, more subtle cause of uncertainty shocks that comes from a sequence of mild positive GDP growth realizations in a row followed by a mildly negative observation. These observations cause the forecaster to increasetheestimatedmeanofthedistribution. Whenthemeanincreases,theexistingnegative outlier data points become further from the mean. Because the previously-observed negative realizations are more extreme, the estimate of skewness rises and the probability of rare negative events can rise as well. This is what happens in the early 1970s as can be seen in Figure 4. A sequence of positive growth realizations causes a rise and then a fall in uncertainty. But the persistence of the high estimated skewness sets the stage for the large rise in uncertainty in the second half of the 1970s. This mechanism provides one explanation for why uncertainty seems to rise particularly at the end of long spells of consistently positive growth. 3.2 Negative Skewness as a Force for Counter-Cyclical Uncertainty One way of understanding the cyclical effect skewness has on uncertainty is by thinking aboutthe skewed distribution as a non-lineartransformation of a normal distribution. The transformation has no economic interpretation. It does not represent a utility function, production function or anything other than an estimated change-of-measure function that regulates the skewness of outcomes.15 But since many problems in economics use normal 15Althoughthispaperdoesnottrytoexplainthenegativeskewnessofoutcomes,manyothertheoriesdo. Whentheeconomyisfunctioningverywell,thenimprovingitsefficiencyresultsinasmallincreaseinGDP. Butifthereisahighdegreeofdysfunctionorinefficiency,thentheeconomycanfallintodepression. Many models generate exactly this type of effect through borrowing or collateral constraints, other financial 18

8 7 6 5 4 3 2 1 0 −5 0 5 10 Growth rate (%) ycneuqerF 14 Both dates Mean 71Q2 only 73Q1 73Q1 only 12 Mean 71Q2 10 71Q2 73Q1 Mean 3.62 4.02 8 Std. 3.40 3.41 Skew. −0.46−0.67 6 4 2 0 −20 −13.4 0 10 20 Growth rate (%) ytisneD Densities at −13.4% (a −5σ event) 71Q2: 0.46% 73Q1: 0.93% 71Q2 73Q1 Figure 4: An example of a positive growth episode that increased the estimated mean, skewness and black swan probability. shocks and compute means and variances of concave functions of these shocks, we can leverage that intuition here to understand the role of skewness. (See Albagli, Hellwig, and Tsyvinski (2015) for a similar approach.) The following result shows that a concave transformation of a variable with a normal probability density results in a variable whose distribution has negative skewness. For proof see Appendix A. Lemma1. Supposethaty isarandomvariablewithaprobabilitydensityfunctionφ(g−1(y)), where φ is a standard normal density and g is an increasing, concave function. Then, E[(y E[y])3] < 0. − The unconditional distribution of GDP growth rates is negatively skewed. Therefore, when we estimate the change of measure function that maps a normal variable X into t GDP growth, we consistently find that the coefficient g is negative, meaning that the transformation is increasing and concave. A concave transformation of a normal variable puts more weight on very low realizations and makes very high realizations extremely unlikely. In other words, the concave transformation creates a negatively-skewed variable. Breakingtheprobabilitydensityintoanormalandaconcavefunctionishelpfulbecause it allows us to understand where counter-cyclical uncertainty comes from. We can use the accelerator mechanisms, matching frictions, or information frictions. Even a simple diminishing returns story could explain such skewness. 19

Radon-Nikodym theorem to characterize the conditional variance of a skewed variable as the conditional variance of a normal variable, times a Radon-Nikodym derivative. (cid:90) Var[y yt] = (y E[y yt])2f(y yt)dy t+1 t+1 t+1 t+1 t+1 | − | | If f(y yt) = f(g(X ) yt) = φ(X Xt), then by the Radon-Nikodym theorem, t+1 t+1 t+1 | | | (cid:90) dg Var[y yt] = (X E[X Xt])2 φ(X Xt)dX t+1 t+1 t+1 t+1 t+1 | − | dX | (cid:20) (cid:12) (cid:21) Var[y t+1 yt] = E dg(X t+1 )(cid:12) (cid:12)Xt Var[X t+1 Xt]+cov( dg ,(X t+1 E[X t+1 Xt])2) | dX (cid:12) | dX − | The conditional variance of the normal variable X obviously depends on its history Xt, t+1 but it is not affected by what the expected value of X is. Normal variables have the t+1 property that their conditional variance is the same throughout the state-space. Conditional variance is not mean-dependent. That is not true of the skewed variable y. Because g is an increasing, concave function, dg/dX is largest when X is low and falls as X rises. This tells us that Var[y yt] is largest when E[y yt] is low and falls as the expected t+1 t+1 | | GDPgrowthraterises. Thisistheoriginofcounter-cyclicaluncertainty. Itarisesnaturally if a variable has a negatively-skewed distribution that can be characterized as a concave transformation of a normal variable. GDP Growth(y) yuncertainty State(x) xuncertainty Figure 5: Nonlinear change of measure and counter-cyclical uncertainty. A given amount of uncertainty about x creates more uncertainty about y when x is low than it does when x is high. Figure 5 illustrates why uncertainty is counter-cyclical. The concave line is a mapping 20

from x into GDP growth, y. The slope of this curve is a Radon-Nikodym derivative. A givenamountofuncertaintyislikeabandofpossiblex’s. Ifxwasuniform, thebandwould representthepositive-probabilitysetandthewidthofthebandwouldmeasureuncertainty about x. If that band is projected on to the y-space, the implied amount of uncertainty about y depends on the state x. When x is high, the mapping is flat, and the resulting widthofthebandprojectedonthey-axis(y uncertainty)issmall. Whenxislow, theband projected on the y axis is larger and uncertainty is high. This mechanism for generating counter-cyclical uncertainty is related to Straub and Ulbricht (2021), except that in their model, the concave function arises from assumptions about an economic environment. In this paper, the concave function is estimated and captures only the fact that GDP growth data is negatively skewed. Learning about skewness causes this concave curve to shift over time. When a negative outlierisobserved,theestimatedstatefallsandestimatedskewnessbecomesmorenegative. Moreskewnesstranslatesintomorecurvatureinthechangeofmeasurefunction. Combined with a low estimated state, this generates even more uncertainty. Thus, bad events trigger larger increases in uncertainty. This is reflected in the more negative correlation between forecasts and uncertainty in our baseline skewed model in Table 1. 3.3 Why Skewness and Parameter Uncertainty Lower Forecasts Aside from generating larger uncertainty shocks, the model with skewness also explains the low GDP growth forecasts in the professional forecaster data. The average forecast is 2.31% in the forecaster (SPF) data.16 These forecasts are puzzling because the average GDP growth rate is 2.63%. It cannot be that over 70 years of post-war history, forecasters have not figured out that the sample mean is 0.3% higher than their forecasts on average. Our next result shows that these low forecasts are entirely rational for a Bayesian who believes that outcomes are negatively skewed and faces parameter uncertainty. This is an application of the Box (1971) result that Bayesian estimates of parameters in non-linear functions are typically biased. 16Variousstudiespriortooursdocumentadownward(pessimistic)forecastbias. ElliottandTimmermann (2008) argue that stock analysts over-estimate earnings growth and the Federal Reserve under-estimates GDP growth. Wieland and Wolters (2013) document the bias in the Greenbook forecasts of the Federal Reserve both for the GDP growth and inflation forecasts. 21

Lemma 2. Suppose that y is a random variable with a probability density function f that can be expressed as f(y;µ,σ) = φ((g−1(y) µ)/σ) where φ is a standard normal density − (cid:82) and g is a concave function. Let the mean of y be y¯ yf(y;µ,σ)dy. A forecaster does not ≡ know the true parameters µ and σ, but estimates probability densities h(µ(cid:48) σ(cid:48)) and k(σ(cid:48)), | with means µ and σ. The forecaster uses these parameter densities to construct a forecast: yˆ (cid:82) (cid:82) (cid:82) yf(y µ(cid:48),σ(cid:48))h(µ(cid:48) σ(cid:48))k(σ(cid:48))dydµ(cid:48)dσ(cid:48). Then yˆ< y¯. ≡ | | The logic of the result is the following: If GDP growth is a concave transformation of a normal underlying variable, Jensen’s inequality tells us that expected values will be systematically lower than the mean realization. But by itself, Jensen’s inequality does not explain the forecast bias because the expected GDP growth and the mean GDP growth should both be lowered by the concave transformation (see Figure 6, left panel). It must GDP GDP Growth(y) Growth(y) Jenseneffect E[yt+1| yt, M ,θ] AdditionalJenseneffect E[yt+1| yt, M ,θ] E[yt+1| yt] frommodeluncertainty Forecasterbelieves f(Xt+1| Xt) E[Xt+1| Xt, M ,θ] State(x) E[Xt+1| Xt] State(x) Figure6: ExplainingwhyaverageforecastsarelowerthanmeanGDPgrowth. Theresulthas two key ingredients: The forecaster faces more uncertainty than he would if he knew the true distribution of outcomes, and a Jensen inequality effect from the concave change of measure. be that there is some additional uncertainty in expectations, making the Jensen inequality effect larger for forecasts than it is for the unconditional mean of the true distribution (see Figure 6, right panel). This would explain why our results tell us that most of the time the sample mean is greater than the average forecast. If the agent knew the true parameters, he would have less uncertainty about y . Less uncertainty would make the Jensen effect t+1 smaller and raise his estimate of y , on average. Thus, it is the combination of parameter t+1 uncertainty and a skewed distribution that can explain the forecast bias. This downward bias in beliefs is the kind of bias that is typically only seen in models of ambiguity aversion or robust control. Those models use a particular form of preferences 22

towardsuncertaintytomakeagentsactasiftheybelievedthatsystematicallybadoutcomes wouldarise. Suchmodelswithnon-lineartransformationsofpreferencesaretypicallysolved as if they had simple preferences with twisted probabilities. Our framework generates similar beliefs because the non-linear functions of normal variables that we introduce to capture skewness are similar to the non-linear functions robustness/ambiguity solution methods employ to “twist” their probabilities. This parallel is useful because it suggests that results from ambiguity aversion theories could be reproduced in Bayesian settings with standard preferences. We could replace the min-max preferences of ambiguity with a skewed distribution of outcomes and agents who are imperfectly informed about the distribution’s parameters. This could be a useful step forward for this literature simply because the data disciplines econometric estimates of probability distributions more precisely than it does preference specifications. 3.4 Convergence and the Downward Trend in Uncertainty Since the parameters in this model are constant, eventually agents will learn them if the model is correctly specified. Even in our 54-year sample, there is evidence of convergence. There is a downward trend in uncertainty, some of which comes from the decline in the uncertainty about the parameter values. Between 1970 and 2007, uncertainty falls from 3.8% to 2.2%. Does this decline imply that all parameter uncertainty should be resolved in the near future and these effects will disappear? There are three reasons why parameter uncertainty would persist. First, our forecasting model is clearly not a complete description of the macroeconomy. Our simple specification represents the idea that people use simple models to understand complex economic processes. Bayesian learning converges when the model is correctly specified. But when the estimated model and the true data-generating process differ, there is no guarantee that parameter beliefs will converge to the truth. Even as the data sample becomes large, parameter beliefs can continue to fluctuate, generating uncertainty shocks. Second, much of the trend decline in uncertainty comes from lower estimated volatility. The mean estimate of the transitory shock variance (σ2) falls by about 30% between 1970:Q1and2007:Q1. ThemeanestimateofvariancesdeclinesimplybecauseGDPgrowth becomes less volatile in the second half of the sample and agents react to that by revising 23

down their estimates of the variance parameters. Lower innovation variance also reduces uncertainty. Finally, simply adding time-varying parameters can prevent convergence. If we assume that some or all of the parameters drift over time, then beliefs about these parameters will continue to change over time. One example of a model with time-varying parameters is a stochastic volatility model. We discuss this extension of our model in the appendix. 4 Data Used to Proxy for Uncertainty Our model generates a measure of economic uncertainty. In this section, we describe the commonly used proxies of uncertainty, analyze their theoretical relationship with conditional variance and then compare their statical properties to those of our measure. The purpose of this analysis is certainly not a race between different measures but rather shedding light on informational assumptions researchers implicitly make when they proxy for uncertainty with different measures. Forecast Dispersion Someauthorsuseforecastdispersionasameasureofuncertainty17 often because it is regarded as “model-free.” It turns out that dispersion is only equivalent to uncertainty in models with uncorrelated signal noise and no parameter uncertainty. Notthatanyunbiasedforecastcanbewrittenasthedifferencebetweenthetruevariable being forecast and some forecast noise that is orthogonal to the forecast: y = E[y ]+η +e (4) t+1 t+1 it t it |I where the forecast error (η + e ) is mean-zero and orthogonal to the forecast. We can t it further decompose any forecast error into a component that is common to all forecasters η and a component that is the idiosyncratic error e of forecaster i. t it Dispersion D is the average squared difference of each forecast from the average foret cast. Wecanwriteeachforecastasy η e . Then, withalargenumberofforecasters, t+1 t it − − wecanapplythelawoflargenumbers,settheaveragee to0andwritetheaverageforecast it 17See e.g. Diether, Malloy, and Scherbina (2002), and Johnson (2004). 24

as E¯[y ] = y η . Thus, t+1 t+1 t − 1 (cid:88) 1 (cid:88) D (E[y ] E¯[y ])2 = e2 (5) t t+1 it t+1 it ≡ N |I − N i i Note that dispersion reflects only private noise e , not public noise η . Uncertainty is it t (cid:112) the conditional standard deviation of the forecast error, which is E[(η +e )2 ] and t it it |I depends on both sources of noise. Thus, whether dispersion accurately reflects uncertainty depends on the private or public nature of information. Mean-Squared Forecast Error A measure that captures both private and common forecast errors is the forecast mean-squared error. A mean-squared error (MSE ) of a forecast of y made in quarter t is the square t+1 t+1 root of the average squared distance between the forecast and the realized value (cid:115) (cid:80) (E[y ] y )2 MSE = i∈It t+1 |I it − t+1 . (6) t+1 N t If forecast errors were completely idiosyncratic, with no common component, then dispersion in forecasts and mean-squared forecasting errors would be equal.18 We use this insight to measure how much variation in mean-squared errors (MSE) comes from changes intheaccuracyofaverageforecastsandhowmuchcomesfromchangesindispersion. Using SPF data, we regress MSE2 on (E¯ [y ] y )2. We find that the R2 of this regression t t+1 t+1 − is 80%. The remaining variation is due to changes in forecast dispersion. This teaches us that most of the fluctuations in MSE come from changes in average forecast errors. It implies that using forecast dispersion as a proxy for uncertainty will miss an important source of variation. Volatility and Confidence Measures Jurado, Ludvigson, and Ng (2015) offer a stateof-the-art macro volatility measure. It uses a rich set of time series, computes conditional 18Toseethis,notethatFE2 =(E[y |I ]−y )2. WecansplitupFE2 intothesum((E[y |I ]− jt t+1 jt t+1 jt t+1 jt E¯ [y ])+(E¯[y ]−y ))2, where E¯ [y ]= (cid:82) E[y |I ] is the average forecast. If the first term in t t+1 t+1 t+1 t t+1 j t+1 jt parentheses is orthogonal to the second, 1/N (cid:80) FE2 = MSE2 is simply the sum of forecast dispersion j jt t and the squared error in the average forecast: E[y |I ]−E¯ [y ])2+(E¯ [y ]−y )2. t+1 jt t t+1 t t+1 t+1 25

volatility of the unforecastable component of the future value of each of these series, and then aggregates these individual conditional volatilities into a macro uncertainty index. Other proxy variables for uncertainty are informative, but have a less clear connection to a conditionalvariancedefinitionofuncertainty. Themarketvolatilityindex(VIX)isatraded blend of options that measures expected percentage changes of the S&P500 in the next 30 days. It captures expected volatility of equity prices. It would require a complex model to link macroeconomic uncertainty to the VIX. Nevertheless, we compare its statistical properties to those of our uncertainty measure in Figure 7. Another commonly cited measure of uncertainty is business or consumer confidence. The confidence survey asks respondents whether their outlook on future business or employment conditions is “positive, negative or neutral.” Likewise, the index of consumer sentiment asks respondents whether future business conditions and personal finances will be “better, worse or about the same.” These questions are about the direction of future changes and not about any variance or uncertainty. They may be correlated with uncertainty because uncertainty is counter-cyclical. Finally, Baker, Bloom, and Davis (2015) use newspaper text analysis, the number of expiring tax laws, and forecast dispersion to create a policy uncertainty index. While the qualitative nature of the data precludes any theoretical comparison, we include it for comparison as an influential alternative. Figure 7: Selected proxies of uncertainty. See Table 2 for definitions and sources. 26

Mean Standard autocorr correlation correlation deviation with y with U t+1 t JLN index 64.83 10.65 0.91 -0.41 33.7% forecast MSE 2.96% 3.41% 0.48 -0.36 27.4% forecast dispersion 1.45% 1.17% 0.58 -0.34 63.0% VIX 19.59 7.13 0.73 -0.31 8.42% BBD index 108.17 51.51 0.70 -0.38 37.6% Table 2: Properties of uncertainty measures used in the literature. JLN index is the 3 month horizon uncertainty measure from Jurado, Ludvigson, and Ng (2015). Forecast MSE and dispersionaredefinedin(6)and(5)andusedatafrom1960:Q3-2021:Q4. Growthforecastisconstructedas ln(E (GDP ))−ln(E (GDP )). VIX is the Chicago Board Options Exchange Volatility Index closing t t t t−1 t price on the last day of quarter t, from 1990:Q1-2022:Q2. BBD index is the uncertainty measure from Baker, Bloom, and Davis (2015). U is uncertainty from our baseline skewed model. t ComparingUncertaintyProxiestoModel-GeneratedUncertainty Figure7plots selected uncertainty proxies. There is considerable comovement, but also substantial variationinthe dynamicsofeach process. These are clearlynotmeasures ofthesamestochastic process, each with independent observation noise. Furthermore, they have properties that are somewhat different from our model-implied uncertainty metric. Table 2 shows that our uncertainty metric is positively correlated with all the proxies of uncertainty listed, albeit to a differing extent. InferringUncertaintyFromProbabilityForecasts Onewaytoinfertheuncertainty of an economic forecaster is to ask them about the probabilities of various events. The SPF asks about the probability that GDP growth exceeds 6%, is between 5-5.9%, between 4-4.9%, ..., and below -2%. The survey averages across all forecasters and reports a single averageprobabilityforeachbin. Sincethisdatadoesnotcompletelydescribeaconditional distribution, computing the conditional variance requires approximation, particularly for the tails of that distribution. The most frequently used approximation is to assume that these are probabilities of ten discrete growth rates, each corresponding to the mid-point of a bin.19 19For example, when agents assign a probability to 1−2% GDP growth, we treat this as if that is the probabilityplacedontheoutcomeof1.5%GDPgrowth. Whentheagentsaysthatthereisprobabilityp 6.5 ofgrowthabove6%,wetreatthisasprobabilityp placedontheoutcomey =6.5%. Andiftheagent 6.5 t+1 reports probability p of growth below -2%, we place probability of p on y =−2.5%. Then, the −2.5 −2.5 t+1 27

Theresultingconditionalvarianceseriesisnotveryinformative. Ithardlyvaries(range is [0.0072,0.0099]). It does not spike in the financial crisis. In fact, the SPF-implied variance suggests that uncertainty in 2008 was roughly the same as it was in 2003. The problem is that the growth rates are top- and bottom-coded. All extremely bad GDP events are grouped in the bin “growth less than 2%.” If there is a very high probability of growth below 2%, then since most of the probability is concentrated in one bin, variance and, thus, uncertainty is low. The main point of our paper is that most uncertainty shocks come from changes in the probabilities of extreme events. This survey truncates extremes and, therefore, fails to capture most changes in uncertainty. 5 Conclusions Theories based on news shocks, uncertainty shocks, higher-order uncertainty shocks, tail risk shocks, and belief shocks generally have been influential in macroeconomics. But they leaveunansweredthefundamentalquestion: Whydobeliefsfluctuateinthisway? Justlike output arises from feeding inputs into a technology, beliefs arise from feeding information sets into a belief-formation procedure. Just like a complete theory explains why the inputs and output change, it should also tell us why beliefs change. In this paper, we consider a Bayesian belief-formation mechanism that allows for estimation of tail risk. We feed in an information set that is simply the real-time available GDP history and a reference forecasting model that the forecaster estimates in real time, just like an econometrician would. We find that these simple ingredients produce large, countercyclical fluctuations in tail risk and uncertainty. Furthermore, without any preference assumptions, they produce a downward bias in mean beliefs that resembles the bias of forecasts in the SPF. This theory of the origins of belief shocks suggests a change in our approach to measurement. Most economic uncertainty measures ignore parameter estimation uncertainty. Sometimes referred to as “rational expectations econometrics,” the traditional approach entails estimating a model on the full sample of data and then treating the estimated (cid:80) expected rate of GDP growth is y¯= p m for M = {−2.5,−1.5,...,6.5}. Finally, the conditional variance of beliefs about GDP growth ar m e (cid:15) v M ar[ m y|I]= (cid:80) p (m−y¯)2. m(cid:15)M m 28

parameters as truth to infer what the volatility of innovations was in each period in the past. In equating volatility with uncertainty, the econometrician assumes that the uncertain agent knows the true distribution of outcomes at every moment in time and is only uncertain about which outcome will be chosen from this distribution. Assuming such precise knowledge of the economic model rules out most uncertainty and ignores many sources of uncertainty shocks. Weexploretheuncertaintyshocksthatarisewhenanagentisnotendowedwithknowledge of the true economic model and needs to estimate it, just like an econometrician. The conditionalvarianceofthisagent’sforecast,hisuncertainty,ismuchhigherandvariesmore than volatility does. When the agent considers skewed distributions of outcomes, new data or real-time revisions to existing data can change his beliefs about the skewness of the distribution, and thus the probability of extreme events. Small changes in the estimated skewness can increase or decrease the probability of these tail events many-fold. Because tail events are far from the mean outcome, changes in their probability have a large effect on conditional variance, which translates into large shocks to uncertainty. Thus, our message is that beliefs about black swans – extreme events that are almost never observed but whose probability is inferred from a forecasting model – are responsible for much of the shocks to macroeconomic uncertainty. This paper has focused on the belief formation process. In our data-disciplined approach we uncovered the mechanisms that make uncertainty fluctuate over time. As such, this paper is a foundation on which other theories can build. Kozeniauskas, Orlik, and Veldkamp (2018) show how a similar mechanism can be embedded in a production economy with heterogeneous information, forecast dispersion and heterogeneous firm outputs. Our mechanism could also be used to model default risk. Since “black swan” probabilities could be interpreted as default probabilities, the model would then tell us what kinds of data realizations trigger high default premia and debt crises. In another project, our mechanism could be embedded in a consumption-based asset pricing model. We know that a well-engineered stochastic process for time-varying rare event probabilities can match many features of equity returns. Our tools could be used to estimate these rare event probabilities and assess whether the estimates explain asset return puzzles. 29

References Albagli, E., C. Hellwig, and A. Tsyvinski (2015): “A Theory of Asset Pricing Based on Heterogeneous Information,” Yale University working paper. Amador, M., and P.-O. Weill (2010): “Learning from prices: Public communication and welfare,” Journal of Political Economy, 118, 866–907. Angeletos, G., and J. La’O (2013): “Sentiments,” Econometrica, 81(2), 739–780. Bacchetta, P., C. Tille, and E. van Wincoop (2012): “Self-Fulfilling Risk Panics,” American Economic Review, 102(7), 3674–3700. Bachmann, R., and C. Bayer (2013): “Wait-and-See Business Cycles?,” Journal of Monetary Economics, 60(6), 704–719. (2014): “Investment Dispersion and the Business Cycle,” American Economic Review, 104(4), 1392–1416. Bachmann, R., S. Elstner, and E. Sims (2013): “Uncertainty and Economic Activity: Evidence from Business Survey Data,” American Economic Journal: Macroeconomics, 5(2), 217–249. Bachmann, R., and G. Moscarini (2012): “Business Cycles and Endogenous Uncertainty,” Yale University working paper. Baker, S., N. Bloom, and S. Davis (2015): “Measuring Economic Policy Uncertainty,” Stanford University working paper. Baley, I., and A. Blanco(2015): “MenuCosts,UncertaintyCyclesandthePropagation of Nominal Shocks,” New York University working paper. Bansal, R., and I. Shaliastovich (2010): “Confidence Risk and Asset Prices,” American Economic Review, 100(2), 537–41. Barro, R. (2006): “Rare Disasters and Asset Markets in the Twentieth Century,” The Quarterly Journal of Economics, 121(3), 823–866. Basu, S., and B. Bundick(2017): “UncertaintyShocksinaModelofEffectiveDemand,” Econometrica, 85, 937–958. Bidder, R., and M. Smith (2012): “Robust Animal Spirits,” Journal of Monetary Economics, 59(8), 738–750. Bloom, N. (2009): “The Impact of Uncertainty Shocks,” Econometrica, 77(3), 623–685. 30

Bloom, N., M. Floetotto, N. Jaimovich, I. Sapora-Eksten, and S. Terry(2012): “Really Uncertain Business Cycles,” NBER working paper 13385. Born, B., and J. Pfeifer (2012): “A practical guide to volatility forecasting through calm and storm,” The Journal of Risk, 3-22, 14. (2014): “Policy risk and the business cycle,” Journal of Monetary Economics, 68, 68–85. Box, M. (1971): “Bias in Nonlinear Estimation,” The Journal of the Royal Statistical Society, 171-201, 33(2). Breon-Drish, B. (2015): “On Existence and Uniqueness of Equilibrium in a Class of Noisy Rational Expectations Models,” Review of Economic Studies, forthcoming. Bruno, V., and H. Shin (2015): “Capital Flows, Cross-Border Banking and Global Liquidity,” Journal of Monetary Economics, 71, 119–132. Chabakauri, G., K. Zachariadis, and K. Yuan (2021): “Multi-Asset Noisy Rational Expectations Equilibrium with Contingent Claims,” LSE working paper. Chen, H., W. Dou, and L. Kogan(2022): “MeasuringtheDarkMatterinAssetPricing Models,” Journal of Finance, forthcoming. Christiano, L., R. Motto, and M. Rostagno (2014): “Risk Shocks,” American Economic Review, 104(1), 27–65. Cogley, T., and T. Sargent (2005): “The Conquest of US Inflation: Learning and Robustness to Model Uncertainty,” Review of Economic Dynamics, 8, 528–563. Collin-Dufresne, P., M. Johannes, and L. Lochstoer(2016): “ParameterLearning in General Equilibrium: The Asset Pricing Implications,” American Economic Review, 106, 664–98. Del Negro, M., and F. Schorfheide (2011): “Bayesian Macroeconometrics,” in The Oxford Handbook of Bayesian Econometrics, ed. by J. Geweke, G. Koop, and H. van Dijk. Oxford University Press. Diether, K., C. Malloy, and A. Scherbina (2002): “Differences of Opinion and the Cross-Section of Stock Returns,” Journal of Finance, 57, 2113–2141. Elliott, G., and A. Timmermann (2008): “Economic Forecasting,” Journal of Economic Literature, 46(1), 3–56. Fajgelbaum, P., E. Schaal, and M. Taschereau-Dumouchel (2014): “Uncertainty Traps,” NBER Working Paper No. 19973. 31

Gao, G., and Z. Song (2018): “Tail Risk Concerns Everywhere,” Management Science, 65, 3111–3130. Geweke, J. (1993): “Bayesian Treatement of the Independent Student-t Linear Model,” Journal of Applied Econometrics, 8, S19–S40. Giacomini, R., and B. Rossi (2013): “Forecasting in Macroeconomics,,” in Handbook of Research Methods and Applications on Empirical Macroeconomics, ed. by N. Hashimzade, and M. Thornton. Edward Elgar. Giglio, S., B. Kelly, and S. Pruitt (2015): “Systemic Risk and the Macroeconomy: An Empirical Evaluation,” Journal of Financial Economics, forthcoming. Gourio, F. (2012): “Disaster Risk and Business Cycles,” American Economic Review, 102(6), 2734–66. Hansen, L.(2007): “Beliefs, DoubtsandLearning: ValuingMacroeconomicRisk,” American Economic Review, 97(2), 1–30. Headrick, T. (2010): Statistical Simulation: Power Method Polynomials and Other Transformations. CRC Press: Carbondale, IL, first edn. Hoaglin, D., F. Mosteller, and J. Tukey (1985): Exploring Data Tables, Trends and Shapes. John Wiley & Sons: New York, NY, first edn. Ilut, C., and M. Schneider(2014): “AmbiguousBusinessCycles,” American Economic Review, 104(8), 2368—2399. Johannes, M., L. Lochstoer, and Y. Mou(forthcoming): “LearningAboutConsumption Dynamics,” Journal of Finance. Johannes, M., and N. Polson (2006): “Exact Particle Filtering and Parameter Learning,” Columbia Univeristy working paper. Johnson, T. (2004): “Forecast Dispersion and the Cross Section of Expected Returns,” Journal of Finance, 59, 1957–1978. Jovanovic, B. (2006): “Asymmetric Cycles,” Review of Economic Studies, 73, 145–162. Jurado, K., S. Ludvigson, and S. Ng (2015): “Measuring Uncertainty,” American Economic Review, 105(3), 1177–1216. Justiniano, A., and G. Primiceri (2008): “Time-Varying Volatility of Macroeconomic Fluctuations,” American Economic Review, 98(3), 604–641. Kacperczyk, M., J. Nosal, and L. Stevens (2019): “Investor Sophistication and Capital Income Inequality,” Journal of Monetary Economics, 107, 18–31. 32

Kelly, B., and H. Jiang (2014): “Tail Risk and Asset Prices,” Review of Financial Studies, forthcoming. Kozeniauskas, N., A. Orlik, and L. Veldkamp (2018): “What are uncertainty shocks,” Jounral of Monetary Economics, 100, 1–15. Mac´kowiak, B., and M. Wiederholt (2009): “Optimal sticky prices under rational inattention,” American Economic Review, 99 (3), 769–803. Matejka, F., and A. McKay (2015): “Rational Inattention to Discrete Choices: A New FoundationfortheMultinomialLogitModel,” American Economic Review, 105(1), 272– 98. Nimark, K. (2014): “Man-Bites-Dog Business Cycles,” American Economic Review, 104(8), 2320–2367. Pastor, L., and P. Veronesi (2012): “Uncertainty about Government Policy and Stock Prices,” Journal of Finance, 67(4), 1219–1264. Reis, R. (2006): “Inattentive producers,” Review of Economic Studies, 73(3), 793–821. Rietz, T. (1988): “The Equity Risk Premium: A Solution,” Journal of Monetary Economics, 22(1), 117–131. Stock, J., and M. Watson (2012): “Disentangling the Channels of the 2007-2009 Recession,” Brookings Papers on Economic Activity, pp. 81–135. Straub, L., and R. Ulbricht (2021): “Endogenous Uncertainty and Credit Crunches,” Harvard University working paper. Taleb, N. N. (2010): The Black Swan: The Impact of the Highly Improbable. Random House. Van Nieuwerburgh, S., and L. Veldkamp (2006): “Learning Asymmetries in Real Business Cycles,” Journal of Monetary Economics, 53(4), 753–772. Wachter, J.(2013): “CanTime-VaryingRiskorRareDisastersExplainAggregateStock Market Volatility?,” Journal of Finance, 68(3), 987–1035. Wieland, V., and M. H. Wolters (2013): “Forecasting and policy making,” in Handbook of economic forecasting, ed. by G. Elliott, and A. Timmermann. North-Holland Amsterdam. 33

A Proofs Lemma 1: Skewness and the concave change of measure We can write the skewness of y (times the variance, which is always positive) as (cid:90) E[(y E[y])3] = (y E[y])3φ(g−1(y))dy (7) − − whereφ(g−1(y))istheprobabilitydensityofy,byassumption. Usingthechangeofvariable rule, we can replace y with g(x). (cid:90) ∂g E[(g(x) E[g(x)])3] = (g(x) E[g(x)])3 φ(x)dx (8) − − ∂x Note that we replaced φ(g−1(g(x))) = φ(x), meaning that x is a standard normal variable. Because g is increasing and concave, ∂g/∂x is positive and decreasing in x. If ∂g/∂x were a constant, then 8 would be the skewness of a normal variable, which is zero. Thus, (cid:90) 0 (cid:90) ∞ (g(x) E[g(x)])3φ(x)dx = (g(x) E[g(x)])3φ(x)dx − − − −∞ 0 Since ∂g/∂x is positive and decreasing, it is higher for any y < 0 than it is for any y > 0 and since both sides of the inequality are positive (cid:90) 0 ∂g (cid:90) ∞ ∂g (g(x) E[g(x)])3 φ(x)dx > (g(x) E[g(x)])3 φ(x)dx − − ∂x − ∂x −∞ 0 Adding the negative of the left side to both sides of the inequality reveals that (cid:90) ∂g E[(g(x) E[g(x)])3] = (g(x) E[g(x)])3 φ(x)dx < 0. − − ∂x Lemma 2: Forecast bias. In the forecast yˆ (cid:82) (cid:82) (cid:82) yf(y µ(cid:48),σ(cid:48))g(µ(cid:48) σ(cid:48))h(σ(cid:48))dydµ(cid:48)dσ(cid:48), ≡ | | we can substitute g(x) for y and substitute x = g−1(y) into φ((g−1(y) µ)/σ) = f(y) to − get (cid:90) (cid:90) (cid:90) yˆ= g(x)φ((x µ(cid:48))/σ(cid:48))g(µ(cid:48) σ(cid:48))h(σ(cid:48))dg(x)dµ(cid:48)dσ(cid:48) − | Then, we can define x˜ = (x µ)/σ and substitute it in for x: − (cid:90) (cid:90) (cid:90) yˆ= g(µ(cid:48)+σ(cid:48)x˜)φ(x˜)g(µ(cid:48) σ(cid:48))h(σ(cid:48))dg(x)dµ(cid:48)dσ(cid:48) | Note that the inside integral evaluated at µ(cid:48) = µ and σ(cid:48) = σ is the true mean of y: y¯ (cid:82) yf(y µ,σ)dy = (cid:82) g(µ+σx˜)φ(x˜)dg(x). Let us use the notation y˜(µ(cid:48),σ(cid:48)) = (cid:82) g(µ(cid:48) + ≡ | σ(cid:48)x˜)φ(x˜)dg(x) to denote the mean of y, given any mean and variance parameters µ(cid:48) and σ(cid:48). Notice that since g is assumed to be a concave function, y˜ is concave in the parameters µ(cid:48) andσ(cid:48). Then,byJensen’sinequality,weknowthatforanyconcavefunctiony˜,E[y˜(µ,σ)] < y˜(µ,σ). Note by inspection that E[y˜(µ,σ)] = yˆ and y˜(µ,σ) = y¯ and the result follows. 34

B Estimating the model The algorithm is a modified version of exact particle filtering and parameter learning (Johannes and Polson, 2006) which relies heavily on the change-of-measure induced by two probabilistic distributions. In particular, notice that, for every history of observa- T tions y T , data density, p (cid:0) yt(cid:1) = (cid:81) p (cid:0) y yt−1,g (cid:1) , can be calculated as p (cid:0) y yt−1,g (cid:1) = { t }t=1 t | t | t=0 (cid:12) (cid:12) (cid:12) 1 (cid:12)p (cid:0) X Xt−1 (cid:1) where g−1log(1+gy ) = X . That is, for a given parameter g, the (cid:12)gYt+1(cid:12) t | t t exercise amounts to estimating parameters of a linear-normal model for X . Since paramt eter g is to be estimated too, together with the remaining parameters and states, we will apply the following particle filtering and parameter learning algorithm, where the sets of particles describe unknown parameters, states, and sufficient statistics (of the posterior distributions). Algorithm: Exact particle filtering and parameter learning in a skewed model Step 0. Simulate initial samples (v ,S ,θ) (i) from the prior distributions. t t In particular, assume that the parameter prior distributions are given by p (cid:0) α,β σ2(cid:1) (cid:0) c ,σ2C−1(cid:1) S t S t | ∼ NIG p (cid:0) σ2(cid:1) (a ,A ) t t ∼ IG p (cid:0) σ2(cid:1) (b ,B ) S t t ∼ IG p(g) (g ,g ) min max ∼ U (cid:0) (cid:1) Also, assume that the initial state distribution is given by p(S ) µ ,σ2 . 0 ∼ N S0 S0 For every i, compute the following - adjusted - weights w (cid:104) (S t ,θ)(i) (cid:105) ∝ (cid:12) (cid:12) (cid:12) (cid:12)giY t 1 +1 (cid:12) (cid:12) (cid:12) (cid:12) (cid:113) (cid:0) σ S 2 (cid:1)(i) 1 +(σ2)(i) exp (cid:32) − 1 2 (cid:0) X (cid:0) σ t+ S 2 1 (cid:1)( − i) α + i ( − σ2 β ) i ( S i) t (cid:1)(cid:33) For every t = 1,...,T: (cid:20)(cid:26) (cid:27)(cid:21) (cid:104) (cid:105)N Step 1. Draw (v S ,θ) (i) Multi w (S ,θ)(i) t+1, t+1 N t ∼ i=1 (cid:16) (cid:17) Step 2. Draw S (i) p S S (i) ,θ(i),X t+1 ∼ t+1 | t t+1 where the updated state distribution is given by (cid:16) (cid:17) (cid:16) (cid:17) (cid:16) (cid:17) (cid:16) (cid:17) p S S (i) ,θ(i),X p X S (i) ,θ(i) p S S (i) µi , (cid:0) σ2 (cid:1)(i) t+1 | t t+1 ∝ t+1 | t+1 t+1 | t ∼ N St+1 St+1 35

with µi X αi βiSi St+1 = t+1 + − t (cid:0) σ2 (cid:1)(i) (σ2)(i) (cid:0) σ2 (cid:1)(i) St+1 S 1 1 1 = + (cid:0) σ2 (cid:1)(i) (σ2)(i) (cid:0) σ2 (cid:1)(i) St+1 S (cid:16) (cid:17) (i) (i) (i) Step 3. Update sufficient statistics v = v ,S ,X using the following t+1 V t t+1 t+1 recursions (cid:16) (cid:17)2 (i) X S Ai = t+1 − t+1 +Ai t+1 2 t Ci = Ci+[1;S (i) ] [1 S (i) ] t+1 t t ∗ t ci = Ci (cid:0) Cici +Si [1; Si] (cid:1) t+1 t+1 t t t+1∗ t Bi = Bi+ (cid:16) Si (cid:0) ci (cid:1)(cid:48) (cid:17) [1; S (i) ] S t i +1 + (cid:0) ci ci (cid:1)(cid:48) Ci ci/2 t+1 t t+1− t+1 ∗ t 2 t − t+1 t+1 t The deterministic hyperparameters are updated as a = a +.5 and b = b +.5. t+1 t t+1 t Step 4. Draw new parameter samples (as in step 0 but with new sufficient statistics. Oncewehaveparticles (cid:0) αi,βi,σi,σi,gi,Si(cid:1) thatsummarizethejointposteriorparame- S t terandstatedistribution, wecomputethemodelforecastsanduncertaintyinthefollowing way. Let E E[X Xt−1] and Var = E[X Xt−1]. Also, let E g = gE +log( 1/g ) and t t t t t t Var g = g2Va ≡ r . The | n, | | | t t (cid:40) 1 exp(E g +.5Var g ) if g < 0 E y,g E[y yt−1] = −g − t t t ≡ t | 1 +exp(E g +.5Var g ) if g > 0 −g t t Var y,g Var[y yt−1] = (expVar g 1)(exp(2E g +Var g )) t t t t t ≡ | − Then average over all particles g (which entails an intermediate step of computing E y2,g E[yt yt−1] so as not to average over variances). t t ≡ | Estimating Prior Beliefs The use of the algorithm described above requires conjugate priors for parameter distribution (knowledge of the functional form of the posterior distribution of the parameters conditional on the entire history of latent states and the data). To discipline the priors, we use historical data, i.e. the vintage of the data as of 1968:Q3 (1947:Q2-1968:Q2). We use uniform priors on all the parameters, and estimate respective models using Bayesian techniques described above. The mean and standard deviations of the posterior parameter distributions as of 1968:Q3 become the moments of the prior 36

distributions for respective parameters that will be used in the real-time estimation from 1968:Q4 onwards. The results for the respective models are reported in the tables below. To compute volatility in these models, we fix parameters at the estimated means of these prior distributions. Figure 8 plots the priors and the evolution of parameter beliefs over the sample. Normal Skewed Parameter Mean Stdev Mean Stdev c 2.35 0.68 41.27 6.97 ρ 0.47 0.12 0.05 0.07 σ¯2 4.89 3.45 0.02 0.01 σ2 15.92 4.47 0.005 0.007 s Table 3: Moments of the prior distributions in the linear-normal and skewed models. Posterior Parameter Estimates See Figure 8 C Extension I: Uncertainty Shocks with Stochastic Volatility So far, we have explored homoskedastic models, in order to isolate the uncertainty shocks that come from parameter learning. But both changes in volatility and in parameter estimates can contribute to uncertainty shocks. To quantify the contribution of each, we estimate a model with stochastic volatility and parameter learning. The result is an uncertainty series that is a bit more volatile than before, but without the downward trend in uncertainty and with a larger spike in uncertainty around the time of the financial crisis. The extension of our baseline skewed model with stochastic volatility has σ(S ) σ(H),σ(L) t ∈ { } x(S ) = x¯ t t ∀ P(S = H S = H) = π , P(S = L S = L) = π t t−1 HH t t−1 LL | | In this model, our forecaster estimates the Markov transition probabilities π and π HH LL that govern changes in variance, together with parameter g instead of the α, β and σ S parameters that governed the hidden AR(1) process in the previous model. The variance is itself a hidden state that can take on one of two values σ(S ) σ(H),σ(L) . t ∈ { } 37

Figure 8: Skewed Model ( ) Parameters: Posterior Means, Medians, and 95% Credible 1 M Sets. Figure 9 plots the uncertainty that results with parameter learning and stochastic volatility in the skewed model. This plot is not detrended, and yet we see no downward trend in uncertainty after 1990. The forecaster with the homoskedastic model needs to accumulate lots of low-volatility observations to revise down her estimate of the fundamental 38

Uncertainty 6 Volatility 5 4 3 1980 1990 2000 2010 Figure 9: Uncertainty U and volatility V in the skewed model with stochastic volatility. t t volatility over time. The forecaster with the stochastic volatility model revises her beliefs by increasing the probability of being in the low-volatility state, and in doing so lowers her uncertainty within a few quarters. Allowing volatility to be stochastic does make uncertainty fluctuate more. The standard deviation of U rises to about 2.0% with stochastic t volatility. But adding stochastic volatility has only a small effect on the correlation of uncertainty with GDP growth (-0.72). The main lessons from combining the stochastic volatility view with the parameter learning view of uncertainty shocks are that (1) Both channels contribute to our understanding of uncertainty shocks; and (2) Incorporating stochastic volatility helps to avoid the downward trend in uncertainty that arises with a homoskedastic model. It prevents uncertainty from converging to a constant level. The more realistic version of this effect is that all parameters of the model can change or drift over time. Such a model would keep learning active and might be a better description of reality. But such a rich model is obviously difficult to estimate. The hope is that this simple first step in that direction might give us some insight about how time-varying parameters and parameter learning might interact more generally. D Extension II: Uncertainty Shocks in a Model with Signals Our baseline forecasting model comes very close to the actual behavior of the forecasts in the Survey of Professional Forecasters do (Table 1, bottom panel). It actually overexplains the bias by a bit. And the time variation of the forecasts over time is slightly lower than in the SPF. However, this is likely because the only source of information in our model are actual realizations of GDP growth. The forecasts in the model are based only on prior GDP releases. In reality, forecasters have access to other sources of data that improve the accuracy of their forecasts. We can remedy both of these issues and reproduce the average (over time and across forecasters) forecast dispersion in the SPF without changing the main messages of our paper. Suppose that each period, each forecaster i observes an additional signal z that is the it 39

next period’s GDP growth, with common signal noise and idiosyncratic signal noise: z = y +η +(cid:15) (9) it t+1 t it where η N(0,σ2) is common to all forecasters and (cid:15) N(0,σ2) is i.i.d. across foret η it (cid:15) ∼ ∼ casters. The two signal noise variances σ2 and σ2 can be calibrated to match the average η (cid:15) dispersion of forecasts and the standard deviation of mean forecast over time. 40

Cite this document
APA
Anna Orlik and Laura Veldkamp (2022). Understanding Uncertainty Shocks and the Role of Black Swans (FEDS 2022-083). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2022-083
BibTeX
@techreport{wtfs_feds_2022_083,
  author = {Anna Orlik and Laura Veldkamp},
  title = {Understanding Uncertainty Shocks and the Role of Black Swans},
  type = {Finance and Economics Discussion Series},
  number = {2022-083},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2022},
  url = {https://whenthefedspeaks.com/doc/feds_2022-083},
  abstract = {Economic uncertainty is a powerful force in the modern economy. Research shows that surges in uncertainty can trigger business cycles, bank runs and asset price fluctuations. But where do sudden surges in uncertainty come from? This paper provides a data-disciplined theory of belief formation that explains large fluctuations in uncertainty. It argues that people do not know the true distribution of macroeconomic outcomes. Like Bayesian econometricians, they estimate a distribution. Our main contribution is to explain why real-time estimation of distributions with non-normal tails are prone to large uncertainty fluctuations. We use theory and data to show how small changes in estimated skewness whip around probabilities of unobserved tail events (black swans). Our estimates, based on real-time GDP data, reveal that revisions in the estimates of black swan risk explain most of the fluctuations in uncertainty.},
}