feds · May 19, 2025

Scenario Synthesis and Macroeconomic Risk

Abstract

We introduce methodology to bridge scenario analysis and model-based risk forecasting, leveraging their respective strengths in policy settings. Our Bayesian framework addresses the fundamental challenge of reconciling judgmental narrative approaches with statistical forecasting. Analysis evaluates explicit measures of concordance of scenarios with a reference forecasting model, delivers Bayesian predictive synthesis of the scenarios to best match that reference, and addresses scenario set incompleteness. This underlies systematic evaluation and integration of risks from different scenarios, and quantifies relative support for scenarios modulo the defined reference forecasts. The framework offers advances in forecasting in policy institutions that supports clear and rigorous communication of evolving risks. We also discuss broader questions of integrating judgmental information with statistical model-based forecasts in the face of unexpected circumstances.

Finance and Economics Discussion Series Federal Reserve Board, Washington, D.C. ISSN 1936-2854 (Print) ISSN 2767-3898 (Online) Scenario Synthesis and Macroeconomic Risk Tobias Adrian, Domenico Giannone, Matteo Luciani, and Mike West 2025-036 Please cite this paper as: Tobias Adrian, Domenico Giannone, Matteo Luciani, and Mike West (2025). “Scenario Synthesis and Macroeconomic Risk,” Finance and Economics Discussion Series 2025-036. Washington: Board of Governors of the Federal Reserve System, https://doi.org/10.17016/FEDS.2025.036. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

Scenario Synthesis and Macroeconomic Risk Tobias Adrian,1 Domenico Giannone,2 Matteo Luciani,3 Mike West4 May 9, 2025 Abstract Weintroducemethodologytobridgescenarioanalysisandmodel-basedriskforecasting,leveraging their respective strengths in policy settings. Our Bayesian framework addresses the fundamental challengeofreconcilingjudgmentalnarrativeapproacheswithstatisticalforecasting. Analysisevaluates explicit measures of concordance of scenarios with a reference forecasting model, delivers Bayesian predictive synthesis of the scenarios to best match that reference, and addresses scenario set incompleteness. This underlies systematic evaluation and integration of risks from different scenarios, and quantifies relative support for scenarios modulo the defined reference forecasts. The framework offers advances in forecasting in policy institutions that supports clear and rigorous communication of evolving risks. We also discuss broader questions of integrating judgmental information with statistical model-based forecasts in the face of unexpected circumstances. Keywords: Macroeconomic Forecasting, Mixtures of Scenarios, Misclassification Rates, Entropic Tilting, Bayesian Predictive Synthesis, Judgmental Forecasting, Forecast Risk Assessment 1TobiasAdrian,DirectoroftheMonetaryandCapitalMarketsDepartment,InternationalMonetaryFund 70019thStreetNW,Washington,DC20431,U.S.A. tadrian@imf.org 2DomenicoGiannone,AssistantDirector,InternationalMonetaryFund 70019thStreetNW,Washington,DC20431,U.S.A. dgiannon2@gmail.com 3MatteoLuciani,PrincipalEconomist,BoardofGovernorsoftheFederalReserveSystem 20thStreetandConstitutionAvenueNW,Washington,DC20551,U.S.A. matteo.luciani@frb.gov 4MikeWest,TheArts&SciencesDistinguishedProfessorEmeritusofStatistics&DecisionSciences DukeUniversity,Durham,NC27708,U.S.A. mike.west@duke.edu Disclaimer: The views expressed in this paper are those of the authors and do not necessarily reflect the views and policiesoftheBoardofGovernors,theFederalReserveSystem,ortheInternationalMonetaryFund,itsManagement,or itsExecutiveDirectors.

1 Introduction Macroeconomicpolicyinstitutionssuchascentralbanksrelyheavilyonforecastingmethods. Monetarypolicymakersareregularlybriefedontheeconomicoutlook,alternativepolicypaths,andthe balance of risks around the central forecast. Central bank staff rely on a combination of structural macroeconomic models, reduced-form empirical models, and judgmental approaches to prepare such monetary policy briefings. The central forecast is used as a basis for alternative policy path discussions, and the balance of risks is discussed more loosely based on scenario analysis. TheBankofEnglandpioneeredcommunicationofriskwithfanchartsin1993;theInflationReportsshowcentralprojectionsofinflationwithchartsthatreflectuncertainty. Uncertaintyintervals are derived from judgmental assessments of risk around the baseline (Britton et al., 1998). Since 1995, the U.S. Federal Reserve’s Tealbook (TB) has presented scenarios as perturbations around baselineforecasts. Mostmajorcentralbanksnowusesomevariantoftheseapproaches. Fancharts andscenarioanalysisposepracticalchallengesastheyrequirefrequentupdatingandquantification of risks based on judgment. Hence central banks are relying more often on statistical methods to forecast macroeconomic risk. The density forecasting approach of “Growth-at-Risk” (GaR: Adrian et al., 2016, 2019; Plagborg-Moller et al., 2020; Adrian et al., 2022) is increasingly popular. The TealbookhasincludedGaRmeasurestogetherwithscenariossince 2017; othercentralbankshave also implemented GaR approaches in addition to the more judgmental scenario methods (e.g., FigueresandJarocin´ski,2020;Lenzaetal.,2023;Aikmanetal.,2019;Eguren-Martinetal.,2024; Anesti et al., 2023; Jondeau et al., 2022; Alessandri et al., 2019). Our focus here is on a formal statistical approach to integrating scenario-based balance of risk discussions with statistical forecasts. The methodology defines a synthesis of the baseline and scenarios that best match the statistical reference forecast distribution, the latter typically from GaR and/or a quantile regression model. The scenario synthesis assigns weights to each scenario, quantifying their relative concordance with the reference and so providing explication of why a certainsetofscenariosisparticularlyrelevant. Theanalysisalsoincorporatesasynthetic“backstop” scenario designed to address potential incompleteness of the defined scenario set. In practice, uncertaintymeasuresareusuallypublishedonlyforthebaseline;alternativescenariosaretypically represented only in terms of point forecasts. We use extensions of the Bayesian decision analysis method of entropic tilting (e.g. Robertson et al., 2005; Tallman and West, 2022) to define full scenario forecast distributions as perturbations the baseline. Analysis further addresses scenario information beyond a single point forecast, specifically to use scenario tail percentiles that reflect measures of scenario risk. This links to the desirability of scenario hypotheses that represent more radical perturbations of the baseline than has been typical (e.g., Justiniano and Primiceri, 2008; Ferna´ndez-Villaverde et al., 2011; Adrian and Boyarchenko, 2012; He and Krishnamurthy, 2012; Brunnermeier and Sannikov, 2014; Ferna´ndez- Villaverde et al., 2023, 2024 with structural models, and Adrian et al., 2021; Caldara et al., 2021; Carriero et al., 2024 with reduced-form models). That scenarios considered by policy institutions oftenrepresentonlymodestperturbationsofthebaselineisalsopartlyaddressedbyouruseofthe syntheticbackstopscenario. Thiscanserveasa“redflag”whenthescenariosetfailstoaccountfor risks– especially tail risks– supported under the statistical reference. Relating baseline and scenario forecasts to the statistical reference exploits Bayesian predictive synthesis (e.g. McAlinn and West, 2019; Johnson and West, 2025) to motivate a discrete mixture (linear pool) of baseline and scenario distributions as a proxy for scenario-based forecasting. 1

The “match” of such a mixture with the statistical reference distribution uses a central measure of concordance between distributions, namely the expected misclassification rate (EMR). Identifying mixture weights to optimize EMR is then a formal Bayesian decision problem. Relative scenario probabilities based on this scenario:reference optimization guide evaluation and interpretation of the roles of scenarios. The analysis includes explicit statistical measures of scenario set incompletenessreflectingaspectsoflack-of-concordancethescenariosynthesiswiththereference. Thisaidsin the policy setting on the question of whether the baseline and chosen scenarios adequately reflect all the risks captured by the reference. ThecasestudydrawsonpublishedversionsoftheTB.Weusedatafromreportspreparedforthe December FOMC meetings in 2007 and 2018, giving predictions for 2008 and 2019, respectively. Following the TB, we focus on risks to real growth. Reanalysis incorporating other variables, such as inflation and unemployment at risk (Adams et al., 2021; Kiley, 2022; Lo´pez-Salido and Loria, 2024) is straightforward but beyond our main scope here. Our detailed examples highlight the generation of scenario weights reflecting aspects of concordance with the reference, and also the questions of scenario set incompleteness. One example of that latter highlights the lack of a very negative, “downside risk scenario” in both the 2007 and 2018 TB. This relates to the particular interest in our analysis when economic uncertainty is high so that defining an adequate baseline forecast is challenging. Then, listing and discussing a range of plausible scenarios, each with an assigned probability derived from the reference match, offers a richer perspective on informed decision-making under uncertainty. Ouranalysistakesbaselineandscenarios(aswellasthereference)asgiven. Inpolicypractice, of course, the back-and-forth between changes to statistical forecast distributions and the evolving narrative of scenarios provides a rich ground to rigorously examine shifts in the balance of risks. ThiswasnotedbyBernanke(2023)andisgermanetotheTB,wherescenario-basedapproachesto thebalanceofriskandstatisticalforecastdistributionsarediscussedseparately. AsFederalReserve ChairJeromePowellnotedduringthePressConferencefollowingtheJanuary2025FOMCmeeting, “One of the things our staff does is they look at a range of possible outcomes. [...] There’ll be baseline, and then they’ll show six or seven alternative scenarios, including really good ones and not so good ones. And what those do is they spark [...] the policymakers to sort of think and understand about [...] the uncertainties that surround us.” Our methodology provides formal cross-talk that can aid macroeconomic staff in policy institutions: it combines the communicative strength of narrative scenarios with the statistical rigor of predictive models, identifying the most relevant risks with easy-to-understand stories and quantifying the relevance of these stories. Section 2 discusses foundations and overviews methodology. Section 3 addresses partial scenario information. Section 4 introduces expected misclassification rates as distributional concordance metrics, with foundational insights. Section 5 develops the embedding of scenario analysis in a fully Bayesian framework, with core theoretical summaries and aspects of computational implementation. Section 6 summarizes key aspects of the detailed case study. Section 7 links to broaderquestionsofcombiningjudgmentalinformationwithstatisticalmodel-basedforecasts. The Appendix adds technical and methodological details. Summary comments define Section 8. 2

2 Setting, Foundations and Perspective 2.1 Context and Goals Interest lies in forecasting a vector outcome y, such as a path of several macroeconomic indicators over multiple future time periods, based on the following ingredients. • Apolicy-basedanalysisproducesapredictivedensityp (y),referredtoasthebaselinedensity. 0 • Relativetothebaseline,thepolicyanalysisconsiderseachofasetofalternativescenarios;scenario j, labeled S , generates a predictive density p (y). These are regarded as hypothetical j j scenarios to be assessed relative to the baseline. • The baseline is a given forecast distribution in the policy setting, so not an hypothetical scenario; that understood, we use S and the index j = 0 to designate the baseline. 0 • Separately, a statistical model (e.g., the statistical GaR analysis) produces a full predictive density p(y), referred to as the reference predictive density. The over-arching goal is to identify “closeness” of each scenario to the reference p(y), and rank them relative to that assessment. The methodology we introduce addresses this, building on foundational statistical concepts and model developments now discussed. 2.2 Scenario Mixtures and Bayesian Predictive Synthesis A Bayesian decision-maker in the policy setting can regard the set of scenario p.d.f.s p (y) as “inj formation” to use in forming a policy-relevant overall forecast. This involves some form of pooling of the predictions across the baseline and alternative scenarios. Here the foundational theory of Bayesian predictive synthesis (BPS)– and the specific class of “mixture BPS” models (McAlinn and West,2019,section2.2;JohnsonandWest,2025)–applies. UnderBPS,avalidBayesianpredictive analysis can be based on a scenario mixture, i.e., a distribution with p.d.f. f(y|α) that is a linear (cid:80) poolofthep (y)withrespecttoprobabilityweightsα inavectorα,namelyf(y|α) ∝ α p (y). j j j j j AkeytheoreticalaspectofmixtureBPSisthatitcanaddressthebroadquestionof“scenarioset (in-)completeness”. Thatis,asettinginwhichthebaselineS andallofthethealternativescenar- 0 iosS consideredarediscordantwiththereferencep(y).Thisrelatestothe“modelsetincompletej ness” issue widely discussed in Bayesian econometrics. BPS theory addresses this by requiring an additional p.d.f. to extend the initial set and to use in the mixture. This has been exploited in BPS applications–andinitsgeneralizationtodecision-guidedsettings(BPDS:TallmanandWest,2023; Chernis et al., 2024)– by structuring the additional p.d.f. as a backstop that can be expected to be supported by future data that is not so well-predicted by the initial model set. Examples in the abovestudiesuseanover-dispersedaverageoftheinitialmixtureofmodelp.d.f.,andthisstrategy can be adopted for scenario analysis. The specific construction of such a backstop scenario in the case study in Section 6 provides an example of this modelling strategy. Index the alternative scenarios from the policy setting by j = 1 : J −1 with the baseline j = 0 and now with j = J for the chosen backstop p.d.f. The latter is labeled S though it is a purely J synthetic scenario chosen for the above purposes. Then the overall scenario mixture p.d.f. is (cid:88) f(y|α) = α p (y). (1) j j j=0:J 3

2.3 Incomplete Specification of Scenario Forecast Distributions Scenario p.d.f.s p (y) are typically only partially specified. A common setting is that S defines j j point forecasts such as means or medians, with or without uncertainty measures such as a few otherpercentiles. Thefoundationalconceptisthatthealternativescenariosrepresenteconomically relevant “what-if?” perturbations of the baseline. Hence receiving such partial information on S j indicates a modification of p (y) to match that partial information. Our approach aims to identify 0 p (y) that is “closest to” the baseline p (y) subject to being consistent with that partial scenario j 0 information. The theoretical basis for methodology to do this, detailed in Section 3.2, is that of entropic tilting (ET: Tallman and West, 2022). Since its introduction by Robertson et al. (2005), ET–based methodology has seen increasing use in forecasting in econometrics, finance and related areas(e.g.Kru¨geretal.,2017;Metaxoglouetal.,2018;Koopetal.,2019;Antol´ın-D´ıazetal.,2021; Clark et al., 2022; West, 2024; Crump et al., 2025). The current setting is different, though use here of ET is close in spirit and goals to its original use in imposing constraints on a given– here the baseline– forecast distribution. 2.4 Scenario-Reference Concordance The goal of measuring concordance of scenarios with the statistical reference is now that of relating f(y|α) in eqn. (1) to the reference p.d.f. p(y). This is addressed by identifying the probability vector α = (α ,...,α )′ such that the scenario mixture is “closest to” p(y). This requires specifica- 0 J tion of a utility function to characterize and quantify “close” in comparing densities, and then the resulting methodology to evaluate α and thus define both scenario-specific weights and the overall mixture synthesis. Section 4.1 introduces a foundational metric for this– based on a measure of concordance of f(y|α) and p(y) from traditional statistical classification. With some new and relevant theoretical results and motivating examples, this underlies its use in scenario synthesis. 3 Partial Scenario Information and Entropic Tilting 3.1 Partial Scenario Information As noted in Section 2.3 the common setting is that for each scenario only partial information relative to the fully specified baseline is provided. In many examples, the partial information can be represented as expectations of functions of y, and this is the setting we adopt. Often, only the perturbedcentraltendencyisreported. Iftakenasamean,itwouldbeaconstraintontheexpected value directly. If taken as the median, then it is formally defined as the expectation of an indicator function. Similar reasoning applies to other percentiles. Our analysis below addresses multiple scenario features simultaneously, such as a set of percentiles. SupposethatS providespartialinformationonp (y)intermsofm = E[s (y)|S ]wheres (y) j j j j j j is a q −vector of scenario scores; call the given vector m the target score for S . In general, the j j j definition of scores can be scenario-specific, but here we assume that q = q and s (y) = s(y) for j j allj = 1:J. Amaincaseofinteresthaselementsofs(y)asindicatorfunctionsinoneormoreofthe univariate dimensions; then m is a given vector of percentiles of p (y) in those dimensions. With j j S regarded as a perturbations of the baseline, methodology aims at identifying that p (y) closest j j to the baseline p (y) subject to being consistent with the forecast information m . Entropic tilting 0 j (ET) results if we choose to define “close to” in a Ku¨llback-Leibler (KL) sense. 4

3.2 Entropic Tilting and Scenario-Baseline ET Weights ET–based methodology, recently exploited in new ways in Bayesian predictive decision synthesis (e.g. Chernis et al., 2024; Tallman and West, 2023, 2025), was originally used in imposing constraints on forecast distributions; that is the context here. In our setting, ET aims to identify p (y)tominimizetheKLdivergenceofthebaselinep (y)fromp (y)subjecttom = E[s (y)|S ] = j 0 j j j j (cid:82) s (y)p (y)dy. ET theory (Tallman and West, 2022) yields y j j (cid:90) p j (y) = k j eτ j ′sj(y)p 0 (y) where k j −1 = eτ j ′sj(y)p 0 (y)dy, (2) y in which τ is the (provably unique) tilting vector such that the expectation constraint is satisfied. j The implied identity 0 = (cid:82) {s (y)−m }exp{τ′s (y)}p (y)dy is typically efficiently solved for τ y j j j j 0 j using simple Newton-Raphson. Inpractice,itistypicalthatthebaselineisrepresentedintermsofaMonteCarlo(MC)sample, i.e.,definedasadiscretedistribution{yi,wi} withsupportpointsyi havingweight(probabil- 0 i=1:n ity) wi. This is particularly key in our setting as we will later use importance sampling to evaluate 0 p (y) relative to the statistical reference p(y). Then expectations defining the ET tilting vectors τ 0 j are trivially evaluated via simple Monte Carlo integration. ET analysis can be regarded as using p (y) as an importance sampling proposal with respect to 0 atargetp.d.f. p (y).ThiswasrecognizedbyRobertsonetal.(2005)andprovidesusefulnumerical j checks on consistency of the scenario-specific moment constraints with the baseline. On sample values yi, the implied normalized IS weights for MC integration in eqn. (2) are wi ∝ uiwi where j j 0 ui ∝ p (yi)/p (yi) = exp{τ′s (yi)}. The ui are called ET weights. The standard expected sample j j 0 j j j size (ESS) can be evaluated on the u . ESS– the reciprocal of the sum of squared ui over i = 1:n– i j providesanoverallassessmentofconcordanceoftheS constraintswiththebaseline(Tallmanand j West, 2022, sect. 1.6). This relates closely to the minimized KL divergence (e.g., Gruber and West, 2016, sect. 3.3; Gruber and West, 2017, sect. 5.4) but on an interpretable scale. 4 Predictive Concordance 4.1 Predictive Concordance and Misclassification Rates PredictiveconcordancemootedinSection2.4ispresentedhereinageneralsettingcomparingtwo density functions p(y) and f(y). The scenario mixture setting then arises with f(y) replaced by f(y|α) of eqn. (1) for any given α. Assume that p(y) and f(y) have the same support. Suppose a random draw y is made from either f(y) or p(y) with equal probabilities. It is not disclosed which distribution generates the outcome y. Write H for the hypothesis that y ∼ p(y), p andH forthehypothesisthaty ∼ f(y).SincethechoiceismadewithPr(H ) = Pr(H ) = 0.5,the f p f resulting posterior probabilities conditional on the observed y are P(H |y) = p(y)/{p(y)+f(y)} p and Pr(H |y) = 1−P(H |y). f p Now assume that y is actually a draw from p(y), i.e., condition on H . Before learning y, the p expected posterior probability on H is then f (cid:90) (cid:90) f(y)p(y) π ≡ E[P(H |y)|H ] = P(H |y)p(y)dy = dy. (3) pf f p f {f(y)+p(y)} y y 5

By symmetry, if y is actually from H , the expected posterior probability π = E[P(H |y)|H ] is f fp p f obviously the same, π = π . fp pf Predictiveconcordanceoff(y)withp(y)isinherentlymeasuredbytheexpectedmisclassification rate(EMR)π . Highervaluesindicatethatitisdifficulttodiscriminatef(y)fromp(y)–indicating pf thatdrawsfromf(y)aremorelikelytobemisclassifiedascomingfromp(y)–andvice-versa. This is a natural, interpretable metric to assess concordance– or discordance– of the two distributions. In traditional classification in statistics and machine learning, the optimal Bayesian classifier judges y as coming from f(y) with probability P(H |y). Averaging across y ∼ p(·), and using f standardterminology,1−π isthenboththepopulationsensitivityand(duetothecomparisonof pf justtwodistributionsandtheimpliedsymmetry)thepopulationsensitivityoftheoptimalBayesian classifier. Itfollowsthat1−π isthetraditionaloverallaccuracyofthetestcomparingf(·)andp(·), pf and so EMR π = 1−accuracy is the traditional error rate. Increasing EMR indicates decreased pf discrimination of f(·) from p(·). Judging f(·) to be “close to” p(·) at higher values of π is thus pf theoretically fundamental and practically interpretable. It is immediate that π ≤ 0.5 with equality only when f(·) ≡ p(·), defining the absolute scale pf for assessmentof concordance. To prove this, notethatπ = E[r(y)/{1+r(y)}|H ]wherer(y) = pf p f(y)/p(y) with E[r(y)|H ] = 1. Now, r/(1+r) is concave on r > 0 so that π ≤ E[r(y)|H ]/{1+ p pf p E[r(y)|H ]} = 1/2. The upper bound is achieved when f(y) ≡ p(y), i.e., r(y) = 1 for all y. p Now consider a decision setting where f(·) is to be chosen to be “close to” p(y), and when y ∼ p(·). Choosingf(·)tomaximizeπ subjecttorelevantconstraintsistheoptimaldecisionwith pf respect to the implied constrained version of utility function P(H |y). This defines the Bayesian f foundation of use of EMR in the scenario synthesis development in Section 5. 4.2 Relationships to Ku¨llback-Leibler Divergence Note that π = E[1/[1 + exp{k(y)}]|H ] where k(y) = log{p(y)/f(y)}. Under H , the scalar pf p p (cid:82) random quantity k(y) has expectation KL(p∥f) ≡ E[k(y)|H ] = log{p(y)/f(y)}p(y)dy, the p y Ku¨llback-Leibler divergence of f(·) from p(·). Assuming this expectation is finite, the delta approximation yields π ≈ 1/[1+exp{KL(p∥f)}]; thus choosing f(·) to maximize π is approximately pf pf the KL divergence minimizing solution. In many cases of practical relevance, this also provides a strict lower bound on π , i.e., π ≥ 1/[1 + exp{KL(p∥f)}]; see Appendix A.1. Both the direct pf pf approximation and the lower bound are accurate in cases of higher concordance. Then, the symmetry of EMR in f(·) and p(·) implies that the same results hold with the two densities exchanged. With KL(f∥p) the divergence of of p(·) from f(·), this immediately refines the lower bound to π ≥ 1/[1+exp(κ )] where κ = min{KL(p∥f),KL(f∥p)}, with equality as the direct delta appf pf pf proximation. KL divergence always raise the question of directional definition. This does not arise in using π due to its symmetry, and this link to KL indicates the relevant “symmetrization” of KL pf as κ . In cases of relatively good concordance, the two directional measures will also be close. pf Additional aspects of the relationship are discussed and exemplified in Appendix A. EMR is fundamental for reasons discussed above; we have presented these connections to KL as it is a well-known measure. A major caveat is that it assumes KL measures are finite. There are important practical contexts where this is not so. An example has f(y) Gaussian and p(y) log T with any degrees of freedom; then p(y) has no moments at all (e.g. West, 2024, Supplementary Material, Appendix B) and KL(p∥f) is infinite. In contrast, π ∈ (0,0.5] always. pf 6

5 Scenario Synthesis 5.1 EMR and Optimizing Scenario Mixture Probabilities The predictive concordance concept applies to the scenario mixture setting with f(y) replaced by f(y|α) = (cid:80) α p (y) at any chosen probability vector α = (α ,...,α )′. Making dependence j=0:J j j 0 J on α explicit, eqn. (3) is now (cid:90) f(y|α)p(y) π (α) = dy. (4) pf {f(y|α)+p(y)} y Valuesofαyieldinghighvaluesofπ (α)definemixturesofbaselineandscenarios“closeto”tothe pf referencep(y)intermsofprobabilisticconcordance. Supposeαmaximizesπ (α)withmaximum (cid:98) pf value π = π (α). Each element α of α quantifies the extent to which S is concordant with (cid:98)pf pf (cid:98) (cid:98)j (cid:98) j the reference relative to the other S for i ̸= j. The summary π is a concrete measure of the i (cid:98)pf concordanceofthesetofpredictionsfromthebaselineandthescenarioscombined. Alowvalueof π indicatesthatnoneoftheS northeirmixturearereallyconcordantwiththereference,relating (cid:98)pf j to the scenario set incompleteness discussion of Section 2.2. Thus π measures how “discordant” (cid:98)pf thescenariosetiswiththereferencestatisticalpredictions. Theweightα onthebackstopprovides (cid:98)J additional information. The framework addresses selection of α as a decision problem that maximizes π (α) with pf regularization to penalize very small α . This is based on deeper foundational and theoretical dej velopment in the next subsection, and leads to choosing α∗ to maximize the objective function (cid:88) (cid:88) λ(α) = log{π (α)}+ϵ log(α ), subject to α > 0 (j = 0:J) and α = 1, (5) pf j j j j=0:J j=0:J whereϵ > 0isaverysmallregularizationparameter. Aswenowshow,eqn.(5)isinfactthelogof a formal posterior distribution so that the optimization seeks the posterior mode. 5.2 Bayesian Foundation 5.2.1 EMR is a Likelihood Function Supposethattheeconomicrealityyisgeneratedfromthereferencep(y)andconsideranhypothetical/synthetic binary outcome z generated from the Bernoulli distribution with success probability Pr(z = 1|y,α) = f(y|α)/{p(y)+f(y|α)}. Then p(z = 1,y|α) = Pr(z = 1|y,α)p(y|α) = Pr(z = 1|y,α)p(y) = f(y|α)p(y)/{p(y)+f(y|α)}. Now suppose you observe z = 1 but not y; EMR emerges via expectations over the “missing data” y,viz.,p(z = 1|α) = π (α).Thus,π (α)isinfactalikelihoodfunctionfortheparameterαbased pf pf onanhypotheticalobservationz = 1thatclassifiesarandomdrawfromp(y)ascomingfromf(y) under a 50:50 prior. The connection with the foundation of EMR in Section 4.1 is immediate. It follows that EMR-maximizer α is a maximum likelihood estimate (MLE). Evaluating α is (cid:98) (cid:98) probabilitysimplexconstrainedconvexoptimizationproblemwithauniquesolution,theconvexity and hence uniqueness being shown here in Appendix C. The solution will typically be a sparse mixture of scenarios, with some zeros in α. This follows from general results of optimization of (cid:98) 7

convex functions over the probability simplex (e.g. Boyd and Vandenberghe, 2004). For some integer k ∈ {0 : J} a subset of k of the α can be zero. There are cases when k = 0 but k > 0– (cid:98)j defining a sparse optimizing vector– is more usual, especially with larger J and diversity among the p (y). This relates to general features of optimization over the simplex; simplex constraints j operate to shrink weights to the boundaries, effectively as ℓ shrinkage for sparsity (e.g. Brodie 1 etal.,2009). Thisunderliesthenotionofscalabilityoftheanalysistolargernumbersofscenarios. However,sparsityinαisunstablesinceitisnotagenuinefeaturebutisinducedbytheimplicit (cid:98) prior ℓ penalty; its values are typically very sensitive to small changes in the input scenario and 1 reference p.d.f.s. This pathology of sparsity inducing penalties was identified and documented in the context of forecasting by Giannone et al. (2021). In the current setting, take an example with two very similar scenario p.d.f.s; one of these scenarios will have a zero value in α, the other (cid:98) non-zero. Then, a very small change in either of the p.d.f.s– or of the reference p.d.f.– will flip the zero/non-zero pattern. At each of these extremes– and for ranges of the α on these two j scenarios bridging the extremes– the resulting scenario mixture f(y|α) will be almost unchanged. (cid:98) Thissensitivityisundesirable;itisdesirabletohavesimilarprobabilitiesonthetwoscenarios. The key point is that a uniform prior on α favors overly sparse models when the likelihood function hasmodesatthesimplexboundaries. Thiscanbeaddressedbyimposingadditionalconstraintsor, more foundationally, with a minimally informative “regularizing” prior over α. 5.2.2 Priors and Penalties The natural priors are Dirichlet, α ∼ Dir(a) having p.d.f. p(α) ∝ (cid:81) α aj−1 over the simplex. j=0:J j (cid:80) Here a > 0 for all j and, with precision a = a , the means are a /a and prior joint j j=0:J j j mode has elements max{0,(a − 1)/(a − J − 1)}. A prior with each a = 1 + ϵ for a very small j j ϵ > 0 is “minimally informative” subject to the joint prior mode being positive on each scenario. Modifications to a = 1 + ϵ to differentially favor scenarios a priori are obviously of interest, j j but for this paper the symmetric prior is adopted. For given ϵ, the prior joint mode and mean are then each 1/(J + 1), i.e., favoring a uniform set of scenario probabilities though with high uncertainty since ϵ is taken as very small. Under this prior α ∼ Dir(1(1+ϵ)), the log posterior is λ(α) in eqn. (5), up to an additive constant. The prior is zero at simplex boundaries, hence so is the resulting unimodal posterior. The posterior mode– denoted by α∗– maximizes EMR modified by the prior-based penalty that explicitly acts to move from the boundary zero MLE values in α to (cid:98) small but non-zero values. This leads to more stable and robust results and addresses the issues discussed in the previous section. Analysisrequireschoiceofa(small)valueoftheregularizinghyper-parameterϵ.BasedontheoryinAppendixB,thedefaultrecommendationisϵ = c/(J+1),wherec = 0.005.Thevalueofccan be modified somewhat up/down with minimal impact, while the scaling with number of scenarios isimportantinmoreheavilypenalizingtheMLE-basedanalysisinhigherdimensions. Givenϵ > 0, evaluationoftheposteriormodeα∗ tomaximizeeqn.(5)triviallymodifiestheprobabilitysimplex constrained convex optimization problem with a unique solution. See Appendix C. It is also of interest to consider analyses with additional constraints on α. A key example is to require α ≥ α for j = 1:J, consistent with the view that the baseline is the “modal” scenario. 0 j In general it is of interest to run comparative analyses with and without such constraints. Such a constraintsimplymodifiestheDirichletpriorbytheindicatoroftheconstraint;thisdoesnotimpact the convexity of the optimization problem and is trivially implemented. 8

5.3 Monte Carlo Importance Sampling Analysis prima facie relies on evaluating the p.d.f.s p(y) and each p (y), and then performing the j integration in eqn. (4). Analytic approximations to the integral may be explored. Specific approximations relate to measures of discriminatory information in classification using mixtures (e.g. Lin et al., 2016). In practice, however– and as already noted in Section 3.2– forecasts will typically be “available” in terms of Monte Carlo (MC) samples, so direct evaluation of π by MC integration is pf a priority (and avoids concerns of assessing the quality of analytic approximations). The analysis is implemented with the values of the p.d.f.s p(y) and the p (y) available only on j a (large) sample of MC draws from the reference p(y), a reference random sample. The random sample yi, (i = 1:n), is drawn from p(y) and at the first step this defines an importance sample (IS) for the baseline p (y) with normalized IS weights wi ∝ p (yi)/p(yi). The discrete distribu- 0 0 0 tion {yi,wi} defines the MC approximation to the baseline for evaluation of expectations in 0 i=1:n the downstream analysis. A proviso is that p(y) is a relevant importance sampling proposal; in particular, it should be heavier-tailed than p (y). As in all applications of IS, monitoring efficiency 0 measures such as the % effective sample size ESS = n−1100/ (cid:80) (wi)2 provides guidance; inii=1:n 0 tial analysis generating a relatively low ESS guides choice of a larger sample size. This IS analysis then underlies evaluation of scenario-specific ET parameters as in Section 3.2, yielding ET weights ui ∝ exp{τ′s (yi)} on sample yi defining p (·) relative to the baseline. Scenario-specific ESS meaj j j j sures using the ui weights are then relevant. In (rare) cases of a scenario that is really discordant j with the baseline, a very low ESS indicates such. Refined but much more computationally demanding adaptive IS methods may be considered, but are outside our current scope. In any case, encountering such discordance would indicate that such a scenario might better be considered separately and its full distribution directly assessed. This ET analysis leads to compound weights wi ∝ uiwi relating S to the reference; these are j j 0 j called the ET−IS weights. ESS measures can now also be evaluated on the wi to provide direct j overall assessment of each p (·) relative to the reference p(·). Note that there can be cases where j a scenario is more concordant with the reference than the baseline as some of our examples show. The reference sample and compound ET−IS weights are then ingredients in the direct evaluation of eqn. (4) via MC integration. 6 Case Study ThecasestudydrawsfromtheRiskandUncertaintyanalysesintheDecember2007and2018Tealbooks (Federal Reserve Board, 2007, 2018). The scenarios specify point forecasts for GDP growth, inflation, the unemployment rate, and other variables. The methodology applies to multiple variables and horizons, but this first application restricts attention to one-year ahead GDP growth, namely y = y, now scalar. Analysis follows the processes discussed in the previous sections; Appendix D gives a summary of the flow of analytic and computational details. 6.1 Reference Distribution Among recent statistical approaches to risk assessment, Adrian et al. (2019) develop quantile regression models of conditional predictive distributions and show that financial markets provide useful risk information. This approach has influenced practice, being adopted for conditional one-year ahead forecasts of GDP growth, unemployment rate, and inflation, for example, by the 9

Federal Reserve Board (Engstrom and Gonzalez-Astudillo, 2017) in the “Time-Varying Macroeconomic Risk” exhibit in the Risk and Uncertainty section of Tealbook A, and the New York Federal Reserve (Boyarchenko et al., 2023) in Outlook-at-Risk. Other central banks and international financial institutions have also adopted GaR approaches (examples include Figueres and Jarocin´ski, 2020;Lenzaetal.,2023;IMF,2017;Eguren-Martinetal.,2024;Anestietal.,2023;Jondeauetal., 2022; Alessandri et al., 2019; Pujadas et al., 2022; Hafemann, 2023). The Federal Reserve Board started producing the Tealbook Time-Varying Macroeconomic Table 1: Reference percentiles Risk forecasts in 2017 but do not provide past NYFed Tealbook values. The NY Fed started producing the 2007 2018 2018 Outlook-at-Risk forecasts only in 2023 but pro- P10: −1.7 0.0 P5 0.7 vides past values starting in 1989. As a re- P25: 0.2 1.1 P15 1.3 sult, our case study constructs and compares P50: 1.8 2.1 P50 2.5 reference distributions from both the NY Fed P75: 3.3 3.0 P85 3.6 P90: 4.8 4.0 P95 4.3 and the Fed Board. In practice, both the NY Fed and the Tealbook give five predictive per- The NY Fed Outlook at Risks gives point forecasts for oneyearaheadGDPgrowth. TheTealbookTime-VaryingMacroecentiles (Table 1). To construct the reference conomic Risk gives one-year ahead Tealbook forecast errors distribution, we fit skew-t distributions (Azza- intheDecember2018Tealbook,whichprovidesGDPgrowth pointforecastsoncethebaselineforecastisadded. lini and Capitanio, 2003) on these percentiles. FollowingAdrianetal.(2019),thefourparametersoftheskew-tminimizethesquareddistancebetweenthegivenreferencequantilesandthose of the resulting skew-t. Table 3 reports resulting parameters. 6.2 Baseline Distribution Table 2 shows baseline and alternative scenario projections. The Tealbook provides the point forecast and 70% intervals for the baseline. To construct the baseline distribution, we take the point forecast as the median and extremes of the 70% confidence interval as the 15th and 85th percentiles, respectively. We fit a skew-t with 50 degrees of freedom to these percentiles– the choice of degrees of freedom allows some modest tail-weight beyond normal but effectively represents a “close to normal” distribution. Table 2: Tealbook Baseline and Scenarios DECEMBER 2007 DECEMBER 2018 j ScenarioS P50 P15 P85 j ScenarioS P50 P15 P85 0 Baseline 1.3 0.1 2.5 0 Baseline 2.4 1.2 3.9 1 Greaterhousingcorrection 1.0 1 Financial-basedrecession −0.7 2 Creditcrunch −0.4 2 Strongersupplyside 3.1 3 Strongerdomesticdemand 1.7 3 Greaterinterestratesensitivity 1.5 4 Withbetterexportperformance 1.9 4 Foreignslowdown 1.6 5 Greatercostpressure 1.2 6 Market-basedfederalfundsrate 1.6 December2007TB:Thebaselineprojectionfor2008isonpageI-21,whilethealternativescenariosareonpageI-17—thescenariosfor 2008areobtainedbyaveragingthevaluesfor2008:H1and2008:H2. December2018TB:Thebaselineprojectionfor2019isonpage 88,whilethealternativescenariosareonpage84. 10

Figure 1: Reference and baseline with scenario point forecasts 2007 2018 0.4 0.4 ReferenceNYFed ReferenceNYFed Baseline ReferenceTealbook Greaterhousingcorrection Baseline Creditcrunch Financial-basedrecession 0.3 Strongerdomesticdemand 0.3 Strongersupplyside Withbetterexportperformance Greaterinterestratesensitivity Greatercostpressure Foreignslowdown Market-basedfederalfundsrate 0.2 0.2 0.1 0.1 0.0 0.0 -6 -4 -2 0 2 4 6 8 -4 -2 0 2 4 6 Solidblacklineisbaselinep.d.f,anddashedblacklineisreferencep.d.f. estimatedfromtheNYFedOutlook-at-Risk;solidgraylineis thereferenceestimatedfromtheTealbookTime-VaryingMacroeconomicRisk.Diamondsshowpointforecastsfromscenarios. Table 3: Reference and baseline skew-t parameters Distribution Type lc sc sk df 7002 Reference NYFed 2.7 2.2 −0.5 3.4 Baseline 1.3 1.1 0.0 50.0 8102 Table 3 shows parameters of the baseline and reference skew-t distributions; Figure 1 shows the p.d.f.s. The baseline is much more precise than the NY Fed reference, with a Reference NYFed 2.5 1.3 −0.3 3.0 lower scale and higher degrees of Reference Tealbook 2.1 1.1 0.5 50.0 freedom. This raises questions for Baseline 1.2 1.9 2.1 50.0 economic forecasting and policy de- Theskew-tparametersarethoseforlocation(lc),scale(sc),skewness(sk) sign. In a world with a known “true anddegreesoffreedom(df). model” of the economy, the baseline and reference would be identical; as they differ in practice, interpreting the scenarios is the challenge. Bycomparison,thebaselineisroughlyaspreciseastheTealbookreference,thelatterhaving 50 degrees of freedom as a result of the optimization process in fitting the skew-t. This suggests that the Tealbook reference may underestimate risk. As Federal Reserve Chair Jerome Powell said during the Press Conference following the January 2025 FOMC meeting, we should not be surprisedas“itishumannature,apparently,tounderestimate[...] howfatthetailsare.[...] Wethinkof things in a normal distribution. And in the economy, it’s not a normal distribution.” Given reference and baseline distributions, our analysis proceeds based on Monte Carlo sampling from the reference. In developments below, the MC sample size is 106 and resulting MC analysis summaries stable and robust across reanalyses with such a large sample size. 6.3 Scenarios TB scenarios in Table 2 provide only point forecasts. In the Dec. 2007 (2018) TB, we have 6 (4) alternative scenarios.1 Figure 1 shows scenarios on the baseline, indicating their concentration around the baseline median but with some indication of downside risk in the left tail. 1Weexclude“Moreroomtogrow”scenarioin2007,andboth“Supplyconstraints”and“Loweroilprices”scenarios in2018. ThesehadGDPpointforecastsidenticaltoeitherthebaselineoranotherscenario. Ifincludedinoursynthesis, theywouldreceiveequalweightwitheitherthebaselineoroneotherscenario. 11

Our first analysis treats the scenario point forecasts as medians2 of the p (y), and the ET conj structionmapsthebaselinetoeachscenariop.d.f. constrainedtoitsspecifiedmedianonly;seethe P50sinTable2. WhiletheTBprovidesnomeasuresofuncertaintyaroundthescenarioprojections, suchinformationcouldbeusefulandavailableinotherapplications. Section6.5exploresanalyses with P15 and P85 constructed for each scenario. AsdiscussedinSection2.2,weaugmentthescenariosetwithabackstoplocatedatthecenterof thescenarioswhilebeingrelativelyover-dispersed. Sincethescenarioinformationhereisrestricted to the median point forecasts, we first construct p (y), j = 1,...,J, using ET as in Section 3, j then use the implied percentiles to define those of the backstop. Specifically, the backstop has P50 B = median j=1,...,J P50 j , P15 B = min j=1,...,J P15 j , and P85 B = max j=1,...,J P85 j , respectively. Numerical details are in Table 4. Figure 2: Examples of Tilted Distributions of Alternative Scenarios December2007Tealbook S : Greaterhousingcorrection S : Creditcrunch 1 2 Baselinep0(y) Baselinep0(y) 0.4 Scenariop1(y) 0.4 Scenariop2(y) 0.3 0.3 fd fd p0.2 p0.2 0.1 0.1 0 0 -6 -4 -2 0 2 4 6 8 -6 -4 -2 0 2 4 6 8 y y S : Greatercostpressure S : Backstop 5 7 Baselinep0(y) Baselinep0(y) 0.4 Scenariop5(y) 0.4 Scenariop7(y) 0.3 0.3 fd fd p0.2 p0.2 0.1 0.1 0 0 -6 -4 -2 0 2 4 6 8 -6 -4 -2 0 2 4 6 8 y y Figure2showstheresultingp (y)forfourofthe2007TBscenarios. Table4showscorrespondj ing ET–based ESS measures. When a scenario is close to the baseline (e.g., S and S ), the tilted 1 5 distribution remains close and slightly asymmetric; otherwise, the tilted distribution can exhibit skewnessandmultimodalityduetothemismatchbetweenthescenarioforecastsandbaseline. The 2TB and other point forecasts might alternatively be treated as modes of scenario distributions. Our analysis can addressthat,basedonnewtheoreticalresults(notreportedhere)showingwhyandhowentropictiltingcanbeapplied whenpointforecastsaremodes. Weusemedians,however,basedonfundamentalconcernfordeeperrepresentationof theprobabilitydistributionsofscenarios,andembeddinginmoredetailedanalyseswithmultiplepercentiles. 12

Table 4: Synthesis based on P50 information: NY Fed reference TB j ScenarioS P15 P50 P85 ET % IS % π∗ α α∗ j j j j j j (cid:98)j j 7002 .ceD 0 Baseline 0.1 1.3 2.5 100.0 62.6 0.41 0.31 0.27 1 Greaterhousingcorrection −0.1 0.9 2.3 94.3 57.4 0.40 0.00 0.02 2 Creditcrunch −1.0 −0.4 2.0 28.7 30.9 0.36 0.08 0.08 3 Strongerdomesticdemand 0.3 1.7 2.7 92.6 65.4 0.42 0.00 0.04 4 Betterexportperformance 0.4 1.9 2.9 84.3 65.2 0.42 0.31 0.27 5 Greatercostpressure 0.0 1.2 2.5 99.5 61.3 0.41 0.00 0.02 6 Market-basedFedFundsrate 0.2 1.6 2.6 97.1 64.8 0.41 0.00 0.03 7 Backstop −1.0 1.4 2.9 56.4 67.2 0.43 0.31 0.27 f(y|α) −0.2 1.4 2.7 71.5 0.43 (cid:98) f(y|α∗) −0.2 1.4 2.7 71.2 0.43 8102 .ceD 0 Baseline 1.2 2.4 3.9 100.0 88.5 0.47 0.74 0.64 1 Financial-basedrecession −1.0 −0.6 3.1 0.6 8.4 0.35 0.04 0.04 2 Strongersupplyside 1.4 3.1 4.4 84.6 67.5 0.45 0.00 0.04 3 Greaterinterestratesensitivity 0.7 1.5 3.4 69.4 70.3 0.45 0.13 0.11 4 Foreignslowdown 0.8 1.6 3.5 75.3 74.5 0.46 0.02 0.10 5 Backstop −1.0 1.6 4.4 2.1 37.9 0.43 0.07 0.08 f(y|α) 0.9 2.2 3.8 91.4 0.48 (cid:98) f(y|α∗) 0.9 2.2 3.8 90.9 0.48 P15andP85forSj,j = 1 : J−1,arefortheET–basedpj(y);theotherpercentilesareinputstotheanalysis. ETj istheET-based ESSofpj(y)relativetothebaselinep0(y)whileISj denotesthatforET–ISimpliedrelativetothereferencep(y).π j ∗istheEMRofthe referenceandscenariojalone.α∗ j andα (cid:101) ∗aretheprobabilitiesonSjintheoptimalmixturesynthesisinanalyseswithandwithoutthe backstopscenario,respectively.Theoptimizationconstraintsα0≥αj,j=1:J,applyinbothcases. emergence of interesting shapes and multimodality in scenario distributions indicates the hypothesized state of the economy in the scenario is in regions poorly supported by the baseline. This is also related to the concept of modest policy intervention of Leeper and Zha (2003), i.e. that we can analyze policy effects using the baseline as long as entertained policy interventions are small enough that economic agents would not change their behavior in response to the intervention. Related considerations are those of formally down-weighting extreme conditional assumptions in conditional forecasting (e.g. Antol´ın-D´ıaz et al., 2021; Chernis et al., 2024, sect. 3). 6.4 Scenario Synthesis based on Medians Table 4 shows optimal scenario mixture weights and other summaries from analyses. There is strong concordance between the mixture synthesis and the reference in the case of the 2018 TB (EMR=0.48), while the concordance is weaker in the 2007 Tealbook (EMR=0.43). Figure 3 providesinsightsviacomparisonofp.d.f.s. ForTB2008,themixturesynthesisislight-tailedrelativeto thereference,asallscenarioslackprobabilityondownsideGDPrangesthatthereferencemeaningfullysupports. Forthe2018example,thescenariomixturesupportspositiveGDPvalues–partlyas thebaselineisalreadyrightskewed–butassignsrelativelylimitedsupporttonegativeGDPgrowth. In the 2007 TB analysis, the baseline, S and backstop are roughly equally weighted, followed 4 by S . The two more extreme scenarios S and S get weight as they help to capture the spread of 2 2 4 the reference, while the other scenarios get small weights– they sit in the center of the reference 13

Figure 3: p.d.f. and c.d.f of scenario synthesis and NY Fed reference DECEMBER2007 DECEMBER2018 Referencep(y) Referencep(y) 0.4 Synthesisf(y,?) 0.4 Synthesisf(y,?) j j 0.3 0.3 fd fd p0.2 p0.2 0.1 0.1 0 0 -6 -4 -2 0 2 4 6 8 -6 -4 -2 0 2 4 6 8 y y 1.00 1.00 0.95 ReferenceP(y) 0.95 ReferenceP(y) 0.90 BaselineP0(y) 0.90 BaselineP0(y) SynthesisF(y,^) SynthesisF(y,^) 0.75 SynthesisF(yj j ,?) 0.75 SynthesisF(yj j ,?) fd0.50 fd0.50 c c 0.25 0.25 0.10 0.10 0.05 0.05 0.00 0.00 -6 -4 -2 0 2 4 6 8 -6 -4 -2 0 2 4 6 8 y y Upper: p.d.f.softhereferenceandscenariosynthesis.Lower: c.d.f.softhereference,baselineandscenariosynthesisincludingresults synthesisc.d.f.sbasedonbothα (cid:98)andα∗;notethatthelatterareeffectivelyindistinguishable. andareverysimilartothebaseline,asshownbytheETESSmeasures. Incontrast,inthe2018TB example the reference is more precise and the baseline is heavier weighted than other scenarios. HeretheEMRofthebaselinealoneisquitehigh,sothealternativescenariosaddsmallbutlimited value in contributing to approximating the reference. Akeyfeatureofanalysisisthat100−ESSforthesynthesisisanabsolutemeasureofscenarioset incompleteness relative to the reference. In TB 2007, the ESS of f(y|α∗) is about 71-72%; we can say that the scenario set is about 28-29% incomplete. In contrast, in TB 2018 the ESS of f(y|α∗) is about 91%, which indicates that the TB scenarios much more adequately represent the risk and uncertainty in the economy as defined by the reference than in 2007. A hint of the inability of TB 2007 scenarios to properly capture the risks in the reference comes also from the substantial weight on the backstop; in contrast, in TB 2018, the backstop receives low weight. As a general pointlookingahead,abackstopp.d.f. thatisover-dispersedhasgeneralbenefits,butinapplication other choices are possible and may be preferred. These examples highlight this, indicating that scenarios reflecting increased support in the upper and lower tails of the– anticipated– reference distribution are most relevant. Future applications might address this. 14

Table 5: Synthesis based on P50 information: TB 2018, Tealbook reference j ScenarioS P15 P50 P85 ET % IS % π∗ α α∗ j j j j j j (cid:98)j j 0 Baseline 1.2 2.4 3.9 100.0 77.7 0.49 1.00 0.89 1 Financial-basedrecession −1.0 −0.6 3.1 0.6 0.8 0.33 0.00 0.01 2 Strongersupplyside 1.4 3.1 4.4 84.7 51.0 0.47 0.00 0.05 3 Greaterinterestratesensitivity 0.7 1.5 3.4 69.6 56.3 0.44 0.00 0.02 4 Foreignslowdown 0.8 1.6 3.5 75.4 61.3 0.45 0.00 0.02 5 Backstop −1.0 1.6 4.4 2.1 3.2 0.40 0.00 0.01 f(y|α) 1.2 2.4 3.9 77.7 0.49 (cid:98) f(y|α∗) 1.2 2.4 3.9 76.1 0.49 DetailsasinTable4. Table 5 summarizes 2018 analysis using the reference distribution based on percentiles from the Tealbook Time-Varying Macroeconomic Risk to compare with the NY Fed-based details above. Againthesynthesishereusesonlythescenariomedians. Inthiscase,thebaselineisaspreciseasthe reference and has similar tail-weight in terms of the skew-t degrees of freedom. The main point, however, is that the baseline dominates the scenarios, indicating that the hypothesized median shifts they represent add little to no value in predictive discrimination relative to the reference. On modeling strategy, consider an example of “normal” economics times as represented by 2018. In such settings, (i) the reference can be expected to be relatively light-tailed, (ii) scenarios can be expected to be modest in terms of varying from backstop, and (iii) the resulting scenario synthesis will be be close to the reference. There is limited scenario set incompleteness and the backstop will play a limited role. Contrast this with periods of higher uncertainty, such as in the 2007contextherewherethereferencedistributionshouldhaveappropriatelyfattertails. Thenthe synthesis of the baseline and the scenarios can substantially under-represent the reference unless the scenario set includes more extreme considerations. This mandates admitting extreme scenario considerations as a rational response to increased uncertainty in very uncertain economic times. 6.5 Scenario Synthesis based on P15, P50 and P85 As earlier noted, the methodology admits specification of multiple features of the scenarios so long as they can be represented as expectations under implicit scenario distributions. The case of multiple percentiles is practically key, and we visit this setting with summaries of further analysis in the TB 2007 context. Figure 4 and Table 6 report the results when scenario p.d.f.s are tilted versions of the baseline that match the scenario-specific P15, P50, and P85. Here the scenario P15 and P85 are computed assumingdistancesoftailpercentilesfrommedianagreewiththebaseline: P15 = P50 −(P50 − j j 0 P15 )andP85 = P50 +(P85 −P50 ). TheP15andP85forthescenariosinTable6aresimilarto 0 j j 0 0 thoseinTable4whenthescenarioisclosetothebaseline;theyarenaturallymorediscordantwith increasing departure of the scenario percentiles from those of baseline (e.g., S ). Then, optimal 2 weightsandEMRresultarequitesimilartothoseinTable4. OtherspecificationsoftheP15andP85 uncertainties– specifications that are founded in scenario considerations– may be quite different than the synthetic choices here, of course, and can be expected to lead to different results. A main point is that the methodology is open to– and trivially applied to– scenario specifications in terms 15

of multiple percentiles, with negligible analytic and computational burden. Such specifications are increasingly common in application, and to be encouraged in policy research moving forward. Figure 4: p.d.f. and c.d.f of scenario synthesis and NY Fed reference Tiltedscenariodistributionsobtainedusingthreepercentiles—December2007Tealbook 1.00 referencep(y) 0.95 ReferenceP(y) 0.4 mixturef(y j ,?) 0.90 BaselineP0(y) SynthesisF(y,^) SynthesisF(yj,?) 0.75 j 0.3 fd fd0.50 p0.2 c 0.25 0.1 0.10 0.05 0.00 0 -6 -4 -2 0 2 4 6 8 -6 -4 -2 0 2 4 6 8 y y Left: p.d.f. ofthereferenceandscenariosynthesis. Right: c.d.f. ofthereference,baselineandscenariosynthesis,thelattercomparing resultsusingα (cid:98)andα∗forthesynthesis;again,thelatterareeffectivelyindistinguishable. Table 6: Synthesis based on P15, P50 and P85 information: TB 2007, NY Fed reference j ScenarioS P15 P50 P85 ET % IS % π∗ α α∗ j j j j j j (cid:98)j j 0 Baseline 0.1 1.3 2.5 100.0 62.6 0.41 0.30 0.26 1 Greaterhousingcorrection −0.2 1.0 2.2 92.3 57.2 0.39 0.00 0.01 2 Creditcrunch −1.5 −0.3 0.9 20.0 31.6 0.32 0.10 0.11 3 Strongerdomesticdemand 0.5 1.7 2.9 90.1 65.9 0.42 0.00 0.07 4 Betterexportperformance 0.7 1.9 3.1 79.3 66.1 0.42 0.30 0.26 5 Greatercostpressure 0.0 1.2 2.4 99.4 61.2 0.40 0.00 0.01 6 Market-basedFedFundsrate 0.4 1.6 2.8 96.0 65.0 0.42 0.00 0.03 7 backstop −1.5 1.4 3.1 27.7 62.4 0.43 0.30 0.26 f(y|α) −0.3 1.4 2.8 73.0 0.44 (cid:98) f(y|α∗) −0.3 1.4 2.8 72.7 0.44 DetailsasinTable4. 7 Distributional Forecasts and Judgment At a general level, our focus is on use and reconciliation of information from statistical models andjudgmentalsources. Analysisisdirectionalinthatthespecificgoalsaretoassessjudgmentally derived scenarios against a statistical reference distribution. In the broader context, the reverse is also of interest; that is, investigation of how a statistical forecast distribution may be “tilted” in a direction deemed important from a judgmental point of view. The latter can often be proxy for information external to that underlying the statistical model. A key context is that of unique, unexpected events and shifts in the structure of the economy that go well beyond existing model structure and assumptions. While structural economic models 16

may provide a formal basis for longer-term adaptation, fully modeling the implications of regime shiftsonthestructureoftheeconomytakestime. Short-term,judgmentaladjustmentscanbemost valuableforreal-timedecisionmaking. Indeed,judgmentplaysadominantroleindecisionmaking in other areas, such as among investment professionals in macroeconomic trading. In monetary policy settings, quantitative macroeconomic models provide a firmer basis, but undesirable policy recommendations from models are often attributed to persistent forecast errors. A “good” policy maker would intervene to input judgment to address this. Subjective information sources include surveys, market intelligence, and stress test outputs. Some comments on each are germane. i. Surveys. Perhaps the key example is the US Survey of Professional Forecasters (SPF) a wellestablished source of community-wide forecast information. SPF now collates probabilistic forecasts on predefined bins for outcomes (Croushore, 1993; Del Negro et al., 2023). Similar regular surveys are conducted by ECB, the Bank of England, and other institutions. ii. Market Intelligence. Large and detailed information sets are commonly collected by central banks in order to inform policy makers on many dimensions of economic and financial market developments outside the scope of well-adopted structural macroeconomic models. Such models, aimingtoreducecomplexityandleadtoopennessandinterpretationineconomicterms,inevitably lack the ability to reflect the full complexity of structural changes or nonlinear dynamics that become practically relevant in more unusual circumstances. More complex economic and financial marketintelligence–intheformofsummaryexternalforecastinformationandjudgment–canthen add real value, if recognized and appropriately integrated with the model-based forecasts. iii. Stress testing. Initially developed by the IMF in the 1990s to assess financial system resilience, stress testing was later broadly adopted to assess macroeconomic risks from banking distress following the Great Financial Crisis (Adrian et al., 2020). All major financial regulators now design and publish macroeconomic stress scenarios and assess financial stability relative to those scenarios. Thistypicallyincludesscenariosofmajordownturnsinmacroeconomicaggregatessuch as real activity, inflation, and financial conditions. Scenario design emphasizes extreme economic and financial circumstances; the resulting outputs can provide a basis for judgmental modification offorecastsfromtheestablishedreferenceeconometricmodelsthatarenotdesignedorcustomized to quickly and easily address such circumstances. The overall question is that of intervention in the statistical model to incorporate such external information. The concept is long recognized and much methodology exists and is used in other areasofforecasting,suchascommercialandfinancialapplications(e.g.,WestandHarrison,1986; West and Harrison, 1989; Black and Litterman, 1991; West and Harrison, 1997, chap. 11; West, 2024). However, emphasizing and formalizing the question with respect to policy applications is highlighted and of renewed interest here. Beyond contextual connections with the main theme of the current paper, key technical featuresofourscenariosynthesismethodologyrelatedirectlytothesecomplementaryinterests. From Section 5 and generalizing the notation there, each of the constructed scenario p.d.f.s has the form p (y) ∝ w (y)p(y) based on the statistical reference p.d.f. p(y) and scenario-specific weight– j j or tilting– function w (y). The latter is w (y) ∝ w (y)exp{τ′s (y)} involving: (i) the ET term j j 0 j j exp{τ′s (y)} used to define p (·) using the baseline and partial scenario information; and (ii) the j j j baseline-reference IS weight function w (y) ∝ p (y)/p(y). The Monte Carlo methodology uses the 0 0 discrete versions over the reference random samples yi, i.e., wi = w (yi) for j = 0:J. j j Thenormalizedscenariop.d.f.sarethenp (y) = c w (y)p(y)wherethec arejustnormalizing j j j j 17

constants. Thus, explicitly, the partial judgmental information that S encodes leads to a weighted j modification of the statistical reference; on S alone, this can be regarded as the scenario-tilted j version of the model-based forecast. This indicates that the overall question of conditioning a model-based forecast on what may be quite distinct forms of judgmental information is intimately addressed within our framework. It also follows that the BPS-justified scenario mixture p.d.f. is (cid:80) f(y|α) = w(y|α)p(y) where w(y|α) = α c w (y). This is true for any α, not just the j=0:J j j j EMR-optimized value central to our scenario synthesis goals. In other contexts, such as the above settings of modifying the reference p(y) with judgment-based information summaries, this allows for context-specific specification of relative scenario weightings. Importantly, scenarios can address multiple aspects of the forecast distribution, including location shifts, scale and skewness perturbations– both within any one scenario and with diversity across a scenario set. This may be particularly important to extension and evaluation of this approach in areas such as stress testing. 8 Summary Comments The formal assessment and integration of partial scenario information with statistical forecast distributions is of interest in a range of policymaking settings. Our approach has been motivated by the monetary policy process, where policy decisions are firmly rooted in macroeconomic forecasts that involve not only the baseline forecast, but also alternative risk scenarios. The methodology offers a concrete and straightforward approach to evaluating baseline and judgmental scenario assessments– with their intuitive and easily communicated bases– against more formal statistical density forecasts of risk. Applications to the monetary policy process are highlighted, and offer a new frontier for practical yet rigorous policy making. Webelievethemethodologywillhavebroadappealinotherapplications. Forexample,theIMF continuously monitors global financial stability, and publishes a formal biannual GaR-based global financial stability assessment in its Global Financial Stability Report. The statistical approach can be compared directly– and in a quantitatively meaningful manner– with the scenario-based risk assessment of the IMF’s World Economic Outlook, published on the same schedule as the GFSR. The approach can also be readily applied to other areas of institutional risk management and contextssuchasportfoliochoiceapplications. Inriskmanagement,asinfinancialinstitutionsupervisorystresstesting,bothscenario-basedapproachesandmorestatisticalapproachesarecommonly deployed. Ourframeworkandmethodologyoffersanovelwaytoevaluateandintegratethosetwo avenues in a concrete fashion. In portfolio allocation decision making, the role of priors is fundamental and features commonly in allocation decisions, yet the bridge between intuitive scenario based approaches and statistical modeling of return forecasts has received little attention. Again our framework offers steps ahead in this regard. Beyond these areas, there are also opportunities in commercial revenue and supply chain forecasting where ranges of forms of external/subjective information are often assessed in the context of formal models with a view to eventual decisions. Forms of scenario information that may feed-into new applications are information sets with more than a few candidate percentiles of forecast distributions under any assumed scenario. If a scenarioisspecifiedintermsofalargernumberofpercentiles,thenanalysisbeginstoapproximate that given a fully-specified scenario p.d.f. p (y). This is certainly of methodological interest, and j may represent applied interests in settings where the scenarios are effectively replaced by forecast distributionsfromalternative/competingmodels. Thislattersettingisclosertotheexistingsetting of BPS where multiple predictive distributions are considered by a Bayesian decision maker, and 18

define analogue information to condition an initial reference forecast distribution. Some of the technical developments here are rather different– and complementary– to the general setting of BPS,butopenupnewquestionsforpotentialdevelopmentandexploitationinforecastmodelcomparison and synthesis. There are also questions of extension of the technical approach to address synthesis based on other forms of scenario information, e.g., point forecasts that are regarded as subjectively assessed modes or means rather than medians, and uncertainty in scenario information,e.g.,percentilesprovidedwithsomenotational±uncertainties. ThegeneralETframeworkin principle applies to such contexts, though details are to be developed for exploitation in any new applied context in which such scenario information sets arise. We have noted that the methodology is applicable with y in several dimensions. Elements of y can include multiple economic indicators (real growth, inflation, unemployment, etc.,) as well as– anchored at a current time period, such as the end of the current quarter– multiple time periods ahead, such as the coming eight quarters. Scenario information to define (uncertain) constraints on forecasts of the state of the macroeconomy over multiple future time periods can than generate a range of scenarios. Technically and computationally, the methodology here extends immediately. We have experience with such extensions, and recognize questions that arise due to increasing dimension of y. In technical essentials, the main questions there are not new, but have to do with scalability of importance sampling methodology with dimension, and of its close technical ally entropic tilting with increasing dimension of the underlying Bayesian decision-analytic utility (a.k.a. score) functions. These questions are addressed in all applications of these general approaches, and will need to be addressed in context– in specific applied settings of scenario synthesis. Acknowledgments The authors thank colleagues at the Federal Reserve and the IMF– including Gianni Amisano, Matthias Paustian, Sheheryar Malik, Jason Wu, and Pierre-Olivier Gourinchas– for multiple discussionsandfeedbackonthegeneralareaandarangeofspecifictopics. WealsothankToddClark oftheCenterforFinancialEconomicsatJohnsHopkinsUniversityforhisdetailed,thoughtfulcomments on a first draft of our paper. The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve System, the Federal Open Market Committee, or the International Monetary Fund, its Management, or its Executive Directors. 19

References Adams,P.A.,T.Adrian,N.Boyarchenko,andD.Giannone(2021). Forecastingmacroeconomicrisks. InternationalJournalofForecasting37(3),1173–1191. Adrian,T.andN.Boyarchenko(2012). Intermediaryleveragecyclesandfinancialstability. StaffReport567, FederalReserveBankofNewYork. Adrian,T.,N.Boyarchenko,andD.Giannone(2016). Vulnerablegrowth. StaffReport794,FederalReserve BankofNewYork. Adrian,T.,N.Boyarchenko,andD.Giannone(2019). Vulnerablegrowth. AmericanEconomicReview109(4), 1263–1289. Adrian, T., N. Boyarchenko, and D. Giannone (2021). Multimodality in macrofinancial dynamics. InternationalEconomicReview62(2),861–886. Adrian,T.,F.Grinberg,N.Liang,S.Malik,andJ.Yu(2022). ThetermstructureofGrowth-at-Risk. American EconomicJournal: Macroeconomics14(3),283–323. Adrian,T.,J.Morsink,andL.B.Schumacher(2020). StresstestingattheIMF:Aframeworkformacroprudentialanalysis. DepartmentalPaper2020/016,InternationalMonetaryFund. Aikman,D.,J.Bridges,S.H.Hoke,C.O’Neill,andA.Raja(2019). Credit,capitalandcrises: aGDP-at-Risk approach. StaffWorkingPaper824,BankofEngland. Alessandri, P., L. D. Vecchio, and A. Miglietta (2019). Financial conditions and ‘Growth at Risk’ in Italy. EconomicWorkingPaper1242,BankofItaly,EconomicResearchandInternationalRelationsArea. Andrews, D. F. and C. L. Mallows (1974). Scale mixtures of normal distributions. Journal of the Royal StatisticalSociety(Ser.B)36(1),99–102. Anesti,N.,M.Garofalo,S.Lloyd,E.Manuel,andJ.Reynolds(2023). Unknownmeasures: AssessinguncertaintyaroundUKinflationusinganewInflation-at-Riskmodel. BankUnderground,BankofEngland. Antol´ın-D´ıaz,J.,I.Petrella,andJ.F.Rubio-Ram´ırez(2021). StructuralscenarioanalysiswithSVARs. Journal ofMonetaryEconomics117(C),798–815. Azzalini, A. and A. Capitanio (2003). Distributions generated by perturbation of symmetry with emphasis onamultivariateskewt-distribution. JournaloftheRoyalStatisticalSociety(Ser.B)65(2),367–389. Azzalini, A. and A. Capitanio (2013). The Skew-Normal and Related Families. Institute of Mathematical StatisticsMonographs.CambridgeUniversityPress. Bernanke, B. (2023). Forecasting for monetary policy making and communication at the Bank of England: Areview. ReporttotheBoEIndependentEvaluationOffice,BankofEngland. Black, F. and R. B. Litterman (1991). Asset allocation: Combining investor views with market equilibrium. TheJournalofFixedIncome1(2),7–18. Boyarchenko,N.,R.K.Crump,L.Elias,andI.L.Gaffney(2023). LookoutforOutlook-at-Risk. LibertyStreet Economics20230517,FederalReserveBankofNewYork. Boyd, S. and L. Vandenberghe (2004). Convex Optimization (additional exercises). Cambridge University Press. 20

Britton, E., P. Fisher, and J. Whitley (1998). The inflation report projections: Understanding the fan chart. QuarterlyBulletinQ1,BankofEngland. Brodie,J.,I.Daubechies,C.DeMol,D.Giannone,andI.Loris(2009). SparseandstableMarkowitzportfolios. ProceedingsoftheNationalAcademyofSciences106(30),12267–12272. Brunnermeier, M. K. and Y. Sannikov (2014). A macroeconomic model with a financial sector. American EconomicReview104(2),379–421. Caldara, D., D. Cascaldi-Garcia, P. Cuba-Borda, and F. Loria (2021). Understanding Growth-at-Risk: A Markovswitchingapproach. SSRNElectronicJournal. doi:10.2139/ssrn.3992793. Carriero, A., T. E. Clark, and M. Marcellino (2024). Capturing macro-economic tail risks with Bayesian vectorautoregressions. JournalofMoney,CreditandBanking56(5),1099–1127. Chernis,T.,G.Koop,E.Tallman,andM.West(2024).Decisionsynthesisinmonetarypolicy.BankofCanada, StaffWorkingPaper2024-30. arXiv:2406.03321. Clark, T. E., G. Ganics, and E. Mertens (2022). What is the predictive value of SPF point and density forecasts? Workingpaperno.22-37,FederalReserveBankofCleveland. doi:10.26509/frbc-wp-202237. Conflitti, C., C. De Mol, and D. Giannone (2015). Optimal combination of survey forecasts. International JournalofForecasting31(4),1096–1103. Croushore,D.(1993). Introducing: Thesurveyofprofessionalforecasters. BusinessReview3/1993,Federal ReserveBankofPhiladelphia. Crump, R. K., S. Eusepi, D. Giannone, E. Qian, and A. M. Sbordone (2025). A large Bayesian VAR of the UnitedStateseconomy. InternationalJournalofCentralBanking-,–. Crump, R. K., M. Everaert, D. Giannone, and C. S. Hundtofte (2024). Changing risk-return profiles. In M. Barigozzi, S. Ho¨rmann, and D. Paindaveine (Eds.), Recent Advances in Econometrics and Statistics: FestschriftinHonourofMarcHallin,pp.283–302.Springer. DeMol,C.(2024). Multiplicativealgorithmsfordensitycombinationanddeconvolution. InRecentAdvances inEconometricsandStatistics: FestschriftinHonourofMarcHallin,pp.493–510.Springer. DelNegro,M.,F.Bassetti,andR.Casarin(2023). Inferenceonprobabilisticsurveysinmacroeconomicswith anapplicationtotheevolutionofuncertaintyintheSurveyofProfessionalForecastersduringtheCOVID pandemic. In W. van der Klaauw, G. Topa, and R. Bachmann (Eds.), Handbook of Economic Expectations. Elsevier. Diebold, F. X., M. Shin, and B. Zhang (2023). On the aggregation of probability assessments: Regularized mixtures of predictive densities for Eurozone inflation and real interest rates. Journal of Econometrics237(2),105321. Eguren-Martin,F.,S.Ko¨sem,G.Maia,andA.Sokol(2024).TargetedfinancialconditionsindicesandGrowthat-Risk. StaffWorkingPaper1084,BankofEngland. Engstrom, E. and M. Gonzalez-Astudillo (2017). Time variation in upside and downside risks to the staff baselineforecast. StaffMemototheFederalOpenMarketCommittee,BoardofGovernorsoftheFederal ReserveSystem. Federal Reserve Board (2007). Report to the FOMC on Economic Conditions and Monetary Policy. Part 1– Current Economic and Financial Conditions: Summary and Outlook. December 5, 2007, Board of GovernorsoftheFederalReserveSystem. 21

FederalReserveBoard(2018). ReporttotheFOMConEconomicConditionsandMonetaryPolicy.BookA– Economic and Financial Conditions: Outlook, Risks, and Policy Strategies. December 7, 2018, Board of GovernorsoftheFederalReserveSystem. Ferna´ndez-Villaverde,J.,P.Guerro´n-Quintana,J.F.Rubio-Ram´ırez,andM.Uribe(2011). Riskmatters: The realeffectsofvolatilityshocks. AmericanEconomicReview101(6),2530–2561. Ferna´ndez-Villaverde, J., S. Hurtado, and G. Nuno (2023). Financial frictions and the wealth distribution. Econometrica91(3),869–901. Ferna´ndez-Villaverde, J., F.Mandelman, Y.Yu, andF.Zanetti(2024). Searchcomplementarities, aggregate fluctuations,andfiscalpolicy. ReviewofEconomicStudies. rdae053. Figueres, J. M. and M. Jarocin´ski (2020). Vulnerable growth in the Euro area: Measuring the financial conditions. EconomicsLetters191(C),109–126. Giannone, D., M. Lenza, and G. E. Primiceri (2021). Economic predictions with big data: The illusion of sparsity. Econometrica89(5),2409–2437. Gruber,L.F.andM.West(2016). GPU-acceleratedBayesianlearningandforecastinginsimultaneousgraphicaldynamiclinearmodels. BayesianAnalysis11(1),125–149. Gruber, L. F. and M. West (2017). Bayesian forecasting and scalable multivariate volatility analysis using simultaneousgraphicaldynamiclinearmodels. EconometricsandStatistics3(C),3–22. Hafemann,L.(2023). Housepricesatrisk: AframeworkforassessingvulnerabilitiesintheGermanhousing market. TechnicalPaper7/2023,DeutscheBundesbank. He, Z. and A. Krishnamurthy (2012). A model of capital and crises. Review of Economic Studies 79(2), 735–777. IMF(2017). Globalfinancialstabilityreport: Isgrowtharisk? InternationalMonetaryFund. Johnson,M.C.andM.West(2025). Bayesianpredictivesynthesiswithoutcome-dependentpools. Statistical Science40(1),109–127. Jondeau,E.,P.Poncet,andC.Rebillard(2022). ArefinancialvariablesusefultocomplementGDPnowcasting? EcoNotepad,BanquedeFrance. Justiniano,A.andG.E.Primiceri(2008). Thetime-varyingvolatilityofmacroeconomicfluctuations. AmericanEconomicReview98(3),604–641. Kiley,M.T.(2022). Unemploymentrisk. JournalofMoney,CreditandBanking54(5),1407–1424. Koop,G.,S.McIntyre,andJ.Mitchell(2019). UKregionalnowcastingusingamixedfrequencyvectorautoregressivemodelwithentropictilting. JournaloftheRoyalStatisticalSociety(Ser.A)183(1),91–119. Kru¨ger, F., T. E. Clark, and F. Ravazzolo (2017). Using entropic tilting to combine BVAR forecasts with externalnowcasts. JournalofBusinessandEconomicStatistics35(3),470–485. Leeper,E.M.andT.Zha(2003). Modestpolicyinterventions. JournalofMonetaryEconomicss50(8),1673– 1700. Lenza,M.,I.Moutachaker,andJ.Paredes(2023). Densityforecastsofinflation: Aquantileregressionforest approach. WorkingPaperSeries2830,EuropeanCentralBank. 22

Lin,L.,C.Chan,andM.West(2016). DiscriminativevariablesubsetsinBayesianclassificationwithmixture models,withapplicationinflowcytometrystudies. Biostatistics17(1),40–53. Lo´pez-Salido,D.andF.Loria(2024). Inflationatrisk. JournalofMonetaryEconomics145(S),103570. Matlab(2024). MatlabOptimizationToolbox(version24.1,r2024a). https://www.mathworks.com. McAlinn, K.andM.West(2019). DynamicBayesianpredictivesynthesisintimeseriesforecasting. Journal ofEconometrics210(1),155–169. Metaxoglou,K.,D.Pettenuzzo,andA.Smith(2018).Option-impliedequitypremiumpredictionsviaentropic tilting. JournalofFinancialEconometrics17(4),559–586. Plagborg-Moller, M., L. Reichlin, G. Ricco, and T. Hasenzagl (2020). When is growth at risk? Brookings PapersonEconomicActivity51(1),167–229. Pujadas, A. M., L. Hospido, and J. M. Montero (2022). House prices at risk: An empirical approach to downsiderisksintheSpanishhousingmarket. WorkingPaper2244,BancodeEspan˜a. Robertson, J.C., E.W.Tallman, andC.H.Whiteman(2005). Forecastingusingrelativeentropy. Journalof Money,Credit,andBanking37(3),383–401. Tallman, E. and M. West (2022). On entropic tilting and predictive conditioning. Supporting material for TallmanandWest(2023).arxiv:2207.10013. Tallman, E. and M. West (2023). Bayesian predictive decision synthesis. Journal of the Royal Statistical Society(Ser.B)86(2),340–363. Tallman, E. and M. West (2025). Predictive decision synthesis for portfolios: Betting on better models. In S. Mazur and P. O¨sterhol (Eds.), Recent Developments in Bayesian Econometrics and Their Applications. Springer. arXiv:2405.01598. West,M.(1987). Onscalemixturesofnormaldistributions. Biometrika74(3),646–648. West,M.(2024). Perspectivesonconstrainedforecasting. BayesianAnalysis19(4),1013–1039. West, M. and P. J. Harrison (1986). Monitoring and adaptation in Bayesian forecasting models. Journal of theAmericanStatisticalAssociation81(395),741–750. West, M. and P. J. Harrison (1989). Subjective intervention in formal models. Journal of Forecasting 8(1), 33–53. West,M.andP.J.Harrison(1997). BayesianForecastingandDynamicModels(2nded.). Springer. 23

Appendix A Relating KL Divergences to EMR A.1 Lower Bounds on EMR As noted in Section 4.2, the bound π ≥ 1/[1+exp{KL(p∥f)}] has empirical support in specific pf examples. Thisis,ofcourse,nottrueingeneral,thoughseemssuggestedundercertain,practically relevant conditions on y ∼ p(y). The implied distribution of k(y) = log{p(y)/f(y)} has mean E[k(y)|H ] = KL(p∥f) ≥ 0 with equality only when k(y) = 0 for all y. The bound is conjectured to p hold when the distribution of k(y) has finite, positive mean, is unimodal with Pr[k(y) > 0] > 0.5, and has p.d.f. tail decay on k(y) < 0 no heavier than that on k(y) > 0. An exact proof for more restricted cases is available, as follows. Simplifying notation, the real-valued quantity k replaces k(y). The focus is on π = E [π(k)] pf k whereπ(k) = 1/{1+exp(k)}andk hassomep.d.f. g(k)withfinitemeanm > 0.HereE [·]denotes k expectation with respect to k ∼ g(k). The following theory draws on standard results concerning scale mixtures of normals (Andrews and Mallows, 1974; West, 1987). Suppose that g(k) is continuous, symmetric and unimodal with mode and finite mean m. Then g(k) is a normal scale mixture: g(k) = E [v−1ϕ{v−1(k −m)}] where ϕ(·) is the standard normal v p.d.f., v is a random scale parameter and E [·] denotes expectation with respect to its distribution. v Then, recognize that π(k) = 1/{1+exp(k)} is the survival function of the standard univariate logistic distribution for real-valued k. The logistic distribution is also a normal scale mixture, so π(k) = 1−E [Φ(u−1k)] where Φ(·) is the standard normal c.d.f., u is the random scale parameter u and E [·] denotes expectation with respect to its distribution. Thus π(m) = 1−E [Φ(u−1m)]. u u The above theory of the normal scale mixture structure g(k) and π(k) further implies that π = 1 − E [ (cid:82) Φ(u−1k)v−1ϕ{v−1(k − m)}dk] with expectation over v,u (in which, implicitly, pf v,u k √ u ⊥⊥ v). Routine normal theory yields π = 1 − E [Φ(w−1m)] where w = v2+u2. Hence pf v,u E [π(k)]−π(m) = E [Φ(u−1m)−Φ(w−1m)].Now,m ≥ 0andw > usothatΦ(u−1m)−Φ(w−1m) ≥ k v,u 0 implying that E [π(k)] ≥ π(m), as required. This inequality is strict unless k = v = 0. k Theanalysisabovemayextendtomoregeneralcaseswheng(k)isnotsymmetric. Suppose,for example, that g(k) is a scale mixture of skew-normal distributions (Azzalini and Capitanio, 2013). This is a rich class of unimodal distributions with ranges of asymmetries; it includes the above symmetric distributions as special cases. Convolutions of skew-normals with normals are skewnormals, so it is reasonable to ask if the above development generalizes. There may be broader generalizationsolongasg(k)isunimodalwithm > 0and/orPr(k ≥ 0) > 0.5. Thisisanasideand beyondcurrentscope,butsuggestsfurthertheoreticalstudy. Importantly,theabovediscussiondoes notextend atall tocases– includingmanypractical cases–when theexpectation ofk (definingthe KL divergence) does not exist and when the distribution of k is less regular and even multimodal. As another aside note, this also provides interpretation of KL on the probabilistic concordance scale in cases when the bound is assured; in such cases, κ ,κ ≤ log{(1−π )/π }. pf fp pf pf A.2 Approximation and Bounds The link of EMR to KL is further illuminated in the 1st-order Taylor series approximation of the function 1/{1 + exp(k)} at k = 0, namely 1/{1 + exp(k)} ≈ (2 − k)/4. This is an exact lower bound on 1/{1+exp(k)} for k ≥ 0 and an exact upper bound for k ≤ 0. See this as follows. First, π(k)−(2−k)/4 is positive on k > 0 if, and only if, g(k) > 0 where g(k) = k+2−(2−k)exp(k). Calculus shows that g(k) is strictly increasing for all k, and, of course, g(0) = 0, hence the lower 24

bound arises for k > 0. Second, π(k)−(2−k)/4 is negative on k < 0 if, and only if, g(k) < 0, so theupperboundisimpliedonthatrange. Thisapproximationisveryaccurateover|k| ≤ 0.5where 1/{1+exp(k)} ≥ 0.38, i.e., in cases of relatively close concordance; the absolute error is less than 0.68% on |k| ≤ 0.5. Hence, if y ∼ p(y) and the implied distribution of k(y) heavily favors such ranges,thenπ ≈ {2−κ }/4.Onthisbasis,choosingf(·)tomaximizeπ isagainapproximately pf pf pf the (symmetrized) KL divergence minimizing solution. In the following example highlights values of π ≥ 0.4 as practically relevant, as lower values are strong indications of lack of concordance. pf A.3 A Simple Example Asimpleexamplerelatestherangeofπ tothefamiliar,interpretablemeasureofexpectedsample pf size(ESS)fromMonteCarlo(MC)analysisusingimportancesampling(IS),inadditiontoKL.Take y = ytobescalarwithp(y) = N(0,1),standardnormal,andf(y) = N(a,1)forsomemeana ≥ 0.IS withproposalp(y)andtargetf(y)astargetleadstoMCintegrationbasedontheresultingweighted averageapproximations. Arandomsampley ∼ p(y),(i = 1:n),leadstoISweightsw ∝ f(y )/p(y ) i i i i (cid:80) subjecttosummingto1. TheresultingMCapproximationtoEMRisπ ≈ w /(1+nw ).The pf i=1:n i i IS effective sample size as a percentage is ESS = n−1100/ (cid:80) w2. Also, in this simple example i=1:n i KL(p∥f) = KL(f∥p) = κ = a2/2. pf Figure 5 comes from an example with n = 106 and across a range of values of a > 0. For a ≤ 1, ESS ≥ 40%,π ≥ 0.40 and is roughly linear in ESS up to its maximum of 0.5. For practical pf purposes and extrapolating from this interpretable example– also supported by other empirical examples– π ≥ 0.4 or so is expected unless f(·) and p(·) are quite substantially discordant. From pf Section4.2,π ≥ 0.40linkstotheaccurateapproximationπ ≈ 1/{1+exp(κ )}withκ ≤ 0.4. pf pf pf pf Figure 5: Predictive Concordance Example: EMR, ESS and KL-based lower bound when p(y) = N(0,1) and f(y) = N(a,1) for a range of values of a. Appendix B Prior/Regularization Parameter Specification Wehavepriorα ∼ Dir(1(1+ϵ))andaimtocalibratethechoiceofsmallϵ.Wediscussthisbyanalogy with canonical setting for Dirichlet prior/posterior distributions, i.e., multinomial sampling. With 25

α representing scenario probabilities, the least informative multinomial sample is just one draw from one of the scenarios; the implied Dirichlet posterior then has parameter updated by +1 in one element only. With no loss of generality, suppose a single outcome is known to come from the baseline;thissingledrawposterioristhenDir(1(1+ϵ)+e)wheree = (1,0,...,0)′. Now,toreflecta minimallyinformativesetting,supposetheposteriorismodifiedtoDir(1(1+ϵ)+xe)forsomevery small, positive x. This can be regarded as the posterior under an imaginary fractional observation; for example, x = 0.01 says the information content of the posterior relative to the prior is 1% of that arising on observing a single multinomial draw. Under this posterior with specified x, the prior mode 1/(J + 1) increases to posterior mode α∗ = (ϵ+x)/{(J +1)ϵ+x} on S , and decreases to α∗ = ϵ/{(J +1)ϵ+x} on the other scenarios 0 0 j j > 0. In this minimal information context it is rationale to limit this latter “shrinkage towards zero”andwereflectthisbyaskingthatα∗ ≥ p/(J+1)forsomefractionalreductionp ∈ (0,1).This j impliesϵ ≥ cx/(J+1)wherec = p/(1−p). Herecisexplicitlyalowerboundonthereductionfrom prior to posterior odds on S for j > 0 given the minimal information of a single outcome under j S . For example, the choice c = 0.5 limits this odds reduction to no more than 50%. The choices 0 x = 0.01 and c = 0.5 imply ϵ ≥ 0.005/(J +1), and this value is recommended as a default. Appendix C Optimization of Scenario Mixtures For any y define the (J +1)−vector p(y) = [p (y),...,p (y)]′. It is then easily shown that deriva- 0 J tives of EMR in eqn. (4) are δπ (α) (cid:90) δ2π (α) (cid:90) h(α) ≡ pf = p(y)h(y|α)dy and H(α) ≡ pf = −2 p(y)p(y)′H(y|α)dy δα δαα′ y y where h(y|α) = p(y)2/{p(y) + f(y|α)}2 and H(y|α) = h(y|α)/{p(y) + f(y|α)}. At any α the Hessian matrix is H(α) = −2E[p(y)p(y)′a(y|α)] where a(y|α) = p(y)/{p(y)+f(y|α)}3 and the expectation is with respect to y ∼ p(·). Since a(y|α) > 0 for all y,α the expectation is a positively weighted average of rank-one matrices p(y)p(y)′. Whether y is continuous or discrete (the latter with at least J +2 support points) H(α) is full rank and strictly negative definite for all α. It follows that maximizing π (α) over the simple is a convex optimization problem with a pf uniquemaximizingvalueα;standardconstrainedoptimizationalgorithmsthenapply. Further,the (cid:98) modification to add the prior penalty and define the log posterior objective function in eqn. (5) maintains convexity and ensures a unique posterior mode α∗ given any specified value of ϵ. Then, standard constrained, non-linear optimization methods (e.g., the default interior-point algorithm in the fmincon function in Matlab, 2024) apply and are fast and efficient. As an aside but of some broader interest, the same approach shows that minimizing KL(p∥f) or KL(f∥p) (when finite) with respect to α are also convex optimizations with unique solutions and similar characteristics. This also applies with any value of ϵ in the fully Bayesian version that adds the prior penalty based on a very diffuse but proper Dirichlet prior that supports non-zero α with probability one. This links to an existing literature on sparsity and stability of KL-optimal j mixturesinthecontextforecastcombination(e.g.Conflittietal.,2015;Dieboldetal.,2023;Crump et al., 2024; De Mol, 2024). Then, relative to EMR, the KL analysis typically leads to more zeros among the optimizing values of α i.e., a sparser mixture more aggressively favoring just one or a small number of scenarios. This arises since KL involves expectations of the unbounded function 26

log{p(y)/f(y|α)}andisverydependentonbehaviourofthetailsofthetwop.d.fs. Thisalsorelates to the caveat that, as noted in Section 4.2, KL divergence may simply be undefined in important practical contexts depending on the relative tail behavior of p(y) and f(y|α). In contrast, EMR is more conservative (and numerically more robust) in discounting scenarios thatarelessconcordantwiththereferencethoughnotextremelyso;thisarisesasEMRisbasedon expectationsoftheboundedfunction1/{1+p(y)/f(y|α)}ineqn.(4). Thatsaid,thefullshrinkage to boundaries of the simplex still arises and requires modest regularization as provided by the penalty induced under the minimally informative prior in the foundational Bayesian setting of Section 5.2. Appendix D Summary of Computational Flow 1. Generate a large random sample yi, (i = 1:n), from the reference p(y), to define an importance sample for Monte Carlo evaluation of the baseline p (y) and the scenarios p (y). 0 j 2. Evaluate baseline IS weights wi ∝ p (yi)/p(yi), subject to normalization. 0 0 3. Evaluate the scenario p.d.f.s p (y) for j > 0. j (a) If the scenario p.d.f.s are completely specified and can be evaluated, this is as in Step 2 above now applied to scenario p.d.f.s p (y) instead of the baseline p (y). For each S j 0 j this delivers normalized IS weights wi on the reference sample values. j (b) If the scenarios are only partially specified, proceed as follows. i. Computethescenariodistributionsusingarandomsamplexi,(i = 1:n),drawnfrom the baseline p (x). Use this for MC evaluations of the integrals required to deliver 0 tilting parameters for each S to minimally distort the baseline to match specified j scenario medians, percentiles, etc. ii. Compute the implied tilting weights ui on each of the reference sampled values yi. j iii. Compute the implied ET-IS weights for each p (yi) at the reference random sample j values yi, namely the normalized weights wi ∝ wiui, (i = 1:n), for each S . j 0 j j 4. Compute synthesis weights by optimizing eqn. (5) over α; this is modified with constraint α ≥ max α ifthebaselineisrequiredtobethemodalscenarioasusedinourexamples. 0 j=1:J j At each value of α in iterations of a numerical optimization routine, direct MC integration evaluates EMR in eqn. (4). This is just the average over i = 1:n of sampled EMR values wi(α)/{wi(α)+wi} with implied scenario synthesis IS weights wi(α) = (cid:80) α wi and f f p f j=0:J j j uniform reference weights wi = 1/n for all i = 1:n. p 27

Cite this document

APA

Tobias Adrian, Domenico Giannone, Matteo Luciani, & and Mike West (2025). Scenario Synthesis and Macroeconomic Risk (FEDS 2025-036). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2025-036

BibTeX

@techreport{wtfs_feds_2025_036,
  author = {Tobias Adrian and Domenico Giannone and Matteo Luciani and and Mike West},
  title = {Scenario Synthesis and Macroeconomic Risk},
  type = {Finance and Economics Discussion Series},
  number = {2025-036},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2025},
  url = {https://whenthefedspeaks.com/doc/feds_2025-036},
  abstract = {We introduce methodology to bridge scenario analysis and model-based risk forecasting, leveraging their respective strengths in policy settings. Our Bayesian framework addresses the fundamental challenge of reconciling judgmental narrative approaches with statistical forecasting. Analysis evaluates explicit measures of concordance of scenarios with a reference forecasting model, delivers Bayesian predictive synthesis of the scenarios to best match that reference, and addresses scenario set incompleteness. This underlies systematic evaluation and integration of risks from different scenarios, and quantifies relative support for scenarios modulo the defined reference forecasts. The framework offers advances in forecasting in policy institutions that supports clear and rigorous communication of evolving risks. We also discuss broader questions of integrating judgmental information with statistical model-based forecasts in the face of unexpected circumstances.},
}