feds · October 31, 2015

Measuring Ambiguity Aversion

Abstract

We confront the generalized recursive smooth ambiguity aversion preferences of Klibanoff, Marinacci, and Mukerji (2005, 2009) with data using Bayesian methods introduced by Gallant and McCulloch (2009) to close two existing gaps in the literature. First, we use macroeconomic and financial data to estimate the size of ambiguity aversion as well as other structural parameters in a representative-agent consumption-based asset pricing model. Second, we use estimated structural parameters to investigate asset pricing implications of ambiguity aversion. Our structural parameter estimates are comparable with those from existing calibration studies, demonstrate sensitivity to sampling frequencies, and suggest ample scope for ambiguity aversion.

Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C. Measuring Ambiguity Aversion A. Ronald Gallant, Mohammad Jahan-Parvar, and Hening Liu 2015-105 Please cite this paper as: Gallant, A. Ronald, Mohammad Jahan-Parvar, and Hening Liu (2015). “Measuring AmbiguityAversion,” FinanceandEconomicsDiscussionSeries2015-105. Washington: Boardof Governors of the Federal Reserve System, http://dx.doi.org/10.17016/FEDS.2015.105. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

Measuring Ambiguity Aversion A. Ronald Gallant Mohammad R. Jahan-Parvar Hening Liu Penn State University∗ Federal Reserve Board† University of Manchester‡§ Original Draft: October 2014 This Draft: October 2015 Abstract WeconfrontthegeneralizedrecursivesmoothambiguityaversionpreferencesofKlibanoff,Marinacci, and Mukerji (2005, 2009) with data using Bayesian methods introduced by Gallant and McCulloch (2009) to close two existing gaps in the literature. First, we use macroeconomic and financialdatatoestimatethesizeofambiguityaversionaswellasotherstructuralparametersin a representative-agent consumption-based asset pricing model. Second, we use estimated structural parameters to investigate asset pricing implications of ambiguity aversion. Our structural parameter estimates are comparable with those from existing calibration studies, demonstrate sensitivity to sampling frequencies, and suggest ample scope for ambiguity aversion. JEL Classification: C61; D81; G11; G12. Keywords: Ambiguity aversion, Bayesian estimation, Equity premium puzzle, Markov switching. ∗Corresponding Author, Department of Economics, The Pennsylvania State University, 511 Kern Graduate Building, University Park, PA 16802 U.S.A. e-mail: aronaldg@gmail.com. †OfficeofFinancialStabilityPolicyandResearch,BoardofGovernorsoftheFederalReserveSystem,20th St. and Constitution Ave. NW, Washington, DC 20551 U.S.A. e-mail: Mohammad.Jahan-Parvar@frb.gov. ‡Accounting and Finance Group, Manchester Business School, University of Manchester, Booth Street West, Manchester M15 6PB, UK. e-mail: Hening.Liu@mbs.ac.uk. §We thank two anonymous referees, Toni Whited (the editor), Geert Bekaert, Dan Cao, Yoosoon Chang, MarcoDelNegro,AnaFostel,LucaGuerrieri,MichaelT.Kiley,NourMeddahi,JamesNason,JoonY.Park, Eric Renault, Jay Shambaugh, Chao Wei, seminar participants at Federal Reserve Board, George Washington University, Georgetown University, Indiana University, North Carolina State University, Midwest Econometric Group meeting 2014, CFE 2014 (Pisa), SoFiE annual conference 2015 (Aarhus University), China International Conference of Finance 2015, and the Econometric Society World Congress 2015 (Montreal) for helpful comments and discussions. The analysis and the conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. Any remaining errors are ours.

1 Introduction In this paper, we confront the smooth ambiguity aversion model of Klibanoff, Marinacci, and Mukerji (2005, 2009), (henceforth, KMM), in its generalized form advanced by Hayashi and Miao (2011) and Ju and Miao (2012), with data to close two existing gaps in the literature. First, we use macroeconomic and financial data to estimate the size of ambiguity aversion together with other structural parameters in a representative-agent consumption-based asset pricing model with smoothambiguityaversionpreferences. Second, basedontheestimatedmodel, weinvestigateasset pricingimplicationsofsmoothambiguityaversion. Giventherisingpopularityofsmoothambiguity preferencesineconomicsandfinance,itisimportanttocharacterizethismodel’sempiricalstrengths as well as its shortcomings. One crucial feature of smooth ambiguity aversion is the separation of ambiguity and ambiguity aversion, where the former is a characteristic of the representative agent’s subjective beliefs, while the latter derives from the agent’s tastes. This study provides a fully market data-based estimation of a dynamic asset pricing model with these preferences. Our structural estimation results suggest that ambiguity aversion is important in matching salient features of asset returns in the U.S. data. Our study shows that ignoring ambiguity aversion in estimation of structural models of financial data leads to inadequate characterization of the market dynamics. The benchmark asset pricing model that we adopt in the estimation is the model developed by Ju and Miao (2012). In this model, aggregate consumption growth follows a Markov switching process with an unobservable state. Mean growth rates of consumption depend on the state. The agent can learn about the state through observing the past consumption data. Ambiguity arises in that the agent may find it difficult to form forecasts of the mean growth rate. Because the underlying state evolves according to a Markov chain, learning cannot resolve this ambiguity over time. The agent is not only risk averse in the usual sense but also ambiguity averse in that he dislikes a mean-preserving-spread in the continuation value implied by the agent’s belief about the unobservable state. As a result, compared with a risk-averse agent, the ambiguity-averse agent effectively assigns more weight to bad states that are associated with lower continuation value. Ju and Miao show that the utility function that permits a three-way separation among risk aversion, ambiguity aversion and the the intertemporal elasticity of substitution (IES) is successful in matching moments of asset returns in the U.S. data. Throughout the paper, we call the model 1

withoutambiguityaversion“alternative”model. Inthisalternativemodel, therepresentativeagent is endowed with Epstein and Zin (1989) recursive utility preferences. Similar to other macro-finance applications, we face sparsity of data. As has become standard in the macro-finance empirical literature, we use prior information and a Bayesian estimation methodologytoovercomedatasparsity. Specifically,weusethe“GeneralScientificModels”(henceforth, GSM) Bayesian estimation method developed by Gallant and McCulloch (2009). GSM is the Bayesian counterpart to the classical “indirect inference” and “efficient method of moments” (hereafter, EMM) methods introduced by Gouri´eroux, Monfort, and Renault (1993) and Gallant and Tauchen (1996, 1998, 2010). These are simulation-based inference methods that rely on an auxiliarymodelforimplementation. GSMfollowsthelogicoftheEMMvariantofindirectinference and relies on the theoretical results of Gallant and Long (1997) in its construction of a likelihood. A comparison of Aldrich and Gallant (2011) with Bansal, Gallant, and Tauchen (2007) displays the advantages of a Bayesian EMM approach relative to a frequentist EMM approach, particularly for the purpose of model comparison. An indirect inference approach is an appropriate estimation methodology in the context of this study since the estimated equilibrium model is highly nonlinear and does not admit analytically tractable solutions, thereby severely inhibiting accurate, numerical construction of a likelihood by means other than GSM. GSM uses a sieve (see Section 4) specially tailored to macroeconomic and financial time-series applications as the auxiliary model. When a suitable sieve is used as the auxiliary model, as in this study, the GSM method synthesizes the exact likelihood implied by the model.1 In this instance, the synthesized likelihood model departs significantly from a normal-errors likelihood, which suggests that alternative econometric methods based on normal approximations will give biased results. In particular, in addition to GARCH and leverage effects, the three-dimensional error distribution implied by the smooth ambiguity aversion modelisskewedinallthreecomponentsandhasfat-tailsforconsumptiongrowthandstockreturns and thin tails for bond returns. Our GSM Bayesian estimation suggests that estimates of the ambiguity aversion parameter are large and statistically significant. Ambiguity aversion in the estimated benchmark model, to a great extent, explains the high market price of risk implied by equity returns data and generates high equity premium. Ignoring ambiguity aversion leads to biased estimates of the risk aversion 1 GallantandMcCulloch(2009)usetheterms“scientificmodel”and“statisticalmodel”insteadoftheterms“structural model” and “auxiliary model” used in the indirect inference econometric literature. We will follow the conventions of the econometric literature. The structural models here are benchmark and alternative models. 2

parameter. Inaddition, ourestimatesfortheIESparameteraresignificantlygreaterthanone. Our estimates for the IES parameter provide support for one of the main predictions of the long-run risk theory. According to the long-run risk literature, a high IES together with a moderate risk aversion coefficient imply that the agent prefers earlier resolution of uncertainty. We find that this demand for early resolution of uncertainty is robust to inclusion of ambiguity aversion, different model specifications, and data samples. Apart from estimating preference parameters, our GSM Bayesian estimation of the asset pricing model with learning indicates two distinct regimes for the mean growth rate of aggregate consumption, where the good regime is persistent while the bad regime is transitory. This result is consistent with many calibration studies using Markov switching models for consumption growth, for example, see Veronesi (1999) and Cecchetti, Lam, and Mark (2000). Related Literature: Two types of ambiguity preferences garner considerable attention in the literature: smooth ambiguity utility of KMM and multiple priors utility of Chen and Epstein (2002) (henceforth, MPU). In the multiple priors framework, the set of priors, which characterizes ambiguity (uncertain beliefs), also determines the degree of ambiguity aversion. However, smooth ambiguity preferences achieve the separation between ambiguity and ambiguity aversion. Thus, it is feasible to do comparative statics analysis by holding the set of relevant probability distributions constant while varying the degree of ambiguity aversion. Furthermore, asset pricing models with MPU are generally difficult to solve with refined processes of fundamentals because MPU features kinked preferences. In comparison with MPU, models with smooth ambiguity preferences are tractable in a wide range of applications.2 Klibanoff et al. (2005, 2009) first introduced smooth ambiguity preferences. Hayashi and Miao (2011) generalized the preferences by disentangling risk aversion and the IES. Applications include endowment economy asset pricing (Ju and Miao (2012), Ruffino (2013), and Collard, Mukerji, Sheppard, and Tallon (2015)), production-based asset pricing (Jahan-Parvar and Liu (2014) and Backus, Ferriere, and Zin (2015)), and portfolio choice (Gollier (2011), Maccheroni, Marinacci, and Ruffino (2013), Chen, Ju, and Miao (2014), and Guidolin and Liu (2014)), among others. These studies typically rely on calibration to examine impacts of ambiguity aversion. Popular calibration methods include the “detection-error probability” method of Anderson, Hansen, and Sargent (2003) and Hansen (2007) (see Jahan-Parvar and Liu (2014) for an application to smooth 2 Strzalecki (2013) provides a rigorous and comprehensive discussion of ambiguity preferences. 3

ambiguity utility) and “thought experiments” similar to Halevy (2007) (see Ju and Miao (2012) and Chen et al. (2014) for applications). Structural estimation of dynamic models with ambiguity is still rare in the literature. To the best of our knowledge, our paper is the first to fully estimate a structural asset pricing model with smooth ambiguity utility. A number of studies are closely related to ours. Jeong, Kim, and Park (2015) estimate an equilibrium asset pricing model where a representative agent has recursive MPU. Their estimation results suggest that fear of ambiguity on the true probability law governing fundamentals carries a premium. The ambiguity aversion parameter, which measures the size of the set of priors in the MPU framework, is both economically and statistically significant and remains stable across alternative specifications. Our paper is different from Jeong, Kim, and Park (2015) in two dimensions. First, we study smooth ambiguity utility, which enables us to obtain an estimate of ambiguity aversion as a preference parameter that clearly describes the agent’s tastes, rather than beliefs. Second, our GSM method allows us to estimate preference parameters and parameters in the processes of fundamentals altogether. Park et al. employ a two-stage econometric methodology that first extracts the volatilities of market returns, consumption growth and labor income growth as latent factors and then estimates preference parameters and the magnitude of the set of priors. Ilut and Schneider (2014) estimate a dynamic stochastic general equilibrium (DSGE) model where agents have MPU. Their estimation results suggest that time varying confidence in future total factor productivity explains a significant fraction of the business cycle fluctuations. Bianchi, Ilut, and Schneider (2014) estimate a DSGE model with endogenous financial asset supply and ambiguity-averse agents. They show that time varying uncertainty about corporate profits explains highequitypremiumandexcessvolatilityofequitypricesobservedintheU.S.data. Theirestimated modelcanalsoreplicatethejointdynamicsofassetpricesandrealeconomicactivityinthepostwar data. Empirical studies on reduced-form estimation of models with ambiguity aversion include Anderson, Ghysels, and Juergens (2009), Viale, Garcia-Feijoo, and Giannetti (2014), and Thimme and V¨olkert (2015). These papers show that ambiguity aversion is priced in the cross-section of expected returns. Anderson, Ghysels, and Juergens (2009) use survey of professional forecasts to construct the uncertainty measure and test model implications in the robust control framework. Viale, Garcia-Feijoo, and Giannetti (2014) rely on relative entropy to construct the ambiguity measure in the multiple priors setting. Fixing the IES at the calibrated value, Thimme and V¨olkert 4

(2015) use the generalized method of moments (GMM) to estimate the ambiguity aversion parameter. Both Viale, Garcia-Feijoo, and Giannetti (2014) and Thimme and V¨olkert (2015) formulate thestochasticdiscountfactor(SDF)underambiguityusingreduced-formregressionmethods. Ahn, Choi, Gale, and Kariv (2014) use experimental data to estimate ambiguity aversion in static portfolio choice settings. The rest of the paper proceeds as follows. Section 2 describes the data used for estimation. Section 3 presents the consumption-based asset pricing model with generalized recursive smooth ambiguitypreferencesdevelopedbyJuandMiao(2012). Section4discussestheestimationmethodology and presents our empirical findings. Section 5 presents model comparison results, forecasts, and asset pricing implications. Section 6 concludes. 2 Data Throughoutthispaper,lowercasedenotesthelogarithmofanuppercasequantity;e.g.,c = ln(C ), t t where C is the observed consumption in period t, and d = ln(D ), where D is dividends paid t t t t in period t. Similarly, we use logarithmic risk-free interest rate (rf) and aggregate equity market t return inclusive of dividends (re = ln(Pe+D )−lnPe ) in the analysis, where Pe is the stock t t t t−1 t price in period t. We use real annual data from 1929 to 2013 and real quarterly data from the second quarter of 1947 to the second quarter of 2014 for the purpose of inference respectively. For the annual (quarterly) sample, we use the sample period 1929–1949 (1947:Q2–1955:Q2) to provide initial lags for the recursive parts of our estimation and the sample period 1950–2013 (1955:Q3–2014:Q2) for estimation and diagnostics. Our measure for the risk-free rate is one-year U.S. Treasury Bill rates for annual data and 3-months U.S. Treasury Bill rates for quarterly data. Our proxy for risky asset returns is the value-weighted returns on CRSP-Compustat stock universe. We use the sum of nondurable and services consumption from Bureau of Economic Analysis (BEA) and deflate the series using the appropriate price deflator (also provided by the BEA). We use mid-year population data to obtain per capita consumption values. As noted in Garner, Janini, Passero, Paszkiewicz, and Vendemia (2006) and Andreski, Li, Samancioglu, and Schoeni (2014), there are notable discrepancies among measures of consumption based on collection methods used and released by different agencies. Thus, throughout the paper, 5

we assume a 5% measurement error in the level of real per capita consumption.3 We assume a linear error structure. That is, C = C∗ +u where C is the observed value, C∗ is the true value, t t t t t and u is the measurement error term. We have c = ln(C ) = ln(C∗+u ) and ∆c = ln(C /C ). t t t t t t t t−1 Table 1 presents the summary statistics of samples. The p-values of Jarque and Bera (1980) test of normality imply that the assumption of normality is rejected for risk-free rate and log consumption growth series, but it cannot be rejected for aggregate market returns and excess returns at annual frequency. Annual data plots are shown in Figure 1. 3 The Model The intuitive notions behind any consumption-based asset pricing model are that agents receive income (wage, interest, and dividends) which they use to purchase consumption goods. Agents reallocate their consumption over time by trading stocks that pay random dividends and bonds that pay interest with certainty. This is done for consumption smoothing over time (for example, insurance against unemployment, saving for retirement, ···). The budget constraint implies that the purchase of consumption, bonds, and stocks cannot exceed income in any period. Agents are endowed with a utility function that depends on the entire consumption path. The first-order conditions of their utility maximization deliver an intertemporal relation of prices of stocks and bonds. We consider the representative-agent model of Ju and Miao (2012) as our benchmark model. Among all tradable assets, we focus on the risky asset that pays aggregate dividends D and the t one-period risk-free bond with zero net supply. Aggregate consumption follows the process (cid:18) (cid:19) C t+1 ∆c ≡ ln = κ +σ (cid:15) , (1) t+1 C zt+1 c t+1 t where (cid:15) is an i.i.d. standard normal random variable, and z follows a two-state Markov chain t t+1 with state 1 being the good state and state 2 being the bad state (κ > κ ). The transition matrix 1 2 is given by   p 1−p 11 11 P =  ,   1−p p 22 22 3 We also experimented with 1% and 10% error levels. Our estimation results are robust to the level of measurement errors. 6

where p denotes the probability of switching from state i to state j. ij Becauseaggregatedividendsaremorevolatilethanaggregateconsumption, wemodeldividends and consumption separately, see Bansal and Yaron (2004). The dividend growth process is given by and an idiosyncratic component, (cid:18) (cid:19) D t+1 ∆d ≡ ln = λ∆c +g +σ ε , (2) t+1 t+1 d d d,t+1 D t where ε is an i.i.d. standard normal random variable that is independent of all other shocks d,t+1 in the model. The parameter λ can be interpreted as the leverage ratio (see Abel (1999)). The parameters g and σ can be pinned down by calibrating the process to the mean and volatility of d d dividend growth. We set the mean dividend growth rate to the unconditional mean of consumption growth implied by the Markov-switching model. In addition, we denote the volatility of dividend growth byσ and estimate this parameter using historical data on consumption growth and returns d on assets and the GSM Bayesian method. The agent cannot observe the regimes of expected consumption growth but can learn about the state (z ) through observing the past consumption data. The agent also knows the parameters t of the model, namely, {κ ,κ ,p ,p ,σ ,λ,g ,σ }. The agent updates beliefs µ = Pr(z |Ω ) 1 2 11 22 c d d t t+1 t according to Bayes’ rule: p f(∆c |1)µ +(1−p )f(∆c |2)(1−µ ) 11 t+1 t 22 t+1 t µ = , (3) t+1 f(∆c |1)µ +f(∆c |2)(1−µ ) t+1 t t+1 t where f(∆c |i), i = 1,2 is the normal density function of consumption growth conditional on t+1 state i. The agent’s preferences are represented by the generalized recursive smooth ambiguity utility function, (cid:104) (cid:105) 1 V (C) = (1−β)C 1−1/ψ +β{R (V (C))}1−1/ψ 1−1/ψ , (4) t t t t+1 (cid:18) (cid:20) (cid:16) (cid:104) (cid:105)(cid:17)1−η(cid:21)(cid:19) 1− 1 η R (V (C)) = E E V1−γ(C) 1−γ , (5) t t+1 µt zt+1,t t+1 where β ∈ (0,1) is the subjective discount factor, ψ is the IES parameter, γ is the coefficient of relativeriskaversion, andη istheambiguityaversionparameterandmustsatisfyη > γ tomaintain ambiguity aversion in the utility function. Equation (5) characterizes the certainty equivalent of 7

future continuation value, which is the key ingredient that distinguishes this utility function from Epstein and Zin (1989) recursive utility. In Equation (5), the expectation operator E [·] is with zt+1,t respect to the distribution of consumption conditioning on the next period’s state z , and the t+1 expectation operator E is with respect to the posterior beliefs about the unobservable state. µt Under this utility function, the SDF is given by (see Hayashi and Miao (2011) for a derivation) (cid:16) (cid:104) (cid:105)(cid:17) 1 −(η−γ) M = β (cid:18) C t+1 (cid:19)−1/ψ(cid:18) V t+1 (cid:19)1/ψ−γ  E zt+1,t V t 1 + − 1 γ 1−γ  . (6) zt+1,t+1 C R (V )  R (V )  t t t+1 t t+1 The last multiplicative term in Equation (6) is due to ambiguity aversion. It makes the SDF more countercyclical than in the case with Epstein-Zin’s recursive preferences. Numerically, we can show that M tends to be higher if z appears to be state 2 (the bad state). In addition, the last zt+1,t+1 t+1 term in Equation (6) induces additional variation in the SDF (compared with Epstein-Zin SDF) and leads to a high market price of risk, defined as σ(M)/E(M). Stock returns, defined by Re = P t e +1 +Dt+1, satisfy the Euler equation t+1 Pe t E (cid:2) M Re (cid:3) = 1. (7) µt,t zt+1,t+1 t+1 The risk-free rate, R , is the reciprocal of the expectation of the SDF: f,t 1 Rf = . t E (cid:2) M (cid:3) µt,t zt+1,t+1 We can rewrite the Euler equation as (cid:104) (cid:16) (cid:17)(cid:105) (cid:104) (cid:16) (cid:17)(cid:105) 0 = µ˜ E MEZ Re −Rf +(1−µ˜ )E MEZ Re −Rf , t 1,t zt+1,t+1 t+1 t t 2,t zt+1,t+1 t+1 t where MEZ can be interpreted as the SDF under Epstein-Zin recursive utility: zt+1,t+1 MEZ = β (cid:18) C t+1 (cid:19)− ψ 1 (cid:18) V t+1 (cid:19) ψ 1−γ , zt+1,t+1 C R (V ) t t t+1 8

and µ˜ can be interpreted as ambiguity distorted beliefs and represented by: t (cid:16) (cid:104) (cid:105)(cid:17)−η−γ µ E V1−γ 1−γ t 1,t t+1 µ˜ = . (8) t (cid:16) (cid:104) (cid:105)(cid:17)−η−γ (cid:16) (cid:104) (cid:105)(cid:17)−η−γ µ E V1−γ 1−γ +(1−µ ) E V1−γ 1−γ t 1,t t+1 t 2,t t+1 As long as η > γ, distorted beliefs are not equivalent to Bayesian beliefs. The distortion driven by ambiguity aversion is an equilibrium outcome and implies pessimistic beliefs; see 5.3. We follow Ju and Miao (2012) and use the projection method with Chebyshev polynomials to solve the model. The model has to be solved for each set of parameter values simulated in the GSM method. We did experiments to solve the model for a number of combinations of parameter valuesandfoundthatthesolutionmethodisrobust. Specifically, homogeneityinutilitypreferences implies V (C) = G(µ )C , and G(µ ) satisfies the following functional equation t t t t   1 G(µ t ) =    (1−β)+β   E µt   (cid:32) E zt+1,t (cid:34) G(µ t )1−γ (cid:18) C C t+ t 1 (cid:19)1−γ (cid:35)(cid:33) 1 1 − − γ η    1− 1− 1/ η ψ    1−1/ψ . To solve for the value function, we approximate G(µ ) using Chebyshev polynomials in the state t variable µ . The approximation takes the form t p (cid:88) G(µ) (cid:39) φ T (c(µ)), j j k=0 where p is the order of Chebyshev polynomials, T with j = 1,...,p are Chebyshev polynomials, and j c(µ) maps the sate variable µ onto the interval [−1,1]. We then choose a set of collocation points for µ and solve for the coefficients {φ } using a nonlinear equations solver. The expectation j j=0,...,p E [·] is approximated using Gauss-Hermite quadrature. zt+1,t Pe The equilibrium price-dividend ratio is a functional of the state variable, t = ϕ(µ ). To solve Dt t for the price-dividend ratio, we rewrite the Euler equation as (cid:20) (cid:21) D ϕ(µ ) = E M (1+ϕ(µ )) t+1 . t t zt+1,t+1 t+1 D t The price-dividend ratio can also be approximated using Chebyshev polynomials in µ . Since the t SDF M can be easily written as a functional of G(µ ) and consumption growth ∆c = zt+1,t+1 t+1 t 9

ln(C /C ), we can solve for the price-dividend ratio in a similar way as we solve for the value t+1 t function. We simulate logarithmic values of consumption growth, stock returns and risk-free rates (cid:110) (cid:111)T ∆c ,re ,rf . t+1 t+1 t+1 t=1 If η = γ, then the agent is ambiguity neutral and has the familiar Kreps and Porteus (1978) and Epstein and Zin (1989) preferences: (cid:104) (cid:105) 1 V (C) = (1−β)C 1−1/ψ +β{R (V (C))}1−1/ψ 1−1/ψ , t t t t+1 (cid:104) (cid:105) 1 R (V (C)) = E V1−γ(C) 1−γ . t t+1 t t+1 We consider this model as the alternative model for estimation. The model is solved and simulated using the projection method described above. 4 Estimation of Model Parameters To estimate model parameters we use a Bayesian method proposed by Gallant and McCulloch (2009), abbreviated GM hereafter, that they termed General Statistical Models (GSM). The GSM methodology was refined in Aldrich and Gallant (2011), abbreviated AG hereafter.4 The discussion hereincorporatesthoserefinementsandistoaconsiderableextentaparaphraseofAG.Thesymbols ζ, θ, etc. that appear in this section are general vectors of statistical parameters and are not instances of the model parameters of Section 3. Let the transition density of a structural model be denoted by p(y |x ,θ), θ ∈ Θ, (9) t t−1 wherex = (y ,...,y )if Markovianandx = (y ,...,y )if not. As aresult, x serves t−1 t−1 t−L t−1 t−1 1 t−1 as a shorthand for lag-lengths that are generally greater than 1. Thus, transition densities may dependonL-lagsofthedata(ifMarkovian)ortheentirehistoryofobservations(ifnon-Markovian). There are two structural models under consideration in this application: the benchmark model and the alternative model, described in Section 3. We presume that there is no straightforward algorithm for computing the likelihood but that we can simulate data from p(·|·,θ) for a given θ. We presume that simulations from the structural 4 Code implementing the method with AG refinements, together with a User’s Guide, is in the public domain and available at www.aronaldg.org/webfiles/gsm. 10

model are ergodic. We assume that there is a transition density f(y |x ,ζ), ζ ∈ Z (10) t t−1 and a map g : θ (cid:55)→ ζ (11) such that p(y |x ,θ) = f(y |x ,g(θ)), θ ∈ Θ. (12) t t−1 t t−1 We assume that f(y |x ,ζ) and its gradient (∂/∂ζ)f(y |x ,ζ) are easy to evaluate. f is called t t−1 t t−1 the auxiliary model and g is called the implied map. When Equation (12) holds, f is said to nest p. Whenever we need the likelihood (cid:81)n p(y |x ,θ), we use t=1 t t−1 n (cid:89) L(θ) = f(y |x ,g(θ)), (13) t t−1 t=1 where{y ,x }n arethedataandnisthesamplesize. AftersubstitutingL(θ)for (cid:81)n p(y |x ,θ), t t−1 t=1 t=1 t t−1 standard Bayesian MCMC methods become applicable. That is, we have a likelihood L(θ) from Equation (13) and a prior π(θ) from Subsection 4.4 and need nothing beyond that to implement Bayesian methods by means of MCMC. A good introduction to these methods is Gamerman and Lopes (2006). ThedifficultyinimplementingGM’spropsalistocomputetheimpliedmapg accuratelyenough that the accept/reject decision in an MCMC chain (Step 5 in the algorithm below) is correct when f is a nonlinear model. The algorithm proposed by AG to address this difficulty is described next. Given θ, ζ = g(θ) is computed by minimizing Kullback-Leibler divergence (cid:90) (cid:90) d(f,p) = [logp(y|x,θ)−logf(y|x,ζ)] p(y|x,θ)dyp(x|θ)dx with respect to ζ. The advantage of Kullback-Leibler divergence over other distance measures is (cid:82)(cid:82) that the part that depends on the unknown p(·|·,θ), logp(y|x,θ)p(y|x,θ)dyp(x|θ)dx, does not have to be computed to solve the minimization problem. We approximate the integral that does 11

have to be computed by (cid:90) (cid:90) N 1 (cid:88) logf(y|x,ζ)p(y|x,θ)dyp(x|θ)dx ≈ logf(yˆ|xˆ ,ζ), t t−1 N t=1 where {yˆ,xˆ }N is a simulation of length N from p(·|·,θ). Upon dropping the division by N, t t−1 t=1 the implied map is computed as N (cid:88) g : θ (cid:55)→ argmax logf(yˆ |xˆ ,ζ). (14) t t−1 ζ t=1 We use N = 1000 in the results reported below. Results (posterior mean, posterior standard deviation, etc.) are not sensitive to N; doubling N makes no difference other than doubling computational time. It is essential that the same seed be used to start these simulations so that the same θ always produces the same simulation. GMrunaMarkovchain{ζ }K oflengthK tocomputeζˆthatsolvesexpression(14). Thereare t t=1 two other Markov chains discussed below so, to help distinguish among them, this chain is called the ζ-subchain. While the ζ-subchain must be run to provide the scaling for the model assessment method that GM propose, the ζˆ that corresponds to the maximum of (cid:80)N logf(yˆ |xˆ ,ζ) over t=1 t t−1 the ζ-subchain is not a sufficiently accurate evaluation of g(θ) for our auxiliary model. This is mainly because our auxiliary model uses a multivariate specification of the generalized autoregressive conditional heteroscedasticity (GARCH) of Bollerslev (1986) that Engle and Kroner (1995) call BEKK. Likelihoods incorporating BEKK are notoriously difficult to optimize. AG use ζˆ as a starting value and maximize the expression (14) using the BFGS algorithm, see Fletcher (1987). This also is not a sufficiently accurate evaluation of g(θ). A second refinement is necessary. The secondrefinementisembeddedwithintheMCMCchain{θ }R oflengthRthatisusedtocompute t t−1 the posterior distribution of θ. It is called the θ-chain. Its computation proceeds as follows. The θ-chain is generated using the Metropolis algorithm. The Metropolis algorithm is an iterative scheme that generates a Markov chain whose stationary distribution is the posterior of θ. To implement it, we require a likelihood, a prior, and transition density in θ called the proposal density. The likelihood is Equation (13) and the prior, π(θ), is described in Section 4.4. The prior may require quantities computed from the simulation {yˆ,xˆ }N that are used in t t−1 t−1 computing Equation (13). In particular, quantities computed in this fashion can be viewed as the 12

evaluation of a functional of the structural model of the form p(·|·,θ) (cid:55)→ (cid:37), where (cid:37) ∈ P. Thus, the prior is a function of the form π(θ,(cid:37)). But since the functional (cid:37) is a composite function with θ (cid:55)→ p(·|·,θ) (cid:55)→ (cid:37), π(θ,(cid:37)) is essentially a function of θ alone. Thus, we only use π(θ,(cid:37)) notation when attention to the subsidiary computation p(·|·,θ) (cid:55)→ (cid:37) is required. Let q denote the proposal density. For a given θ, q(θ,θ∗) defines a distribution of potential new values θ∗. We use a move-one-at-a-time, random-walk, proposal density that puts its mass on discrete, separated points, proportional to a normal. Two aspects of the proposal scheme are worth noting. The first is that the wider the separation between the points in the support of q the less accurately g(θ) needs to be computed for α at step 5 of the algorithm below to be correct. A practical constraint is that the separation cannot be much more than a standard deviation of the proposal density or the chain will eventually stick at some value of θ. Our separations are typically 1/2 of a standard deviation of the proposal density. In turn, the standard deviations of the proposal density are typically no more than the standard deviations in Table 2 and no less than one order of magnitude smaller. The second aspect worth noting is that the prior is putting mass on these discrete points in proportion to π(θ). Because we never need to normalize π(θ) this does not matter. Similarly for the joint distribution f(y|x,g(θ))π(θ) considered as a function of θ. (cid:82) However, f(y|x,ζ) must be normalized such that f(y|x,ζ)dy = 1 to ensure that the implied map expressed in (14) is computed correctly. Thealgorithmfortheθ-chainisasfollows. Givenacurrentθo andthecorrespondingζo = g(θo), obtain the next pair (θ(cid:48),ζ(cid:48)) as follows: 1. Draw θ∗ according to q(θo,θ∗). 2. Draw {yˆ,xˆ }N according to p(y |x ,θ∗). t t−1 t=1 t t−1 3. Compute ζ∗ = g(θ∗) and the functional (cid:37)∗ from the simulation {yˆ,xˆ }N . t t−1 t=1 (cid:16) (cid:17) L(θ∗)π(θ∗,(cid:37)∗)q(θ∗,θo) 4. Compute α = min 1, . L(θo)π(θo,(cid:37)o)q(θo,θ∗) 5. With probability α, set (θ(cid:48),ζ(cid:48)) = (θ∗,ζ∗), otherwise set (θ(cid:48),ζ(cid:48)) = (θo,ζo). It is at step 3 that AG made an important modification to the algorithm proposed by GM. At that point one has putative pairs (θ∗,ζ∗) and (θo,ζo) and corresponding simulations {yˆ∗,xˆ∗ }N t t−1 t=1 and {yˆo,xˆo }N . AG use ζ∗ as a start and recompute ζo using the BFGS algorithm, obtaining ζˆo. t t−1 t=1 13

If N N (cid:88) (cid:88) logf(yˆo|xˆo ,ζˆo) > logf(yˆo|xˆo ,ζo), t t−1 t t−1 t=1 t=1 then ζˆo replaces ζo. In the same fashion, ζ∗ is recomputed using ζo as a start. Once computed, a (θ,ζ) pair is never discarded. Neither are the corresponding L(θ) and π(θ,(cid:37)). Because the support of the proposal density is discrete, points in the θ-chain will often recur, in which case g(θ), L(θ), and π(θ,(cid:37)) are retrieved from storage rather than computed afresh. If the modification just described results in an improved (θo,ζo), that pair and corresponding L(θo) and π(θo,(cid:37)o) replace the values in storage; similarly for (θ∗,ζ∗). The upshot is that the values for g(θ) used at step 4 will be optima computed from many different random starts after the chain has run awhile. 4.1 Relative Model Comparison RelativemodelcomparisonisstandardBayesianinference. Theposteriorprobabilitiesofthemodels withandwithoutambiguityaversionarecomputedusingtheNewtonandRaftery(1994)pˆ4 method forcomputingthemarginallikelihoodfromanMCMCchainwhenassigningequalpriorprobability toeachmodel. Theadvantageof thatmethodis thatknowledge ofthe normalizingconstants ofthe likelihood L(θ) and the prior π(θ) are not required. We do not know these normalizing constants due to the imposition of support conditions. It is important, however, that the auxiliary model be the same for both models. Otherwise the normalizing constant of L(θ) would be required. One divides the marginal density for each model by the sum for both models to get the probabilities for relative model assessment. Or, because we are only comparing two models, one can equally as well use the ratio of the two probabilities, i.e., the odds ratio. 4.2 Forecasts A forecast is a functional Υ : f(·|·,ζ) (cid:55)→ υ of the auxiliary model that can be computed from f(·|·,ζ) either analytically or by simulation. Due to the map ζ = g(θ), we view such a forecast as both a forecast from the structural model and as a function of θ. Viewing it as a function of θ, we can compute υ at each draw in the posterior MCMC chain for θ which results in an MCMC chain for υ. From the latter chain the mean and standard deviation of υ can be computed. The same quantities can be computed for draws from the prior. Examples are in Figure 6. 14

4.3 The Auxiliary Model The observed data are y for t = 1,...,n, where y is a vector of dimension three in our application. t t We use the notation x = {y ,··· ,y }, if the auxiliary model is Markovian, and x = t−1 t−1 t−L t−1 {y ,··· ,y } if it is not.5 Either way, x serves as a shorthand for lagged values of y vector. t−1 1 t−1 t In this application, the auxiliary model is not Markovian due to the recursion in expression (17). The data are modeled as y = µ +R z t xt−1 xt−1 t where µ = b +By , (15) xt−1 0 t−1 which is the location function of a one-lag vector auto-regressive (VAR) specification, and R is xt−1 the Cholesky factor of Σ = R R(cid:48) (16) xt−1 0 0 +QΣ Q(cid:48) (17) xt−2 +P(y −µ )(y −µ )(cid:48)P(cid:48) (18) t−1 xt−2 t−1 xt−2 +max[0,V(y −µ )]max[0,V(y −µ )](cid:48). (19) t−1 xt−2 t−1 xt−2 In computations, max(0,x) in expression (19), which is applied element-wise, is replaced by a twice differentiable cubic spline approximation that plots slightly above max(0,x) over (0.00,0.10) and coincides elsewhere. The density h(z) of the i.i.d. z is the square of a Hermite polynomial times a normal density, t the idea being that the class of such h is dense in Hellenger norm and can therefore approximate a density to within arbitrary accuracy in Kullback-Leibler distance, see Gallant and Nychka (1987). Such approximations are often called sieves; Gallant and Nychka term this particular sieve seminonparametric maximum likelihood estimator, or SNP. The density h(z) is the normal when the degree of the Hermite polynomial is zero. In addition, the constant term of the Hermite polynomial can be a linear function of x . This has the effect of adding a nonlinear term to the location t−1 function (15) and the variance function (16). It also causes the higher moments of h(z) to depend 5 Refer to Gallant and Long (1997) for the properties of estimators of the form used in this section when the model is not Markovian. 15

on x as well. The SNP auxiliary model is determined statistically by adding terms as indicated t−1 by the Bayesian information criteria (BIC) protocol for selecting the terms that comprise a sieve, see Schwarz (1978). In our specification, R is an upper triangular matrix, P and V are diagonal matrices, and Q 0 is scalar. The degree of the SNP h(z) density is four. The constant term of the SNP density does not depend on the past. The auxiliary model chosen for our analysis, based on BIC, has 1 lag in the conditional mean component, 1 lag in each of autoregressive conditional heteroscedasticity (ARCH) and generalized autoregressive conditional heteroscedasticity (GARCH) terms. The model admits leverage effect in the ARCH term. The auxiliary model has 37 estimated parameters. The implied error distributions in GSM estimation can differ significantly from the error shocks used for solving the structural model. For example, we numerically solve the two structural models in Sections 3 assuming normal distributions for error terms in Equations (1) and (2). This is a simplifying assumption to ease numerical solutions. The error distributions of simulations from these models are non-Gaussian. For example, in addition to GARCH and leverage effects, the three-dimensional error distribution implied by the benchmark smooth ambiguity aversion model is skewed in all three components and has fat-tails for consumption growth and stock returns and thin tails for bond returns. The auxiliary model is determined from simulations of the structural model so issues of data sparsity do not arise; one can make the simulation length N as large as necessary to determine the parameters of the auxiliary model accurately. As stated above, we used N = 1,000 and found that using larger values of N did not change results other than increase run times. 4.4 The Prior and Its Support Both the benchmark and alternative models are richly parameterized. The benchmark model has 11 structural parameters, given by θ = (β,γ,ψ,η,p ,p ,κ ,κ ,λ,σ ,σ ). 11 22 1 2 c d Thealternativemodelhas10parameterswithγ = η. Theprioristhecombinationoftheproductof independent normal density functions and support conditions. The product of independent normal 16

density functions is given by n π(θ) = (cid:89) N (cid:2) θ | (cid:0) θ∗,σ2(cid:1)(cid:3) i i θ i=1 where n denotes the number of parameters. For annual data, the prior location parameters of the benchmark model are θ∗ = (0.9750,2.00,1.50,8.8640,0.9780,0.5160,−0.06785,0.02251,2.7400,0.03127,0.1200). These values are selected based on Ju and Miao (2012)’s calibration results. With these parameter values, the calibrated model can roughly reproduce the means and volatilities of the risk-free rate and stock returns observed in the U.S. data. The scale parameters, i.e., standard deviations, are σ = (0.90/1.96)θ∗. Theimplicationofthischoiceofstandarddeviationisthatthepriorprobability θi i satisfies P(|θ − θ∗|/|θ∗| < 0.90) = 0.95, i.e., the probability of θ being within 90 percent of θ∗ i i i i i is 0.95. This is a loose prior so that the major determinant of the prior are support conditions described next. Imposition of a loose prior and mild support conditions provides room for the equilibrium model to contribute to the identification of estimated parameters. Due to the support conditions, the effective prior is not an independence prior. For some values of θ∗ proposed in Step 1 of the θ-chain described in Section 4, a model solution at Step 2 will not exist. In such cases, α at Step 5 is set to zero. We constrain the subjective discount factor β to be between 0.00 and 1.00. We bound the coefficient of risk aversion γ to be above 0.00 and below 15.00, in line with the recommendation of Mehra and Prescott (1985) that risk aversion should be moderate. Fully parameterized Kreps and Porteus (1978) and Epstein and Zin (1989) preferences imply a separation between risk aversion and the IES; therefore we impose γ (cid:54)= 1/ψ.6 Following the long-run risk literature (e.g., Bansal and Yaron (2004)), we impose ψ > 1.00 such that persistent movements in expected consumption growth are significantly priced in equity returns. The upper bound for ψ is set as ψ < 5.00. Relaxing this bound has little impact on our estimation results. We bound η between 2.00 and 100.00. We impose η > γ in the estimation of the benchmark model. Hayashi and Miao (2011) and Ju and Miao (2012) furnish detailed discussions of this requirement. Briefly, with η = γ, compound predictive probability distributions are reduced to an ordinary predictive probability distribution, removing ambiguity from the model. When estimating 6 If γ =1/ψ, Kreps-Porteus and Epstein-Zin preferences collapse to power utility. 17

the alternative model, we impose the restriction η = γ to obtain ambiguity neutrality. Following consumption-based asset pricing models (e.g., Abel (1999) and Bansal and Yaron (2004)) and based on empirical findings of Aldrich and Gallant (2011), we require positive leverage in the model. To this end, we impose λ > 1.00e−7 for the leverage parameter. For parameters in the consumption growth process, we impose 0.93960 < p < 0.99962, 0.2514 < p < 0.7806, 11 22 0.01596 < κ < 0.02906, −0.1055 < κ < −0.0302, and 0.02646 < σ < 0.03608. These bounds 1 2 c are adopted based on the parameter estimates and the associated standard errors in the Markov switching model for consumption growth, which are reported in Ju and Miao (2012) and Cecchetti, Lam, and Mark (2000). The bound for the volatility of dividend growth is 0.06542 < σ < 0.1746, d set according to the estimate and standard error provided by Bansal and Yaron (2004). The prior information for annual estimation is summarized in the first three columns of Table 2. Forestimationbasedonquarterlydata,weappropriatelyrescaletheprior(thelocationandscale parameters, and the support conditions as well) in the annual estimation. The prior specifications for the preference parameters β,γ and ψ and for the leverage parameter λ remain unchanged. The location parameter for ambiguity aversion, η, is adjusted to yield a sizable equity premium when the benchmark model is evaluated at the rescaled location parameters. The prior information for quarterly estimation is summarized in the first three columns of Table 3. Our prior specification and support conditions help the GSM Bayesian estimation identify parameter estimates of both benchmark and alternative models. As will be discussed below, even after combining our loose priors with support conditions, the estimation procedure and data are important for the identification of key parameters. 4.5 Empirical Results WeplotthepriorandposteriordensitiesofthestructuralparametersofthebenchmarkandalternativemodelsinFigures2–5. Theplotsshowconsiderableshiftsinbothlocationandscale,suggesting that the estimation procedure and data have a strong influence on our estimation results. This observation is reassuring because an important concern in Bayesian estimation is identification of parameter estimates. In other words, one wants to know the relative contribution of priors and support conditions versus the contribution of the data. It is clear from Figures 2–5 that for almost all of the estimated parameters, the posterior densities shift significantly compared to the prior densities. Moreover, this observation is true for estimations of the benchmark and alternative mod- 18

els at both annual and quarterly frequencies. Our discussion below focuses on posterior densities in the annual estimation. The reader can apply the same logic and line of reasoning to quarterly estimation results. Figure 2 reveals that the identification of the ambiguity aversion parameter η, which is the key preference parameter in the model, is strong in the annual estimation. Both the location and scale change dramatically as a consequence of the estimation procedure. In addition, it can be seen from Figures 2 and 3 that the identification of other preference parameters including β,γ and ψ also appears to be notable for both the benchmark and alternative models, though the posterior density of the γ estimate is moderately more dispersed in the benchmark model estimation. Posterior densities of model parameters governing the dynamics of consumption and dividends also indicate that our estimation procedure has an important impact on the identification of those parameters, with the help of the priors and support conditions described in Section 4.4. The posteriordensitiesoftheestimatedtransitionprobabilitiesaremoreconcentratedinthebenchmark model estimation than in the alternative model estimation. We note that in Figure 2, the low mean growthrateofconsumption,κ ,hasaverytightposteriordensity. Thisresultisduetotheinclusion 2 of ambiguity aversion in the benchmark model. The ambiguity-averse agent distorts beliefs toward the bad regime. As a result, the low mean growth rate largely determines the impact of ambiguity aversion on the SDF and therefore equity premium. This feature of the model is manifested in the estimation results in that both the identification of η and that of κ are strong in the benchmark 2 model estimation. By contrast, the posterior density of the high mean growth rate, κ , is tight 1 in the estimation of the alternative model, as shown in Figure 3. Other parameters including the leverage parameter (λ), the volatility of consumption growth innovation (σ ) and the volatility of c dividend growth (σ ) have posterior densities significantly different from the corresponding prior d densities. TheestimatedmomentsofthemodelparametersaresummarizedinTable2andTable3,forthe annual and quarterly samples, respectively. We report means, modes and standard deviations of the parameters in the benchmark model featuring ambiguity aversion and in the alternative model with recursive utility. The posterior mean and mode estimates of the subjective discount factor β are stable across the benchmark and alternative models for the annual and quarterly samples. The estimates are below 1 in all cases, which is consistent with values adopted by many calibration studies. Moreover, they are reasonably close to estimates reported by Aldrich and Gallant (2011) 19

and Bansal et al. (2007) and also to the GMM estimate of Yogo (2006). Thus, they do not cause any concern for us and imply precise estimation of the target parameter. In contrast to the discount factor parameter, estimates of the risk aversion parameter, γ, are sensitive to the presence of ambiguity aversion. For the annual estimation, the posterior mean and mode of γ in the alternative model are significantly larger than the corresponding estimates in the benchmark model. For instance, the posterior mean of γ is 1.62 in the benchmark model, as opposed to 6.32 in the alternative model. When quarterly data are used for estimation, the posterior mean and mode of γ in alternative model are an order of magnitude larger than those for the benchmark model. This result is plausible given the calibration studies of Ju and Miao (2012) andCollard,Mukerji,Sheppard,andTallon(2015),whoshowthatwithsmoothambiguityaversion, low risk aversion is required to account for high and time varying equity premium. The result is also related to the findings of Jeong et al. (2015) for their estimation of the recursive utility model andthemultiplepriorsmodel, whereaggregatewealthconsistsoffinancialwealthonly. Jeongetal. (2015) report estimates of γ ranging between 0.20 to 2.90 in the multiple priors model, while the γ estimate is 4.90 in the recursive utility model. In comparison with Aldrich and Gallant (2011), the estimates of γ in the benchmark model with ambiguity aversion are smaller than their estimates for both habit formation and long-run risk models, but similar to their prospect theory-based results. The posterior mode and mean of the ambiguity aversion parameter η are 29.80 and 30.33 for the annual sample and 57.08 and 55.93 for the quarterly sample.7 The standard deviation of the posteriordistributionofη isconsistentlylow. Takentogether,tightlyestimatedvaluesforη andthe impact of modeling ambiguity aversion on estimation of risk aversion parameter, γ, strongly imply that ambiguity aversion does explain important features of asset returns in the data, namely low risk-free rates, high equity premium and volatile equity returns. Since all of the model parameters areestimatedsimultaneouslybytheGSMBayesianestimationmethodology,theposteriorestimates of the ambiguity aversion parameter depend on the estimation results for other model parameters especially primitive parameters in the consumption growth process. Using the post-war data, our quarterly estimation generates parameter estimates in the consumption growth process that are quite different from the results in the annual estimation. This observation explains the difference between annual and quarterly estimates of the ambiguity aversion parameter. Typical values used 7 ThimmeandVo¨lkert(2015)usequarterlydatatoestimatetheambiguityaversionparameterinthesmoothambiguity utility function adopted in our study. Their GMM estimation relies on fixed values for the IES parameter and a reduced-form, linearized SDF. They obtain estimates of η ranging from 24 to 62, which are comparable to our structural estimation results. 20

in the calibration studies, for example, η = 8.86 in Ju and Miao (2012) and η = 19 in Jahan-Parvar and Liu (2014), provide a lower bound for our estimates.8 Ahn et al. (2014) conduct an experimental study on estimating smooth ambiguity aversion. Based on a static formulation, they report values of an ambiguity aversion parameter ranging between 0.00 and 2.00, with a mean value of 0.207. Their estimates of the ambiguity aversion parameter are statistically insignificant and are at least an order of magnitude smaller than our dynamic model-based estimates. We believe that ignoring intertemporal choice under ambiguity explains these differences in estimates of ambiguity aversion.9 ThereisanongoingdebateaboutthevalueoftheIESparameterψintheassetpricingliterature. Thisparameteriscrucialforequilibriumassetpricingmodelstomatchmacroeconomicandfinancial moments in the data. In the empirical literature, some studies (e.g., Hall (1988) and Ludvigson (1999)) find that the IES estimates are close to zero as implied by aggregate consumption data. Other studies find higher values using cohort- or household-level data (e.g., Attanasio and Weber (1993) and Vissing-Jorgensen (2002)). Attanasio and Vissing-Jorgensen (2003) find that the IES estimate for stockholders is typically above 1. Bansal and Yaron (2004) point out that the IES estimates will be under-estimated unless heteroscedasticity in aggregate consumption growth and asset returns is taken into account. Our estimation strongly suggests an IES greater than unity, as advocated by the long-run risk literature. Tables 2 and 3 present the posterior mode and mean of the IES parameter, ψ, estimated in the annual and quarterly samples respectively. The posterior mode and mean estimates range from 2.50 to 5.00 across different models with small standard deviations. These estimates are larger than those reported by Aldrich and Gallant (2011), which are in the neighborhood of 1.50. Risk aversion and the IES both determine the representative agent’s preference for the timing of resolution of uncertainty. If γ > 1/ψ, the agent prefers earlier resolution of uncertainty; see Epstein and Zin (1989) and Bansal and Yaron (2004). Given the high estimates of ψ, both benchmark and alternative models point to a representative agent who prefers an earlier resolution of uncertainty. Addingambiguityaversionattenuatesthispreferencemoderately: Onceambiguityaversionistaken 8 Ju and Miao (2012) calibrate their consumption-based model to a century-long data sample starting from late 19th century. Jahan-ParvarandLiu(2014)calibratetheirmodeltomatchfeaturesofdataonboththebusinesscycleand asset returns based on 19230–2010 data. 9 We find that the difference in the magnitude of these estimates is similar to the difference between static estimates of Gul (1991) disappointment aversion parameter reported by Choi, Fisman, Gale, and Kariv (2007) and dynamic estimates reported by Feunou, Jahan-Parvar, and T´edongap (2013). Thus, the difference is more likely to be an outcomeofthestaticsettingusedratherthanusingdifferentestimationmethods,suchasGSMBayesianmethodology in our case and a frequentist method in case of Feunou et al. 21

into account, the estimates of ψ are around 2.62 – 2.96 in annual estimates and 4.28 – 4.55 for quarterly data. The GSM estimation delivers stable estimates of the IES parameter. This is in contrast to the results of Jeong et al. (2015), where the ψ estimates range from 0.00 to ∞. In particular, when only financial wealth is used to proxy total wealth and ambiguity is represented bymultiplepriors,Jeongetal.(2015)obtainestimatesofψ thatareequalto0.68withtime-varying volatility and 11.16 with nonlinear stochastic volatility. Table 2 and Table 3 present posterior mean and mode estimates of the primitive parameters in the consumption growth process for the annual and quarterly estimations respectively. The results indicate that the GSM estimation method can successfully identify two distinct regimes of consumptiongrowthforbothbenchmarkandalternativemodels. Thedifferencebetweenκ andκ 1 2 estimates is sizable. The transition probability estimate of p is above 0.90 in all cases, while the 11 estimate of p is about 0.30 – 0.40 in the annual estimation and about 0.70 – 0.80 in the quarterly 22 estimation. This result suggests that the good regime is very persistent while the bad regime is transitory. All these estimates together with the estimates for volatility of the growth innovation, σ , have low standard deviations. Compared with empirical estimates reported by Cecchetti, Lam, c and Mark (2000), differences in several parameter estimates are noticeable. However, this is not surprising. Cecchetti, Lam, and Mark (2000) fit a Markov switching model to consumption data only. Our GSM estimation uses both consumption data and asset returns data to estimate the model. Besides, we use different sample periods. Dividend growth, ∆d , is a latent variable in our estimation. In the benchmark annual estit mation, the posterior estimates of the leverage parameter λ are moderately greater than 1. This is consistent with the argument of Abel (1999) that aggregate dividends are a levered claim on aggregate consumption. However, the estimates of λ are lower than the value used in the calibration of Ju and Miao (2012) where λ = 2.74. In the alternative model estimation at the annual frequency, estimates of λ shown on Table 2 are closer to this value. In the quarterly estimation, the posterior mean estimates of λ are between 1 and 2. The volatility estimate of dividend growth is stable across different models and samples. Our estimates of λ and σ are not directly comparad ble to results of Aldrich and Gallant (2011) or Bansal et al. (2007) due to different specifications for modeling dividend growth. Specifically, Aldrich and Gallant (2011) and Bansal et al. (2007) estimate the long-run risk model featuring time variation in the volatility of fundamentals, while we rely on Markov-switching mean growth rates and learning to generate time-varying volatility 22

of equity returns. However, our estimates of σ are close in magnitude to that in the estimated d prospect theory model reported by Aldrich and Gallant (2011). Aldrich and Gallant posit constant volatility for the dividend growth process in the prospect theory model. Insummary,apartfromestimatesfortheriskaversionparameterandambiguityaversionparameter, estimates of other structural parameters in our study are remarkably stable and are generally comparable in magnitude to values reported by other empirical asset pricing studies. Thus, it is reasonable to believe that parameter estimates other than risk aversion and ambiguity aversion estimates have small influence on identification and model comparison when it comes to the model featuring smooth ambiguity aversion. In addition, ignoring ambiguity aversion can lead to biased estimates of the risk aversion parameter. 5 Model Comparison and Implications 5.1 Relative Model Comparison Relative model comparison is standard Bayesian inference as described in Subsection 4.1. The computed odds ratio is 1/6.09e−85 for the annual estimation and 1/1.18e−36 for the quarterly estimation, which strongly favors the benchmark model over the alternative model. This ratio implies that our benchmark model provides a better description of the available data. Given the logarithmic values of posterior evaluated at the mode for the benchmark and alternative models reportedinTables2and3, itisalsoobviousthatthebenchmarkmodelisthepreferredmodel. One can gain a rough appreciation for what these odds ratios indicate from a frequentist perspective by disregarding the effects of the prior and support conditions and comparing the log posteriors shown in Tables 2 and 3 as if they were log likelihoods. For the annual comparison minus twice the log likelihood ratio gives a χ2-statistic equal to 260.00 on one degree of freedom and for the quarterly data 82.70 on one degree of freedom. The p-value for either is less than 0.0001. 5.2 Forecasts Forecasts are constructed as described in Subsection 4.2. Prior forecasts (not shown) do not differ much between pre- and post-Great Recession periods. There are, however, differences between prior forecasts based on the benchmark model and the alternative model. The main difference is the disparity in the level of benchmark and alternative model-based forecasts of the short rate. The 23

benchmark model forecasts a higher level for the short rate and a wider standard deviations than the alternative model does. The second difference is the slight increase in the consumption growth path forecasted by the benchmark model, against the drop forecasted by the alternative model. Prior forecasts are not a measure of a model’s success in predicting the data dynamics. For that purpose, we rely on posterior forecasts shown in Figure 6.10 As Figure 6 shows, the posterior forecasts of consumption growth differ across the pre- and post-Great Recession episodes. Both benchmark and alternative models forecast a drop in consumption growth for the pre-recession periodwhileaslightincreaseforthepost-recessionperiodbasedonavailableinformationbytheend of 2011. The posterior forecasts paths generated by both modes are on average similar. However, the benchmark model implies slightly more variation in consumption growth forecasts. For the pre-recession period, the benchmark model forecasts a steeper drop in stock returns compared to the alternative model. For the post-recession period, the benchmark model yields posterior return forecasts that are lower than those generated by the alternative model. These results reflect the pessimism inherent in our benchmark model with ambiguity aversion. Based on data up to the Great Recession period, Aldrich and Gallant (2011) report posterior forecasts of stock returns generated from the long-run risk model. Their return forecasts are roughly 6% for the period 2009–2013. Given the recent past experience, this level appears to be high. ItisclearfromFigure6thatthebenchmarkmodelpredictsoveralllowershortratescomparedto the alternative model’s forecasts. Moreover, the benchmark model predicts a drop in the short rate for both pre- and post-recession periods while the alternative model implies the opposite results.11 These results about the posterior forecasts echo the mechanism of earlier models on ambiguity (or uncertainty) that the induced higher precautionary savings motive tends to reduce the risk-free rate. Given recent announcements by various practitioners, academicians and former policy makers about the likelihood of interest rates reverting back to “old normal” levels, the posterior forecasts generated by our benchmark model seem reasonable.12 GiventhatBayesianmodelcomparisonprefersthebenchmarkmodeloverthealternativemodel, 10We find a rather dramatic change in the standard errors between the prior and posterior forecasts. This result suggests that the data is quite informative for the forecasts. The impact can also be confirmed by comparing the prior and posterior estimates of model parameters reported in Tables 2 and 3 and the prior and posterior densities of the estimated parameters shown in Figures 2 to 5. 11Whilethisobservationisinlinewiththezero-lowerboundenvironmentsincetheGreatRecession,theyshouldnotbe viewed as synonymous. We are forecasting real risk-free rates. They are not influenced by fiscal or monetary policy and are endogenously determined in the model. 12For example, on May 16, 2014 the former Federal Reserve Chairman Ben Bernanke opined that low interest rate environment is likely to continue beyond many then-current forecasts. (Source: Reuters) 24

our results of posterior forecasts merit attention. The two models lead to very different dynamics for consumption growth and asset returns. If we indeed live in a world populated by ambiguityaverseagents, thenpolicyanddecisionmakersneedtobeawareoftheinherentlydifferentattitudes and hence, market behavior of agents endowed with ambiguity aversion preferences, as opposed to those assumed in standard rational expectation models. 5.3 Asset Pricing Implications In this section, we study the asset pricing implications of the benchmark model, using the estimated model parameters. Unlike calibration studies, our focus here is not to match unconditional moments of asset returns in the data as closely as possible. Instead, we want to assess the impact of ambiguity aversion on equity premium and the price of risk based on our estimated model rather than independently chosen parameter values. If the estimated benchmark model is reasonably successful in reproducing high price of risk and unconditional equity premium that are not explicitly targeted in our estimation, we view this outcome as confirmation that the dynamics of asset prices implied by our estimation are reasonably close to the underlying data generating process (DGP). Table 4 presents key financial moments generated by both benchmark and alternative models when model parameters are set to their posterior mean values reported in Table 2.13 Although matchingfinancialmomentsisnotsetanexplicittargetinourestimation,theestimatedbenchmark modelimpliesmomentsofassetreturnsclosetothedata. Allmomentsreportedin4areannualized. Weobservethefollowing. First, underthebenchmarkmodel, therisk-freeratehasameanofabout 1percentandlowvolatility. Lowvolatilityoftherisk-freerateisduetothehighestimateoftheIES parameter, which implies strong intertemporal substitution effect, see Bansal and Yaron (2004). The mean risk-free rate implied by the alternative model is 1.44 percent and is much higher than the data, at 1.07 percent. Second, while both benchmark and alternative models generate volatility of equity returns close to the data, the two models differ dramatically in terms of their performance in producing high equity premium. The mean equity premium implied by the benchmark model estimation is 7.31 percent, close to that in the annual sample at 7.47 percent. In contrast, the alternative model implies a mean equity premium of 1.36 percent. As shown by Bansal and Yaron (2004), 13Thebenchmarkmodelestimatedusingthequarterlysamplecanalsoproducehighpriceofriskandequitypremium. Forthesakeofbrevity,theunconditionalfinancialmomentsarenotreportedforthiscase. Resultsareavailableupon request from the authors. 25

without high risk aversion or time-varying uncertainty, the long-run risk model with Epstein and Zin preferences has difficulty in matching the mean equity premium. For that reason, Bansal and Yaron consider long-run risk in the mean of consumption growth and also set γ = 10 to match the mean equity premium. Since the estimated γ for the alternative model is smaller (E(γ) = 6.32 and σ(γ) = 0.26) and the model abstracts from stochastic volatility, the mean equity premium implied by the alternative model is too low. Third, the market price of risk, defined by σ(M)/E(M), is closely related to moments of equity returns via the Hansen-Jagannathan bound (cid:12) (cid:16) (cid:17)(cid:12) (cid:12)E Re−Rf (cid:12) (cid:12) t t (cid:12) σ(M t ) (cid:12) (cid:12) σ (cid:16) Re−Rf (cid:17)(cid:12) (cid:12) ≤ E(M t ) . (cid:12) t t (cid:12) A reasonable model that can explain asset prices data well should deliver a SDF that satisfies the bound. The price of risk under the alternative model is 0.28, whereas the Sharpe ratio is 0.37 in the annualsample. ItisobviousthattheestimatedalternativemodelviolatestheHansen-Jagannathan bound. The estimated benchmark model generates σ(M )/E(M ) = 2.63 and thus satisfies the t t bound. In addition, the Sharpe ratio implied by the benchmark model is 0.42, close to the data at 0.37. Figure 7 plots conditional equity premium, equity volatility, price of risk and the price-dividend ratio as functions of the state belief µ , the posterior probability of the high mean regime of t consumption growth. The results are similar to conditional moments plotted by Ju and Miao (2012) except that our results are based on estimated model parameters. Under the alternative model with Epstein and Zin’s preferences, conditional equity premium, equity volatility and price of risk display humped shapes. The maximum of these conditional moments is attained when µ is t close to 0.50 due to high uncertainty induced by Bayesian learning. Ambiguity aversion increases conditional equity premium and price of risk significantly for values of µ near its steady-state t level implied by the estimated Markov-switching model. The intuition is that the agent distorts her beliefs pessimistically in the face of a shock to consumption growth and thus demands high risk premium. Because the high mean growth regime is persistent, the distribution of µ is highly t skewed toward 1.00. Thus, the impact of ambiguity aversion on conditional price of risk and equity premium is strong when µ is close to its steady-state value. The pessimistic distortion also t yields lower price-dividend ratios in the benchmark model, as shown in Panel D, Figure 7. Similar 26

to Ju and Miao (2012), our estimated benchmark model can also reproduce the countercyclical pattern of equity premium and equity volatility. The simulation results are plotted in Figure 8. We observe that the distorted belief puts more probability weight on the bad regime. When shocks to consumptiongrowtharelargeinmagnitude, thedistortedbeliefbecomesevenmorepessimisticand conditional equity premium and equity volatility rise significantly. Thus, the model can reproduce volatility clustering, which is also captured by the auxiliary model used in our estimation. Finally, an important question is: Do our structural estimations imply reasonable magnitudes for ambiguity aversion? To address this question, we use detection-error probabilities to assess the room for ambiguity aversion based on our estimation results. This exercise is meaningful because our estimation is grounded in the data and thus is informative about the behavior of economic agents and the dynamics of economic variables. Detection-error probabilities are an approach developed by Anderson, Hansen, and Sargent (2003) and Hansen and Sargent (2010) to assess the likelihood of making errors in selecting statistically “close” (in terms of relative entropy) data generating processes (DGP). In this study, the reference DGP refers to the Markov switching model specified in Equation (1). Without ambiguity aversion, the transition probabilities of the Markov chain are defined by p and p . Ambigu- 11 22 ity aversion implies distortion to the transition probabilities and thus gives rise to the distorted DGP. The Appendix shows that the reference DGP and the distorted DGP differ only in terms of transition probabilities. We adapt the approach of computing detection-error probabilities in Jahan-Parvar and Liu (2014) to the endowment economy in our study. This approach enables us to simulate artificial data from the reference and distorted DGPs and then evaluate the likelihood explicitly. Details of the algorithm are available in the Appendix. A sizable detection-error probability (p(η)) associated with a certain value of the ambiguity aversion parameter, η, implies that there is a large chance of making mistakes in distinguishing the reference DGP from the distorted DGP, and thus ample room exists for ambiguity aversion. Based on the estimated parameters of the benchmark model, the detection-error probability is 10.22% for theannualestimationand13%forthequarterlyestimation. Anderson, Hansen, andSargent(2003) advocates that a detection-error probability of about 10% suggests plausible extent for ambiguity. Thus, our estimated model parameters admit reasonably large scope for ambiguity aversion. 27

6 Conclusion Smooth ambiguity preferences of Klibanoff et al. (2005, 2009) have gained considerable popularity in recent years. This popularity is due to clear separation between ambiguity, which is a characteristic of the representative agent’s subjective beliefs, and ambiguity aversion that derives from the agent’s tastes. In this paper, we estimate the endowment equilibrium asset pricing model with smooth ambiguity preferences proposed by Ju and Miao (2012) using U.S. data and GSM Bayesian estimation methodology of Gallant and McCulloch (2009) to: (1) investigate the empirical properties of such an asset pricing model as an adequate characterization of the returns and consumption growth data and, (2) provide an empirical estimation of the ambiguity aversion parameter and its relationship with other structural parameters in the model. Our study contributes to the existing literature by providing a formal empirical investigation for adequacy of this class of preferences for economic modeling and presenting estimations for the structural parameters of this model. The estimated structural parameters are in line with theoretical expectations and are comparable with estimated parameters in related studies. With respect to measurement of ambiguity aversion, our results show a marked improvement over the existing literature. The existing empirical literature either provides measures of ambiguity (which is usually the size of the set of priors in the MPU framework) instead of ambiguity aversion of the agent, or implausible estimates (economically or statistically) for smooth ambiguity aversion parameter. Our study addresses both shortcoming in the extant literature. We find that Bayesian model comparison strongly favors the benchmark model featuring a representative agent endowed with smooth ambiguity preferences, over the alternative model featuring Epstein-Zin’s recursive preferences. Our estimates of the ambiguity aversion parameter are large and have important asset pricing implications for the market price of risk and equity premium. Detection-error probabilities computed using the estimated parameters imply ample scope for ambiguity aversion. Structural estimations ignoring ambiguity aversion may lead to biased estimates of the risk aversion parameter and are unable to explain the high market price of risk implied by financial data. Our estimates of the IES parameter are significantly greater than 1 and suggest a strong preference for earlier resolution of uncertainty, as is consistent with the long-run risk literature. 28

References Abel, A. B., 1999. Risk premia and term premia in general equilibrium. Journal of Monetary Economics 43 (1), 3–33. Ahn, D., Choi, S., Gale, D., Kariv, S., 2014. Estimating ambiguity aversion in a portfolio choice experiment. Quantitative Economics 5 (2), 195–223. Aldrich, E. M., Gallant, A. R., 2011. Habit, long-run risks, prospect? a statistical inquiry. Journal of Financial Econometrics 9, 589–618. Anderson,E.W.,Ghysels,E.,Juergens,J.L.,2009.Theimpactofriskanduncertaintyonexpected returns. Journal of Financial Economics 94, 233–263. Anderson, E. W., Hansen, L. P., Sargent, T. J., 2003. A quartet of semigroups for model specification, robustness, price of risk, and model detection. Journal of the European Economic Association 1, 68–123. Andreski, P., Li, G., Samancioglu, M. Z., Schoeni, R., 2014. Estimates of annual consumption expendituresanditsmajorcomponentsinthePSIDincomparisontotheCE.AmericanEconomic Review: Papers and Proceedings 104 (5), 132–135. Attanasio, O. P., Vissing-Jorgensen, A., 2003. Stock-market participation, intertemporal substitution and risk aversion. American Economic Review Papers and Proceedings 93, 383–391. Attanasio, O. P., Weber, G., 1993. Consumption growth, the interest rate and aggregation. Review of Economic Studies 60, 631–649. Backus, D., Ferriere, A., Zin, S., 2015. Risk and ambiguity in models of business cycles. Journal of Monetary Economics 69, 42–63. Bansal, R., Gallant, R., Tauchen, G., 2007. Rational pessimism and exuberance. Review of Economic Studies 74, 1005–1033. Bansal, R., Yaron, A., 2004. Risks for the long run: A potential resolution of asset pricing puzzles. Journal of Finance 59, 1481–1509. Bianchi, F., Ilut, C., Schneider, M., 2014. Uncertainty shocks, asset supply and pricing over the business cycle. Working Paper. Bollerslev, T., 1986. Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics 31, 307–327. Cecchetti, S. G., Lam, P.-S., Mark, N. C., 2000. Asset pricing with distorted beliefs: Are equity returns too good to be true? American Economic Review 90, 787–805. Chen, H., Ju, N., Miao, J., 2014. Dynamic asset allocation with ambiguous return predictability. Review of Economic Dynamics 17, 799–823. Chen, Z., Epstein, L., 2002. Ambiguity, risk, and asset returns in continuous time. Econometrica 70 (4), 1403–1443. Choi, S., Fisman, R., Gale, D., Kariv, S., 2007. Consistency, heterogeneity, and granularity of individual behavior under uncertainty. American Economic Review 97 (5), 1921–1938. Collard, F., Mukerji, S., Sheppard, K., Tallon, J.-M., 2015. Ambiguity and the historical equity premium. Working Paper, University of Oxford. 29

Engle, R. F., Kroner, K. F., 1995. Multivariate Simulteneous Generalized ARCH. Econometric Theory 11 (1), 122–150. Epstein, L. G., Zin, S. E., 1989. Substitution, risk aversion, and the temporal behavior of consumption and asset returns: A theoretical framework. Econometrica 57 (4), 937–969. Feunou,B.,Jahan-Parvar,M.R.,T´edongap,R.,2013.Modelingmarketdownsidevolatility.Review of Finance 17 (1), 443–481. Fletcher, R., 1987. Practical Methods of Optimization, 2nd Edition. John Wiley and Sons, New York, NY. Gallant,A.R.,Long,J.R.,1997.Estimatingstochasticdifferentialequationsefficientlybyminimum chi-squared. Biometrica 84, 125–141. Gallant, A. R., McCulloch, R. E., 2009. On the determination of general scientific models with application to asset pricing. Journal of the American Statistical Association 104, 117–131. Gallant, A. R., Nychka, D. W., 1987. Seminonparametric maximum likelihood estimation. Econometrica 55, 363–390. Gallant, A. R., Tauchen, G., 1996. Which moments to match? Econometric Theory 12, 657–681. Gallant, A. R., Tauchen, G., 1998. Reprojecting partially observed systems with application to interest rate diffusions. Journal of the American Statistical Association 93, 10–24. Gallant, A. R., Tauchen, G., 2010. Handbook of Financial Econometrics. Vol. 1 Tools and Techniques. Elsevier/North-Holland, Amsterdam, Ch. Simulated Score Methods and Indirect Inference for Continuous-time Models, pp. 427–478. Gamerman,D., Lopes,H.F., 2006.MarkovChainMonteCarlo: StochasticSimulationforBayesian Inference, 2nd Edition. No. 86 in CRC Texts in Statistical Science. Chapman & Hall, New York, NY. Garner, T. I., Janini, G., Passero, W., Paszkiewicz, L., Vendemia, M., September 2006. The CE and the PCE: A Comparison. Monthly Labor Review, Bureau of Labor Statistics (09). Gollier, C., 2011. Portfolio choices and asset prices: The comparative statics of ambiguity aversion. Review of Economic Studies 78, 1329–1344. Gouri´eroux, C., Monfort, A., Renault, E. M., 1993. Indirect inference. Journal of Applied Econometrics 8, S85–S118. Guidolin, M., Liu, H., 2014. Ambiguity aversion and under-diversification. Journal of Financial and Quantitative Analysis, forthcoming. Gul, F., 1991. A theory of disappointment aversion. Econometrica 59 (3), 667–686. Halevy, Y., 2007. Ellsberg revisited: An experimental study. Econometrica 75, 503–536. Hall, R., 1988. Intertemporal substitution in consumption. Journal of Political Economy 96, 339– 357. Hansen, L. P., 2007. Beliefs, doubts and learning: The valuation of macroeconomic risk. American Economic Review 97, 1–30. 30

Hansen, L. P., Sargent, T. J., 2010. Fragile beliefs and the price of uncertainty. Quantitative Economics 1, 129–162. Hayashi,T.,Miao,J.,2011.Intertemporalsubstitutionandrecursivesmoothambiguitypreferences. Theoretical Economics 6. Ilut, C., Schneider, M., 2014. Ambiguous business cycles. American Economic Review 104, 2368– 2399. Jahan-Parvar, M. R., Liu, H., 2014. Ambiguity aversion and asset prices in production economies. Review of Financial Studies 27 (10), 3060–3097. Jarque, C. M., Bera, A. K., 1980. Efficient tests for normality, homoscedasticity and serial independence of regression residuals. Economics Letters 6 (3), 255–259. Jeong, D., Kim, H., Park, J. Y., 2015. Does ambiguity matter? Estimating asset pricing models with a multiple-priors recursive utility. Journal of Financial Economics 115 (2), 361–382. Ju, N., Miao, J., 2012. Ambiguity, learning, and asset returns. Econometrica 80, 559–591. Klibanoff,P.,Marinacci,M.,Mukerji,S.,2005.Asmoothmodelofdecisionmakingunderambiguity. Econometrica 73 (6), 1849–1892. Klibanoff, P., Marinacci, M., Mukerji, S., 2009. Recursive smooth ambiguity preferences. Journal of Economic Theory 144 (3), 930–976. Kreps, David, M., Porteus, E. L., 1978. Temporal resolution of uncertainty and dynamic choice. Econometrica 46, 185–200. Ludvigson, S. C., 1999. Consumption and credit: A model of time-varying liquidity constraints. Review of Economics and Statistics 81, 434–447. Maccheroni, F., Marinacci, M., Ruffino, D., 2013. Alpha as ambiguity: Robust mean-variance portfolio analysis. Econometrica 81 (3), 1075–1113. Mehra, R., Prescott, E. C., 1985. The equity premium: a puzzle. Journal of Monetary Economics 15, 145–161. Newton, M. A., Raftery, A. E., 1994. Approximate bayesian inference with the weighted likelihood bootstrap. Journal of the Royal Statistical Society - Series B 56, 3–48. Ruffino, D., 2013. A robust capital asset pricing model. Working Paper, Federal Reserve Board. Schwarz, G., 1978. Estimating the dimension of a model. Annals of Statistics 6, 461–464. Strzalecki, T., 2013.Temporalresolutionofuncertaintyandrecursivemodelsofambiguityaversion. Econometrica 81 (3), 1039–1074. Thimme, J., V¨olkert, C., 2015. Ambiguity in the cross-section of expected returns: An empirical assessment. Journal of Business & Economic Statistics, forthcoming 33 (3), 418–429. Veronesi, P., 1999. Stock market overreaction to bad news in good times: A rational expectations equilibrium model. Review of Financial Studies 12, 975–1007. Viale, A. M., Garcia-Feijoo, L., Giannetti, A., 2014. Safety first, learning under ambiguity, and the cross-section of stock returns. Review of Asset Pricing Studies 4 (1), 118–159. 31

Vissing-Jorgensen, A., 2002. Limited asset market participation and the elasticity of intertemporal substitution. Journal of Political Economy 110, 825–853. Yogo, M., 2006. A consumption-based explanation of expected stock returns. Journal of Finance 61, 539–580. 32

7 Appendix: Detection-error Probabilities • In constructing distorted transition probabilities, we consider a “full information model”, wheretheagentisambiguityaversebutstatez isobservable. Inthiscase, theEulerequation t is (cid:104) (cid:16) (cid:17)(cid:105) (cid:104) (cid:16) (cid:17)(cid:105) 0 = p E M Re −Rf +(1−p )E M Re −Rf 11 1,t zt+1,t+1 t+1 t 11 2,t zt+1,t+1 t+1 t for z = 1 and t 0 = (1−p )E (cid:2) M (R −R ) (cid:3) +p E (cid:2) M (R −R ) (cid:3) 22 1,t zt+1,t+1 e,t+1 f,t 22 2,t zt+1,t+1 e,t+1 f,t for z = 2. The Euler equation can be rewritten as t (cid:104) (cid:16) (cid:17)(cid:105) (cid:104) (cid:16) (cid:17)(cid:105) 0 = p˜ E MEZ Re −Rf +(1−p˜ )E MEZ Re −Rf 11 1,t zt+1,t+1 t+1 t 11 2,t zt+1,t+1 t+1 t (cid:104) (cid:16) (cid:17)(cid:105) (cid:104) (cid:16) (cid:17)(cid:105) 0 = (1−p˜ )E MEZ Re −Rf +p˜ E MEZ Re −Rf 22 1,t zt+1,t+1 t+1 t 22 2,t zt+1,t+1 t+1 t where MEZ is the SDF under recursive utility without ambiguity aversion, and p˜ and zt+1,t+1 11 p˜ are distorted transition probabilities and are given by 22 p 11 p˜ = , (20) 11  (cid:104) (cid:105)−η−γ E V1−γ 1−γ 2 zt+1,t+1 p 11 +(1−p 11 ) (cid:104) (cid:105) E V1−γ 1 zt+1,t+1 p 22 p˜ = , (21) 22  (cid:104) (cid:105)−η−γ E V1−γ 1−γ 1 zt+1,t+1 (1−p 22 ) (cid:104) (cid:105) +p 22 E V1−γ 2 zt+1,t+1 where V , (z = 1,2) are solutions to the following value function under full information: zt,t t (cid:20) (cid:21) 1 V (C) = (1−β)C 1− ψ 1 +β (cid:8) R (cid:0) V (C) (cid:1)(cid:9)1− ψ 1 1− ψ 1 , zt,t t zt zt+1,t+1 R (cid:0) V (C) (cid:1) = (cid:18) E (cid:20) (cid:16) E (cid:104) V1−γ (C) (cid:105)(cid:17) 1 1 − − γ η(cid:21)(cid:19) 1− 1 η . zt zt+1,t+1 zt zt+1,t zt+1,t+1 • The numerical algorithm of calculating detection-error probabilities takes the following steps: 1. Repeatedly draw {∆c }T under the reference data generating process (DGP), which is t t=1 the two-state Markov switching model with transition probabilities p and p . 11 22 2. Evaluate the log likelihood function under the reference DGP by computing T (cid:40) 2 (cid:41) (cid:88) (cid:88) lnLr = ln f(∆c |z )Pr(z |Ω ) T t t t t−1 t=1 zt=1 where π = Pr(z = 1|Ω ) are filtered probabilities implied by the Markov switching t−1 t t−1 model. 33

3. Evaluate the log likelihood function under the distorted DGP by computing T (cid:40) 2 (cid:41) (cid:88) (cid:88) (cid:94) lnLd = ln f(∆c |z )Pr(z |Ω ) T t t t t−1 t=1 zt=1 (cid:94) where Pr(z |Ω ) are the filtered probabilities that are obtained by applying the dist t−1 torted transition probabilities p˜ and p˜ (in place of the constant transition proba- 11,t 22,t bilities p and p ) to the Markov switching model’s filter. 11 22 (cid:16) Ld (cid:17) 4. Compute the fraction of simulations for which ln T > 0 and denote it as p . The Lr r T fractionapproximatestheprobabilitythattheeconometricianbelievesthatthedistorted DGP generated the data, while the data are actually generated by the reference DGP. 5. Do a symmetrical computation and simulate {∆c }T under the distorted DGP. Comt t=1 (cid:16) Lr (cid:17) pute the fraction of simulations for which ln T > 0 and denote it as p . This fraction Ld d T approximates the probability that the reference DGP generated the data when actually the distorted DGP generates the data. Assuming an equal prior on the reference and the distorted DGP, the detection error probability is defined by (see Anderson et al. (2003)): 1 p(η) = (p +p ). (22) r d 2 In the approximation, we set T = 100 years and simulate 20,000 samples of artificial data. 34

Table 1: Summary Statistics of the Data re rf re−rf ∆c t t t t t 1929-2013 Mean 8.54 1.07 7.47 1.85 St Dev 20.35 0.06 20.35 2.15 Skewness -0.29 0.60 -0.29 -1.49 Kurtosis -0.72 1.32 -0.72 5.01 J-B test 0.3938 0.0133 0.4012 0.0001 1947:Q2-2014:Q2 Mean 8.76 1.05 7.71 1.91 St Dev 16.43 0.02 16.44 1.02 Skewness -0.57 0.99 -0.57 -0.42 Kurtosis 1.90 1.27 1.90 1.11 J-B test 0.0013 0.0001 0.0013 0.0017 This table reports summary statistics for annual (1929-2013) and quarterly (1947:Q2-2014:Q2) U.S. data. 1-year Treasury Billrate(r t f), aggregateequityreturns(r t e), excessreturns(r t e−r t f), andreal, percapita, logconsumptiongrowth(∆ct)are expressedinpercentages. Meanandstandarddeviationofquarterlydataareannualized. Therowtitled“J−B test”reports thep-valuesofJarqueandBera(1980)testofnormality. 35

Table 2: GSM Annual Estimation Results Panel A: Benchmark model Prior Posterior Mode Mean Std. Dev. Mode Mean Std. Dev. β 0.9766 0.6267 0.2340 0.9428 0.9470 0.0034 γ 1.8750 1.9808 0.8972 1.6172 1.6264 0.6817 ψ 1.3750 1.7765 0.5147 2.6172 2.9646 0.4978 η 10.0000 9.3388 3.6839 29.7969 30.3285 1.3914 p 0.9805 0.9682 0.0179 0.9399 0.9411 0.0013 11 p 0.4688 0.5133 0.1393 0.2753 0.2733 0.0120 22 κ 0.0217 0.0225 0.0036 0.0231 0.0201 0.0023 1 κ -0.0615 -0.0662 0.0199 -0.0683 -0.0662 0.0012 2 λ 2.8750 2.5191 0.8536 1.0781 1.2497 0.1930 σ 0.0308 0.0310 0.0028 0.0267 0.0268 0.0003 c σ 0.1133 0.1263 0.0267 0.1742 0.1705 0.0039 d Log. Post. -392.2910 Panel B: Alternative model Prior Posterior Mode Mean Std. Dev. Mode Mean Std. Dev. β 0.9551 0.5635 0.2373 0.9793 0.9819 0.0052 γ 5.0625 4.9505 2.4958 6.4570 6.3259 0.2561 ψ 1.6875 1.7740 0.5108 4.3828 4.0195 0.5753 η N/A N/A N/A N/A N/A N/A p 0.9902 0.9706 0.0170 0.9435 0.9482 0.0090 11 p 0.5156 0.5107 0.1411 0.4470 0.3665 0.0841 22 κ 0.0229 0.0222 0.0038 0.0160 0.0166 0.0008 1 κ -0.0684 -0.0672 0.0193 -0.0472 -0.0428 0.0136 2 λ 2.8750 2.5877 0.8896 3.3438 2.5988 0.8080 σ 0.0349 0.0312 0.0028 0.0344 0.0342 0.0022 c σ 0.1133 0.1284 0.0271 0.1726 0.1661 0.0082 d Log. Post. -522.2885 This table presents priors and posteriors on mode, mean, and standard deviation of model parameters for the benchmark model featuring ambiguity aversion and the alternative model with Epstein-Zin’s recursive utility. We impose η = γ for the alternative model estimation. Preference parameters (β,γ,ψ and η) represent subjective discount factor, coefficients of risk aversion, intertemporalelasticityofsubstitution, andambiguityaversionrespectively. p11 andp22 aretransitionprobabilities in the Markov-switching model for consumption growth. κ1 and κ2 are good and bad state mean consumption growth rates, respectively. λ is the leverage parameter, and σc and σ d are volatilities for consumption and dividend growth, respectively. “Log. Post.” representslogposteriorevaluatedatthemodeforthebenchmarkandalternativemodels. Estimationresultsare for annual data 1929–2013. In our GSM Bayesian estimation, we use the 1929–1949 data to prime the estimation procedure, andthe1950–2013datatoobtaintheestimatedparameters. 36

Table 3: GSM Quarterly Estimation Results Panel A: Benchmark model Prior Posterior Mode Mean Std. Dev. Mode Mean Std. Dev. β 0.9883 0.7063 0.2462 0.9892 0.9897 0.0006 γ 0.8750 0.8422 0.3662 1.3672 1.3237 0.4015 ψ 4.0625 3.2063 1.0600 4.5547 4.2770 0.4572 η 66.0000 53.3760 22.3194 57.0781 55.9348 4.2912 p 0.9727 0.9563 0.0244 0.9994 0.9993 0.0003 11 p 0.6719 0.7029 0.0442 0.7677 0.7287 0.0370 22 κ 0.0056 0.0057 0.0021 0.0081 0.0077 0.0010 1 κ -0.0146 -0.0155 0.0056 -0.0198 -0.0183 0.0012 2 λ 1.6250 1.8377 0.5345 2.1719 1.6436 0.7118 σ 0.0132 0.0131 0.0039 0.0169 0.0150 0.0028 c σ 0.1211 0.0883 0.0223 0.0905 0.0843 0.0038 d Log. Post. -1658.8368 Panel B: Alternative model Prior Posterior Mode Mean Std. Dev. Mode Mean Std. Dev. β 0.9102 0.6552 0.2423 0.9869 0.9875 0.0007 γ 27.8750 19.3914 10.7661 13.5234 15.0289 0.9939 ψ 3.5625 3.2937 0.9976 4.9141 4.9031 0.0950 η N/A N/A N/A N/A N/A N/A p 0.9258 0.9584 0.0247 0.9997 0.9998 0.0001 11 p 0.6719 0.7038 0.0441 0.7743 0.7741 0.0019 22 κ 0.0051 0.0057 0.0020 0.0148 0.0147 0.0003 1 κ -0.0146 -0.0153 0.0056 -0.0250 -0.0262 0.0007 2 λ 1.3750 1.8407 0.5327 1.0469 1.1008 0.0979 σ 0.0103 0.0132 0.0039 0.0188 0.0193 0.0011 c σ 0.1133 0.0878 0.0224 0.0890 0.0850 0.0035 d Log. Post. -1700.1807 This table presents priors and posteriors on mode, mean, and standard deviation of model parameters for the benchmark model featuring ambiguity aversion and the alternative model with Epstein-Zin’s recursive utility. We impose η = γ for the alternative model estimation. Preference parameters (β,γ,ψ and η) represent subjective discount factor, coefficients of risk aversion, intertemporalelasticityofsubstitution, andambiguityaversionrespectively. p11 andp22 aretransitionprobabilities in the Markov-switching model for consumption growth. κ1 and κ2 are good and bad state mean consumption growth rates, respectively. λ is the leverage parameter, and σc and σ d are volatilities for consumption and dividend growth, respectively. “Log. Post.” representslogposteriorevaluatedatthemodeforthebenchmarkandalternativemodels. Estimationresultsare for quarterly data 1947:Q2–2014:Q2 data. In our GSM Bayesian estimation, we use the 1947:Q2–1955:Q2 data to prime the estimationprocedure,andthe1955:Q3–2014:Q2datatoobtaintheestimatedparameters. 37

Table 4: Financial Moments Data Benchmark model Alternative model (1929—2013) p(η) = 10.22% E(rf) 1.07 0.98 1.44 t σ(rf) 0.06 0.09 0.16 t E(re−rf) 7.47 7.31 1.36 t t σ(re−rf) 20.35 17.41 17.66 t t Sharpe ratio 0.37 0.42 0.08 σ(M )/E(M ) N/A 2.63 0.28 t t This table presents unconditional financial moments generated from the estimated benchmark and alternative models using annual data. Model parameters are set at their posterior mean values reported in Table 2. E(rf) and E(re−rf) are mean t t t risk-freerateandmeanequitypremiumrespectively,inpercentage. σ(rf)andσ(re−rf)arevolatilitiesofrisk-freeratesand t t t excessreturnsrespectively,inpercentage. σ(Mt)/E(Mt)isthemarketpriceofrisk. 38

Figure 1: Risk Free Rate, Aggregate Equity Returns, Excess Returns, and Consumption Growth Thefigureshows,fromtoptobottom,annualreturnsofCRSP-Compustatvalue-weightedindexreturns,1-yearTreasuryBill rates, excess returns over 1-year T-Bill rates, and annual real per-capita log consumption growth for the 1929–2013 period. ShadedareasrepresentNBERrecessions. 39

Figure 2: Prior and Posterior Densities of Estimated Parameters of the Benchmark Model, Annual Data b g y 0.0 0.5 1.0 0 2 4 6 1 2 3 4 5 h p p 1,1 2,2 0 10 20 30 0.92 0.96 1.00 0.2 0.4 0.6 0.8 1.0 k k l 2 1 −0.14 −0.10 −0.06 −0.02 0.010 0.020 0.030 0 1 2 3 4 5 6 7 s s D c D d 0.025 0.030 0.035 0.040 0.05 0.10 0.15 0.20 This figure plots prior and posterior densities of the benchmark model parameters. The solid lines depict posterior densities anddottedlinesdepictpriordensities. Theresultsarebasedon1929–2013annualdata. 40

Figure 3: Prior and Posterior Densities of Estimated Parameters of the Alternative Model, Annual Data b g y 0.0 0.5 1.0 0 5 10 15 1 2 3 4 5 6 p p k 1,1 2,2 2 0.92 0.96 1.00 0.0 0.2 0.4 0.6 0.8 1.0 −0.14 −0.10 −0.06 −0.02 k 1 l s D c 0.010 0.020 0.030 0 2 4 6 0.025 0.030 0.035 0.040 s D d 0.05 0.10 0.15 0.20 Thisfigureplotspriorandposteriordensitiesofthealternativemodelparameters,wheretherestrictionη=γ isimposed. The solidlinesdepictposteriordensitiesanddottedlinesdepictpriordensities. Theresultsarebasedon1929–2013annualdata. 41

Figure 4: Prior and Posterior Densities of Estimated Parameters of the Benchmark Model, Quarterly Data b g y 0.0 0.5 1.0 0 1 2 3 0 2 4 6 h p p 1,1 2,2 0 50 100 0.90 0.95 1.00 0.60 0.70 0.80 k k l 2 1 −0.03 −0.01 0.00 0.000 0.005 0.010 0.015 0 1 2 3 4 5 s s D c D d 0.000 0.010 0.020 0.02 0.06 0.10 0.14 This figure plots prior and posterior densities of the benchmark model parameters. The solid lines depict posterior densities anddottedlinesdepictpriordensities. Theresultsarebasedon1947–2014quarterlydata. 42

Figure 5: Prior and Posterior Densities of Estimated Parameters of the Alternative Model, Quarterly Data b g y −0.5 0.0 0.5 1.0 1.5 5 10 20 30 0 2 4 6 p p k 1,1 2,2 2 0.85 0.90 0.95 1.00 1.05 0.55 0.65 0.75 0.85 −0.04 −0.02 0.00 k 1 l s D c 0.000 0.010 0.020 0 1 2 3 4 5 0.000 0.010 0.020 0.030 s D d 0.00 0.05 0.10 0.15 Thisfigureplotspriorandposteriordensitiesofthealternativemodelparameters,wheretherestrictionη=γ isimposed. The solidlinesdepictposteriordensitiesanddottedlinesdepictpriordensities. Theresultsarebasedon1947–2014quarterlydata. 43

stsaceroF roiretsoP :6 erugiF ledoM evitanretlA ledoM kramhcneB stsaceroF noisseceR taerG-erP 6102 4102 2102 0102 8002 6002 3 2 1 0 1− htworg noitpmusnoc 6102 4102 2102 0102 8002 6002 51 5 0 01− snruter kcots 6102 4102 2102 0102 8002 6002 0.2 5.1 0.1 5.0 0.0 6102 4102 2102 0102 8002 6002 etar trohs 3 2 1 0 1− htworg noitpmusnoc 6102 4102 2102 0102 8002 6002 51 5 0 01− snruter kcots 6102 4102 2102 0102 8002 6002 0.2 5.1 0.1 5.0 0.0 etar trohs stsaceroF noisseceR taerG-tsoP 0202 8102 6102 4102 2102 3 2 1 0 1− htworg noitpmusnoc 0202 8102 6102 4102 2102 51 5 0 01− snruter kcots 0202 8102 6102 4102 2102 0.2 5.1 0.1 5.0 0.0 0202 8102 6102 4102 2102 etar trohs 3 2 1 0 1− htworg noitpmusnoc 0202 8102 6102 4102 2102 51 5 0 01− snruter kcots 0202 8102 6102 4102 2102 0.2 5.1 0.1 5.0 0.0 etar trohs snoitamitselaunnanodesaberastolpehT .sdoirepnoisseceRtaerG-tsopdna-erpehtrofetartrohsehtdna,snruterkcots,htworgnoitpmusnocrofstsacerofroiretsopswohserugfisihT .snoitaiveddradnatsroiretsop69.1±ehterasenildehsadehT .sledomevitanretladnakramhcnebehtfo 44

Figure 7: Conditional Financial Moments Panel A: Equity premium Panel B: Equity volatility 0.1 0.22 0.08 0.21 0.06 0.2 0.04 0.19 0.02 0.18 0 AA 0.17 EZ −0.02 0.16 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 Panel C: Price of risk Panel D: P/D 3 70 2.5 60 2 50 1.5 40 1 30 0.5 20 0 10 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 Thisfigureplotsconditionalfinancialmomentsimpliedbythebenchmarkandalternativemodelsasfunctionsofstatebeliefµt, i.e., the perceived probability of high mean consumption growth under Bayesian learning. The results are based on the GSM Bayesianestimationappliedtoannualdata(1929–2013). Modelparametersaresetattheirposteriormeanestimatesreported inTable2. 45

Figure 8: Quantitative Implications of Ambiguity Aversion Panel A: Filtered probabilities and distorted probabilities 1 0.5 Bayesian Distorted 0 10 20 30 40 50 60 70 80 90 100 Panel B: Conditional equity premium 0.09 0.085 0.08 0.075 10 20 30 40 50 60 70 80 90 100 Panel C: Conditional equity volatility 0.185 0.18 0.175 0.17 10 20 30 40 50 60 70 80 90 100 This figure plots simulated series of Bayesian-filtered and distorted state beliefs (µt and µ˜t), conditional equity premium and equityvolatilityforthebenchmarkmodelwithambiguityaversion. Thebenchmarkmodelparametersaresetattheirposterior meanestimatesreportedinTable2. 46

Cite this document

APA

A. Ronald Gallant, Mohammad Jahan-Parvar, & and Hening Liu (2015). Measuring Ambiguity Aversion (FEDS 2015-105). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2015-105

BibTeX

@techreport{wtfs_feds_2015_105,
  author = {A. Ronald Gallant and Mohammad Jahan-Parvar and and Hening Liu},
  title = {Measuring Ambiguity Aversion},
  type = {Finance and Economics Discussion Series},
  number = {2015-105},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2015},
  url = {https://whenthefedspeaks.com/doc/feds_2015-105},
  abstract = {We confront the generalized recursive smooth ambiguity aversion preferences of Klibanoff, Marinacci, and Mukerji (2005, 2009) with data using Bayesian methods introduced by Gallant and McCulloch (2009) to close two existing gaps in the literature. First, we use macroeconomic and financial data to estimate the size of ambiguity aversion as well as other structural parameters in a representative-agent consumption-based asset pricing model. Second, we use estimated structural parameters to investigate asset pricing implications of ambiguity aversion. Our structural parameter estimates are comparable with those from existing calibration studies, demonstrate sensitivity to sampling frequencies, and suggest ample scope for ambiguity aversion.},
}