feds · August 31, 2016

From Which Consumption-Based Asset Pricing Models Can Investors Profit? Evidence from Model-Based Priors

Abstract

This paper compares consumption-based asset pricing models based on the forecasting performance of investors who use economic constraints derived from the models to predict the equity premium. Three prominent asset pricing models are considered: Habit Formation, Long Run Risk, and Prospect Theory. I propose a simple Bayesian framework through which the investors impose the economic constraints as model-based priors on the parameters of their predictive regressions. An investor whose prior beliefs are rooted in the Long Run Risk model achieves more accurate forecasts overall. The greatest difference in performance occurs during the bull market of the late 1990s. During this period, the weak predictability of the equity premium implied by the Long Run Risk model helps the investor to not prematurely anticipate falling stock prices.

Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C. From Which Consumption-Based Asset Pricing Models Can Investors Profit? Evidence from Model-Based Priors Mathias S. Kruttli 2016-027 Please cite this paper as: Kruttli, Mathias S. (2016). “From Which Consumption-Based Asset Pricing Models Can Investors Profit? Evidence from Model-Based Priors,” Finance and Economics Discussion Series 2016-027. Washington: Board of Governors of the Federal Reserve System, http://dx.doi.org/10.17016/FEDS.2016.027r1. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

From Which Consumption-Based Asset Pricing Models Can Investors Profit? Evidence from Model-Based Priors∗ Mathias S. Kruttli† First draft: February 11, 2014 This draft: September 26, 2016 Abstract This paper compares consumption-based asset pricing models based on the forecasting performance of investors who use economic constraints derived from the models to predict the equity premium. Three prominent asset pricing models are considered: Habit Formation, Long Run Risk, and Prospect Theory. I propose a simple Bayesian framework through which the investors impose the economic constraints as model-based priors on the parameters of their predictive regressions. An investor whose prior beliefs are rooted in the Long Run Risk model achieves more accurate forecasts overall. The greatest difference in performance occurs during the bull market of the late 1990s. During this period, the weak predictability of the equity premium implied by the Long Run Risk model helps the investor to not prematurely anticipate falling stock prices. JEL classification: G11, G12, G17 Keywords: Bayesian econometrics, consumption-based asset pricing, return predictability ∗I thank Nicholas Barberis, John Campbell, Lubos Pastor, Andrew Patton, Tarun Ramadorai, Kevin Sheppard, Dimitri Vayanos, Jessica Wachter, Missaka Warusawitharana, Sumudu Watugala, Ivo Welch, Mungo Wilson, and seminar participants at the Federal Reserve Board of Governors, NBERSummerInstitute2016, Oxford-ManInstituteofQuantitativeFinance, Sa¨ıdBusiness School, and Swiss Economists Abroad Conference 2015, for useful comments. The analysis and conclusions set forth are those of the author and do not indicate concurrence by other members of the Board of Governors of the Federal Reserve System. Part of this paper was completed when Mathias S. Kruttli was at the University of Oxford Economics Department. An earlier version of this paper was named “Model-Based Priors for Predicting the Equity Premium” (2014). †MathiasS.KruttliisattheBoardofGovernorsoftheFederalReserveSystemandtheOxford- Man Institute of Quantitative Finance. Email: mathias.s.kruttli@frb.gov. 1

1 Introduction Predicting aggregate stock returns has been of great interest to academics and investors alike. For academics, the predictability of stock returns is important for testing market efficiency. For investors, knowing whether the equity premium is predictable is crucial for portfolio allocation decisions. An extensive literature uses a variety of variables to explain the time-variation of returns (see, for example, Campbell (1987); Campbell and Shiller (1988); Fama and French (1988 and 1989); Baker and Wurgler (2000); Lettau and Ludvigson (2001); Polk, Thompson, and Vuolteenaho (2006); Welch and Goyal (2008); Li, Ng, and Swaminathan (2013); Kruttli, Patton, and Ramadorai (2015)). Valuation ratios were initially found to have predictive power when forecasting the equity premium, but the set of forecasting variables has since been extended with variables such as corporate payout, fluctuations in the consumption-wealth ratio, implied cost of capital, and yields on bonds and Treasury securities. Welch and Goyal (2008) provide a comprehensive analysis of the in-sample and out-of-sample (OOS) predictive power of the major variables and question whether an investor could have used these predictors to forecast the equity premium OOS. Campbell and Thompson (2008) further investigate these findings by imposing economic constraints when estimating the single-variable predictive regressions. They apply sign restrictions on the parameter estimates of the predictive regression and a non-negativity restriction on the forecast of the equity premium. These restrictions help the investor to reduce uncertainty about the regression parameters. Campbell and Thompson (2008) find that through these restrictions, a real-time investor could profitably forecast the equity premium. This paper imposes novel economic constraints derived from consumption-based asset pricing models on the parameters of single-variable predictive regressions that are typically used in the equity premium prediction literature. I propose a simple Bayesian econometric framework to implement these economic constraints as prior 2

distributions on the parameters. These prior distributions are named model-based priors. My approach relates to the macroeconometric literature, in which prior distributions from dynamic stochastic general equilibrium models (DSGE) are imposed on vector autoregressions (VAR) (see, for example, Ingram and Whiteman (1994) and Del Negro and Schorfheide (2011)). Instead of macroeconomists who use priors from DSGE models to improve their forecasts of macroeconomic variables, I consider investors who use model-based priors from consumption-based asset pricing models to forecast the equity premium. The three consumption-based asset pricing models that act as sources for the model-based priors are the Habit Formation (HF) model of Campbell and Cochrane (1999), the Prospect Theory (PT) model of Barberis, Huang, and Santos (2001), and the Long Run Risk (LRR) model of Bansal and Yaron (2004). I chose these three models, as they propose different theories that can explain the equity premium puzzle (Mehra and Prescott (1985)). Also, the respective authors emphasize the equity premium predictability implied by their models and calibrate their models with similar US data.1 The model-based priors allow me to assess whether an investor could have profited from knowing the theories and the theories’ implications for the predictability of the equity premium inherent in these consumption-based asset pricing models. I assume that an investor who forecasts the equity premium with valuation ratios has a prior belief about the parameter estimates of the predictive regression that stems from one of the asset pricing models. The investor then updates her beliefs about the predictive regression parameters with empirical data and forecasts the equity premium OOS based on the posterior parameter estimates. To my knowledge, prior distributionsderivedfromassetpricingmodelshavenotbeenpreviouslyexploredfor the purpose of forecasting returns OOS. Unlike other papers in the equity premium prediction literature, the focus of this paper is to compare the performances of the 1Barro(2006)explainstheequitypremiumpuzzlethroughadisasterriskmodel,butunlikethe threeassetpricingmodelsusedinthispaper,thecalibrationisbasedoninternationaldataonlarge economic declines, and the model-implied predictability of the equity premium is not analyzed. 3

model-based priors from the three asset pricing models with each other. Comparing the accuracy of the forecasts provides an assessment of how useful the asset pricing models’ descriptions of the macro-finance world are for a finance practitioner who attempts to time her investments in the aggregate stock market. This novel way of comparing consumption-based asset pricing models leads to insights that are not obtained when matching empirical data moments with model-based moments from Monte Carlo simulations, as is generally done (see, for example, Bansal, Gallant, and Tauchen (2007) for a comparison of the HF and the LRR models and Ludvigson (2012) for a survey of the literature).2 Several other papers in this growing literature also make use of economically motivated parameter constraints for predicting the equity premium and implement them through a type of Bayesian framework on the predictive regressions. Pastor and Stambaugh (2009) employ a prior that implies a negative correlation between expected and unexpected return shocks. Shanken and Tamayo (2012) consider prior beliefs on the risk-return tradeoff and on the extent to which mispricing drives predictability. Pettenuzzo, Timmermann, and Valkanov (2014) propose a Bayesian methodology that imposes a non-negative equity premium and bounds on the conditional Sharpe ratio. Their constraints lead to forecasts of the equity premium that are substantially more accurate. Wachter and Warusawitharana (2009) model skepticism of an investor over the predictability of the equity premium as an informative prior over the R2 and show that a skeptical investor achieves better forecasts. Wachter and Warusawitharana (2015) analyze whether an investor who is initially skeptical about the existence of equity premium predictability would update her prior and conclude that the equity premium is predictable when being confronted with historical data. Other Bayesian studies consider uncertainty about the predictive regression parameters through uninformative priors (see, for example, 2TheanalyticalsolutionsandtheempiricallyobservablestatevariablesoftheLRRmodelallow for additional model evaluations: the in-sample estimation proposed by Bansal, Kiku, and Yaron (2010)andConstantinidesandGhosh(2012)andtheOOSfitproposedbyFerson,Nallareddy,and Xie (2013). 4

Stambaugh(1999); Barberis(2000); Brandt, Goyal, Santa-Clara, andStroud(2005); Penasse (2016)) or investigate how parameter uncertainty affects the long run predictive variance (see, for example, Pastor and Stambaugh (2012) and Avramov, Cederburg, and Lucivjanska (2016)). My sample comprises data from 1926 to 2014. I compare the predictive accuracy of hypothetical investors who had access to the three asset pricing models from 1926 onward and use the model-based priors to reduce the uncertainty about the parameters of the predictive regression. The investors try to time the market by forecasting the equity premium with either the dividend-price ratio or the dividend yield.3 For my benchmark analysis, the calibration of the asset pricing models is the same as presented by the authors in the respective publications of the models.4 However, the results are robust to recalibrating the asset pricing models over a time period that has no overlap with the OOS period. I find a sharp distinction between the performances of the LRR model-based priors and the model-based priors derived from the HF and PT models. The LRR model-based priors perform particularly well from 1980 onward. The HF and PT model-based priors result in more accurate forecasts up to the 1980s. Over the whole data sample, an investor armed with the knowledge of the LRR model would have generally outperformed investors whose prior beliefs about the predictability of the equity premium were rooted in the HF or PT model. The differences in performance hold when comparing both the accuracy of the forecasts and the utility gains achieved by the investors. The key to the strong performance of the LRR prior over the total sample period is the bull market of the late 1990s, when low valuation ratios predicted negative stock returns that did not materialize for several years (see, for example, Lettau and Ludvigson (2005)). The LRR model implies a 3AsinWelchandGoyal(2008)andPettenuzzoetal. (2014), thedividend-priceratioisdefined asdividendsdividedbyprice,andthedividendyieldisdefinedasdividendsdividedbypricelagged by one period. 4Becausetheauthorsoftheassetpricingmodelsusealmostidenticaldatasetsforthecalibration of their respective models, the comparison of the model-based priors’ forecasting performances should not be distorted. 5

lower predictive power of valuation ratios than the other two asset pricing models. Hence, an investor who uses the LRR model as guidance for her investment choices is more skeptical to conclude that low valuation ratios imply an immediate decline in stock prices. This skepticism improves her forecast performance during the late 1990s, and this effect dominates less accurate forecasts of the LRR priors during episodes when the predictive power of valuation ratios was stronger as, for example, in the 1970s. The differences in forecast accuracy between the three asset pricing models are economically significant. I find that an investor with mean-variance preferences who allocates her portfolio based on equity premium forecasts would on average be willing to pay 26 basis points per year to have access to the LRR model-based priors instead of the HF model-based priors. Relative to the PT model-based priors, the investor would on average be willing to pay 67 basis points per year to have access to the LRR model-based priors. The implied predictability of the equity premium differs across the asset pricing models due to the model-specific mechanisms that lead to time-variation in valuation ratios. In the HF and PT models, changes in the valuation ratios are driven by time-varying discount rates, and this mechanism leads to a predictable equity premium. In the LRR model, the discount rate channel similarly leads to fluctuations in the valuation ratios and return predictability. However, the LRR model also incorporates a predictable component in the dividend growth rate, the long-run risk component, which drives valuation ratios and mitigates their predictive power for theequitypremium. Thereexistsconsiderabledebateaboutwhethertheempirically observed changes in valuation ratios are driven by time-variation of discount rates or time-variation in the forecasts of dividend growth (see, for example, Campbell and Shiller (1988); Lettau and Ludvigson (2005); Bansal, Kiku, and Yaron (2007 and 2012); Cochrane (2008)). This paper shows that from the perspective of an investor who tries to time the market OOS over the 1926-2014 sample, model-based 6

priors derived from an asset pricing model that accounts for changes in valuation ratios due to time-varying dividend growth forecasts are preferred because of the lower equity premium predictability that the model implies. The rest of this paper is as follows. Section 2 explains the Bayesian methodology used to impose the model-based priors. Section 3 reports the data used and the results. Section 4 discusses the utility gains that an investor with mean-variance preferences achieves when implementing the model-based priors. Section 5 analyzes the robustness of the results. Section 6 concludes the paper. 2 Methodology This section describes how I impose economic constraints on the single-variable predictive regressions through priors derived from consumption-based asset pricing models and how these models are simulated to obtain the priors. 2.1 Equity premium prediction model The log equity premium at time t+1 is denoted by r and is defined as the rate of t+1 return on the stock market in excess of the prevailing short-term interest rate. As is common in the equity premium prediction literature, r is regressed on a constant t+1 and a predictor, x , which is lagged by one period: t r = β +β x +(cid:15) , where (cid:15) ∼N(0,σ2). (1) t+1 0 1 t t+1 t+1 (cid:15) The OOS predictions of the equity premium are generated through recursive forecasts (see, for example, Campbell and Thompson (2008), Welch and Goyal (2008), and Pettenuzzo et al. (2014)). Hence, all available observations up to period t are used to estimate the model in equation (1). Based on the resulting estimates of the parameters β = [β , β ](cid:48) and σ2, and by observing x , one can forecast the 0 1 (cid:15) t equity premium in t+1. The predicted equity premium is denoted by rˆ . Because t+1 7

observations after t + 1 are not used to estimate β, a real-time investor who forecasts the equity premium can implement this procedure. If no model-based priors are imposed, the parameters can be estimated via ordinary least squares (OLS). A common benchmark for a predictor in the equity premium literature is the historical average model, which forecasts that the equity premium will be next period what it has been on average in the past (β in equation (1) is set to zero). 1 2.2 Model-based priors An investor who wants to make use of the theoretical insights of a consumptionbased asset pricing model can impose economic constraints derived from the asset pricing model on β and σ2. These model-based constraints reduce uncertainty about (cid:15) the predictive regression parameters and are best imposed via Bayesian techniques. I assume that the investor’s prior belief is that β and σ2 take the values implied by (cid:15) the asset pricing model. She then updates her belief through empirical data. The prior distribution of the parameters in equation (1) — that is β and σ2 — (cid:15) is assumed to be Gamma-Normal (see, for example, Koop (2003) and Pettenuzzo et al. (2014)), which has the advantage of being a tractable prior distribution. The prior distribution is given by β ∼ N (cid:0) β,V (cid:1) , σ−2 ∼ G (cid:0) σ∗−2,v(t−1) (cid:1) . (2) (cid:15) (cid:15) The mean and the variance of the Normal prior distribution are specified as     β∗ λσ2 0 β =  0, V =  r,t , (3)     β∗ 0 λσ2 /σ2 1 r,t x,t where β∗ and β∗ are the coefficient values implied by the consumption-based asset 0 1 pricing model. The parameter λ is exogenously chosen and is weakly positive. If λ is large, the prior is loose. If λ is equal to zero, the prior is dogmatic. I set 8

λ = 1 for the benchmark analysis. Section 5 reports results for different values of λ and shows that these are in line with the benchmark case. The sample moments σ2 and σ2 are scaling factors, which ensure that the results are comparable for r,t x,t different predictors and forecast frequencies. Such scaling factors are commonly usedinBayesianmacroeconometricsanddatebacktoLitterman(1986). Thesample moments are given by t t 1 (cid:88) 1 (cid:88) σ2 = (r −r¯)2, r = r (4) r,t t−2 τ t t t−1 τ τ=2 τ=2 and t−1 t−1 1 (cid:88) 1 (cid:88) σ2 = (x −x¯ )2, x = x . (5) x,t t−2 τ t t t−1 τ τ=1 τ=1 The Gamma distribution parametrization follows Koop (2003) by specifying the distribution with mean σ∗−2 and degrees of freedom v(t−1), where σ∗−2 is derived (cid:15) (cid:15) from the consumption-based asset pricing model. The tightness of the prior is controlled by v, which is strictly positive. A large v corresponds to a tight prior, and a small v corresponds to a diffuse prior. The benchmark case sets v to 0.1, but my results are robust to a tighter or a more diffuse prior on σ−2 (see Section 5). (cid:15) 2.3 Posterior distribution The model-based prior distributions yield conditional posterior distributions for β and σ−2. I draw from these two conditional distributions through a Gibbs sampler. (cid:15) The conditional posterior distribution for β is β|σ−2,I ∼ N (cid:0) β ¯ ,V (cid:1) , (6) (cid:15) t where V = (V−1 +σ∗−2X(cid:48)X)−1, β ¯ = V(V−1β +σ∗−2X(cid:48)R), (7) (cid:15) (cid:15) 9

X is a t − 1 × 2 matrix with rows [1 x ] for τ = 1,...,t − 1, and R is a t − 1 × 1 τ vector with elements r for τ = 2,...,t. The information set at time t is denoted by τ I . The conditional posterior distribution for σ −2 takes the form t (cid:15) (cid:0) (cid:1) σ−2|β,I ∼ G s¯−2,v¯ , (8) (cid:15) t where (cid:80)t (r −β −β x )2 +σ∗2v(t−1) v¯ = v +(t−1), and s¯2 = τ=2 τ 0 1 τ−1 (cid:15) . (9) v¯ Through the Gibbs sampling algorithm with J iterations, we obtain a series of draws for each of the parameters denoted by {βj} and {σ−2,j} for j = 1,...,J. These (cid:15) simulated series can then be used to draw from the predictive return distribution (cid:90) p(r |I ) = p(r |β,σ−2,I )p(β,σ−2|I )dβdσ2, (10) t+1 t t+1 (cid:15) t (cid:15) t (cid:15) β,σ(cid:15) −2 which yields {rj } for j = 1,...,J. The point forecast for the equity premium in t+1 period t+1 is given by the mean of the sampled distribution J 1 (cid:88) rˆm = rj . (11) t+1 J t+1 j=1 2.4 Deriving priors from asset pricing models I next describe how the prior means β∗ = [β∗, β∗](cid:48) and σ∗−2 are derived from 0 1 (cid:15) the three consumption-based asset pricing models: HF, LRR, and PT. All three models specify a log consumption and a log dividend growth process. These two processes drive the state variables of the models, and the state variables determine the dividend-price ratio. By simulating random shocks, time series of consumption growth and dividend growth are generated, based on which I solve the models in each period for the log equity premium, the log dividend-price ratio, and the log 10

dividend yield. The log dividend-price ratio is the difference between the log of dividends and the log of prices, and the log dividend yield is the difference between the log of dividends and the log of prices lagged by one period.5 (A more detailed description of the models and how to solve them is provided in Appendix A.) I denote the simulated log equity premium in period t + 1 as r . I can then M M,t+1 estimate the single-variable predictive regression that is generally used in the equity premium prediction literature and given in equation (1) with simulated data, where the simulated predictor x is either the log dividend-price ratio or the log dividend M,t yield: r = β +β x +(cid:15) , where (cid:15) ∼N(0,σ2 ). (12) M,t+1 M,0 M,1 M,t M,t+1 M,t+1 M,(cid:15) The OLS estimates of β = [β , β ](cid:48) and σ −2 are denoted by β∗ and σ∗−2, M M,0 M,1 M,(cid:15) (cid:15) whichactasthepriormeansoftheGamma-NormaldistributiondescribedinSection 2.2. The predictive regression with simulated data in equation (12) is also estimated by the respective authors of the consumption-based asset pricing models, that is, Campbell and Cochrane (1999), Barberis et al. (2001), and Bansal and Yaron (2004), to assess the predictability of the equity premium implied by their proposed theories. The model-based priors for my benchmark results are based on data simulated from the asset pricing models when using the same calibration as proposed by the authors in the respective published papers. The authors use almost identical calibration data sets, and thus, the comparison of the model-based priors’ performances should not be distorted. I assume that the investors have no uncertainty about the parameters of the asset pricing models, since the focus of this paper (as in, for example, Campbell and Thompson (2008) and Pettenuzzo et al. (2014)) is the investors’ uncertainty about the parameters of the predictive regression given in equation (1). There is a concern that the results are affected by an overlap of the OOS period 5The dividend-price ratio and the dividend yield are the only two predictors from the equity premium prediction literature that can be simulated from the three asset pricing models. Also, valuation ratios are the most prominent predictors of the equity premium prediction literature. 11

with the sample period used by the authors to calibrate the asset pricing models. To address this concern, I show the robustness of the results in Section 5, by recalibrating the models with an empirical data sample that has no overlap with the OOS period. The calibration of the models is described in Appendix A. For the HF model, the simulation is at a monthly frequency, and the quarterly and annual data are constructed by time-averaging the monthly data. The same procedure is used by Campbell and Cochrane (1999). The log equity premium is summed across the quarter (year). For the dividend-price ratio and the dividend yield, consumption and dividends are summed across the quarter (year) and the end-of-quarter (year) price is used. I simulate 120,000 months, estimate β∗ and σ∗2, (cid:15) and average the estimates over 10 iterations. The HF model has two specifications, and I use both to generate priors. The first specification (HF 1) assumes a perfect positive correlation between the log consumption and log dividend growth, and the second specification (HF 2) assumes that the correlation is imperfect and positive. Similar to the HF model, the PT model is specified by Barberis et al. (2001) with perfect positive correlation between the log consumption and log dividend growth processes and with imperfect positive correlation between the two processes. I only use the latter specification, as it more successfully matches the empirical data moments. The authors calibrate the model with a range of parameter values for the investor’s sensitivity to financial wealth fluctuations (b0) and the effect of prior losses on risk aversion (k). I generate priors from the parameterizations that set b0 equal to 100 and k equal to 3 (PT 1) and 8 (PT 2). Of the specifications proposed by Barberis et al. (2001), setting b0 equal to 100 and k equal to 8 generates a log equity premium that is closest to the empirical data moment. For the b0 equal to 100 and k equal to 3, the generated log equity premium is lower, but the average loss aversion of the agent is 2.25, which is in line with experimental evidence. Following Barberis et al. (2001), I simulate the model at monthly, quarterly, and annual frequencies by 12

adjusting the model parameters accordingly.6 TheLRRmodel, liketheHFmodel, issimulatedatamonthlyfrequency, andthe quarterly and annual values are time-averaged.7 Bansal and Yaron (2004) use the same procedure to generate simulated data. Again, 120,000 months are simulated to estimate β∗ and σ∗2, and the estimates are averaged across 10 iterations. Bansal (cid:15) and Yaron (2004) present two specifications of their model: with and without timevarying volatility of consumption growth. Because the specification that accounts for time-varying volatility of consumption growth is substantially more successful at matching the empirical data moments, I generate priors only from this specification. However, as in Bansal and Yaron (2004), I consider two calibrations for the agent’s riskaversiontosimulatethemodel: ariskaversionof7.5(LRR1)andariskaversion of 10 (LRR 2). PanelsAandBofTable1showβ∗ andσ∗−2 estimatedfromsimulateddataofthe (cid:15) three consumption-based asset pricing models. The table also reports the empirical estimates over the total sample from 1926 to 2014 for comparison.8 For all three asset pricing models, β∗ is positive for the dividend-price ratio and the dividend 1 yield. Thus, high valuation ratios predict higher subsequent returns, which is in line with the empirical estimates. For both predictors and across all return frequencies, the coefficients of the LRR model are substantially lower than for the HF and PT models. The implication is that in the LRR model, the predictive power of valuation ratios is weak. Of the three models, the PT model generates the highest β∗ and 0 β∗ for the dividend-price ratio. For the dividend yield, the β∗ and β∗ of the HF 1 0 1 model are greater than the estimates of the other two models and the empirical 6For the monthly, quarterly, and annual frequencies, I simulate 120,000, 40,000, and 10,000 periods, and average the β∗ and σ∗2 estimates over 10 iterations. (cid:15) 7I simulate the model based on the analytical solutions as done by Bansal et al. (2010 and 2012) and Beeler and Campbell (2012). 8Adjusting the β empirical estimates for the bias discussed by Stambaugh (1999) leads to 1 annual, quarterly, and monthly estimates of 0.034, 0.011, and 0.002, respectively, for the log dividend-priceratio. Forthedividendyield,theadjustedannual,quarterly,andmonthlyestimates are 0.079, 0.020, and 0.006, respectively. For the model-implied parameter estimates, a biasadjustment is not necessary, because the data is simulated, and thus, finite sample issues do not apply. 13

estimates. The pattern for σ∗−2 is more mixed. However, the values implied by the (cid:15) asset pricing models are close to the empirical values. The weak implied predictability of the LRR model can also be seen in Panel C. Panel C reports the R2 for the single-variable predictive regression in equation (12). The R2 values for the LRR model are lower than for the HF and PT models and the empirical data. The predictability of the equity premium is strongest for the HF model, for which the R2 is higher than for the empirical data across all frequencies and both predictors. For the PT model, the dividend-price ratio has considerable predictive power, but the R2 values for the dividend yield are lower — consistent with the higher β∗ for the dividend-price ratio in Panel A. While the 1 R2 positively correlates with the magnitude of β∗, the β∗ of the HF model is lower 1 1 than for the PT model for the dividend-price ratio, despite the R2 being higher for the HF model. The reason for the smaller β∗ of the HF model is the more volatile 1 simulated dividend-price ratio (shown in Appendix A). 2.5 Implied predictability of the asset pricing models The reason for the weak implied predictability of the LRR prior relative to the HF and the PT priors lies in the different mechanisms of the three asset pricing models. In the HF model, time-variation of the dividend-price ratio is driven by a surplus consumption ratio that increases (decreases) with positive (negative) shocks to consumption. A positive shock to consumption makes the agent less risk averse, which causes asset prices to rise. The increase in the asset prices results in a lower dividend-price ratio, which predicts lower expected returns as the agent requires less compensation for risk. Hence, time-variation in the dividend-price ratio is driven by changes in the risk aversion of the agent, and these changes also affect expected returns. However, the expected dividend growth remains constant and does not affect the time-variation of the dividend-price ratio. Similar to the HF model, the PT model generates a time-varying dividend-price 14

ratio through time-varying risk aversion and not changes in expected cash flows. Barberis et al. (2001) incorporate utility from fluctuations in financial wealth into a standard power utility function. Gains (losses) in financial wealth make the agent less (more) risk averse. Thus, a positive shock to dividends will lower the risk aversion of the agent, which results in a higher asset price and a lower dividend-price ratio. As the expected dividend growth remains constant, the price increase leads to lower expected returns. Dividend-price ratios and future returns are therefore positively related. In the LRR model, the agent is concerned about economic growth prospects and economic uncertainty. The key difference in terms of predictability between the LRR and the other two models is that the time-variation of the dividend-price ratio is partly driven by changes to expected dividend growth prospects. A positive shock to expected dividend growth leads to a lower dividend-price ratio that is followed by higher cash flows. This mechanism mitigates the predictive power of the dividend-price ratio that is generated by the economic uncertainty channel of the LRR model: because of the Epstein-Zin (see Epstein-Zin (1989)) preferences of the agents, a negative shock to time-varying economic uncertainty results in higher asset prices and lower dividend-price ratios, which reduces subsequent returns. Whether the changes in valuation ratios are driven by time-variation in the forecasts of dividend growth or time-variation in discount rates is a source of considerable debate (see, for example, Lettau and Ludvigson (2005), Bansal et al. (2007 and 2012), and Cochrane (2008)). The former leads to weak predictability of returns, while the latter implies strong predictability of returns. This paper uses the asset pricing models and the implied mechanisms for the time-variation in valuation ratios dogmatically and investigates which mechanism leads to more accurate forecasts through the model-based priors. The investors who use the model-based priors to forecast the equity premium have no uncertainty about the asset pricing parameters. The investors’ uncertainty is about the parameters of the predictive 15

regression in equation (1). 3 Results In this section, I describe the data and report the OOS results when imposing model-basespriorsderivedfromassetpricingmodelsonthesingle-variablepredictive regressions. 3.1 Data The empirical data on the equity premium and the predictors at a monthly, quarterly, and annual frequency are available on Amit Goyal’s website.9 The equity premium is computed as the log return on the S&P 500 index minus the log threemonth U.S. Treasury bill rate. I set the start date of the time series at 1926, as high-quality return data on the S&P 500 from the Center of Research in Security Prices became available in 1926. The time series ends in 2014. The availability of predictor variables that can be used to assess the performances of the model-based priors is restricted by the three asset pricing models. The predictor variables that can be simulated from the three models are the dividend-price ratio and the dividend yield. Dividends on the S&P 500 index are 12-month moving sums from 1926 to 2014. As for the data simulated from the asset pricing models, the dividend-price ratio is defined as the difference between log dividends and log prices, and the dividend yield is defined as the difference between log dividends and log prices lagged by one period. 9Amit Goyal’s website address is http://www.hec.unil.ch/agoyal/. 16

sretemarap deilpmi-ledoM :1 elbaT :srotciderp owt rof )1( noitauqe ni nevig muimerp ytiuqe gol eht fo noisserger evitciderp elbairav-elgnis eht fo setamitse tneicffieoc eht stroper A lenaP setamitse desab-ledom ehT .4102 ot 6291 morf elpmas atad eht rof era setamitse laciripme ehT .dleiy dnedivid gol eht dna oitar ecirp-dnedivid gol eht esehT .4.2 noitceS ni debircsed erudecorp noitalumis olraC etnoM eht hguorht deniatbo era TP dna ,RRL ,FH sledom gnicirp tessa eerht eht morf rof desu noitavonni nruter eht fo ecnairav eht fo esrevni eht swohs B lenaP .)2( noitauqe ni roirp lamroN eht fo snaem sa desu era setamitse tneicffieoc .noisserger evitciderp elbairav-elgnis eht fo )tnecrep ni( 2R eht stroper C lenaP .)2( noitauqe ni roirp ammaG eht )β( stneicffieoC :A lenaP 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH laciripmE ∗β ∗β ∗β ∗β ∗β ∗β ∗β ∗β ∗β ∗β ∗β ∗β β β 1 0 1 0 1 0 1 0 1 0 1 0 1 0 oitar PD goL 123.0 968.0 642.0 837.0 110.0 370.0 400.0 040.0 681.0 416.0 591.0 536.0 660.0 282.0 snruter launnA 022.0 964.0 171.0 604.0 200.0 610.0 100.0 110.0 250.0 961.0 150.0 661.0 120.0 580.0 snruter ylretrauQ 851.0 072.0 231.0 552.0 100.0 700.0 000.0 400.0 810.0 750.0 810.0 750.0 500.0 320.0 snruter ylhtnoM YD goL 360.0 612.0 640.0 761.0 900.0 960.0 400.0 930.0 451.0 615.0 961.0 555.0 770.0 313.0 snruter launnA 450.0 531.0 520.0 870.0 200.0 610.0 000.0 900.0 940.0 290.0 150.0 390.0 910.0 080.0 snruter ylretrauQ 910.0 640.0 500.0 220.0 100.0 700.0 000.0 300.0 710.0 650.0 710.0 650.0 700.0 720.0 snruter ylhtnoM )2−σ( noitavonni nruter fo ecnairav esrevnI :B lenaP (cid:15) oitar PD goL 337.81 482.52 719.63 866.53 766.72 132.05 236.52 snruter launnA 807.88 025.521 469.741 439.341 737.99 880.371 931.98 snruter ylretrauQ 220.493 562.994 491.144 309.924 159.392 386.705 436.333 snruter ylhtnoM YD goL 706.71 184.42 665.63 885.53 656.62 004.84 777.52 snruter launnA 390.68 975.321 954.741 320.341 283.001 876.271 910.98 snruter ylretrauQ 984.193 350.894 978.044 501.234 179.492 007.205 969.333 snruter ylhtnoM )% ni 2R( ytilibatciderP :C lenaP oitar PD goL 860.6 491.3 620.0 410.0 165.7 912.21 804.2 launnA 787.2 471.1 700.0 400.0 241.2 442.3 918.0 ylretrauQ 529.0 324.0 100.0 100.0 337.0 001.1 002.0 ylhtnoM YD goL 963.0 851.0 210.0 600.0 691.5 009.8 659.2 launnA 862.0 250.0 300.0 100.0 269.1 350.3 686.0 ylretrauQ 830.0 200.0 200.0 000.0 307.0 260.1 003.0 ylhtnoM 17

3.2 Measuring forecast accuracy Iassesstheperformancesofthemodel-basedpriorsviatheOOSR2 (see,forexample, Campbell and Thompson (2008)): (cid:80)T (r −rˆm)2 R2 = 1− τ=t τ τ , (13) OOS (cid:80)T (r −rˆh)2 τ=t τ τ where rˆm is the equity premium forecast when imposing the model-based prior as τ given in equation (11); rˆh is the prediction of the historical average model; and t and τ T are the start and end dates, respectively, of the OOS forecast period. Thus, the R2 assesses the forecast performance of the model-based prior relative to the non- OOS predictability model, which assumes that the best forecast of the equity premium is its historical average, that is, β being set equal to zero in equation (1). 1 3.3 Forecasting I consider four sample periods for the OOS predictability exercise. First, I use the full sample from 1926 to 2014 and start the recursive OOS forecast in 1947. This starting point guarantees that a sufficient number of data points are available to estimate the predictive regression. Next, I analyze the subsample stability by splitting the 1947-2014 OOS forecast period in half and consider forecasts up to 1980 and forecasts starting in 1981. Last, I only use the postwar sample from 1947 to 2014, and the forecasts start in 1968. Figure 1 shows the quarterly OOS forecasts of the log equity premium from 1947 to 2014 in the top panel, when the predictive regression in equation (1) is estimated via OLS. The valuation ratios predict a substantial time variation of the equity premium. The lower panel depicts the corresponding OLS coefficient estimates. Both predictors lost predictive power during the dot-com boom in the late 1990s, which leads to the sharp drop in the coefficient estimates. Table2showstheR2 (inpercent)resultsforallmodel-basedpriorsforthreere- OOS 18

0.04 0.03 0.02 0.01 0.00 −0.01 Aug−1957 Apr−1971 Dec−1984 Sep−1998 May−2012 muimerp ytiuqE Equity premium forecasts LogDPratio LogDY 0.25 0.20 0.15 0.10 0.05 Aug−1957 Apr−1971 Dec−1984 Sep−1998 May−2012 setamitse epols dna tpecretnI Coefficient estimates β0logDPratio β0logDY β1logDPratio β1logDY Figure 1: Empirical out-of-sample forecasts The top panel shows the OOS quarterly log equity premium forecasts for two predictors: the log dividend-price ratio and the log dividend yield. The predictive regression in equation (1) is estimated recursively via OLS. The data sample starts in 1926 and the OOS period is from 1947 to 2014. The lower panel depicts the corresponding OLS coefficient estimates. turnfrequencies. FollowingtheequitypremiumOOSforecastingliterature,monthly, quarterly, and annual return frequencies are used. The “no prior” column reports the R2 for the case in which the single-variable predictive regression in equation OOS (1) is estimated via OLS. If the model-based prior leads to an increase in the R2 , OOS then the figure is in bold. The last column of the table show the best-performing prior for the respective frequency, predictor, and time period. Whether the differences in forecast errors between the predictive regression, estimated via OLS or the model-based priors, and the historical average model are significant is tested with a Diebold-Mariano test (see Diebold and Mariano (1995)). Overall, the model-based priors help to improve the forecast accuracy of the single-variable predictive regression relative to the OLS estimates. The gains in R2 are considerable compared with the literature (see, for example, Campbell OOS and Thompson (2008)). Out of the three asset pricing models, the priors derived 19

fromtheLRRmodelperformconsistentlybetterthanthepriorsderivedfromtheHF and PT models for three out of the four OOS periods. The LRR model-based priors yieldlessaccurateforecaststhantheothermodel-basedpriorsonlyforthe1947-1980 OOS period. In most cases, the LRR model-based priors outperform the historical average model, which is shown by the positive R2 values. The exception is the OOS 1981-2014 OOS period, for which the LRR model-based priors improve the forecast accuracythemostrelativetotheOLSestimatesbutfailtobeatthehistoricalaverage model. The Diebold-Mariano test leads to statistically significant results only for the 1947-1980 OOS period, for which we can reject the historical average model at the quarterly and annual frequency. The difficulty of statistically rejecting the historical average model when predicting the equity premium OOS is emphasized by Welch and Goyal (2008). Table3comparesthemodel-basedpriorsbyassessingtheirforecasterrorsagainst each other instead of comparing them to the forecast errors of the historical average model. The differences in the R2 (in percent) between the best-performing prior OOS and the other priors are reported for every return frequency, predictor, and sample period. To test whether the difference in forecast errors is statistically significant, I use a one-sided Diebold-Mariano test. Despite the difficult task to statistically reject OOS forecasting models of the equity premium (see, for example, Campbell and Thompson (2008) and Welch and Goyal (2008)), the differences are statistically significant in several cases. For the log dividend-price ratio, the hypothesis of equal predictive power of the model estimated with the LRR priors and the PT priors can be rejected for the majority of data samples. The differences between the R2 OOS of the LRR priors and the HF priors are generally smaller and, thus, significant in fewer cases. When the log dividend yield acts as the predictor, the results are not as pronounced as for the log dividend-price ratio, but the hypothesis of equal predictive power can be rejected particularly at the monthly frequency, where more data points are available and the power of the test is increased. The analysis in 20

ecnamrofrep tsacerof roirp desab-ledoM :2 elbaT ,RRL ,FH :sledom gnicirp tessa desab-noitpmusnoc eerht eht morf devired sroirp desab-ledom eht fo ecnamrofrep SOO eht wohs C dna ,B ,A slenaP ,)31( noitauqe morf )tnecrep ni( 2R eht si detropeR .)1( noitauqe ni nevig noisserger evitciderp elbairav-elgnis eht no desopmi era sroirp ehT .TP dna SOO ehT .ledomegarevalacirotsihehtotevitalernoissergerevitciderpelbairav-elgnisehtfostsacerofmuimerpytiuqegolSOOfoycaruccaehtserusaemhcihw tsacerof SLO ot evitaler 2R eht ni esaercni na ot sdael roirp desab-ledom eht fI .dleiy dnedivid gol eht dna oitar ecirp-dnedivid gol eht era srotciderp SOO .ycarucca tsacerof ni tnemevorpmi tsetaerg eht ot sdael roirp desab-ledom hcihw setoned nmuloc tsal ehT .dlob ni si erugfi eht ,nmuloc ”roirp on“ eht ni htiw detset si noisserger evitciderp eht dna ledom egareva lacirotsih eht fo srorre tsacerof eht neewteb ecnereffid eht fo ecnacfiingis lacitsitats ehT .leveltnecrep01ehtta*ybdnaleveltnecrep5ehtta**ybdetonedsicitsitatstsettnacfiingisA .))5991(onairaMdnadlobeiDees(tsetonairaM-dlobeiDa snruter launnA :A lenaP tseB SOO elpmaS 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH roirp oN rotciderP roirp doirep trats 1 RRL 975.7- 540.4- 706.0 731.1 292.1- 902.1- 693.0 oitar PD goL 1 TP 887.0 738.0 344.0 903.0 060.3- 329.4- 082.61- YD goL 4102-7491 6291 1 RRL 693.3- 406.1- 525.0 327.0 671.2- 660.3- 249.7egarevA 2 TP 465.61 *553.61 *380.5 *729.3 **727.41 *243.51 *598.11 oitar PD goL 2 FH 983.8 896.6 944.5 714.4 035.11 884.11 552.01- YD goL 0891-7491 6291 1 FH 674.21 625.11 662.5 271.4 921.31 514.31 028.0 egarevA 2 RRL 942.43- 422.52- 986.2- 720.3- 186.81- 684.81- 242.21oitar PD goL 2 RRL 370.8- 511.6- 752.4- 067.4- 367.91- 777.12- 209.22- YD goL 4102-1891 6291 2 RRL 161.12- 076.51- 374.3- 398.3- 222.91- 231.02- 275.71egarevA 2 RRL 861.9- 879.4- 313.2 732.2 606.1- 138.1- 125.3oitar PD goL 2 TP 543.2 998.1 070.1 587.0 100.0- 960.0- 433.1- YD goL 4102-8691 7491 2 RRL 114.3- 935.1- 196.1 115.1 308.0- 059.0- 824.2egarevA 21

)deunitnoc( ecnamrofrep tsacerof roirp desab-ledoM :2 elbaT snruter ylretrauQ :B lenaP tseB SOO elpmaS 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH roirp oN rotciderP roirp doirep trats 1 RRL 155.7- 230.5- 250.0 642.0 020.1- 706.0- 518.0oitar PD goL 1 FH 614.0 047.0 958.0 438.0 246.0 778.0 043.0 YD goL 4102-7491 6291 1 RRL 765.3- 641.2- 654.0 045.0 981.0- 531.0 832.0egarevA 1 FH 603.0 257.1 573.3 433.3 086.3 011.4 357.3 oitar PD goL 2 TP *530.5 **854.4 **126.3 **331.3 *352.4 **988.4 *047.4 YD goL 0891-7491 6291 1 FH 076.2 501.3 894.3 332.3 769.3 994.4 742.4 egarevA 2 RRL *230.41- 433.11- 844.2- 288.2- 470.5- 599.4- 727.4oitar PD goL 1 RRL 522.3- 244.2- 298.1- 607.1- 646.2- 130.3- 924.3- YD goL 4102-1891 6291 2 RRL 826.8- 888.6- 071.2- 492.2- 068.3- 310.4- 870.4egarevA 1 RRL 531.7- 060.5- 916.0 046.0 245.0- 145.0- 193.0oitar PD goL 2 RRL 570.0 606.0 038.0 207.0 934.0 594.0 792.0- YD goL 4102-8691 7491 2 RRL 035.3- 722.2- 427.0 176.0 150.0- 320.0- 443.0egarevA snruter ylhtnoM :C lenaP tseB SOO elpmaS 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH roirp oN rotciderP roirp doirep trats 1 RRL 656.1- 086.1- 360.0 290.0 570.0- 452.0- 160.0oitar PD goL 1 RRL 534.0- 653.0- 762.0- 112.0- 013.0- 003.0- 393.0- YD goL 4102-7491 6291 1 RRL 640.1- 810.1- 201.0- 950.0- 291.0- 772.0- 722.0egarevA 2 FH 281.0 047.0 777.0 080.1 402.1 901.1 462.1 oitar PD goL 2 FH 712.1 620.1 640.1 410.1 832.1 941.1 571.1 YD goL 0891-7491 6291 2 FH 996.0 388.0 119.0 740.1 122.1 921.1 912.1 egarevA 1 RRL 054.3- 862.3- 121.1- 739.0- 699.0- 963.1- 241.1oitar PD goL 1 RRL 197.1- 104.1- 272.1- 632.1- 018.1- 449.1- 276.1- YD goL 4102-1891 6291 1 RRL 026.2- 533.2- 791.1- 780.1- 304.1- 756.1- 704.1egarevA 1 RRL 487.2- 384.2- 410.0 621.0 390.0- 753.0- 732.0oitar PD goL 1 TP 830.0- 351.0 011.0 780.0- 921.0- 352.0- 661.0- YD goL 4102-8691 7491 2 RRL 114.1- 561.1- 260.0 910.0 111.0- 503.0- 202.0egarevA 22

Section 4 shows that even small and statistically insignificant differences in R2 OOS can lead to substantial utility gains for an investor with mean-variance preferences. The strong performance of the LRR prior can be explained by the low modelimplied predictability. In Table 1, β∗ and β∗ are lower for all three frequencies and 0 1 both predictors compared with the empirical estimates and β∗ and β∗ of the HF and 0 1 PT models. Thus, imposing the LRR prior pushes the posterior estimates of β and 0 β down. Figure 2 shows the OLS estimates — that is, no prior is imposed on the 1 predictive regression — and the posterior estimates for the log dividend-price ratio and quarterly returns for the 1968-2014 OOS period. The LRR 1 posterior estimates are substantially lower than the OLS estimates and the posterior estimates of the HF 1 and PT 1 models. However, the model-based priors derived from the HF 1 model lead to posterior estimates that are similar to the OLS estimates. The modelbased priors from the PT 1 model push the posterior estimates for both coefficients higher than they are when ignoring any prior and simply relying on OLS estimates. Figure 3 shows that the lower posterior estimates achieved through the LRR 1 prior are beneficial for an investor. The top panel depicts the difference between the cumulative sum of squared errors (SSE) of the historical average model and the single-variable predictive regression estimated via OLS or via model-based priors. I subtract the cumulative SSE of the predictive regression from the cumulative SSE of the historical average model. Hence, a positive value implies that the predictive regression outperforms the historical average model. Until the beginning of the 1990s, the predictive regression performs better than the historical average model regardless of the estimation method. The highest cumulative SSE value is achieved for an investor who relies on the priors of the PT 1 model, which is due to the strong predictive power of the log dividend-price ratio implied by the PT 1 model. In the 1970s, valuation ratios had strong predictive power, and the PT 1 model makes the investor rely on this predictive power to a higher degree than an investor who uses the HF 1 or LRR 1 model to form her priors. The LRR 1 prior leads to the 23

secnamrofrep tsacerof roirp desab-ledom eht fo nosirapmoC :3 elbaT ytiuqe gol SOO fo ycarucca eht serusaem dna )31( noitauqe ni nevig si 2R eht erehw ,)tnecrep ni( 2R eht ni secnereffid eht era detropeR SOO SOO noissergerevitciderpehT .ledomegarevalacirotsihehtotevitaler,)1(noitauqeninevig,noissergerevitciderpelbairav-elgnisafostsacerofmuimerp gol eht era srotciderp ehT .TP dna ,RRL ,FH :sledom gnicirp tessa desab-noitpmusnoc eerht eht morf devired sroirp desab-ledom aiv detamitse si eht morf detcartbus si roirp hcae fo 2R eht ,doirep elpmas dna ,rotciderp ,ycneuqerf hcae roF .dleiy dnedivid gol eht dna oitar ecirp-dnedivid SOO tset onairaM-dlobeiD dedis-eno a htiw detset si srorre tsacerof ni ecnereffid eht fo ecnacfiingis lacitsitats ehT .roirp gnimrofrep-tseb eht fo 2R SOO tseb ehT .level tnecrep 01 eht ta * yb dna level tnecrep 5 eht ta ** yb detoned si citsitats tset tnacfiingis A .))5991( onairaM dna dlobeiD ees( .nmuloc htruof eht ni detroper si roirp-gnimrofrep snruter launnA :A lenaP 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH roirp tseB rotciderP doirep SOO trats elpmaS 617.8 181.5 035.0 924.2 643.2 1 RRL oitar PD goL 4102-7491 6291 840.0 493.0 825.0 798.3 067.5 1 TP YD goL 902.0 *084.11 *736.21 738.1 222.1 2 TP oitar PD goL 0891-7491 6291 141.3 338.4 280.6 311.7 240.0 2 FH YD goL *065.13 535.22 833.0 299.51 797.51 2 RRL oitar PD goL 4102-1891 6291 718.3 958.1 305.0 *705.51 *125.71 2 RRL YD goL 084.11 192.7 670.0 819.3 441.4 2 RRL oitar PD goL 4102-8691 7491 644.0 572.1 065.1 643.2 514.2 2 TP YD goL 24

)deunitnoc( secnamrofrep tsacerof roirp desab-ledom eht fo nosirapmoC :3 elbaT snruter ylretrauQ :B lenaP 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH roirp tseB rotciderP doirep SOO trats elpmaS **797.7 **872.5 491.0 662.1 358.0 1 RRL oitar PD goL 4102-7491 6291 164.0 731.0 634.0 133.0 532.0 1 FH YD goL 408.3 853.2 537.0 677.0 034.0 1 FH oitar PD goL 0891-7491 6291 775.0 414.1 *209.1 *287.0 641.0 2 TP YD goL **485.11 **688.8 434.0 **626.2 **745.2 2 RRL oitar PD goL 4102-1891 6291 815.1 637.0 681.0 049.0 *523.1 1 RRL YD goL *477.7 *007.5 120.0 281.1 181.1 1 RRL oitar PD goL 4102-8691 7491 557.0 422.0 721.0 193.0 433.0 2 RRL YD goL snruter ylhtnoM :C lenaP 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH roirp tseB rotciderP doirep SOO trats elpmaS **947.1 **377.1 920.0 761.0 *743.0 1 RRL oitar PD goL 4102-7491 6291 422.0 641.0 750.0 990.0 980.0 1 RRL YD goL 320.1 464.0 724.0 421.0 690.0 2 FH oitar PD goL 0891-7491 6291 120.0 212.0 291.0 422.0 880.0 2 FH YD goL **215.2 **133.2 481.0 950.0 *234.0 1 RRL oitar PD goL 4102-1891 6291 **555.0 561.0 630.0 **475.0 **707.0 1 RRL YD goL **019.2 **806.2 211.0 812.0 **384.0 1 RRL oitar PD goL 4102-8691 7491 *191.0 340.0 *042.0 *382.0 **604.0 1 TP YD goL 25

4.0 3.0 2.0 1.0 0 9002−guA 4002−beF 8991−peS 3991−raM 7891−peS 2891−raM 6791−tcO 1791−rpA .tse epols/tpecretnI noitamroF tibaH :setamitse tneiciffeoC roirp on β 0 .tsop 1FH β 0 roirp on β 1 .tsop 1FH β 1 4.0 3.0 2.0 1.0 0 9002−guA 4002−beF 8991−peS 3991−raM 7891−peS 2891−raM 6791−tcO 1791−rpA .tse epols/tpecretnI ksiR nuR gnoL :setamitse tneiciffeoC roirp on β 0 .tsop 1 RRL β 0 roirp on β 1 .tsop 1 RRL β 1 4.0 3.0 2.0 1.0 0 9002−guA 4002−beF 8991−peS 3991−raM 7891−peS 2891−raM 6791−tcO 1791−rpA .tse epols/tpecretnI yroehT tcepsorP :setamitse tneiciffeoC roirp on β 0 .tsop 1 TP β 0 roirp on β 1 .tsop 1 TP β 1 setamitse tneicffieoc elpmas-fo-tuo roiretsoP :2 erugiF .sroirp1 TPdna ,1RRL ,1FH ehtrof 4102ot8691morfdoirepSOO ehtrof β dna βfo setamitseroiretsop ehtdna setamitseSLO eht swohs erugfisihT 1 0 .atad ylretrauq htiw detamitse si )1( noitauqe ni noisserger evitciderp eht dna ,oitar ecirp-dnedivid gol eht si rotciderp ehT 26

lowest cumulative SSE value until 1994. However, during the dot-com boom from 1994 to 1999, the predictive power of the log dividend-price ratio collapses and the cumulative SSE of the predictive regression turns negative for all four estimation methods. The investor armed with the LRR 1 model is able to limit poor forecasts, as her belief in the predictive ability of valuation ratios is qualified because of her prior. The lower panel of Figure 3 provides further detail. The equity premium forecasts for the OOS period from 1968 to 2014 are depicted. The posterior point forecasts given in equation (11) of the LRR 1 model are close to zero during the dot-com boom. The other two model-based priors and the OLS estimates result in strongly negative forecasts. Hence, an investor relying on these forecasts to time the market suffers losses during this bull market period. Some papers in the equity premium prediction literature restrict the model estimatestoyieldonlynon-negativepredictionsoftheequitypremium(see,forexample, Campbell and Thompson (2008) and Pettenuzzo et al. (2014)). Such a restriction will lead to a result that is similar to imposing the LRR prior. However, an investor who derives a prior belief about the predictability of the equity premium through data simulated from the asset pricing models would conclude that valuation ratios can forecast a negative equity premium. Particularly for the HF and PT models, negative forecasts occur frequently when estimating the predictive regression solely with simulated data (as in equation (12)). For simulated data from the LRR model, negative forecasts of the equity premium are not as frequent, as less weight is placed on the predictor variable: β is small. 1 Figure 4 shows the simulated posterior density of the quarterly log equity premium prediction given in equation (10) for the third quarter in 1998. The predictor is the log dividend-price ratio, and the model-based priors are the same as in Figures 2 and 3. The densities are simulated with 10,000 draws. For all three model-based priors, the posterior densities are similarly shaped and approximate a Normal distribution. Hence, there are no substantial differences in terms of the risk that the 27

40.0 20.0 00.0 20.0− 40.0− 60.0− 80.0− 9002−guA 4002−beF 8991−peS 3991−raM 7891−peS 2891−raM 6791−tcO 1791−rpA ecnereffid ESS evitalumuC secnereffid srorre derauqs fo mus evitalumuC roirp oN roirp 1FH roirp 1 RRL roirp 1 TP 40.0 20.0 00.0 20.0− 40.0− 9002−guA 4002−beF 8991−peS 3991−raM 7891−peS 2891−raM 6791−tcO 1791−rpA muimerp ytiuqE stsacerof muimerp ytiuqE roirp oN roirp 1FH roirp 1 RRL roirp 1 TP sroirp desab-ledom fo stsaceroF :3 erugiF ledomegarevalacirotsihehtfostsacerofmuimerpytiuqegolehtneewteb)ESS(srorrederauqsfomusevitalumucehtnisecnereffidehtswohslenappotehT ,1 RLL ,1 FH eht morf devired era sroirp desab-ledom ehT .roirp desab-ledom ro SLO aiv detamitse )1( noitauqe ni nevig noisserger evitciderp eht dna SOO ehT .ledom egareva lacirotsih eht fo ESS evitalumuc eht morf detcartbus era noisserger evitciderp eht fo ESS evitalumuc ehT .sledom 1 TP dna tniop eht stciped lenap rewol ehT .oitar ecirp-dnedivid gol eht si rotciderp ehT .ycneuqerf ylretrauq a ta era stsacerof eht dna 4102 ot 8691 morf si doirep roiretsop eht ,sroirp desab-ledom eht roF .sroirp desab-ledom eht dna setamitse SLO eht rof noisserger evitciderp eht fo muimerp ytiuqe eht fo stsacerof .)11( noitauqe ni nevig era stsacerof tniop 28

predictive densities imply. However, when imposing the LRR 1 prior, the density is furthest to the right, corresponding to an equity premium forecast that is greater than the forecast of the other two model-based priors. These posterior densities are in line with the predictions during the dot-com boom shown in the lower panel of Figure 3. Figure 5 shows the corresponding posterior densities of β and β given 0 1 in equation (6). As the predictive density of the log equity premium, the coefficient densities approximate Normal distributions. For both coefficients, the LRR 1 prior results in posterior densities that are centered to the left of the HF 1 and PT 1 priors, consistent with the higher posterior mean of the equity premium predictive density shown in Figure 4. Hence, in the third quarter of 1998 at the height of the dot-com boom, when the dividend-price ratio was low, an investor who believes in the HF 1 or PT 1 model expects a negative equity premium to materialize in the next period. However, an investor whose prior beliefs are in line with the LRR 1 model is more hesitant to draw this conclusion. This finding is related to Wachter and Warusawitharana (2009), who show that an investor who is skeptical about the predictive power of the dividend-price ratio and the yield spread performs better when forecasting the equity premium OOS. My paper focuses on how model-based priors from different consumption-based asset pricing models perform without investigating the reasons behind empirical fluctuations in valuation ratios. However, several papers analyze drivers of the high equity prices during the 1990s. Some researchers propose that an increase in stock market participation and diversification is at least partially responsible for the higher equity prices (see, for example, Heaton and Lucas (1999)). Lettau, Ludvigson, and Wachter (2008) estimate through a regime-switching model that the low dividend-price ratios during 1990s are driven by a shift to substantially lower consumption volatility. This decrease in macroeconomic risk led to a lower expected equity premium. Lettau and Van Nieuwerburgh (2008) allow for shifts in the steady state of the economy and find a structural break in the 1990s for the dividend-price 29

0.12 0.10 0.08 0.06 0.04 0.02 0.00 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 Equity premium prediction ytisned ytilibaborP HF1prior LRR1prior PT1prior Figure 4: Predictive posterior density of the equity premium This figure shows the simulated posterior density of the quarterly log equity premium prediction given in equation (10) for the third quarter in 1998 for three model-based priors: HF 1, LRR 1, and PT 1. The predictor is the log dividend-price ratio. Data from the first quarter in 1947 to the second quarter in 1998 are used to estimate the predictive regression. The densities are simulated with 10,000 draws. ratio. The authors show that while this structural break can be detected in-sample, an investor could not have exploited it OOS. Based on a VAR framework, Campbell, Giglio, and Polk (2013) find that during the dot-com boom, the discount rates of investors were at a historically low level, but the boom preceding the financial crisis of 2007-2009 was caused by positive cash flow news. 4 Utility of an investor So far, I have analyzed how priors derived from the three consumption-based asset pricing models affect the forecast accuracy of single-variable predictive regressions. However, investorsareultimatelyconcernedabout utility, andthus, weneedtocompute differences in utility gains when comparing the model-based priors. Further, comparing utility gains takes the investors’ risk aversion into account. 30

0.20 0.15 0.10 0.05 0.00 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 β 0 ytisned ytilibaborP Simulated posterior densities of β 0 HF1prior LRR1prior PT1prior 0.30 0.20 0.10 0.00 −0.15 −0.1 −0.05 0 0.05 0.1 0.15 β 1 ytisned ytilibaborP Simulated posterior densities of β 1 HF1prior LRR1prior PT1prior Figure 5: Posterior density of coefficients Thisfigureshowsthesimulatedposteriordensityofthecoefficientsβ andβ giveninequation(6) 0 1 for the third quarter in 1998 for three model-based priors: HF 1, LRR 1, and PT 1. The predictor is the log dividend-price ratio. Quarterly data from the first quarter in 1947 to the second quarter in 1998 are used to estimate the predictive regression. The densities are simulated with 10,000 draws. The Bayesian technique that I use to impose the economic constraints provides the full predictive density of the equity premium. Based on the mean and the variance of the predictive density, I can compute the portfolio allocation and utility gains of an investor with mean-variance preferences (see, for example, Campbell and Thompson (2008) and Wachter and Warusawitharana (2009)). The utility gains of an investor achieved through the model-based priors will also give us an estimate of how much an investor would be willing to pay to know the theory of one consumption-based asset pricing model over another. 4.1 Asset allocation An investor is assumed to have mean-variance preferences, and she chooses portfolio weights for a risky asset and a risk-free asset. The return on the risky asset is the log 31

equity premium r plus the log risk-free return r , and the risk-free asset yields t+1 f,t r . At time t, the investor solves the following maximization problem f,t 1 maxE [W ]−γ Var [W ], (14) t t+1 t t+1 αt 2 subject to W = α exp(r +r )+(1−α )exp(r ), (15) t+1 t t+1 f,t t f,t where α is the portfolio share of the risky asset, and γ is the risk aversion of the t investor. The solution to the maximization problem is E [exp(r +r )−exp(r )] α∗ = t t+1 f,t f,t . (16) t γVar [exp(r )] t t+1 For an investor who imposes model-based priors on the predictive regression to forecast the equity premium, we can use the mean and the variance of the sampled predictive density of r given in equation (10) to approximate α∗. The optimal t+1 t risky asset portfolio share based on the model-based prior forecasts is denoted αˆ∗ . t,m Basedonαˆ∗ andtherealizedequitypremium, therealizedwealthcanbecomputed t,m W(cid:99) = αˆ∗ exp(r +r )+(1−αˆ∗ )exp(r ). (17) t+1,m t,m t+1 f,t t,m f,t Solving for αˆ∗ for t = t − 1,...,T − 1 results in a sequence of (cid:8) W(cid:99) (cid:9)T . The t,m t,m t=t realized utility over the total OOS sample period is then given by T 1 1 (cid:88) U(cid:98) = W −γ (W(cid:99) −W )2, (18) m m τ,m m 2T −τ τ=t where W = 1 (cid:80)T W(cid:99) . m T−(τ−1) τ=t τ,m WhenestimatingtherealizedutilityofportfoliosN andA, acertaintyequivalent return(CER)canbecomputed. TheCERisdefinedasaconstantreturnthat, when added to the portfolio return of portfolio N, equates the realized utility of portfolios 32

N and A. The CER is given by CER = U(cid:98) −U(cid:98) . (19) A N AmoreintuitiveinterpretationoftheCERisatransactioncostoramanagementfee that the investor is willing to pay each period to have access to the equity premium forecasts used for portfolio A. For example, when portfolio N uses the model-based prior from the HF model and portfolio A uses the model-based prior from the LRR model, then the CER tells us how much the investor would be willing to pay each period to have access to the LRR model instead of the HF model. 4.2 Utility results I compute the CER given in equation (19) for each return frequency, predictor, and OOS period. The share of the risky asset for portfolio A is computed based on the predictions of the predictive regression when imposing the model-based prior which results in the highest utility for the investor. The share of the risky asset for portfolio N is computed based on the predictions when imposing one of the remaining model-based priors, respectively. The results are shown in Table 4, which is structured like Table 3 but with the R2 figures replaced with the annualized OOS CERs. The risk aversion parameter γ is set equal to 5. The CER results are even more favorable to the LRR prior than the R2 OOS results reported in Table 2 and 3: an investor who derives her prior belief about the predictability of the equity premium from the LRR model performs consistently the best for three out of the four sample periods across all frequencies and both predictors. The only OOS period during which the HF and PT priors dominate is from 1947 to 1980. The CERs are economically significant with the maximum value being 4.46%. Panel D averages the CER for each prior pair across all frequencies, predic- 33

tors, and sample periods. These values show how much on average an investor would be willing to pay to have access to the model-based priors derived from the consumption-based asset pricing model in the top row instead of any of the remaining five model-based priors. For the LRR 1 prior, all the CER values are positive, which implies that an investor who uses the LRR 1 prior to predict the equity premium and allocate her portfolio according to these predictions achieves the highest average utility. The LRR 2 model is a close second with positive average CER values against all model-based priors except the LRR 1 prior. The average CER values are economically meaningful. Investors who rely on the HF 1 or HF 2 priors would pay between 20 and 30 basis points per year to have access to the LRR priors. The investors who derive their prior beliefs about the predictability of the equity premium from the PT model would on average need an additional 60 and 75 basis points per year to achieve the utility level of the investor who uses the LRR priors. Figure 6 shows the risky asset share of the portfolio given in equation (16) for the HF 2, LRR 2, and PT 2 priors. The forecasts are at an annual frequency, and the OOS period is from 1947 to 2014. The top panel shows the risky asset share when the log dividend-price ratio is used as the predictor. For the bottom panel, the log dividend-yield is the predictor. Generally, the LRR prior leads to a more stable portfolio share of the risky asset, which is due to the low predictability implied by the LRR model. The greatest difference between the priors is again during the bull market of the late 1990s. For the dividend-price ratio, the HF 2 and the PT 2 investors short the risky asset during this period, because they expect low valuation ratios to predict strongly negative returns. However, the investor with prior beliefs derived from the LRR 2 model is skeptical about the predictive power of the low valuation ratios and maintains a positive weight on the risky asset. The bottom panel is similar to the top panel with the difference being that the PT 2 investor is more bullish during the bull market of the late 1990s when predicting with the dividend yield. This difference is explained by the prior means of the 34

sroirp desab-ledom eht fo ecnamrofrep cimonocE :4 elbaT taht eef tnemeganam a sa deterpretni eb nac REC ehT .)91( noitauqe ni nevig )tnecrep ni( REC dezilaunna eht troper C dna ,B ,A lenaP stsacerof muimerp ytiuqe eht ot ssecca evah ot raey hcae yap ot gnilliw si 5 fo noisreva ksir a dna ytilitu ecnairav-naem htiw rotsevni na ytiuqe eht fo daetsni )1( noitauqe ni nevig noisserger evitciderp elbairav-elgnis eht no roirp gnimrofrep-tseb eht gnisopmi morf tluser hcihw ehT .nmuloc ”roirp tseb“ eht ni detroper si roirp gnimrofrep-tseb ehT .sroirp evfi gniniamer eht fo eno no desab stsacerof muimerp dna ,srotciderp ,seicneuqerf lla ssorca REC egareva eht stroper D lenaP .dleiy dnedivid eht dna oitar ecirp-dnedivid eht era srotciderp .riap roirp hcae rof ,sdoirep elpmas snruter launnA :A lenaP tseB SOO elpmaS 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH rotciderP roirp doirep trats 867.0 635.0 110.0 943.0 573.0 1 RRL oitar PD goL 4102-7491 6291 221.0 870.0 210.0 163.0 204.0 2 RRL YD goL 910.0 100.0 912.0 682.0 940.0 1 FH oitar PD goL 0891-7491 6291 371.0 671.0 322.0 242.0 220.0 2 FH YD goL 617.1 513.1 260.0 739.0 820.1 1 RRL oitar PD goL 4102-1891 6291 552.0 571.0 950.0 209.0 441.1 1 RRL YD goL 576.1 532.1 210.0 818.0 758.0 2 RRL oitar PD goL 4102-8691 7491 620.0 020.0 970.0 833.0 564.0 2 RRL YD goL snruter ylretrauQ :B lenaP tseB SOO elpmaS 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH rotciderP roirp doirep trats 268.0 176.0 100.0 271.0 502.0 1 RRL oitar PD goL 4102-7491 6291 921.0 150.0 150.0 690.0 431.0 1 RRL YD goL 440.0 691.0 771.0 940.0 510.0 1 TP oitar PD goL 0891-7491 6291 350.0 360.0 171.0 612.0 201.0 1 FH YD goL 308.1 835.1 720.0 584.0 474.0 2 RRL oitar PD goL 4102-1891 6291 182.0 030.0 110.0 122.0 781.0 2 RRL YD goL 197.3 071.3 011.0 295.0 207.0 1 RRL oitar PD goL 4102-8691 7491 523.0 122.0 520.0 302.0 724.0 1 RRL YD goL 35

)deunitnoc( sroirp desab-ledom eht fo ecnamrofrep cimonocE :4 elbaT snruter ylhtnoM :C lenaP tseB SOO elpmaS 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH rotciderP roirp doirep trats 177.0 916.0 720.0 790.0 831.0 1 RRL oitar PD goL 4102-7491 6291 101.0 470.0 951.0 661.0 351.0 1 RRL YD goL 430.0 212.0 480.0 150.0 401.0 1 TP oitar PD goL 0891-7491 6291 510.0 071.0 070.0 121.0 610.0 2 TP YD goL 283.1 473.1 951.0 881.0 792.0 2 RRL oitar PD goL 4102-1891 6291 473.0 791.0 570.0 014.0 051.0 2 RRL YD goL 954.4 344.4 270.0 631.0 774.0 1 RRL oitar PD goL 4102-8691 7491 662.0 300.0 440.0 914.0 616.0 2 TP YD goL REC egarevA :D lenaP 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH roirp nosirapmoC 344.0- 323.0- 572.0 782.0 740.0 1 FH 094.0- 073.0- 922.0 042.0 740.0- 2 FH 137.0- 116.0- 210.0- 042.0- 782.0- 1 RRL 917.0- 995.0- 210.0 922.0- 572.0- 2 RRL 021.0- 995.0 116.0 073.0 323.0 1 TP 021.0 917.0 137.0 094.0 344.0 2 TP 36

predictive regression coefficients reported in Panel A of Table 1. The model-implied parameters of PT 2 for the dividend yield are smaller than for the dividend-price ratio, which makes the investor more hesitant to believe that the low dividend yield will predict an immediate downturn of the stock market. 5 Robustness For the benchmark results shown previously, the consumption-based asset pricing modelsusedtoderivethepriorsarecalibratedasproposedbytherespectiveauthors, that is, Campbell and Cochrane (1999), Barberis, et al. (2001), and Bansal and Yaron (2004). To test whether the results are robust to calibrating the asset pricing models with data from a time period that has no overlap with the OOS period, I calibrate the parameters of the asset pricing models with data from 1926 to 1967. In the OOS forecast exercise above, 1926 is the beginning of the return sample, and 1967 is the end of the burn-in period for the postwar data sample. All three asset pricing models are calibrated with annual data. Hence, using a shorter sample for the calibration makes the task of matching empirical moments too challenging for the models, as the empirical moments are likely distorted by outliers. I follow the calibration methodology proposed by the respective authors: some parameters are set equal to their empirical counterparts and others are chosen such that the model simulated moments match the empirical moments, as, for example, the mean and standard deviation of the dividend-price ratio or the equity premium. Details regarding the calibration of the models can be found in Appendix A. Table 5 reports the results for the priors derived from the asset pricing models calibrated with data from 1926 to 1967. The OOS forecasts start in 1968. Panel A shows the R2 for each prior, predictor, and return frequency. The priors from OOS the LRR model perform consistently the best and improve the R2 relative to the OOS “no prior” forecast in every case. In Panel B, the difference between the R2 of OOS 37

4.0 2.0 0.0 2.0− 4.0− 6.0− 8991 4891 1791 7591 tessa yksir fo erahS oitar ecirp−dnediviD :erahs tessa yksiR roirp 2FH roirp 2 RRL roirp 2 TP 4.0 2.0 0.0 2.0− 4.0− 8991 4891 1791 7591 tessa yksir fo erahS dleiy dnediviD :erahs tessa yksiR roirp 2FH roirp 2 RRL roirp 2 TP tessa yksir fo erahs oiloftroP :6 erugiF RRL ,2 FH eht no yler ohw 5 ot lauqe γ htiw srotsevni ecnairav-naem rof )61( noitauqe ni nevig oiloftrop eht fo erahs tessa yksir eht swohs erugfi sihT .)dleiy dnedivid gol( oitar ecirp-dnedivid gol eht htiw tsacerof si tessa yksir eht no nruter eht ,lenap )mottob( pot eht nI .ylevitcepser ,sroirp 2 TP dna ,2 .4102 ot 7491 morf si doirep SOO eht dna ,ycneuqerf launna na ta era snruter ehT 38

the respective prior and the best performing prior is shown. The significance of the difference in forecast errors is tested with a Diebold-Mariano test (see Diebold and Mariano (1995)). The differences in forecast performance between the LRR priors and priors derived form the HF and the PT model are generally significant at a monthly and a quarterly frequency, where the higher number of observations leads to more power compared to the annual returns. For the benchmark analysis, the tightness parameters of the Gamma-Normal prior, λ and v, are set equal to 1 and 0.1, respectively, as described in Section 2.2. However, theresultsandtheconclusionsdrawninthispaperarerobusttotightening or loosening the model-based priors. Table 6 reports the results for the total sample, that is, the 1947-2014 OOS period. Relative to the benchmark, the model-based priors are tightened and loosened by a factor of 2 and 4. Tightening the priors further does not alter the conclusion, and loosening the priors by more than factor of 4 leads to forecast results that are not substantially different from the OLS estimates. Tightening (loosening) the prior by a factor of 4 results in λ = 0.25 (λ = 4) and v = 0.4 (v = 0.025). The LRR model-based priors yield the most accurate forecasts across the range of λ and v values. For the dividend-price ratio, a LRR model-based prior is the best performing prior for all the hyperparameter values and return frequencies, with the exception of the monthly return frequency when λ = 4 and v = 0.4. For the dividend-yield, the LRR model-based priors outperform the other priors in half of the cases. As in Table 3, the differences in OOS forecast errors are statistically significant based on the Diebold-Mariano test (see Diebold and Mariano (1995)) in several cases at a monthly and quarterly frequency. The sensitivity to the prior tightness of the economic performance of a meanvariance investor are shown in Table 7, which is stuctured similarly to Table 4 and reportstheCERgiveninequation(19)foreachreturnpredictor, priortightness, and return frequency, for the total sample OOS period, that is, from 1947 to 2014. An 39

investor who forecasts the equity premium with an LRR model-based prior achieves the highest utility in 23 out of the 24 cases. Panel D shows the average CER for each model-based prior pair. Compared to an investor who times the market based on forecasts from HF priors, an investor using the LRR priors generates a CER of close to 20 basis point a year. The average CER of the LRR priors compared to the PT priors is around 50 basis points per year. These results confirm that the strong forecasting performance of an investor who derives her prior beliefs about the equity premium predictability from the LRR model are robust to changes in the prior tightness. 6 Conclusion DifferenttheorieshavebeenproposedtoresolvetheequitypremiumpuzzleofMehra and Prescott (1985). Three prominent consumption-based asset pricing models that provide different explanations for the existence of the equity premium puzzle are the Habit Formation (HF), the Long Run Risk (LRR), and the Prospect Theory (PT) models. I compare these asset pricing models based on whether they can profitably guide the investment decisions of investors who try to time the stock market. I propose a simple Bayesian framework in which investors reduce the uncertainty about predictive regression parameters by imposing economic constraints derived from the three asset pricing models. The predictor of the single-variable predictive regression is a valuation ratio — that is, the log dividend-price ratio or the log dividend yield. The priors derived from the LRR model perform particularly well during the dot-com boom in the late 1990s. During that period, low valuation ratios predicted negative returns that failed to materialize for several years. The key to the strong performance of the LRR priors is the weak implied predictive power of valuation ratios for the equity premium. The weak predictive power is caused by the LRR 40

atad 7691-6291 htiw detarbilac sledom gnicirp tessa morf sroirp desab-ledoM :5 elbaT :sledom gnicirp tessa desab-noitpmusnoc eerht eht morf devired sroirp desab-ledom eht fo ecnamrofrep SOO eht swohs A slenaP xidneppA .7691 ot 6291 morf atad launna htiw detarbilac era sledom gnicirp tessa desab-noitpmusnoc ehT .TP dna ,RRL ,FH .)1( noitauqe ni nevig noisserger evitciderp elbairav-elgnis eht no desopmi era sroirp ehT .noitarbilac eht fo sliated sniatnoc A eht fo stsacerof muimerp ytiuqe gol SOO fo ycarucca eht serusaem hcihw ,)31( noitauqe morf )tnecrep ni( 2R eht si detropeR SOO ehtdnaoitarecirp-dnedividgolehterasrotciderpehT .ledomegarevalacirotsihehtotevitalernoissergerevitciderpelbairavelgnis ,nmuloc ”roirp on“ eht ni tsacerof SLO ot evitaler 2R eht ni esaercni na ot sdael roirp desab-ledom eht fI .dleiy dnedivid gol SOO evfi gniniamer eht fo hcae dna roirp gnimrofrep-tseb eht fo 2R eht neewteb ecnereffid eht swohs B lenaP .dlob ni si erugfi eht SOO ecnereffid eht fo ecnacfiingis lacitsitats eht ,slenap htob nI .nmuloc dnoces eht ni detroper si roirp-gnimrofrep tseb ehT .sroirp A .B lenaP rof dedis-eno si tset ehT .))5991( onairaM dna dlobeiD ees( tset onairaM-dlobeiD a htiw detset si srorre tsacerof ni .level tnecrep 01 eht ta * yb dna level tnecrep 5 eht ta ** yb detoned si citsitats tset tnacfiingis ecnamrofrep tsaceroF :A lenaP 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH roirP oN rotciderp / ycneuqerF 4102-8691 :doirep SOO / 6291 :trats elpmaS snruter launnA 337.32- 566.9- 274.1 742.1 181.11- 638.11- 851.1oitar ecirp-dnedivid goL 055.0 660.0- 908.0 073.0 855.01- 842.01- 825.5dleiy dnedivid goL snruter ylretrauQ 983.01- 877.5- 420.0 941.0 209.1- 022.2- 998.0oitar ecirp-dnedivid goL 729.0- 171.0- 342.0 412.0 722.0- 821.0- 361.0dleiy dnedivid goL snruter ylhtnoM 820.2- 982.1- 280.0- 911.0 442.0- 554.0- 561.0oitar ecirp-dnedivid goL 896.0- 593.0- 152.0- 601.0- 885.0- 482.0- 973.0dleiy dnedivid goL nosirapmoc roirP :B lenaP 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH roirp tseB rotciderp / ycneuqerF 4102-8691 :doirep SOO / 6291 :trats elpmaS snruter launnA 502.52 731.11 522.0 456.21 903.31 2 RRL oitar ecirp-dnedivid goL 062.0 578.0 934.0 863.11 750.11 2 RRL dleiy dnedivid goL snruter ylretrauQ **935.01 **829.5 521.0 250.2 *073.2 1 RRL oitar ecirp-dnedivid goL 071.1 314.0 920.0 074.0 073.0 2 RRL dleiy dnedivid goL snruter ylhtnoM **741.2 *804.1 102.0 363.0 **375.0 1 RRL oitar ecirp-dnedivid goL **195.0 *882.0 541.0 *184.0 771.0 1 RRL dleiy dnedivid goL 41

model’s time-variation in the dividend growth forecasts, the long-run risk component. Hence, an investor who uses the LRR model to guide her investment choices is hesitant to conclude that low valuation ratios result in an immediate fall in stock prices. The stronger predictability implied by the HF and PT models helps to improve the forecast accuracy up to the 1980s. However, the performance deteriorates quickly during the dot-com boom, as the investors who believe in the strong predictive power of valuation ratios anticipate a sharp price decline much earlier than it materializes. Because the performance during the dot-com boom dominates over the total sample period, that is, from 1926 to 2014, an investor whose prior beliefs are anchored in the LRR model would have outperformed investors whose prior beliefs stem from the HF and PT models. These differences in forecast accuracy are not only shown by the R2 , but also translate into considerable utility gains for an OOS investor with mean-variance preferences. By imposing model-based priors derived from consumption-based asset pricing models on predictive regressions and showing how the forecast performances of these priors differ, this paper makes a novel contribution to the equity premium prediction literature. This paper also adds to our understanding of consumption-based asset pricing models. The paper shows that over the 1926-2014 sample, an investor whose beliefs had been rooted in an asset pricing model that implies weak equity premium predictability would have outperformed investors who relied on priors from models in which the equity premium is strongly predictable. 42

ecnamrofrep tsacerof eht fo ytivitisnes retemaraprepyH :6 elbaT fo ycarucca eht serusaem dna )31( noitauqe ni nevig si 2R eht erehw ,)tnecrep ni( 2R eht ni secnereffid eht era detropeR SOO SOO egareva lacirotsih eht ot evitaler ,)1( noitauqe ni nevig ,noisserger evitciderp elbairav-elgnis a fo stsacerof muimerp ytiuqe gol SOO :sledom gnicirp tessa desab-noitpmusnoc eerht eht morf devired sroirp desab-ledom aiv detamitse si noisserger evitciderp ehT .ledom tsrfi eht ni nwohs era v dna λ sretemaraprepyh eht rof seulav evitcepser eht dna ,deirav si ssenthgit roirp ehT .TP dna ,RRL ,FH roirp dna ,rotciderp ,ycneuqerf hcae roF .dleiy dnedivid gol eht dna oitar ecirp-dnedivid gol eht era srotciderp ehT .snmuloc owt eht fo ecnacfiingis lacitsitats ehT .roirp gnimrofrep-tseb eht fo 2R eht morf detcartbus si roirp hcae fo 2R eht ,ssenthgit SOO SOO tset tnacfiingis A .))5991( onairaM dna dlobeiD ees( tset onairaM-dlobeiD dedis-eno a htiw detset si srorre tsacerof ni ecnereffid .nmulochtruofehtnidetropersiroirp-gnimrofreptsebehT .leveltnecrep01ehtta*ybdnaleveltnecrep5ehtta**ybdetonedsicitsitats snruter launnA :A lenaP 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH roirp tseB rotciderP v λ 4102-7491 :doirep SOO / 6291 :trats elpmaS 493.42 419.21 918.1 760.5 835.6 2 RRL oitar PD goL 4.0 52.0 394.1 095.2 948.2 126.5 178.7 2 TP YD goL 4.0 52.0 647.81 559.9 304.0 064.4 141.4 2 RRL oitar PD goL 2.0 5.0 021.1 838.1 330.2 940.6 181.7 2 TP YD goL 2.0 5.0 518.2 087.1 165.0 590.1 171.1 1 RRL oitar PD goL 50.0 2 442.1 024.1 350.0 955.4 *886.5 1 RRL YD goL 50.0 2 247.0 913.0 **816.0 862.0 136.0 1 RRL oitar PD goL 520.0 4 **622.1 105.0 **137.0 **878.2 **506.3 2 RRL YD goL 520.0 4 43

)deunitnoc( ecnamrofrep tsacerof eht fo ytivitisnes retemaraprepyH :6 elbaT snruter ylretrauQ :B lenaP 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH roirp tseB rotciderP v λ 4102-7491 :doirep SOO / 6291 :trats elpmaS **960.34 **702.03 473.0 195.2 833.2 2 RRL oitar PD goL 4.0 52.0 852.1 820.1 783.1 080.0 362.0 1 TP YD goL 4.0 52.0 **624.32 **850.61 900.0 877.1 888.1 1 RRL oitar PD goL 2.0 5.0 030.1 216.0 609.0 891.0 833.0 1 TP YD goL 2.0 5.0 *919.1 *625.1 580.0 313.0 835.0 2 RRL oitar PD goL 50.0 2 951.0 911.0 *854.0 842.0 860.0 1 FH YD goL 50.0 2 **616.0 **807.0 *033.0 **555.0 **468.0 2 RRL oitar PD goL 520.0 4 833.0 *304.0 210.0 343.0 862.0 1 FH YD goL 520.0 4 snruter ylhtnoM :C lenaP 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH roirp tseB rotciderP v λ 4102-7491 :doirep SOO / 6291 :trats elpmaS **922.73 **399.13 421.0 868.0 548.0 2 RRL oitar PD goL 4.0 52.0 658.0 161.0 483.0 361.1 481.1 2 RRL YD goL 4.0 52.0 **879.01 **382.9 *732.0 484.0 994.0 2 RRL oitar PD goL 2.0 5.0 622.0 952.0 171.0 *397.0 063.0 1 RRL YD goL 2.0 5.0 512.0 502.0 541.0 *212.0 740.0 2 RRL oitar PD goL 50.0 2 812.0 **503.0 **553.0 **503.0 601.0 2 RRL YD goL 50.0 2 *832.0 450.0 380.0 221.0 730.0 2 FH oitar PD goL 520.0 4 761.0 **303.0 **592.0 381.0 **093.0 1 RRL YD goL 520.0 4 44

ecnamrofrep cimonoce eht fo ytivitisnes retemaraprepyH :7 elbaT eef tnemeganam a sa deterpretni eb nac REC ehT .)91( noitauqe ni nevig )tnecrep ni( REC dezilaunna eht troper C dna ,B ,A lenaP muimerp ytiuqe eht ot ssecca evah ot raey hcae yap ot gnilliw si 5 fo noisreva ksir a dna ytilitu ecnairav-naem htiw rotsevni na taht daetsni ,)1( noitauqe ni nevig ,noisserger evitciderp elbairav-elgnis eht no roirp gnimrofrep-tseb eht gnisopmi morf tluser taht stsacerof dna ,deirav si ssenthgit roirp ehT .nmuloc ”roirp tseb“ eht ni detroper si roirp gnimrofrep-tseb ehT .sroirp evfi gniniamer eht fo eno fo -ar ecirp-dnedivid gol eht era srotciderp ehT .snmuloc owt tsrfi eht ni nwohs era v dna λ sretemaraprepyh eht rof seulav evitcepser eht .riaproirphcaerof,srotciderpdna,sretemaraprepyh,seicneuqerfllassorcaRECegarevaehtstroperDlenaP .dleiydnedividgolehtdnaoit snruter launnA :A lenaP 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH roirp tseB rotciderP v λ 4102-7491 :doirep SOO / 6291 :trats elpmaS 909.0 636.0 210.0 844.0 264.0 2 RRL oitar PD goL 4.0 52.0 811.0 120.0 400.0 303.0 753.0 2 RRL YD goL 4.0 52.0 539.0 946.0 530.0 634.0 174.0 2 RRL oitar PD goL 2.0 5.0 571.0 701.0 940.0 334.0 364.0 2 RRL YD goL 2.0 5.0 494.0 553.0 700.0 762.0 592.0 2 RRL oitar PD goL 50.0 2 301.0 270.0 120.0 923.0 533.0 1 RRL YD goL 50.0 2 572.0 412.0 010.0 461.0 461.0 1 RRL oitar PD goL 520.0 4 670.0 510.0 600.0 532.0 752.0 1 RRL YD goL 520.0 4 snruter ylretrauQ :B lenaP 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH roirp tseB rotciderP v λ 4102-7491 :doirep SOO / 6291 :trats elpmaS 650.2 917.1 570.0 963.0 653.0 2 RRL oitar PD goL 4.0 52.0 702.0 130.0 200.0 990.0 690.0 1 RRL YD goL 4.0 52.0 483.1 751.1 710.0 502.0 872.0 1 RRL oitar PD goL 2.0 5.0 151.0 610.0 620.0 190.0 411.0 1 RRL YD goL 2.0 5.0 493.0 963.0 400.0 890.0 151.0 1 RRL oitar PD goL 50.0 2 800.0 700.0 620.0 740.0 420.0 2 RRL YD goL 50.0 2 142.0 291.0 020.0 910.0 100.0 2 RRL oitar PD goL 520.0 4 001.0 011.0 270.0 221.0 250.0 2 RRL YD goL 520.0 4 45

)deunitnoc( ecnamrofrep cimonoce eht fo ytivitisnes retemaraprepyH :7 elbaT snruter ylhtnoM :C lenaP 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH roirp tseB rotciderP v λ 4102-7491 :doirep SOO / 6291 :trats elpmaS 380.3 687.2 980.0 593.0 862.0 1 RRL oitar PD goL 4.0 52.0 322.0 321.0 900.0 391.0 182.0 2 RRL YD goL 4.0 52.0 025.1 435.1 280.0 041.0 740.0 2 RRL oitar PD goL 2.0 5.0 190.0 621.0 790.0 032.0 371.0 1 RRL YD goL 2.0 5.0 804.0 943.0 010.0 110.0 200.0 2 FH oitar PD goL 50.0 2 721.0 550.0 080.0 230.0 701.0 2 RRL YD goL 50.0 2 572.0 081.0 560.0 470.0 101.0 2 RRL oitar PD goL 520.0 4 381.0 830.0 341.0 021.0 051.0 1 RRL YD goL 520.0 4 REC egarevA :D lenaP 2 TP 1 TP 2 RRL 1 RRL 2 FH 1 FH roirp nosirapmoC 353.0- 242.0- 191.0 681.0 700.0 1 FH 063.0- 842.0- 481.0 971.0 700.0- 2 FH 935.0- 824.0- 500.0 971.0- 681.0- 1 RRL 445.0- 334.0- 500.0- 481.0- 191.0- 2 RRL 111.0- 334.0 824.0 842.0 242.0 1 TP 111.0 445.0 935.0 063.0 353.0 2 TP 46

Appendix A Asset pricing models A.1 By force of habit: A consumption-based explanation of aggregate stock market behavior Campbell and Cochrane (1999) use a standard representative-agent consumptionbased asset pricing model but add a slow-moving habit to the basic power utility function. Thisslow-movinghabitleadstoatime-varyingriskpremiumthatishigher at business cycle troughs than at peaks. The agents are identical and maximize their utility given by (cid:34) (cid:35) (cid:88) ∞ (C −X )(1−γ) −1 E δt t t , (A.1) 1−γ t=0 where C is the consumption level, X is the level of habit, δ is the time discount t t factor, and γ is the risk aversion. A surplus consumption ratio S ≡ (C −X )/C t t t t is defined — a small value of S indicates that the economy is in a bad state. The t local curvature of this utility function is given by C u (C ,X ) γ t cc t t η ≡ − = . (A.2) t u (C ,X ) S c t t t A process is specified for s = ln(S ), which ensures that C is always above X : t t t t s = (1−φ)s¯+φs +λ(s )(c −c −g), (A.3) t+1 t t t+1 t with φ reflecting habit persistence. The function λ(s ) takes the form t     S 1 ¯ (cid:112) 1−2(s t −s¯)−1, s t ≤ s max λ(s ) = (A.4) t   0, s t > s max , with the parameter s set equal to s¯+1(1−S ¯2). The steady state value s¯is given max 2 47

(cid:112) by ln(σ γ/(1−φ)). The evolution of s is based on consumption growth being t+1 an i.i.d. lognormal process ∆c = g +v , where v i. ∼ i.d. N(0,σ2). (A.5) t+1 t+1 t+1 v Stocks represent a claim to the consumption stream. The price-consumption ratio for a consumption claim satisfies (cid:20) (cid:20) (cid:21)(cid:21) P C P t t+1 t+1 (s ) = E M 1+ (s ) . (A.6) t t t+1 t+1 C C C t t t+1 The underlying assumption is that dividend growth is perfectly correlated with consumption growth in equation (A.5). Above, I denote this specification the HF 1 model.10 The intertemporal marginal rate of substitution (IMRS) M takes the t+1 form (cid:18) S C (cid:19)−γ t+1 t+1 M ≡ δ . (A.7) t+1 S C t t Because the term (S /S )−γ correlates positively with asset returns, the HF model t+1 t generates a higher equity premium compared with the standard power utility model. The log risk-free rate is given by γ2σ2 rf = −ln(δ)+γg −γ(1−φ)(s −s¯)− [1+λ(s )]2. (A.8) t t 2 t The price-consumption ratio is correlated with the business cycles, as it depends on s . The ratio is high at business cycle peaks and low at troughs. Why is the pricet consumption ratio procyclical? Suppose there is a positive shock to consumption in period t. Higher consumption raises s and consequently E [M ], which results t t t+1 in a higher asset price and price-consumption ratio. (Equation (A.2) shows how an increase in s lowers the the local curvature of the utility function and makes t 10The solution for the model specification which assumes imperfectly correlated consumption and dividend processes (HF 2) is given in Campbell and Cochrane (1999). 48

the agent less risk averse.) Because expected future cash flows remain constant, the higherassetpriceswillleadtolowerexpectedreturns. Hence, theprice-consumption ratio and subsequent returns are inversely correlated. A.1.1 Calibration and simulation of model For my benchmark analysis, priors from the HF model are based on the parameter values proposed by Campbell and Cochrane (1999). These parameter values are reported in Table A.1. in the “original value” column. My recalibration of the model with data from 1926 to 1967 results in parameter values reported in the “1926- 1967 value” column. For the recalibration, I follow the methodology of Campbell and Cochrane (1999). Consumption data are real per capita consumption of nondurables and services from the Bureau of Economic Analysis (BEA). The standard deviation of log consumption growth is chosen such that annual log consumption growth simulated from the model matches the empirical counterpart of 3.02%. The risk-free rate time series is from Amit Goyal’s website and deflated with inflation data from Federal Reserve Economic Data. Dividends are computed using CRSP New York Stock Exchange (NYSE) data. The persistence of the log price-dividend ratio is 0.82. Following Campbell and Cochrane (1999), I chose γ to match the NYSE equity premium sharpe ratio, which is 0.33 for the 1926-1967 period, with the HF 1 specification. The discount factor δ is selected such that the annualized log risk-free rate matches the empirical value of 0.31. I apply the fixed-point method to solve for the price-consumption and the pricedividend ratio (see Wachter (2005)). The model is simulated at a monthly frequency and time-aggregated to lower frequencies. Summary statistics of the simulation for the model specification with perfectly (HF 1) and imperfectly (HF 2) correlated log consumption and log dividend growth are given in Panel A of Table A.2. The simulated moments match the moments obtained by Campbell and Cochrane (1999) and Wachter (2005). The simulated moments based on the model recalibrated with 49

data from 1926-1967 can be found in Panel B of Table A.2. A.2 Prospect theory and asset prices In the model of Barberis et al. (2001), the agent not only derives utility from consumption but also from financial wealth fluctuations. There are two important aspects in the way financial wealth fluctuations affect the utility of an economic agent. First, the agent is loss averse. Second, the degree of loss aversion depends on prior investment outcomes. Prior gains lead to less loss aversion, and prior losses lead to more loss aversion. Hence, the risk aversion of the agent varies over time. Aggregate consumption growth and dividend growth follow the i.i.d. lognormal processes given by i.i.d. ∆c = g +σ (cid:15) , where (cid:15) ∼ N(0,1) (A.9) t+1 c c c,t+1 c,t+1 and i.i.d. ∆d = g +σ (cid:15) , where (cid:15) ∼ N(0,1), (A.10) t+1 d d d,t+1 d,t+1 with the correlation between (cid:15) and (cid:15) being denoted by ω.11 c,t+1 d,t+1 The agent’s maximization problem is set up as (cid:34) (cid:35) (cid:88) ∞ (cid:18) C1−γ (cid:19) E ρt t +b C ¯−γρt+1v(X ,S ,z ) . (A.11) 1−γ 0 t t+1 t t t=0 The second term captures the fact that the agent’s utility is affected by fluctuations in financial wealth. The variable X denotes the change of the financial wealth t+1 between time t and t+1 and is defined as X ≡ S R −S R . (A.12) t+1 t t+1 t f,t 11Barberis et al. (2001) consider two different specifications: Economy I, in which dividends equal consumption, and Economy II, in which consumption and dividends follow separate but positively correlated processes. The simulated moments of Economy II are much more successful in matching the empirical moments; hence, I do not consider Economy I. 50

Table A.1: Habit Formation model parameter values The parameter values from Campbell and Cochrane (1999) are reported in the “original value” column. The parameter values chosen for the calibration of the model based on data from 1926 to 1967 are reported in the “1926-1967 value” column. A * denotes that the value is annualized. 1926-1967 Description Variable Original value value Mean log consumption growth* g 1.89% 1.77% Std. dev. log consumption growth* σ 1.50% 3.75% Log risk-free rate* rf 0.94% 0.31% Persistence coefficient* φ 0.87 0.82 Utility curvature γ 2.00 1.00 Std. dev. log dividend growth* σ 11.2% 14.3% ω Corr. log cons. and log div. growth ρ 0.20 0.57 Subjective discount factor* δ 0.89 0.92 Table A.2: Habit Formation model simulated moments Simulated moments at monthly, quarterly, and annual frequencies that are reported for the specifications of the HF model that assume perfect (HF 1) and imperfect correlation (HF 2) betweenlogconsumptionandlogdividendgrowth. ForPanelA,theparametervaluesofCampbell and Cochrane (1999) are used. For Panel B, the parameter values are calibrated based on a sample with data from 1926 to 1967. The price-dividend ratio moments are annualized. Panel A: Based on original parameter values Model Freq. P/D Log P/D Log equity prem. Log Sharpe Mean Std. dev. Mean Std. dev. ratio HF 1 Annual 18.55 0.27 6.60% 15.06% 0.44 HF 2 Annual 19.00 0.30 6.52% 19.91% 0.33 HF 1 Quarterly 18.43 0.27 1.65% 7.73% 0.21 HF 2 Quarterly 18.92 0.28 1.63% 10.08% 0.16 HF 1 Monthly 18.39 0.27 0.55% 4.49% 0.12 HF 2 Monthly 18.89 0.28 0.54% 5.84% 0.09 Panel B: Based on 1926-1967 parameter values HF 1 Annual 17.32 0.35 7.76% 23.71% 0.33 HF 2 Annual 17.14 0.40 7.92% 31.59% 0.25 HF 1 Quarterly 17.20 0.35 1.95% 12.26% 0.16 HF 2 Quarterly 17.01 0.37 1.97% 16.17% 0.12 HF 1 Monthly 17.12 0.34 0.65% 7.12% 0.09 HF 2 Monthly 17.03 0.37 0.66% 9.38% 0.07 51

The variable S measures the value of the agent’s risky assets at time t. The variable t z accounts for prior gains and losses up to time t and is defined as Z /S , where Z t t t t is a historical benchmark level for the value of the risky asset. If z is smaller than t one, the agent has prior gains; if z is greater than one, the agent faces prior losses. t The time discount factor is ρ, and b C ¯−γ is a scaling term, with γ being the risk 0 t aversion over consumption. The form of the utility function over financial wealth v(.) is different conditional on prior gains or prior losses. The dynamics of z are given by the process t (cid:18) ¯ (cid:19) R z = η z +(1−η). (A.13) t+1 t R t+1 This process ensures that the benchmark level Z reacts sluggishly to changes in the t ¯ stock price. The parameter R is chosen such that the median value of z is around t one. The price-dividend ratio is assumed to be a function of the state variable z : t f ≡ P /D = f(z ). (A.14) t t t t The real stock returns are thus given as 1+f(z ) R = t+1 eg d +σ d (cid:15) d,t+1. (A.15) t+1 f(z ) t Barberis et al. (2001) show that the equilibrium is characterized by a constant real risk-free rate, R = δ−1eγgc−γ2σ c 2/2, (A.16) f 52

and a price-dividend ratio determined by the equation (cid:20) (cid:21) 1+f(z ) 1 =δeg d −γgc+γ2σ c 2(1−ω2)/2E t+1 e(σ d −γωσc)(cid:15) d,t+1 t f(z ) t (A.17) (cid:20) (cid:18) (cid:19)(cid:21) 1+f(z ) +b δE vˆ t+1 eg d +σ d (cid:15) d,t+1,z , 0 t t f(z ) t where the utility function vˆ(R ,z ) is equal to v(X ,S ,z )/S and specified for t+1 t t+1 t t t z ≤ 1 as t    R t+1 −R f,t , R t+1 ≥ z t R f,t vˆ(R ,z ) = (A.18) t+1 t   (z t R f,t −R f,t )+λ(R t+1 −z t R f,t ), R t+1 < z t R f,t and for z > 1 as t    R t+1 −R f,t , R t+1 ≥ R f,t vˆ(R ,z ) = (A.19) t+1 t   λ(z t )(R t+1 −R f,t ), R t+1 < R f,t , where λ(z ) = λ+k(z −1) with k > 0. t t The PT model generates an equity premium that is predictable by the dividendprice ratio. The mechanism works through time-varying risk aversion. A positive period t shock to dividends in equation (A.10) increases the return of the asset and leads to a lower z through equation (A.13). A lower z implies that the agent t t is less loss averse as shown in equations (A.18) and (A.19). Hence, the price of the asset will increase, which reduces the agent’s loss aversion further, leading to a higher price-dividend ratio. Because of the higher prices and unchanged cash flow expectations, the expected returns are lower. Price-dividend ratios and future returns are therefore negatively related. 53

A.2.1 Calibration and simulation of model The parameter values from Barberis et al. (2001) are reported in Table A.3. in the “original value” column. My recalibration of the model with data from 1926 to 1967 uses the parameter values in the “1926-1967 value” column. For some parameters, two values are given. In these cases, the first value corresponds to the PT 1 model. TherecalibrationfollowsthemethodologyofBarberisetal. (2001). ThePT1model is calibrated such that the average effective loss aversion of the model is 2.25.12 The second value corresponds to the PT 2 model, which is calibrated such that the log equity premium of the model matches the empirical moment. When calibrating the model with the 1926-1967 data sample, I use the the same consumption, dividend, and return data as for the calibration of the HF model, described previously.13 The parameters γ, ρ, and δ are chosen to bring the risk-free rate close to the empirical value of 0.31%. The prior outcome parameter k and the time discount factor ρ are set to 4 and 0.98, respectively, for the PT 1 model such that the annual average effective loss aversion is 2.25. For the PT 2 model, the parameter values are chosen to be 18 and 0.99, respectively, to bring the annual simulated equity premium close to 7.42%. The persistence parameter η is set such that the persistence of the log price-dividendratioisclosetotheempiricalvalueof0.82. Theremainingparameters are not estimated with empirical data and set equal to the values of Barberis et al. (2001). I solve the model by following the process laid out by Barberis et al. (2001). The moments in Panel A of Table A.4. are generated by simulating the model with the parameter values proposed by the authors, particularly b0 = 100 and k = 3 for PT 1 and b0 = 100 and k = 8 for PT 2. The moments match the moments obtained by Barberis et al. (2001). Panel B reports the simulated moments based on my 12This value is chosen by Barberis et al. (2001) based on experimental evidence. 13Isetσ equalto12%forthe1926-1967parametervalues, asinBarberisetal. (2001), instead D of 14.2% as in the HF model, as a convergence of the numerical solution was not achieved with a more volatile log dividend growth process. 54

recalibration of the parameter values with data from 1926 to 1967. A.3 Risks for the long run: a potential resolution of asset pricing puzzles Bansal and Yaron (2004) propose a solution to the equity premium puzzle through a consumption-based asset pricing model with Epstein and Zin (1989) preferences. Their model differs from other consumption-based asset pricing models in two ways. First, they include a small persistent expected growth rate component in the consumption and dividend growth rate processes. This component causes consumption and the return on the market portfolio to covary positively, and hence, the economic agents require a higher risk premium. Second, they allow for time-varying volatility, which accounts for fluctuating economic uncertainty, in both processes: this additional source of systematic risk increases the risk premium further. The asset pricing restriction for the real return on the market portfolio R , m,t+1 according to the Epstein and Zin (1989) preferences, is (cid:20) (cid:21) E δθG − ψ θ R−(1−θ)R = E [M R ] = 1, (A.20) t c,t+1 c,t+1 m,t+1 t t+1 m,t+1 where G is the aggregate gross growth rate of consumption, R denotes the c,t+1 c,t+1 real return on an asset that pays aggregate consumption as dividends, δ is the time discountfactor, and M is the IMRS.The parameterθ is defined as(1−γ)/(1−1), t+1 ψ where γ is the risk aversion parameter, and ψ accounts for the intertemporal elasticity of substitution (IES). To derive the real returns, the authors use the standard approximation of Campbell and Shiller (1988). The real log return for the claim to aggregate consumption is r = κ +κ z −z +g , (A.21) c,t+1 0 1 t+1 t c,t+1 55

Table A.3: Prospect Theory model parameter values TheparametervaluesfromBarberisetal. (2001)arereportedincolumn“originalvalue”. Theparametervalueschosenforthecalibrationofthemodelbasedondatafrom1926to1967arereported inthe“1926-1967value”column. Whentwovaluesaregivenforthesameparameter,thenthefirst valuestandsforthePT1modelandthesecondvalueforthePT2model. Allvaluesareannualized. 1926-1967 Description Variable Original value value Mean log consumption growth g 1.84% 1.77% c Mean log dividend growth g 1.89% 1.77% d Std. dev. log consumption growth σ 3.79% 3.02% c Std. dev. log dividend growth σ 12.0% 12.0% d Corr log cons. and log div. growth ω 0.15 0.57 Utility curvature γ 1.00 1.00 Time discount factor ρ 0.98 0.98 / 0.99 Loss aversion λ 2.25 2.25 Prior outcome parameter k 3 / 8 4 / 18 Prospect utility weight b0 100 100 Persistence factor η 0.90 0.90 Table A.4: Prospect Theory model simulated moments Simulated moments at monthly, quarterly, and annual frequencies are reported. In Panel A, the parameter values of Barberis et al. (2001) are used, particularly b0=100 and k =3 for the PT 1 specificationandb0=100andk =8forthePT2specification. ForPanelB,theparametervalues are estimated based on a sample with data from 1926 to 1967, particularly b0 = 100 and k = 4 for the PT 1 specification and b0=100 and k =18 for the PT 2 specification. The price-dividend ratio moments are annualized. Panel A: Based on original parameter values Model Freq. Price-dividend ratio Log equity prem. Log Sharpe Mean Std. dev. Mean Std. dev. ratio PT 1 Annual 17.30 2.38 3.74% 20.23% 0.19 PT 2 Annual 12.73 2.21 5.87% 23.87% 0.25 PT 1 Quarterly 9.46 0.54 2.13% 9.00% 0.24 PT 2 Quarterly 7.45 0.60 2.84% 10.79% 0.26 PT 1 Monthly 6.30 0.14 1.15% 4.48% 0.26 PT 2 Monthly 5.05 0.16 1.47% 5.05% 0.29 Panel B: Based on 1926-1967 parameter values PT 1 Annual 16.99 2.48 3.90% 21.18% 0.18 PT 2 Annual 12.30 2.54 7.47% 28.54% 0.26 PT 1 Quarterly 9.45 0.57 2.12% 9.34% 0.23 PT 2 Quarterly 6.73 0.67 3.51% 12.99% 0.27 PT 1 Monthly 6.32 0.16 1.15% 4.65% 0.25 PT 2 Monthly 4.35 0.18 1.85% 5.85% 0.32 56

whereg isthelogconsumptiongrowth, andz denotesthelogprice-consumption c,t+1 t ratio. The specification for the real log return on the market portfolio is r = κ +κ z −z +g (A.22) m,t+1 0,m 1,m m,t+1 m,t d,t+1, where g is the log dividend growth rate, and z denotes the log price-dividend d,t+1 m,t ratio. The values for κ , κ , κ , and κ are constants that are derived through 0 0,m 1 1,m the approximation of Campbell and Shiller (1988).14 The dynamics of log consumption growth and log dividend growth — which incorporate a small persistent predictable component x , the long run risk compot nent, and a time-varying volatility component σ , reflecting fluctuating economic t uncertainty — are x =ρx +ϕ σ e t+1 t e t t+1 g =µ +x +σ η c,t+1 c t t t+1 (A.23) g =µ +φx +ϕ σ u d,t+1 d t d t t+1 σ2 =σ2 +v (σ2 −σ2)+σ w , t+1 1 t w t+1 with e , u , η , and w having i.i.d. standard Normal distributions.15 The t+1 t+1 t+1 t+1 state variables, which determine the price-consumption and price-dividend ratios, are x and σ . The solutions for z and z are t t t m,t z =A +A x +A σ2 t 0 1 t 2 t (A.24) z =A +A x +A σ2. m,t 0,m 1,m t 2,m t The derivation of A and A can be found in Bansal and Yaron (2004) and Bansal m et al. (2010 and 2012). 14Bansal et al. (2010) show that κ is equal to exp(z¯)/(1+exp(z¯)), and κ is equal to ln(1+ 1 0 exp(z¯))−κ z¯, where z¯ is the mean log price-consumption ratio. Accordingly, κ is given by 1 1,m exp(z¯ )/(1+exp(z¯ )), and κ is equal to ln(1+exp(z¯ ))−κ z¯ , with z¯ being the mean m m 0,m m 1,m m m log price-dividend ratio. 15BansalandYaron(2004)alsosimulateaversionoftheirmodelwithouttime-varyingvolatility of consumption growth, which is less successful in matching empirical data moments. 57

The model generates excess returns that are predictable by the price-dividend ratio, but the predictability is weak. The predictability is affected by the two state variables σ2 and x . A negative shock to σ2 results in a lower E [R ], which t t t t c,t+1 causes E [M ] to increase. Consequently, asset prices and price-dividend ratios t t+1 are both higher. The higher prices cause a decrease in expected returns, and thus, a negative correlation between the price-dividend ratio and future returns. A positive shock to x also causes an increase in E [M ] as E [G ] goes up: asset prices and t t t+1 t t+1 price-dividend ratios increase. However, dividends in subsequent periods will be higher because of the positive shock to the growth rate. Thus, high price-dividend ratios are followed by higher cash flows which weakens the negative correlation of price-dividend ratios and subsequent returns. A.3.1 Calibration and simulation of model The parameter values used by Bansal and Yaron (2004) are reported in Table A.5. in the “original value” column. My calibration of the model over the 1926-1967 sample uses the parameter values in the “1926-1967 value” column. For the risk aversion parameter γ two values are given. The first value corresponds to the LRR 1 model. The LRR 1 model yields a simulated price-dividend ratio that is close to the empirical moment. The second value corresponds to the LRR 2 model, which matches the empirical log equity premium closely. For the calibration with the 1926-1967 sample, I use the same consumption, dividend, and return data as for the calibration of the HF model, described previously.16 Following Bansal and Yaron (2004), the parameters µ, µ , ρ, ϕ , φ, ϕ , and σ, are chosen such that the model d e d can replicate the log consumption growth and log dividend growth dynamics of the annual empirical data, as well as producing a price-dividend ratio (LRR 1) and an equity premium (LRR 2) that are close to their empirical counterparts of 22.34 and 7.42%, respectively. For the 1926-1967 sample, log consumption growth has a mean 16Bansal and Yaron (2004) assume consumption takes place at the end of a period. I assume the same timing convention. 58

of 1.80% and a standard deviation of 3.08% with an autocorrelation of 0.32. The variance ratios at the 2, 5, and 10 year horizon are 1.35, 1.32, and 1.37, respectively. The log dividend growth has a standard deviation of 14.27% and an autocorrelation of -0.03. The correlation between log consumption and log dividend growth is 0.57. Theparametersoftheeconomicuncertaintyprocessv andσ areselectedsuchthat 1 w predictable variation of consumption volatility with the log price-dividend ratio is 3% as in the empirical data. Panel A of Table A.6. reports the moments of the simulated data from the LRR model for γ = 7.5 (LRR 1) and γ = 10 (LRR 2) when the Bansal and Yaron (2004) parameter values are used. The simulation is based on the analytical solutions of the model. The analytical solutions are considered more reliable than the numerical solutions (see, for example, Bansal et al. (2010 and 2012) and Beeler and Campbell (2012)). The model is simulated at a monthly frequency and time-aggregated to lower frequencies. The obtained data moments match the data moments in Bansal and Yaron (2004) and Beeler and Campbell (2012). Panel B of Table A.6. reports the simulated moments based on my recalibration of the model with data from 1926 to 1967. 59

Table A.5: Long Run Risk model parameter values The parameter values from Bansal and Yaron (2004) are reported in the “original value” column. The parameter values chosen for the calibration of the model based on data from 1926 to 1967 are reported in the “1926-1967 value” column. When two values are given for the same parameter, then the first value stands for the LRR 1 model and the second value for the LRR 2 model. A * denotes that the value is at a monthly frequency. Description Variable Original value 1926-1967 value Mean log consumption growth* µ 0.0015 0.0015 c Mean log dividend growth* µ 0.0015 0.0015 d Persistence of x * ρ 0.979 0.977 t Volatility multiple of x * ϕ 0.044 0.049 t e Dividend leverage* φ 3.00 3.70 Dividend volatility multiple* ϕ 4.50 4.80 d Unconditional mean of σ * σ 0.0078 0.0083 t Persistence of σ * v 0.987 0.987 t 1 Baseline volatility* σ 0.23×10−5 0.23×10−5 w Risk aversion γ 7.5 / 10 7.5 / 10 IES ψ 1.50 1.50 Time discount factor* δ 0.9880 0.9885 Table A.6: Long Run Risk model simulated moments Simulated moments at monthly, quarterly, and annual frequencies are reported for the specifications of the LRR model with γ = 7.5 (LRR 1) and γ = 10 (LRR 2). For Panel A, the parameter values of Bansal and Yaron (2004) are used. For Panel B, the parameter values are estimated basedonasamplewithdatafrom1926to1967. Theprice-dividendratiomomentsareannualized. Panel A: Based on original parameter values Model Freq. P/D Log P/D Log equity prem. Log Sharpe Mean Std. dev. Mean Std. dev. ratio LRR 1 Annual 26.86 0.20 2.70% 16.75% 0.16 LRR 2 Annual 20.61 0.20 4.08% 16.46% 0.25 LRR 1 Quarterly 26.68 0.17 0.67% 8.32% 0.08 LRR 2 Quarterly 20.43 0.17 1.03% 8.22% 0.13 LRR 1 Monthly 26.65 0.16 0.23% 4.81% 0.05 LRR 2 Monthly 20.44 0.16 0.35% 4.76% 0.07 Panel A: Based on 1926-1967 parameter values LRR 1 Annual 23.10 0.27 4.13% 21.10% 0.20 LRR 2 Annual 16.46 0.26 6.17% 20.50% 0.30 LRR 1 Quarterly 22.79 0.23 1.04% 10.56% 0.10 LRR 2 Quarterly 16.31 0.22 1.58% 10.30% 0.15 LRR 1 Monthly 22.72 0.22 0.35% 6.09% 0.06 LRR 2 Monthly 16.27 0.21 0.52% 5.95% 0.09 60

References Avramov, D., S. Cederburg, and K. Lucivjanska. 2016. Are stocks riskier over the long run? Taking cues from economic theory. Unpublished working paper. Baker, M., and J. Wurgler. 2000. The equity share in new issues and aggregate stock returns. Journal of Financial Economics 55:2219-2257. Bansal, R., A. R. Gallant, and G. Tauchen. 2007. Rational pessimism, rational exuberance, and asset pricing models. Review of Economic Studies 74:1005-1033. Bansal, R., D. Kiku, and A. Yaron. 2007. A note on the economics and statistics of predictability: A long run risks perspective. Unpublished working paper. Bansal, R., D. Kiku, and A. Yaron. 2010. Risks for the long run: Estimation and inference. Unpublished working paper. Bansal, R., D. Kiku, and A. Yaron. 2012. An empirical evaluation of the long-run risks model for asset prices. Critical Finance Review 1:183-221. Bansal, R., and A. Yaron. 2004. Risks for the long run: A potential resolution of asset pricing puzzles. Journal of Finance 59:1481-1509. Barberis, N. 2000. Investing for the long run when returns are predictable. Journal of Finance 55:225-264. Barberis, N., M. Huang, and T. Santos. 2001. Prospect theory and asset prices. Quarterly Journal of Economics 116:1-53. Barro,R.J.2006.Raredisastersandassetmarketsinthetwentiethcentury.Quarterly Journal of Economics 121:823-866. Beeler, J., and J. Y. Campbell. 2012. The long run risks model and aggregate asset prices: An empirical assessment. Critical Finance Review 1:141-182. 61

Brandt, M. W., A. Goyal, P. Santa-Clara, and J. R. Stroud. 2005. A simulation approach to dynamic portfolio choice with an application to learning about return predictability. Review of Financial Studies 18:831-873. Campbell, J. Y. 1987. Stock returns and the term structure. Journal of Financial Economics 18:373-399. Campbell, J. Y., and J. H. Cochrane. 1998. By force of habit: A consumption-based explanation of aggregate stock market behavior. Center for Research in Security Prices Working Paper No. 412. Campbell, J. Y., and J. H. Cochrane. 1999. By force of habit: A consumptionbased explanation of aggregate stock market behavior. Journal of Political Economy 107:205-251. Campbell, J. Y., S. Giglio, and C. Polk. 2013. Hard times. Review of Asset Pricing Studies 3:95-132. Campbell, J. Y., and R. J. Shiller. 1988. The dividend-price ratio and expectations of future dividends and discount factors. Review of Financial Studies 1:195-228. Campbell, J. Y., and S. B. Thompson. 2008. Predicting excess stock returns out of sample: Can anything beat the historical average? Review of Financial Studies 21:1509-1531. Cochrane, J. H. 2008. The dog that did not bark: A defense of return predictability. Review of Financial Studies 21:1534-1575. Constantinides, G. M., and A. Ghosh. 2012. Asset pricing tests with long-run risks in consumption growth. Review of Asset Pricing Studies 1:96-136. Del Negro, M., and F. Schorfheide. 2011. Bayesian macroeconometrics. The Oxford Handbook of Bayesian Econometrics. 62

Epstein, L., and S. Zin. 1989. Substitution, risk aversion and the temporal behavior of consumption and asset returns: A theoretical framework. Econometrica 57:937- 969. Fama,E.F.,andK.French.1988.Dividendyieldsandexpectedstockreturns.Journal of Financial Economics 22:3-25. Fama, E. F., and K. French. 1989. Business conditions and expected returns on stocks and bonds. Journal of Financial Economics 25:23-49. Ferson, W., S. Nallareddy, and B. Xie. 2013. The ”out-of-sample” performance of long run risk models. Journal of Financial Economics 107:537-556. Heaton, J., and D. Lucas. 1999. Stock prices and fundamentals, in O. J. Blanchard, and S. Fischer (eds), NBER Macroeconomics Annual: 1999. Cambridge, MA: MIT Press. Ingram, B. F., and C. H. Whiteman. 1994. Supplanting the ‘Minnesota’ prior: Forecasting macroeconomic time series using real business cycle model priors. Journal of Monetary Economics 34:497-510. Koop, G. 2003. Bayesian econometrics. John Wiley & Sons, Ltd, England. Kruttli, M. S., A. J. Patton, and T. Ramadorai. 2015. The impact of hedge funds on asset markets. Review of Asset Pricing Studies 5:185-226. Lettau, M., andS.C.Ludvigson.2001.Consumption, aggregatewealth, andexpected stock returns. Journal of Finance 56:815-849. Lettau, M., and S. C. Ludvigson. 2005. Expected returns and expected dividend growth. Journal of Financial Economics 76:583-626. Lettau, M., S. C. Ludvigson, and J. A. Wachter. 2008. The declining equity premium: Whatroledoesmacroeconomicriskplay.Review of Financial Studies 21:1653-1687. 63

Lettau, M., and S. Van Nieuwerburgh. 2008. Reconciling the return predictability evidence. Review of Financial Studies 21:1607-1652. Li, Y., D. T. Ng, and B. Swaminathan. 2013. Predicting market returns using aggregate implied cost of capital. Journal of Financial Economics 110:419-436. Litterman, R. B. 1986. Forecasting with Bayesian vector autoregressions: Five years of experience. Journal of Business & Economic Statistics 4:25-38. Ludvigson, S.C.2013. Advances inconsumption-based assetpricing: Empiricaltests, in G. M. Constantinides, M. Harris, and R. M. Stulz (eds), Handbook of the Economics of Finance: 2013. Netherlands, Amsterdam: Elsevier B.V. Mehra, R., and E. C. Prescott. 1985. The equity premium: A puzzle. Journal of Monetary Economics 15:145-161. Pastor, L., and R. F. Stambaugh. 2009. Predictive systems: Living with imperfect predictors. Journal of Finance 64:1583-1628. Pastor, L., and R. F. Stambaugh. 2012. Are stocks really less volatile in the long run? Journal of Finance 67:431-478. Penasse, J.2016.Returnpredictability: Learningfromthecross-section.Unpublished working paper. Pettenuzzo, D., A. Timmermann, and R. Valkanov. 2014. Forecasting stock returns under economic constraints. Journal of Financial Economics 114:517-553. Polk, C., S. Thompson, and T. Vuolteenaho. 2006. Cross-sectional forecasts of the equity premium. Journal of Financial Economics 81:101-141. Shanken, J. A., and A. Tamayo. 2012. Payout yield, risk, and mispricing: A Bayesian analysis. Journal of Financial Economics 105:131-152. 64

Stambaugh, R. F. 1999. Predictive regressions. Journal of Financial Economics 54:375-421. Wachter, J. A. 2005. Solving models with external habit. NBER Working Paper No. 11559. Wachter, J. A., and M. Warusawitharana. 2009. Predictable returns and asset allocation: Should a skeptical investor time the market? Journal of Econometrics 148:162-178. Wachter, J. A., and M. Warusawitharana. 2015. What is the chance that the equity premium varies over time? Evidence from regressions on the dividend-price ratio. Journal of Econometrics 186:74-93. Welch, I., and A. Goyal. 2008. A comprehensive look at the empirical performance of equity premium prediction. Review of Financial Studies 21:1455-1508. 65

Cite this document

APA

Mathias S. Kruttli (2016). From Which Consumption-Based Asset Pricing Models Can Investors Profit? Evidence from Model-Based Priors (FEDS 2016-027). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2016-027

BibTeX

@techreport{wtfs_feds_2016_027,
  author = {Mathias S. Kruttli},
  title = {From Which Consumption-Based Asset Pricing Models Can Investors Profit? Evidence from Model-Based Priors},
  type = {Finance and Economics Discussion Series},
  number = {2016-027},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2016},
  url = {https://whenthefedspeaks.com/doc/feds_2016-027},
  abstract = {This paper compares consumption-based asset pricing models based on the forecasting performance of investors who use economic constraints derived from the models to predict the equity premium. Three prominent asset pricing models are considered: Habit Formation, Long Run Risk, and Prospect Theory. I propose a simple Bayesian framework through which the investors impose the economic constraints as model-based priors on the parameters of their predictive regressions. An investor whose prior beliefs are rooted in the Long Run Risk model achieves more accurate forecasts overall. The greatest difference in performance occurs during the bull market of the late 1990s. During this period, the weak predictability of the equity premium implied by the Long Run Risk model helps the investor to not prematurely anticipate falling stock prices.},
}