ifdp · August 31, 2010

The Information Content of High-Frequency Data for Estimating Equity Return Models and Forecasting Risk

Abstract

We demonstrate that the parameters controlling skewness and kurtosis in popular equity return models estimated at daily frequency can be obtained almost as precisely as if volatility is observable by simply incorporating the strong information content of realized volatility measures extracted from high-frequency data. For this purpose, we introduce asymptotically exact volatility measurement equations in state space form and propose a Bayesian estimation approach. Our highly efficient estimates lead in turn to substantial gains for forecasting various risk measures at horizons ranging from a few days to a few months ahead when taking also into account parameter uncertainty. As a practical rule of thumb, we find that two years of high frequency data often suffice to obtain the same level of precision as twenty years of daily data, thereby making our approach particularly useful in finance applications where only short data samples are available or economically meaningful to use. Moreover, we find that compared to model inference without high-frequency data, our approach largely eliminates underestimation of risk during bad times or overestimation of risk during good times. We assess the attainable improvements in VaR forecast accuracy on simulated data and provide an empirical illustration on stock returns during the financial crisis of 2007-2008.

Board of Governors of the Federal Reserve System International Finance Discussion Papers Number 1005 September 2010 The Information Content of High-Frequency Data for Estimating Equity Return Models and Forecasting Risk Dobrislav Dobrev Pawel Szerszen NOTE: International Finance Discussion Papers are preliminary materials circulated to stimulate discussion and critical comment. References to International Finance Discussion Papers (other than an acknowledgment that the writer has had access to unpublished material) should be cleared with the author or authors. Recent IFDPs are available on the Web at www.federalreserve.gov/pubs/ifdp/. This paper can be downloaded without charge from Social Science Research Network electronic library at www.ssrn.com.

The Information Content of High-Frequency Data for Estimating Equity Return Models and Forecasting Risk Dobrislav Dobrev, Pawel Szerszen ∗ First version: December 30, 2009 This version: August 25, 2010 Abstract We demonstrate that the parameters controlling skewness and kurtosis in popular equity return models estimated at daily frequency can be obtained almost as precisely as if volatility is observable by simply incorporating the strong information content of realized volatility measures extracted from high-frequency data. For this purpose, we introduce asymptotically exact volatility measurement equations in state space form and propose a Bayesian estimation approach. Our highly efficient estimates lead in turn to substantial gains for forecasting various risk measures at horizons ranging from afewdaystoafewmonthsaheadwhentakingalsointoaccountparameteruncertainty. Asapracticalruleofthumb, wefindthattwoyearsofhighfrequencydataoftensuffice to obtain the same level of precision as twenty years of daily data, thereby making our approach particularly useful in finance applications where only short data samples are availableoreconomicallymeaningfultouse. Moreover,wefindthatcomparedtomodel inferencewithouthigh-frequencydata,ourapproachlargelyeliminatesunderestimation of risk during bad times or overestimation of risk during good times. We assess the attainable improvements in VaR forecast accuracy on simulated data and provide an empirical illustration on stock returns during the financial crisis of 2007-2008. JEL classification: C11; C13; C14; C15; C22; C53; C80; G17; Keywords: Equityreturnmodels;Parameteruncertainty;Bayesianestimation;MCMC; High-frequency data; Jump-robust volatility measures; Value at Risk; Forecasting. ∗Dobrislav Dobrev: Federal Reserve Board of Governors, Dobrislav.P.Dobrev@frb.gov Pawel Szerszen: Federal Reserve Board of Governors, Pawel.J.Szerszen@frb.gov WearegratefultoTorbenAndersen,FedericoBandi,LucaBenzoni,NeilEricsson,MichaelGordy,ErikHjalmarsson, Michael Johannes, Matthew Pritsker, and Viktor Todorov for helpful discussions and comments. We also thank conference participants at the 2010 International Risk Management Conference, the 30th International Symposium on Forecasting, the 16th International Conference on Computing in Economics and Finance, as well as seminar participants at the Federal Reserve Board, Johns Hopkins University, the OfficeoftheComptrolleroftheCurrency,andUniversityofMarylandfortheirfeedback. Excellentresearch assistance has been provided by Patrick Mason as well as Erica Reisman and Raymond Zhong. Any errors or omissions are our sole responsibility. Theviewsinthispaperaresolelythoseoftheauthorsandshouldnotbeinterpretedasreflectingtheviews oftheBoardofGovernorsoftheFederalReserveSystemorofanyotherpersonassociatedwiththeFederal Reserve System.

1 Introduction Modeling equity returns is central to risk management, derivatives pricing, portfolio choice, and asset pricing in general. Continuous time jump-diffusion models succeeding those pioneered by Merton (1969) and Black and Scholes (1973) are now commonplace. Typically, the inherent time-varying and stochastic nature of continuous market activity is represented by a combination of persistent and non-persistent latent stochastic volatility factors. The pronounced asymmetric return-volatility relation in equities, known also as leverage or volatility feedback effects, is captured by correlated return and volatility innovations. Suddenpricerevisionsduetonewsandothermarketsurprisesgiverisetojumpsin returns, while the often abrupt changes in the level of market activity and risk has justified the introduction of jumps in volatility. The latent nature of volatility in such rich models, however, poses serious challenges for reliable inference based solely on daily or monthly return series, even the longest existing ones. It is thus critical to develop estimation methods exploiting relevant additional information that could help reduce the severe parameter and volatility estimation uncertainty. Two different approaches have emerged to improve estimation efficiency in this regard. Thefirstapproachreliesonthecrosssectionofoptionpricesovertime.1 However,aspointed out by Eraker, Johannes, and Polson (2003), it is unclear whether the inclusion of option price data leads to decrease or increase of parameter uncertainty given that the risk premia embedded in option prices introduce additional parameters, which are typically difficult to estimate. The second and seemingly more viable approach avoids such complications by exclusivelyrelyingondailyrealizedvolatilitymeasuresextractedfromnowadaysubiquitous high-frequencyintradayreturndata.2,3 Ourpapercontributestothissecondlineofresearch by utilizing high-frequency realized volatility measures within a standard Bayesian Markov Chain Monte Carlo (MCMC) estimation framework of popular equity return models. In particular, we take explicitly into account the resulting substantial reduction in parameter uncertainty and are able to show sizeable economic gains when forecasting risk. The most closely related studies to our work such as Alizadeh, Brandt, and Diebold (2002), Barndorff-Nielsen and Shephard (2002), Bollerslev and Zhou (2002), Corradi and Distaso (2006), Todorov (2009) among others have used classical rather than Bayesian estimation methods and have focused on using high-frequency volatility measures for assessing the goodness of fit of alternative model specifications without explicitly analyzing the economic value of reducing parameter uncertainty. These studies have largely ruled out the 1See for example, Chernov and Ghysels (2000), Pan (2002), and Eraker (2004) among others. 2Recent surveys of the realized volatility literature include Andersen, Bollerslev, and Diebold (2009), Bandi and Russell (2007), Barndorff-Nielsen and Shephard (2007), McAleer and Medeiros (2008). 3Thoroughempiricalevidencepointingmorebroadlytowardsthevalueofrealizedvolatilityformodeling equity returns can be found in Andersen, Bollerslev, Diebold, and Ebens (2001). 1

simplest known single factor stochastic volatility models with Poisson jumps in returns in favor of more complex specifications including one or more extra features such as a second stochastic volatility factor, pronounced non-linearities, jumps both in returns and volatility (possibly even of infinite activity).4 But rather than reconciling or refining such findings, our main goal is to go a step beyond specification testing and clearly demonstrate the economic gains from harnessing the information content of high-frequency volatility measures regardless of the underlying model. To this end, we exploit recent advances in jump robust volatility estimation from highfrequencydatasuchasAndersen,Dobrev,andSchaumburg(2009),Barndorff-Nielsen,Shephard, and Winkel (2006), Podolskij and Vetter (2009) and references therein to formally introduce an asymptotically precise volatility measurement equation directly within the standard state-space representation of popular equity return models estimated at daily or lowerfrequency. ThenweadoptastandardBayesianMCMCestimationframeworkallowing us to exploit the strong information content of such volatility measurement equation across a wide range of models featuring stochastic volatility, leverage effects, and jumps in returns and volatility. In terms of efficiency, our approach considerably improves on Bayesian estimation methods based on an identical state-space representation at a daily or lower frequency but without a volatility measurement equation such as Eraker, Johannes, and Polson (2003) and Jacquier, Polson, and Rossi (2004), among others.5 In terms of generality, we overcome major limitations of the quasi-maximum likelihood estimation methods for state-space formulations with volatility measurement equation pursued by Barndorff- Nielsen and Shephard (2002) who consider non jump-robust realized volatility measures and Alizadeh, Brandt, and Diebold (2002) who consider non jump-robust and less efficient range-based volatility measures. In particular, our approach incorporates leverage effects and jumps, necessary for modeling equity returns, as well as possibly two (one persistent and one non-persistent) stochastic volatility factors. We also offer an attractive alternative to existing moment-based estimation approaches such as Bollerslev and Zhou (2002), Corradi and Distaso (2006) and Todorov (2009) in terms of more fully exploiting the information content of high frequency volatility measures in various model settings via their state-space formulations. In particular, unlike these studies, the Bayesian estimation approach we propose allows us to easily account for parameter uncertainty and demonstrate the economic gains from using high-frequency volatility measures for model estimation and risk forecasting across a range of popular equity return models. 6 4Results in the same spirit have been obtained also in studies based solely on daily returns or in combination with options data such as Broadie, Chernov, and Johannes (2007). Other non-parametric studies based on high-frequency data include Andersen, Bollerslev, and Dobrev (2007), Bandi and Reno (2009). 5Cf. Andersen, Benzoni, and Lund (2002) and Chernov, Gallant, Ghysels, and Tauchen (2003). 6Assuch,ourresultsaddtothegrowingbodyofevidenceshowingtheeconomicvalueofhigh-frequency realized volatility measures in finance applications, e.g. Fleming, Kirby, and Ostdiek (2003) among others. 2

Our main contributions can be summarized as follows. First, we demonstrate theoretically and empirically that the parameters controlling skewness and kurtosis in popular equity return models estimated at daily and monthly frequency can be obtained almost as precisely as if volatility is observable by incorporating the strong information content of realized volatility measures extracted from high-frequency data. In particular, we extend the empirical findings in Alizadeh, Brandt, and Diebold (2002) by showing that not only the parameters controlling volatility of volatility but also those controlling leverage effects can be estimated several times more precisely by exploiting high-frequency volatility measures. Second, we show that our highly efficient estimates lead in turn to substantial gains for forecasting various risk measures at horizons ranging from a few days to a few months ahead when taking also into account parameter uncertainty. In fact, our approach not only reduces the root mean square prediction error but also shrinks and almost eliminates the forecast bias, which inevitably arises from the pronounced nonlinearities in the involved transformation of parameter and volatility estimates. As a practical rule of thumb we find that two years of high frequency data often suffice to obtain the same level of precision as twenty years of daily data, thereby making our approach particularly useful in finance applications where only short data samples are available or economically meaningful to use. Third, and most important in risk management applications, our simulation results reveal that risk forecasts stemming from traditional model inference on daily data tend to be overly conservative in good times (e.g. overestimating risk by as much as 30%) but they are not conservative enough in bad times (e.g. underestimating risk by as much as 10%). By contrast, risk forecasts based on our approach to exploiting high-frequency data are considerably closer to the truth in both bad and good times. Thanks to incorporating the strong information content of high-frequency volatility measures, we are able to better curb risk taking exactly when needed the most, i.e. early on in times of crisis, while avoiding unnecessary overstatement of risk in normal times. Finally, our findings are robust both across different models and jump-robust volatility measures on high frequency data that we analyze. This allows us to remain largely agnostic about the best suited ones, while making a strong case for the potentially large economic value of our approach to using high-frequency volatility measures in model estimation and risk forecasting or other closely related finance applications such as derivatives pricing. The rest of the paper is organized as follows. Section 2 introduces our volatility measurement equations in detail. Section 3 incorporates such equations within the state space formulation of popular equity return models and develops appropriate Bayesian estimation methods. Section 4 documents the resulting gains in estimation efficiency and risk forecasting accuracy. Section 5 provides an empirical comparison of Value-at-Risk forecasts on S&P 500 and Google returns during the financial crisis of 2007-2008. Section 6 concludes. 3

2 Volatility measurement equations Jumps in returns have been recognized as an important feature for continuous-time modeling of equity returns within standard no-arbitrage semimartingale setting. Moreover, recent progress in non-parametric volatility measurement based on high-frequency intraday data has made it possible to separate ex-post the daily continuous part of the volatility process from the daily return variation induced by discontinuities or jumps. Originally pioneered by Barndorff-Nielsen and Shephard (2004), jump-robust volatility estimators with different asymptotic and finite sample properties have been proposed by Andersen, Dobrev, and Schaumburg (2009), Barndorff-Nielsen, Shephard, and Winkel (2006), Podolskij and Vetter (2009) among others. A common feature among these and other high-frequency volatility estimators is that as the intraday sampling frequency increases, the arising measurement error shrinks to zero and converges to a known mixed normal asymptotic distribution.7 For our purposes, suitable asymptotic results of this kind directly imply asymptotically precisemeasurementequationsthatformallycapturetheextenttowhichthecontinuousand jump parts of daily total variance become ex-post nearly observable when high frequency intraday data is available. Such separation of the continuous and jump components of volatility can be directly utilized in state space form. In this section, we formally introduce a general form of the jump-robust volatility measurement equations that play a key role in our approach to estimating models in state space form and allow us to tackle considerably more general settings than those considered by Alizadeh, Brandt, and Diebold (2002) and Barndorff-Nielsen and Shephard (2002) in the absence of jump-robust volatility measures nearly a decade ago. 2.1 Jump-robust estimators of diffusive volatility On a filtered probability space (Ω, F, (F ) , P) we consider an adapted process Y = t t≥0 {Y } , providing the following jump-diffusion represention of the evolution of the logat t≥0 rithmic price of an asset in continuous time: dY = µ dt+σ dB +dJ (1) t t t t t Here µ is a locally bounded and predictable process, σ is cadlag and bounded away from zero almost surely, while J is a jump process so that dJ , whenever different from zero, t 7Therateofconvergenceistypicallysquare-root. Itisslowerforhigh-frequencyvolatilitymeasuresthat are robust also to market microstructure noise, empirically found to matter at sample frequencies higher than a few minutes. 4

represents the size of a jump at time t. Without loss of generality, we restrict attention to finite activity jumps.8 For a day of unit length with M + 1 discrete observations of the logarithmic price process {Y } on 0 ≤ t < t < ··· < t ≤ 1 we denote the intraday time intervals t 0≤t≤1 0 1 M and corresponding returns as ∆t = t −t and ∆Y = Y −Y , i = 1,...,M. In what i i i−1 i ti ti−1 follows, we consider standard continuous record in-fill asymptotics where the time intervals characterizing the intraday sampling scheme uniformly shrink towards zero as the sampling frequency M increases. In this setting, the daily quadratic variation (QV) of the observed process consists of the sum of its continuous and jump parts, QV = R1σ2du+P (dJ )2, and is estimated 0 u 0≤u≤1 u consistently by the well established realized volatility (RV) measure:9 M RV = X (∆Y )2 . (2) M i i=1 Our main object of interest, though, is the diffusive part of the quadratic variation defined as the integrated variance (IV), IV = R1σ2du. It can be conveniently estimated by 0 u variousmultipowervariationmeasuresdevelopedbyBarndorff-NielsenandShephard(2004) and Barndorff-Nielsen, Shephard, and Winkel (2006) or more recent analogous measures based on nearest neighbor truncation developed by Andersen, Dobrev, and Schaumburg (2009).10 Inthecaseoffiniteactivityjumps,themostefficientmultipowervariationmeasure that allows for an asymptotic mixed normal limit theory is the realized tripower variation (TV) based on the product of triplets of adjacent absolute returns:11 TV = µ−3 (cid:18) M (cid:19)M X −1 |∆Y |2/3|∆Y |2/3|∆Y |2/3 (3) M 2/3 M −2 i−1 i i+1 i=2 The TV estimator is only marginally less efficient than the corresponding MedRV estimator based on (two-sided) nearest neighbor truncation, taking the median instead of the product of triplets of adjacent absolute returns:12 8Our subsequent analysis remains valid to the extent that the utilized asymptotic results are unaffected by jumps of possibly infinite activity (but still finite variation). 9Forrecentsurveysoftherealizedvolatilityliteraturesee,e.g.,Andersen,Bollerslev,andDiebold(2009), Bandi and Russell (2007), Barndorff-Nielsen and Shephard (2007), McAleer and Medeiros (2008). 10Otherapproachesthatinvolvepotentiallydelicatethresholdorbandwidthchoicesincludethetruncated RV of Mancini (2006) and Aït-Sahalia and Jacod (2007), the truncated bipower variation of Corsi, Pirino, and Renò (2008), as well as the quantile RV estimator of Christensen, Oomen, and Podolskij (2008). 11The scaling factor µ is defined as µ =E|U|p =2p/2Γ((p+1)/2), U ∼ N(0,1). p p Γ(1/2) 12The asymptotic variance factor for TV is 3.06 as opposed to 2.96 for MedRV. Also by design, MedRV is somewhat more robust than TV not only to jumps but also to the occurrence of “zero” returns in finite 5

MedRV = √ π (cid:18) M (cid:19)M X −1 med(|∆Y |, |∆Y |, |∆Y |)2 (4) M i−1 i i+1 6−4 3+π M −2 i=2 Hence,inourempiricalanalysiswerelyonbothTVandMedRV,allowingustoconclude thatourmainresultsarenotsensitivetotheparticularjump-robustvolatilitymeasuresthat we use to derive volatility measurement equations. By presenting these equations below in generic form, we are able to abstract from the chosen jump-robust estimators that we are going to utilize in the state space formulation of various models for the sake of reducing parameter and volatility estimation uncertainty. 2.2 Generic asymptotic results and volatility measurement equations Let IdV M be some jump-robust volatility estimator applicable in the considered setting such as TV and MedRV defined above. Then a central limit theorem (CLT) of the following generic form holds: √ (cid:18) Z 1 (cid:19) M(IdV M −IV) − D → N 0, ν σ u 4du , (5) 0 where ν is a known asymptotic variance factor depending on the particular estimator (e.g. 3.06 for TV and 2.96 for MedRV), while IQ = R1σ4du is the integrated quarticity control- 0 u ling the precision of all such estimators. Moreover, since the convergence in (5) is stable, it is possible to apply the delta method to derive feasible asymptotic results based on any consistent jump-robust estimator IcQ of IQ.13 In particular, M √ IdV M −IV D M −→ N (0, 1) , (6) νIcQ M and √ log(IdV M )−log(IV) D M −→ N (0, 1) . (7) ν IcQ M 2 IcV M samples. 13Without loss of generality, in our empirical analysis we focus on the popular realized quad-power quarticityestimatorQQ = π2M (cid:0) M (cid:1)PM−3|∆Y ||∆Y ||∆Y ||∆Y |ofBarndorff-NielsenandShephard M 4 M−3 i=1 i i+1 i+2 i+3 (2004)aswellastheslightlymoreefficient(androbusttobothjumpsandzeroreturns)medianrealizedquarticityestimatorMedRQ M = 9π+7 3 2 π − M 52 √ 3 (cid:0) M M −2 (cid:1)PM i= − 2 1med(|∆Y i−1 |,|∆Y i |,|∆Y i+1 |)4,ofAndersen,Dobrev, and Schaumburg (2009). 6

The log transformation in (7) results in better finite sample approximation than (6), as already noted by Barndorff-Nielsen and Shephard (2005) and Huang and Tauchen (2005). This is especially useful for our purposes as we will focus our subsequent analysis exactly on logarithmic SV models. In what follows, we denote the feasible estimate of the asymptotic variance of log(IdV M ) 2 implied by (7) as ΩbM = νIcQ M /IdV M to obtain the following logarithmic volatility measurement equation that we are going to utilize in the state space representation of various logarithmic SV models (with leverage effects and jumps) to improve estimation efficiency: q log(IdV M ) ≈ log(IV)+ M 1 ΩbM ε t , (8) where ε ∼ N(0,1) is independent of the underlying process and the measurement error t vanishes as the intraday sampling frequency M increases. More generally, to make explicit distinction between different days, we rewrite this key equation as: q log(IdV t,t+1;M ) ≈ log(IV t,t+1 )+ M 1 Ωbt,t+1;M ε t , (9) where ε t ∼ N(0,1) as above, while log(IV t,t+1 ), log(IdV t,t+1;M ), and Ωbt,t+1;M stand, respectively, for the true daily diffusive variance, its available jump-robust estimate at any sample frequency M, and the corresponding asymptotic variance on a given day of unit length represented by the interval (t, t+1]. We restrict attention to moderate sample frequencies such as two or five minutes (e.g. M = 195 or M = 78 over a typical trading day of six and a half hours) in order to avoidcomplicationsarisingfromvariousmarketmicrostructureeffectsthatcannotbesafely ignored.14 Alternatively, for jump-robust volatility estimation at higher frequencies one can resort to noise-reduction techniques such as pre-averaging, introduced in the context of multipower variations by Podolskij and Vetter (2009).15 Quite similarly to the way we obtained equation (9) above, it is possible to single out also the jump part of volatility by using available asymptotic results for the difference betweennonjump-robustandjump-robusthighfrequencyvolatilitymeasures,suchasthose exploited for moment-based estimation by Todorov (2009). What is important to keep in mindisthatanysuchvolatilitymeasurementequationsbasedonhigh-frequencydatasimilar to (9) do not require knowledge of the exact intraday dynamics of the logarithmic price 14Volatilitymeasuresobtainedathigherfrequenciescanincurbiasesduetomarketimperfectionssuchas bid-ask bounce effects, stale quotes, price discreteness, and intraday patterns. 15The extra robustness comes at the cost of lower convergence rate that can be easily accommodated by our generic volatility measurement equation (9) by changing the power of M accordingly. 7

process. This observation is crucial for our analysis as it allows us to largely abstract from modeling complications due to non-trivial intraday market microstructure effects. Thus, in the next section we focus entirely on the estimation of popular parametric models for equity returns at daily or lower frequencies by directly bringing our generic daily volatility measurement equations based on high-frequency intraday data to the state space form of each model. 3 Equity return models and estimation With this extra machinery at hand, our goal is to demonstrate the ease and importance of utilizing high-frequency data for more efficient estimation of a broad range of commonly usedequityreturnmodels. Ononesideofthespectrumweconsiderabasiccontinuous-time diffusionmodelsimilartothesettingofJacquier,Polson,andRossi(2004)withlog-volatility specification, leverage effect and no jumps. On the other side of the spectrum we also study a two-factor logarithmic SV model with leverage effects and compound Poisson jumps in returns. It offers a less restrictive setting than the two-factor models studied by Alizadeh, Brandt, and Diebold (2002) and Bollerslev and Zhou (2002) thanks to incorporating both leverage effects and jumps. Moreover, like the single-factor model, it can still be successfully fitted using information on daily data only, which we use as a natural benchmark for gauging the attainable efficiency gains from our approach to incorporating high-frequency data. Formally, by relying on Bayesian estimation methods, we are able to fully exploit the information content of high frequency volatility measures within the standard state-space form of the models. Hence, we can obtain a clean measure of the incremental value of high-frequency data compared to estimation based on daily data only. As shown in Das and Sundaram (1999) among others, models with stochastic volatility, leverage effects and jumps allow for skewness and excess kurtosis of returns and make it possible to closely match stylized facts of empirical asset return distributions that have been extensively studied under both physical and risk-neutral measures. For example, Andersen, Benzoni, and Lund (2002) find that adding jumps in returns to single-factor stochastic volatility models can help better fit stock return skewness and kurtosis and better reproduce volatility smiles in option prices. Eraker, Johannes, and Polson (2003) further extend single-factor jump-diffusion models by adding jumps not only in returns but also in volatility, which Broadie, Chernov, and Johannes (2007) show to be important for fitting volatility skewness and kurtosis. Studies of stochastic volatility models with similar findings under risk-neutral measure include Bakshi, Cao, and Chen (1997), Bates (2000), amongothers. Aunifiedapproachusingbothreturnsandoptionsdata,pursuedbyChernov and Ghysels (2000), Eraker (2004) and Jones (2003), has also stressed the importance of properly fitting the conditional skewness and kurtosis of return distributions at various 8

return horizons. From such broader modeling point of view, it is not our goal to use high-frequency data forthesakeofimprovedspecificationtestingasdone,forexample,byAlizadeh,Brandt,and Diebold(2002),BollerslevandZhou(2002),CorradiandDistaso(2006)andTodorov(2009). Instead, we go a step beyond specification testing and attempt to clearly demonstrate first the efficiency and then the economic gains from harnessing the information content of highfrequency volatility measures regardless of the underlying model. As our workhorse for analysis, in this section we develop appropriate Bayesian estimation methods that allow us to easily incorporate high frequency volatility measurement equations (such as those presented in the previous section) directly in state space form of anymodel. Inthisregard,ourestimationapproachisclosesttoBarndorff-NielsenandShephard(2002),althoughtheydonotallowforjumpsandusequasi-maximumlikelihoodrather than Bayesian estimation methods. By following a Bayesian Markov Chain Monte Carlo (MCMC) approach to estimation, we are able to easily take parameter uncertainty into account and demonstrate that high-frequency information helps greatly increase precision in the parameter estimates governing skewness and kurtosis of returns, which in turn leads to considerably more precise and less biased Value-at-Risk forecasts for multi-day returns. Thus, our study contributes directly to the growing body of evidence that high frequency returns are an important source of information in asset pricing and risk management. Without loss of generality, here we restrict our exposition to one and two-factor models on opposite sides of the spectrum in terms of complexity. We impose a logarithmic specification for the stochastic volatility components in our models directly in line with Andersen, Bollerslev, Christoffersen, andDiebold(2007)whopointoutthatlognormal/normal mixture models show great appeal in financial risk management in view of the empirically observed near lognormality of realized volatility coupled with the near normality of daily returns standardized by realized volatility. 3.1 One-factor log-SV model with leverage effects We consider a standard one-factor log-SV model that provides a high level of simplicity and transparency, while it is still rich enough to allow for both skewness and excess kurtosis of asset returns. Our contribution consists in extending the equations of the model in state space form with our extra volatility measurement equation derived in generic form in Section 2.2, which is the only difference compared to the standard specification in Jacquier, Polson, and Rossi (2004). It is worth noting that Jones (2003) has studied a similar system of equations with extra measurement equation coming from option implied volatilities. In our model weuse high-frequencyvolatilitymeasures asextra information, which incontrast to implied volatilities allows to theoretically derive the variance of measurement noise and does not require estimation of risk-premia related parameters. In order to facilitate the 9

exposition, we first present the part of our single-factor model identical with Jacquier, Polson, and Rossi (2004). After standard first order Euler discretization as in Kloeden and Platen (1992), or cast directly as a discrete-time model, the system of equations takes the following form: h √ t (1) Y −Y = µ∆+exp( ) ∆ ε (10) t+∆ t 2 t+∆ √ q h = h +κ (θ −h )∆+σ ∆ (ρ ·ε (1) + (1−ρ2)·ε (2) ) (11) t+∆ t h h t h h t+∆ h t+∆ (j) where t = 0, ∆, 2∆, ..., T∆ is a sequence of discrete times, {ε } , j = 1,2 are sequences t t≥0 of jointly independenti.i.d. N(0,1)random variables, {Y } denotes thelogarithmic asset t t≥0 priceorindexlevelattimet, µ ∈ Risthedriftpartofthereturnprocess, κ ∈ (0,2)defines h the speed of mean reversion16 of the log-volatility process h towards its mean θ ∈ R, t h σ > 0 defines the volatility of volatility parameter, ρ ∈ (−1,1) defines the typically h h negative correlation between returns and volatility increments known as leverage effect, and finally ∆ > 0 is a discretization parameter. In this paper we consider dynamics at a daily frequency and fix accordingly ∆ = 1.17 We next consider a version of the model new to the literature, where the discretized system of equations (10)-(11) is augmented by our additional daily volatility measurement equation based on high-frequency data, given by (9) above for this model as: r 1 (IV) log(IdV t,t+∆;M ) ≈ α 0 + h t + M Ωbt,t+∆;M ε t+∆ , (12) (IV) (j) where {ε } is a sequence of i.i.d. N(0,1) random variables independent of {ε } t t≥0 t for j = 1,2, while {IdV t,t+∆;M } t≥0 is some integrated variance measure such as MedRV or TV with measurement error determined by the sampling frequency M and efficiency {Ωbt,t+∆;M } t≥0 asdescribedinSection2.2. Notethatboth{IdV t,t+∆;M } t≥0 and{Ωbt,t+∆;M } t≥0 are treated as daily observations and are directly calculated as functions of the available high frequency intraday returns at any suitable sample frequency M. As part of the volatility measurement equation (12) we also introduce an optional auxiliary parameter α , which 0 serves the purpose of correcting for the discrepancy between the log integrated variance measures log(IdV t,t+∆;M ) calculated using open-to-close intraday data and the corresponding log-variances of close-to-close daily returns represented by h .18 Note, that such correct tion is not required, though, if we use open-to-close data for the daily returns, in which case we simply impose α = 0. To complete the probabilistic set-up of the one-factor model, we 0 assume that all random variables are constructed on a probability space (Ω,F,P) with a 16As usual, we restrict κ to satisfy standard stationarity conditions for h . h t 17AsnotedbyEraker,Johannes,andPolson(2003),thediscretizationbiasfordailydataisnotsignificant. 18This is a standard correction of realized volatility measures as given in more detail, for example, by Hansen and Lunde (2005). 10

given filtration {F } and all processes are adapted to the filtration. t t≥0 We keep the daily dynamics given by (10) and (11) in the center of our analysis, while (12)servesthesolepurposeofincorporatingtheinformationcontentofhigh-frequencydata without incurring modeling complications due to market microstructure effects and other features of intraday data not relevant for modeling of daily returns, as discussed in Section 2.2. Thus, the use of non-parametric high-frequency volatility measures {IdV t,t+∆;M } t≥0 designedtoberobusttoknownirregularitiesofintradaydatagivesusanadditionaldegreeof freedom to implicitly allow for high-frequency returns to follow possibly different dynamics from that of daily returns. In order to find the contribution of high frequency information, we consider the above two versions of the model in state space form: (i) the one with daily returns only; (ii) the one including both daily returns and a daily volatility measurement equation from high frequency intraday data. The former is given by the system of equations (10)-(11), while thelatterconsistsofallequationsfromthe“dailyonly”modelaugmentedbyouradditional volatility measurement equation (12). 3.2 Two-factor log-SV model with leverage effects and jumps Alizadeh, Brandt, and Diebold (2002) and Bollerslev and Zhou (2002) provide strong support in favor of two-factor models of foreign exchange rates by utilizing high frequency data as part of non-Bayesian estimation procedures for specifications without leverage effects. Their first factor mimics the long-memory component in volatility, while the second factor has considerably smaller degree of persistence. Bollerslev and Zhou (2002) further findthateveninthepresenceofasecondshort-memorystochasticvolatilityfactor, itisstill important to include also a jump component in the model. Therefore, we consider a two factor log-SV model with compound Poisson jumps in returns. Moreover, we extend the specification by incorporating leverage effects, which allows us to model also the negative correlation between return and volatility innovations typical for equity returns. Thus, our two-factor logarithmic stochastic volatility model with Poisson jumps in returns represents a very general setting in the current literature. It still allows, though, successful estimation with the use of only daily data, for the sake of comparison to our approach with an extra volatility measurement equation. Similarly to our one-factor specification above, the discretized version of our two-factor model is given by the following set of equations in state space form, where the probabilistic setup and notation are analogous 11

to those of our one-factor model: h +f √ t t (1) Y −Y = µ∆+exp( ) ∆ ε +q ·J (13) t+∆ t 2 t+∆ t+∆ t+∆ √ q h = h +κ (θ −h )∆+σ ∆ (ρ ·ε (1) + (1−ρ2)·ε (2) )(14) t+∆ t h h t h h t+∆ h t+∆ √ q f = f +κ (θ −f )∆+σ ∆ (ρ ·ε (1) + (1−ρ2)·ε (3) )(15) t+∆ t f f t f f t+∆ f t+∆ r 1 (IV) log(IdV t,t+∆;M ) ≈ α 0 + h t +f t + M Ωbt,t+∆;M ε t+∆ (16) We assume without loss of generality that κ < κ and denote the persistent and nonh f persistent volatility factors as h and f respectively. Other than that, the parameters κ t t f andσ governingtheshort-memoryfactorf havesimilardomainandinterpretationastheir f t counterparts κ and σ for the long-memory factor h . We further assume for identification h h t purposes θ = 0, since only the total (unconditional) mean log-volatility is identified in the f (j) (IV) model. Also by construction, {ε } , j = 1, 2, 3 and {ε } are sequences of jointly t t≥0 t t≥0 independent i.i.d. N(0,1) random variables. Thus, we allow for leverage effects in both factors, which is more explicitly seen by defining the innovations specific to h and f as: t t q ε (h) = (ρ ·ε (1) + (1−ρ2)·ε (2) ) (17) t+∆ h t+∆ h t+∆ q ε (f) = (ρ ·ε (1) + (1−ρ2)·ε (3) ) (18) t+∆ f t+∆ f t+∆ In particualar, the instantaneous covariance matrix between return and volatility innovations is given by:  (1)  (1) 0   ε ε 1 ρ ρ t+1 t+1 h f Σ t+1|t ≡ E(    ε ( t+ h) 1       ε t ( + h) 1    ) =    ρ h 1 0    , (f) (f) ε ε ρ 0 1 t+1 t+1 f where we impose the positive definite restriction 1−ρ2 −ρ2 > 0. h f OurcompoundPoissonjumpspecificationwithnormallydistributedjumpsizesdrawson Andersen, Benzoni, and Lund (2002), Eraker, Johannes, and Polson (2003), and Johannes and Polson (2002). In particular, we assume a maximum of one jump per day. The jump incrementsintheinterval(t,t+∆]followthelawofq ·J ,wherethejumptimes{q } t+∆ t+∆ t t≥0 are i.i.d. Bernoulli(λ) and the jump sizes {J } are i.i.d. N(µ ,σ2). The parameters t t≥0 J J λ > 0,µ ∈ R and σ > 0 denote respectively the jump intensity, mean and standard J J deviation of jump sizes. Since at a daily frequency the jump intensity parameter λ is close to zero, our assumption of maximum one jump per day is not binding. Most importantly, we extend the state-space form of the model with our volatility measurement equation (16), which is a direct counterpart to equation (12) in the one-factor model and specializes equation (9) given in general form in section 2.2. Here the high fre- 12

quency measure of log integrated variance log(IdV t,t+∆;M ) is an estimate of h t +f t as the total diffusive variance in the two-factor model. The extra parameter α serves the same 0 purpose as in the one-factor model. It provides standard correction for the discrepancy between log integrated variance measures log(IdV t,t+∆;M ) calculated using open-to-close intraday data and the log variance of close-to-close daily returns modeled by h + f . For t t modeling the log variance of open-to-close daily returns we simply restrict α = 0. 0 In order to find the contribution of high frequency information, similarly to our onefactor model, we consider two versions of the two factor model: (i) the one with only daily returns; (ii) the one including both daily returns and a daily volatility measurement equationfromhighfrequencyintradaydata. Theformerisgivenbythesystemofequations (13)-(15), while the latter consists of all equations from the “daily” model augmented by our additional volatility measurement equation (16). 3.3 Estimation 3.3.1 Markov chain Monte Carlo methods We first briefly describe the general principles of Markov chain Monte Carlo (MCMC) methods,withmoredetailedexpositioninChibandGreenberg(1996),JohannesandPolson (2002) and Jones (1998). Let Y denote the vector of observations, X be the vector of latent state variables and Θ be the vector of model parameters. In Bayesian inference we utilize the prior information on the parameters to derive the joint posterior distribution for both parameters and state variables. By the Bayes rule, we have: p(Θ,X|Y) ∝ p(Y|X,Θ) · p(X|Θ) · p(Θ) , where p(Y|X,Θ) is the likelihood function of the model, p(X|Θ) is the probability distribution of state variables conditional on the parameters and p(Θ) is the prior probability distribution on the parameters of the model. Ideally we would like to know the analytical properties of the joint posterior distribution of X and Θ, however, this is hardly feasible. The highly multidimensional joint posterior distribution is very often too complicated to work with and analytically intractable and hence even direct simulation from the joint posterior distribution is hard to perform. In the sequel we base our exposition on Jones (2003). The idea behind MCMC methods is to break the highly dimensional vectors of latent variables X and parameters Θ into smaller pieces. The Gibbs sampler developed in Geman and Geman (1984) considers partitioning of X and Θ into respectively IX and IΘ subvectors X(1),X(2),...,X(IX) and Θ(1),Θ(2),...,Θ(IΘ). Then the Markov chain is constructed by first defining starting values 13

of the chain X and Θ and then iteratively forming the chain 0 0 (X ,Θ ) = (X(1),X(2),...,X(IX),Θ(1),Θ(2),...,Θ(IΘ)) n n n n n n n n Thedrawsof(X ,Θ )areperformedforeachi = 1,...,IX andeachj = 1,...,IΘ bydrawing n n from the following transition densities: p(X(i)|X(−i),Θ ,Y),i = 1,2,...,IX (19) n n n−1 p(Θ(j)|Θ(−j),X ,Y),j = 1,2,...,IΘ (20) n n n (−i) (k) (k) (−j) (k) (k) where X ≡ (X ;k < i) ∪ (X ;k > i) and Θ ≡ (Θ ;k < j) ∪ (Θ ;k > j) It n n n−1 n n n−1 can be shown that under mild conditions the chain (X ,Θ ) converges to its invariant n n distribution p(Θ,X|Y) that is by construction a joint posterior distribution of the model under consideration. The proof of the Gibbs sampler convergence to invariant distribution, sufficient conditions and some applications can be found in Chib and Greenberg (1996). The Gibbs sampler algorithm provides a tractable method to draw from multidimensional and complicated distributions only if one can draw from all complete conditional distributions in equations (19) and (20). However, even one-dimensional complete conditional distributions can be in practice difficult if not impossible to draw from. In this case we replace a particular Gibbs sampler step by the Metropolis-Hastings (MH) step in Metropolis, Rosenbluth and Rosenbluth (1953). Chib and Greenberg (1996) provide further details about the MH algorithm. The main building block of our estimation method is based on the Gibbs sampler algorithm with some blocks replaced by MH steps. After discarding a “burn-in” period of the first N draws, the discrete approximation {(X ,Θ )} of the joint posterior density p(Θ,X|Y) allows one to compute various n n n>N statistics. For example, the sample mean of the posterior distributions can be taken to obtain parameter estimates for our models. Likewise, one can estimate statistics of particular interest in applications such as moment and quantile forecasts for multi-horizon returns as well as associated risk measures such as Value-at-Risk (VaR) or any other function of the conditional multi-horizon return density such as the price of a derivative contract. Moreover, parameter uncertainty is taken automatically into account by integrating over the entire joint posterior distribution of parameters and state variables. This important property of MCMC estimation methods is especially valuable for our purposes, as it allows us to show how increasing the precision of parameter and volatility state estimation (by including our volatility measurement equations (12) and (16)) gets translated into more accurate conditional return density forecasts and moments/quantiles in particular. 14

3.3.2 Bayesian MCMC inference for models with high frequency volatility measurement equations We limit our exposition to describing our MCMC estimation procedure for the twofactor stochastic volatility models from Section (3.2).19 We put special emphasis on how to estimatemodelsincludingourhigh-frequencymeasurementequationsbyofferingastraightforward extension of estimation methods based only on daily returns. Following the notation from the previous section, we need to specify the vector of observations Y, the vector of latent state variables X and the vector of parameters Θ along with their appropriate subdivision in line with the construction of the Gibbs sampler algorithm. In particular, we define the following vectors, where “Daily” stays for estimation based only on daily returns (equations (13)-(15)) and “HF” stays for estimation incorporating also volatility measures based on high-frequency intraday data (equations (13)-(16)): Y(Daily) = {{Y } } t t=1,...,T Y(HF) = {{Y t } t=1,...,T ,{IdV t,t+1;M } t=1,...,T−1 ,{Ωbt,t+1;M } t=1,...,T−1 } X = {{h } ,{f } ,{q } ,{J } } t t=1,...,T t t=1,...,T t t=2,...,T t t=2,...,T Θ = {µ,κ ,θ ,(σ ,ρ ),κ ,(σ ,ρ ),λ,µ ,σ ,α } . h h h h f f f J J 0 The partitions of Θ and X are given by Θ(1) = µ, Θ(2) = κ , Θ(3) = θ , Θ(4) = (σ ,ρ ), h h h h Θ(5) = κ , Θ(6) = (σ ,ρ ), Θ(7) = λ, Θ(8) = µ , Θ(9) = σ and X(i) = h , X(i+T) = f , f f f J J i i X(j+2T) = q , X(j+(3T−1)) = J where i = 1,2,...,T, j = 1,...,T −1. Thus, we treat j+1 j+1 each element of the state vector X as a single block. For the vector of parameters Θ all elements are treated as a single block with the exception of (σ ,ρ ) and (σ ,ρ ). These h h f f parameters are drawn jointly as in Jacquier, Polson, and Rossi (2004). Finally, the extra parameter Θ(10) = α in equation (16) appears only in the “HF” model including high 0 frequency information and is estimated along with the rest of the parameters or it can be exogenously specified following standard approaches in the realized volatility literature to obtain variances for the whole day such as Hansen and Lunde (2005). It is set to zero when modeling open-to-close daily returns. Having defined above all blocks for the latent state variables X and parameters Θ, we apply the MCMC algorithm based on the Gibbs sampler presented in Section 3.3.1. Since draws of all parameters and jump related latent variables are standard in the literature, we directlyrefertoSzerszen(2009)fortheimposedpriordistributionsonthemodelparameters Θ and all other details. Here we focus on addressing the fundamental difference between estimation of the stan- 19One-factormodelscanbeviewedasaspecialcasebyrestrictingf =0forallt,omittingtheparameters t κ ,θ ,σ ,ρ for the f factor and imposing the constraint ρ =0 in the instantaneous correlation matrix. f f f f f 15

dard“Daily”andour“HF”versionofthemodel,whichdifferjustbytheadditionalvolatility measurementequation(16)basedonhigh-frequencydata. Theinformationprovidedbythis extraequationaffectsonlythecompleteconditionalposteriorsofthevolatilitystatesh and t f . In particular, the MCMC update for h is given by t t p(h |{f },h ,h ,q,J,Θ,Y) ∝ p(Y |Y ,{f },{h },Θ,q,J)·p(Y |Y ,{f },{h },Θ,q,J) t t t+1 t−1 t+1 t t t t t−1 t t ·p(h t+1 |h t ,Θ)·p(h t |h t−1 ,Θ)·p(h t |IdV t,t+1;M ,Ωbt,t+1;M ,f t ,Θ) for t=1, 2, ..., T, where the second and fourth kernels on the right hand side are omitted for t=1, while the first, third and last kernels are omitted for t=T. The MCMC update for the second factor f is performed analogously. t Thus, an inspection of the above update expression reveals that the only kernel affected bythehighfrequencyinformationwithY = Y(HF)isthelastonep(h t |IdV t,t+1;M ,Ωbt,t+1;M ,f t ) for the h factor and, similarly, p(f t |IdV t,t+1;M ,Ωbt,t+1;M ,h t ) for the f factor. The rest of the kernels are exactly those coming from inference based on daily returns only, i.e. with Y = Y(Daily), which appear also with Y = Y(HF). This is of key importance for understanding how the extra information provided by high-frequency data improves estimation efficiency in our “HF” versus “Daily” approaches. The extra kernels p(h t |IdV t,t+1;M ,Ωbt,t+1;M ,f t ) and p(f t |IdV t,t+1;M ,Ωbt,t+1;M ,h t ) in the MCMC updates of h and f, respectively, are very spiked around the mode for dates with low values of M 1 Ωbt,t+1;M in the volatility measurement equation (16) and, hence, they are very informative about the latent volatility states. The attainableprecisionimprovementsincreasewiththesamplefrequencyM anddependalsoon Ωbt,t+1;M , being a function of the underlying volatility paths and the chosen high-frequency integrated variance and quarticity measures as detailed in Section 2.2. By contrast, the use of only daily data is equivalent to artificially setting M 1 Ωbt,t+1;M to infinity in order to suppress the strong information content of high frequency data provided by our volatility measurement equation. In what follows, we analyze the gains in estimation efficiency and risk forecasting accuracy from our “HF” versus traditional “Daily” estimation as a natural benchmark for comparison. 4 Estimation efficiency and risk forecasting accuracy The ability to estimate parameters and volatility states more efficiently directly translates into more accurate risk forecasts. Moreover, the highly non-linear nature of the underlying transformation from noisy parameter and volatility estimates to risk forecasts implies reduction not only in the variance but also in the bias of the prediction errors. Our analysis in this section is designed to study the interplay between longer sample size and higher intraday frequency as an additional source of information introduced by our volatility measurement equation for the purpose of reducing estimation uncertainty. We document that 16

even for the longest sample lengths encountered in practice there is a substantial efficiency gain from incorporating the extra information provided by high-frequency volatility measures. Moreover, for key model parameters controlling skewness and kurtosis we find that two or five years of high frequency data would suffice to obtain the same level of precision as twenty years of daily data. This suggests that our approach can be particularly useful in financeapplicationswhereonlyshortdatasamplesareavailableoreconomicallymeaningful to use. It is possible to derive analytical results along these lines in certain more restrictive settings. An instructive example for a canonical log-SV model is given in the appendix. Monte Carlo analysis is the only viable option, though, for models that are not analytically tractable. Hence, we take a Monte Carlo approach to study estimation efficiency and the impact of parameter uncertainty on risk forecasting accuracy. We conduct considerably morethoroughandextensivesimulationsthanusualinordertoproperlydocumentthesubstantial efficiency gains and improved precision of risk forecasts at horizons of up to a few months ahead regardless of the chosen model when high frequency information is included in the model. Perhaps the most important of our findings is that there is considerable asymmetry between bad and good times when it comes to the attainable improvements in risk forecasting accuracy: in good times we are able to largely eliminate overstatement of risk, while in bad times our approach helps avoid understatement of risk. From a practical point of view, this implies imposing an appropriate larger risk cushion exactly when needed the most, e.g. early on in times of crisis (rather than with a delay), while at the same time avoiding excessive risk cushion requirements in normal times. In this sense, our main purpose in what follows is to document both the efficiency gains for model estimation and forecasting and the implied potentially large economic value of our approach to incorporating the information content of high-frequency volatility measures for model estimation and risk forecasting. 4.1 Monte Carlo setup In order to set-up the stage for Monte Carlo analysis we first describe how to draw sample paths consistent with the data generating process implied by our model specifications. Dailydynamicsofbothreturnsandvolatilityarebasedonequations(10)-(11)and(13)-(15) respectivelyfortheone-factorandtwo-factorlog-SVmodelsthatweconsider. Theintraday dynamics is based on a Brownian bridge connecting consecutive daily sample points and producingvalidintegratedvariancemeasures{IdV t,t+1;M } t≥0 andcorrespondingscaledintegrated quarticity measures {Ωbt,t+1;M } t≥0 that govern our additional volatility measurement equations in (12) and (16) as described in Section 2.2. In this way, we allow for potentially richer intraday dynamics than the one at the daily frequency, possibly including also realistic intraday market-microstructure effects that many novel high-frequency volatility 17

measures are designed to be robust to when sampled at two to five minute frequency.20 We draw 1,000 sample paths for each of the considered one- and two-factor log-SV models. For each sample path we estimate the underlying model parameters using different informationsets: (i)dailydataonly; (ii)dailydatawithadditionalhighfrequencyvolatility measurements based on 5-minute or 2-minute intraday returns; (iii) the “infeasible” case of perfectly observed volatility.21 In order to study the interplay between additional information coming from more high frequency data and longer sample size in terms of number of days, we consider three sample windows of 2, 5 and 20 years. This gives a total of twelve one-factor and twelve two-factor specifications for the information sets used for model estimation. We estimate all specifications using the Bayesian MCMC methods described in Section 3 with 250,000 draws, where the first 50,000 draws are discarded as the burn-in sample. For the purposes of forecasting conditional return moments and quantiles, based on the obtained 200,000 draws of the posterior distribution of parameters and volatility states, we approximate multi-period conditional density forecasts by a cloud of 25,000,000 points. We then compare moments and quantiles of the obtained conditional density forecasts for the two different estimation procedures that we consider, depending on whether a daily volatility measurement equation based on high-frequency data is used or not. 4.2 Efficiency gains in parameter and volatility estimation In Tables 1 and 2 we report parameter estimates, bias and root mean squared error (RMSE) of volatility related parameters governing equations (11) and (14)-(15) for our one-factor and two-factor specifications respectively. The true parameter values in each table represent our estimates on S&P 500 daily futures returns for the period October 2, 1985 - February 26, 2009. For the one-factor model (Table 1) we attain up to few times better precision when using high frequency data compared to only daily data for estimating the parameters governing skewness and kurtosis. This translates into RMSE reduction of as much as 70%. In particular, we find that the information content of high-frequency volatility measures improves the most the estimation efficiency of the volatility of volatility parameter σ and h the leverage effect parameter ρ in the model.22 Moreover, the gains are consistent across h different sample lengths, even for the longest ones typically encountered in practice such as 20In particular, in our analysis we focus on the MedRV estimator of Andersen, Dobrev, and Schaumburg (2009) and the tri-power variation measure of Barndorff-Nielsen, Shephard, and Winkel (2006). We report results only for the former as the results obtained for the latter are in the same spirit. 21In the one-factor model the case of perfectly observed volatility can be viewed as the limiting case of ourvolatilitymeasurementequationwhentheintradaysamplefrequencygrowstoinfinity. Inthetwo-factor model, though, the volatility measurement equation provides information only about the sum of the two volatility factors without separating them as in the infeasible case of full observability. 22Inthissense,ourresultsextendthoseobtainedbyAlizadeh,Brandt,andDiebold(2002)inaconsiderably more restrictive range-based analysis of a model without leverage effects. 18

20 years, when daily estimation is more likely to produce satisfactory results. As a practical rule of thumb we find that two years of high frequency data often suffice to obtain the same level of precision for these parameters as twenty years of daily data. At the same time, a comparison between the attainable improvement by switching from daily to 5-minute estimation and any further increase in the intraday sample frequency from 5 to 2 minutes and beyond (up to the infeasible case of perfectly known volatility) reveals a rapid decrease in the additional efficiency gains that can be obtained. We also observe a substantial RMSE reduction for the parameter governing persistence of volatility κ for the shortest sample h sizes, while still dominating the estimation efficiency with only daily data across all sample sizes. For the richer two-factor log-SV model (Table 2) these substantial efficiency gains from incorporating high-frequency volatility measures naturally get even larger. Moreover, a somewhat larger part of the gains is due to bias reduction. It is important to note that here skewness and kurtosis are driven not only by a persistent volatility factor but also by a second non-persistent factor. For the non-persistent factor we find that the gains from incorporating high frequency information are more pronounced than those from increasing the yearly sample length. We do not find such evidence for the persistent factor, where both sources of information play an important role in parameter estimation. This implies bigger efficiency gains from incorporating high frequency information for the parameters ρ and σ governing skewness and kurtosis arising from the non-persistent factor f. The f f reduction of parameter uncertainty for the persistent factor h is somewhat smaller but still very visible. The quality of risk forecasts depends not only on the degree of parameter uncertainty but also on the degree of volatility estimation uncertainty. In particular, it is important to assess the impact of incorporating additional high frequency information on the accuracy of estimation of terminal volatility states as they play important role in forecasting risk. In Table 3 we report mean estimates, bias and RMSE for the terminal volatility states h T and f of the two-factor log-SV model. Thus, we conclude that our volatility measurement T equation helps in estimating better not only model parameters but also latent volatility states. Considerable efficiency gains are obtained mainly for the persistent volatility factor, whileforthenon-persistentfactorwestillobserveslightimprovements. Similarlytoparameter estimates, our findings for volatility states are consistent across all considered sample sizes. Moreover, the biggest efficiency gains take place when moving from estimation based only on daily data to estimation incorporating our volatility measurement equation based on 5-minute returns. Further increase of the intraday sample frequency from 5-minutes to 2-minutes leads to additional efficiency gains of much smaller magnitude. Overall, for the estimation of volatility states adding high frequency information has somewhat bigger importance than increasing the yearly sample length. This plays a major role especially for 19

short-term risk forecasting. 4.3 Precision improvements in risk forecasting accuracy Thedocumentedsubstantialdecreaseinparameterandvolatilityestimationuncertainty impliesnon-trivialimprovementsintheaccuracyofforecastsofconditionalreturnmoments and quantiles. We compare forecasts resulting from inference based on daily data to those utilizing 5-minute high frequency volatility measures. We restrict attention to the 5-minute frequency in accordance with our finding that it offers essentially the bulk of the attainable improvements based on our volatility measurement equation. We perform our analysis incorporating parameter and volatility estimation uncertainty for all three considered sample lengths of 2, 5 and 20 years. In Tables 4 and 5 we report forecasts of conditional return moments respectively for one-factor and two-factor models. In Tables 6 and 7 we also report forecasts of conditional return quantiles. The considered forecast horizons are 1, 5, 10 and 20 days ahead and are presented in separate panels in each table. These forecast horizons are of primary interest in many finance applications. Our main finding is that our more efficient parameter and state estimates incorporating the strong information content of high-frequency volatility measures translate into equally better conditional return density forecasts not only in terms of RMSE but also in terms of bias. The bias reduction is due to the pronounced non-linearities in the underlying transformation of parameters and state variables. The main message from our analysis summarized in Tables 4-7 is that for any model, any estimation sample length, and across all forecast horizons of interest, the forecasts incorporating the extra information from our volatilitymeasurementequationclearlydominatethosebasedonlyondailydata. Moreover, theseresultsstrengthenourruleofthumbthatmodelspecificationsestimatedwithtwoyears of high frequency data perform at least as good as the same model specifications estimated with twenty years of daily data, which in turn are considerably outperformed if estimated on twenty years of high-frequency data. 4.4 Forecast error reductions in good versus bad times From risk management perspective it is important to know how the improvements in risk forecasting accuracy vary across good and bad times. To this end, in Table 8 we report relative errors of forecasts of the 0.01 and 0.05 conditional return quantiles at horizons of one (panel A), five (panel B), ten (panel C) and twenty (panel D) days ahead. The reported relative errors are calculated across 1,000 Monte Carlo replications as the mean of the percentage difference between a forecast based on parameter and state estimates and the forecast based on the corresponding true values. The results are sorted by the rank 20

order of the true quantile forecasts from low (representing bad times) to high (representing good times), as indicated in the first column. In our model this is equivalent to sorting by terminalvolatilitystatefromhigh(representingbadtimes)tolow(representinggoodtimes). Each three rows reported for ranks 1 (low) to 5 (high) of the true quantile forecasts contain results for three different sample lengths T equal to 2 years, 5 years and 20 years (as given in the second column), taking parameter and volatility estimation uncertainty into account. For each quantile and forecast horizon we report results for the two alternative Bayesian estimation procedures in adjacent column pairs: either with (right column denoted “HF 5-min”) or without (left column denoted “Daily only”) augmenting the underlying statespace formulation with our daily volatility measurement equation based on high frequency intraday data. AsagraphicalsummaryoftheresultsreportedinTable8,Figure1plotstheone-percent VaR (top graph) and five-percent VaR (bottom graph) relative forecast errors at a five-day horizon as a function of the rank order of the underlying true forecasts from low (representing bad times) to high (representing good times). The resulting VaR forecast errors without utilizing our high-frequency volatility measures are plotted as a solid line (denoted “Daily”), while those incorporating the information content of intraday data for the latent daily volatility are plotted as a dashed line (denoted “HF 5-min”). The reported relative errors of conditional return quantile forecasts can be interpreted also as the percentage overestimation or understimation of the implied capital charge for market risk based on one-percent (quantile 0.01) and five-percent (quantile 0.05) VaR. Both Table 8 and Figure 1 reveal that risk forecasts stemming from traditional model inferenceondailydatatendtobeoverlyconservativeingoodtimes(e.g. overestimatingrisk byasmuchas30%)buttheyarenotconservativeenoughinbadtimes(e.g. underestimating risk by as much as 10%). By contrast, risk forecasts based on our approach to exploiting high-frequency data are considerably closer to the truth in both bad and good times. Leavingthereportedmagnitudesaside,thisresultisveryintuitiveastheuseofvolatility measures based on high frequency data allows for considerably faster and more precise incorporation of major changes in the current volatility level compared to daily data alone. For example, in bad times when volatility goes up it should take a longer sequence of daily returns alone than in conjunction with high-frequency volatility measures to deliver volatility state estimates that are not downward biased. Similarly, in good times when volatility goes down it should take longer for daily data alone than in conjunction with high-frequency volatility measures to produce volatility state estimates that are not upward biased. Thus, the observed differences between the risk forecast errors in bad versus good times(Table8andFigure1)arecompletelyinlinewiththeasymmetricincreaseinvolatility state uncertainty, coupled also with higher parameter uncertainty (see Section 4.2 above), characterizingtraditionaldailyestimationincomparisontotheproposedapproachutilizing 21

also high-frequency data. In sum, thanks to incorporating the strong information content of high-frequency volatility measures, we are able to better curb risk taking exactly when needed the most, i.e. early on in times of crisis, while avoiding unnecessary overstatement of risk in normal times. 5 Empirical Illustration Conditional return quantile forecasts play important role in risk management as they represent value-at-risk (VaR) forecasts. A key testable implication from our analysis in the previous section is that during bad times, e.g. early on in times of crisis, VaR forecast timeseries based on our approach to exploiting high-frequency data will tend to “cross from above” the VaR forecast time-series stemming from traditional model inference on daily data. This is because, as explained above, the daily-based VaR forecasts are downward biased in bad times (when risk is elevated) and upward biased in good times (when risk is minimal), while our HF-based VaR forecasts are considerably closer to the truth in both bad and good times. In order to test the empirical validity of this important risk management implication, we study the dynamics of five-day ahead VaR forecasts for S&P 500 and Google returns throughout the financial crisis of 2007-2008. Our goal is to illustrate the potentially large economic value from the proposed approach to incorporating the information content of high-frequency volatility measures. It is beyond the scope of this paper, though, to run a horse race between many viable alternative VaR forecasting techniques. We limit ourselves strictly to evaluating the empirical validity of our main testable implication with regard to HF-based versus daily-based VaR forecasts in the context of popular equity return models such as the fairly general two-factor log-SV model with jumps analyzed in the previous sections. 5.1 Data and estimation In our empirical illustration we consider S&P 500 daily futures returns for the period October 2, 1985 - February 26, 2009 and Google daily equity returns for the period August 30, 2004 - July 31, 2009.23 We exclude from each series holidays and shortened trading days. Our high-frequency measurement equation is constructed from five-minute intraday returnsfollowingtheproceduresgiveninsection2.2,whilemodelestimationandforecasting is conducted as detailed in sections 3 and 4. We study the dynamics of five-day ahead VaR forecasts for the last 120 business weeks in each sample, both of which cover the financial crisis of 2007-2008. To produce each forecast we re-estimate our two-factor log-SV model 23The data for S&P 500 is provided by Tick Data, while the data for Google is from NYSE TAQ. 22

with all available data going back to the beginning of each sample. Thus, the sample for S&P500roughlycorrespondsto20yearsofdatainourMonteCarlostudy(Section4). The sample for Google, on the other hand, represents 2-5 years of data and cannot be extended further back as it starts ten days after Google’s IPO. 5.2 Forecasting risk throughout the 2007-2008 financial crisis On Figures 2 and 3 we plot one-percent (top graph) and five-percent (bottom graph) VaR forecasts without overlapping at five-day horizon for S&P 500 futures returns (Figure 2) and Google equity returns (Figure 3) based on a two-factor log-SV model with jumps in returns. The model is estimated at a daily discretization interval by Bayesian MCMC methodseitherwithoutorwithaugmentingtheunderlyingstate-spaceformulationwithour dailyvolatilitymeasurementequationbasedonhighfrequencyintradaydata. Theresulting VaR forecasts without utilizing high-frequency volatility measures are plotted as a solid line (denoted “VaR with daily data”), those incorporating the information content of intraday data for the latent daily volatility are plotted as a dashed line (denoted “VaR with HF 5-min data”), while the corresponding actual observed returns are plotted as vertical bars (denoted “Return realizations”). As clearly seen from the graphs, the VaR forecasts with HF 5-min data seemingly correctlypredictmoreriskand“crossfromabove”theVaRforecastswithdailydataexactly aroundmajorturmoileventsduringthefinancialcrisisof2007-2008. TheseincludetheBear Sterns turmoil in July 2007, the Countrywide turmoil in January 2008, the Fannie Mae and FreddieMacturmoilinJuly2008, andmostnotably, theLehmanBrotherscollapsefollowed by the TARP Legislation turmoil in October 2008. The gap between the two alternative VaRforecastsaroundtheseeventsimpliessizeableunderestimationofriskbythetraditional approachbasedondailydata. ThisismorepronouncedforGoogleinlinewiththefactthat individual stocks tend to be more risky than stock indices. At the same time, before the summer of 2007 and on many occasions afterwards the VaR forecasts with HF 5-min data predict a bit less risk than the VaR forecasts with daily data. Nonetheless, the number of incurred violations (given by the number of times the return realizations, plotted as vertical bars, go below the VaR forecasts) remains completely in line with the expected number of violations at the 1% and 5% VaR levels across 120 (non-overlapping) forecasts. Overall, the observed dynamics of VaR forecasts for S&P 500 and Google returns throughout the financial crisis of 2007-2008 is in striking agreement with the key testable implication from our analysis in the previous sections. We obtain strong empirical support that not only in theory but also in important real-world examples our approach to incorporating the information content of high frequency volatility measures can help better curb risk taking exactly when needed the most, i.e. early on in times of crisis, while avoiding unnecessary overstatement of risk in normal times. 23

6 Conclusion In this paper, we have developed a method for estimating popular equity return models relying not only on daily returns but also on nowadays ubiquitous high-frequency intraday return data. The essence of our approach is to borrow asymptotic results from the growing realized volatility literature and cast them as precise volatility measurement equations directly within the standard state-space representation of popular equity return models estimated at daily frequency. In this way, we avoid specifying explicitly the intraday return dynamics, while considerably improving estimation efficiency of such models at daily or monthly frequency. In particular, we utilize daily returns along with high-frequency jump-robust realized volatility measures within a standard Bayesian MCMC estimation framework. This allows us to take explicitly into account the resulting substantial reduction in parameter uncertainty. Thus, we are able to show sizeable economic gains when forecasting risk, compared to inference based on the more limited information provided by daily returns alone. In this way, we depart from previous studies geared primarily towards specification testing that have focused on the use of such high-frequency volatility measures in classical rather than Bayesian estimation procedures. Instead, we demonstrate that across a variety of equity return models estimated at daily frequency the parameters controlling skewness and kurtosis can be obtained almost as precisely as if volatility is observable by incorporating the strong information content of realized volatility measures extracted from high-frequency data. In particular, we show that not only the parameters controlling volatility of volatility but also those controlling leverage effects can be estimated several times more precisely by exploiting high-frequency volatility measures. Furthermore, we show that our highly efficient estimates lead in turn to substantial gains for forecasting various risk measures at horizons ranging from a few days to a few months ahead when taking also into account parameter uncertainty. In fact, our approach not only reduces the root mean square prediction error but also shrinks and almost eliminates the forecast bias, which inevitably arises from the pronounced nonlinearities in the involved transformation of parameter and volatility estimates. As a practical rule of thumb we find that two years of high frequency data often suffice to obtain the same level of precision as twenty years of daily data, thereby making our approach particularly useful in finance applications where only short data samples are available or economically meaningful to use. Last, but perhaps most important in risk management applications, we find that risk forecasts based on our approach to exploiting high-frequency data are considerably closer to the truth in both bad and good times relative to those stemming from traditional model inference on daily data, which we find can overestimate risk by as much as 30% in good times or underestimate it by as much as 10% in bad times. We support our findings both with extensive simula- 24

tions and an empirical illustration on VaR forecasts for S&P500 and Google returns during the financial crisis of 2007-2008. Thanks to incorporating the strong information content of high-frequency volatility measures, we are able to better curb risk taking exactly when needed the most, i.e. early on in times of crisis (rather than with a delay), while avoiding unnecessary overstatement of risk in normal times. Qualitatively, our findings are robust both across different models and jump-robust volatility measures on high frequency data that we analyze. In view of the documented substantial precision gains in forecasting risk of equity returns, the estimation approach we propose can directly add value in different areas of risk management and asset pricing. Beyond equity returns, the method can be applied also to otherfinancialdatasuchasforeignexchangerates,bondsandinterestrates. Itcanbeeasily geared also towards model specification testing. More generally, we establish a promising and tractable way to incorporate additional sources of information, such as alternative high frequency volatility measures, into models in state space form. 25

References Aït-Sahalia, Y. and J. Jacod (2007). Volatility estimators for discretely sampled lévy processes. Ann. Stat. 35(1), 355–392. Alizadeh, S., M. Brandt, and F. Diebold (2002). Range-based estimation of stochastic volatility models. The Journal of Finance 57(3). Andersen, T. G., L. Benzoni, and J. Lund (2002). An empirical investigation of continuous-time equity return models. Journal of Finance 57(3), 1239–1284. Andersen, T. G., T. Bollerslev, P. Christoffersen, and F. X. Diebold (2007). Practical volatility and correlation modeling for financial market risk management. In The Risks of Financial Institutions, NBER Chapters, pp. 513–548. National Bureau of Economic Research, Inc. Andersen,T.G.,T.Bollerslev,andF.X.Diebold(2009).Parametricandnonparametricvolatility measurement. North Holland. In Handbook of Financial Econometrics, Yacine Aït-Sahalia, Lars P. Hansen, and Jose A. Scheinkman (Eds.). Andersen, T. G., T. Bollerslev, F. X. Diebold, and H. Ebens (2001). The distribution of realized stock return volatility. Journal of Financial Economics 61(1), 43–76. Andersen, T. G., T. Bollerslev, and D. Dobrev (2007). No-arbitrage semi-martingale restrictions forcontinuous-timevolatilitymodelssubjecttoleverageeffects,jumpsandi.i.d.noise: Theory and testable distributional implications. Journal of Econometrics 138(1), 125–80. Andersen, T. G., D. P. Dobrev, and E. Schaumburg (2009). Jump-robust volatility estimation using nearest neighbor truncation. NBER Working Paper (15533). Bakshi, G., C. Cao, and Z. Chen (1997). Empirical performance of alternative option pricing models. The Journal of Finance 52(5), 2003–2049. Bandi, F. M. and R. Reno (2009). Nonparametric stochastic volatility. Global COE Hi-Stat Discussion Paper Series gd08-035, Institute of Economic Research, Hitotsubashi University. Bandi, F. M. and J. R. Russell (2007). Volatility. Elsevier Science, New York. in Handbook of Financial Engineering, V. Linetski and J. Birge (Eds.). Barndorff-Nielsen, O. E. and N. Shephard (2002). Econometric analysis of realised volatility and its use in estimating stochastic volatility models. Journal of the Royal Statistical Society, Series B 64, 253–280. Barndorff-Nielsen, O. E. and N. Shephard (2004). Power and bipower variation with stochastic volatility and jumps. Journal of Financial Econometrics 2(1), 1–37. Barndorff-Nielsen, O. E. and N. Shephard (2005). How accurate is the asymptotic approximation to the distribution of realised volatility? In D. W. K. Andrews and J. H. Stock (Eds.), Identification and Inference for Econometric Models. A Festschrift in Honour of T.J. Rothenberg, pp. 306–331. Cambridge: Cambridge University Press. Barndorff-Nielsen, O. E. and N. Shephard (2007). Variation, Jumps, Market Frictions and High Frequency Data in Fnancial Econometrics. Cambridge University Press. In Advances in Economics and Econometrics. Theory and Applications, Ninth World Congress, R. Blundell, T. Persson, and W. Newey (Eds.). 26

Barndorff-Nielsen, O. E., N. Shephard, and M. Winkel (2006). Limit theorems for multipower variation in the presence of jumps. Stochastic Processes and Their Applications 116, 796–806. Bates, D. S. (2000). Post-’87 crash fears in the s&p 500 futures option market. Journal of Econometrics 94(1-2), 181–238. Black, F. and M. Scholes (1973). The pricing of options and corporate liabilities. Journal of Political Economy 81, 637–659. Bollerslev, T. and H. Zhou (2002). Estimating stochastic volatility diffusion using conditional moments of integrated volatility. Journal of Econometrics 109, 33–65. Broadie,M.,M.Chernov,andM.Johannes(2007).Modelspecificationandriskpremia: Evidence from futures options. Journal of Finance 62(3), 1453–1490. Chernov, M., A. Gallant, E. Ghysels, and G. Tauchen (2003). Alternative models for stock price dynamics. Journal of Econometrics 116, 225–257. Chernov,M.andE.Ghysels(2000).Astudytowardsaunifiedapproachtothejointestimationof objective and risk neutral measures for the purpose of option valuation. Journal of Financial Economics 56, 407–458. Chib,S.andE.Greenberg(1996).Markovchainmontecarlosimulationmethodsineconometrics. Econometric Theory 12(3), 409–431. Christensen, K., R. Oomen, and M. Podolskij (2008). Realized quantile-based estimation of integrated variance. Working paper. Corradi, V. and W. Distaso (2006). Semiparametric comparison of stochastic volatility models via realized measures. Review of Economic Studies 73, 635–667. Corsi, F., D. Pirino, and R. Renò (2008). Volatility forecasting: The jumps do matter. SSRN Working Paper. Das, S. R. and R. K. Sundaram (1999). Of smiles and smirks: A term structure perspective. The Journal of Financial and Quantitative Analysis 34(2), 211–239. Eraker,B.(2004).Dostockpricesandvolatilityjump? reconcilingevidencefromspotandoption prices. The Journal of Finance 59(3), 1367–1403. Eraker, B., M. Johannes, and N. Polson (2003). The impact of jumps in volatility and returns. The Journal of Finance 58(3), 1269–1300. Fleming, J., C. Kirby, and B. Ostdiek (2003, March). The economic value of volatility timing using "realized" volatility. Journal of Financial Economics 67(3), 473–509. Hansen, P. R. and A. Lunde (2005). A realized variance for the whole day based on intermittent high-frequency data. Journal of Financial Econometrics 3(4), 525–554. Huang, X. and G. Tauchen (2005). The relative contribution of jumps to total price variance. Journal of Financial Econometrics 3(4), 456–99. Jacquier,E.,N.Polson,andP.Rossi(2004).Bayesiananalysisofstochasticvolatilitymodelswith fat-tails and correlated errors. Journal of Econometrics 122, 185–212. Johannes, M. and N. Polson (2002). Mcmc methods for financial econometrics. In Handbook of Financial Econometrics. North-Holland. Forthcoming. 27

Jones, C. (1998). Bayesian estimation of continuous-time finance models. Working Paper. Jones, C. (2003). The dynamics of stochastic volatility: evidence from underlying and options market. Journal of Econometrics 116, 181–224. Mancini, C. (2006). Estimating the integrated volatility in stochastic volatility models with levy type jumps. Working paper, University of Firenze. McAleer,M.andM.C.Medeiros(2008).Realizedvolatility: Areview.EconometricReviews27(1- 3), 10–45. Merton, R. C. (1969). Lifetime portfolio selection under uncertainty: The continuous-time case. The Review of Economics and Statistics 51(3), 247–257. Pan, J. (2002). The jump-risk premia implicit in options: evidence from an integrated time-series study. Journal of Financial Economics 63(1), 3–50. Podolskij,M.andM.Vetter(2009).Bipower-typeestimationinanoisydiffusionsetting.Stochastic Processes Appl. 119(9), 2803–2831. Szerszen,P.(2009).Bayesiananalysisofstochasticvolatilitymodelswithlévyjumps: application to risk analysis. Board of Governors of the Federal Reserve System Working Paper, 40. Todorov, V. (2009). Estimation of continuous-time stochastic volatility models with jumps using high-frequency data. Journal of Econometrics 148(2), 131–148. 28

A Analytical Results In Toy Model: A Motivating Example In this paper we introduce asymptotically exact volatility measurement equations in state space form and propose a Bayesian MCMC estimation approach that we use to demonstrate the efficiency gains when estimating key parameters in various popular SV models. Although our expressions for the complete conditional posteriors given in section 3.3.2 provide an intuitive explanation where the documented efficiency gains are coming from, it is useful to provide some extra analytical support for the obtained results also by using classical estimation methods in a suitable “toy model”. Tothisend,herewerestrictattentiononestimatingthekurtosisparameterσ inthefollowingsimplified h log-SVmodelinstatespaceformaugmentedwithavolatilitymeasurementequationbasedonhigh-frequency data: h r = exp( t) ε(r) (21) t+1 2 t+1 h = β h +σ ε(h) (22) t+1 h t h t+1 q ν log(IcV t+1;M ) = h t + M ε t ( + IV 1 ) (23) where all error terms are i.i.d. Gaussian. Note that equations (21)-(22) represent a canonical log-SV model for daily returns (and log-variances) extensively studied in the literature, see e.g. Taylor (1986), Nelson (1988), Harvey, Ruiz, and Shephard (1994), Ruiz (1994), Andersen and Sorensen (1996), Francq and Zakoïan (2006), among others.24 Without loss of generality, the log-variance process is zero mean with persistence controlled by β =1−κ . Equation (23) represents our volatility measurement equation based h h on high-frequency intraday data in its simplest form (for implicitly assumed Brownian intraday dynamics), where M >> 1 is the intraday sample frequency and ν is an efficiency factor depending on the chosen volatility measure IcV t+1;M as detailed in section 2.2. Asusual,itisconvenienttosubstitutethereturnmeasurementequationinthiscanonicallog-SVmodel withtheoneobtainedaftertakingthelogarithmofsquaredreturns(withoutincurringanyinformationloss when the distribution of ε(r) is symmetric): t log(r2 ) = h +log(ε(r) )2 (24) t+1 t t+1 h = β h +σ ε(h) (25) t+1 h t h t+1 q ν log(IcV t+1;M ) = h t + M ε t ( + IV 1 ) (26) It is convenient to further simplify notation by redefining the measurements and their errors as x = t √ log(r t 2)−E[log(ε t (r))2], ε( t x) = ε t (r) −E[log(ε( t r))2], y t = log(IcV t+1;M ), ε( t y) = ν ε( t IV). This yields the following representation of the model in state space form: x = h +ε(x) (27) t+1 t t+1 h = β h +σ ε(h) (28) t+1 h t h t+1 1 y = h + √ ε(y) (29) t+1 t M t+1 24ComprehensivesurveysoftheliteratureonSVmodelsandestimationincludeAndersen,Bollerslev,and Diebold (2009), Ghysels, Harvey and Renault (1996), Shephard (1996), Taylor (1994), among others. 29

Inwhatfollows,westudytheefficiencyofestimatingσ takingβ asgiveninthefollowingtwospecifications: h h (i)Standard“Daily”givenbythefirsttwoequations(27)-(28);(ii)Augmented“Daily+HF”givenbythefull system (27)-(29). Intuitively, it is clear that the relative difference in the precision of the two measurement equations (27) and (29) determines the attainable efficiency gains from using both measurement equations in the proposed “Daily + HF” specification as opposed to using only the first measurement equation in the standard “Daily” specification. Clearly, increasing the sample frequency M improves the precision of the additional volatility measurement equation (29) and in the limit as M →∞ it yields perfect measurements ofthevolatilitystates. Thismeansthatthemaximumattainableefficiencyinthecaseofperfectlyobserved volatility states can be closely achieved by increasing the sample frequency M sufficiently. A GMM estimation approach provides a straightforward formalization of these intuitive observations. Let mε(x) = E[(ε(x))q] , q = 2,4 and mε(y) = E[(ε(y))q] , q = 2,4 denote the known second and fourth q t q t unconditionalmomentsofthetwomeasurementerrorterms. Considerthefollowingtwomomentconditions: σ2 g (σ ,β ) = x2− h −mε(x) (30) 1 h h t 1−β2 2 h σ2 1 g (σ ,β ) = y2− h − mε(y) (31) 2 h h t 1−β2 M 2 h It is easy to confirm that these are valid moments: E[g ] = 0 (32) 1 E[g ] = 0 (33) 2 The corresponding variance of each moment and the covariance between them is given by: 2σ4 4σ2 V[g ] = h + h mε(x)+mε(x)−(mε(x))2 (34) 1 (1−β2)2 (1−β2) 2 4 2 h h 2σ4 4σ2 mε(y) mε(y)−(mε(y))2 V[g ] = h + h 2 + 4 2 (35) 2 (1−β2)2 (1−β2) M M2 h h 2σ4 C[g ,g ] = h (36) 1 2 (1−β2)2 h Notethattheunconditionalsecondmomentofthelog-varianceprocessisgivenbymh =E[h2]= σ h 2 . 2 t (1−β2) h Hence, the above variance and covariance expressions take the following form: V[g ] = 2(mh)2+4mhmε(x)+mε(x)−(mε(x))2 (37) 1 2 2 2 4 2 mε(y) mε(y)−(mε(y))2 V[g ] = 2(mh)2+4mh 2 + 4 2 (38) 2 2 2 M M2 C[g ,g ] = 2(mh)2 (39) 1 2 2 The resulting optimal GMM weighting matrix is given by: !−1 ! V[g 1 ] C[g 1 ,g 2 ] = 1 V[g 2 ] −C[g 1 ,g 2 ] (40) C[g 1 ,g 2 ] V[g 2 ] V[g 1 ]V[g 2 ]−C[g 1 ,g 2 ]2 −C[g 1 ,g 2 ] V[g 1 ] 30

Itfollowsthattheratiobetweenthevarianceofanestimatorofσ basedonthefirstmomentcondition h (“Daily” specification) and the variance of the optimal GMM estimator of σ combining both moment h conditions (“Daily + HF” specification) is given by: (cid:16) (cid:17)2 V[g 1 ] =1+ C[ V g [ 1 g , 1 g ] 2] −1 (41) (cid:16) V V [g [g 1 1 ]+ ]V V [ [ g g 2 2 ] ] − − C 2 [g C 1 [g ,g 1 2 ,g ]2 2] (cid:17)−1 C[ V g [ 1 g , 1 g ] 2] C[ V g [ 1 g , 2 g ] 2] −1 2 mε 2 (x) + 1[ mε 4 (x) −( mε 2 (x) )2] mh 2 (mh)2 mh = 1+ (cid:18) 2 (cid:19) 2 (cid:18) 2 (cid:19) (42) 1+ 2 m 2 ε(y) + 1 [ m 4 ε(y) −( m 2 ε(y) )2] 1+2 mε 2 (x) + 1[ mε 4 (x) −( mε 2 (x) )2] M mh 2M2 (mh)2 mh mh 2 (mh)2 mh 2 2 2 2 2 2 Expressedinthisform,thevariancereductionfactorisafunctionofthevarianceofeachmeasurementerror relative to the variance of the state variable and, hence, the intraday sample frequency M affecting the precision of the second measurement equation based on high-frequency intraday data. Two important conclusions follow. First, as M →∞ this variance reduction factor approaches mε(x) 1 mε(x) mε(x) V[g ] V[g ] 1+2 2 + [ 4 −( 2 )2]= 1 ≡ 1 , (43) mh 2 (mh)2 mh C[g ,g ] lim V[g ] 2 2 2 1 2 M→∞ 2 which means that the Hausman principle applies in the limit, in the sense that when volatility is perfectly observed in the second measurement equation then it alone achieves minimum variance, i.e. maximum efficiency of the estimator. Second, for values of M typically used in empirical work such as M =78 (fiveminute returns) and M = 195 (two-minute returns), the above variance reduction factor (42) is very close to its limiting value (43) for M →∞ since the denominator in (42) would be close to unity. This proves analytically in the considered simplified setting that augmenting the state space form of the model with a volatility measurement equation based on high frequency data yields an estimator with several times smaller variance compared to the one without a volatility measurement equation. For typical values of M in the order of 100 the variance reduction factor is fairly close to its limiting value (43). In particular, based on the derived formulas it is easy to see that for parameter values in the neighborhood of those used in prior studies of the same model (see for example Ruiz (1994) or Andersen and Sorensen (1997)) implies variance reduction factor somewhere in the range 5 to 30 times, which roughly translates into 2 to 5 times smaller standard deviation. This is quite in line with the RMSE reduction documented in ourMonteCarlostudyforpopularnon-analyticallytractablemodelsforwhichweproposeBaysianMCMC estimation methods with the added benefit of more fully exploiting information via the model state space form. 31

B Figures and Tables Table1: Parameter estimates for a one-factor log-SV model with leverage effects. For select model parameters we report the mean, bias, and RMSE of the estimates obtained across 1,000 Monte Carlo replications. The state-space form of the model is as follows: h Y −Y = µ+exp( t) ε(1) t+1 t 2 t+1 q h = h +κ (θ −h )+σ (ρ ·ε(1) + (1−ρ2)·ε(2) ) t+1 t h h t h h t+1 h t+1 r 1 log(IcV t,t+1;M ) ≈ h t + M Ωbt,t+1;M ε( t+ IV 1 ) Columnsrepresentresultsforalternativeestimationproceduresdependingonweatherourvolatility measurement equation based on high-frequency log integrated variance measures log(IcV t,t+1;M ) is used(HF5-minwithM=78; HF2-minwithM=195)ornot(dailyonly),aswellasfortheinfeasible case of perfect observability (known volatility). The rows in each block contain results for different yearly sample lengths (2, 5, or 20 years). Sample length T Daily HF HF Known Daily HF HF Known (years) Only 5-min 2-min Volatility Only 5-min 2-min Volatility 2 0.0249 0.0176 0.0174 0.0173 0.1739 0.1655 0.1647 0.1649 5 0.0188 0.0171 0.0171 0.0171 0.1694 0.1649 0.1650 0.1649 20 0.0168 0.0165 0.0165 0.0165 0.1653 0.1647 0.1647 0.1647 2 0.0086 0.0014 0.0011 0.0010 0.0091 0.0007 0.0000 0.0001 5 0.0025 0.0009 0.0009 0.0008 0.0046 0.0002 0.0002 0.0001 20 0.0006 0.0002 0.0002 0.0002 0.0005 -0.0001 0.0000 0.0000 2 0.0309 0.0092 0.0090 0.0089 0.0260 0.0087 0.0069 0.0047 5 0.0071 0.0050 0.0049 0.0048 0.0183 0.0058 0.0045 0.0030 20 0.0028 0.0021 0.0021 0.0021 0.0100 0.0028 0.0023 0.0015 Sample length T Daily HF HF Known Daily HF HF Known (years) Only 5-min 2-min Volatility Only 5-min 2-min Volatility 2 -9.4024 -9.4837 -9.5006 -9.5067 -0.5840 -0.6618 -0.6651 -0.6679 5 -9.4481 -9.4468 -9.4478 -9.4467 -0.6360 -0.6668 -0.6683 -0.6704 20 -9.4308 -9.4299 -9.4297 -9.4295 -0.6599 -0.6709 -0.6703 -0.6706 2 0.0219 -0.0594 -0.0764 -0.0825 0.0875 0.0098 0.0065 0.0037 5 -0.0239 -0.0225 -0.0235 -0.0224 0.0355 0.0047 0.0032 0.0012 20 -0.0065 -0.0057 -0.0054 -0.0053 0.0116 0.0006 0.0012 0.0010 2 0.9659 0.8943 0.8897 0.8720 0.1383 0.0395 0.0321 0.0210 5 0.3026 0.2906 0.2917 0.2911 0.0776 0.0247 0.0214 0.0140 20 0.1416 0.1390 0.1395 0.1398 0.0392 0.0132 0.0110 0.0073 NAEM SAIB ESMR  h = 0.0163 NAEM  h = 0.1648  h = - 0.6716 SAIB ESMR  h = - 9.4243 32

Table2: Parameter estimates for a two-factor log-SV model with leverage effects and jumps. For select model parameters we report the mean, bias, and RMSE of the estimates obtained across 1,000 Monte Carlo replications. The state-space form of the model is as follows: h +f Y −Y = µ+exp( t t) ε(1) +q ·J t+1 t 2 t+∆ t+1 t+1 p h = h +κ (θ −h )+σ (ρ ·ε(1) + (1−ρ2)·ε(2) ) t+1 t h h t h h t+1 h t+1 q f = f +κ (θ −f )+σ (ρ ·ε(1) + (1−ρ2)·ε(3) ) t+1 t f f t f f t+1 f t+1 r 1 log(IcV t,t+1;M ) ≈ h t +f t + M Ωbt,t+1;M ε( t+ IV 1 ) Columnsrepresentresultsforalternativeestimationproceduresdependingonweatherourvolatility measurement equation based on high-frequency log integrated variance measures log(IcV t,t+1;M ) is used(HF5-minwithM=78; HF2-minwithM=195)ornot(dailyonly),aswellasfortheinfeasible case of perfect observability (known volatility). The rows in each block contain results for different yearly sample lengths (2, 5, or 20 years). Sample length T Daily HF HF Known Daily HF HF Known (years) Only 5-min 2-min Volatility Only 5-min 2-min Volatility 2 0.0254 0.0203 0.0202 0.0154 0.9868 0.7011 0.6930 0.6750 5 0.0195 0.0157 0.0156 0.0142 0.9146 0.6870 0.6774 0.6719 20 0.0150 0.0136 0.0136 0.0134 0.7303 0.6810 0.6742 0.6721 2 0.0124 0.0073 0.0072 0.0024 0.3144 0.0287 0.0206 0.0026 5 0.0065 0.0027 0.0026 0.0012 0.2422 0.0146 0.0050 -0.0005 20 0.0020 0.0006 0.0006 0.0004 0.0579 0.0086 0.0018 -0.0003 2 0.0206 0.0156 0.0156 0.0101 0.3794 0.0866 0.0775 0.0424 5 0.0121 0.0079 0.0079 0.0059 0.3400 0.0512 0.0444 0.0275 20 0.0042 0.0027 0.0027 0.0025 0.2444 0.0260 0.0231 0.0131 Sample length T Daily HF HF Known Daily HF HF Known (years) Only 5-min 2-min Volatility Only 5-min 2-min Volatility 2 0.1553 0.1426 0.1421 0.1321 0.1685 0.3818 0.3795 0.3862 5 0.1507 0.1367 0.1362 0.1322 0.1752 0.3883 0.3849 0.3864 20 0.1400 0.1333 0.1329 0.1320 0.2643 0.3919 0.3879 0.3872 2 0.0233 0.0106 0.0101 0.0001 -0.2191 -0.0058 -0.0081 -0.0014 5 0.0187 0.0047 0.0042 0.0002 -0.2124 0.0007 -0.0027 -0.0012 20 0.0080 0.0013 0.0009 0.0000 -0.1233 0.0043 0.0003 -0.0004 2 0.0338 0.0208 0.0205 0.0042 0.2203 0.0196 0.0184 0.0121 5 0.0285 0.0143 0.0139 0.0026 0.2159 0.0118 0.0117 0.0081 20 0.0154 0.0072 0.0070 0.0013 0.1423 0.0075 0.0056 0.0039 Sample length T Daily HF HF Known Daily HF HF Known (years) Only 5-min 2-min Volatility Only 5-min 2-min Volatility 2 -0.2225 -0.2615 -0.2610 -0.2802 -0.1443 -0.1130 -0.1172 -0.1095 5 -0.2509 -0.2769 -0.2755 -0.2815 -0.1739 -0.1096 -0.1116 -0.1105 20 -0.2669 -0.2849 -0.2824 -0.2824 -0.1712 -0.1104 -0.1112 -0.1098 2 0.0606 0.0216 0.0221 0.0029 -0.0334 -0.0021 -0.0063 0.0014 5 0.0322 0.0062 0.0076 0.0016 -0.0630 0.0013 -0.0007 0.0004 20 0.0162 -0.0018 0.0007 0.0007 -0.0603 0.0005 -0.0003 0.0011 2 0.1965 0.1316 0.1276 0.0411 0.2543 0.0628 0.0613 0.0439 5 0.1376 0.0864 0.0857 0.0256 0.2305 0.0399 0.0375 0.0292 20 0.0774 0.0461 0.0453 0.0121 0.1185 0.0190 0.0178 0.0143 ESMR  h = - 0.2831  f = - 0.1109 NAEM SAIB ESMR NAEM SAIB  h = 0.0130 NAEM  f = 0.6724  f = 0.3876 SAIB ESMR  h = 0.1320 33

Table 3: Volatility state estimates for a two-factor log-SV model with leverage effects and jumps. We report the mean, bias, and RMSE of the terminal volatility state estimates obtained across 1,000 Monte Carlo replications. The state-space form of the model is as follows: h +f Y −Y = µ+exp( t t) ε(1) +q ·J t+1 t 2 t+∆ t+1 t+1 p h = h +κ (θ −h )+σ (ρ ·ε(1) + (1−ρ2)·ε(2) ) t+1 t h h t h h t+1 h t+1 q f = f +κ (θ −f )+σ (ρ ·ε(1) + (1−ρ2)·ε(3) ) t+1 t f f t f f t+1 f t+1 r 1 log(IcV t,t+1;M ) ≈ h t +f t + M Ωbt,t+1;M ε( t+ IV 1 ) Columnsrepresentresultsforalternativeestimationproceduresdependingonweatherourvolatility measurement equation based on high-frequency log integrated variance measures log(IcV t,t+1;M ) is used(HF5-minwithM=78; HF2-minwithM=195)ornot(dailyonly),aswellasfortheinfeasible case of perfect observability (known volatility). The rows in each block contain results for different yearly sample lengths (2, 5, or 20 years). Sample length h T : E[h T ] = -9.8998 f T : E[f T ] = 0 T Daily HF HF Known Daily HF HF Known (years) Only 5-min 2-min Volatility Only 5-min 2-min Volatility 2 -9.8259 -9.8616 -9.8560 -9.9200 0.0017 0.0014 -0.0013 -0.0065 5 -9.8631 -9.9379 -9.9238 -9.8467 -0.0028 -0.0170 -0.0172 -0.0084 20 -9.9611 -9.9978 -9.9893 -9.9037 -0.0042 -0.0028 -0.0029 0.0013 2 0.0030 -0.0326 -0.0270 0.0000 0.0207 0.0204 0.0177 0.0000 5 0.0520 -0.0229 -0.0087 0.0000 0.0299 0.0157 0.0155 0.0000 20 0.0255 -0.0112 -0.0027 0.0000 0.0184 0.0199 0.0197 0.0000 2 0.4748 0.3049 0.2958 0.0000 0.4207 0.4075 0.4070 0.0000 5 0.4687 0.3116 0.2997 0.0000 0.4343 0.4197 0.4189 0.0000 20 0.4491 0.2736 0.2662 0.0000 0.3980 0.3813 0.3800 0.0000 NAEM SAIB ESMR 34

rotcaf-eno a rof daeha syad ytnewt dna net ,evfi ,eno fo snoziroh ta stnemom nruter lanoitidnoc fo stsaceroF :4 elbaT ,noitaiveddradnats,naemnruterlanoitidnocehtfostsaceroffoESMRdnasaib,naemehttropereW .stceffe egarevel htiw ledom VS-gol etnoM 000,1 ssorca deniatbo ,daeha syad )D lenap( ytnewt dna )C lenap( net ,)B lenap( evfi ,)A lenap( eno fo snoziroh ta sisotruk dna ssenweks elpmas tnereffid eerht rof stluser niatnoc stsacerof tnemom eseht fo ESMR dna ,saib ,naem eht rof detroper swor eerht hcaE .snoitacilper olraC .tnuocca otni ytniatrecnu noitamitse ytilitalov dna retemarap gnikat ,)nmuloc dnoces eht ni detacidni sa( sraey 02 dna 5 ,2 ot lauqe T shtgnel rehtie :sriap nmuloc tnecajda ni serudecorp noitamitse naiseyaB evitanretla owt rof stluser troper ew noziroh tsacerof dna tnemom hcae roF htiw noitalumrof ecaps-etats gniylrednu eht gnitnemgua )”nim-5 FH“ detoned nmuloc thgir( htiw ro )”ylno yliaD“ detoned nmuloc tfel( tuohtiw .repap siht ni desoporp sa ,atad yadartni ycneuqerf hgih no desab noitauqe tnemerusaem ytilitalov yliad a noziroh yad-5 :B lenaP noziroh yad-1 :A lenaP FH yliaD FH yliaD FH yliaD FH yliaD T FH yliaD FH yliaD FH yliaD FH yliaD T nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( 3021.3 8463.3 8270.0- 7470.0- 8220.0 8320.0 8000.0 9000.0 2 7670.3 3806.3 1000.0 4000.0 5800.0 9800.0 1000.0 1000.0 2 5811.3 2323.3 9270.0- 6570.0- 2320.0 9320.0 9000.0 9000.0 5 7570.3 7845.3 0000.0 0000.0 7800.0 0900.0 1000.0 1000.0 5 5711.3 8392.3 3370.0- 1670.0- 1320.0 5320.0 0100.0 0100.0 02 2570.3 0705.3 0000.0 0000.0 7800.0 8800.0 1000.0 1000.0 02 1140.0 7582.0 1000.0- 0200.0- 1000.0 1100.0 2000.0- 1000.0- 2 6280.0 1416.0 1000.0 3000.0 0000.0 4000.0 0000.0 0000.0 2 6930.0 2442.0 3000.0- 0300.0- 1000.0 8000.0 1000.0- 1000.0- 5 5180.0 6455.0 1000.0 0000.0 0000.0 3000.0 0000.0 0000.0 5 6830.0 9412.0 8000.0- 6300.0- 2000.0- 2000.0 0000.0 0000.0 02 1180.0 9215.0 0000.0 0000.0 1000.0- 1000.0 0000.0 0000.0 02 1440.0 2303.0 8010.0 2530.0 1200.0 0500.0 2200.0 4200.0 2 1380.0 7436.0 8000.0 3200.0 8000.0 9100.0 3000.0 3000.0 2 8040.0 4352.0 2700.0 4320.0 0200.0 7400.0 4100.0 5100.0 5 8180.0 1765.0 7000.0 1100.0 8000.0 8100.0 2000.0 2000.0 5 0930.0 4022.0 7300.0 5210.0 0200.0 7400.0 7000.0 7000.0 02 2180.0 4125.0 7000.0 9000.0 8000.0 8100.0 1000.0 1000.0 02 FH yliaD FH yliaD FH yliaD FH yliaD T FH yliaD FH yliaD FH yliaD FH yliaD T nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( 7772.3 6105.3 7431.0- 5731.0- 3340.0 1540.0 7200.0 1300.0 2 4461.3 9173.3 9290.0- 3590.0- 3130.0 7230.0 5100.0 7100.0 2 4962.3 3814.3 5331.0- 0731.0- 8340.0 1540.0 9200.0 1300.0 5 4161.3 2323.3 6290.0- 7590.0- 8130.0 8230.0 6100.0 7100.0 5 6462.3 1173.3 4331.0- 4631.0- 4340.0 1440.0 4300.0 4300.0 02 8951.3 9092.3 9290.0- 9590.0- 7130.0 2230.0 8100.0 8100.0 02 0640.0 8962.0 2200.0- 1500.0- 6000.0 4200.0 7000.0- 3000.0- 2 4730.0 9442.0 8000.0- 2300.0- 2000.0 6100.0 4000.0- 2000.0- 2 0930.0 9781.0 4100.0- 9400.0- 4000.0 7100.0 5000.0- 3000.0- 5 9430.0 6691.0 7000.0- 8300.0- 2000.0 2100.0 3000.0- 1000.0- 5 5430.0 0141.0 5100.0- 5400.0- 2000.0- 4000.0 0000.0 0000.0 02 2330.0 3461.0 1100.0- 1400.0- 2000.0- 3000.0 0000.0 0000.0 02 0670.0 6633.0 9120.0 9660.0 2400.0 1900.0 7700.0 2800.0 2 7540.0 7172.0 1410.0 3540.0 9200.0 7600.0 2400.0 5400.0 2 9350.0 9612.0 2410.0 9240.0 7300.0 4800.0 8400.0 0500.0 5 4830.0 2902.0 3900.0 7920.0 7200.0 3600.0 6200.0 7200.0 5 7830.0 5251.0 1700.0 4120.0 5300.0 9700.0 3200.0 5200.0 02 3430.0 8071.0 7400.0 4510.0 7200.0 2600.0 3100.0 3100.0 02 NAEM SAIB ESMR ESMR sisotruK ssenwekS veDdtS naeM sisotruK ssenwekS veDdtS naeM tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF noziroh yad-02 :D lenaP sisotruK ssenwekS veDdtS naeM sisotruK ssenwekS veDdtS naeM tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF NAEM SAIB ESMR NAEM SAIB ESMR NAEM SAIB noziroh yad-01 :C lenaP 35

rotcaf-owt a rof daeha syad ytnewt dna net ,evfi ,eno fo snoziroh ta stnemom nruter lanoitidnoc fo stsaceroF :5 elbaT ,naem nruter lanoitidnoc eht fo stsacerof fo ESMR dna saib ,naem eht troper eW .spmuj dna stceffe egarevel htiw ledom VS-gol deniatbo ,daeha syad )D lenap( ytnewt dna )C lenap( net ,)B lenap( evfi ,)A lenap( eno fo snoziroh ta sisotruk dna ssenweks ,noitaived dradnats eerhtrofstluserniatnocstsaceroftnemomesehtfoESMRdna,saib,naemehtrofdetropersworeerhthcaE .snoitacilperolraCetnoM000,1ssorca ytniatrecnunoitamitseytilitalovdnaretemarapgnikat ,)nmulocdnocesehtnidetacidnisa(sraey02dna5 ,2otlauqeTshtgnelelpmastnereffid nmuloc tnecajda ni serudecorp noitamitse naiseyaB evitanretla owt rof stluser troper ew noziroh tsacerof dna tnemom hcae roF .tnuocca otni ecaps-etats gniylrednu eht gnitnemgua )”nim-5 FH“ detoned nmuloc thgir( htiw ro )”ylno yliaD“ detoned nmuloc tfel( tuohtiw rehtie :sriap .repap siht ni desoporp sa ,atad yadartni ycneuqerf hgih no desab noitauqe tnemerusaem ytilitalov yliad a htiw noitalumrof FH yliaD FH yliaD FH yliaD FH yliaD T FH yliaD FH yliaD FH yliaD FH yliaD T nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( 2908.51 2651.71 2272.0- 3742.0- 1620.0 4620.0 8300.0 8300.0 2 0542.69 9509.29 3716.0- 1385.0- 8900.0 9900.0 5000.0 5000.0 2 8696.61 5740.51 0482.0- 7162.0- 3520.0 8520.0 8300.0 9300.0 5 4837.201 1792.09 0256.0- 3295.0- 5900.0 7900.0 5000.0 6000.0 5 2107.71 3921.61 3792.0- 0082.0- 9420.0 1520.0 9300.0 0400.0 02 8653.011 6771.89 6596.0- 2836.0- 4900.0 5900.0 6000.0 6000.0 02 0134.0- 0619.0 6300.0 6820.0 0000.0 2000.0 1000.0- 1000.0- 2 1874.3- 3718.6- 8510.0 1050.0 1000.0- 0000.0 0000.0 0000.0 2 6776.0- 8403.2- 6600.0 7820.0 1000.0- 5000.0 1000.0- 1000.0- 5 8963.5- 0118.71- 6220.0 4280.0 0000.0 2000.0 0000.0 0000.0 5 6539.0- 5705.2- 4800.0 7520.0 0000.0 2000.0 1000.0- 0000.0 02 1252.7- 4134.91- 7920.0 1780.0 0000.0 1000.0 0000.0 0000.0 02 3719.4 7453.15 1260.0 6671.0 1300.0 2500.0 0200.0 2200.0 2 7633.63 5972.411 7002.0 8213.0 2100.0 0200.0 3000.0 3000.0 2 7642.5 3691.8 2460.0 9001.0 0300.0 4400.0 2100.0 3100.0 5 5251.93 1928.06 1891.0 2403.0 2100.0 7100.0 2000.0 2000.0 5 7050.5 9256.8 9850.0 1001.0 5200.0 9300.0 6000.0 6000.0 02 2308.73 0725.46 4291.0 5803.0 0100.0 5100.0 1000.0 1000.0 02 FH yliaD FH yliaD FH yliaD FH yliaD T FH yliaD FH yliaD FH yliaD FH yliaD T nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( 1325.6 7270.7 1181.0- 8171.0- 2940.0 8940.0 0310.0 2310.0 2 5896.9 2978.01 9412.0- 7891.0- 8530.0 2630.0 1700.0 1700.0 2 1117.6 5643.6 8481.0- 5671.0- 4740.0 4840.0 0310.0 3310.0 5 3631.01 7313.9 0222.0- 0702.0- 6430.0 3530.0 1700.0 2700.0 5 6139.6 2095.6 2981.0- 2481.0- 6640.0 0740.0 3310.0 6310.0 02 7636.01 1368.9 6032.0- 8912.0- 1430.0 4430.0 2700.0 3700.0 02 6610.0- 1335.0 1200.0- 3700.0 4000.0 0100.0 5000.0- 4000.0- 2 9571.0- 8400.1 7000.0 9610.0 1000.0 5000.0 3000.0- 2000.0- 2 1201.0- 4164.0- 3000.0 4800.0 1000.0 0100.0 5000.0- 2000.0- 5 0903.0- 4021.1- 6300.0 4810.0 0000.0 7000.0 3000.0- 1000.0- 5 5781.0- 9825.0- 9100.0 0700.0 0000.0 4000.0 2000.0- 0000.0 02 0544.0- 6812.1- 0500.0 8510.0 0000.0 3000.0 1000.0- 0000.0 02 5172.1 1866.41 3530.0 9270.0 8500.0 2900.0 7600.0 4700.0 2 9035.2 1454.53 3440.0 6890.0 2400.0 9600.0 7300.0 0400.0 2 5023.1 2230.2 8920.0 5150.0 2500.0 7700.0 1400.0 4400.0 5 4476.2 2071.4 6340.0 0170.0 0400.0 9500.0 2200.0 4200.0 5 7622.1 5190.2 2520.0 2340.0 1400.0 5600.0 0200.0 2200.0 02 5945.2 5863.4 5930.0 9760.0 2300.0 1500.0 1100.0 2100.0 02 NAEM SAIB ESMR NAEM SAIB ESMR NAEM SAIB noziroh yad-5 :B lenaP noziroh yad-1 :A lenaP sisotruK ssenwekS veDdtS sisotruK ssenwekS veDdtS naeM tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF noziroh yad-01 :C lenaP NAEM SAIB ESMR ESMR naeM tsaceroF noziroh yad-02 :D lenaP sisotruK ssenwekS veDdtS naeM sisotruK ssenwekS veDdtS naeM tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF 36

rotcaf-eno a rof daeha syad ytnewt dna net ,evfi ,eno fo snoziroh ta selitnauq nruter lanoitidnoc fo stsaceroF :6 elbaT lanoitidnoc 99.0 dna 59.0 ,50.0 ,10.0 eht fo stsacerof fo ESMR dna saib ,naem eht troper eW .stceffe egarevel htiw ledom VS-gol olraC etnoM 000,1 ssorca deniatbo ,daeha syad )D lenap( ytnewt dna )C lenap( net ,)B lenap( evfi ,)A lenap( eno fo snoziroh ta selitnauq nruter shtgnelelpmastnereffideerhtrofstluserniatnocstsacerofelitnauqesehtfoESMRdna ,saib ,naemehtrofdetropersworeerhthcaE .snoitacilper roF .tnuocca otni ytniatrecnu noitamitse ytilitalov dna retemarap gnikat ,)nmuloc dnoces eht ni detacidni sa( sraey 02 dna 5 ,2 ot lauqe T tuohtiw rehtie :sriap nmuloc tnecajda ni serudecorp noitamitse naiseyaB evitanretla owt rof stluser troper ew noziroh tsacerof dna elitnauq hcae yliad a htiw noitalumrof ecaps-etats gniylrednu eht gnitnemgua )”nim-5 FH“ detoned nmuloc thgir( htiw ro )”ylno yliaD“ detoned nmuloc tfel( .repap siht ni desoporp sa ,atad yadartni ycneuqerf hgih no desab noitauqe tnemerusaem ytilitalov FH yliaD FH yliaD FH yliaD FH yliaD T FH yliaD FH yliaD FH yliaD FH yliaD T nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( 2350.0 6650.0 8730.0 4930.0 1730.0- 4830.0- 9350.0- 1750.0- 2 1020.0 8120.0 1410.0 6410.0 9310.0- 4410.0- 9910.0- 5120.0- 2 1450.0 8650.0 5830.0 7930.0 6730.0- 7830.0- 8450.0- 4750.0- 5 5020.0 0220.0 4410.0 8410.0 2410.0- 6410.0- 2020.0- 7120.0- 5 1450.0 8550.0 5830.0 0930.0 4730.0- 9730.0- 5450.0- 2650.0- 02 5020.0 5120.0 4410.0 6410.0 1410.0- 3410.0- 2020.0- 3120.0- 02 2000.0 6300.0 1000.0- 5100.0 3000.0- 7100.0- 6000.0- 8300.0- 2 1000.0 8100.0 1000.0- 5000.0 0000.0 5000.0- 2000.0- 8100.0- 2 2000.0 0300.0 0000.0 2100.0 2000.0- 3100.0- 5000.0- 2300.0- 5 2000.0 6100.0 0000.0 4000.0 0000.0 4000.0- 2000.0- 7100.0- 5 2000.0- 5100.0 3000.0- 3000.0 3000.0 3000.0- 2000.0 5100.0- 02 0000.0 1100.0 1000.0- 1000.0 1000.0 1000.0- 0000.0 1100.0- 02 2500.0 9110.0 0400.0 2800.0 2400.0 8800.0 6500.0 8210.0 2 9100.0 8400.0 4100.0 1300.0 4100.0 2300.0 9100.0 8400.0 2 7400.0 3110.0 5300.0 8700.0 6300.0 0800.0 0500.0 8110.0 5 8100.0 6400.0 3100.0 0300.0 3100.0 0300.0 8100.0 6400.0 5 7400.0 8010.0 4300.0 6700.0 5300.0 8700.0 9400.0 3110.0 02 8100.0 4400.0 3100.0 0300.0 3100.0 0300.0 8100.0 4400.0 02 FH yliaD FH yliaD FH yliaD FH yliaD T FH yliaD FH yliaD FH yliaD FH yliaD T nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( 6101.0 5701.0 3270.0 4570.0 7960.0- 0270.0- 1401.0- 2901.0- 2 3370.0 7770.0 2250.0 4450.0 7050.0- 5250.0- 5470.0- 5870.0- 2 8201.0 0701.0 2370.0 4570.0 2070.0- 0270.0- 8401.0- 9801.0- 5 5470.0 9770.0 0350.0 7450.0 4150.0- 8250.0- 4570.0- 7870.0- 5 4201.0 7401.0 1370.0 1470.0 2960.0- 1070.0- 5301.0- 8501.0- 02 4470.0 3670.0 0350.0 8350.0 9050.0- 6150.0- 8470.0- 8670.0- 02 1100.0 9600.0 3000.0 4300.0 7100.0- 0400.0- 6200.0- 8700.0- 2 4000.0 8400.0 0000.0 2200.0 7000.0- 6200.0- 2100.0- 3500.0- 2 6000.0 9400.0 1000.0 3200.0 0100.0- 8200.0- 6100.0- 7500.0- 5 3000.0 7300.0 0000.0 6100.0 5000.0- 9100.0- 8000.0- 1400.0- 5 3000.0- 9100.0 4000.0- 5000.0 4000.0 5000.0- 3000.0 0200.0- 02 3000.0- 7100.0 4000.0- 4000.0 3000.0 4000.0- 2000.0 7100.0- 02 4110.0 3220.0 4900.0 9510.0 1110.0 2810.0 0410.0 8520.0 2 5700.0 2610.0 9500.0 3110.0 6600.0 4210.0 5800.0 9710.0 2 4900.0 2020.0 3700.0 3410.0 2800.0 0510.0 9010.0 9120.0 5 6600.0 1510.0 0500.0 6010.0 3500.0 9010.0 2700.0 0610.0 5 1800.0 1810.0 9500.0 8210.0 4600.0 6310.0 9800.0 7910.0 02 2600.0 2410.0 5400.0 0010.0 7400.0 4010.0 6600.0 1510.0 02 NAEM SAIB ESMR NAEM SAIB ESMR noziroh yad-02 :D lenaP noziroh yad-01 :C lenaP 99.0 elitnauQ 59.0 elitnauQ 50.0 elitnauQ 10.0 elitnauQ 99.0 elitnauQ 59.0 elitnauQ 50.0 elitnauQ 10.0 elitnauQ tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF NAEM SAIB ESMR NAEM SAIB ESMR noziroh yad-5 :B lenaP noziroh yad-1 :A lenaP 99.0 elitnauQ 59.0 elitnauQ 50.0 elitnauQ 10.0 elitnauQ 99.0 elitnauQ 59.0 elitnauQ 50.0 elitnauQ 10.0 elitnauQ tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF 37

rotcaf-owt a rof daeha syad ytnewt dna net ,evfi ,eno fo snoziroh ta selitnauq nruter lanoitidnoc fo stsaceroF :7 elbaT 99.0 dna 59.0 ,50.0 ,10.0 eht fo stsacerof fo ESMR dna saib ,naem eht troper eW .spmuj dna stceffe egarevel htiw ledom VS-gol 000,1 ssorca deniatbo ,daeha syad )D lenap( ytnewt dna )C lenap( net ,)B lenap( evfi ,)A lenap( eno fo snoziroh ta selitnauq nruter lanoitidnoc tnereffid eerht rof stluser niatnoc stsacerof elitnauq eseht fo ESMR dna ,saib ,naem eht rof detroper swor eerht hcaE .snoitacilper olraC etnoM otni ytniatrecnu noitamitse ytilitalov dna retemarap gnikat ,)nmuloc dnoces eht ni detacidni sa( sraey 02 dna 5 ,2 ot lauqe T shtgnel elpmas :sriap nmuloc tnecajda ni serudecorp noitamitse naiseyaB evitanretla owt rof stluser troper ew noziroh tsacerof dna elitnauq hcae roF .tnuocca noitalumrofecaps-etatsgniylrednuehtgnitnemgua)”nim-5FH“detonednmulocthgir(htiwro)”ylnoyliaD“detonednmuloctfel(tuohtiwrehtie .repap siht ni desoporp sa ,atad yadartni ycneuqerf hgih no desab noitauqe tnemerusaem ytilitalov yliad a htiw noziroh yad-5 :B lenaP noziroh yad-1 :A lenaP 99.0 elitnauQ 59.0 elitnauQ 50.0 elitnauQ 10.0 elitnauQ 99.0 elitnauQ 59.0 elitnauQ 50.0 elitnauQ 10.0 elitnauQ tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF FH yliaD FH yliaD FH yliaD FH yliaD T FH yliaD FH yliaD FH yliaD FH yliaD T nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( 3360.0 0560.0 4140.0 0240.0 8430.0- 2530.0- 5060.0- 8160.0- 2 7120.0 1220.0 2410.0 4410.0 1310.0- 3310.0- 7020.0- 1120.0- 2 1160.0 5360.0 8930.0 0140.0 2330.0- 2430.0- 2850.0- 3060.0- 5 8020.0 6120.0 6310.0 1410.0 5210.0- 9210.0- 8910.0- 5020.0- 5 0060.0 7160.0 1930.0 7930.0 3230.0- 8230.0- 0750.0- 5850.0- 02 4020.0 0120.0 4310.0 6310.0 2210.0- 4210.0- 3910.0- 9910.0- 02 6000.0 4200.0 2000.0- 5000.0 1000.0- 5000.0- 0100.0- 3200.0- 2 3000.0 7000.0 1000.0- 1000.0 1000.0 1000.0- 3000.0- 8000.0- 2 6000.0 0300.0 2000.0- 0100.0 1000.0- 1100.0- 9000.0- 9200.0- 5 3000.0 1100.0 1000.0- 3000.0 0000.0 4000.0- 4000.0- 1100.0- 5 8000.0 5200.0 0000.0 7000.0 2000.0- 7000.0- 9000.0- 4200.0- 02 5000.0 1100.0 0000.0 2000.0 0000.0 2000.0- 5000.0- 1100.0- 02 2800.0 8310.0 1600.0 8900.0 2600.0 8900.0 2800.0 9310.0 2 4300.0 8500.0 2200.0 7300.0 2200.0 7300.0 4300.0 8500.0 2 9700.0 8110.0 9500.0 4800.0 8500.0 6800.0 8700.0 0210.0 5 4300.0 0500.0 2200.0 2300.0 2200.0 2300.0 3300.0 0500.0 5 5600.0 4010.0 7400.0 4700.0 8400.0 6700.0 4600.0 4010.0 02 8200.0 5400.0 8100.0 9200.0 8100.0 9200.0 8200.0 5400.0 02 FH yliaD FH yliaD FH yliaD FH yliaD T FH yliaD FH yliaD FH yliaD FH yliaD T nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( 6631.0 8831.0 5780.0 9880.0 7460.0- 6560.0- 8021.0- 4221.0- 2 5490.0 3690.0 8950.0 8060.0 5740.0- 1840.0- 5880.0- 7980.0- 2 7231.0 5531.0 4480.0 7680.0 6160.0- 2360.0- 9611.0- 9811.0- 5 8190.0 2490.0 6750.0 3950.0 3540.0- 6640.0- 7580.0- 5780.0- 5 4131.0 7231.0 1380.0 4480.0 7950.0- 5060.0- 1511.0- 8511.0- 02 7090.0 1290.0 6650.0 6750.0 9340.0- 6440.0- 5480.0- 4580.0- 02 2100.0 4300.0 3000.0 7100.0 3100.0- 2200.0- 3200.0- 9300.0- 2 6000.0 4200.0 1000.0- 9000.0 5000.0- 1100.0- 1100.0- 3200.0- 2 1000.0- 7200.0 3000.0- 0200.0 7000.0- 4200.0- 0100.0- 0300.0- 5 1000.0 6200.0 2000.0- 5100.0 3000.0- 7100.0- 6000.0- 4200.0- 5 1000.0- 2100.0 0000.0 3100.0 5000.0- 3100.0- 4000.0- 2100.0- 02 3000.0 6100.0 0000.0 0100.0 3000.0- 0100.0- 4000.0- 3100.0- 02 8310.0 4120.0 2210.0 8710.0 5210.0 9710.0 3410.0 9120.0 2 1010.0 6610.0 5800.0 2310.0 7800.0 2310.0 0010.0 5610.0 2 7110.0 2610.0 5010.0 5410.0 1010.0 9410.0 4110.0 9610.0 5 3900.0 3310.0 9700.0 2110.0 7700.0 5110.0 8800.0 3310.0 5 5800.0 8210.0 9700.0 3210.0 9700.0 6210.0 4800.0 0310.0 02 2700.0 1110.0 2600.0 8900.0 3600.0 0010.0 9600.0 8010.0 02 NAEM SAIB ESMR NAEM SAIB ESMR noziroh yad-02 :D lenaP noziroh yad-01 :C lenaP 99.0 elitnauQ 59.0 elitnauQ 50.0 elitnauQ 10.0 elitnauQ 99.0 elitnauQ 59.0 elitnauQ 50.0 elitnauQ 10.0 elitnauQ tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF tsaceroF NAEM SAIB ESMR NAEM SAIB ESMR 38

stsacerof eurt gniylrednu eht fo redro knar yb detros stsacerof elitnauq nruter lanoitidnoc fo srorre evitaleR :8elbaT .spmuj dna stceffe egarevel htiw ledom VS-gol rotcaf-owt a rof daeha syad ytnewt dna net ,evfi ,eno fo snoziroh ta )C lenap( net ,)B lenap( evfi ,)A lenap( eno fo snoziroh ta selitnauq nruter lanoitidnoc 50.0 dna 10.0 eht fo stsacerof fo srorre evitaler troper eW tsacerof a neewteb ecnereffid egatnecrep eht fo naem eht sa snoitacilper olraC etnoM 000,1 ssorca detaluclac ,daeha syad )D lenap( ytnewt dna fo redro knar eht yb detros era stluser ehT .seulav eurt gnidnopserroc eht no desab tsacerof eht dna setamitse etats dna retemarap no desab eerht hcaE .nmuloc tsrfi eht ni detacidni sa ,)semit doog gnitneserper( hgih ot )semit dab gnitneserper( wol morf stsacerof elitnauq eurt eht ,sraey 2 ot lauqe T shtgnel elpmas tnereffid eerht rof stluser niatnoc stsacerof elitnauq eurt eht fo )hgih( 5 ot )wol( 1 sknar rof detroper swor elitnauq hcae roF .tnuocca otni ytniatrecnu noitamitse ytilitalov dna retemarap gnikat ,)nmuloc dnoces eht ni nevig sa( sraey 02 dna sraey 5 nmuloc tfel( tuohtiw rehtie :sriap nmuloc tnecajda ni serudecorp noitamitse naiseyaB evitanretla owt rof stluser troper ew noziroh tsacerof dna ytilitalov yliad a htiw noitalumrof ecaps-etats gniylrednu eht gnitnemgua )”nim-5 FH“ detoned nmuloc thgir( htiw ro )”ylno yliaD“ detoned nruter lanoitidnoc fo srorre evitaler detroper ehT .repap siht ni desoporp sa ,atad yadartni ycneuqerf hgih no desab noitauqe tnemerusaem desab ksir tekram rof egrahc latipac deilpmi eht fo noitamitsrednu ro noitamitserevo egatnecrep eht sa osla deterpretni eb nac stsacerof elitnauq .RaV )50.0 elitnauq( tnecrep-evfi dna )10.0 elitnauq( tnecrep-eno no noziroh yad-02 :D lenaP noziroh yad-01 :C lenaP noziroh yad-5 :B lenaP noziroh yad-1 :A lenaP 50.0 elitnauQ 10.0 elitnauQ 50.0 elitnauQ 10.0 elitnauQ 50.0 elitnauQ 10.0 elitnauQ 50.0 elitnauQ 10.0 elitnauQ elpmaS eurT rorrE tsaceroF rorrE tsaceroF rorrE tsaceroF rorrE tsaceroF rorrE tsaceroF rorrE tsaceroF rorrE tsaceroF rorrE tsaceroF eziS tsaceroF FH yliaD FH yliaD FH yliaD FH yliaD FH yliaD FH yliaD FH yliaD FH yliaD T knaR nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO nim-5 ylnO )sraey( )selitniuq( %93.0 %08.3- %66.1 %64.0- %20.2- %54.6- %65.0- %69.2- %03.3- %19.7- %46.1- %85.4- %57.4- %81.9- %54.2- %94.6- 2 ts1 %80.3- %22.7- %16.1- %07.3- %15.4- %73.8- %18.2- %47.4- %33.5- %00.9- %85.3- %96.5- %72.6- %13.9- %30.4- %76.6- 5 )woL( %82.3- %91.7- %67.1- %99.3- %05.3- %24.7- %97.1- %59.3- %16.3- %75.7- %39.1- %62.4- %97.3- %65.7- %94.1- %41.4- 02 %54.0- %03.1- %79.0 %61.2 %66.1- %10.3- %31.0 %48.0 %53.2- %10.4- %16.0- %75.0- %62.3- %48.4- %90.1- %38.1- 2 %78.0 %02.0 %53.1 %60.2 %21.0 %86.0- %90.1 %28.1 %53.0- %42.1- %78.0 %83.1 %02.1- %28.1- %10.1 %18.0 5 dn2 %62.2- %25.3- %75.0- %64.0- %35.2- %67.3- %75.0- %22.0- %66.2- %89.3- %88.0- %84.0- %59.2- %71.4- %85.0- %52.0- 02 %00.4 %97.5 %15.3 %28.5 %65.3 %65.5 %64.3 %90.6 %32.3 %33.5 %78.3 %88.6 %35.2 %20.5 %76.4 %88.7 2 %65.2 %59.7 %73.2 %83.6 %94.2 %04.8 %17.2 %24.7 %04.2 %15.8 %32.3 %13.9 %39.1 %92.8 %72.4 %91.11 5 dr3 %14.2 %60.3 %30.2 %64.3 %77.2 %65.3 %25.2 %86.4 %17.2 %17.3 %92.3 %79.5 %42.2 %46.2 %76.4 %16.7 02 %87.4 %68.01 %08.2 %85.6 %50.5 %88.11 %74.3 %71.8 %89.4 %91.21 %52.5 %78.11 %73.4 %51.21 %57.6 %02.51 2 %10.6 %48.41 %57.2 %73.7 %35.6 %93.61 %58.3 %35.9 %63.7 %63.71 %85.6 %35.51 %50.7 %20.71 %04.9 %40.02 5 ht4 %14.6 %72.31 %82.2 %12.5 %47.6 %74.41 %78.2 %57.6 %08.6 %37.41 %63.6 %28.31 %22.6 %96.41 %74.8 %18.71 02 %38.5 %93.32 %21.1 %33.5 %21.8 %45.72 %67.1 %55.6 %99.8 %84.82 %15.6 %07.91 %18.8 %42.72 %42.11 %08.03 2 ht5 %22.7 %11.52 %21.1 %95.4 %36.9 %64.92 %54.1 %15.5 %57.9 %17.92 %06.7 %82.02 %62.9 %66.72 %57.11 %59.03 5 )hgiH( %81.01 %89.92 %02.1 %25.4 %18.11 %50.53 %83.1 %54.5 %29.11 %07.53 %92.7 %40.22 %65.01 %25.23 %19.21 %32.73 02 39

Figure1: Relative error plots for five day VaR forecasts against the rank order of the underlying true forecasts for a two-factor log-SV model with leverage effects and jumps. Weplotone-percentVaR(topgraph)andfive-percentVaR(bottomgraph)relative forecast errors at a five-day horizon as a function of the rank order of the underlying true forecasts fromlow(representingbadtimes)tohigh(representinggoodtimes). Theerrorsarecalculatedasthe meanofthepercentagedifferencebetweenaforecastbasedonparameterandstateestimatesandthe forecast based on the corresponding true values across 1,000 Monte Carlo replications. The model is estimated at a daily discretization interval by Bayesian MCMC methods either without or with augmenting the underlying state-space formulation with a daily volatility measurement equation basedonhighfrequencyintradaydata, asproposedinthispaper. TheresultingVaRforecasterrors without utilizing high-frequency volatility measures are plotted as a solid line (denoted “Daily”), while those incorporating the information content of intraday data for the latent daily volatility are plotted as a dashed line (denoted “HF 5-min”). 40% 30% 20% 10% 0% -10% True forecast rank (quintiles) )%( rorre tsacerof evitaleR One-percent VaR Daily HF 5-min 1 st L o w 2 n d 3r d 4t h 5t h H i g h 40% 30% 20% 10% 0% -10% True forecast rank (quintiles) )%( rorre tsacerof evitaleR Five-percent VaR Daily HF 5-min 1 st L o w 2 n d 3r d 4t h 5t h H i g h 40

Figure 2: One-percent and five-percent VaR forecasts for S&P 500 returns during the financial crisis of 2008. We plot one-percent (top graph) and five-percent (bottom graph) VaR forecasts at five-day horizon without overlapping for S&P 500 futures returns based on a two-factor log-SV model with jumps in returns. The model is estimated at a daily discretization intervalbyBayesianMCMCmethodseitherwithoutorwithaugmentingtheunderlyingstate-space formulation with a daily volatility measurement equation based on high frequency intraday data, as proposed in this paper. The resulting VaR forecasts without utilizing high-frequency volatility measures are plotted as a solid line (denoted “VaR with daily data”), those incorporating the information content of intraday data for the latent daily volatility are plotted as a dashed line (denoted “VaRwithHF5-mindata”), whilethecorrespondingactualobservedreturnsareplottedasvertical bars(denoted“Returnrealizations”). TheVaRanalysisisfortheperiodJuly6,2006-February19, 2009 and involves re-estimating the model on each date with all past data going back to October 2, 1985. 10% 5% 0% -5% -10% -15% -20% -25% -30% -35% -40% ksiR dna snruteR S&P500: One-Percent VaR Forecasts vs Return Realizations Bear Sterns Turmoil Countrywide Turmoil Fannie Mae / Freddie Mac Turmoil Return Realizations TARP Legislation Turmoil VaR with Daily Data VaR with HF 5-min Data 8/3/200 9 6 /12/20 1 0 0 6 /20/200 1 6 2/4/200 1 6 /19/200 2 7 /28/2007 4/4/200 5 7 /14/200 6 7 /21/200 7 7 /31/2007 9/7/20 1 0 0 7 /17/20 1 0 1 7 /29/200 1 7 /11/200 2 8 /25/2008 4/3/2008 5/8/200 6 8 /17/200 7 8 /25/2008 9/3/200 1 8 0/8/20 1 0 1 8 /20/2008 1/7/200 2 9 /19/2009 10% 5% 0% -5% -10% -15% -20% -25% -30% -35% -40% ksiR dna snruteR S&P500: Five-Percent VaR Forecasts vs Return Realizations Bear Sterns Turmoil Countrywide Turmoil Fannie Mae / Freddie Mac Turmoil TARP Legislation Turmoil Return Realizations VaR with Daily Data VaR with HF 5-min Data 8/3/200 9 6 /12/20 1 0 0 6 /20/200 1 6 2/4/200 1 6 /19/200 2 7 /28/2007 4/4/200 5 7 /14/200 6 7 /21/200 7 7 /31/2007 9/7/20 1 0 0 7 /17/20 1 0 1 7 /29/200 1 7 /11/200 2 8 /25/2008 4/3/2008 5/8/200 6 8 /17/200 7 8 /25/2008 9/3/200 1 8 0/8/20 1 0 1 8 /20/2008 1/7/200 2 9 /19/2009 41

Figure 3: One-percent and five-percent VaR forecasts for Google returns during the financial crisis of 2008. Weplotone-percent(topgraph)andfive-percent(bottomgraph) VaR forecasts at five-day horizon without overlapping for Google returns based on a two-factor log-SV model with jumps in returns. The model is estimated at a daily discretization interval by BayesianMCMCmethodseitherwithoutorwithaugmentingtheunderlyingstate-spaceformulation with a daily volatility measurement equation based on high frequency intraday data, as proposed in this paper. The resulting VaR forecasts without utilizing high-frequency volatility measures are plottedasasolidline(denoted“VaRwithdailydata”),thoseincorporatingtheinformationcontent of intraday data for the latent daily volatility are plotted as a dashed line (denoted “VaR with HF 5-min data”), while the corresponding actual observed returns are plotted as vertical bars (denoted “Return realizations”). The VaR analysis is for the period December 8, 2006 - July 24, 2009 and involves re-estimating the model on each date with all past data going back to August 30, 2004 (ten days after Google’s IPO). 10% 5% 0% -5% -10% -15% -20% -25% -30% -35% -40% ksiR dna snruteR Google: One-Percent VaR Forecasts vs Return Realizations Bear Sterns Turmoil Fannie Mae / Freddie Mac Turmoil Countrywide Turmoil Return Realizations VaR with Daily Data TARP Legislation Turmoil VaR with HF 5-min Data 1/18/200 2 7 /27/2007 4/4/200 5 7 /14/200 6 7 /21/200 7 7 /31/2007 9/7/20 1 0 0 7 /17/20 1 0 1 7 /29/200 1 7 /11/200 2 8 /25/2008 4/3/2008 5/8/200 6 8 /17/200 7 8 /25/2008 9/3/20 1 0 0 8 /15/20 1 0 1 8 /24/2008 1/9/200 2 9 /23/200 3 9 /30/2009 5/7/200 6 9 /16/200 7 9 /24/2009 10% 5% 0% -5% -10% -15% -20% -25% -30% -35% -40% ksiR dna snruteR Google: Five-Percent VaR Forecasts vs Return Realizations Bear Sterns Turmoil Fannie Mae / Freddie Mac Turmoil Countrywide Turmoil TARP Legislation Turmoil Return Realizations VaR with Daily Data VaR with HF 5-min Data 1/18/200 2 7 /27/2007 4/4/200 5 7 /14/200 6 7 /21/200 7 7 /31/2007 9/7/20 1 0 0 7 /17/20 1 0 1 7 /29/200 1 7 /11/200 2 8 /25/2008 4/3/2008 5/8/200 6 8 /17/200 7 8 /25/2008 9/3/20 1 0 0 8 /15/20 1 0 1 8 /24/2008 1/9/200 2 9 /23/200 3 9 /30/2009 5/7/200 6 9 /16/200 7 9 /24/2009 42

Cite this document

APA

Dobrislav Dobrev and Pawel Szerszen (2010). The Information Content of High-Frequency Data for Estimating Equity Return Models and Forecasting Risk (IFDP 2010-1005). Board of Governors of the Federal Reserve System, International Finance Discussion Papers. https://whenthefedspeaks.com/doc/ifdp_2010-1005

BibTeX

@techreport{wtfs_ifdp_2010_1005,
  author = {Dobrislav Dobrev and Pawel Szerszen},
  title = {The Information Content of High-Frequency Data for Estimating Equity Return Models and Forecasting Risk},
  type = {International Finance Discussion Papers},
  number = {2010-1005},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2010},
  url = {https://whenthefedspeaks.com/doc/ifdp_2010-1005},
  abstract = {We demonstrate that the parameters controlling skewness and kurtosis in popular equity return models estimated at daily frequency can be obtained almost as precisely as if volatility is observable by simply incorporating the strong information content of realized volatility measures extracted from high-frequency data. For this purpose, we introduce asymptotically exact volatility measurement equations in state space form and propose a Bayesian estimation approach. Our highly efficient estimates lead in turn to substantial gains for forecasting various risk measures at horizons ranging from a few days to a few months ahead when taking also into account parameter uncertainty. As a practical rule of thumb, we find that two years of high frequency data often suffice to obtain the same level of precision as twenty years of daily data, thereby making our approach particularly useful in finance applications where only short data samples are available or economically meaningful to use. Moreover, we find that compared to model inference without high-frequency data, our approach largely eliminates underestimation of risk during bad times or overestimation of risk during good times. We assess the attainable improvements in VaR forecast accuracy on simulated data and provide an empirical illustration on stock returns during the financial crisis of 2007-2008.},
}