feds · May 2, 2019

When Simplicity Offers a Benefit, Not a Cost: Closed-Form Estimation of the GARCH(1,1) Model that Enhances the Efficiency of Quasi-Maximum Likelihood

Abstract

Simple, multi-step estimators are developed for the popular GARCH(1,1) model, where these estimators are either available entirely in closed form or dependent upon a preliminary estimate from, for example, quasi-maximum likelihood. Identification sources to asymmetry in the model's innovations, casting skewness as an instrument in a linear, two-stage least squares estimator. Properties of regular variation coupled with point process theory establish the distributional limits of these estimators as stable, though highly non-Gaussian, with slow convergence rates relative to the √ n -case. Moment existence criteria necessary for these results are consistent with the heavy-tailed features of many financial returns. In light-tailed cases that support asymptotic normality for these simple estimators, conditions are discovered where the simple estimators can enhance the asymptotic efficiency of quasi-maximum likelihood estimation. In small samples, extensive Monte Carlo experiments reveal these efficiency enhancements to be available for (very) heavy tailed cases. Consequently, the proposed simple estimators are members of the class of multi-step estimators aimed at improving the efficiency of the quasi-maximum likelihood estimator. Accessible materials (.zip) | Appendix (PDF)

Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C. When Simplicity Offers a Benefit, Not a Cost: Closed-Form Estimation of the GARCH(1,1) Model that Enhances the Efficiency of Quasi-Maximum Likelihood Todd Prono 2019-030 Please cite this paper as: Prono, Todd (2019). “When Simplicity Offers a Benefit, Not a Cost: Closed-Form EstimationoftheGARCH(1,1)ModelthatEnhancestheEfficiencyofQuasi-MaximumLikelihood,” FinanceandEconomicsDiscussionSeries2019-030. Washington: BoardofGovernorsofthe Federal Reserve System, https://doi.org/10.17016/FEDS.2019.030. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

When Simplicity Offers a Benefit, Not a Cost: Closed-Form Estimation of the GARCH(1,1) Model that Enhances the Efficiency of Quasi-Maximum Likelihood 1 ToddProno2 ThisVersion: February2019 Abstract Simple,multi-stepestimatorsaredevelopedforthepopularGARCH(1;1)model,wheretheseestimators are either available entirely in closed form or dependent upon a preliminary estimate from, for example, quasi-maximum likelihood. Identification sources to asymmetry in the model’s innovations, casting skewness as an instrument in a linear, two-stage least squares estimator. Properties of regular variationcoupledwithpointprocesstheoryestablishthedistributionallimitsoftheseestimatorsasstable,thoughhighlynon-Gaussian,withslowconvergenceratesrelativetothepn-case.Momentexistence criterianecessaryfortheseresultsareconsistentwiththeheavy-tailedfeaturesofmanyfinancialreturns. Inlight-tailedcasesthatsupportasymptoticnormalityforthesesimpleestimators,conditionsarediscovered where the simple estimators can enhance the asymptotic efficiency of quasi-maximum likelihood estimation. Insmallsamples,extensiveMonteCarloexperimentsrevealtheseefficiencyenhancements tobeavailablefor(very)heavytailedcases. Consequently,theproposedsimpleestimatorsaremembers oftheclassofmulti-stepestimatorsaimedatimprovingtheefficiencyofthequasi-maximumlikelihood estimator. Keywords: GARCH models, closed form estimation, heavy tails, instrumental variables, regular variation. JELcodes: C13,C22,C58. 1TheviewsexpressedinthispaperarethoseoftheauthoranddonotnecessarilyreflectthoseoftheFederalReserveBoard. 2FederalReserveBoard.(202)973-6955,todd.a.prono@frb.gov. 1

1.1. Introduction ThelinearGARCH(1;1)modelofBollerslev(1986)isaworkhorseofconditionalvolatilityforecasting in financial economics, its applications spanning portfolio formation, derivative pricing, and risk management. Despite its parsimony, this model is shown to outperform (in terms of out-of-sample forecasting) morecomplicatedalternativespecifications(see;e.g.,HansenandLunde,2005). Themostcommonestimator for this model is quasi-maximum likelihood (QML), which is based on a Gaussian likelihood function. PioneeringworksbyLeeandHansen(1994),andLumsdaine(1996)establishtheQMLEasconsistentand asymptoticallynormalunderavarietyof(unknown)densitiesforthemodel’sinnovations. Berkes,Horváth, and Kokoszka (2003) and Francq and Zakoïan (2004) extend this result to the GARCH(p;q) model under milderconditions,includingawell-definedfourthmomentofthemodel’sinnovations. HallandYao(2003) establish the distributional limit of the QMLE in cases when the fourth moment of these innovations is ill-defined. The first aim of this paper is to temporarily part ways with QMLE to propose simple, moment-based alternativesforGARCH(1;1)modelestimation. ThedefinitionofasimpleestimatorheraldsfromLewbel (2004)andissubsequentlyappliedinDongandLewbel(2015). DEFINITION. A simple estimator closely resembles (or consists of steps that each resemble) estimators thatarealreadyincommonuseandinvolvesfewornonumericalsearchesornumericaloptimizations. Consistentwiththisdefinition, theestimatorsdevelopedhereinareavailableinclosedformand, therefore, comparable to those proposed by Kristensen and Linton (2006), although under milder conditions. Collectively, these simple estimators are instrumental variables (IV) estimators that apply separately to the model’s ARCH and GARCH parameters and are implemented via applications of linear, two-stage least squares. These simple estimators are shown to be strongly consistent and to weakly converge to stable, thoughhighlynon-Gaussian,limitsinempirically-relevantcases. Specifically,theseresultsrequire(slightly stronger than) third moment existence for the raw return sequence being modeled and a well-defined ith momentoftheGARCHinnovations,wherei (3; 6). Convergenceratesforthesesimpleestimatorstend 2 tobe(much)slowerthanpnanddependonthetail-thicknessoftherawreturnsbeingmodeled. Simpleestimatorstendtobeassociatedwithinefficientestimators. Indeed,relativetothecaseofmaximum likelihood estimation (MLE), which relies upon knowledge of the true innovation density, this associationis(manytimes)justified. QMLE,ontheotherhand,whilestillconsistent,canalsobeconsiderably inefficient in cases where the true (and unknown) innovation density deviates from normality. While one 2

possible fix for this inefficiency loss is to specify a heavier-tailed density for the model’s innovations, like the student-t (see; e.g., Baillie and Bollerslev, 1989), consistency is lost if the true innovation density happenstoresideoutsideofthestudent-tfamily. Consequently,aliteratureonGARCHestimationhasemerged aimedatdefiningmulti-stepestimatorsthatimproveupontheQMLE,asimplementedinapreliminaryfirst step, butalsomaintainrobustnessintermsofconsistency(see; e.g., DrostandKlaassen, 1997, Francqand Zakoïan, 2011, Fan, Qi and Xiu, 2014, and Preminger and Storti, 2017).3 Collectively, by better targeting thescaleofthetrue(andunknown)innovationdensity,theseestimatorsofferefficiencyenhancementsover QMLE. Identification of the simple estimators proposed herein relies on non-zero skewness in the raw returns being modeled. Essentially, skewness is the instrument upon which these estimators are based. In a linear GARCH context, skewness in the raw returns necessarily sources to skewness in the true (and unknown) innovation density. If that skewness represents a prominent feature of the innovation density, explicitly targeting it may very-well provide efficiency gains, just as (better) targeting scale does. Additionally, the simple estimators proposed herein are also multi-step estimators reliant upon preliminary estimates from a first step. The second aim of this paper, then, is to investigate the advantages of sourcing the requisite preliminaryestimatesfortheproposedsimpleestimatorstoQMLE.Fromthatinvestigation,itisfoundthat for raw return processes characterized by no more than a well-defined third moment, the proposed simpleestimatorsareasymptoticallymoreefficientwithQMLE-basedpreliminaryestimatesthanclosed-form, moments-based alternatives with slower convergence rates. In thin-tailed cases that support asymptotic normality for the simple estimators, conditions are found under which the simple estimators are actually asymptoticallymoreefficientthanQMLE.4 Lastly(and, perhaps, mostsurprisingly)itisalsofoundthatin small samples, the simple estimators can enhance the efficiency of QMLE even when the model’s innovations are (very) heavy tailed. Explaining this enhancement in heavy-tailed cases is the same factor at work under asymptotic normality; namely, skewness in the model’s innovations. Consequently, the simple estimatorsproposedhereinare,infact,comparabletotheaforementionedclassofmulti-stepestimatorsaimed atenhancingtheefficiencyofQMLE;withtheaddedbenefitofnotrequiringanynumericaloptimizationin theirfinalstep. 3Specifically,DrostandKlaassen(1997)investgatethepossibilityforadaptiveGARCHestimation; thatis,asemiparametric estimatormatchingtheefficiencyofMLE.Recognizingadaptiveestimationtobe,generally,infeasible,themorerecentcitedworks looktoimproveupontheefficiencyofQMLE,therebynarrowing(butnoteliminating)theefficiencylossrelativetoMLE. 4Monte Carlo studies find that these conditions can be satisfied for (at least) certain regions of the parameter space for the GARCH(1;1)model. 3

1.2 Background and Motivation ForthelinearGARCH(1;1)modelof Y = (cid:27) (cid:15) ; (cid:27)2 = !+(cid:11)Y2 +(cid:12)(cid:27)2 ; t t t t t 1 t 1 (cid:0) (cid:0) where(cid:15) i:i:d:D(0; 1)andD isunknown,itiswellknownthat t (cid:24) Y2 = !+(cid:30)Y2 (cid:12)W +W ; (cid:30) = (cid:11)+(cid:12); W = (cid:27)2 (cid:15)2 1 ; (1) t t 1 t 1 t t t t (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:1) where W is a Martingale difference sequence (MDS); that is, the GARCH(1;1) model implies an t f g ARMA(1;1) model for the second-order sequence Y2 . When thinking about simple estimators for this t second-orderARMA(1;1)model,twosetsofpossib(cid:8)lein(cid:9)strumentsspringtomind: (i) Z = Yi ; :::; Yi ; i = 1;2: t 1 t 1 t h (cid:0) (cid:0) (cid:0) (cid:16) (cid:17) The case where i = 2 covers the estimators proposed by Kristensen and Linton (2006) and Giraitis and Robinson (2000). This paper investigates the (up until this point) overlooked case of i = 1. In order for Z (1) toserveasavalidinstrumentforY2 requiresboththatE Y3 < andthatE Y3 = 0. t (cid:0) 1 t (cid:0) 1 t 1 t 6 (cid:0) (cid:1) (cid:0) (cid:1) STYLIZEDFACT. Many financial returns seem to be characterized by heavy-tailed processes for which thefourthmomentisnotwell-defined...(seeFigure1and;e.g.,HillandRenault,2012). PlottedinPanelAofFigure 1areHill(1975)tailindexestimatestogetherwith95%confidence bands (the latter coming from Hill, 2010, Theorem 4) for daily S&P 500 Index log returns. Recalling that a tail index(cid:20) > 0foraregularlyvaryingrandomvariableisamomentsupremum(i.e.,ifY isregularlyvarying, t thenE Y p < ifandonlyifp < (cid:20)),empiricalevidencedoesnot(strongly)supportwell-definedfourth t j j 1 momentsforthesereturns. Infact,inmanyinstances,eventheupperboundsofthe95%Confidenceintervals fallbelow4. ThislackofsupportforE Y4 < isproblematicifZ (2) istoserveasinstrumentsfor(1) t 1 t 1 (cid:0) sinceidentificationand,hence,consisten(cid:0)cyh(cid:1)ingesuponthiscriterion. STYLIZEDFACT. "Thereisnowgoodevidencethatonshorttimescales,andusinglongtimeseries,the tail index for stocks is around 3 on several markets (U.S., Japan, Germany)"...Bouchard and Potters (2003) 4

PanelAofFigure1ismuchmoresupportiveoftheclaimthatE Y3 < . Inaddition,theskewness t 1 statistic for the returns is 0:26, which is highly significant agains(cid:0)t a n(cid:1)ull of normality, given the sam- (cid:0) ple size.5 Table 1 illustrates additional instances where (very high frequency) financial returns evidence very significant skewness statistics that are also quite large in absolute terms. Collectively then, empirical (1) evidenceseemstosupportZ asaviablesetofinstrumentsfor(1). t 1 (cid:0) Theempiricalevidencefromthepreviousparagraphalsoillustratestheimpracticalityofsimpleestima- (1) tors for (1) based even on Z being asymptotically normal: the well-defined, higher moments necessary t 1 (cid:0) for such a result simply aren’t supported empirically. Fortunately, these simple estimators can be shown to weakly converge in distribution to a heavy-tailed mixture of stable random variables using results from Davis and Hsing (1995) that are also applied in, for example, Davis and Mikosch (1998) and Mikosch and Sta˘rica˘ (2000).6 Applicability of these results depends on Y being regularly varying in the case where t f g (cid:15) is drawn for a skewed distribution. In addition, as mentioned in the introduction, a second requirement t is for E (cid:15) i < , where i (3; 6). From Panel B of Figure 1 (which depicts tail index estimates for the t j j 1 2 innovationstoaGARCH(1;1)modelappliedtodailyS&P500Indexlogreturns),thissecondrequirement alsoenjoysempiricalsupport.7 2.1. Simple Estimation of the GARCH(1,1) Model Considerthemodel Y = (cid:27) (cid:15) ; (cid:15) i:i:d:D(0; 1); (2) t t t t (cid:24) where (cid:27)2 = ! +(cid:11) Y2 +(cid:12) (cid:27)2 (3) t 0 0 t 1 0 t 1 (cid:0) (cid:0) = ! +(cid:27)2 (cid:11) (cid:15)2 +(cid:12) 0 t 1 0 t 1 0 (cid:0) (cid:0) = ! +(cid:27)2 A(cid:0) (cid:1) 0 t 1 t (cid:0) Here,! denotesthetruevalue,!anyoneofasetofpossiblevalues,!anestimate,andparalleldefinitions 0 holdforallotherparametervalues. Themodelof(2)and(3)describesastrongGARCHprocess(seeDrost b and Nijman, 1993). Consistency of the simple estimators studied in this paper holds if (cid:15) is, instead, t f g 5Infact,theskewnessstatisticfordailyS&P500Indexlogreturnscanbeashighas 1:02,dependingonthelengthofthedata (cid:0) sampleused. 6Thismixtureofstablerandomvariableshasanill-definedvariance. 7Specifically,thelinearGARCH(1;1)modelappliedalsoincludesanAR(1)componentintheconditionalmean. 5

a weakly dependent, MDS (see; e.g., Prono, 2014). The distributional limits for these simple estimators, however, require (cid:15) to be i:i:d. (Mikosch and Straumann, 2002, 2006, and Vaynman and Beare, 2014, t f g imposethissamerequirement). MikoschandSta¯rica¯(2000)establish(3)asastochasticrecurrenceequation(SRE).MostlinearGARCH processesareaffordedthischaracterization(see;e.g.,Basrak,DavisandMikosch,2002),whichisimportant for establishing them as regularly varying. For instance, conditional on (3) being a SRE, both (cid:27)2 and t Y are regularly varying sequences (see Lemmas 3 and 5, respectively, in the Supplemental A(cid:8)ppen(cid:9)dix). t f g Specifically,for0 h < ,considerY = Y ; :::; Y ,orY = Y = Y ; :::; Y for (cid:20) 1 t t t+h 0 0 h short. Y isregularlyvaryinginR h+1 withtai(cid:16)lindex(cid:20) 0 ,meaning(cid:17)thereexistsasequ(cid:16)enceofconstants f (cid:17)a n g suchthat nP ( Y > a ) 1; n ; n j j (cid:0)! ! 1 where Y = max Y , m j j m=0;:::;hj j a = n1=(cid:20) 0L(n); n andL( )isslowly-varyingat . MikoschandSta˘rica˘ (2000,Theorem2.3)demonstrateY toberegularly (cid:1) 1 varying in the case where D is symmetric. Lemma 5 in the Supplemental Appendix is a more general resultthatestablishesY asregularlyvaryingregardlessofwhetherD issymmetricorskewedbycombining certain elements from the proofs of Mikosch and Sta˘rica˘ (2000, Theorem 2.3) and Basrak et al. (2002, Corollary 3.5(B)), respectively (see Remark 6 in the Supplemental Appendix). This generalization is importantbecauseanecessaryconditionforidentifyingthesimpleestimatorsinthispaperisE (cid:15)3 = 0,and t 6 given(2)and(3),thisconditionimpliesE Y3 = 0. (cid:0) (cid:1) t 6 (cid:0) (cid:1) ASSUMPTIONA1: ThedistributionDhasanunboundedsupport. Inaddition,forsome(cid:14) > 0,E (cid:15) i+(cid:14) < t j j ,where3 i < h < ,whileE (cid:15) h = ,andforj i,E (cid:15) j = c . 1 (cid:20) 1 j t j 1 (cid:20) j t j j UnderA1, (cid:15) islighter-tailedthan (cid:27) . Thisdistinctionisimportantbecauseitlimitstheheavy-tailed t t f g f g featuresof Y tostemfrom (cid:27) ,which,inturn,enables Y tobeestablishedasregularlyvarying. t t t f g f g f g ASSUMPTIONA2: Theparameterspaceisgivenby (cid:2) = (cid:18) = !; (cid:11); (cid:12) R 3 ! !; (cid:11) > 0; (cid:12) 0 ; 2 j (cid:21) (cid:21) n (cid:16) (cid:17) o forsome! > 0. 6

Thestrictlypositivelowerboundon! heraldsfromKristensenandRahbek(2005). Noticeaswellthat (cid:2)isnon-compact. ASSUMPTIONA3: E (cid:15)3 = c = 0: t (cid:3)3 6 (cid:0) (cid:1) A3 passes skewness onto the unconditional distribution of Y . The direction of skewness is uncont strained.8 Skewness in (high frequency) returns is considered a stylized fact. This fact is exogenous to the modelunderconsideration,yet(aswillbeshown)canbeharnessedtoidentifythemodel. Exampleswhere an asymmetric D is used to account for skewness in returns include Hansen (1994), Harvey and Siddique (1999),andJondeauandRockinger(2003). ASSUMPTIONA4: E A3=2 < 1: (cid:0) (cid:1) A4issufficientfor Y tohaveastrictlystationarysolution(see;e.g.,Mikosch,1999,Corollary1.4.38 t f g andRemark1.4.39). Throughoutthisandtheremainingsections,assumethatthisstrictlystationarysolution istheonebeingobserved. From(2)and(3)followsthat Y2 = (cid:27)2+W ; (4) t t t where W isanMDS.LetX Y2 (cid:13) ,where f t g t (cid:17) t (cid:0) 0 ! (cid:13) E Y2 = 0 ; (cid:30) = (cid:11) +(cid:12) ; 0 (cid:17) t 1 (cid:30) 0 0 0 0 (cid:0) (cid:0) (cid:1) and(cid:30) < 1,givenA4. Thenfrom(4)followsthat 0 X = (cid:30) X (cid:12) W +W (5) t 0 t 1 0 t 1 t (cid:0) (cid:0) (cid:0) = (cid:30) X +V 0 t 1 t (cid:0) which relates the GARCH(1;1) model to an ARMA(1;1) model of the (centered) second-order sequence Y2 . AlsogivenA4,E Y3 = E (cid:27)3 c iswelldefined(seeProno,2018,Lemma1). Consequently, t t t (cid:2) (cid:3)3 (cid:8)given(cid:9)the law of iterated(cid:0)exp(cid:1)ectation(cid:0)s, m(cid:1) ultiplying both sides of (5) by Y for a m 1 and taking t m (cid:0) (cid:21) expectationsproduces E X Y = (cid:11) (cid:30)m 1E Y3 ; t t m 0 0 (cid:0) t (cid:0) 8Forequityreturns,asanexample,skewness(cid:0)canbeofe(cid:1)ithersignforsingl(cid:0)enam(cid:1)esandtendstobenegativeforportfolios. 7

inwhichcase, E X Y = (cid:11) E Y3 ; (6) t t 1 0 t (cid:0) (cid:0) (cid:1) (cid:0) (cid:1) and E X Y = (cid:30) E X Y ; m 2: (7) t t m 0 t t m+1 (cid:0) (cid:0) (cid:21) (cid:0) (cid:1) (cid:0) (cid:1) From(6),anexactlyidentifiedestimatorfor(cid:11) in(5)is 0 1 (cid:11) = F n 1 X Y ; F = n 1 X Y (cid:0) ; (8) IV (cid:0) t t 1 (cid:0) t 1 t 1 (cid:18) t (cid:0) (cid:19) (cid:18) t (cid:0) (cid:0) (cid:19) P P b b b b b where X = Y2 (cid:13); (cid:13) = n 1 Y2: (9) t t (cid:0) t (cid:0) t P Notice that (8) is a linear TSLS estimbator appliebd to thbe feasible version of (5) (see (25) in the Appendix) usingY asaninstrumentforX . NoticeaswellthatY isnotaproperinstrument,since t 1 t 1 t 1 (cid:0) (cid:0) (cid:0) b E W Y = E Y3 = 0: t 1 t 1 t 1 (cid:0) (cid:0) (cid:0) 6 (cid:0) (cid:1) (cid:0) (cid:1) Nonetheless,Y issufficientforidentifying(cid:11) from(5)asthefollowingTheoremdemonstrates. t 1 0 (cid:0) 1 Theorem1 Consider the estimator in (8). Let F = E X Y (cid:0) , and let Assumptions A1–A4 hold. 0 t 1 t 1 (cid:0) (cid:0) Then (cid:0) (cid:1) a:s: (cid:11) (cid:11) ; IV 0 (cid:0)! and b na 3((cid:11) (cid:11) ) d (cid:11) 1F (V (cid:12) V ); (10) (cid:0)n IV (cid:0) 0 (cid:0)! (cid:0)0 0 2;y (cid:0) 0 1;y d where(cid:20) (3; 6)," "isweak,anbdthelimitingrandomvariables V definedinLemma11are 0 2 (cid:0)! i;y i=1;2 jointly((cid:20) =3) stable. If(cid:20) (6; ),inwhichcase,E Y6 < ,t(cid:16)hen (cid:17) 0 (cid:0) 0 2 1 t 1 (cid:0) (cid:1) pn((cid:11) (cid:11) ) d N 0; E Y3 (cid:0) 2 (cid:6) ; (11) IV (cid:0) 0 (cid:0)! t VY 1 (cid:0) (cid:16) (cid:17) (cid:0) (cid:1) b where 2 1 (cid:6) = E V Y +2 E V Y V Y : VY 1 t t 1 t t 1 t s t 1 s (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:16) (cid:0) (cid:1) (cid:17) X s=1 (cid:0) (cid:1) Proof. See the Appendix for proofs of all theorems and corollaries stated here in the main text. See the 8

SupplementalAppendixforstatementsandproofsofallsupportinglemmasaswellasadditionaltheorems andcorollaries. Remark2 Asymptotically, (cid:13) does not affect the limiting distribution of (cid:11) . Also, consistency of (cid:11) IV IV does not require consistency of (cid:13) (see (26) in the Appendix). In thin-tailed cases where E Y6 < t b b b1 (which is equivalent to E A3 < 1), there is an inverse relationship between the required as(cid:0)ymm(cid:1)etry in b the distribution of Y and (cid:0)the a(cid:1)symptotic variance of (cid:11) . Specifically, as E Y3 0, the asymptotic t IV t ! variance of (cid:11) increases without bound. The limiting case where E Y3(cid:12) =(cid:0)0 c(cid:1)o(cid:12)rresponds to the case IV b t (cid:12) (cid:12) wherethisasymptoticvarianceisill-defined,rendering(cid:11) unidentified.(cid:0)Ana(cid:1)logously,inheavy-tailedcases IV b where (cid:20) (3; 6) and, consequently, E Y6 = , the stable limit is ill-defined when E Y3 = 0. In 0 2 t 1 b (cid:20)0(cid:0) 3 t addition, away from symmetric innovation(cid:0)s, th(cid:1)e rate of convergence in (10) is n (cid:20)0 , which(cid:0)is q(cid:1)uite a bit slowerthanpnin(11),especiallyforempirically-relevantvaluesof(cid:20) . 0 Remark3 Thedistributionallimitin(10)dependsoni (3; 6)inA1. Thisrequirementisbothconsistent 2 withexistinglimittheoryforalternativeGARCHestimatorsliketheQMLEaswellastheempiricalfeatures ofmanyGARCHprocesses(see;e.g.,Figure1aswellasHillandRenault,2012). Incontrast,arequirement moreanalogoustooneemployedinrelatedworks(see;e.g.,MikoschandSta˘rica˘,2000,andKristensenand Linton, 2006) would be for i 6, which is both much stronger and not as well supported by empirical (cid:21) evidence.9 Remark4 Inthespecialcasewhere(cid:12) = 0(i.e.,theARCH(1)case),thedistributionallimitsin(10)and 0 (11)reducetothoseinProno(2018a,Theorem1)withasingle,laggedinstrument. Owingtoitsdependenceon(cid:20) ,theexactconvergenceratein(10)isunknown. Thisfeaturecomplicates 0 bootstrapping a confidence interval for (cid:11) .10 Consider, then, the estimator (cid:28)2 = n 1 Y6. Given this IV n (cid:0) t t estimator, P b b na 6(cid:28)2 = a 6 Y6 d V (2) ; (cid:0)n n (cid:0)n t (cid:0)! 0;y t P where V (2) is ((cid:20) =6) stable following thbe method of proof given for Davis and Hsing (1995, Theorem 0;y 0 (cid:0) 3.1(i)). Since V (cid:12) V ; V (2) is jointly-stable following the arguments given for Vaynman and 2;y (cid:0) 0 1;y 0;y (cid:16) (cid:17) 9Ineachofthesetworeferencedcases,second-orderautocovariancesareconsidered;i.e.,E(X X )form 1,inwhich case,theanalogousconditionisi=8. t t (cid:0) m (cid:21) 10Thedistributionallimitin(10)hasanawkwardcharacteristicfunctionthatdoesnotreadilyadmittheconstructionofconfidence intervals. 9

Beare(2014,Theorem4), (cid:11) (cid:11) V (cid:12) V pn (cid:18) IV (cid:28) (cid:0) n 0 (cid:19) (cid:0) d ! (cid:11) (cid:0)0 1F 0 2;y V (cid:0) 0 ( ; 2 y ) 0 1;y ! ; b by the continuous mapping theorem.bAdvantages of this normalization are twofold. First, the bootstrap method of Hall and Yao (2003) can now be applied.11 Second, a bridge is provided for understanding the transitionfrom(10)to(standard)asymptoticnormalityin(11). Specifically,if(cid:20) (6; ),thelimitof(cid:28) 0 2 1 n isdegenerate,andthelinearcombinationofV andV isGaussian. 1;y 2;y b Next,let R = X (cid:30) X t t 0 t 1 (cid:0) (cid:0) sothat,given(5), R = (cid:12) W +W ; (12) t 0 t 1 t (cid:0) (cid:0) makingR aMA(1)process. Recursivesubstitutioninto(12)produces t R = (cid:12) R +U ; U = W (cid:12)2W : (13) t 0 t 1 t t t 0 t 2 (cid:0) (cid:0) (cid:0) (cid:0) From(13),anexactlyidentifiedIVestimatorfor(cid:12) is 0 1 (cid:12) (cid:30) = G n 1 R Y ; G = n 1 R Y (cid:0) ; (14) IV (cid:0) t t 1 (cid:0) t 1 t 1 (cid:0) (cid:18) t (cid:0) (cid:19) (cid:18) t (cid:0) (cid:0) (cid:19) (cid:16) (cid:17) P P b b b b b b where R = X (cid:30)X : t t t 1 (cid:0) (cid:0) Theestimatorin(14)ishighlyanalogoustobtheonbein(8b),bsince E R Y = E X Y = E Y3 ; t 1 t 1 t 1 t 1 t 1 (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) therebylinkinginstrumentstrengthdirectlytothelevelofskewnessinD. Differentiating(14)from(8)is 1. Y being a proper instrument for R (it is an improper instrument for X as previously dist 1 t 1 t 1 (cid:0) (cid:0) (cid:0) cussed) 11Thistypeofnormalizationappliestoallofthesimpleestimatorsdiscussedinthispaperandenablesthecalculationofconfidenceintervalsforthe,respective,parameterestimates,eveninheavy-tailedcases. 10

2. theestimatordependingon(cid:30) b Thisseconddifferentiatingfeaturelikens(14)totheclosed-formestimatorproposedbyKristensenand Linton (2006) and necessitates, in turn, a consistent estimator for (cid:30) . In order to satisfy this condition and 0 preserve(14)beingentirelyclosedform,consider n 1 X Z 0 (cid:3) (cid:0) t 1 t 2 (cid:30) = F n 1 X Z ; F = (cid:18) t (cid:0) (cid:0) (cid:19) ; (15) IV (cid:0) t t 2 P (cid:18) t (cid:0) (cid:19) n 1 X Z b0 (cid:3) n 1 b X Z P (cid:0) t 1 t 2 (cid:0) t 1 t 2 b b b b (cid:18) t (cid:0) (cid:0) (cid:19) (cid:18) t (cid:0) (cid:0) (cid:19) P P b b b withtheinstrumentvector Z = Y ; :::; Y 0: t 2 t 2 t h (cid:0) (cid:0) (cid:0) (cid:16) (cid:17) GivenA3,(7),and E W Z = 0; i = 0;1; t i t 2 (cid:0) (cid:0) (cid:0) (cid:1) whichfollowsbythelawofiteratedexpectations,Z isapropersetofinstrumentsforX in(5). t 2 t 1 (cid:0) (cid:0) a:s: ASSUMPTIONA5: (cid:3) (cid:3) ,apositivedefinitematrix: (cid:0)! 0 1 If (cid:3) = n (cid:0) 1 Z b t 2 Z0 t 2 (cid:0) , then (cid:30) IV is a TSLS estimator. The advantage of this choice of a (cid:18) t (cid:0) (cid:0) (cid:19) weightingmatrixisPthat b b 1 (cid:3) = (cid:18) n (cid:0) 1 t Z t (cid:0) 2 Z0 t (cid:0) 2 (cid:19) (cid:0) (cid:0) a ! :s: (cid:13) (cid:0)0 1I h ; P whereI isah hidentitymatrbix,givenAssumptionsA1–A4. Alternatively,if h (cid:2) 1 (cid:3) = n (cid:0) 1 X t (cid:30)X t 1 2 Z t 2 Z0 t 2 (cid:0) ; (cid:18) t (cid:0) (cid:0) (cid:0) (cid:0) (cid:19) (cid:16) (cid:17) P b e where (cid:30) is a preliminary estimator, then (cid:30) is a two-step GMM estimator. While this second choice of a IV weightingmatrixiscertainlypreferableonefficiencygrounds,(cid:3) a:s: (cid:3) nowrequiresE A3 < 1, which e b (cid:0)! 0 isarathertallordergiventheempiricalfeaturesofmanyfinancialreturnseries(see;e.g.,(cid:0)Figu(cid:1)re1). Conseb quently, this paper focuses on the TSLS interpretation of (15). Theorem 13 in the Supplemental Appendix a:s: showsthat(cid:30) (cid:30) andthat(cid:30) converges(weakly)indistributiontoalimitthatisqualitativelysimilar IV 0 IV (cid:0)! to (10) when (cid:20) (3; 6), with the same rate of convergence. The following theorem depends on these b 0 2 b results. 11

Theorem5 Considertheestimatorin(14)with(cid:30) = (cid:30) asdefinedin(15). Let IV b b 1 A 0 = (cid:3) 0 E X t 1 Z t 2 ; B 0 = E X t 1 Z t 2 0A 0 ; G 0 = E R t 1 Y t 1 (cid:0) ; (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) andletAssumptionsA1–A5hold. Then a:s: (cid:12) (cid:30) (cid:12) ; IV IV 0 (cid:0)! (cid:16) (cid:17) b b and na 3 (cid:12) (cid:30) (cid:12) d (cid:11) 1 F S+G (cid:11) (cid:30) V V (cid:12) V ; (16) (cid:0)n IV IV (cid:0) 0 (cid:0)! (cid:0)0 0 0 0 0 0;y (cid:0) 2;y (cid:0) 0 1;y (cid:16) (cid:16) (cid:17) (cid:17) (cid:0) (cid:0) (cid:0) (cid:1)(cid:1)(cid:1) where(cid:20) (3; 6), "bd "bisweak, V isdefinedinLemma12, V (cid:12) V heraldsfromTheorem1, 0 2 (cid:0)! 0;y 2;y (cid:0) 0 1;y and F S is defined in Theorem 13 of the Supplemental Appendix(cid:0). This (weak)(cid:1)distributional limit is also 0 ((cid:20) =3) stable. If(cid:20) (6; ),inwhichcase,E Y6 < ,then 0 (cid:0) 0 2 1 t 1 (cid:0) (cid:1) d pn (cid:12) (cid:30) (cid:12) N 0; (cid:6) ; IV IV 0 (cid:12) (cid:0) (cid:0)! (cid:16) (cid:16) (cid:17) (cid:17) (cid:0) (cid:1) b b where (cid:6) (cid:12) = E Y t 3 (cid:0) 2 B 0 2(cid:6) UY 1 +E Y t 3 2 A0 0 (cid:6) VZ 2 A 0 (cid:0) 2B 0 (cid:6)0 UVY 1 Z 2 A 0 ; (cid:0) (cid:0) (cid:0) (cid:0) h (cid:16) (cid:17)i (cid:0) (cid:1) (cid:0) (cid:1) A ,B ,and(cid:6) aredefinedinTheorem13, 0 0 VZ 2 (cid:0) 2 (cid:6) = E U Y +2E U U Y Y ; UY t t 1 t t 1 t 1 t 2 (cid:0) 1 (cid:0) (cid:0) (cid:0) (cid:0) (cid:16) (cid:17) (cid:0) (cid:1) (cid:0) (cid:1) (cid:6) = E V2Z Z0 +2E V V Z Z0 ; VZ 2 t t 2 t 2 t t 1 t 2 t 3 (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:16) (cid:17) (cid:16) (cid:17) and (cid:6) = E U V Y Z +2E U V Y Z : UVY Z t t t 1 t 2 t t 1 t 1 t 3 (cid:0) 1 (cid:0) 2 (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:1) (cid:0) (cid:1) Remark6 As is true in Theorem 1, (i) (cid:13) impacts neither consistency nor the asymptotic variance of (cid:12) , IV (ii) a necessary condition for the limiting result in (16) is that i (3; 6) in A1, and (iii) in order for both b 2 b the heavy-tailed and thin-tailed distributional limits to be stable, E Y3 = 0; otherwise, (cid:12) (cid:30) is not t IV 6 identified. (cid:0) (cid:1) (cid:16) (cid:17) b b Remark7 The limiting result in (16) is a linear combination of the results in (10) and (28) in the SupplementalAppendix;inwhichcase,thelimitingdistributionof(cid:30)impactstheasymptoticdistributionof(cid:12) as IV 12 b b

itdoesintheKristensenandLinton(2006)estimator. Corollary8 Considertheestimatorin(14). LetAssumptionsA1–A4hold. Inaddition,assumethereexists a(cid:30)suchthat(cid:30) (cid:0) a ! :s: (cid:30) 0 witharateofconvergenceoftoastablelimitingdistributionofnl,wherel > (cid:20) 0 (cid:20) (cid:0) 0 3 . Then, b b a:s: (cid:12) (cid:30) (cid:12) IV 0 (cid:0)! (cid:16) (cid:17) and b b na 3 (cid:12) (cid:30) (cid:12) d G (cid:30) V (cid:11) 1 V (cid:12) V : (17) (cid:0)n IV (cid:0) 0 (cid:0)! 0 0 0;Y (cid:0) (cid:0)0 2;y (cid:0) 0 1;y (cid:16) (cid:16) (cid:17) (cid:17) (cid:0) (cid:0) (cid:1)(cid:1) b b Remark9 Relative to (16), (17) has a (substantial) source of variation in the asymptotic limit removed. Theimplicationsofthisresultarethreefold. First,foranyconsistent(cid:30)thatconvergesfasterthan(cid:30) (see IV Theorem13intheSupplementalAppendix),asymptoticallythereisnodifferencebetweenusingthisestimate b b andthetruevalue(cid:30) . Second,inthiscase,(cid:12) (cid:30) isasymptoticallymoreefficientthan(cid:12) (cid:30) . Third, 0 IV IV IV itisnaturaltoconsider(cid:30) = (cid:30) ,since(cid:30) (cid:16) (cid:17) isbothconsistentandnl asymptotically(cid:16)norm(cid:17)alunder QMLE b QMbLE b b conditionssupportedbytheCorollary(see;e.g.,Berkes,Horváth,andKokoszka,2003,HallandYao,2003, b b b andFrancqandZakoïan,2004). Given(cid:12) (cid:30) inCorollary8,considerthealternativeestimatorfor(cid:11) IV 0 (cid:16) (cid:17) b b (cid:11) (cid:30) = (cid:30) (cid:12) (cid:30) : (18) IV IV (cid:0) (cid:16) (cid:17) (cid:16) (cid:17) b b b b b Corollary10 Considertheestimatorin(18). LetAssumptionsA1–A5hold. Thenforboththe(cid:30)definedin Corollary8and(cid:30) = (cid:30) , IV b a:s: (cid:11) (cid:30) (cid:11) ; b b IV (cid:0)! 0 (cid:16) (cid:17) and b b na 3 (cid:11) (cid:30) (cid:11) d G (cid:30) V (cid:11) 1 V (cid:12) V : (19) (cid:0)n IV (cid:0) 0 (cid:0)! (cid:0) 0 0 0;Y (cid:0) (cid:0)0 2;y (cid:0) 0 1;y (cid:16) (cid:16) (cid:17) (cid:17) (cid:0) (cid:0) (cid:1)(cid:1) FromTheorem1andCborolla b ry10,if na 3((cid:11) (cid:11) ) d X; (cid:0)n IV 0 (cid:0) (cid:0)! b then na 3 (cid:11) (cid:30) (cid:11) d X Y (cid:0)n IV 0 (cid:0) (cid:0)! (cid:0) (cid:16) (cid:16) (cid:17) (cid:17) b b 13

Drawinguponwhatis(generally)knownfortwo-stepestimators,necessaryfor(cid:11) (cid:30) todisplayasymp- IV totic efficiency gains over (cid:11) is a (strong) positive association between X and Y(cid:16)(se(cid:17)e; e.g., Newey and IV b b McFadden, 1994). Specifically, given (10) and (19), there needs to be a (strong) and positive association b between V and V (cid:12) V . From Lemma 12 in the Supplemental Appendix, V is the limit to 0;Y 2;y (cid:0) 0 1;y 0;Y n 1 Y3, while V(cid:0) is the limi(cid:1)t to n 1 Y Y2 for m = 1;2 (see Lemma 11). Given (6) and (7), (cid:0) t m;Y (cid:0) t t+m t t suppPortingsuchapositiveassociationinlarPgesamplesandthin-tailedcaseswhere(cid:20) (6; )isthen 0 2 1 Cov n 1 Y3; n 1 Y Y2 (cid:12) n 1 Y Y2 = Cov n 1 Y3; n 1 Y Y2 (cid:0) t (cid:0) t t+2 0 (cid:0) t t+1 (cid:0) t (cid:0) t t+2 (cid:0) ! ! t t t t t X X X X X (cid:12) Cov n 1 Y3; n 1 Y Y2 0 (cid:0) t (cid:0) t t+1 (cid:0) ! t t X X Cov n 1 Y3; (cid:11) (cid:30) n 1 Y3 (cid:0) t 0 0 (cid:0) t (cid:25) ! t t X X (cid:12) Cov n 1 Y3; (cid:11) n 1 Y3 0 (cid:0) t 0 (cid:0) t (cid:0) ! t t X X = (cid:11)2Var n 1 Y3 0 (cid:0) t ! t X > 0 Itcanbeanticipated, however, thatanyefficiencygainsin(cid:11) (cid:30) over(cid:11) willbemutedrelativetothe IV IV gainsin(cid:12) (cid:30) over(cid:12) (cid:30) . (cid:16) (cid:17) IV QMLE IV IV b b b (cid:16) (cid:17) (cid:16) (cid:17) b b b b 2.2. Potential for Efficiency Gains Considertheestimator(cid:18). a:sb: d ASSUMPTIONA6: (cid:18) (cid:18) and pn (cid:18) (cid:18) N (0; (cid:6) ),where (cid:0)! 0 (cid:0) 0 (cid:0)! (cid:18) (cid:16) (cid:17) b b (cid:6) (cid:6) (cid:6) ! !;(cid:11) !;(cid:12) (cid:6) = 0 (cid:6) (cid:6) (cid:6) 1: (cid:18) !;(cid:11) (cid:11) (cid:11);(cid:12) B C B (cid:6) (cid:6) (cid:6) C !;(cid:12) (cid:11);(cid:12) (cid:12) B C @ A Inaddition, (cid:6) +2(cid:6) < 0: (20) (cid:11) (cid:11);(cid:12) 14

Since 0; 1; 1 (cid:18) (cid:18) = (cid:30) (cid:30) ,fromA6followsthat 0 0 (cid:0) (cid:0) (cid:16) (cid:17)(cid:16) (cid:17) (cid:16) (cid:17) b b d pn (cid:30) (cid:30) N 0;(cid:6) +(cid:6) +2(cid:6) : 0 (cid:12) (cid:11) (cid:11);(cid:12) (cid:0) (cid:0)! (cid:16) (cid:17) (cid:0) (cid:1) b Consequently, given (20), (cid:6) < (cid:6) . Moreover, since (cid:6) > 0, necessary for (20) is (cid:6) < 0. Given (cid:30) (cid:12) (cid:11) (cid:11);(cid:12) the discussion that follows Corollary 8, a natural candidate for (cid:18) is (cid:18) . The next section considers QMLE a wide range of Monte Carlo simulation designs for the model in (2) and (3). Under all of these designs b b (withoutexception),includingonesconductedusing(very)largesamplesizes,(20)holdswhen(cid:18) = (cid:18) . QMLE Consequently,becausetheMonteCarlodesignsconformbothwiththemagnitudesofARCHandGARCH b b parameters and the stylized facts of GARCH innovations encountered empirically, A6 appears to (at least) enjoybroadempiricalsupportwhenappliedtotheQMLE. Considernexttheestimatorin(8). GivenTheorem1, d pn((cid:11) (cid:11) ) N (0; (cid:6) ) IV 0 (cid:3)(cid:11) (cid:0) (cid:0)! b inthin-tailedcaseswhere(cid:20) (6; ). Furtherconsider 0 2 1 (cid:6) = nE ((cid:11) (cid:11) ) (cid:12) (cid:12) ; (cid:10) = nE(((cid:11) (cid:11) )((cid:11) (cid:11) )): (cid:3)(cid:11);(cid:12) IV 0 0 (cid:3)(cid:11) IV 0 0 (cid:0) (cid:0) (cid:0) (cid:0) (cid:16) (cid:16) (cid:17)(cid:17) b b b b ASSUMPTIONA7: (cid:6) 2 (cid:10) +(cid:6) < 0 (21) (cid:3)(cid:11) (cid:3)(cid:11) (cid:3)(cid:11);(cid:12) (cid:0) (cid:0) (cid:1) InalltheMonteCarloexperimentsconsideredinthenextsection,(cid:10) +(cid:6) < 0when(cid:18) = (cid:18) . (cid:3)(cid:11) (cid:3)(cid:11);(cid:12) QMLE Consequently,therelativesizeof(cid:6) isimportantfordeterminingwhether(21)holds. Recallfrom(11)that (cid:3)(cid:11) b b as E Y3 increases, (cid:6) decreases. As a result, the likelihood of (21) holding increases along with the t (cid:3)(cid:11) ma(cid:12)gni(cid:0)tude(cid:1)o(cid:12)fskewnessinY . Empiricalevidencesupportselevatedskewnesslevelsin(very)highfrequency (cid:12) (cid:12) t returns (see Table 1). These elevated levels, however, correspond with heavy-tailed return processes (see Figure1). Aninterestingquestion,then,iswhetherthereexistthin-tailedGARCHprocessesinthecontext of Theorem 1 with sufficient skewness to satisfy (21). Evidencing why this question is interesting is the followingtheorem. 15

Theorem11 Consider the estimator in (14) under thin-tailed cases where (cid:20) (6; ). Let Assumptions 0 2 1 a:s: A1–A4andA6–A7hold. Then,(cid:12) (cid:30) (cid:12) , IV 0 (cid:0)! (cid:16) (cid:17) b b d pn (cid:12) (cid:30) (cid:12) N 0; (cid:6) ; IV 0 (cid:3)(cid:12) (cid:0) (cid:0)! (cid:16) (cid:16) (cid:17) (cid:17) (cid:0) (cid:1) b b and (cid:6) < (cid:6) : (cid:3)(cid:12) (cid:12) Remark12 FromTheorem11,thereexistconditionsunderwhich(cid:12) (cid:30) hasasmallerasymptoticvari- IV ancethan(cid:12). Undertheseconditions,(cid:12) (cid:30) canbeinterpretedasenha(cid:16)nc(cid:17)ingtheefficiencyof(cid:12),generally, IV b b and (cid:12) , specifically, in an analogou(cid:16)s w(cid:17)ay that the estimators of Francq et al. (2011) and Fan et al. QMLbE b b b (2014) enhance the efficiency of (cid:12) . These latter estimators achieve efficiency gains (i.e., smaller b QMLE asymptotic variances) by targeting the scale of the unknown innovation density D. Rather than targeting b scale, (cid:12) (cid:30) targets the skewness of D. To the extent that this skewness is pronounced (i.e., a prevalent IV featureofD(cid:16)),(cid:17)efficiencygainsshouldresult,especiallyiftheinitialestimator(cid:18) ignoresthisskewness,asis b b thecase,generally,formanyGARCHestimators,and,certainly,(cid:18) ,specifically. QMLE b b Corollary13 Considertheestimatorin(18)underthin-tailedcaseswhere(cid:20) (6; ). LetAssumptions 0 2 1 A1–A4hold. Then d pn (cid:11) (cid:30) (cid:11) N (0; (cid:6) ); IV 0 (cid:3)(cid:11) (cid:0) (cid:0)! (cid:16) (cid:16) (cid:17) (cid:17) thesamedistributionallimitas(cid:11) in(b11). b IV Remark14 Inthintailedcasesbwhere(cid:20) (6; ),(cid:11) (cid:30) and(cid:11) sharethesamedistributionallimit. 0 2 1 IV IV Consequently, if (cid:6) < (cid:6) , which likely sources to heavy(cid:16)ske(cid:17)wness in D, (cid:11) (cid:30) offers no improvement (cid:3)(cid:11) (cid:11) b b b IV over (cid:11) , a result which, owing to simulation evidence, stands in contrast to(cid:16)the(cid:17)improvement (cid:12) (cid:30) IV b b IV affordsover(cid:12) (cid:30) . (cid:16) (cid:17) b IV IV b b (cid:16) (cid:17) b b 3.1. Monte Carlo Design ThissectionconsiderstheGARCH(1;1)modelfromSection2.1,where (cid:15) isdrawnfromtheskewed t f g student’s-t density of Hansen (1994). This density has two parameters, (cid:21) and (cid:17), where the former governs skewness, the latter governs the tails, and up to the (cid:17)th moment of the distribution is well defined. Values 16

fortheseparametersconsideredintheMonteCarloexperimentsare (cid:21) = 0:20; 0:40; 0:80; 0:90; 0:99; (cid:17) = 64:5; 8:1; 4:5; 3:5: (22) (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) As(cid:21)increases,skewnessincreases,whileas(cid:17) decreases,tailthicknessincreases. TheMonteCarloexperimentssummarizedinthissectioninvolvetheGARCHspecificationof ! = 0:005; (cid:11) = 0:10; (cid:12) = 0:80: (23) 0 0 0 Forrobustness,otherspecificationsarealsoconsidered,theresultsforwhicharesummarizedintheSupplementalAppendix. TheestimatorsunderstudyarethesimpleIVestimatorsfromSection2.1,withboththe closed-form Kristensen and Linton (2006) estimator (KL) and the QMLE serving as benchmarks. Recalling that m denotes the number of lags used as instruments, for (cid:11) , (cid:11) (cid:30) and (cid:12) (cid:30) , IV IV QMLE IV QMLE m = 1. For (cid:12) (cid:30) , m = 20; 10; 5, so as to investigate the effects o(cid:16)f the num(cid:17)ber of inst(cid:16)ruments o(cid:17)n IV IV b b b b b the performance o(cid:16)f the(cid:17)estimator. Table 2 summarizes the skewness statistics and tail index estimates for b b Y T , given the GARCH specification in (23) and the different values for (cid:21) and (cid:17) in (22). Notice that f t gt=1 skewness levels in the simulations are consistent with skewness levels encountered empirically (see Table 1). When (cid:17) = 64:5, (cid:11) , (cid:11) (cid:30) , (cid:11) , (cid:12) (cid:30) , (cid:12) (cid:30) , and (cid:12) are all as- IV IV QMLE QMLE IV IV IV QMLE QMLE ymptotically normal, while (cid:11) (cid:16)and (cid:12) (cid:17) (likely) are no(cid:16)t.12 W(cid:17)hen (cid:17)(cid:16)= 8:1 an(cid:17)d (cid:17) = 4:5, only (cid:11) b b KLb KL b b b b b b QMLE and (cid:12) are asymptotically normal. When (cid:17) = 3:5, none of the estimators are asymptotically nor- QMLE b b b mal.13.When(cid:17) = 4:5and(cid:17) = 3:5,(cid:11) and(cid:12) arenotconsistent;inwhichcase,theyarenotconsidered b KL KL intheexperiments.14 b b Samples sizes for the simulations are 100;000 and 500 observations, respectively, with the former investigating the large-sample properties of the simple IV estimators (given their slow convergence rates), andthelatterinvestigatingtheirsmall-sampleproperties. Allsimulationsareconductedover10;000trials, withthefirst200observationsdroppedtoavoidinitializationeffects. Summarystatisticsforthesimulations aretherootmeansquarederror, meanabsoluteerror, andmedianabsoluteerror(eachmeasuredrelativeto the true parameter value) divided by the corresponding efficiency measure for the QMLE. These ratios are termed"efficiencyratios,"andbenchmarktheperformanceofthesimpleIVestimatorsagainsttheQMLE. AdditionaldetailsonthesimulationsarecontainedinthenotestotherelevantTables. 12Necessaryfor(cid:11) and(cid:12) tobeasymptoticallynormalisE Y8 < (seeKristensenandLinton,2006),whichdoesnot KL KL t 1 appeartobetrue,giventheresultsinTable2. (cid:0) (cid:1) 13Anecessarycobnditionfobr(cid:11) QMLE and(cid:12) QMLE tobeasymptoticallynormalisE (cid:15)4 t < 1 (see;e.g.,HallandYao,2003). 14Necessaryforconsistencyof(cid:11) and(cid:12) isE Y4 < ,which(verylikely)doesnothold,giventheresultsinTable2. KL KL t 1 (cid:0) (cid:1) b b (cid:0) (cid:1) b b 17

3.2. Results Tables 3–4 report large-sample results for the simple estimators proposed herein at varying levels of tail-thickness for the GARCH(1;1) model’s innovation density. Beginning with Table 3, in the thin-tailed case of (cid:17) = 64:5, the relative performance of all simple estimators improves as (cid:21) increases (in absolute terms). The opposite is true for the KL estimator, which sees its performance meaningfully degrade with increasing levels of skewness At low skewness levels, the KL estimator is more efficient than the simple estimators. At high skewness levels, this tendency is reversed, with the KL estimator notably lagging the simple estimators. Amongst the simpleestimators, (cid:12) (cid:30) performs thebest. However, all simple IV QMLE estimators,includingtheKLestimator,notablylagtheQ(cid:16)MLE. (cid:17) b b Movingtotheheavier-tailedcaseof(cid:17) = 8:1(stillTable3),thesametrendsmentionedabovecontinueto hold. Thereare,however,somenotablepointsofdeparturefromthesetrends. Specifically,therelativeperformanceof(cid:12) (cid:30) materiallydegradesinthisheavier-tailedcase. Also,itsperformancenowappears IV IV toworsenas(cid:21)inc(cid:16)rease(cid:17)s(inabsoluteterms). Incontrast,therelativeperformanceof(cid:11) ,(cid:11) (cid:30) , b b IV IV QMLE and(cid:12) (cid:30) remainsmuchmorestableacrossthetwocases. (cid:16) (cid:17) IV QMLE b b b In the(cid:16)heavy-ta(cid:17)iled case of (cid:17) = 4:5 (now Table 4), a substantial relative performance drop is, again, b b evidencedfor(cid:12) (cid:30) . Consequently,(cid:12) (cid:30) nownotablylagsinperformancerelativetotheother IV IV IV IV simpleestimatorsa(cid:16)talls(cid:17)kewnesslevelsconside(cid:16)red. O(cid:17)verall,thereisageneraltendencyfortherelativeperb b b b formanceoftheothersimpleestimatorstoalsodeclinebetweenthecasesof(cid:17) = 8:1and(cid:17) = 4:5;however, this decline is decidedly more modest. In the heaviest-tailed case of (cid:17) = 3:5, the relative performance of all simple estimators notably improves, likely owing to the fact that while under the cases of (cid:17) = 8:1 and (cid:17) = 4:5, QMLE is asymptotically normal, under the case of (cid:17) = 3:5, the distributional limit of QMLE is qualitatively(much)moresimilartothatofthesimpleestimators. Table5summarizesresultsofaninvestigationintowhetherAssumptionsA6andA7ofTheorem11can holdinthin-tailedcasesthatsupportasymptoticnormalityforboth(cid:12) and(cid:12) (cid:30) . Thedesign QMLE IV QMLE of this investigation is as follows. For different ARCH and GARCH parameter valu(cid:16)es (listed(cid:17)in the Table) b b b andthehighestpossible(cid:21)values(inabsoluteterms),determinethelowestpossible(cid:17)valuethatstillsupports asymptoticnormalityfor(cid:12) (cid:30) . Ineachofthesecases,measuretheefficiencyof(cid:12) (cid:30) IV QMLE IV QMLE relative to (cid:12) . Across all(cid:16)of the ca(cid:17)ses summarized in Table 5 (and, in fact, all large and sm(cid:16)all sampl(cid:17)e QMLE b b b b cases considered in this section), A6 holds.15 Consequently, as hypothesized in Section 2.2, the prediction b ofTheorem11criticallyhingesuponthevalidityofA7. Table5demonstratesthatA7can,infact,hold,in 15Thisfinding,thoughnotexplicitlyevidentintheTable,canbeseenthroughperformancecomparisonsof(cid:30) and(cid:12) . QMLE QMLE Thesecomparisonsareavailableuponrequest. b b 18

whichcase,(cid:12) (cid:30) ismoreefficientthan(cid:12) . IV QMLE QMLE It is tempting(cid:16)to conc(cid:17)lude from the results in Table 5 that efficiency gains of (cid:12) (cid:30) over b b b IV QMLE (cid:12) are limited to GARCH processes with low persistence. Such a conclusion under-(cid:16)weights t(cid:17)he fact QMLE b b thatcaseswhere(cid:12) (cid:30) ismoreefficientthan(cid:12) correspondwiththehighestskewnesslevels b IV QMLE QMLE in Y T . Evidently(cid:16), lower G(cid:17)ARCH persistent levels are associated with higher skewness levels in the f t gt=1 b b b Table. However,totheextentthatispossibletogenerateskewnesslevels (inabsoluteterms)thehighest (cid:21) levels observed in Table 5 for more persistent GARCH processes, then it seems likely that (cid:12) (cid:30) IV QMLE willexceedtheperformanceof(cid:12) inthesecasesaswell. (cid:16) (cid:17) QMLE b b Tables3–5evidencethatthesimpleestimatorsrecordtheirbestperformancerelativetoQMLEincases b where skewness is high. In Table 5, the level of skewness achievable is limited by the constraint to only consider thin-tailed densities. Relaxing this constraint allows for materially higher skewness levels (see Table 2), the effects of which are demonstrated in Tables 3–4 for (very) large samples. From Section 2.1, the convergence rate of the simple estimators is slow, implying that their distributional limits as proxied for in the large sample results might offer poor predictions for how the simple estimators behave in small samplesunderthesameheavy-tailedandskeweddatageneratingprocesses. Itisalsogenerallyknownthat asymptotic normality offers a poor proxy for QMLE in small samples of non-normally distributed data. Consequently,Table6summarizessmallsampleresultsfromthecasesof(cid:17) = 4:5and(cid:17) = 3:5. Theresults are striking. In particular, (cid:11) , (cid:11) (cid:30) , and (cid:12) (cid:30) are now shown to outperform their IV IV QMLE IV QMLE QMLEcounterpartswhenskewnessis(cid:16)high. Mo(cid:17)reover,this(cid:16)outperfor(cid:17)mancecanbe(very)substantial. b b b b b 3.3. Interpretation (cid:11) (cid:30) and(cid:12) (cid:30) areanalogoustothemulti-stepestimatorsofPremingerandStorti IV QMLE IV QMLE (2017) a(cid:16)nd Fan, Q(cid:17)i and Xiu(cid:16)(2014) (se(cid:17)e also Francq, Lepage and Zakoïan, 2011), the latter of which are b b b b intended to improve upon the efficiency of (cid:18) by using (cid:18) as first-step inputs. The relative QMLE QMLE performance of these simple estimators against the QMLE strengthens this analogy. Simulation evidence b b in Preminger and Storti (2017) for their least squares estimator (LSE) benchmarked against the Fan et al. (2014) non-gaussian quasi-maximum likelihood estimator (NGQMLE) for the same sample size and ((cid:11) ; (cid:12) )valuesconsideredhereshows(cid:11) tobest(cid:11) (intermsofrootmeansquarederror)by 0 0 NGQMLE QMLE lessthan(cid:11) (cid:30) bests(cid:11) and(cid:12) tobest(cid:12) toacomparabledegreeas(cid:12) (cid:30) IV QMLE QMLE LSE QMLE IV QMLE b b bests(cid:12) (cid:16). (cid:17) (cid:16) (cid:17) QM b LE b b b b b b To understand why (cid:11) (cid:30) and (cid:12) (cid:30) can improve upon their QMLE inputs, begin b IV QMLE IV QMLE (cid:16) (cid:17) (cid:16) (cid:17) b b b b 19

by noting that both LSE and NGQMLE attempt such improvements by better targeting the scale of the unknown GARCH(1;1) innovation density. The simple estimators (cid:11) (cid:30) and (cid:12) (cid:30) IV QMLE IV QMLE alsotargetaparticularfeatureofthisinnovationdensity;namely,itsskewn(cid:16)ess. Toth(cid:17)eextenttha(cid:16)tskewnes(cid:17)s b b b b figures prominently in this density (as it does in the high (cid:21) cases), it is certainly possible that estimators focused on this feature can outperform alternatives that ignore it (like QMLE), even if those alternatives possessbetterlargesampleproperties, solongasthoselargesamplepropertieshaveyettoapply. Case-inpoint,thesmallsampledistributionsofboth(cid:11) and(cid:12) evidencedinTable6arecharacterizedby QMLE QMLE elevatedlevelsofbothskewnessandexcesskurtosis,makingnormalitya(very)poorapproximationforthese b b distributionsand,consequently,renderingthelarge-samplepropertiesof(cid:11) and(cid:12) uninformative QMLE QMLE oftheir,respective,small-samplebehavior. NotingfurtherthatthetargetofbothLSEandNGQMLEisalso b b insensitive to skewness, it is, perhaps, less surprising that (cid:11) (cid:30) and (cid:12) (cid:30) would (at IV QMLE IV QMLE least)performcomparablytothesealternatives,incaseswheresk(cid:16)ewnessis(cid:17)afeaturew(cid:16)orthtarg(cid:17)eting. b b b b 4. Empirical Application EstimatorsfortheGARCH(1;1)modelfromSection3areappliedtointra-dayJapaneseYenlogreturns measuredagainsttheUSDat15-,10-,5-,and1-minutesamplingfrequenciesoverthetimeperiodJanuary 1, 2015 through July 1, 2015.16 Using the approach in Hecq, Laurent, and Palm (2012, Eq. 4.1), all log returnsarepre-filteredfortheU-shapedintra-dayperiodicitynotedbyAndersonandBollerslev(1997). The QMLEservestobenchmarkthesimpleIVestimators. Tables 7–8 summarize the estimation results. With the exception of the 1-minute frequency, (cid:11) and IV (cid:11) (cid:30) exist inside the 95% confidence intervals for (cid:11) .17 In contrast, and also with the ex- IV QMLE QMLE b cepti(cid:16)onofthe1(cid:17)-minutefrequency,(cid:12) (cid:30) existswelloutsidethe95%confidenceintervalsfor(cid:12) , b b IV IV b QMLE while (cid:12) (cid:30) exists inside them(cid:16). Th(cid:17)ese estimation results conform with the Monte Carlo experi- IV QMLE b b b (cid:16) (cid:17) b b mentsinthat(cid:11) tendstoperformbetterthan(cid:12) (cid:30) atallsamplesizesandacrossalldegreesoftail IV IV IV thickness. Specifically, (cid:12) (cid:30) tends to be (sev(cid:16)erely)(cid:17)biased, where the source of the bias is (cid:30) . The b IV IV b b IV estimation results confirm th(cid:16)is find(cid:17)ing, since (cid:12) (cid:30) tends to be (well) inside the 95% confidence b b IV QMLE b interval for (cid:12) . Notice, however, that at the 1(cid:16)-minute(cid:17)frequency (which, corresponds with the largest QMLE b b 16JapaneseYenlogreturnsareselectedbecauseoftheforecastcomparisonsconductedbyHansenandLunde(2005),whichshow b thattheGARCH(1;1)modelprovidesthebestvolatilityforecastforthesereturnsovermorecomplicatedGARCHspecifications. 17Theseconfidenceintervalsarebasedonasymptoticnormality.Giventhe,respective,tailindexestimates,asymptoticnormality fortheQMLEissuspect(see;e.g.,HallandYao,2003).Consequently,thetrueconfidenceintervalsarelikelytobewider. 20

datasample),(cid:12) (cid:30) isalsowellinsidethe95%confidenceintervalfor(cid:12) . Consequently,inthis IV IV QMLE case,(cid:12) (cid:30) e(cid:16)njoys(cid:17)asizableadvantageover(cid:12) intermsofcomputationtime(theformerisfaster IV IbV b QMLE b tocompute(cid:16)byo(cid:17)rders-of-magnitudeoverthelatter),withnoseemingcostintermsofsacrificedprecision. b b b 5. Conclusion ThispaperproposessimpleestimatorsforthepopularGARCH(1;1)modelandstudiestheirproperties. Simple, in this context, means available in closed form. Consequently, all such estimators are linear TSLS estimators. Insomecases,theinstrumentsuponwhichtheseestimatorsdepend,inturn,dependonpreliminaryparameterestimatesfromtheGARCH(1;1)modelthatmay,ormaynot,beavailableinclosedform. AnexampleofthelattercaseispreliminaryQMLestimates;inwhichcase,thelinearTSLSestimatorsbased on these estimates are shown to improve upon the efficiency (either asymptotically, in thin-tailed cases, or when applied to small samples, in heavy-tailed cases) of QMLE. As a result, these simple estimators are members of the (growing) class of multi-step estimators aimed at improving the performance of QMLE by better aligning the parameter estimates with certain aspects of the unknown GARCH(1;1) innovation density. EstablishedinthispaperarethedesirablepropertiesofthesimpleestimatorsovertheQMLEalternative. Thesedesirablepropertiesrelatetoin-samplefit. Itwouldbeinterestingtoinvestigatethedegreetowhich thesedesirablepropertiestranslateintoimprovedout-of-samplevolatilityforecasts. Thatis,towhatextent (and under what conditions) can the simple estimators beat the QMLE in out-of-sample forecasting? This investigationisleftforfutureresearch. References [1] Amado,C.&T.Teräsvirta(2013)Modellingvolatilitybyvariancedecomposition.JournalofEconometrics175,142-153. [2] Amado,C.&T.Teräsvirta(2014)Modellingchangesintheunconditionalvarianceoflongstockreturn series.JournalofEmpiricalFinance25,15-35. [3] Andersen,T.G.&T.Bollerslev(1997)Intradayperiodicityandvolatilitypersistenceinfinancialmarkets.JournalofEmpiricalFinance4,115-158. [4] Baillie,R.T.,&T.Bollerslev(1989)Themessageindailyexchangerates: aconditional-variancetale. JournalofBusinessandEconomicStatistics7,297-305. [5] Berkes, I., L. Horváth & P. Kokoszka (2003) GARCH processes: structure and estimation. Bernoulli 9,201-227. 21

[6] Bollerslev,T.(1986)Generalizedautoregressiveconditionalheteroskedasticity.JournalofEconometrics31,307-327. [7] Bouchaud,J.P.&M.Potters(2003)TheoryofFinancialRiskandDerivativePricing: FromStatistical PhysicstoRiskManagement(SecondEdition).CambridgeUniversityPress. [8] Davis, R.A. & T. Hsing (1995) Point process and partial sum convergence for weakly dependent randomvariableswithinfinitevariance.TheAnnalsofProbability23,879-917. [9] Dong, Y. & A. Lewbel (2015) A simple estimator for binary choice models with endogenous regressors.EconometricReviews34,82-105. [10] Drost, F.C&C.A.J.Klaassen(1997)Efficientestimationinsemiparametricgarchmodels.Journalof Econometrics81,193-221. [11] Drost, F.C. & T.E. Nijman (1993) Temporal aggregation of garch processes. Econometrica 61, 909- 927. [12] Engle,R.F.&J.G.Ranel(2008)TheSpline-GARCHmodelforlow-frequencyvolatilityanditsglobal macroeconomiccauses.TheReviewofFinancialStudies21,1187-1222. [13] Fan, J., L. Qi and D. Xiu (2014) Quasi-maximum likelihood estimation of garch models with heavytailedlikelihoods.JournalofBusinessandEconomicStatistics32,178-191. [14] Francq, C., & J.-M. Zakoïan (2004) Maximum likelihood estimation of pure garch and arma-garch processes.Bernoulli10,605-637. [15] Francq,C.,G.Lepage&J.M.Zakoïan(2011),Two-stagenongaussianqmlestimationofgarchmodels andtestingtheefficiencyofthegaussianqmle.JournalofEconometrics165,246-257. [16] Giraitis,L.&P.M.Robinson(2001)Whittleestimationofarchmodels.EconometricTheory17,608- 631. [17] Glosten, L.R., R. Jagannathan & D.E. Runkle (1993) On the relation between expected value and the volatilityofthenominalexcessreturnonstocks.JournalofFinance48,1779-1801. [18] Hall, P. & Q. Yao (2003) Inference in arch and garch models with heavy-tailed errors. Econometrica 71,285-317. [19] Hansen, B.E. (1994) Autoregressive conditional density estimation. International Economic Review 35,705-730. [20] Hansen, P.R. & A. Lunde (2005) A forecast comparison of volatility models: does anything beat a garch(1,1)? JournalofAppliedEconometrics20,873-889. [21] Harvey, C.R. & A. Siddique (1999) Autoregressive conditional skewness. Journal of Financial and QuantitativeAnalysis,34,465-487. [22] Hecq, A., S. Laurent & F.C. Palm (2012) Common intraday periodicity. Journal of Financial Econometrics10,325-353. [23] Hill, B.M. (1975) A simple general approach to inference about the tail of a distribution. Annals of Statistics5,1163-1174. 22

[24] Hill, J.B. (2010) On tail index estimation for dependent, heterogeneous data." Econometric Theory 26(5): 1398-1436. [25] Hill,J.B.&E.Renault(2012)Variancetargetingforheavytailedtimeseries.Unpublishedmanuscript. [26] Jondeau, E. & M. Rockinger (2003) Conditional volatility, skewness, and kurtosis: existence, persistence,andcomovements.JournalofEconomicDynamicsandControl27,1699-1737. [27] Kristensen, D. & O. Linton (2006) A closed form estimator for the garch(1,1)-model. Econometric Theory22,323-327. [28] Kristensen,D.&A.Rahbek(2005)Asymptoticsoftheqmleforaclassofarch(q)models.Econometric Theory21,946-961. [29] Lee, S.W. & B.E. Hansen (1994) Asymptotic theory for the garch(1,1) quasi-maximum likelihood estimator.EconometricTheory10,29-52. [30] Lewbel,A.(2004)Simpleestimatorsforhardproblems: endogeneityindiscretechoicerelatedmodels. unpublishedmanuscript. [31] Lumsdaine, R.L. (1996) Consistency and asymptotic normality of the quasi-maximum likelihood estimator estimator in igarch(1,1) and covariance stationary garch(1,1) models. Econometrica 64, 575- 596. [32] Mikosch, T. (1999) Regular variation, subexponentiality and their applications in probability theory. Lecturenotesfortheworkshop"HeavyTailsandQueques,"EURANDOM,Eindhoven,Netherlands. [33] Mikosch, T. & C. Sta˘rica˘ (2000) Limit theory for the sample autocorrelations and extremes of a garch(1,1)process.TheAnnalsofStatistics28,1427-1451. [34] Mikosch,T.&D.Straumann(2002)Whittleestimationinaheavy-tailedgarch(1,1)model.Stochastic ProcessesandtheirApplications100,187-222. [35] Mikosch, T. & D. Straumann (2006) Quasi-maximum-likelihood estimation in conditionally heteroskedastictimeseries: astochasticrecurrenceequationsapproach.TheAnnalsofStatistics34,2449- 2495. [36] Newey, W.K. & D. McFadden (1994) Large sample estimation and hypothesis testing. In R.F. Engle and D. McFadded (eds), Handbook of Econometrics Vol. 4: Amsterdam North Holland, chapter 36, 2111-2245. [37] Preminger, A. & G. Storti (2017) Least-squares estimation of garch(1,1) models with heavy-tailed errors.EconometricsJournal20,221-258. [38] Prono, T. (2014) Simple estimators for the GARCH(1,1) model. Available at SSRN: http://ssrn.com/abstract=1511720. [39] Prono,T.(2018a)Closed-formestimatorsforfinite-orderarchmodelsassimpleandcompetitivealternativestoqmle.ForthcominginStudiesinNonlinearDynamicsandEconometrics. [40] Prono,T.(2018b)Closed-formestimatorsforfinite-orderarchmodelsassimpleandcompetitivealternatives to qmle: supplemental appendix. Forthcoming in Studies in Nonlinear Dynamics and Econometrics. 23

[41] Resnick, S.I. (1987) Extreme Values, Regular Variation, and Point Processes. New York: Springer- Verlag. [42] Samorodnitsky,G.&M.S.Taqqu(1994)StableNon-GaussianRandomProcesses: StochasticModels withInfiniteVariance.StochasticModeling.NewYork: ChapmandandHall. [43] Shao,Y.&M.Zhou(2010)Acharacterizationofmultivariatenormalitythroughunivariateprojections. JournalofMultivariateAnalysis101,2637-2640. [44] Vaynman,I.&B.K.Beare(2014)Stablelimittheoryforthevariancetargetingestimator,inY.Chang, T.B.Fomby&J.Y.Park(eds), EssaysinHonorofPeterC.B.Phillips, vol.33ofAdvancesinEconometrics: EmeraldGroupPublishingLimited,chapter24,639-672. Appendix AllLemmas,uponwhichtheproofsoftheTheoremsrely,arebothstatedandprovedintheSupplemental Appendix. ProofofTheorem1: Tobegin,notethat X = X ((cid:13) (cid:13) ); (24) t t 0 (cid:0) (cid:0) b b sothatgiven(5), X = c +(cid:30) X (cid:12) W +W ; c = ((cid:13) (cid:13) ) ((cid:30) 1): (25) t 0 0 t 1 0 t 1 t 0 0 0 (cid:0) (cid:0) (cid:0) (cid:0) (cid:2) (cid:0) b b b SincebyCarrascoandChen(2002,Corollary6), Y isstronglymixing, t f g n 1 X Y = n 1 X Y ((cid:13) (cid:13) )n 1 Y (26) (cid:0) t 1 t 1 (cid:0) t 1 t 1 0 (cid:0) t 1 t (cid:0) (cid:0) t (cid:0) (cid:0) (cid:0) (cid:0) t (cid:0) P a:s:P P b E X t 1 Y t 1 b (cid:0)! (cid:0) (cid:0) (cid:0) (cid:1) bytheErgodicTheorem,and,given(25)and(26), n 1 X Y = c n 1 Y +(cid:30) n 1 X Y (cid:12) n 1 W Y +n 1 W Y (cid:0) t t 1 0 (cid:0) t 1 0 (cid:0) t 1 t 1 0 (cid:0) t 1 t 1 (cid:0) t t 1 t (cid:0) t (cid:0) t (cid:0) (cid:0) (cid:0) t (cid:0) (cid:0) t (cid:0) P a:s: P P P P b (cid:30) 0 E X t 1 Y t 1 (cid:12)b0 E W t 1 Y t 1 (cid:0)! (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) = (cid:11) E X (cid:0) Y ; (cid:1) (cid:0) (cid:1) 0 t 1 t 1 (cid:0) (cid:0) (cid:0) (cid:1) 24

wherethefinalequalityfollowssince E X Y = E (cid:27)2 W Y (27) t 1 t 1 t 1 t 1 t 1 (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:1) = E(cid:0)(cid:0)W Y : (cid:1) (cid:1) t 1 t 1 (cid:0) (cid:0) (cid:0) (cid:1) Next, na 3((cid:11) (cid:11) ) = F a 3 X Y E X Y +o (1) (28) (cid:0)n IV 0 0 (cid:0)n t t 1 t t 1 p (cid:0) (cid:18) t (cid:0) (cid:0) (cid:0) (cid:19) P (cid:0) (cid:1) b = F a 3 Y2Y E Y2Y (cid:13) a 3 Y +o (1) 0 (cid:0)n t t 1 t t 1 0 (cid:0)n t 1 p (cid:18) t (cid:0) (cid:0) (cid:0) (cid:0) t (cid:0) (cid:19) P (cid:0) (cid:1) P = F a 3 Y2Y E Y2Y +o (1) 0 (cid:0)n t t 1 t t 1 p (cid:18) t (cid:0) (cid:0) (cid:0) (cid:19) d (cid:11) 1 P F (V (cid:12) V (cid:0) ); (cid:1) (cid:0)! (cid:0)0 0 2;y (cid:0) 0 1;y d where" "followsfromLemma11,whichitselfdependsonLemmas9–10,theCLTinLemma7, (cid:0)! andLemma5,andthefinalequalityfollowssince a 1 Y d V ; (29) (cid:0)n t 1 0 t (cid:0) (cid:0)! P withV being(cid:20) stable(seetheproofofLemma10). Finally,if(cid:20) (6; )sothatE Y6 < , 0 0(cid:0) 0 2 1 t 1 thengiven(5),(9),and(24), (cid:0) (cid:1) n 1 X Y (cid:0) t t 1 pn((cid:11) (cid:11) ) = pn t (cid:0) (cid:11) +o (1) IV (cid:0) 0 0 n 1 PX Y (cid:0) 0 p 1 (cid:0) t 1 t 1 B t (cid:0) (cid:0) C b @ Pn 1 V Y A b(cid:0) t t 1 = pn (cid:12) + t (cid:0) +o (1) 0 0 E XP Y p 1 t 1 t 1 (cid:0) (cid:0) @ (cid:0) (cid:1) A = E Y t 3 (cid:0) 1 pn n (cid:0) 1 V t Y t 1 E V t Y t 1 +o p (1) (cid:18) t (cid:0) (cid:0) (cid:0) (cid:19) d(cid:0) N (cid:1) 0; E Y3 (cid:0) P2 (cid:6) (cid:0) (cid:1) (cid:0)! t VY 1 (cid:0) (cid:16) (cid:17) (cid:0) (cid:1) wherethelimitingresultfollowsfromIbragimovandLinnik(1971,Theorem18.5.3).(cid:4) 25

ProofofTheorem5: Fornotationalease,let(cid:12) = (cid:12) (cid:30) . Given(24), IV IV IV (cid:16) (cid:17) b b b R = R (cid:30) (cid:30) X ((cid:13) (cid:13) ) 1 (cid:30) : (30) t t 0 t 1 0 (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:16) (cid:17) (cid:16) (cid:17) a:s: b b b b If (cid:30) (cid:30) (as is the case when (cid:30) = (cid:30) , see Theorem 13 in the Supplemental Appendix), then, 0 IV (cid:0)! giventhat Y isstronglymixing, b f t g b b n 1 R Y = n 1 R Y (cid:30) (cid:30) n 1 X Y ((cid:13) (cid:13) ) 1 (cid:30) n 1 Y (cid:0) t 1 t 1 (cid:0) t 1 t 1 0 (cid:0) t 2 t 1 0 (cid:0) t 1 t (cid:0) (cid:0) t (cid:0) (cid:0) (cid:0) (cid:0) t (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) t (cid:0) (cid:16) (cid:17) (cid:16) (cid:17) P a:s: P P P b E W t 1 Y t 1 b b b (cid:0)! (cid:0) (cid:0) (cid:0) (cid:1) and n 1 R Y = n 1 R Y (cid:30) (cid:30) n 1 X Y ((cid:13) (cid:13) ) 1 (cid:30) n 1 Y (cid:0) t t 1 (cid:0) t t 1 0 (cid:0) t 1 t 1 0 (cid:0) t 1 t (cid:0) t (cid:0) (cid:0) (cid:0) t (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) t (cid:0) (cid:16) (cid:17) (cid:16) (cid:17) P a:s: P P P b (cid:12) 0 E W t 1 Yb t 1 b b (cid:0)! (cid:0) (cid:0) (cid:0) (cid:0) (cid:1) bytheErgodicTheorem,sincealsogiven(27), E R Y = E X Y (cid:30) E X Y t 1 t 1 t 1 t 1 0 t 2 t 1 (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:1) = E(cid:0)X Y (cid:1) (cid:0) (cid:1) t 1 t 1 (cid:0) (cid:0) (cid:0) (cid:1) and E R Y = E X Y (cid:30) E X Y t t 1 t t 1 0 t 1 t 1 (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:1) = E(cid:0)Y2Y (cid:1) (cid:30) E(cid:0)Y2 Y (cid:1) t t 1 0 t 1 t 1 (cid:0) (cid:0) (cid:0) (cid:0) = E(cid:0)(cid:27)2Y (cid:1) (cid:30) E (cid:0)W Y (cid:1) t t 1 0 t 1 t 1 (cid:0) (cid:0) (cid:0) (cid:0) = (cid:11) (cid:0)E Y2 (cid:1)Y (cid:30)(cid:0) E W (cid:1)Y : 0 t 1 t 1 0 t 1 t 1 (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:1) (cid:0) (cid:1) Next,since (cid:12) = G n 1 R Y (cid:30) (cid:30) n 1 X Y +o (1) ; IV (cid:0) t t 1 0 (cid:0) t 1 t 1 p (cid:0) (cid:18) t (cid:0) (cid:0) (cid:0) t (cid:0) (cid:0) (cid:19) (cid:16) (cid:17) P P b b b 26

then (cid:12) (cid:12) = G n 1 R Y E R Y + (cid:30) (cid:30) G n 1 X Y IV 0 (cid:0) t t 1 t t 1 0 (cid:0) t 1 t 1 (cid:0) (cid:0) (cid:18) t (cid:0) (cid:0) (cid:0) (cid:19) (cid:0) (cid:18) t (cid:0) (cid:0) (cid:19) (cid:16) (cid:17) P (cid:0) (cid:1) P b bG G E R Y b b 0 t t 1 (cid:0) (cid:0) (cid:0) (cid:16) (cid:17) (cid:0) (cid:1) = (cid:30) b(cid:30) G n 1 X Y E X Y (cid:30) n 1 X Y E X Y +o (1) 0 (cid:0) t t 1 t t 1 0 (cid:0) t 1 t 1 t 1 t 1 p (cid:0) (cid:0) (cid:18) t (cid:0) (cid:0) (cid:0) (cid:0) t (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:19) (cid:16) (cid:17) P (cid:0) (cid:1) P (cid:0) (cid:1) b b suchthat na 3 (cid:12) (cid:12) = na 3 (cid:30) (cid:30) (31) (cid:0)n IV 0 (cid:0)n 0 (cid:0) (cid:0) (cid:16) (cid:17) (cid:16) (cid:17) b G ab3 Y2Y E Y2Y (cid:30) a 3 Y3 E Y3 +o (1) (cid:0)n t t 1 t t 1 0 (cid:0)n t 1 t 1 p (cid:0) (cid:18) t (cid:0) (cid:0) (cid:0) (cid:0) t (cid:0) (cid:0) (cid:0) (cid:19) d b (cid:11) 1 F P S+G (cid:11) (cid:30) V (cid:0) V (cid:1) (cid:12) V P ; (cid:0) (cid:1) (cid:0)! (cid:0)0 0 0 0 0 0;Y (cid:0) 2;y (cid:0) 0 1;y (cid:0) (cid:0) (cid:0) (cid:1)(cid:1)(cid:1) d wherethislastequalityfollowsgiven(29),and" "followsgivenLemmas11and12. Thatthislimit (cid:0)! isjointly((cid:20) =3) stablefollowsfromSamorodnitskyandTaqqu(1994,Theorem2.1.5(c)). Finally, 0 (cid:0) n 1 U Y pn (cid:12) (cid:12) = pn (cid:0) (cid:0) t t t (cid:0) 1 + (cid:30) (cid:30) E X t 1 Y t 1 +o (1) IV (cid:0) 0 0 E RPY (cid:0) 0 E R (cid:0) Y (cid:0) p 1 t t 1 (cid:0) t t 1 (cid:1) (cid:16) (cid:17) (cid:0) (cid:16) (cid:17) (cid:0) b @ (cid:0) (cid:1) b (cid:0) (cid:1) A = pn E Y t 3 (cid:0) 1 n (cid:0) 1 U t Y t 1 + (cid:30) (cid:30) 0 +o p (1) (cid:18) (cid:0) (cid:18) (cid:0) t (cid:0) (cid:19) (cid:0) (cid:19) (cid:16) (cid:17) d (cid:0) (cid:1) P N 0; (cid:6) ; b (cid:12) (cid:0)! (cid:0) (cid:1) d where " " follows using the CLT from the proof of Theorem 1 as well as Shao and Zhou (2010, (cid:0)! Theorem1).(cid:4) ProofofCorollary8 AlmostsureconvergenceisestablishedintheproofofTheorem5. From(31), na 3 (cid:12) (cid:30) (cid:12) = na 3 (cid:30) (cid:30) +O na 3 ; (cid:0)n IV 0 (cid:0)n 0 p (cid:0)n (cid:0) (cid:0) (cid:16) (cid:16) (cid:17) (cid:17) (cid:16) (cid:17) (cid:0) (cid:1) b b b where the second term on the right-hand-side of the equality follows from Lemmas 11 and 12. For thefirstterm, na (cid:0)n 3 (cid:30) (cid:30) 0 = n (cid:20)0 (cid:20) (cid:0) 0 3 (cid:0) l nl (cid:30) (cid:30) 0 = o p (1): (32) (cid:0) (cid:0) (cid:18) (cid:19) (cid:16) (cid:17) (cid:16) (cid:17) Lemmas11and12,again,thenbestablish(17).(cid:4) b 27

ProofofCorollary10 Almost sure convergence follows from the proof of Theorem 5 here and the proof ofTheorem13intheSupplementalAppendix. Given(18), na 3 (cid:11) (cid:30) (cid:11) = na 3 (cid:30) (cid:30) na 3 (cid:12) (cid:30) (cid:12) : (33) (cid:0)n IV 0 (cid:0)n 0 (cid:0)n IV 0 (cid:0) (cid:0) (cid:0) (cid:0) (cid:16) (cid:16) (cid:17) (cid:17) (cid:16) (cid:17) (cid:16) (cid:16) (cid:17) (cid:17) b b b b b If(cid:30) = (cid:30) ,then(19)isestablishedbyTheorem2hereandTheorem13intheSupplementalAppen- IV dix,notingthattheshared(asymptotic)dependenceof(cid:30) and(cid:12) (cid:30) onScancelsout. Lastly, b b IV IV IV from(33)andgiven(32),(cid:11) (cid:30) sharesthesamedistributionallimi(cid:16)t(exc(cid:17)ludingasignchange)with IV b b b (cid:12) IV (cid:30) .(cid:4) b (cid:16) b (cid:17) (cid:16) (cid:17) b b ProofofTheorem11 AlmostsureconvergencefollowsfromtheproofofTheorem5. Given(30)and(13), (cid:12) (cid:30) = G n 1 R Y (34) IV (cid:0) t t 1 (cid:0) (cid:18) t (cid:0) (cid:19) (cid:16) (cid:17) P b b = (cid:12) b G n 1b U Y + (cid:30) (cid:30) G n 1 X Y +o (1) 0 (cid:0) t t 1 0 (cid:0) t 1 t 1 p (cid:0) (cid:18) t (cid:0) (cid:19) (cid:0) (cid:18) t (cid:0) (cid:0) (cid:19) (cid:16) (cid:17) P P b b b RecallingalsothatR = X (cid:30) X ,fromtheproofofTheorem5, t t (cid:0) 0 t (cid:0) 1 n 1 R Y = n 1 X Y (cid:30)n 1 X Y ((cid:13) (cid:13) ) 1 (cid:30) n 1 Y (cid:0) t 1 t 1 (cid:0) t 1 t 1 (cid:0) t 2 t 1 0 (cid:0) t 1 t (cid:0) (cid:0) t (cid:0) (cid:0) (cid:0) t (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) t (cid:0) (cid:16) (cid:17) P = n 1PX Y +o (1)P; P b (cid:0) t 1 t 1 bp b b t (cid:0) (cid:0) P inwhichcase, G n 1 X Y = 1; n : (cid:0) t 1 t 1 (cid:18) t (cid:0) (cid:0) (cid:19) ! 1 P Next,giventhedefinitionsofbV andU in(5)and(13),respectively, t t U = (cid:12)2W +(cid:12) W +V : t 0 t 2 0 t 1 t (cid:0) (cid:0) (cid:0) 28

Consequently, n 1 V Y +(cid:12) n 1 W Y +o (1) (cid:0) t t 1 0 (cid:0) t 1 t 1 p G n 1 U Y = t (cid:0) t (cid:0) (cid:0) (cid:0) t t 1 P n 1 X YP +o (1) (cid:18) t (cid:0) (cid:19) (cid:0) t 1 t 1 p P t (cid:0) (cid:0) b n 1 V Y P (cid:0) t t 1 t (cid:0) = +(cid:12) +o (1) n 1 PX Y 0 p (cid:0) t 1 t 1 t (cid:0) (cid:0) P 1 = n 1 X Y (cid:0) n 1 V Y E V Y (cid:0) t 1 t 1 (cid:0) t t 1 t t 1 (cid:18) t (cid:0) (cid:0) (cid:19) (cid:2)f t (cid:0) (cid:0) (cid:0) P P (cid:0) (cid:1) +(cid:12) n 1 X Y E W Y 0 (cid:0) t 1 t 1 t 1 t 1 (cid:18) t (cid:0) (cid:0) (cid:0) (cid:0) (cid:0) (cid:19) g P (cid:0) (cid:1) = E Y t 3 (cid:0) 1 n (cid:0) 1 V t Y t 1 E V t Y t 1 +o p (1) (cid:18) t (cid:0) (cid:0) (cid:0) (cid:19) = ((cid:11)(cid:0) (cid:1)(cid:11) ); P (cid:0) (cid:1) IV 0 (cid:0) b wheregiven(5),thethirdequalityreliesonE V Y = (cid:12) E W Y andE X Y = t t (cid:0) 1 (cid:0) 0 t (cid:0) 1 t (cid:0) 1 t (cid:0) 1 t (cid:0) 1 E W Y ,andthefourthequalityfollows(cid:0)fromth(cid:1)eproofofT(cid:0)heorem1.(cid:1)Then,gi(cid:0)ven(34), (cid:1) t 1 t 1 (cid:0) (cid:0) (cid:0) (cid:1) pn (cid:12) (cid:30) (cid:12) = pn (cid:30) (cid:30) pn((cid:11) (cid:11) )+o (1) (35) IV 0 0 IV 0 p (cid:0) (cid:0) (cid:0) (cid:0) (cid:16) (cid:16) (cid:17) (cid:17) (cid:16) (cid:17) = pn (cid:12) (cid:12) +pn((cid:11) (cid:11) ) pn((cid:11) (cid:11) )+o (1) b b b (cid:0) 0 b (cid:0) 0 (cid:0) IV (cid:0) 0 p (cid:16) (cid:17) d b b bd Sincepn (cid:18) (cid:18) N (0; (cid:6) )byassumption,andpn((cid:11) (cid:11) ) N (0; (cid:6) )byTheorem (cid:0) 0 (cid:0)! (cid:18) IV (cid:0) 0 (cid:0)! (cid:3)(cid:11) 1, (cid:16) (cid:17) b b d pn (cid:12) (cid:30) (cid:12) N 0; (cid:6) ; IV 0 (cid:3)(cid:12) (cid:0) (cid:0)! (cid:16) (cid:16) (cid:17) (cid:17) (cid:0) (cid:1) where b b (cid:6) = (cid:6) +(cid:6) +2(cid:6) +(cid:6) 2 (cid:10) +(cid:6) (cid:3)(cid:12) (cid:12) (cid:11) (cid:11);(cid:12) (cid:3)(cid:11) (cid:3)(cid:11) (cid:3)(cid:11);(cid:12) (cid:0) < (cid:6) +(cid:6) 2 (cid:10) +(cid:6) (cid:0) (cid:1) (cid:12) (cid:3)(cid:11) (cid:3)(cid:11) (cid:3)(cid:11);(cid:12) (cid:0) < (cid:6) : (cid:0) (cid:1) (cid:12) (cid:4) 29

ProofofCorollary13 From(18), pn (cid:11) (cid:30) (cid:11) = pn (cid:30) (cid:12) (cid:30) (cid:11) IV 0 IV 0 (cid:0) (cid:0) (cid:0) (cid:16) (cid:16) (cid:17) (cid:17) (cid:16) (cid:16) (cid:17) (cid:17) = pn (cid:30) (cid:12) (cid:30) (cid:11) (cid:12) +(cid:12) b b b (cid:0) bIV b (cid:0) 0 (cid:0) 0 0 (cid:16) (cid:16) (cid:17) (cid:17) = pn (cid:30) (cid:30) pn (cid:12) (cid:30) (cid:12) b b0 b IV 0 (cid:0) (cid:0) (cid:0) (cid:16) (cid:17) (cid:16) (cid:16) (cid:17) (cid:17) = pn((cid:11) (cid:11) )+o (1); bIV 0 p b b (cid:0) b wherethefinalequalityfollowsfrom(35). ThenfromTheorem1, d pn (cid:11) (cid:30) (cid:11) N (0; (cid:6) ); IV 0 (cid:3)(cid:11) (cid:0) (cid:0)! (cid:16) (cid:16) (cid:17) (cid:17) b b whichcompletestheproof.(cid:4) 30

TABLE1: SkewnessEstimates JPYReturns SPXReturns DJIAReturns freq. obs. skew. obs. skew. obs. skew. 1-min 174,997 -2.68 46,551 -1.75 46,557 -1.25 (0.01) (0.01) (0.01) 5-min 35,028 -1.94 9,312 -3.17 9,315 -2.68 (0.01) (0.03) (0.03) 10-min 17,523 -1.51 (0.02) 15-min 11,685 -3.10 (0.02) 20-min 8,766 -2.10 (0.03) NotestoTables1.ThedatasourceisBloombergFinaceLP.JPYistheYen/USDexchangerate.SPX andDJIAistheS&P500andDowJonesIndustrialAverage,respectively. Thedaterangeforallreturn seriesis7/19/2015–12/31/2015.Skewisthestandardestimateofthe(unconditionally)standardizedthird moment. Standarderrorsfortheskewness,measuredagainstthenullofnormality,areinparentheses. TABLE2: SimulationDesigns (cid:17) =64:5 (cid:17) =8:1 (cid:17) =4:5 (cid:17) =3:5 (cid:21) skew. (cid:20) skew. (cid:20) skew. (cid:20) skew. (cid:20) -0.20 -0.34 7.25 -0.56 5.15 -1.10 3.76 -1.92 3.14 -0.40 -0.65 6.60 -1.06 4.73 -2.05 3.53 -3.50 2.98 -0.80 -1.03 6.08 -1.67 4.37 -3.18 3.29 -5.26 2.81 Notes to Tables 2. Reported are the skewness statistics and tail index values for the Monte Carlo simulationdesignsthatstudythelinearGARCH(1;1)modelwhen! =0:005,(cid:11)=0:10,and(cid:12) =0:80. Therescalederrorsfromthismodelfollowtheskewedstudent’stdensityofHansen(1994)normalized sothatE((cid:15) )=0andE (cid:15)2 =1. Thisdensityhastwoparameters,(cid:21)and(cid:17),withtheformergoverning t t skewness,thelattergoverningthetails,anduptothe(cid:17)thmomentbeingwelldefined. Boththeskewness (cid:0) (cid:1) statisticsandtailindexvalues(cid:20)forthesimulatedrawreturnsarethemselvesdeterminedthroughsimulationasthemeanestimatefrom Y 10;000 across10;000trialsforthegivendesign. Theestimatorfor f tgt=1 (cid:20)isHill(1975)withaconstantthresholdof0:5%. 31

TABLE3: LargeSampleResultsI (cid:21)= 0:20 (cid:21)= 0:40 (cid:21)= 0:80 (cid:0) (cid:0) (cid:0) efficiencyratio efficiencyratio efficiencyratio est. m rmse mae mdae rmse mae mdae rmse mae mdae I. (cid:17) =64:5 (cid:11) 1 7.35 7.27 7.22 4.23 4.17 4.09 3.15 3.05 2.99 IV (cid:11) 4 3.00 2.85 2.72 3.46 3.19 2.98 4.23 3.76 3.41 KL (cid:11) b(cid:30) 1 8.16 8.12 8.03 4.27 4.24 4.22 2.71 2.68 2.65 IV QMLE b (cid:12) (cid:16) (cid:30) (cid:17) 20 4.96 4.87 4.78 4.04 3.84 3.66 4.66 4.13 3.71 b IVb IV (cid:16) (cid:17) 10 4.80 4.72 4.64 3.83 3.59 3.41 4.36 3.73 3.33 b b 5 5.11 5.05 4.98 4.19 4.03 3.85 4.67 4.21 3.88 (cid:12) 4 4.49 4.39 4.25 5.10 4.92 4.71 6.08 5.77 5.40 KL (cid:12) (cid:30) 1 3.91 3.90 3.88 2.18 2.17 2.16 1.54 1.53 1.50 IV QMLE b (cid:16) (cid:17) II. (cid:17) =8:1 b (cid:11)b 1 7.14 6.82 6.61 4.74 4.31 4.02 4.24 3.53 3.25 IV (cid:11) 4 5.65 4.96 4.33 6.21 5.43 4.68 6.59 5.67 5.06 KL (cid:11) b(cid:30) 1 7.89 7.52 7.25 4.63 4.25 3.95 3.46 3.03 2.74 IV QMLE b (cid:12) (cid:16) (cid:30) (cid:17) 20 8.94 7.40 6.10 10.69 8.30 6.17 13.42 10.94 8.32 b IVb IV (cid:16) (cid:17) 10 8.21 6.76 5.64 9.64 7.20 5.06 11.96 9.16 6.23 b b 5 8.49 7.18 6.28 9.32 7.27 5.61 11.13 8.55 6.16 (cid:12) 4 7.86 7.40 6.78 8.50 8.00 7.33 9.06 8.50 8.02 KL (cid:12) (cid:30) 1 4.07 3.85 3.65 2.49 2.29 2.12 2.02 1.79 1.64 IV QMLE b (cid:16) (cid:17) b b TABLE4: LargeSampleResultsII (cid:21)= 0:20 (cid:21)= 0:40 (cid:21)= 0:80 (cid:0) (cid:0) (cid:0) efficiencyratio efficiencyratio efficiencyratio est. m rmse mae mdae rmse mae mdae rmse mae mdae III. (cid:17) =4:5 (cid:11) 1 7.83 6.92 6.34 5.18 4.65 4.36 3.86 3.84 3.76 IV (cid:11) (cid:30) 1 8.86 7.83 7.13 5.20 4.71 4.25 3.32 3.36 3.09 IV QMLE b (cid:12) (cid:16) (cid:30) (cid:17) 20 17.60 16.65 13.91 17.41 17.34 15.08 16.97 17.89 16.99 b IVb IV (cid:16) (cid:17) 10 15.98 14.36 10.88 15.60 14.64 11.80 15.11 15.04 13.29 b b 5 14.62 12.79 9.56 14.00 12.56 9.21 13.29 12.46 9.68 (cid:12) (cid:30) 1 5.25 4.61 4.03 3.27 2.83 2.42 2.34 2.16 1.91 IV QMLE (cid:16) (cid:17) IV. (cid:17) =3:5 b (cid:11)b 1 3.87 5.30 5.21 2.57 3.51 3.87 1.83 2.82 3.45 IV (cid:11) (cid:30) 1 4.29 5.94 5.71 2.63 3.45 3.58 1.52 2.33 2.59 IV QMLE b (cid:12) (cid:16) (cid:30) (cid:17) 20 13.97 16.72 18.18 12.73 15.80 17.85 11.65 14.82 17.50 b IVb IV (cid:16) (cid:17) 10 12.63 14.38 14.58 11.45 13.49 14.40 10.49 12.69 14.21 b b 5 11.23 12.15 10.93 10.11 11.20 10.44 9.12 10.33 10.20 (cid:12) (cid:30) 1 4.16 4.07 3.75 2.43 2.45 2.28 1.69 1.80 1.76 IV QMLE (cid:16) (cid:17) b b 32

Notes to Tables 3–4. Simulations are conducted on sample sizes of T = 100;000 across 10;000 trials,where,withineachtrial,thefirst200observationsaredroppedtoavoidinitializationeffects. The linearGARCH(1;1)modelunderstudyisparameterizedas ! =0:005; (cid:11) =0:10; (cid:12) =0:80: 0 0 0 The simple IV estimators are considered, along with the Kristensen and Linton (2006) estimator (KL) andquasi-maximumlikelihoodestimator(QMLE),bothofwhichserveasbenchmarks. Forthesimple IV estimators, m is the number of lagged instruments used. The innovations from the GARCH(1;1) modelfollowHansen’s(1994)skewedstudent’s-tdensity,where (cid:21)= 0:20; 0:40; 0:80; (cid:17) =64:5;8:1;4:5;3:5: (cid:0) (cid:0) (cid:0) Highervaluesof(cid:21)correspondwithmoreskewness(inabsoluteterms)in Y T ,whilehighervaluesof f tgt=1 (cid:17)correspondwithheaviertailsinthesimulatedreturnsample. Inthethin-tailedcaseof(cid:17) =64:5,(cid:11) , IV (cid:11) (cid:30) , (cid:11) , (cid:12) (cid:30) , (cid:12) (cid:30) , and (cid:12) are all asymptotically normal, IV QMLE QMLE IV IV IV QMLE QMLE b while(cid:16)(cid:11) an(cid:17)d (cid:12) (likely) ar(cid:16)e not.(cid:17)18 In th(cid:16)e heavy-t(cid:17)ailed cases of (cid:17) = 8:1;4:5, only (cid:11) and b bKL b KL b b b b b QMLE (cid:12) areasymptoticallynormal. Inthe(very)heavy-tailedcaseof(cid:17) = 3:5, noneoftheestimators QMLE areasybmptoticall b ynormal.19. Intheheavy-tailedcasesof(cid:17) =4:5;3:5,(cid:11) and(cid:12) arenbotconsistent KL KL ;bconsequently,theyarenotconsideredinthosecases.20 Summarystatisticsforthesimulationsarethe rootmeansquarederror,meanabsoluteerror,andmedianabsoluteerro b r(eachmbeasuredrelativetothe trueparametervalue)dividedbythecorrespondingefficiencymeasurefortheQMLE.Theseratiosare termed"efficiencyratios." 18Necessaryfor(cid:11) and(cid:12) tobeasymptoticallynormalisE Y8 < (seeKristensenandLinton,2006),whichdoesnot KL KL t 1 appeartobetrue,giventheresultsinTable2. (cid:0) (cid:1) 19Anecessarycobnditionfobr(cid:11) QMLE and(cid:12) QMLE tobeasymptoticallynormalisE (cid:15)4 t < 1 (see;e.g.,HallandYao,2003). 20Necessaryforconsistencyof(cid:11) and(cid:12) isE Y4 < ,which(verylikely)doesnothold,giventheresultsinTable2. KL KL t 1 (cid:0) (cid:1) b b (cid:0) (cid:1) b b 33

TABLE5: Thin-TailedEfficiencyComparisons(LargeSample) efficiencyratio (cid:21) (cid:17) skew. (cid:20) (cid:11) (cid:12) (cid:30) rmse mae mdae -0.90 16.5 -1.21 6.03 0.05 0.50 0.55 1.00 0.99 0.99 -0.99 16.5 -1.24 5.99 0.05 0.50 0.55 1.00 0.99 0.99 -0.90 20.5 -1.17 6.04 0.10 0.50 0.60 1.01 1.00 1.00 -0.99 20.5 -1.21 6.00 0.10 0.50 0.60 1.01 1.00 1.00 -0.90 16.5 -1.21 6.01 0.05 0.60 0.65 0.99 0.99 0.98 -0.99 16.5 -1.25 5.97 0.05 0.60 0.65 0.99 0.99 0.98 -0.90 21.5 -1.17 6.01 0.10 0.60 0.70 1.03 1.01 1.00 -0.99 21.5 -1.20 5.97 0.10 0.60 0.70 1.03 1.01 1.00 -0.90 18.5 -1.19 6.07 0.05 0.80 0.85 1.07 1.01 1.01 -0.99 18.5 -1.22 6.03 0.05 0.80 0.85 1.07 1.01 1.01 -0.90 64.5 -1.08 6.01 0.10 0.80 0.90 1.50 1.49 1.49 -0.99 64.5 -1.11 5.96 0.10 0.80 0.90 1.49 1.48 1.47 -0.90 24.5 -1.15 6.04 0.05 0.90 0.95 1.68 1.68 1.71 -0.99 24.5 -1.19 6.00 0.05 0.90 0.95 1.65 1.66 1.67 Table6: SmallSampleResults (cid:21)= 0:20 (cid:21)= 0:40 (cid:21)= 0:80 (cid:0) (cid:0) (cid:0) efficiencyratio efficiencyratio efficiencyratio est. m rmse mae mdae rmse mae mdae rmse mae mdae III. (cid:17) =4:5 (cid:11) 1 2.41 2.49 2.05 1.17 1.31 1.48 0.68 0.85 1.15 IV (cid:11) (cid:30) 1 2.34 2.52 2.18 1.18 1.34 1.50 0.61 0.77 1.03 IV QMLE b (cid:12) (cid:16) (cid:30) (cid:17) 20 2.69 3.95 7.06 2.77 4.14 7.35 2.66 3.95 6.95 b IVb IV (cid:16) (cid:17) 10 2.50 3.60 6.38 2.54 3.73 6.62 2.44 3.55 6.28 b b 5 2.22 3.07 5.30 2.25 3.15 5.56 2.15 3.00 5.33 (cid:12) (cid:30) 1 1.29 1.47 1.74 1.06 1.15 1.37 0.97 0.99 1.09 IV QMLE (cid:16) (cid:17) IV. (cid:17) =3:5 b (cid:11)b 1 1.63 1.73 1.62 0.86 0.98 1.20 0.54 0.68 1.01 IV (cid:11) (cid:30) 1 1.60 1.79 1.76 0.82 0.97 1.19 0.45 0.58 0.85 IV QMLE b (cid:12) (cid:16) (cid:30) (cid:17) 20 2.51 3.61 6.61 2.55 3.74 6.70 2.43 3.54 6.29 b IVb IV (cid:16) (cid:17) 10 2.35 3.33 6.07 2.37 3.42 6.16 2.25 3.24 5.80 b b 5 2.11 2.87 5.17 2.11 2.91 5.28 2.01 2.78 5.04 (cid:12) (cid:30) 1 1.20 1.33 1.55 1.02 1.08 1.24 0.95 0.95 1.00 IV QMLE (cid:16) (cid:17) b b 34

Notes to Table 5. Simulations are conducted on sample sizes of T = 100;000 across 10;000 trials, where, within each trial, the first 200 observations are dropped to avoid initialization effects. The linear GARCH(1;1) model is studied under different parameter values for (cid:11) and (cid:12) and different 0 0 specifications of Hansen’s (1994) skewed student’s-t density for the model’s innovations (see the Table).21 Theestimatorsconsideredare(cid:12) (cid:30) and(cid:12) . DifferentspecificationsofHansen’s IV QMLE QMLE (1994)skewedstudent’s-tdensityareselec(cid:16)tedtoma(cid:17)ximizetheamount(inabsoluteterms)ofskewness in Y T , while maintaining asymp b totic n b ormality for (cid:12) b (cid:30) . "skew." and (cid:20) are the skewf tgt=1 IV QMLE nessandtailindexvalueof Y T ,respectively,underthegi(cid:16)vensimul(cid:17)ationdesign,while(cid:30)=(cid:11)+(cid:12). f tgt=1 b b Summarystatisticsforthesimulationsaretherootmeansquarederror,meanabsoluteerror,andmedian absolute error (each measured relative to the true parameter value) for (cid:12) (cid:30) divided by the IV QMLE correspondingefficiencymeasurefor(cid:12) ,withtheseratiosbeingterme(cid:16)d"efficien(cid:17)cyratios,"asin QMLE b b Tables3and4. b Notes to Table 6. Simulations are conducted on sample sizes of T = 500 across 10;000 trials, where,withineachtrial,thefirst200observationsaredroppedtoavoidinitializationeffects. Thelinear GARCH(1;1)modelunderstudyisparameterizedas ! =0:005; (cid:11) =0:10; (cid:12) =0:80: 0 0 0 The simple IV estimators are considered (where m is the number of lagged instruments used to construct the estimator), along with the quasi-maximum likelihood estimator (QMLE), which serves as a benchmark. TheinnovationsfromtheGARCH(1;1)modelfollowHansen’s(1994)skewedstudent’s-t density,where (cid:21)= 0:20; 0:40; 0:80; (cid:17) =4:5;3:5: (cid:0) (cid:0) (cid:0) Highervaluesof(cid:21)correspondwithmoreskewness(inabsoluteterms)in Y T ,whilehighervalues f tgt=1 of (cid:17) correspond with heavier tails in the simulated return sample. In these small-sample experiments, only(very)heavy-tailedinnovationdensitiesareconsidered. Summarystatisticsforthesimulationsare therootmeansquarederror,meanabsoluteerror,andmedianabsoluteerror(eachmeasuredrelativeto thetrueparametervalue)dividedbythecorrespondingefficiencymeasurefortheQMLE.Theseratios aretermed"efficiencyratios." 21Inallcases,! =0:005. 0 35

TABLE7: GARCHModelEstimatesI estimator tail TSLS QMLE freq. obs. skew. index para. m=5 m=10 (cid:30) QMLE 15-min 12,680 -1.27 2.93 (cid:11) 0.15 0.15 0.19 0.20 (0.02) b (0.06,0.34) (cid:30) 0.32 0.25 0.90 0.94 (cid:12) 0.17 0.10 0.71 0.74 (0.58,0.89) 10-min 19,021 -1.33 2.83 (cid:11) 0.19 0.19 0.17 0.12 (0.02) (0.05,0.20) (cid:30) 0.53 0.53 0.94 0.95 (cid:12) 0.35 0.35 0.77 0.83 (0.73,0.93) TABLE8: GARCHModelEstimatesII estimator tail TSLS QMLE freq. obs. skew. index para. m=5 m=10 (cid:30) QMLE 5-min 38,035 -1.40 2.83 (cid:11) 0.09 0.09 0.15 0.12 (0.01) b (0.08,0.17) (cid:30) 0.77 0.78 0.96 0.99 (cid:12) 0.64 0.65 0.82 0.87 (0.82,0.91) 1-min 190,058 -1.81 3.05 (cid:11) 0.03 0.03 0.03 0.13 (0.01) (0.08,0.18) (cid:30) 0.84 0.91 1.04 1.00 (cid:12) 0.86 0.94 1.01 0.87 (0.82,0.91) Notes to Tables 7–8. All data ranges from January 1, 2015 through July 1, 2015 and sources to Bloomberg Finance LP. Log return data is intra-daily at the stated frequency measured from traded JapaneseYenexchangeratesrelativetotheUSD.UsingtheapproachinHecq,Laurent,andPalm(2012, Eq. 4.1), all log returns are pre-filtered for the U-shaped intra-day periodicity noted by Anderson and Bollerslev(1997).Skewistheunconditionalskewnessofthelogreturns.Movingfromlefttorightinthe columnsundertheTSLSheading,thefirsttwocolumnsshow(cid:11) ,(cid:30) ,and(cid:12) (cid:30) ,respectively. IV IV IV IV For (cid:11) , it is always the case that m = 1 (where m denotes the number of lags u(cid:16)sed a(cid:17)s instruments). IV For (cid:30) , and (cid:12) (cid:30) , it is either the case that m = 5bor m b = 10. b The th b ird column shows IV IV IV b (cid:11) (cid:30) and(cid:16)(cid:12) (cid:17)(cid:30) , where, it is also always that case that m = 1, and, additionally, IVb QMLE b bIV QMLE (cid:30)=(cid:16)(cid:11) (cid:30) (cid:17) +(cid:12) (cid:16) (cid:30) (cid:17) .ThefinalcolumnoftheTableshowsestimatesfromtheQMLE, b bIV QMLE b IVb QMLE togetherw(cid:16)iththelo(cid:17)wer-and(cid:16)upper-bo(cid:17)undsoftheirassociated95%confidenceinterval. b b b b 36

FIGURE 1 PANEL A PANEL B S&P 500 Raw Return Tail Index Estimates S&P 500 GARCH Innovation Tail Index Estimates Date Range: Jan. 2, 1990--Aug. 20, 2018 Date Range: Jan. 2, 1990--Aug. 20, 2018 7.50 7.50 7.00 7.00 6.50 6.50 6.00 6.00 5.50 5.50 5.00 5.00 4.50 4.50 4.00 4.00 3.50 3.50 3.00 3.00 2.50 2.50 2.00 2.00 1.50 1.50 1.00 1.00 58 130 202 274 346 418 491 563 635 707 58 130 202 274 346 418 491 563 635 707 number of tail observations number of tail observations Notes to Figure1: Hill (1975) tail index estimatesat varying thresholds are depicted for S&P 500 Index log returns and the innovations from a GARCH(1,1) model applied to these returns. Th thresholds are determined as proportions of the ranked data ranging from 1% to 10%. The underlying price data is daily and sources to Bloomberg Finance LP.

Cite this document
APA
Todd Prono (2019). When Simplicity Offers a Benefit, Not a Cost: Closed-Form Estimation of the GARCH(1,1) Model that Enhances the Efficiency of Quasi-Maximum Likelihood (FEDS 2019-030). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2019-030
BibTeX
@techreport{wtfs_feds_2019_030,
  author = {Todd Prono},
  title = {When Simplicity Offers a Benefit, Not a Cost: Closed-Form Estimation of the GARCH(1,1) Model that Enhances the Efficiency of Quasi-Maximum Likelihood},
  type = {Finance and Economics Discussion Series},
  number = {2019-030},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2019},
  url = {https://whenthefedspeaks.com/doc/feds_2019-030},
  abstract = {Simple, multi-step estimators are developed for the popular GARCH(1,1) model, where these estimators are either available entirely in closed form or dependent upon a preliminary estimate from, for example, quasi-maximum likelihood. Identification sources to asymmetry in the model's innovations, casting skewness as an instrument in a linear, two-stage least squares estimator. Properties of regular variation coupled with point process theory establish the distributional limits of these estimators as stable, though highly non-Gaussian, with slow convergence rates relative to the √ n -case. Moment existence criteria necessary for these results are consistent with the heavy-tailed features of many financial returns. In light-tailed cases that support asymptotic normality for these simple estimators, conditions are discovered where the simple estimators can enhance the asymptotic efficiency of quasi-maximum likelihood estimation. In small samples, extensive Monte Carlo experiments reveal these efficiency enhancements to be available for (very) heavy tailed cases. Consequently, the proposed simple estimators are members of the class of multi-step estimators aimed at improving the efficiency of the quasi-maximum likelihood estimator. Accessible materials (.zip) | Appendix (PDF)},
}