A Coherent Framework for Predicting Emerging Market Credit Spreads with Support Vector Regression
Abstract
We propose a coherent framework using support vector regression (SRV) for generating and ranking a set of high quality models for predicting emerging market sovereign credit spreads. Our framework adapts a global optimization algorithm employing an hv-block cross-validation metric, pertinent for models with serially correlated economic variables, to produce robust sets of tuning parameters for SRV kernel functions. In contrast to previous approaches identifying a single "best" tuning parameter setting, a task that is pragmatically improbable to achieve in many applications, we proceed with a collection of tuning parameter candidates, employing the Model Confidence Set test to select the most accurate models from the collection of promising candidates. Using bond credit spread data for three large emerging market economies and an array of input variables motivated by economic theory, we apply our framework to identify relatively small sets of SVR models with su perior out-of-sample forecasting performance. Benchmarking our SRV forecasts against random walk and conventional linear model forecasts provides evidence for the notably superior forecasting accuracy of SRV-based models. In contrast to routinely used linear model benchmarks, the SRV-based models can generate accurate forecasts using only a small set of input variables limited to the country-specific credit-spread-curve factors, lending some support to the rational expectation theory of the term structure in the context of emerging market credit spreads. Consequently, our evidence indicates a better ability of highly flexible SVR to capture investor expectations about future spreads reflected in today's credit spread curve. Accessible materials (.zip)
Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C. A Coherent Framework for Predicting Emerging Market Credit Spreads with Support Vector Regression Gary Anderson and Alena Audzeyeva 2019-074 Please cite this paper as: Anderson, Gary, and Alena Audzeyeva (2019). “A Coherent Framework for Predicting EmergingMarketCreditSpreadswithSupportVectorRegression,”FinanceandEconomics DiscussionSeries2019-074. Washington: BoardofGovernorsoftheFederalReserveSystem, https://doi.org/10.17016/FEDS.2019.074. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.
A Coherent Framework for Predicting Emerging Market Credit Spreads with Support Vector Regression∗ Gary Anderson† Board of Governors of the Federal Reserve System, USA Alena Audzeyeva‡ Keele Management School, Keele University, UK October 15, 2019 ∗Correspondenceto:AlenaAudzeyeva,KeeleUniversity. Theanalysisandconclusionsinthispaperarethoseofthe authorsanddonotindicateconcurrencebymembersoftheresearchstaffortheFederalReserveBoardofGovernors. †Email: cemar.llc@gmail.com ‡Email:a.audzeyeva@keele.ac.uk
Abstract Weproposeacoherentframeworkusingsupportvectorregression(SVR)forgeneratingandrankingasetofhighqualitymodelsforpredictingemergingmarketsovereign credit spreads. Our framework adapts a global optimization algorithm employing an hv-blockcross-validationmetric,pertinentformodelswithseriallycorrelatedeconomic variables, to produce robust sets of tuning parameters for SVR kernel functions. In contrast to previous approaches identifying a single “best” tuning parameter setting, a taskthatispragmaticallyimprobabletoachieveinmanyapplications,weproceedwith acollectionoftuningparameter candidates,employingtheModelConfidenceSettest to select the most accurate models from the collection of promising candidates. Using bond credit spread data for three large emerging market economies and an array of inputvariablesmotivatedbyeconomictheory,weapplyourframeworktoidentifyrelativelysmallsetsofSVRmodelswithsuperiorout-of-sampleforecastingperformance. Benchmarking our SVR forecasts against random walk and conventional linear model forecastsprovidesevidenceforthenotablysuperiorforecastingaccuracyofSVR-based models. Incontrasttoroutinelyusedlinearmodelbenchmarks,theSVR-basedmodels can generate accurate forecasts using only a small set of input variables limited to the country-specific credit-spread-curve factors, lending some support to the rational expectationtheoryofthetermstructureinthecontextofemergingmarketcreditspreads. Consequently, our evidence indicates a better ability of highly flexible SVR to capture investorexpectationsaboutfuturespreadsreflectedintoday’screditspreadcurve. Keywords: Machine Learning; Support Vector Machine Regressions; Sovereign credit spreads; Emerging Markets; Out-of-sample predictability; Model Confidence Set. JELClassifications:G17;G15;C53.
Contents 1 Introduction 1 2 Datavariables 5 2.1 Emergingmarketcreditspreadsandspreadcurvefactors . . . . . . . . . . . . . . 5 2.2 Inputvariableselection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3 Forecastingframework 7 3.1 Supportvectorregression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 Settingtuningparameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2.1 hv-Blockcross-validation . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2.2 Multi-SequentialNumber-TheoreticOptimization . . . . . . . . . . . . . . 13 3.3 Modelselectionandevaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4 Empiricalforecastingresults 18 4.1 PredictiveabilityofSVRmodels . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2 Furtherempiricalobservations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5 Conclusions 21 AppendixADatavariables: Summarystatistics 25 AppendixBMSNTO:Implementationdetails 26 i
1 Introduction The emerging economies bond market has become a core global asset class in the past decade, playing an increasingly important role in portfolio allocation and risk management decisions of international investors. Moreover, measures of sovereign credit spreads serve as benchmarks for pricingotherdomesticassetssuchascorporatebondsandcreditderivatives. Yet,verylittleisknown abouttheirpredictabilityinreal-time,insharpcontrasttothevoluminousliteratureonrisklessbond yields that has evolved in recent years; some prominent examples include Fama and Bliss (1987), Diebold and Li (2006), Ang and Piazzesi (2003), and Ludvigson and Ng (2009). Furthermore, a keyroleplayedbyemergingdebtmarketsinglobalfinancialstabilityprovidesadditionalimportant motivationfordevelopingaccuratepredictionsofemergingmarketsovereigncreditspreads. However,recentempiricalstudiesofsovereignemergingmarketcreditspreadstendtofocuson their predictability in-sample; see, for example, Longstaff et al. (2011) and Comelli (2012). Such evidence does not directly extend to predictions in real time. Furthermore, the few studies that attemptsuchforecastsdocumentthatcreditspreadsaredifficulttopredictout-of-sample(OOS).In particular, in a recent study Audzeyeva and Fuertes (2018) show that predictive models employing country-specific credit-spread-curve factors, namely, the level, slope and curvature, known to contain useful information for future yields in the context of riskless debt, cannot beat a random walk for emerging market bonds. They find that employing additional global and country-specific predictors improves the model predictive ability, but that even with these additional predictors the model-based-forecastscannotalwaysoutperformarandomwalk. However,theseforecastingresultsrelyonassumingthatexpectationsaboutfuturecreditspreads arealinearfunctionofthelevel,slopeandcurvaturefactorsoftoday’screditspreadcurve. Whilethis is a plausible assumption, also advocated, for example, in Diebold and Li (2006) for U.S. Treasury yields, it represents only one of many possible expectation mechanisms that, in fact, is likely to be more complex in nature for globally-traded emerging market bonds. In particular, it is well known that emerging market bond prices tend to be noisy and that their distributional properties change over time, triggered, for example, by a domestic or external credit event, or a global economic or financial market crisis. This complexity motivates the application of SVR for modeling and forecasting of such series. The key advantages of SVRs, which are data-driven, non-parametric 1
models, are that (a) unlike linear models, they do not require strong a-priori assumptions about the relationship between the target variable and predictors, and (b) they are, by design, better able to allay the issue of over-fitting inherent in the standard multivariate linear regression techniques. Furthermore, SVRs have demonstrated superior performance in time series prediction relative to bothconventionalmodelingapproachesandalternativemachinelearningtechniquessuchasneural networks;seeCaoandTay(2001)andStasinakisetal.(2016)forcomprehensiveoverviewsofSVR financialmodelingapplications. This study contributes, first, to the sparse literature on sovereign credit spread prediction by applying SVR for the OOS forecasting of credit spreads of large emerging market borrowers. To the best of our knowledge, this is the first study that applies the SVR methodology in the context of emerging bond markets. In doing so, we extend the analysis of linear predictive models underpinned by the assumption of linearly formed expectations in extant emerging market bond studiesbypermittingmorecomplexexpectationmechanismsaffordedbyhighly-flexibleSVRmodel specifications. Furthermore, we go one step beyond many SVR forecasting studies of financial markettimeseriesthat,similartotechnicalanalysistraders,tendtorelyonpredictivecontentinthe historical data of the target variable alone for constructing input variables. Examples are Law and Shawe-Taylor(2017)whoforecasttheU.K.andU.S.basedstockmarketindices,commodityfutures, government bond yields and corporate CDS, Stasinakis et al. (2016) and Sermpinis et al. (2017b) who focus on predicting U.S. based commodity exchange traded funds (ETF) and European stock market ETFs, respectively. In contrast, our predictive models are motivated by economic theory and as such are aligned with investment strategies of more sophisticated fundamental traders, employing as input variables an expanded array of predictors containing both global and domestic fundamentals. Second,wecontributetotheforecastingmethodologybyproposingacoherentframeworkusing SVRforgeneratingandrankingarobustsetofhighqualityforecastingmodels. Thiscontrastswith extant studies advocating the selection of one "best" predictive model, a task that is pragmatically improbable to achieve in many applications. To illustrate the issue, consider the methodology for developing an SVR model that entails two important stages. During the first stage, the modeler selects specific SVR kernels and must set a small number of tuning parameters that determine how well the model produced by the subsequent SVR optimization stage will characterize the data 2
(Cao and Tay, 2001). While the methodology for the second, optimization stage is theoretically soundandrelativelystraightforwardtoimplement,Stasinakisetal.(2016),LawandShawe-Taylor (2017), Sermpinis et al. (2017a) and Sermpinis et al. (2017b) among others emphasize that SVR forecasting performance is highly sensitive to the kernel, or tuning, parameters selected at the first stage,pointingoutthatthereislittleformalguidanceintheliteratureonhowtosettheseparameters. With lacking compelling theoretical guidance for setting tuning parameters, modelers tend to resort to applying an optimization technique, with grid search being the most frequent choice, in conjunction with a metric characterizing the goodness of fit, typically based on the forecast RMSE;see,forexample,MinandLee(2005),Dingetal.(2008)andGunduzandUhrig-Homburg (2011). Amongfewstudiesemployingglobaloptimizationtechniqueswiththeirmeasureoffitness, Stasinakisetal.(2016)applytheKrill-Herdmeta-heuristicoptimizationmethodtointroduceKrill- HerdSVRwhereasSermpinisetal.(2017a)incorporateoppositionbasedoptimizationtodevelopthe reverseadaptiveKrillglobalsearchmethodinasearchforthebesttuningparametersetting. Lawand Shawe-Taylor(2017)applyaBayesianapproachthatfollowsGao(2002)inassumingtheexistence of a most likely SVR prediction function generated by a Gaussian process. However, the mapping from the data to the tuning parameters is highly non-linear and extremely challenging to compute. To simplify the problem, the authors employ a Taylor-series approximation for the relationship, performingglobalquasi-optimizationtofindthemostlikelyvaluesoftuningparameters. A common assumption in all of these approaches is that they can identify with some degree of certainty one “best" set of tuning parameter values producing a single best SVR forecasting model for a given kernel. However, the complexity of the objective function and the existence of manylocaloptimamakesitunattainableforevenglobaloptimizationtechniquestodistinguishwith certainty between a number of potentially “best" solutions. The problem is further exacerbated by typically limited data sets available to the modelers. Law and Shawe-Taylor (2017) highlight the issue,proposingtocalculateerrorboundsfortheirparametervaluesasawaytoresolveit. However, their error bound calculations depend on a number of approximations, with the resulting accuracy requiring further analysis. As a result, there is no persuasive evidence that extant approaches can successfullyidentifyauniqueglobalsolutionamongperhapsmanyplausiblelocalalternatives. Another issue faced by the modeler is that the standard SVR forecasting methodology may be unsuitable for applications involving serially correlated data series that tend to generate serially 3
correlated forecast errors as is the case in our bond market application. In particular, Brabanter et al. (2011) show that standard cross-validation schemes can interpret the serial correlation as a highfrequencyrelationshipwithsmallvariance,leadingtospuriousparameterchoices.1 To address the first methodological issue, contrary to previous SVR literature requiring the modelertoselectonebestparametersetting,ourforecastingframeworkpermitsthatachosenSVR kernelcangeneratemultipleprofitableforecasts. ThemultiplicityofforecastsarisesbecauseaSVR kernelcanaccommodateasetofviablechoicesfortuningparameters. Next,weproposeacoherent 3-stepstrategyformanagingsuchmultiplicity. Ourapproachfirstusesarobustglobaloptimization algorithm, namely, multi-sequential number-theoretic optimization (MSNTO) put forward by Xu et al. (2005), in conjunction with hv-block cross validation of Racine (2000) to generate a set of promising model candidates. The advantage of the global optimization routine is that it is more robust than alternative methods such as a grid search against choosing bad local optima and does not require much guidance on what would be a good initial guess. Our choice of hv-block cross validation as a measure of fitness is motivated by its robustness for serially correlated series and forecast errors that are common in many economic and financial time series applications, addressing the second methodological issue in the SVR forecasting literature caused by presence of serial correlation in the data. In step two, predictor models from step one are estimated using the standard SVR optimization methodology. In step three, we select a robust set of most accurate models from the collection of promising candidates from the previous step by applying the Model Confidence Set test of Hansen et al. (2011), known to be robust to data snooping, that permits identifyingasubsetofagroupofmodelswiththebestforecastaccuracy. Weapplyourframeworktoaquarter-aheadOOSforecastingofcreditspreadsforthreelargeand relativelymatureemergingmarketsofBrazil,MexicoandTurkey. Allthreesovereignborrowersare majoremergingmarketeconomiesandmembersoftheG20. Inourpredictiveanalysis,weemploy the data set of Audzeyeva and Fuertes (2018). Our analysis employs SVR with linear, sigmoid, RBFandpolynomialkernelstogeneratecreditspreadforecastsusingsetsofinputvariablesthatare motivatedbyeconomictheory. Wefindthatourframeworkdeliversarelativelysmallrobustsetof 1Amongfewstudiesaimingtoaddresstheissueofserialcorrelation,Bergmeiretal.(2018)arguethatk-foldcrossvalidation that they apply to both generated and real-life commodity data sets can be adequate for some applications using non-parametric estimation that satisfy certain conditions. However, we find that our bond market application doesnotmeettheirrequirements. 4
high-qualitySVRforecastingmodelsforeachsovereignborrower. Themodelsetsarecharacterized by country-specific preferred kernel functions. Our results provide evidence of notably superior forecastingaccuracyofthehighly-rankedSVRmodelsrelativetobothalternativeSVRspecifications and standard benchmarks. We further find that our SVR models can deliver accurate credit spread forecastswithonlyasmallsetofinputvariableslimitedtothecredit-spread-curvefactorsforMexico and Turkey and the credit-spread-curve factors augmented by global yield-curve factors for Brazil, performing as well as or better than both other SVR and benchmark models using extended input sets. 2 Data variables 2.1 Emerging market credit spreads and spread curve factors The target variable in our predictability analysis is the credit spread on sovereign bonds of an emergingmarketcountryc y (τ) Y (τ) Y (τ), (1) c,t c,t US,t ≡ − where Y (τ) and Y (τ) are the time-t yield to maturity on τ-maturity zero-coupon bonds of c,t US,t emerging-market country c and the U.S. Treasury, respectively. In our predictive analysis we set τ = 5years. TheweeklyfrequencydatasetofAudzeyevaandFuertes(2018)containstheyieldsto maturity on zero-coupon U.S. Treasuries and U.S. dollar-denominated Eurobonds of each country weusedtocalculatethecountry-specificcreditspreads.2 Adopting the Nelson and Siegel (1987) representation permits expressing the credit spread on τ-maturityzero-couponbondoftheemerging-marketsovereigncasaparsimoniousfunctionofits credit-spread-curvefactors: (cid:18) 1 e−λc,tτ(cid:19) (cid:18) 1 e−λc,tτ (cid:19) y (τ) = β +β − +β − e−λc,tτ (2) c,t c0,t c1,t c2,t λ τ λ τ − c,t c,t 2AudzeyevaandFuertes(2018)extractyieldsonriskyandrisklessbondsfromcross-sectionsofbondmarketprices foragivenemergingmarketcountryandU.S.Treasurybonds,respectively,byfollowinganestablishedmethodology thatbuildsonseminalworkofFamaandBliss(1987),Svensson(1994)andDieboldandLi(2006). 5
where β , β and β , are the level, slope and curvature factors, respectively. The country c0,t c1,t c2,t credit-spread-curvefactorsareavailableataweeklyfrequencyinourdataset.3 Figure 1 plots the evolution of credit spread curves for Brazil, Mexico and Turkey during our December 2, 2008 to December 29, 2015 sample period while Table A1 in Appendix A gives summary statistics for the target variable, 5-year credit spreads, and the credit-spread-curve factors.4 Creditspreadsandcredit-spread-curvefactorsofallcountriesexhibitstylizedpersistence. Figure1furthershowsthat althoughthedynamicsofcreditspreadsexhibitcommontrendsacross the three emerging market economies, driven by various global market factors, country-specific variations are, nevertheless, apparent. Such variations primarily reflect differences in countryspecific creditworthiness. In particular, in December 2008 Brazil and Mexico, both investmentgrade BBB-rated by the S&P credit rating agency, exhibited lower credit spreads than Turkey that wasratedasspeculative-gradeBBthroughoutthedatasampleperiod. WhileMexico’screditrating remained stable, with its spreads experiencing a long-term downward trend, Brazil’s spreads rose sharply with the S&P issuing a negative rating outlook for Brazil on June 6, 2013, followed by several negative rating changes that led to Brazil’s downgrade to speculative-grade rating BB on September 9, 2015. In Table A1, these differences in creditworthiness are reflected in the lower mean credit spread and spread volatility at 139 and 65 basis points, respectively, for Mexico, with the corresponding figures reaching 159 and 81 basis points for Brazil and even higher, 253 and 93 basispoints,forTurkey. 2.2 Input variable selection Theselectionofpredictivevariablesfortheinputvectorinouranalysisisbasedonpriorresearchon predictabilityofsovereignemergingmarketspreads. Weemploypredictivevariables,motivatedby economictheory,thathavebeenpreviouslyanalyzedinAudzeyevaandFuertes(2018). Accordingly, our Baseline model is rooted in the expectations theory of the term structure of interest rates of Sargent (1972) and Roll (1970). The main idea is that investor expectations about future credit spreads, reflected in today’s forward spreads, exploit all the available information. Consequently, 3AudzeyevaandFuertes(2018)extractthecreditcurvelevel,slopeandcurvaturefactorsfromweeklycross-sections ofcountry-specificbondprices;seetheirpaperfordetails. 4WeusethedatasetconstructedforAudzeyevaandFuertes(2018)forFigure1,TableA1andallourtrainingand validationcalculations. 6
(a) Brazil (b) Mexico (c) Turkey Figure1Emergingmarketcreditspreads today’s credit spread curve, which embeds forward credit spreads, ought to contain all relevant information for predicting future spreads. In line with this theory, the Baseline model employs as input variables level, slope and curvature of the country-specific credit spread curve, known to summarize the information content in today’s spread curve. Furthermore, as in Audzeyeva and Fuertes (2018), we test two alternative model specifications that augment the input vector in the Baselinemodelby(a)globalmacroeconomicvariables,modelG,and(b)bothglobalanddomestic macroeconomicvariables,modelGEM. Table1listsinputvariablesineachpredictivemodeland TableA1inAppendixAprovidessummarystatisticsofthedatavariables. 3 Forecasting framework Our forecasting framework entails a three-step process. Step one involves specifying predictive models by selecting SVR kernel functions and setting tuning parameters. In step two, predictive models obtained in step one are estimated using the standard SVR methodology, and their OOS forecastingaccuracyissubsequentlyassessed,permittingtheselectionofasubsetofbestperforming models in step three. For the OOS predictability analysis, we allocate for training the first 2/3N obs 7
Table1 Inputvariablesintheforecastingmodels Models InputVariables Baseline G GEM Countryspreadcurvefactors (cid:88) (cid:88) (cid:88) (level,slope,curvature) Globalpredictors USyieldcurvefactors (cid:88) (cid:88) (level,slope,curvature) VolatilityofUSshort-termrate (cid:88) (cid:88) Domesticpredictors Countryriskrating (cid:88) Tradebalancelevel (cid:88) Tradebalancevolatility (cid:88) Terms-of-tradegrowthlevel (cid:88) Terms-of-tradegrowthvolatility (cid:88) consecutiveweeksoftheN =367weeklyobservationsavailableinthedatasampleperiod,setting obs aside remaining 1/3N weeks for OOS forecast evaluation (Hansen et al., 2011). The forecasts obs aregeneratedusingrollingregressions. In what follows we first introduce the standard SVR methodology (step two) for which kernel specifications serve as inputs, subsequently outlining our methodology for setting SVR tuning parameters(stepone)andevaluatingmodelforecastsformodelselection(stepthree). 3.1 Support vector regression SVRimplementsstructuralriskminimization,animportanttenetofstatisticallearningtheory,with the purpose of constructing models with reliable OOS performance (Vapnik, 1995). Instead of empirical risk minimization, which minimizes the error on observed data, as in linear regression andmostotherconventionalestimationmethodologies,SVRseekstominimizeanupperboundon thegeneralizationerror. AsaconsequenceofitsdocumentedsuperiorOOSpredictiveperformance, the technique has found wide acceptance in financial series forecasting; see, for example, Cao and Tay(2001),MinandLee(2005),Stasinakisetal.(2016),LawandShawe-Taylor(2017),Sermpinis etal.(2017a),andSermpinisetal.(2017b). Using a training sample containing I observations of a scalar target variable y and a vector of 8
predictor variables x Q , SVR constructs predictive models for q-step-ahead forecasts of y of ∈ R theform: I (cid:88) y = f(x )+u = w + w k(x ,x ;Υ)+u (3) t+q t t 0 i i t t i=1 We set q = 13 weeks, or one quarter, in our predictive analysis, as in Audzeyeva and Fuertes (2018). Kernel function k(x ,x;Υ) in Eq.3 effectively maps the input data vector x into a higher i dimensionalspacerepresentingaw -weightedlinearsumoftermsthatcanbetterpredictthetarget i variable y due to its superior flexibility. The vector of tuning parameters Υ provides a way for varying aspects of the nonlinear mapping of the data. This non-linear mapping, often referred to as “kernel trick”, provides a variety of parsimoniously parameterized nonlinear functional forms (Hofmann et al., 2008). Our analysis employs four kernel functions described in Table 2: linear, polynomial,radialbasisandsigmoidfunctions. SVRchoosesavectorofweightsw thatminimizestheregularizedempiricalriskfunction: i I 1 (cid:88) min w +C (ξ +ξ∗), suchthat 2 (cid:107) (cid:107) i i i=1 (4) y f(x ) (cid:15)+ξ , f(x ) y (cid:15)+ξ∗, i+q − i ≤ i i − i+q ≤ i ξ ,ξ∗ 0 forall i. i i ≥ Here the bandwidth parameter, (cid:15), determines an (cid:15)-insensitive region for characterizing empirical risk,andtheregularizationparameter,C,determinesthetrade-offbetweenameasureofflatnessof thefunction, w ,andthelevelofempiricalrisk. (cid:107) (cid:107) The predictions of the parameterized function f(x ) can violate the constraint, but at a cost i proportional to C. With this so called "double-hinged” loss function, the loss will be zero when y f(x ) (cid:15), and increase linearly at the rate C for points where the predicted value falls i+q i | − | ≤ outside the (cid:15)-insensitive region. Based on statistical learning theory, this (cid:15)-SVR formulation explicitly provides robustness against parameter-driven model over-fitting through the judicious choiceoftheregularizationandbandwidthparameters(SmolaandSchölkopf,2004).5 5Inanalternativeν-SVRformulation,themodelerspecifiesparameterνdetermininganupperboundonthefraction observations which can fall outside the (cid:15)-insensitive region and a lower bound of the fraction of support vectors (Scholkopfetal.,2000). Then, ν isusedtodetermineC and(cid:15). Sincethereisnostrongprioravailableregardingan 9
Table2 Kernelfunctions Name Kernelfunctionk(x ,x) Kernel-specificparameters1 i Linear (x Tx) — i Radialbasisfunctions2 e − ψ2||xi 1 −x||2 ψ Sigmoid3 tanh(γ(x Tx)+s) s,γ i Polynomial4 (γ(x Tx)+s)g s,g,γ i 1 Eachkernelfunctionrequiresachoiceoftheregularizationparameter,C,andthebandwidth parameter, (cid:15), along with any kernel-specific parameters, altogether comprising the vector of tuningparametersΥ. 2 Parameterψ controlstheradiusofinfluenceofindividualobservations: largerψ reducesthe radiusofinfluence. 3 Thesigmoidkernelretainssomeofthepropertiesofalogisticcurvebutherethevaluesrange between 1. Reducing γ makes for a more gradual transition between the extreme values andforth±eresponsetoappearmorelinearforintermediatevalues. Whereassdeterminesthe locationofthepointwherethekernelfunctionvaluecrosseszero. 4 Thepolynomialkernelgeneralizesthelinearkernelfunctionbyprovidinganonlinearresponse tothedotproductvalue: thelargertheg,themorenonlineartheresponse. Hereγ moderates the sensitivity to the nonlinear interaction term while s determines the location of the zero response. Thelossfunction,Eq. 4,hasadualLagrangianoftheform I I 1 (cid:88) (cid:88) = w 2 +C (ξ +ξ∗) (η ξ +η∗ξ∗) L 2|| || i i − i i i i − i=1 i=1 I (cid:88) υ ((cid:15)+ξ y +k(x ,x;Υ)+w ) (5) i i i i 0 − − i=1 I (cid:88) υ∗((cid:15)+ξ∗ +y k(x ,x;Υ) w ) i i i − i − 0 i=1 Eq. 5constitutesaquadraticprogrammingproblem(QPP)withwellknownpropertiesandsolution algorithms. Theoptimizationproducesasetofnonzeroweightsυ ,υ∗ whichidentifiesacollection i i of "support vectors”, influential observations that determine the optimal set of weights – deleting appropriatevalueforν inourapplication,wehaveapplied(cid:15)-SVR. 10
theotherobservationsandagainsolvingtheQPPproducesthesamesetofoptimalweights.6 Note that in minimizing Eq. 4, we are only free to adjust w, ξ and ξ∗. The aim is to obtain w i i neededforconstructingpredictivemodelsofcreditspreads,Eq. 3. TwobasictuningparametersC and (cid:15) and any other parameters in Υ defining the kernel function k(x ,x;Υ) serve here as inputs. i Consequently, the SVR model forecasting performance will depend upon the choice of the kernel function,parametersC and(cid:15),andadditionaltuningparameters,ifany,inΥ. Givenakernelfunction andasetoftuningparameters,theSVRapproachwillproduceasingleforecastingmodel. However, changing tuning parameters would typically produce a different forecasting model. In Sections 3.2 and3.3weshowhowtoaccommodatethismultiplicity. 3.2 Setting tuning parameters This section outlines the procedure we propose for selecting tuning parameter candidates, Υ, that serve as input into Eq.3. This procedure is nontrivial for two reasons. First, there is little theoretical guidance for guessing appropriate tuning parameter values. Second, as in our bond market application, when searching for appropriate tuning parameters, the modeler is likely to encounter many local minima that are difficult to rank with any degree of certainty. Consequently, to mitigate the risk of neglecting favorable tuning parameters we employ MSNTO, a robust global optimization technique put forward by Xu et al. (2005). Furthermore, as predictive models in our bond market application generate serially correlated forecast errors, we apply the hv-block cross-validation algorithm of Racine (2000) together with MSNTO, as an appropriate metric for evaluatingthepredictiveperformanceofagivensetoftuningparametersinthiscontext. We do not use any of the validation set for tuning parameter selection, employing the training datasetaloneatthisstage. Theresultingtuningparametervaluesarethenfixedduringoursubsequent OOSforecastingexercise. Wewillfirstdescribethehv-Blockcross-validationmetricandthenmove ontodescribetheglobaloptimizationroutine. 6Forourcalculations,wehaveusedthe“C++”versionoflibsvm,awidelyusedopensourceSVRimplementation. 11
3.2.1 hv-Blockcross-validation SinceSVRhasbeendevelopedprimarilyforOOSprediction,researchershaveroutinelyusedcrossvalidation procedures, estimating the magnitude of expected OOS forecast errors, to ensure the robustness of forecasts. However, the presence of serial correlation in the data and forecast errors can complicate the estimation process, making routinely used cross-validation techniques like kfold cross validation potentially unsuitable. This is because the shape and smoothness of the fitted SVR functions depend critically upon the kernel bandwidth (cid:15). Many cross-validation techniques interprettheserialcorrelationasahighfrequencyrelationshipwithsmallvariancewhichcanresult insetting(cid:15)toolowfortrackingthefunctionvariation(Brabanteretal.,2011). Inotherwords,serial correlationcancausethealgorithmtofavornarrowerthanappropriatebandwidthsinordertomore effectively limit smoothing of the fitted function, erroneously taking the correlated observations as partofvariationinthefunctionvalueinsteadofpartoftheerror. Thehv-blockcross-validationapproachputforwardbyRacine(2000)providesawaytoestimate the accuracy of forecasts in the presence of autocorrelated forecast errors. Consider training data x set = pictured in Figure 2.7 To apply hv-block cross-validation, for each observation i we X y x collect v observations on either side of the observation i = i to construct a local validation X y i setfortimet = iofsizen = 2v+1. Thenhobservationsareremovedfromeithersideofthelocal v validationset,withtheremainingn = n 2v 2h 1observationsformingthelocaltrainingset c − − − for time i.8 The algorithm then trains on each local training set of size n , and computes errors on c eachlocalvalidationsetofsizen . Herehlimitstheimpactofautocorrelation,maintaining“near” v independencebetweenvalidationandtrainingsetsbyh-blocking,whereasv ensuresconsistencyof theestimatedforecasterrorvariance. Denote the validation data subset, designated “For Validation” in Figure 2, by = (i:v) X 7NotethatalloftheobservationsinFigure2comefromtheoriginaltrainingdataset,sothatthecross-validation onlyusesdatafromwithinthisdataset. 8Theactualcountremainingwillincreasewhentheobservationnearthebeginningorendoftheavailabledataas oneoftheholdoutsectionsshrinks. 12
h v For Estimation Holdout Holdout For Estimation ForValidation i X Figure2hv-Blockcross-validation (y ,x ). Thehv-blockcross-validationestimateoftheforecasterrorvarianceisgivenby: (i:v) (i:v) n−v 1 (cid:88) Φ(Υ) = y y (Υ) 2 (6) (i:v) (cid:100)(i:v) (n 2v)n || − || v − i−v+1 TheguidelinesinRacine(2000)supporth = 56,v = 28forourtrainingdatasetof244observations. 3.2.2 Multi-SequentialNumber-TheoreticOptimization When searching for optimal tuning parameter values, the modeler faces an optimization problem thatcanhavemanylocalminima. Toillustratetheproblem,Figure3plotsthe Φ(Υ)-surfacegiven − by Eq.6 for Mexico, as an example, using the Baseline forecasting model with a linear kernel. The figureshowsmanywidelydispersedpeaksrepresentinglocalminimaoftheobjectivefunction,with no clear dominating, or “best”, point among the parameter values. Furthermore, we do not have a waytoguess,withanyprecision,theregionswheregoodparametervaluesmightlie. Consequently, this type of problem calls for a robust global optimization routine like MSNTO that can search broadlyyetwillnotbeconfoundedbybad potentialparameterchoices.9 Consider a multivariate domain containing all of the parameter values we care to entertain π = [a,b] l, a Υ b , forall ι 1,...,l and a function, Φ(Υ), continuous on π. We ι ι ι ∈ R ≤ ≤ ∈ { } 9Employing derivatives seems impractical due to excessive computational complexity in the optimization routine forobtainingcross-validationvalues. 13
Figure3 Φ(Υ)-surface,Eq. 6,forMexico: Baselinemodelwithalinearkernel − toseektofind Υ∗ π suchthat Υ∗ = Φ(Υ∗) = minΦ(Υ) ∈ Υ∈π WechooseasetΥ = Υ ,ι = 1,...,Λ widelydispersedonπ,approximatingΥ∗ byΥ(cid:99)∗ Υ { ι } Λ ∈ suchthat Φ(Υ(cid:99)∗) = min Φ(Υ ). Λ ι ι∈{1,...,Λ} This simple strategy is known to converge slowly. Recursively contracting the search region produces dramatic speed improvements. However, this sequential best choice scheme might find only a local optimum. To avoid getting stuck at a local optimum, various authors propose implementing multiple local searches based on clustering. We employ the MSNTO algorithm that generates sample points using a number-theoretic sequence that maximizes dispersion. Then, at each step the algorithm retains a small fraction, Π, of these points for clustering. Since the exact dispersionevaluationwouldbecostly,thealgorithmusesanapproximationfordispersion, (cid:92) (ρ,π) = max min Υ Υ , ι θ B 1≤ι≤Λ1≤θ≤Λ,ι(cid:54)=θ (cid:107) − (cid:107) (cid:92) andcalculatesρ = ρ (ρ,π),asmallmultipleoftheestimateofdispersion. Hereρdeterminesthe Φ B range of points that should be included in a single cluster. The algorithm applies a user supplied 14
0.10 0.08 0.06 1 2 0.04 3 0.02 20 40 60 80 Figure4MSNTO( Φ(C,(cid:15)))evaluationpointclusteringandcontraction: Anex−amplefromanSVRwithalinearkernel parametertodeterminethemaximumamountofcontractiontowardthelocalminimumtoapplyto eachoftheclusterregions. These new cluster regions replace the old search regions during the next recursion. Xu et al. (2005)showthatstartingoutusinganumbertheoreticsequencewithaverylargesetofpoints,Λ , 1 and continuing the recursion with a somewhat smaller set of points in the sequence, Λ , improves 2 the algorithm’s performance. We used Λ = 100,Λ = 15,Π = 0.1,ρ = 1.2,σ = 0.5 and 1 2 Φ generated points using a Niedrreiter sequence (Niederreiter et al., 1983). Table B1 in Appendix B givestherangesforsearchfortheparametersofeachkernel. Figure4furtherillustratesthisprocess for a two dimensional Φ(C,(cid:15)) parameter search using a SVR with a linear kernel as an example. The algorithm constructs clusters of points around each local minimum. The first set of points is widely dispersed, covering a large region of the search space. The second iteration has identified two clusters and retains the points associated with the two clusters. The third iteration further refinesthesearchtoclustersthatencompassasmallerregiontherebyincreasingtheprecisionofthe approximationforthelocationofthetwolocalminima. For a given kernel/model combination, applying the MSNTO algorithm to minimize the crossvalidationerrorproducesacollectionofcandidatetuningparametersettings. TableB2inAppendix Bshowsourtallyofdistinctpromisingparametersettingsforeachkernel/modelcombination. 15
3.3 Model selection and evaluation We employ the expected value of the squared error loss to evaluate the predictive ability of the quarter-aheadOOSforecastsgeneratedbyvariousSVRcandidatemodels. However,whenchoosing among a collection of forecasting models using a fixed data set, the "data snooping” issue arises: model results that outperform may just be the result of luck. To overcome this issue, we evaluate statisticalsignificanceofgainsinpredictiveaccuracybymeansoftheModelConfidenceSet(MCS) test of Hansen et al. (2011). The procedure uses forecast errors to identify a subset of a group of modelswhosememberslikelyhavethebestforecastingaccuracy. Thissubsetofmodels,thattheauthorscall amodelconfidenceset,isconstructedtocontainall the superior models with a specified level of confidence. The MCS test has a number of important advantages over widely used alternative tests put forward by Diebold and Mariano (1995), White (2000)andHansen(2005). First,theMCStestprovidesadditionalinformationthatisusefultothe modeler: a measure of uncertainty surrounding model selection. The second advantage is related to the MCS test sensitivity to the utility of information in the available data such that informative dataproducesasmallcollectionofgoodmodelswhereasuninformativedatagenerateslargemodel confidence sets. Third, unlike alternative tests of pair-wise model comparisons, the MCS test does not require a benchmark, facilitating direct comparisons of forecasting accuracy among multiple competingmodelcandidateswhichisparticularlyusefulinourcase. Let the squared error loss function for the model j(cid:48) prediction yˆ of y to be given by L = j(cid:48),t t j(cid:48),t L(y ,yˆ ) = (yˆ y )2. Definethemeasureofrelativemodelperformanceasµ E(L t j(cid:48),t j(cid:48),t− t j(cid:48)j(cid:48)(cid:48) ≡ j(cid:48),t− L ). Thus, model j(cid:48) is preferred to model j(cid:48)(cid:48)when µ < 0. The authors assume that µ is j(cid:48)(cid:48),t j(cid:48)j(cid:48)(cid:48) j(cid:48)j(cid:48)(cid:48) finiteandindependentoft. TodefinetheMCStestprocedure,considerafiniteinitialcollectionofforecastingmodels, 0. M Let ∗ denotethesetofbestmodelsforaspecifiedmetricofmodelassessment: M ∗ j(cid:48) 0 : µ 0 forall j(cid:48)(cid:48) 0 . M ≡ { ∈ M j(cid:48)j(cid:48)(cid:48) ≤ ∈ M } A model confidence set at level α, ∗ , is then a subset of 0 containing all of ∗ with a M1−α M M probability(1 α). Foragivenlossfunctionandconfidencelevel,thetestusessampleinformation − about each model’s relative performance OOS to sequentially eliminate the poorest performing 16
models, producing p-values for each model in 0. Small p-values indicate low probability that M the model is actually among the best. The test procedure estimates (cid:99)∗ via a sequence of M1−α significance tests with null hypothesis H : µ = 0 forall j(cid:48),j(cid:48)(cid:48) ; 0 and 0,M j(cid:48)j(cid:48)(cid:48) ∈ M M ⊂ M alternative, H : µ = 0 forsomej(cid:48),j(cid:48)(cid:48) . The sequential elimination from the model A,M j(cid:48)j(cid:48)(cid:48) (cid:54) ∈ M confidence set continues until doing so reduces the coverage ratio, (1 α), below the specified − confidencelevel. The authors show that, since the procedure uses the same significance level in all tests, all models with p-values greater than α are in (cid:99)∗ . When the test assigns relatively high p-values M1−α toonlyoneorfewmodels,thisservesascompellingevidenceoftheirsuperiorpredictiveaccuracy relative to the competitor models. Alternatively, when the evidence does not support a few strong candidates, there may be many models with similarly high p-values. We employ the Hansen et al. (2011)maximumt-statistictestforassessingmodelforecastingaccuracyusingOOSforecasterrors fromrollingregressionestimation.10 TocontrasttheaccuracyoftheSVRforecastswiththatofstandardbenchmarksfromthecredit spread forecasting literature, we add a variety of benchmark forecasts to the list of competing forecasts in our MCS test. A random walk (RW) model makes the first natural benchmark as it is documented to be difficult to beat in Audzeyeva and Fuertes (2018). To represent another set of benchmarks, commonly utilized linear regression models, we employ three time-series OLS regression models from Audzeyeva and Fuertes (2018) that use the same sets of input variables as our SVR forecasts. Thus, we consider three additional OLS benchmark models. The first model, OLS-Baseline, generates forecasts that utilize the predictive content in the credit-spread-curve factors: y = ψ +κ β ˆ +κ β ˆ +κ β ˆ +ν (7) c,t+q c c0 c0,t c1 c1,t c2 c2,t c,t+q ThesecondOLS benchmark,OLS-G,augmentsOLS-Baselinewiththevectorofglobalmacroeconomicinputvariables,G : t y = ψ +κ β ˆ +κ β ˆ +κ β ˆ +θGG +ν (8) c,t+q c c0 c0,t c1 c0,t c2 c2,t c t c,t+q 10WehaveusedtheForecastEvalpackage,aJuliaimplementationoftheMCS. 17
InthethirdOLS model,OLS-GEM,theinputsetisfurtheraugmentedwiththevectorofdomestic macroeconomicvariables,EM : c,t y = ψ +κ β ˆ +κ β ˆ +κ β ˆ +θGG +θEMEM +ν (9) c,t+q c c0 c0,t c1 c0,t c2 c2,t c t c c,t c,t+q Thepredictivehorizonhereq = 13weeksasbefore. TheOLSbenchmarkpredictionsareobtained using rolling regressions, based on the same training and OOS evaluation windows as respective SVRpredictions. 4 Empirical forecasting results 4.1 Predictive ability of SVR models Accuracy of the quarter-ahead OOS SVR forecasts generated using models with various sets of input variables: Baseline, G, and GEM, employing Linear, RBF, Sigmoid and Polynomial SVR kernels, is evaluated by running a horse race among competing models. Table 3 reports the results for twenty SVR-based forecasts that have the lowest RMSE for a given country, contrasting them with forecast RMSE of benchmarks utilized in the literature. To gauge the statistical significance of gains in forecast accuracy, we report Model Confidence Set (MCS) p-values, identifying model forecasts in (cid:99)∗ , (cid:99)∗ and (cid:99)∗ .11,12 The interpretation of the MCS test confidence level is M75% M50% M25% analogous to that of a confidence interval for a parameter where MCS identifies from a collection ofmodelcandidatesasubsetofmodelsthatcontainthebestmodelwithagivenlevelofconfidence. Model forecasts are ordered by MCS p-value, with those more likely to generate the most accurate forecasts listed first. The wide range of reported p-values across models provides evidence of high information utility in each country’s data unambiguously identifying a sub-group of most accurate forecasts. The OOS forecasting evidence confidently identifies superior predictive accuracy of the SVRbased forecasts over the benchmarks. In particular, ten SVR-based forecasts but none of the 11Wereportlevelsofconfidencethataremoreconservativethan90%and75%reportedintheforecastingexercise byHansenetal. (2011). 12Giventhesizeofthedatasample,increasingthesizeofthemodelsetbeyond25mayaffectthereliabilityofthe MCStestresults(Hansenetal.,2011). 18
3elbaT b ,asledomdesab-RVSehtfoytilibaevitciderP yekruT ocixeM lizarB P ESMR ledoM P ESMR ledoM P ESMR ledoM SCM SCM SCM ***000.1 743.0 05-enilesaB-yloP-RVS ***000.1 381.0 5-enilesaB-raeniL-RVS ***000.1 764.0 01-G-FBR-RVS ***787.0 743.0 36-enilesaB-yloP-RVS ***909.0 681.0 95-enilesaB-raeniL-RVS *763.0 864.0 6-G-FBR-RVS ***787.0 943.0 6-MEG-FBR-RVS ***909.0 781.0 25-enilesaB-raeniL-RVS *633.0 174.0 201-G-FBR-RVS ***787.0 843.0 4-enilesaB-diomgiS-RVS ***909.0 781.0 4-enilesaB-yloP-RVS *633.0 574.0 922-G-FBR-RVS 731.0 943.0 2-MEG-FBR-RVS ***909.0 781.0 74-enilesaB-raeniL-RVS *633.0 274.0 511-G-FBR-RVS 731.0 943.0 3-MEG-FBR-RVS ***909.0 781.0 07-enilesaB-raeniL-RVS *633.0 374.0 701-G-FBR-RVS 911.0 943.0 16-enilesaB-yloP-RVS ***909.0 881.0 66-enilesaB-yloP-RVS *633.0 374.0 311-G-FBR-RVS 911.0 943.0 7-MEG-FBR-RVS ***909.0 981.0 22-enilesaB-diomgiS-RVS *633.0 674.0 142-G-FBR-RVS 911.0 943.0 26-enilesaB-yloP-RVS ***909.0 981.0 12-enilesaB-yloP-RVS *233.0 474.0 7-G-FBR-RVS 911.0 943.0 8-MEG-FBR-RVS ***197.0 881.0 24-enilesaB-raeniL-RVS *233.0 674.0 991-G-FBR-RVS 911.0 943.0 5-MEG-FBR-RVS ***197.0 981.0 9-enilesaB-raeniL-RVS 981.0 474.0 59-G-FBR-RVS 911.0 943.0 4-MEG-FBR-RVS **527.0 091.0 62-enilesaB-yloP-RVS 981.0 474.0 99-G-FBR-RVS 911.0 943.0 94-enilesaB-yloP-RVS **527.0 091.0 22-enilesaB-yloP-RVS 981.0 474.0 4-G-FBR-RVS 911.0 943.0 9-MEG-FBR-RVS **126.0 191.0 65-enilesaB-raeniL-RVS 981.0 774.0 022-G-FBR-RVS 911.0 053.0 1-MEG-FBR-RVS **126.0 091.0 55-enilesaB-raeniL-RVS 981.0 574.0 501-G-FBR-RVS 911.0 053.0 65-enilesaB-yloP-RVS **126.0 191.0 72-enilesaB-yloP-RVS 981.0 574.0 65-G-FBR-RVS 911.0 153.0 75-enilesaB-yloP-RVS **126.0 191.0 01-enilesaB-raeniL-RVS 981.0 674.0 431-G-FBR-RVS 911.0 153.0 55-enilesaB-yloP-RVS 661.0 191.0 51-enilesaB-raeniL-RVS 981.0 674.0 801-G-FBR-RVS 911.0 253.0 25-enilesaB-yloP-RVS 661.0 191.0 43-enilesaB-raeniL-RVS 981.0 674.0 861-G-FBR-RVS 911.0 353.0 4-G-diomgiS-RVS 661.0 191.0 27-enilesaB-raeniL-RVS 981.0 674.0 331-G-FBR-RVS 911.0 753.0 3-G-diomgiS-RVS 661.0 191.0 05-enilesaB-raeniL-RVS 981.0 574.0 75-G-FBR-RVS 911.0 553.0 35-enilesaB-yloP-RVS 661.0 191.0 7-enilesaB-raeniL-RVS 981.0 774.0 071-G-FBR-RVS 911.0 553.0 5-enilesaB-raeniL-RVS 661.0 291.0 61-enilesaB-raeniL-RVS 981.0 674.0 211-G-FBR-RVS 911.0 853.0 01-enilesaB-diomgiS-RVS 661.0 291.0 84-enilesaB-raeniL-RVS 981.0 774.0 202-G-FBR-RVS 911.0 753.0 1-enilesaB-diomgiS-RVS 661.0 291.0 52-enilesaB-yloP-RVS 981.0 674.0 111-G-FBR-RVS 911.0 614.0 enilesaB-SLO 661.0 512.0 MEG-SLO 981.0 694.0 MEG-SLO 911.0 405.0 WR 661.0 722.0 G-SLO 981.0 645.0 enilesaB-SLO 611.0 264.0 MEG-SLO 840.0 522.0 WR 981.0 025.0 G-SLO 930.0 384.0 G-SLO 840.0 522.0 enilesaB-SLO 981.0 036.0 WR arofseulav-pSCMstropernmulocdrihtehtdnaESMRtsacerofstropernmulocdnoceseht,emanledomehtsevignmuloctsrfieht,yrtnuochcaeroF a .ylevitcepser, ∗(cid:99) dna ∗(cid:99) , ∗(cid:99) nistsacerofehtgniyfitnedi***dna,**,*htiw,dnahtaledom selbairavtupnicimonoceforotcevehtdna)yloPro,diomgiS % ,F 5 B 2MR,raeni % L( 05 lMenrek %5 e 7 hMt,)SLOroRVS(dohtemnoitamitseehteziretcarahcsemanledoM b noitazimitpo OTNSM eht ni detareneg gnittes retemarap gninut hcae rof refiitnedi euqinu a sedivorp rebmun gniliart ehT .)MEG ro G ,enilesaB( .ssecorp 19
benchmarks enter (cid:99)∗ for Brazil. Gains in forecast accuracy are equally sizable in economic M75% terms as borne out by a substantial reduction in forecast errors afforded by the SVR models: the ten (cid:99)∗ models, all G-specifications of SVR, deliver, on average, a 9.1% and 4.7% reduction M75% in RMSE (1 RMSE /RMSE ) over the benchmark employing the same set of input SVR OLS − variables, OLS-G, and the best performing benchmark, OLS-GEM, respectively. The evidence is even more striking for Mexico and Turkey: eleven SVR-based forecasts for Mexico and four SVR-basedforecastsforTurkeyenter (cid:99)∗ ;thesuperiormodelsetisdominatedbyBaselineSVR M25% specificationsforbothcountries. Atthesametime,similartoBrazil,allbenchmarkforecastsexhibit relatively low p-values, clearly signaling their inferior predictive accuracy to SVR-based forecasts for both countries. Evidence of substantive economic gains confirms this result: the reductions in forecast RMSE afforded by the (cid:99)∗ SVR models are 16.8% relative to OLS-Baseline and M25% 12.9%relativetothebestperformingbenchmark,OLS-GEM,forMexico. Therespectivegainsare equallysizableat16.4%forTurkey,withOLS-Baselinebeingthesameinput-set-basedbenchmark andbestperformingbenchmarkatthesametime. 4.2 Further empirical observations Table 3 further shows that there is no persuasive evidence for singling out a kernel function that maybemostsuitedformodelingcreditspreadsofvariouscountries. Nevertheless,country-specific evidence suggests that some kernel functions may be particularly well suited for modeling credit spreads of a given country. For example, there is a clear preferred kernel, RBF, for Brazil, with all models in (cid:99)∗ being RBF-based. In contrast, Linear is the preferred kernel, used in 7 out M75% of 11 best performing models, for Mexico and Poly, featuring in 2 out of 4 best models, is the preferredkernelforTurkey. Interestingly,PolyandLinear-basedSVRappearamongtopperformers when employed in conjunction with a small input set such as Baseline whereas RBF-based SVR performwellinconjunctionwithextendedinputsetslikeGforMexicoandGEM inTurkey’scase. However,thisobservationrequiresfurthermoreconclusiveevidence. Furthermore, the results reveal that SVR-based models require only a relatively small set of input variables to deliver accurate forecasts across all three countries. In particular, the best performing SVR employ the Baseline input set for Mexico and Turkey and G, an extension with global but not domestic variables, for Brazil. This finding contrasts with the results for benchmark 20
modelswherethebenchmarkusingthelargestsetofinputvariables,GEM,containingbothglobal and domestic fundamentals, generates the lowest RMSE for Brazil and Mexico, with Baseline delivering the lowest RMSE only for Turkey. Thus, our findings provide evidence of the SVR modelsabilitytodeliveraccurateforecastswhenemployingevensmallerinputsetsthanthoseused bythebenchmarks. Furthermore,addingglobalvariables(G)orbothglobalanddomesticvariables (GEM)totheinputsetdoesnotdeliverimprovementsinforecastaccuracyoverSVRBaselinethat exploits predictive content only in the credit spread curve for Mexico and Turkey. This SVR-based findingcontrastswithevidenceforlinear-model-basedforecastsinourstudyandalsothosereported in Audzeyeva and Fuertes (2018), suggesting that the Baseline OLS specification cannot always outperformrandomwalkandthatitsforecastingaccuracycanbeimprovedbyadditionalpredictors. Only for Brazil, adding global (but not domestic) macroeconomic variables to the Baseline input sethelpsimproveforecastaccuracy.13 Takentogether,theSVR-basedevidencelendssomesupporttotherationalexpectationtheoryfor thetermstructureofcreditspreadswhenthecredit-spread-curvefactorsarenotboundedtoastrictly linear relationship with future credit spreads as in the previous literature. A further investigation into the factors that produce supportive evidence for some countries (Mexico and Turkey) but not forothers(Brazil)presentsafruitfuldirectionforfurtherresearch. 5 Conclusions This paper proposes a coherent framework for producing a set of highly accurate SVR models for forecasting credit spreads of emerging markets. In our main methodological contribution, we put forward a systematic approach for setting robust parameter values for SVR kernel functions, addressing a gap in the SVR literature. In contrast to previous studies aiming to select one "best" kernel setting that serves as input into the "best" SVR predictive model, our approach generates a robust set of viable tuning parameter values feeding into a set of SVR model candidates. We manage this model multiplicity by adopting the MCS test to select a subset of most accurate mod- 13ThesignificanceofpredictivecontentinglobalfactorsforBrazilislikelytoberelatedtoasharpdeteriorationin Brazil’screditworthiness, forcingBraziloutofinvestment-gradeBBBandintospeculative-gradeBBratingcategory in September 2015 while credit ratings of Mexico and Turkey remained relatively stable. A consequent steep rise in Brazil’sexposuretoglobalrisksmayhavebeenreflectedinitscurrentcreditspreadcurvewithsomedelay,enabling globalfactorstoaddpredictivecontentbeyondthespreadcurveforfuturecreditspreads. 21
els. Furthermore, our approach accommodates novel economic and financial market applications characterizedbyseriallycorrelateddata. In the empirical analysis part, the evaluation of a quarter-ahead OOS performance of SVR forecasts that our approach generates for three large sovereign emerging market borrowers using variouskernelfunctionsandsetsofinputvariablesmotivatedbyeconomictheoryprovidesevidence that our approach identifies a relatively small set of SVR models with a notably superior OOS forecasting ability in economic and statistical terms relative to both other SVR specifications and standard benchmarks utilized in the credit spreads literature. Moreover, our evidence confirms a finding in Sermpinis et al. (2017b) for European stock market ETF disproving a widely-held belief that the RBF kernel is the optimal choice for modeling financial market series, indicating that the kernelchoicemaybecountry-specificforemergingmarketcreditspreads. Our results further suggests that our SVR approach can deliver accurate credit spread forecasts with a small set of predictors limited to the credit curve level, slope and curvature factors, outperforming the random walk and linear-regression-based benchmarks using even larger predictor sets and performing at least as well as SVR forecasts using extended sets of predictors. Consequently, ourfindingsforSVR-basedcreditspreadforecastslendsupporttotherationalexpectationtheoryof thetermstructureinthecontextofemergingmarketcreditspreadsthathasbeenpreviouslyrejected for linear-model-based forecasts of emerging-market sovereign credit spreads in Audzeyeva and Fuertes(2018)andmature-market(U.S.)corporatecreditspreadsinKrishnanetal. (2010). Hence, our results provide indirect evidence that highly flexible SVR models may be better suited than linearmodels,routinelyemployedintheliterature,forcapturinginvestorexpectationsaboutfuture credit spreads on emerging market bonds. Further direct tests will constitute a fruitful avenue for futureresearch. References Ang,A.,Piazzesi,M.,2003. Ano-arbitragevectorautoregressionofterm-structuredynamicswith macroeconomicandlatentvariables. JournalofMonetaryEconomics50,745–787. Audzeyeva, A., Fuertes, A.M., 2018. On the predictability of emerging market sovereign credit spreads. JournalofInternationalMoneyandFinance88,140–157. Bergmeir,C.,Hyndman,R.J.,Koo,B.,2018.Anoteonthevalidityofcross-validationforevaluating autoregressivetimeseriesprediction. ComputationalStatistics&DataAnalysis120,70–83. 22
Brabanter, K.D., Brabanter, J.D., Suykens, J.A.K., Moor, B.D., 2011. Kernel regression in the presenceofcorrelatederrors. JournalofMachineLearningResearch12,1955–1976. Cao,L.,Tay,F.,2001. Financialforecastingusingsupportvectormachines. NeuralComputingand Applications10,184–192. Comelli, F., 2012. Emerging market sovereign bond spreads: Estimation and backtesting. IMF WorkingPaper12/2012,InternationalMonetaryFund,WashingtonD.C.. Diebold, F., Li, C., 2006. Forecasting the term-structure of government bond yields. Journal of Econometrics130,337–364. Diebold, F.X., Mariano, R.S., 1995. Comparing Predictive Accuracy. Journal of Business & EconomicStatistics13,253–263. Ding,Y.S.,Song,X.P.,Zen,Y.M.,2008.Forecastingfinancialconditionofchineselistedcompanies basedonsupportvectormachine. ExpertSystemswithApplications34,3081–3089. Fama, E., Bliss, R., 1987. The information in long-maturity forward rates. American Economic Review77,680–692. Gunduz, Y., Uhrig-Homburg, M., 2011. Predicting credit default swap prices with financial and puredata-drivenapproaches. QuantitativeFinance11,1709–1727. Hansen,P.R.,2005. Atestforsuperiorpredictiveability. JournalofBusiness&EconomicStatistics 23,365–380. Hansen,P.R.,Lunde,A.,Nason,J.M.,2011. Themodelconfidenceset. Econometrica79,453–497. Hofmann, T., Schölkopf, B., Smola, A.J., 2008. Kernel methods in machine learning. Annals of Statistics36,1171–1220. Law, T., Shawe-Taylor, J., 2017. Practical bayesian support vector regression for financial time seriespredictionandmarketconditionchangedetection. QuantitativeFinance17,1403–1416. Longstaff,F.,Pan,J.,Pedersen,L.,Singleton,K.,2011. Howsovereignissovereignrisk? American EconomicJournal: Macroeconomics3,75–103. Ludvigson, S., Ng, S., 2009. Macro factors in bond risk premia. Review of Financial Studies 22, 5027–5067. Min, J.H., Lee, Y.C., 2005. Bankruptcy prediction using support vector machine with optimal choiceofkernelfunctionparameters. ExpertSystemswithApplications28,603–614. Nelson, C., Siegel, A., 1987. Parsimonious modeling of yield curves. Journal of Business 60, 473–489. Niederreiter, H., Alpár, L., Halász, G., Sárközy, A., 1983. Studies in Pure Mathematics: To the MemoryofPaulTurán.BirkhäuserBasel,Basel.chapter7. pp.523–529. "Aquasi-MonteCarlo methodfortheapproximatecomputationoftheextremevaluesofafunction". Racine, J., 2000. Consistent cross-validatory model-selection for dependent data: hv -block crossvalidation. JournalofEconometrics99,39–61. 23
Roll, R., 1970. The Behavior of Interest Rates: An Application of the Efficient Market Model to U.S.TreasuryBills. BasicBooks,Inc.,NewYork. Sargent, T.,1972. Rationalexpectations andthe termstructure ofinterest rates. Journalof Money, CreditandBanking4,74–97. Scholkopf,B.,Smola,A.J.,Williamson,R.C.,Bartlett,P.L.,2000. Newsupportvectoralgorithms. NeuralComputation12,1207–1245. Sermpinis, G., Stasinakis, C., Hassanniakalager, A., 2017a. Reverse adaptive krill herd locally weightedsupportvectorregressionforforecastingandtradingexchangetradedfunds. European JournalofOperationalResearch263,540–558. Sermpinis, G., Stasinakis, C., Rosillo, R., de la Fuente, D., 2017b. European exchange trading funds trading with locally weighted support vector regression. European Journal of Operational Research258,372–384. Smola,A.J.,Schölkopf,B.,2004. Atutorialonsupportvectorregression. Statisticsandcomputing 14,199–222. Stasinakis,C.,Sermpinis,G.,Psaradellis,I.,Verousis,T.,2016.Krill-herdsupportvectorregression and heterogenous autoregressive leverage: Evidence from forecasting and trading commodities. QuantitativeFinance16,1901–1915. Svensson, L., 1994. Estimating and interpreting forward interest rates: Sweden 1992-1994. TechnicalReport48871.NationalBureauofEconomicResearchWorkingPaper. Vapnik,V.N.,1995. Thenatureofstatisticallearningtheory. Springer-VerlagNewYork,Inc. White,H.,2000. Arealitycheckfordatasnooping. Econometrica68,1097–1126. Xu,Q.s.,Liang,Y.z.,Hou,Z.t.,2005. Amulti-sequentialnumber-theoreticoptimizationalgorithm usingclusteringmethods. JournalofCentralSouthUniversityofTechnology12,283–293. 24
Appendix A Data variables: Summary statistics TableA1 Summarystatisticsofemergingmarketsovereigncreditspreadsandinputvariables Country Mean StDev Min Max AR(1) Variable US Yield curve level 0.043 0.008 0.027 0.058 0.986 Yield curve slope -0.028 0.013 -0.053 -0.004 0.981 Yield curve curvature -0.079 0.025 -0.139 -0.022 0.978 Short-term rate volatility 0.000 0.001 0.000 0.004 0.977 Brazil Credit spread 0.016 0.008 0.004 0.050 0.973 Credit-curve-spread curve level 0.025 0.010 0.012 0.061 0.974 Credit-curve-spread curve slope -0.011 0.020 -0.061 0.062 0.887 Credit-curve-spread curve curvature -0.025 0.040 -0.154 0.069 0.925 Country risk rating 39.196 3.146 32.500 45.500 0.978 Trade balance 0.067 0.081 -0.159 0.379 0.978 Trade balance volatility 0.054 0.027 0.021 0.157 0.990 Terms-of-trade growth 0.052 8.790 -13.276 19.361 0.997 Terms-of-trade growth volatility 2.747 1.560 0.389 8.899 0.991 Mexico Credit spread 0.014 0.006 0.006 0.047 0.972 Credit-curve-spread curve level 0.021 0.006 0.013 0.044 0.930 Credit-curve-spread curve slope -0.009 0.014 -0.044 0.024 0.874 Credit-curve-spread curve curvature -0.021 0.032 -0.086 0.063 0.930 Country risk rating 40.289 1.908 35.500 43.000 0.978 Trade balance -0.033 0.071 -0.272 0.085 0.958 Trade balance volatility 0.066 0.032 0.024 0.135 0.992 Terms-of-trade growth -2.778 9.850 -22.243 19.394 0.998 Terms-of-trade growth volatility 3.609 2.973 0.356 14.099 0.994 Turkey Credit spread 0.025 0.009 0.013 0.073 0.964 Credit-curve-spread curve level 0.029 0.008 0.013 0.063 0.961 Credit-curve-spread curve slope -0.007 0.017 -0.055 0.061 0.797 Credit-curve-spread curve curvature -0.006 0.030 -0.124 0.127 0.796 Country risk rating 33.740 2.550 27.000 37.500 0.981 Trade balance -0.816 0.227 -1.253 -0.202 0.993 Trade balance volatility 0.110 0.041 0.042 0.209 0.989 Terms-of-trade growth 0.857 4.350 -8.864 9.667 0.995 Terms-of-trade growth volatility 1.732 1.183 0.241 5.320 0.993 25
Appendix B MSNTO: Implementation details TableB1 MSNTOtuningparametersearchranges Kernel ParameterValueSearchRegions Linear C (1.,200),(cid:15) (0.0001,0.1) ∈ ∈ RBF C (1.,200),(cid:15) (0.0001,0.1),ψ (0.0001,40) ∈ ∈ ∈ Sigmoid C (1.,200),(cid:15) (0.0001,0.1),γ (0.0001,40),s ( 13,13) ∈ ∈ ∈ ∈ − Polynomial C (1.,200),(cid:15) (0.0001,0.1),γ ( 13,13),s (0.01,20),g (1,6) ∈ ∈ ∈ − ∈ ∈ TableB2 Numberofuniquetuningparametervalues MODELS Kernel Baseline G GEM Brazil Linear 112 100 3 RBF 88 341 84 Sigmoid 190 291 45 Poly 160 246 10 Mexico Linear 110 63 8 RBF 44 254 100 Sigmoid 34 90 80 Poly 66 199 152 Turkey Linear 21 63 31 RBF 215 1292 38 Sigmoid 37 18 20 Poly 65 256 19 26
Cite this document
Gary Anderson and Alena Audzeyeva (2019). A Coherent Framework for Predicting Emerging Market Credit Spreads with Support Vector Regression (FEDS 2019-074). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2019-074
@techreport{wtfs_feds_2019_074,
author = {Gary Anderson and Alena Audzeyeva},
title = {A Coherent Framework for Predicting Emerging Market Credit Spreads with Support Vector Regression},
type = {Finance and Economics Discussion Series},
number = {2019-074},
institution = {Board of Governors of the Federal Reserve System},
year = {2019},
url = {https://whenthefedspeaks.com/doc/feds_2019-074},
abstract = {We propose a coherent framework using support vector regression (SRV) for generating and ranking a set of high quality models for predicting emerging market sovereign credit spreads. Our framework adapts a global optimization algorithm employing an hv-block cross-validation metric, pertinent for models with serially correlated economic variables, to produce robust sets of tuning parameters for SRV kernel functions. In contrast to previous approaches identifying a single "best" tuning parameter setting, a task that is pragmatically improbable to achieve in many applications, we proceed with a collection of tuning parameter candidates, employing the Model Confidence Set test to select the most accurate models from the collection of promising candidates. Using bond credit spread data for three large emerging market economies and an array of input variables motivated by economic theory, we apply our framework to identify relatively small sets of SVR models with su perior out-of-sample forecasting performance. Benchmarking our SRV forecasts against random walk and conventional linear model forecasts provides evidence for the notably superior forecasting accuracy of SRV-based models. In contrast to routinely used linear model benchmarks, the SRV-based models can generate accurate forecasts using only a small set of input variables limited to the country-specific credit-spread-curve factors, lending some support to the rational expectation theory of the term structure in the context of emerging market credit spreads. Consequently, our evidence indicates a better ability of highly flexible SVR to capture investor expectations about future spreads reflected in today's credit spread curve. Accessible materials (.zip)},
}