ifdp · February 28, 1999

Pitfalls in Tests for Changes in Correlations

Abstract

Correlations are crucial for pricing and hedging derivatives whose payoff depends on more than one asset. Typically, correlations computed separately for ordinary and stressful market conditions differ considerably, a pattern widely termed "correlation breakdown." As a result, risk managers worry that their hedges will be useless when they are most needed, namely during "stressful" market situations.

Board of Governors of the Federal Reserve System International Finance Discussion Papers Number 597* First Version, December 1997 Revised, March 1999 PITFALLS IN TESTS FOR CHANGES IN CORRELATIONS Brian H. Boyer, Michael S. Gibson and Mico Loretan NOTE: International Finance Discussion Papers are preliminary materials circulated to stimulate discussionandcriticalcomment. ReferencestoInternationalFinanceDiscussionPapers(otherthan an acknowledgment that the writer has had access to unpublishedmaterial) should be cleared with the author or authors. Recent IFDPs are available on the Web at http://www.bog.frb.fed.us.

PITFALLS IN TESTS FOR CHANGES IN CORRELATIONS (cid:3) Brian H. Boyer, Michael S. Gibson and Mico Loretan Abstract: Correlations are crucial for pricing and hedging derivatives whose payo(cid:11) depends on more than one asset. Typically, correlations computed separately for ordinary and stressful market conditions di(cid:11)er considerably, a pattern widely termed \correlation breakdown." As a result, risk managers worry that their hedges will be useless when they are most needed, namely during \stressful" market situations. We show that such worries may not be justi(cid:12)ed since \correlation breakdowns" can easily be generated by data whose distribution is stationary and, in particular, whose correlation coe(cid:14)cient is constant. We make this point analytically, by way of several numerical examples, and via an empirical illustration. But, risk managers should not necessarily relax. Although \correlation breakdown" can be an artifact of poor data analysis, other evidence suggests that correlations doin fact change over time. Keywords: risk management, risk measurement, hedging, derivatives, correlation, conditional correlation, normal distribution, foreign exchange (cid:3)The(cid:12)rstauthorisaPh.D.studentattheUniversityofMichiganBusinessSchool. Thesecondandthirdauthors are sta(cid:11) economists in the Division of International Finance, Board of Governors of the Federal Reserve System. We thank Matt Pritsker and seminar participants at the European Central Bank for comments. The views in this paperaresolelytheresponsibilityoftheauthorsandshouldnotbeinterpretedasreflectingtheviewsoftheBoardof Governors of theFederal ReserveSystem or of any other person associated with theFederal Reserve System. Email addresses for correspondence are boyerbh@hotmail.com, gibsonm@frb.gov and loretanm@frb.gov, respectively.

1 Introduction The correlations between (cid:12)nancial time series are crucial to risk management and the pricing of 1 portfolios of assets. Of particular interest is whether the correlations are constant over time: Unstable correlations make it di(cid:14)cult (or impossible) to hedge exposure to one risk factor with an o(cid:11)setting position in another asset. In discussions of the importance of correlation for risk management and hedging, one often encounters statements to the e(cid:11)ect that one has to allow for \correlation breakdown,"i.e., theempiricalregularity thatthecorrelations between theseries di(cid:11)er between \quiet" (or \ordinary") periods and \hectic" (or \unusual") periods. To quote the global risk manager for a major securities trading (cid:12)rm (Bookstaber, 1997): During major market events, correlations change dramatically. Correlation breakdowns, if they occur, call into question the usefulness of hedgingoperations based on correlations estimated from long time series of historical data, since they may be inaccurate precisely when they may be needed the most. Thepurposeof ourpaperis toshow thattesting for changes in correlations is notas straightforward as one might think. Speci(cid:12)cally, we demonstrate that splitting a sample of data according to the ex post realizations of a series, say between \large" and \small" values of one of the series, can yield very misleading results, because such a procedure is likely to suggest correlation breakdown regardless of whether the correlation coe(cid:14)cients have changed. We make this point analytically, by way of several numerical examples, and via an empirical illustration. Although it may not be obvious at (cid:12)rst, our results are a direct consequence of selection bias, a phenomenon familiar to statisticians and econometricians. In Section 2, we introduce some notation, provide an analytical expression for the conditional correlation between two independently and identically distributed (i.i.d.) bivariate normal random variables, and computethevalue of theconditional correlations for a variety of methodsof splitting a dataset into subperiods of interest. In each case, looking at the resulting conditional correlations would indicate \correlation breakdown" even though the data are, by construction, i.i.d. In Sec- 1See, for example, Wilson 1993; Sullivan,1995; Campa and Chang, 1997. 1

tion 3, we provide several empirical illustrations of how \correlation breakdown" may be generated by innocuous-seeming conditioning on events of interest. We examine a simple bivariate dataset of daily changes in German and Japanese exchange rates (vs. the U.S. dollar). We (cid:12)nd that the patterns of conditional correlations calculated from this dataset are very similar to what we would (cid:12)nd if the data were known to be i.i.d. bivariate normal. Our empirical illustrations con(cid:12)rm that it would be improper to conclude that the (population) correlation between two series varies across observations based on sample-splitting exercises alone. In Section 4, we discuss what our (cid:12)ndings imply for proper testing for changes in correlations. 2 The relationship between conditional and unconditional correlations To test whether the correlation between two time series is constant or is changing over time, one could consider comparing sampling correlations between the two series calculated from subsets of 2 the data. If these conditional correlations are found to be statistically di(cid:11)erent from each other, onemightbetemptedtoconcludethatthepopulationcorrelation isnotconstant. Inthissectionwe demonstrate both analytically and numerically that this intuitively attractive approach to testing for correlation breakdowns can be very misleading. 2.1 Analytical derivation for bivariate normal random variables We begin by considering a pair of bivariate normal random variables x and y with (unconditional) correlation coe(cid:14)cient (cid:26). We are interested in studying the e(cid:11)ect that various forms of conditioning events|placing restrictions on the support of the distribution of (x;y)|have on the correlation between x and y. Empirical practice frequently proceeds by restricting only one of the two variables. Events of interest in this paper are therefore mostly of the form \x 2 A", where A (cid:26) R. 3 When there is 2The focus on correlations and hence on linear dependence is entirely appropriate when the joint distribution of thedatais multivariatenormalor, moregenerally, multivariateelliptic. Inempirical practice, thejoint distributions of many asset price changes are frequently found to befairly close to being multivariate elliptic. 3We do not treat events of the type \x 2A(cid:26) R and y 2B (cid:26) R " explicitly in the present paper. Our aim is to demonstratethewaysinwhich(intentionalorinadvertant)conditioningoneventsmaya(cid:11)ectthecorrelationbetween twovariables,ratherthanprovideageneralanalysisofalltypesofconditioning. Explicitformul(cid:26)arestraightforward to derivealong thelines suggested by Theorem 1 in this section. 2

no chance of confusion, we shall denote the events simply by \A". We only consider events of nontrivial (6= 0;1) probability. The following theorem, which is proven in Appendix A, states the relationship between conditional and unconditional correlations for bivariate normal random variables when conditioning places restrictions on one of the two variables: Theorem 1. Consider a pair of bivariate normal random variables x and y with variances (cid:27)2 x and (cid:27) y 2 , respectively, and covariance (cid:27) xy. Put (cid:26) = (cid:27) xy =((cid:27) x (cid:27) y), the unconditional correlation between x and y. Consider any event x2 A, where A (cid:26) R such that 0 < Pr(A)< 1. The conditional correlation (cid:26) A between x and y, conditional on the event x 2A, is equal to (cid:18) (cid:19) Var(x) −1=2 (cid:26) A = (cid:26) (cid:26)2 +(1−(cid:26)2 ) Var(x jx 2A) : (1) Inspecting equation (1), we make the following observations: 1. sign((cid:26) A) = sign((cid:26)). Conditioning, by itself, does not a(cid:11)ect the sign of the correlation coe(cid:14)cient. 2. (cid:26) A = (cid:26) if (cid:26)= 0, i.e., if x and y are independent, or if j(cid:26)j= 1, i.e., if the bivariate distribution is degenerate (a case which we do not consider further). 3. For (cid:26)6= 0, j(cid:26) A j? j(cid:26)j if Var(x jx2 A)? Var(x). This is the result that is of primaryinterest to us. The dependence of (cid:26) A on the ratio Var(x j A)=Var(x) is illustrated graphically in Figure 1 for several values of (cid:26). In practice, the parameters (cid:26) and Var(x), whether conditional or unconditional, are usually estimated from time series data. In keeping with common usage, one would equate the fullsample estimates of moments with the corresponding \unconditional" moments, and sub-sample estimates|where the subsamples are selected based on the conditioning criterion|with the \conditional" moments. To extend the logic of Theorem 1 to the case of time series observations, consider the bivariate time series fx t ;y t g, t = 1;2;:::;n, which has supportR2(cid:2)R2(cid:2)(cid:1)(cid:1)(cid:1)(cid:2)R2 (n copies) or, equivalently, 3

Rn (cid:2)Rn . Formally, conditioning consists of restricting attention to those observations for which the x t’s fall into a subsetof Rn , i.e., to require (x 1 ;x 2 ;:::;x n) 0 2 A, whereA (cid:26) Rn . As an example of a sample-based conditioning criterion, we could study the correlation between those x t’s and y t’s for which x t falls into the (cid:12)rst or fourth quartile of the sampling distribution of x. Whentherandomvariablesx tandy tarei.i.d.bivariatenormalwithcontemporanouscorrelation coe(cid:14)cient (cid:26), equation (1) holds exactly for conditioning events A de(cid:12)ned over (x 1 ;x 2 ;:::;x n) 0 . If the sequence fx t ;y t g, t = 1;2;:::;n, is not i.i.d., but is assumed to satisfy certain stationarity and weak dependence conditions, a more general version of equation (1) would apply. Both statements are proven in Appendix C. Figure 1: Dependence of (cid:26) A on Var(x j A)=Var(x) 4

2.2 Numerically calculated conditional correlations From thepreceding discussion, wenote that knowledge of Var(x jA)lets usdetermine whether(cid:26) A is less than or greater than (cid:26). In Appendix B, we show how one may compute the conditional variance of a normally distributed random variablex for various types of conditioning events of the formx2 A (cid:26) R. Wenowprovidethreenumericalillustrationsofhowmuchconditionalcorrelations can di(cid:11)er from their unconditional counterparts. As a (cid:12)rst illustration of the dependence of the conditional correlation on the nature of the conditioning event, let x and y be i.i.d. mean-zero, unit-variance normal random variables. The onlyfreeparameteris(cid:26),thecontemporaneouscorrelation coe(cid:14)cient. LetD 1 ;D 2 ;:::;D 10 represent the deciles of the (marginal) distribution of x. The conditioning events A, in this case, are of the form \x2 D i" for the deciles i, i = 1;:::;10. We consider two values of (cid:26), one moderate ((cid:26) = 0:50) and one high ((cid:26) = 0:95). Weshouldexpectthevarianceof(x j x2 D i)tobelargerforthosedecilesthatfallintothetails of the distribution, simply because the tail deciles are wider than the central deciles. Therefore, by inspection of equation (1), we would also expect the conditional correlation between x and y to be higher when x is in the tail of its distribution, irrespective of the value of (cid:26). These expectations are con(cid:12)rmed by Table 1, where the conditional variance of x and the conditional correlation between x and y are given for each of the ten deciles. 4 As is shown in the center column of the table, the conditional variance of x strongly depends on the chosen decile: Var(x j x 2 D 1) exceeds Var(xj x2 D 5) by a factor of more than 30. As a result, the relationship between deciles and conditional correlations is distinctly \U-shaped"|for both values of (cid:26) we consider|with the conditional correlations being largest in D 1 and D 10. Clearly, then, a U-shaped pattern need not indicatecorrelationbreakdown,butmayinsteadmerelybeaconsequenceoftheex post partitioning of the data|in this case into deciles. Theprecedingillustrationstudiedeventswhichconsistofsingleintervalsofthedata. Inpractice, we are often interested in two-sided events, such as \x is more than (less than) two/three/four standarddeviationsawayfromitsmean." InTable2,wepresenttherelationshipbetweentwo-sided 4Var(xjx2D i) is calculated with equation (B.2); Corr(x;yjx2D i) is calculated with equation (1). 5

Table 1: Conditional variances and correlations, decile delimited events, bivariate normal random variables x and y are bivariate normal with zero mean and unit variance, and unconditional correlation (cid:26). Decile Interval Var(x jx 2D i) Corr(x;y jx 2D i) (cid:26)= 0:50 (cid:26) = 0:95 1 −1 < x(cid:20) −1:282 :169 :231 :781 2 −1:282 < x(cid:20) −:842 :0159 :0725 :358 3 −:842 < x(cid:20) −:524 :00834 :0526 :268 4 −:524 < x(cid:20) −:253 :00610 :0451 :231 5 −:253 < x(cid:20) 0:00 :00534 :0421 :217 6 0:00 < x(cid:20) :253 :00534 :0421 :217 7 :253 < x(cid:20) :524 :00610 :0451 :231 8 :524 < x(cid:20) :842 :00834 :0526 :268 9 :842 < x(cid:20) 1:282 :0159 :0725 :358 10 1:282 < x < 1 :169 :231 :781 tail events A (and their complements, Ac ) and the resulting conditional correlation coe(cid:14)cients, again for random variables that are i.i.d. bivariate normally distributed. We consider four values for the unconditional correlation (.20, .50, .80, and .95) and two-sided events with probabilities of 50%, 10%, 5%, and 1%. (Each side has one half of the total probability of the event.) The case Pr(A) = 50% signi(cid:12)es that the event of interest consists of x falling into either the lowermost or uppermost quartile of its distribution. This corresponds to the case where we split the sample into subsamples of equal size, according to whether x is far away from its median or not, and wish to test whether the subsample correlations di(cid:11)er from each other. Cases of Pr(A) = 10% or less servetocomparethecorrelation between \tail" observations of thedatawiththecorrelation among \ordinary" observations. For two-sided events we (cid:12)nd j(cid:26) A j > j(cid:26)j > j(cid:26) Ac j, since Var(x j A) > Var(x) > Var(x j Ac ). We note that the pairs of conditional correlations are often far apart from each other, especially when the (population) correlation coe(cid:14)cient is relatively small. E.g., if (cid:26)= 0:5 and Pr(A)= 10%, (cid:26) A = 0:771 whereas (cid:26) Ac = 0:415|a di(cid:11)erence of close to 100%! Surely an unsuspecting analyst might feel tempted to conclude that this discrepancy typi(cid:12)es a clear instance of \correlation breakdown" 6

Table 2: Conditional correlations, two-sided events, bivariate normal random variables x and y are bivariate normal with zero mean and unit variance. Events A are two-sided \tail" events of the marginal distribution of x. Two-Sided Event Probabilities 50% 10% 5% 1% Corr(x;y) (cid:26) A (cid:26) Ac (cid:26) A (cid:26) Ac (cid:26) A (cid:26) Ac (cid:26) A (cid:26) Ac .20 :268 :077 :393 :159 :434 :175 :510 :193 .50 :618 :213 :771 :415 :806 :449 :859 :485 .80 :876 :450 :942 :725 :953 :758 :968 :789 .95 :972 :754 :988 :923 :990 :936 :994 :946 between ordinary (quiet) and unusual (hectic) data observations. But, once more, the di(cid:11)erences between the conditional correlations are caused by the choice of subsamples alone and not by any change in the parameters of the data generating process. A third form of conditioning is to look at subsamples of the data that are characterized by \high volatility." Let a sample of n draws from (x;y) be divided into n m equally sized subsamples (\months"; m = 1;2;:::;n m). The subset of \high-volatility months" can be de(cid:12)ned as the set of months in which the ratio of the (within-month) variance of x to the overall (i.e., unconditional) variance of x exceeds some threshold k (k (cid:21) 1): HVM = fm : Var(x t j V t a 2 r(x m t) onth m) (cid:21) k; m = 1;:::;n m g Using this de(cid:12)nition, the conditioning event A can be de(cid:12)ned as A = f(x t ;y t) :t 2 month m; m 2 HVMg Since the conditional variance of x is chosen directly in this case, the application of equation (1) is direct when x t and y t are assumed to be i.i.d. bivariate normal random variables. Conditioning on 7

Table 3: E(cid:11)ects of conditioning on \high volatility months" x t and y t are i.i.d. bivariate normal with means equal to zero, variances equal to unity, and correlation 0.5. \High volatility months" are de(cid:12)ned as months where Var(x) within the month is greater than or equal to a threshold k. A month is assumed to have 20 observations. Variance Fraction of months Corr(x t ;y t j t 2 \high volatility month") threshold k identifed as \high volatility months" 1.8 .01 0.615 1.7 .02 0.610 1.6 .04 0.603 1.5 .06 0.595 1.4 .10 0.585 1.3 .16 0.575 1.2 .24 0.564 1.1 .33 0.553 1.0 .45 0.541 \high volatility months" will cause the conditional correlation (cid:26) A to be greater in absolute value than the unconditional correlation (cid:26). Table3illustratestherelationshipbetweentheconditionalcorrelationandconditioningeventsof thistype. Itassumesthatx t andy t aredistributedi.i.d.bivariatenormalwithzeromeans,variances equal to unity, and correlation (cid:26) = 0:5. We further assume each \month" has 20 observations. Because the x’s are independent standard normal variables, the within-month conditional variance of x, which is a function of a sum of terms involving x2 , is distributed proportional to a (cid:31)2 random variable with 20 degrees of freedom. The (cid:12)rst column of the table tabulates di(cid:11)erent values of the threshold k. The second column uses the cumulative distribution function for a (cid:31)2 (20) random variable to predict the fraction of months that will have a conditional variance greater than a certain threshold k. The third column 8

shows what the conditional correlation between x and y is for those months. 5 Again the pattern emerges that conditioning events characterized by high volatility of x are associated with high conditional correlation between x and y, even though the underlying data are i.i.d. 2.3 Conditional correlations for data that are not multivariate normal The central argument of our paper is that computing correlations conditional on realizations of one variable, and observing that those correlations are di(cid:11)erent for di(cid:11)erent conditioning events, gives you no basis to conclude that the \true" correlation of the data-generating process is changing over time. Theorem 1 and Tables 1, 2, and 3 each illustrate this point. They are derived for the case of i.i.d. normally distributed random variables because the multivariate normal distribution is analytically tractable. However, Appendix C shows that our argument can be extended to treat the case of non-i.i.d. random variables, although no simple analytic expressions are available. To illustrate this point, we reproduced the numerical illustrations in Tables 1, 2, and 3 for the caseofx t andy tdistributedasbivariateGARCH(1,1)withconstantcontemporaneouscorrelation, a modelintroducedbyBollerslev(1990). GARCHrandomvariablesareneithernormalnoridentically distributed, as their variance changes over time. However, like the case of i.i.d. bivariate normal random variables we used above, the GARCH model of Bollerslev (1990) does feature a constant 6 contemporaneous correlation coe(cid:14)cient. The results are omitted to save space and because they are nearly identical to the results in 7 Tables 1, 2, and 3. \Correlation breakdown" is still observed when conditioning on deciles, twosided events, or high variance months, even when the true data generating process has constant contemporaneous correlation and time-varying volatility. 5The numbersin thethird column are Z (cid:26) A(z)dG(z); z(cid:21)k where (cid:26) A(z) is from equation (1) with Var(x j A) = z and G(z) is the cumulative distribution function of V=20, where V is a (cid:31)2(20) random variable. 6Because no analytic formula is available for the conditional variance of a bivariate GARCH(1,1) process|the (cid:6)xxjA from AppendixC|theresultsforGARCHrandomvariableswereproducedusingMonteCarlo simulationsof GARCH(1,1) data with constant contemporaneous correlation (cid:26). 7The tables are available from theauthors on request. 9

3 Empirical illustrations We now present conditional correlations for the three conditioning events discussed above using actual (cid:12)nancial time series. The dataset comprises daily log changes in the German and Japanese spot exchange rates versus the U.S. dollar, from January 2, 1991 through December 31, 1998 8 (henceforth, German and Japanese exchange rates). The full sample correlation between the two series is 0.504. A scatterplot of the data is provided in Figure 2. The data points are clustered around an upward-sloping line. However, the clustering appears to be slightly less tight in the tails of the data than in the central portion. It is therefore of interest to examine how the correlations di(cid:11)er between the subsets of the data. Theempirical values for theconditional correlations between the two series, byempirical decile, are presented in Table 4. The theoretical conditional correlations that would apply if the data were drawn from an i.i.d. bivariate normal distribution can be seen in Table 1, in the column labelled \(cid:26) = 0:50." Rather than reproduce these numbers in Table 4, we provide a 90% con(cid:12)dence 9 interval for the theoretical conditional correlation. We observe that the empirical and theoretical conditional correlations follow virtually the same U-shaped pattern. The empirical conditional correlation is outside the 90% con(cid:12)dence interval for the theoretical conditional correlation only once, in decile 3. Hence, the U-shaped pattern of correlations present in the data cannot be used, by itself, to determine whether actual correlations di(cid:11)er across hectic and quiet subperiods. Conditional correlation coe(cid:14)cients for two-sided tail events (\hectic periods") were also calculated for the exchange rate data and are presented in Table 5. Again, the theoretical conditional correlations which would apply if the data were i.i.d. and bivariate normal are nearly the same as those shown in Table 2 in the row \(cid:26) = 0:5," so we again show 90% con(cid:12)dence intervals for the theoretical conditional correlations under the assumption of bivariate normality. We observe, yet again, that the empirical and theoretical conditional correlations are quite similar. The empirical conditional correlations are never outside the 90% con(cid:12)dence interval for normally distributed data. We can therefore not conclude that the true correlation between the 8Theexchangerates werecollected bytheFederalReserveBankof NewYork,at noon of each U.S.business day. 9All con(cid:12)dence intervals in this section were generated with a Monte Carlo simulation on simulated bivariate normal data with (cid:26)=0:504 and sample size n=2;000. 10

Figure 2: German and Japanese exchange rates vs. U.S. dollar: Scatterplot of daily log changes (cid:2) 100, January 1991{December 1998 two empirical series is di(cid:11)erent in the \tails" of the distribution. Instead, we should conclude that the question of correlation breakdown for these series cannot be decided on the basis of this sample-splitting exercise. Our third illustration is to compute the empirical correlation between the two exchange rates conditional on being in a month where the dollar-mark exchange rate exhibited \high volatility." Our data sample has 96 months. A scatterplot of the ratio of the within-month variance of daily dollar-mark returns (on the horizontal axis) against the within-month correlation between dollarmark and dollar-yen returns (on the vertical axis) is presented in Figure 3. In addition to the scatterplot of empirical data, a curve representing the theoretical conditional correlation under the 11

Table 4: Empirical and theoretical conditional correlations, decile delimited events, exchange rate data Data: Daily log changes of German (x) and Japanese (y) exchange rates relative to U.S. dollar, scaled by 100%. Sample period: January 2, 1991 to December 31, 1998. Full-sample correlation (cid:26)= 0:504. Decile D i of x Range of decile D i Empirical 90% Conf. Interval for Cond. Corr. Theoretical Cond. Corr. 1 −2:896% < x(cid:20) −:782% :322 (:117;:342) 2 −:782% < x(cid:20) −:482% −:026 (−:040;:192) 3 −:482% < x(cid:20) −:285% :198 (−:064;:168) 4 −:285% < x(cid:20) −:133% :067 (−:071;:159) 5 −:133% < x(cid:20) :006% :038 (−:076;:158) 6 :006% < x(cid:20) :150% −:009 (−:076;:158) 7 :150% < x(cid:20) :286% :050 (−:071;:159) 8 :286% < x(cid:20) :485% −:026 (−:064;:168) 9 :485% < x(cid:20) :797% :040 (−:040;:192) 10 :797% < x < 3:103% :285 (:117;:342) Table 5: Empirical and theoretical conditional correlations, two-sided events, exchange rate data Data: Daily log changes of German (x) and Japanese (y) exchange rates relative to U.S. dollar. Sampleperiod: January2,1991toDecember31, 1998. Fullsamplecorrelation(cid:26) = 0:504. EventsA are two-sided \tail" events of the marginal distribution of x. Conditional Correlations Two-Sided Event Probabilities 50% 10% 5% 1% (cid:26) (cid:26) (cid:26) (cid:26) (cid:26) (cid:26) (cid:26) (cid:26) A Ac A Ac A Ac A Ac Empirical conditional :589 :224 :726 :386 :746 :448 :837 :477 correlation 90% conf. interval for (:580; (:143; (:714; (:372; (:739; (:410; (:734; (:450; theoretical :665) :286) :829) :464) :871) :497) :954) :530) cond. corr. 12

Figure 3: Relation between conditional variance and correlation, within-month data, daily-frequency German and Japanese exchange rate returns Scatterplot of within-month variance of German exchange rate (Var(x j x 2 A)) against withinmonthcorrelationbetweenGermanandJapaneseexchangeratechangesvs.theU.S.dollar((cid:26) month), January 1991{December 1998 (96 months). Solid curve: theoretical conditional correlation as a function of conditional variance of x. Error bars: 90%-con(cid:12)dence intervals for sample size of 21. assumption of bivariate normality is plotted along with pointwise 90% con(cid:12)dence intervals (the vertical bars). The (cid:12)gure shows that the empirical within-month correlations are positively related tothewithin-monthvariance,astheywouldbeifthedatageneratingprocesswerebivariatenormal. Comparingthecorrelationcoe(cid:14)cientsin\highvolatility" monthsand\lowvolatility" monthscould 13

Table 6: Empirical and theoretical conditional correlations, \high volatility month" events, exchange rate data Data: Daily log changes of German (x) and Japanese (y) exchange rates relative to U.S. dollar. Sample period: January 2, 1991 to December 31, 1998. Full-sample correlation (cid:26)= 0:504. \High volatility months" are de(cid:12)ned as months where Var(x j month)=Var(x) is greater than or equal to a threshold k. Variance Fraction of months Corr(x t ;y t jt 2 \high volatility month") threshold k identifed as \high Empirical Theoretical 90% volatility months" con(cid:12)dence interval 1.8 .13 .662 (:425;:791) 1.7 .14 .660 (:427;:777) 1.6 .14 .660 (:442;:749) 1.5 .19 .639 (:467;:717) 1.4 .20 .638 (:485;:683) 1.3 .21 .644 (:496;:658) 1.2 .22 .642 (:498;:635) 1.1 .28 .622 (:498;:616) 1.0 .34 .562 (:494;:601) give the illusion of \correlation breakdown," but again the same pattern would emerge if the data 10 generating process were i.i.d. Table 6 shows how the empirical conditional correlation varies when conditioning on various 11 de(cid:12)nitions of a \high volatility month." The table also shows the 90% con(cid:12)dence interval for the theoretical conditional correlations under bivariate normality. Evidence of fat tails in the distribution of the dollar-mark exchange rate is clear in the second column, which shows a wider spread of the distribution of within-month variance in Table 6 compared with the statistics in Table 3 based on the bivariate normal distribution. Still, the theoretical and empirical conditional correlationshavethesamebasicpattern: highestwhenusingamoreextremecuto(cid:11)forthede(cid:12)nition of a \high volatility month" and declining as the cuto(cid:11) is reduced. For the thresholds k = 1:2 10The (cid:12)gure does show that the empirical data depart from bivariate normality. There is a clustering of points outside the 90% con(cid:12)dence interval in the \northwest" corner of the (cid:12)gure. This clustering would be unlikely to occur if the data were truly bivariate normal. 11Note that Table 6’s conditioning events are of the form Var(xjmonth)=Var(x)(cid:21)k while Figure 3’s are of the form Var(xjmonth)=Var(x)=k. 14

andk = 1:1,theempiricalconditionalcorrelationisoutsidethe90%con(cid:12)denceintervalforbivariate normal data. For the remaining thresholds, it is not. Once again, although the correlation in \high volatility months" is greater than the unconditional correlation, that is no basis to conclude that the data generating process exhibits non-constant correlation. 4 Conclusion We have shown that changes in correlations over time or across \regimes" cannot be detected reliably by splitting a sample according to the realized values of the data. The lack of reliability stemsfromthe(cid:12)ndingthatcorrelationbreakdownswillbe\uncovered"bythismethod,irrespective of theactual stationarity propertiesof thedata. Thisresultis adirectconsequenceof the(implicit) selection bias that occurs when a sample is split according to the realized or observed values alone. What valid alternative methods exist to detect changing correlations? In order to carry out a valid test, we argue that it is necessary that the researcher begin with a data-coherent model of the data generating process that builds in the possibility of structural changes, estimate the model’s parameters, and only then decide whether the estimated parameters imply changing correlations (and possibly other structural breaks). For example, if the data were generated according to a Markov regime switching model with separate parameters for \quiet" and \hectic" time periods, one could estimate the model’s parameters and then test whether the estimated correlations di(cid:11)er signi(cid:12)cantly between regimes. We are undertaking further research along these lines. For other valid approaches to testing the constancy of correlations, see Bera and Kim (1996), Karolyi and Stulz (1996), and Longin and Solnik (1995). We caution that, in empirical practice, onemust guard against subtle influencesof data mining: The choice of model to represent the data generating process must be based on considerations that go beyond prior knowledge as to which model may \(cid:12)t" the data best. Relying on such knowledge may reintroduce the problem of splitting the data by ex post criteria, and hence possibly invalidate the formal test of constancy of correlations across regimes. 15

Appendix A. Proof of Theorem 1 Theproofrelies on thewell knownpropertyof bivariatenormalrandomvariables thateach element may be expressed as a weighted sum of the other element and an independent component which is 12 also normally distributed. Let u and v be two independent standard normal random variables. To create two random variables x and y such that x and y are distributed bivariate normal with means (cid:22) x and (cid:22) y and variances (cid:27) x 2 and (cid:27) y 2 , respectively, and correlation coe(cid:14)cient (cid:26)= (cid:27) xy =((cid:27) x (cid:27) y), j(cid:26)j (cid:20) 1, the following operations can be performed on u and v: x= (cid:22) x+(cid:27) x u (A.1) p y = (cid:22) y +(cid:26)(cid:27) y u+ 1−(cid:26)2(cid:27) y v p = (cid:22) y +((cid:26)(cid:27) y =(cid:27) x)(x−(cid:22) x)+ 1−(cid:26)2(cid:27) y v (A.2) Without loss of generality for what follows, we may assume that (cid:22) x = 0 and (cid:22) y = 0. Consider any event A such that 0 < Pr(A) < 1. The conditional correlation coe(cid:14)cient (cid:26) A between x and y, by de(cid:12)nition, can be expressed as Cov(x;y j A) (cid:26) A = p p : (A.3) Var(x j A) Var(y jA) We now replace both occurrences of y in equation (A.3) with the expression given in equation (A.2). The numerator of equation (A.3) may be rewritten as (cid:0) p (cid:12) (cid:1) Cov(x;y jA) = Cov x;((cid:26)(cid:27) y =(cid:27) x)x+ 1−(cid:26)2(cid:27) y v (cid:12) A ; (A.4) or equivalently, (cid:0) (cid:12) (cid:1) (cid:0) p (cid:12) (cid:1) Cov(x;y jA) = Cov x;((cid:26)(cid:27) y =(cid:27) x)x(cid:12) A +Cov x; 1−(cid:26)2(cid:27) y v (cid:12)A : (A.5) 12See, e.g., Goldberger (1991), p. 75. 16

But since x and v are independent, the covariance between x and y conditional on the event A simpli(cid:12)es to Cov(x;y jA)= ((cid:26)(cid:27) y =(cid:27) x)Var(xj A): (A.6) Substituting equation (A.2) into the second part of the denominator of equation (A.3) and recalling that x and v are independent and v has unit variance, we obtain (cid:0) p (cid:12) (cid:1) Var(y j A)= Var ((cid:26)(cid:27) y =(cid:27) x)x+ 1−(cid:26)2(cid:27) y v (cid:12) A = ((cid:26)2(cid:27)2=(cid:27)2 )Var(x jA)+(1−(cid:26)2 )(cid:27)2 Var(v jA) y x y = ((cid:26)2(cid:27)2=(cid:27)2 )Var(x jA)+(1−(cid:26)2 )(cid:27)2: (A.7) y x y By substituting equations (A.6) and (A.7) back into equation (A.3), we may express the conditional correlation between x and y as (cid:26) A = q q ((cid:26)(cid:27) y =(cid:27) x)Var(xj A) : (A.8) Var(x jA) ((cid:26)2(cid:27)2=(cid:27)2)Var(x jA)+(1−(cid:26)2)(cid:27)2 y x y Finally, as was to be shown in this proof, we may simplify this expression to (cid:18) (cid:19) Var(x) −1=2 (cid:26) A = (cid:26) (cid:26)2 +(1−(cid:26)2 ) Var(xj A) : Incidentally, this resultalso demonstrates that|at least for thecase whenthe joint distribution of x and y is normal|the conditional correlation coe(cid:14)cient does not depend on the variance of y directly. Appendix B. Analytical calculation of conditional correlation coe(cid:14)cients 17

In this appendix, we provide some technical information on how one may calculate conditional correlations analytically when the conditioning events are of the form x 2 A, where A (cid:26) R. Since we exclude zero-probability events from our analysis (cf. Theorem 1), events of interest cannot contain isolated points on the real number line. Candidate events A must therefore consist of either an interval or a set of non-overlapping intervals. Suppose (cid:12)rst that A is a single interval, i.e., A = [a;b] with a < b. 13 In this case, the conditional variance of x is (cid:0) (cid:12) (cid:1) (cid:0) (cid:12) (cid:1) (cid:0) (cid:12) (cid:1) Var x(cid:12) x2 [a;b] = E x2 (cid:12) x2 [a;b] −E x (cid:12)x2 [a;b] 2 R "R # bx2f(x)dx bxf(x)dx 2 = aR − Ra : (B.1) bf(x)dx bf(x)dx a a If x and y are bivariate normal and x has unit variance, equation (B.1) can be rewritten as, R putting Pr(A)= bf(x)dx, a " # 2 Var (cid:0) x (cid:12) (cid:12)x2 [a;b] (cid:1) = −be− p b2=2 +ae−a2=2 +1− −e− p b2=2 +e−a2=2 : (B.2) 2(cid:25)Pr(A) 2(cid:25)Pr(A) Alternatively, A may consist of a collection of mutually exclusive intervals, i.e., A = [A i, i = 1;:::;‘, with A i \ A j = ; for all i 6= j. The conditional variance of x easily follows from straightforward modi(cid:12)cations to equations (B.1) and (B.2). Equation (B.1), for instance, becomes R (cid:20)R (cid:21) x2f(x)dx xf(x)dx 2 Var(x j A)= A − A (B.1 0 ) Pr(A) Pr(A) R where A signi(cid:12)es integration over the intervals A i = [a i ;b i] that comprise the set A. Thus, by substituting the applicable analytical expression for the conditional variance of x into equation (1), it is possible to calculate the conditional correlation between x and y numerically, for any conditioning set A of interest, as long as Pr(A)> 0. 13Itisimmaterialherewhetherthisintervalisopenorclosedsinceweareworkingwithnormaldistributions,which havecontinuousprobability density functions. 18

Appendix C. Extension of Theorem 1 to the multivariate, non-i.i.d. case In this appendix, we show how the main result of Section 2, Theorem 1, may be extended to the case of time-series observations. Instead of assuming that (x;y) are a pair of bivariate normal random variables, we now assume that x (cid:17) (x 1 ;x 2 ;:::;x n) 0 and y (cid:17) (y 1 ;y 2 ;:::;y n) 0 are random vectors which satisfy: x (n(cid:2)1) (cid:24) N((cid:22) x ;(cid:6)xx); (C.1) y (n(cid:2)1) (cid:24) N((cid:22) y ;(cid:6)yy); (C.2) 0 1 02 3 2 31 B x C B6 (cid:22) x 7 6 (cid:6)xx (cid:6)xy 7C and @ A (cid:24) N @4 5;4 5A; (C.3) y (cid:22) y (cid:6)xy 0 (cid:6)yy (cid:0) (cid:1) where (cid:6)xy (cid:17) Cov(x;y) = E (x−(cid:22) x)(y−(cid:22) y) 0 . If x and y are covariance stationary, the vectors (cid:22) x and (cid:22) y are vectors of constants, and (cid:6)xx, (cid:6)yy, and (cid:6)xy are banded matrices. More generally, we may allow some heterogeneity in both time series, as long as the heterogeneity and serial dependence satisfy conditions that permit the application of a weak law of large numbers (WLLN) such as the one proved by Andrews (1988). The average correlation between x and y may be de(cid:12)ned as (cid:26)= p tr((cid:6)xy) ; (C.4) tr((cid:6)xx)(cid:1)tr((cid:6)yy) wheretr((cid:1)) is thetrace operator. Anestimator of (cid:26)isde(cid:12)nedas thefull-sample correlation between x and y: P (cid:26)^= q P n 1 t (x t −x(cid:22)n)( P y t −y(cid:22)n) (C.5) n 1 t (x t −x(cid:22)n)2 n 1 t (y t −y(cid:22)n)2 Since a suitable WLLN is assumed to hold, we observe that plim(cid:26)^= limn!1 (cid:26), and may therefore write (cid:26)^= (cid:26)+ sampling error. (Again, if x and y are covariance stationary, (cid:26) is a constant in any (cid:12)nite sample.) 19

Since the random vectors x and y are multivariate normal, they may also be written as linear combinations of independent standard-normal random vectors U and V, as follows: x = (cid:22) x+(cid:6) x 1= x 2U; (C.6) y = (cid:22) y +(cid:6)xy(cid:6) − xx 1 (x−(cid:22) x)+((cid:6)yy −(cid:6)xy(cid:6) − xx 1 (cid:6)xy 0 ) 1=2V = (cid:22) y +(cid:6)xy(cid:6) − xx 1=2U +((cid:6)yy −(cid:6)xy(cid:6) − xx 1 (cid:6)xy 0 ) 1=2V (C.7) As in the discussion in Section 2, we are interested in the e(cid:11)ects conditioning exerts on the correlation between x t and y t. We assumethatconditioning consists of restricting thesamplespace to (x 2 A;y 2 Rn ), where A (cid:26) Rn such that 0 < Pr(A) < 1. The random vectors (x j x2 A) and (y jx2 A) are, in general, not multivariate normal. Setting their (cid:12)rst and second moments equal to (cid:22) xjA, (cid:22) yjA, (cid:6)xxjA and (cid:6)yyjA, respectively, we de(cid:12)ne the conditional covariance matrix as (cid:0) (cid:1) Cov(x;y j A)= E (x−(cid:22) xjA)(y−(cid:22) yjA) 0 j A = (cid:6)xyjA : Setting n A as the number of sample points for which x t 2 A, the quantities (cid:26) A and (cid:26)^A may de(cid:12)ned similarly to the corresponding unconditional moments (C.4) and (C.5): (cid:26) A = q tr((cid:6)xyjA) (C.8) tr((cid:6)xxjA)(cid:1)tr((cid:6)yyjA) P (cid:26)^A = q P n 1 A t2A (x t −x(cid:22) A n )( P y t −y(cid:22) n A ) (C.9) n 1 t2A (x t −x(cid:22)A n )2(cid:1) n 1 t2A (y t −y(cid:22) n A)2 A A Asintheunconditionalcase, weassumethatasuitableWLLNappliestolet thesampleconditional moments converge to ((cid:12)nite) constants as n ! 1 (and, of course, that (cid:26) A converges to the same limit). Some algebra shows that (cid:6)yyjA = Var(y j A) (cid:16) (cid:12) (cid:17) = Var (cid:6)xy(cid:6) − xx 1 (x−(cid:22) x)+ (cid:0) (cid:6)yy −(cid:6)xy(cid:6) − xx 1 (cid:6)xy 0 (cid:1) 1=2 (cid:1)V (cid:12) (cid:12) A 20

(cid:0) (cid:12) (cid:1) = Var (cid:6)xy(cid:6) − xx 1 (x−(cid:22) x) (cid:12) A +((cid:6)yy −(cid:6)xy(cid:6) − xx 1 (cid:6)xy 0 ) 1=2 Var(V jA)((cid:6)yy −(cid:6)xy(cid:6) − xx 1 (cid:6)xy 0 ) 1=2 (cid:0) (cid:1) = (cid:6)xy(cid:6) − xx 1 Var(x j A)(cid:6) − xx 1 (cid:6)xy 0 + (cid:6)yy −(cid:6)xy(cid:6) − xx 1 (cid:6)xy 0 = (cid:6)xy(cid:6) − xx 1 (cid:6)xxjA(cid:6) − xx 1 (cid:6)xy 0 +(cid:6)yy −(cid:6)xy(cid:6) − xx 1 (cid:6)xy 0: (C.10) Similarly, (cid:6)xyjA = Cov(x;y jA) (cid:16) (cid:12) (cid:17) = Cov x;(cid:6)xy(cid:6) − xx 1x− (cid:0) (cid:6)yy −(cid:6)xy(cid:6) − xx 1 (cid:6)xy 0 (cid:1) 1=2 V (cid:12) (cid:12) A (cid:0) (cid:12) (cid:0) (cid:1) (cid:12) (cid:1) = Cov x;(cid:6)xy(cid:6) − xx 1x(cid:12)A)−Cov x;((cid:6)yy −(cid:6)xy(cid:6) − xx 1 (cid:6)xy 0 1=2 V (cid:12) A = (cid:6)xy(cid:6) − xx 1 Var(x j A) = (cid:6)xy(cid:6) − xx 1 (cid:6)xxjA : (C.11) The relationship between (cid:26) A and (cid:26), de(cid:12)ned in (C.8) and (C.4) above, can|in general|not be expressed in simple terms such as (1). However, in the case that (x t ;y t) is i.i.d., we may simplify these equations, since (cid:6)xx = (cid:27) x 2(cid:1)I n, (cid:6)yy = (cid:27) y 2(cid:1)I n, and (cid:6)xy = (cid:27) xy (cid:1)I n. Setting (cid:26) = (cid:27) xy =((cid:27) x (cid:27) y), we may write: (cid:6)xxjA = (cid:27) x 2 jA (cid:1)I n (C.12) (cid:26)2(cid:27)2(cid:27)2 (cid:6)yyjA = (cid:26)(cid:27) x (cid:27) y((cid:27) x 2 ) −1 (cid:6)xxjA((cid:27) x 2 ) −1(cid:26)(cid:27) x (cid:27) y +(cid:27) y 2I n − (cid:27) x 2 yI n x (cid:0) (cid:27)2 (cid:1) = (cid:27) y 2 (cid:26)2 (cid:27) xj 2 A +1−(cid:26)2 I n (C.13) x (cid:26)(cid:27) (cid:27) (cid:6)xyjA = (cid:27) x 2 y(cid:27) x 2 jA I n x (cid:26)(cid:27) = (cid:27) y(cid:27) x 2 jA I n : (C.14) x 21

We thus obtain (cid:26)(cid:27)y(cid:27)2 (cid:26) A = r(cid:27)x xjA (cid:27)2 (cid:27)2 (cid:1)(cid:27)2 (cid:26)2 xjA +1−(cid:26)2 xjA y (cid:27)2 x (cid:26) = s (cid:18) (cid:19) (cid:27)2 (cid:27)x (cid:27)2 (cid:26)2 xjA +1−(cid:26)2 (cid:27)2 xjA (cid:27)2 xjA x (cid:26) = r ; (C.15) (cid:26)2+ (cid:27) x 2 (1−(cid:26)2) (cid:27)2 xjA which is exactly equal to (1) in the bivariate case. References Andrews, Donald W.K., \Laws of Large Numbers for Dependent Non-identically Distributed Random Variables," Econometric Theory 4 (1988), 458{467. Bera, Anil K. and Sangwhan Kim, \Testing Constancy of Correlation with an Application to International Equity Returns," CIBER Working Paper 96-107 (Champaign, IL: University of Illinois, 1996). Bollerslev, Tim, \Modelling the Coherence in Short-run Nominal Exchange Rates: A Multivariate Generalized ARCH Model," Review of Economics and Statistics 72:3 (1990), 498{505. Bookstaber, Richard, \Global RiskManagement: AreWeMissingthePoint?," Journal of Portfolio Management 23:3 (1997), 102{107. Campa, Jos(cid:19)e M. and P. H. Kevin Chang, \The Forecasting Ability of Correlations Implied in Foreign Exchange Options," Journal of International Money and Finance 17:6 (1998), 855{880. Goldberger, Arthur S., A Course in Econometrics (Cambridge, MA: Harvard University Press, 1991). 22

Karolyi, G. Andrew and Ren(cid:19)e M. Stulz, \Why Do Markets Move Together? An Investigation of U.S.{Japan Return Comovements," Journal of Finance 51:3 (1996), 951{986. Longin, Fran(cid:24)cois and Bruno Solnik, \Is the Correlation in International Equity Returns Constant: 1960{1990?," Journal of International Money and Finance 14:1 (1995), 3{26. Sullivan, Greg, \Correlation Counts," Risk 8 (August 1995), 36{38. Wilson, Thomas, \In(cid:12)nite Wisdom," Risk 6 (June 1993), 37{45. 23

Cite this document
APA
Brian H. Boyer, Michael S. Gibson, & and Mico Loretan (1999). Pitfalls in Tests for Changes in Correlations (IFDP 1997-597). Board of Governors of the Federal Reserve System, International Finance Discussion Papers. https://whenthefedspeaks.com/doc/ifdp_1997-597
BibTeX
@techreport{wtfs_ifdp_1997_597,
  author = {Brian H. Boyer and Michael S. Gibson and and Mico Loretan},
  title = {Pitfalls in Tests for Changes in Correlations},
  type = {International Finance Discussion Papers},
  number = {1997-597},
  institution = {Board of Governors of the Federal Reserve System},
  year = {1999},
  url = {https://whenthefedspeaks.com/doc/ifdp_1997-597},
  abstract = {Correlations are crucial for pricing and hedging derivatives whose payoff depends on more than one asset. Typically, correlations computed separately for ordinary and stressful market conditions differ considerably, a pattern widely termed "correlation breakdown." As a result, risk managers worry that their hedges will be useless when they are most needed, namely during "stressful" market situations.},
}