feds · May 5, 2022

Dynamic Factor Copula Models with Estimated Cluster Assignments

Abstract

This paper proposes a dynamic multi-factor copula for use in high dimensional time series applications. A novel feature of our model is that the assignment of individual variables to groups is estimated from the data, rather than being pre-assigned using SIC industry codes, market capitalization ranks, or other ad hoc methods. We adapt the k-means clustering algorithm for use in our application and show that it has excellent finite-sample properties. Applying the new model to returns on 110 US equities, we find around 20 clusters to be optimal. In out-of-sample forecasts, we find that a model with as few as five estimated clusters significantly outperforms an otherwise identical model with 21 clusters formed using two-digit SIC codes.

Finance and Economics Discussion Series Federal Reserve Board, Washington, D.C. ISSN 1936-2854 (Print) ISSN 2767-3898 (Online) Dynamic Factor Copula Models with Estimated Cluster Assignments Dong Hwan Oh and Andrew J. Patton 2021-029 Please cite this paper as: Oh, Dong Hwan, and Andrew J. Patton (2022). “Dynamic Factor Copula Models with Estimated Cluster Assignments,” Finance and Economics Discussion Series 2021-029r1. Washington: Board of Governors of the Federal Reserve System, https://doi.org/10.17016/FEDS.2021.029r1. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

Dynamic Factor Copula Models with Estimated Cluster Assignments (cid:3) Dong Hwan Oh Andrew J. Patton Federal Reserve Board Duke University First draft: 3 November 2020. This draft: 14 April 2022. Abstract This paper proposes a dynamic multi-factor copula for use in high dimensional time series applications. A novel feature of our model is that the assignment of individual variables to groups is estimated from the data, rather than being pre-assigned using SIC industry codes, market capitalization ranks, or other ad hoc methods. We adapt the k-means clustering algorithm for use in our application and show that it has excellent (cid:133)nite-sample properties. Applying the new model to returns on 110 US equities, we (cid:133)nd around 20 clusters to be optimal. In out-of-sample forecasts, we (cid:133)nd that a model with as few as (cid:133)ve estimated clusters signi(cid:133)cantly outperforms an otherwise identical model with 21 clusters formed using two-digit SIC codes. Keywords: high-dimensional models, risk management, multivariate density forecasting. J.E.L. codes: C32, C38, C58. (cid:3)We thank participants at the 2020 EC2 conference (Paris), Econometrics and Business Analytics conference (St. Petersburg), and the International Workshop on Financial Econometrics (Federal University of Rio Grande do Sul, Brazil). The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research sta⁄or the Board of Governors. We are grateful to Forrest Denson for his excellent computingsupport. Email: donghwan.oh@frb.gov, andrew.patton@duke.edu. A Matlabtoolboxtoimplementthe methods proposed in this paper will be available at http://www.econ.duke.edu/sap172. 1

1 Introduction Models for the dependence structure of a large collection of variables play an important role in risk managementandregulation,yetthereisarelativepaucityofsuchmodels. Akeyimpedimentisthat these models need to be parsimonious enough to deal with the inevitable curse of dimensionality that arises in high-dimensional applications, yet (cid:135)exible enough to capture the time-varying and potentially asymmetric nature of the dependence between economic variables. We propose a multi-factor, high-dimensional, copula model where the assignment of individual variables to groups or clusters is estimated from the data. Existing approaches for similar problems (see Creal and Tsay, 2015, Bester and Hansen, 2016, and Opschoor et al., 2020, for example) use pre-speci(cid:133)ed cluster assignments, based on SIC industry codes, or market capitalization deciles, or similar. In the absence of a computationally feasible data-driven alternative such approaches are reasonable, however it is not obvious that such assignments are optimal empirically. We propose a method based on k-means clustering (see, e.g., Hastie et al., 2009) to estimate the optimal assignments of variables to clusters, and we model dynamics in the conditional copula using a (cid:147)generalized autoregressive score(cid:148)(GAS) model (Creal et al., 2013; Harvey 2013). Theestimationoftheoptimalclusterassignmentsforahigh-dimensionaldynamiccopulamodel requiresustoovercometwocomputationalhurdles. Firstly, ratherthanthesimulation-basedfactor copula model of Oh and Patton (2017), we adopt and extend the model of Opschoor et al. (2020), which has a closed-form likelihood and is thus much faster to estimate. Our extension enables us to capture asymmetric dependencies which can be important for equity returns, see Ang and Chen (2002), Hong et al. (2007) and Patton (2013) amongst many others. Secondly, we exploit the fact that the presence of clusters in the dynamic model implies the presence of clusters in the (misspeci(cid:133)ed) static version of the model. The static version of the model is naturally much faster to estimate than the dynamic version. These two techniques, combined with extensive use of parallel processing, make the estimation of optimal cluster assignments feasible. We prove the consistency of the estimated cluster assignments under very mild conditions, and we (cid:133)nd in realistically-designed simulations that our estimation method is remarkably accurate. 2

We apply the new model to daily returns on 110 U.S. equities over the period 2010-2019, and consider a range of choices for the number of clusters in the model. We (cid:133)nd that the BIC optimal number of clusters is around 20, and moreover (cid:133)nd that a model with just (cid:133)ve estimated clusters outperformsanotherwiseidenticalmodelbasedon21clustersformedusingtwo-digitSICgroupings. In out-of-sample forecast comparisons, we (cid:133)nd that the model with estimated cluster assignments signi(cid:133)cantly outperforms one with clusters formed using two-digit SIC codes. This paper bridges two lines in the extant literature. Most directly, this paper is related to the literature on high-dimensional methods for (cid:133)nancial risk measurement. Early work focused on improved methods for estimating large covariance matrices. For example, Fan et al. (2008, 2013) propose using a factor model where the number of factors grows with the number of variables, with the latter of these papers also accommodating approximate factor models. Tao et al. (2011) consider high-dimensional covariance matrix estimation based on a combination of high- and lowfrequency data, also using a factor model. Hautsch et al. (2012) propose a method to estimate covariance matrices using high frequency data from assets with varying degrees of liquidity. More recent work in this area has included a focus on copula-based models, such as Creal and Tsay (2015) who proposed a high-dimensional stochastic copula with a factor structure, and Oh and Patton (2018) and Opschoor et al. (2020) who consider factor copulas with dynamics driven by a GAS speci(cid:133)cation. Christo⁄ersen et al. (2018) propose a high-dimensional dynamic copula model with DCC (Engle, 2002) type dynamics. As far as we know, our paper is the (cid:133)rst to consider a high-dimensional copula model with estimated group assignments. This paper is also related to the fast-growing area of clustering and classi(cid:133)cation methods in economics and (cid:133)nance. Lin and Ng (2012) and Bonhomme and Manresa (2015) consider linear panel models with unknown group assignments which are estimated using k-means clustering. Su et al. (2016, 2019) consider panel models with group assignments estimated using a new type of LASSO estimator. The latter of these papers allows the parameters of the panel model to vary nonparametrically with time. Vogt and Linton (2020) also consider nonparametric regression for a panel of data with unknown group assignments. Francis et al. (2017) cluster countries by their business cycle patterns, and Patton and Weller (2019) consider clustering stocks by the risk premia 3

they generate. This research area is very active and this review is surely incomplete already. Theremainderofthepaperisstructuredasfollows. InSection2wepresentthedynamiccopula models considered in this paper, and in Section 3 we discuss how we can optimally assign variables to clusters. Section 4 presents the results of a simulation study of the (cid:133)nite-sample performance of the proposed model and estimation method. Section 5 applies the new methods to a collection of 110 stock returns. Section 6 concludes, and the appendix contains proofs and technical details. A web appendix contains additional analyses and material. 2 A dynamic skewed t factor copula model A copula is an N-dimensional distribution function with Unif(0;1) margins, and even when N is only moderately-sized the curse of dimensionality arises. A common approach to overcome this in other contexts is to impose some sort of factor structure, and recent work on high-dimensional copula models has moved in this direction, see Oh and Patton (2017, 2018), Creal and Tsay (2015) and Opschoor et al. (2020). An attractive feature of the latter two papers is that the copula likelihood is available in closed form. Motivated by previous work showing that equity returns exhibit asymmetric dependence (see, e.g., Ang and Chen, 2002, Hong et al., 2007, and Patton, 2013), we consider an extension of the model proposed by Opschoor et al. (2020) to allow for asymmetric dependence, namely a skewed t factor copula: u = T (x ;(cid:23);(cid:16)); i = 1; ;N; (1) i;t skew i;t (cid:1)(cid:1)(cid:1) x = W (cid:21)0 z +(cid:27) (cid:15) +(cid:16)W ; (2) i;t t i;t t i;t i;t t p (cid:16) (cid:17) where z t iid N (0e;I k ); (cid:15) i;t iid (0;1); (3) (cid:24) (cid:24) N (cid:23) (cid:23) W iid IG ; ; W z (cid:15) (4) t t t i;t (cid:24) 2 2 ? ? (cid:16) (cid:17) whereT ( ;(cid:23);(cid:16))denotes the univariate skewedt CDF ofx ;with degrees of freedom parameter skew i;t (cid:1) (cid:23) (2; ] and asymmetry parameter (cid:16) [ 1;1].1 (cid:21) is a vector of scaled factor loadings, z is a i;t t 2 1 2 (cid:0) 1CrealandTsay(2010)describethiscopulabutdonotimplementitorpresentresultsonitslikelihoodandscores. e Asthatpapernotes,when(cid:16) =0thefunctionT (;(cid:23);(cid:16))isnotavailableinclosedform,andCrealandTsay(2010) skew 6 (cid:1) 4

vector of common latent factors and (cid:15) is an idiosyncratic shock, both Normally distributed, and i;t W is an inverse gamma variable. We de(cid:133)ne the vector (cid:21) and scalar (cid:27) as t i;t i;t e (cid:21) 1 (cid:21) = i;t ; (cid:27)2 = (5) i;t i;t 1+(cid:21) (cid:21) 1+(cid:21) (cid:21) 0i;t i;t 0i;t i;t e q for a factor loading (cid:21) to maintain the unit variance of (cid:21)0 z +(cid:27) (cid:15) . The skewed t copula nests i;t i;t t i;t i;t the Student(cid:146)s t copula when (cid:16) = 0, and the Gaussian copula when (cid:16) = 0 and (cid:23) . Given this e ! 1 structure, and since each element of the vector (cid:21)0 z +(cid:27) (cid:15) ;:::;(cid:21)0 z +(cid:27) (cid:15) has unit 1;t t 1;t 1;t N;t t N;t N;t h i variance, its covariance matrix is a correlation matrix, R ; with the form: e t e R = L L +D (6) t 0t t t e e where L = (cid:21) ;:::;(cid:21) and D = diag (cid:27)2 ;:::;(cid:27)2 : The skewed t copula then contains t 1;t N;t t 1;t N;t h i (cid:16) (cid:17) time-varying factor loadings (cid:21) ;::: ;(cid:21) and static shape parameters ((cid:23);(cid:16)). Creal and Tsay e e e 01;t 0N;t (cid:1)(cid:1)(cid:1) (2015) show that a factor co(cid:2)pula structure of(cid:3)the sort in equation (2) facilitates the evaluation of the copula density even for high dimensions since the inverse and determinant of R are available t in closed form and require only lower-dimension inversions and determinant calculations: 1 R 1 = D 1 D 1L I +L D 1L (cid:0) L D 1 (cid:0)t (cid:0)t (cid:0)t 0t k t (cid:0)t 0t t (cid:0)t (cid:0) (cid:16) (cid:17) R t = I k +L t D (cid:0)t 1 eL 0t D te: e e j j (cid:1)j j (cid:12) (cid:12) (cid:12) (cid:12) e e (cid:12) (cid:12) We consider a factor structure determined by a (G+1) vector z of common latent factors and t a loading matrix L . Speci(cid:133)cally, we allow for one common factor, shared by all variables, and G t cluster-speci(cid:133)c factors, shared only by members of that cluster.2 For example, assuming there are e G groups and each group has only two members, z and L are determined by: t t omit it from their analysis. The presence of this parameter raises no theoretical di¢ culties, only a computational e one. InAppendixA.1wedescribeasimpleandcomputationallytractablemethodtoovercomethisdi¢ culty,making the likelihood of this copula quasi-closed form (up to a simple one-dimensional numerical integral). 2This corresponds to the (cid:147)multi-factor(cid:148)(MF) model in Opschoor et al., (2020), which is the second-most (cid:135)exible factor structure considered in that paper. The more (cid:135)exible (cid:147)lower-triangular MF(cid:148)model is less amenable to the estimation of group assignments, which is the key focus of this paper, and we do not consider that structure here. 5

z (0;I ) t G+1 (cid:24) N M C (cid:21) (cid:21) 0 0 0 1;t 1;t (cid:1)(cid:1)(cid:1) 0 M C 1 (cid:21) 0 (cid:21) 0 0 e2;t e 2;t B (cid:1)(cid:1)(cid:1) C 1 L 0t = B B B (cid:21) e M 3;t 0 e 0 (cid:21) C 3;t 0 0 C C C(cid:10) 0 1 1 (7) B B . . . . . . . . . 0 ... 0 C C B C e Be e C @ A B C B(cid:21) M 0 0 0 0 (cid:21) C C B G;t G;tC B C @ A e e where denotes the Kronecker product. Note that the loadings on the common factor and the (cid:10) cluster-speci(cid:133)c factor can only be separately identi(cid:133)ed if each group has at least two members; we impose this condition when estimating the model. Next, we formulate the dynamics of 2G distinct factor loadings based on the generalized autoregressive score model proposed by Creal et al. (2013) and Harvey (2013). Speci(cid:133)cally, we model those dynamics by: @logc (x ;R ;(cid:23);(cid:16)) (cid:21)M = !M +(cid:11)M Skewt;t t t +(cid:12)M(cid:21)M, for g = 1;:::;G (8) g;t+1 g @(cid:21)M g;t g;t @logc (x ;R ;(cid:23);(cid:16)) (cid:21)C = !C +(cid:11)C Skewt;t t t +(cid:12)C(cid:21)C , for g = 1;:::;G g;t+1 g @(cid:21)C g;t g;t where x = T 1 (u ;(cid:23);(cid:16)), c ( ;R ;(cid:23);(cid:16)) is the conditional skewed t copula density and t s(cid:0)kew t Skewt;t (cid:1) t !M;:::;!M;!C;:::;!C;(cid:11)M;(cid:12)M;(cid:11)C;(cid:12)C 0 is the vector of parameters determining the dynam- 1 G 1 G i(cid:2)cs of time varying factor loadings.3 Obvi(cid:3)ously the key component is the score of the conditional copula @logc (x ;R ;(cid:23);(cid:16))=@(cid:17) where (cid:17) is a (2G 1) vector of all dynamic factor loadings: Skewt;t t t t t (cid:2) (cid:17) = (cid:21)M;:::;(cid:21)M ;(cid:21)C ;:::;(cid:21)C 0: (9) t 1;t G;t 1;t G;t (cid:2) (cid:3) The skewed t copula density and the analytical derivation of its score are given in Appendix A.1. and A.2. respectively. While our model has factor loadings that vary across time, we assume that the group assign- 3AsinOpschoor et al. (2020)andOhandPatton(2018),weuseaunitscalingofthescoreinequation(8),rather than the inverse Hession or its square-root, to reduce the computational burden of estimating the model. 6

ments are stable. Custodio Joªo et al. (2022) and Lumsdaine et al. (2022) consider models with time-varying group assignments, and generalizing our framework to allow this is an interesting extension for future research. 3 Clustering and factor copulas 3.1 Clustering via a misspeci(cid:133)ed model While the closed-form density and GAS equations presented in equation (8) greatly reduce the computational burden of estimating a dynamic high-dimensional copula model, this model is still too costly to use when combined with an EM algorithm to estimate group assignments from the data. In this section we show that the structure of our model is such that we can estimate group assignments based on a simpler, misspeci(cid:133)ed, model, overcoming this hurdle. Firstly, consider a static skew t factor copula. The factor loading vectors ((cid:21) ) obey a cluster i structure, in that all variables in the same cluster have the same loading vector. From equation (5) above, given the factor loadings we can obtain the normalized loadings and idiosyncratic variances, (cid:21) and (cid:27)2 ; and from those we obtain the correlation matrix: i;t i;t e R = LL+D 0 e e where L = (cid:21) ;:::;(cid:21) and D = diag (cid:27)2;:::;(cid:27)2 : The cluster structure embedded in (cid:21) implies 1 N 1 N i h i thatRexhibitsablockstructure,which(cid:0),asdiscuss(cid:1)edabove,canbeusedtospeedupmatrixinverse e e e anddeterminantcalculations. Further,wenotethattheblockstructureinRholdsregardless ofthe shape parameters ((cid:23);(cid:16)): Thus a Normal factor copula, where the shape parameters are incorrectly (cid:133)xed at ((cid:23);(cid:16)) = ( ;0) will exhibit the same cluster structure as the more complicated skew t 1 factor copula. This means that the cluster assignments implied by the Normal factor copula are identical to the skew t factor copula, permitting us to use the simpler model to estimate cluster assignments, with the usual caveat that these estimates are likely less precise than those based on the true model. 7

Next consider a time-varying skew t factor copula. In this case the time-varying correlation matrix R = L L + D obeys a block structure, and while the values taken by the elements t 0t t t of R vary over time, the block structure is constant due to the maintained assumption that t e e group assignments are stable. The conditional marginal copula of any pair (u ;u ) is determined i;t j;t completely by (R ;(cid:23);(cid:16)); and any pair of variables (i;j) belonging to groups (g ;g ) will have the i;j;t 1 2 same distribution as any other pair (i;j ) belonging to the same two groups. The unconditional 0 0 marginalcopulaisjustanintegraloftheconditionalmarginalcopula,andsotheunconditionalrank correlation matrix, %(cid:22) Corr[u ]; exhibits the same cluster structure as the conditional correlation t (cid:17) matrix R ; opening up the possibility of using a constant Normal factor copula to estimate group t assignments for a dynamic skew t factor copula. One complication arises when using a static copula to determine group assignments for a dynamic DGP: since we are taking time series averages, it is possible that the unconditional rank correlation matrix %(cid:22) is more homogeneous than the conditional correlation matrix R , making it t harder to identify group assignments. That is, clusters may not be as well separated in the approximating model as in the true model. The concept of (cid:147)well separatedness(cid:148)is a (cid:133)nite-sample issue, and we examine it in detail in our simulation study. To preview our (cid:133)ndings, our simulations indicate that this is not a signi(cid:133)cant concern here. 3.2 Estimation of cluster assignments and copula parameters The main advantage of using a factor copula comes from the dimension reduction enabled by classifyingvariablesintoarelativelysmallnumberofclustersandassumingidenticalfactorloadings within each cluster. In the existing literature, variables are clustered according to observable characteristics, such as SIC industry classi(cid:133)cations. Given those cluster assignments, the factor copula can be estimated via maximum likelihood under standard conditions, however, the ex ante assignments of variables to clusters may not provide the best (cid:133)t to the data. We propose an iterative method which estimates cluster assignments, and copula parameters, directly from the data, exploiting an expectation-maximization (EM) algorithm. This algorithm cycles between (1) estimating copula parameters given cluster assignments and (2) estimating clus- 8

ter assignments given the estimated copula parameters. Let (cid:0) = [(cid:13) ;:::;(cid:13) ] where (cid:13) 1;:::;G 1 N i 2 f g for i = 1;::::;N, denote the vector of cluster assignments, and let (cid:18) = (cid:21)M;:::;(cid:21)M;(cid:21)C;:::;(cid:21)C be 1 G 1 G the vector of market and cluster-speci(cid:133)c factor loadings used to obtain th(cid:2)e correlation matrix p(cid:3)arameter for the static Gaussian factor copula, with log-likelihood denoted logc( ). Given an estimate (cid:1) of the cluster assignment vector, (cid:0)^(s) the log-likelihood of the copula model is maximized over the copula parameters (cid:18) to yield: (cid:18)^(s+1) = argmax Q^ (cid:18);(cid:0)^(s) (10) T (cid:18) (cid:16) (cid:17) T where Q^ ((cid:18);(cid:0)) logc(u ;(cid:18);(cid:0)) (11) T t (cid:17) t=1 X Then, given copula parameter (cid:18)^(s+1) , the log-likelihood is maximized over cluster assignments (cid:13) i for i = 1;:::;N: (cid:13)^ (s+1) = arg max Q^ (cid:18)^(s+1) ;(cid:0)~(s) (12) i T i;g g 1;:::;G 2f g (cid:16) (cid:17) where (cid:0)~(s) is equal to (cid:0)^(s) except that the ith element is set equal to g: i;g The copula parameter in equation (10) is estimated through a typical gradient-based optimization. We update each variable(cid:146)s cluster assignment (equation 12) by re-optimizing the cluster assignments one variable at a time, motivated by the method underlying k-means clustering. This latter step requires only G N likelihood evaluations, making cluster assignment estimation feasi- (cid:2) ble and fast.4 The iteration between equation (10) and equation (12) continues until convergence. Convergence to a local optimum is guaranteed, and we use 10 randomly-chosen starting values to improve the accuracy of the estimator. Our simulation study below con(cid:133)rms this to be a su¢ cient number of starting values. Denote the resulting estimates as (cid:18)^ ;(cid:0)^ : T T We next provide conditions under which the estimated clu (cid:16) ster ass (cid:17) ignments, (cid:0)^ ; are consistent T forthetrueclusterassignments, (cid:0) :Thisisanon-standardestimationproblemastheparameter(cid:0) 0 0 is discrete: each of its N elements can take one of only G values. Let denote the parameter space G 4Other estimation algorithms for k-means type problems have been proposed in the computer science/machine learning literature. Given the very good (cid:133)nite-sample performance we (cid:133)nd for the algorithm described here, when a su¢ cient number of starting values is used, we did not consider any alternatives. 9

for (cid:0):5 Since the labels attached to clusters are arbitrary (i.e., the objective function is invariant to relabeling the clusters), there is a set of correct cluster labels, rather than just a singleton; let denote this set. To state the assumptions we de(cid:133)ne the following: 0 G (cid:18)~ (cid:3)((cid:0)) = argmin E[logc(u t ;(cid:18);(cid:0))] (13) (cid:18) (cid:2) 2 (cid:18) (cid:3) = argmin E[logc(u t ;(cid:18);(cid:0) 0 )] (14) (cid:18) (cid:2) 2 Note that the parameter (cid:18) is a pseudo-true parameter: it is the optimal parameter for the mis- (cid:3) speci(cid:133)ed static Gaussian copula model. We obtain this parameter as a by-product of estimating the cluster assignments, but we have no subsequent use for it. Assumption 1: u is a stationary ergodic sequence. t f g Assumption2: Foreach(cid:0) 2 G , (a) j logc(u t ;(cid:18);(cid:0)) j1 < 18 (cid:18) 2 (cid:2);(b) r (cid:18) logc u t ;(cid:18)~ (cid:3)((cid:0));(cid:0) 1 < (cid:13) (cid:16) (cid:17)(cid:13) 1 ; and (c) kr (cid:18)(cid:18) logc(u t ;(cid:18);(cid:0)) k1 < 1 8 (cid:18) 2 (cid:2): (cid:13) (cid:13) (cid:13) (cid:13) Assumption 3: (a) For each (cid:0) , limsup T Q^ T (cid:18)~ (cid:3)((cid:0));(cid:0) Q^ T ((cid:18);(cid:0)) > 0 (cid:18) 2 G !1 (cid:0) 8 2 (cid:2) (cid:17) T (");where(cid:17) T (")isan"-neighborhoodof(cid:18)~ (cid:3)((cid:0));an h d(b) (cid:16) limsup T (cid:17) Q^ T ((cid:18) (cid:3) ;(cid:0) i 0 ) Q^ T ((cid:18);(cid:0)) > n !1 (cid:0) h i 0 ((cid:18);(cid:0)) (cid:2) (cid:17) (") : T 0 8 2 f n g(cid:2)fG n G g Assumption 1 allows for general forms of serial dependence in the data (e.g., mixing). Importantly, giventhatweexpectthestaticGaussiancopulamodeltobemisspeci(cid:133)ed, itdoesnot require correct speci(cid:133)cation of the conditional copula. Assumption 2, combined with Assumption 1, ensures that the log-likelihood and its (cid:133)rst and second derivatives each obey a law of large numbers. Assumption 3 is a standard (cid:147)identi(cid:133)able uniqueness(cid:148)assumption required for estimation, see De- (cid:133)nition 3.3 of White (1994). In our application, it requires that the clusters are (cid:147)well separated.(cid:148) If the clusters are too close together, then identi(cid:133)cation of the clusters breaks down. A similar assumption is made in, e.g., Hahn and Moon (2010) and Bonhomme and Manresa (2015). The proof of the following theorem is in Appendix A.4. 5Recall that for identi(cid:133)cation of our model we require all clusters to have at least two members. We restrict to G impose this condition. 10

Theorem 1 Under Assumptions 1-3 we have Pr (cid:0)^ p 1 as T . T 0 2 G ! ! 1 h i Results from related contexts suggest that if the series u generated by equation (1) satis(cid:133)es t f g certain mixing properties, a large deviations principle may be applied (e.g., see Hahn and Moon, 2010, Choirat and Seri, 2012, and Bonhomme and Manresa, 2015). This enables obtaining a rate result,re(cid:133)ningtheconsistencyresultinTheorem1. Speci(cid:133)cally,estimatedclusterassignmentshave been shown in some applications to be superconsistent, with estimation errors taking the form: Pr (cid:0)^ = C exp C T(cid:20) (15) T 0 1 2 2 G (cid:20) f(cid:0) g h i for some constants C ;C ;(cid:20) > 0:6 The simulation results presented in the next section reveal that 1 2 cluster assignments are indeed estimated extremely well, in line with a superconsistent rate of convergence, though, unfortunately, general results on the mixing properties of GAS processes are not yet available in the literature, and so we do not pursue a theoretical result of this nature here.7 A result of the form in equation (15) implies that estimation error in estimated cluster assignments vanishesfasterthantheusualpT rate,andstandarderrorsontheremainingmodelparameterscan be computed as though group assignments were known. If errors in estimated group assignments are of the same asymptotic order as those in the remaining model parameters, then standard errors on the remaining parameters need to be adjusted.8 The primary research questions of this paper do not require us to take a stand on the rate of convergence of the estimated group assignments. With the estimated the cluster assignments (cid:0)^ in hand, we estimate the parameters of the T skewed t copula with GAS dynamics: T ^ = argmax logc u ; (cid:0)^ T Skewt;t t T j X t=1 (cid:16) (cid:17) 6For example, Hahn and Moon (2010) provide conditions under which alpha mixing implies (cid:20) = 1=2; and phi mixing implies (cid:20)=1: The constants C ;C vary with the speci(cid:133)cs of the application. 1 2 7Related to the GAS context considered here, Carrasco and Chen (2002) and Hafner and Preminger (2009) show that univariate and multivariate GARCH processes, respectively, are beta mixing. Some results on the stationarity and ergodicity of univariate GAS processes are presented in Blasques et al. (2014). 8The supplementalappendix ofBonhomme and Manresa (2015)discussesa bootstrap approach fortheirapplication, however they note that it is borderline computationally prohibitive even in their linear, non-dynamic, model. To the best of our knowledge, the literature does not yet contain results allowing for the theoretical analysis of a bootstrap method for the dynamic time series applications considered here. 11

where = !M;:::;!M;!C;:::;!C;(cid:11)M;(cid:12)M;(cid:11)C;(cid:12)C;(cid:23);(cid:16) 0: As the parameter is large, we adopt 1 G 1 G a (cid:147)variance(cid:2) targeting(cid:148)approach to separately estimate(cid:3)the intercept parameters !M;:::;!C ; 1 G leaving us with only six parameters that require di¢ cult numerical optimization. D(cid:2)etails on th(cid:3)is method are described in Appendix A.3. Once ^ is obtained, the time series of factor loadings, T (cid:21) ; can computed using equation (8). t 4 Simulation study We investigate the (cid:133)nite-sample performance of the estimation method proposed above in a simulation study designed to match the key features of our empirical application below. We consider a sample size of T = 1000 and a collection of N = 100 variables, and three di⁄erent factor copulas: a Gaussian factor copula, a t factor copula, and a skew t factor copula, corresponding to [(cid:23);(cid:16)] = [ ;0]; [5;0]; [5; 0:1] respectively. For illustration, a sample of bivariate data from 1 (cid:0) these three copulas, as well as a skew Normal copula which we omit from the simulation study, is presented in Figure 1. In all cases the linear correlation is 0.5, and to aid the interpretation we transform draws from these copulas using the inverse Normal CDF, and so these four distributions all have N (0;1) marginal distributions. In the upper-left panel of Figure 1, we see the familiar bivariate Normal distribution, with low dependence in the tails and displaying radial symmetry. Theupper-rightpaneldisplaystheStudent(cid:146)stcopula, whichisalsoradiallysymmetricbutexhibits tail dependence, which manifests in this (cid:133)gure as realizations that lie close to the main diagonal in the upperand lowerjointtails. The lowertwo panels present asymmetric copulas, with dependence beingstrongerinthelowertailthantheuppertail, particularlyfortheskewtcopulawhichexhibits non-zero tail dependence. [ INSERT FIGURE 1 ABOUT HERE ] We consider two cases for the dynamics of the copula: the benchmark static case, in which the conditional copula is constant, and the case of empirical interest, where the parameters of the copula evolve according to the GAS model introduced in Section 2. We set the number of clusters, G; to be 10 or 20; with an equal number of variables allocated to each cluster. In the static case, we 12

assume that the loadings across groups on the market factor range from 0.25 to 2.50 in increments of 0.25, while the loadings on the group speci(cid:133)c factors range from 2.5 to 0.25 in increments of -0.25. This implies that some groups are more in(cid:135)uenced by the common market factor than their group factor, while the reverse is true for other groups, roughly mimicking the di⁄erences between industries like manufacturing and mining/construction. Naturally, in this case the GAS dynamic parameters (cid:11)M;(cid:12)M;(cid:11)C;(cid:12)C are all zero. In the d(cid:0)ynamic case, we(cid:1)set the intercept parameters !M;!C equal to 0.04 for all groups, g g which, combined with the common values for the GAS dyn(cid:0)amic pa(cid:1)rameters (cid:11)M;(cid:12)M;(cid:11)C;(cid:12)C = (0:02;0:9;0:02;0:9), means that all groups have the same average loading on th(cid:0)e market factor a(cid:1)nd on their group-speci(cid:133)c factor. This homogeneity of loadings makes the estimation problem more di¢ cult than if the loadings had di⁄erent long-run averages, and is designed to further interrogate the ability of our clustering method to correctly assign variables to groups. Table 1 presents the results for the static copula case with G = 10. In Panel A we see that the estimated parameters are centered on the true values, for all three copulas, and the standard errors on the factor loadings increase slightly (on average) as we move from Gaussian to t to skew t copulas. Panel B of Table 1 reports the striking result that in 100% of the simulations there were zero variables assigned to an incorrect group. That is, in every simulation the clustering algorithm was able to correctly allocate variables to their groups.9 In the Gaussian case, the clustering step is done using the correct model (a static Gaussian copula) while in the other two cases the model used in the clustering step is misspeci(cid:133)ed. Panel B reveals that this misspeci(cid:133)cation leads to no errors in the classi(cid:133)cation of these variables.10 This is consistent with the exponential convergence rate (see equation 15) found in other contexts for cluster assignment estimators. Panel C of Table 1 reports the average estimation time (using a machine with an Intel Xeon Gold 6132 processor, with ten cores and clock speed of 2.60GHz) and number of EM iterations 9Recall that groups are identi(cid:133)ed only up to a re-labelling; we account for this when computing the accuracy of the estimated group assignments. 10The clustering algorithm is not, of course, infallible: its accuracy depends on the structure of the DGP and the data available. In situations where the clusters are close together relative to sampling variation, estimated cluster assignments willinevitably contain errors. In our realistically-calibrated simulation design, the clusters appear to be su¢ ciently well separated that cluster assignments can be very accurately estimated. 13

required for convergence, and reveals no large di⁄erences in the di¢ culty of estimation across these models. [ INSERT TABLE 1 ABOUT HERE ] Table 2 presents the results for the dynamic copula case with G = 10. We again see that the estimated parameters are centered on the true values, and in Panel B we see the remarkable result that the clustering algorithm described above is able to correctly assign every variable to its group in 100% of simulations. Recall that the estimated cluster assignments are based on a static Gaussian copula model, which is misspeci(cid:133)ed in all three cases considered in Table 2. That model is shown in Table 2 to be rich enough to reveal the true clusters in the data even though it is misspeci(cid:133)ed, con(cid:133)rming the discussion in Section 3.1. Panel C of Table 2 shows that the clustering step for the dynamic model is almost as fast as for the static case (where one source of model misspeci(cid:133)cation is removed), while the copula parameter estimation step is naturally slower. TableS1intheSupplementalAppendixconsidersadesignanalogoustothatforTable2, except that we set the number of clusters to be 20 rather than 10. This is a more challenging estimation problem, and the time required for the cluster assignment estimation is greater (around 22 minutes compared with around 9 minutes for the G = 10 case), as is the time required for the copula estimation (around 60 minutes compared with around 40 minutes). In this design we also observe one, but only one, case where a variable is missclassi(cid:133)ed in the cluster assignment step, for the skew t copula DGP. Overall, the results in Tables 1, 2 and S1 provide strong reassurance that the models and estimation methods proposed in Section 3 work well in (cid:133)nite samples, enabling us to take these to real data in the next section. [ INSERT TABLE 2 ABOUT HERE ] 14

5 Empirical application 5.1 Data and summary statistics We study daily equity returns over the period from January 4, 2010 to December 31, 2019, a total of T = 2516 trade days. Every stock that was ever a constituent of the S&P 100 index during this sample, and which traded for the full sample period, is included in the data set, yielding a total of N = 110 (cid:133)rms. A list of those (cid:133)rms, including their names, ticker symbols, and two-digit Standard Industrial Classi(cid:133)cation (SIC) codes, are provided in Table S2 in the supplemental appendix. All data is obtained from Center for Research in Security Prices, CRSP US Stock Database, Wharton Research Data Services, http://www.whartonwrds.com/datasets/crsp/. Table 3 presents summary statistics of the data and parameter estimates for the mean, variance and marginal distribution models. Panel A presents unconditional sample moments of the daily returnsforeachstock, andthesemomentsarecomparabletothoseobservedinotherstudies. Given theskewnessandkurtosisestimatesreportedinPanelA,ourmarginaldistributionmodelcombines an AR(1) for the conditional mean, GJR-GARCH(1,1) for the conditional variance, and a skewed t for the marginal distribution of the standardized residuals: r = (cid:30) +(cid:30) r +(cid:15) i;t 0i 1i i;t 1 i;t (cid:0) h = $ +(cid:12) h +(cid:11) (cid:15)2 +(cid:20) (cid:15)2 1 (cid:15) 0 i;t i i i;t 1 i i;t 1 i i;t 1 i;t 1 (cid:0) (cid:0) (cid:0) f (cid:0) (cid:20) g (cid:15) i;t iid Skew t((cid:24) ; ) i i h (cid:24) i;t p where h is the conditional variance at time t for (cid:133)rm i and Skew t is the univariate skewed t i;t distribution of Hansen (1994) with the tail parameter (cid:24) and the asymmetry parameter . Using i i quasi-maximumlikelihood,weestimatetheconditionalmeanandvariancemodels,thengiventhose estimatedstandardizedresiduals,weestimatetheskewedt parameters. PanelBofTable3provides the estimation results of the marginal distribution model, and the values there are consistent with those reported in the empirical (cid:133)nance literature (see, e.g., Bollerslev, Engle, and Nelson 1994). Thestandardizedresidualsstillindicatesubstantialskewness( ^ = 0:027onaverage)andkurtosis (cid:0) 15

(^(cid:24) = 5:089 on average). Given the marginal model parameters we obtain the probability integral transforms, u ; used in the estimation of the copula. it Panel C of Table 3 presents Pearson(cid:146)s linear correlations and Spearman(cid:146)s rank correlations between those standardized residuals whose quantiles between 5% and 95% range from 0.17 to 0.49 and from 0.20 to 0.53, respectively, indicating heterogeneous pairwise dependence, and motivating our (cid:135)exible factor copula speci(cid:133)cation presented in Section 2. [ INSERT TABLE 3 ABOUT HERE ] 5.2 Estimated cluster assignments We (cid:133)rstly use the method described in Sections 3 to estimate the group assignments for each variable. To determine the optimal number of groups, we use the BIC for the (cid:133)tted static Gaussian copula model.11 The value of the BIC for each choice of G is plotted in Figure 2, along with the values of the BIC obtained when using one-digit or two-digit SIC codes to determine group assignments. In our sample there are seven one-digit SIC groups and 21 two-digit SIC groups.12 Figure 2 reveals that the BIC from a model using only four estimated group assignments dominates the seven one-digit SIC groups, and a model with just (cid:133)ve estimated group assignments beats the 21-group model based on two-digit SIC codes. These rankings reveal the gains available from a data-driven assignment of stocks to groups, rather than assignments based on SIC codes.13 The optimal number of estimated groups, according to the BIC, is 21, which is coincidentally the same as the number of two-digit SIC groups.14 We note that the BIC curve is relatively (cid:135)at 11TheBICiscomputedasBIC(G)= 2 T logc u ;(cid:18)^(G) ;(cid:0)^(G) +2Glog(T);whereGdenotesthenumberof (cid:0) t=1 t T T clusters,leadingto2GparameterstobeestimPated. Weu(cid:16)sethenotatio(cid:17)n((cid:18)^(G) ;(cid:0)^(G))toemphasizethattheparameters T T of the copula vary with G: 12Ourmodelcannotaccommodategroupswithonlyonemember,andwhenestimatingwithSIC-basedclusterswe address this by moving stocks that are a singleton in their group to the SIC group with which they have the highest correlation. Speci(cid:133)cally,intheone-digitclusteringmodel,Weyerhaeuser(WY)istheonlystockintheone-digitSIC group0,andwemoveittoSICgroup3. Inthetwo-digitclusteringmodel,FCX(10),NKE(30),WY(08),FDX(45) and V(61) are all singletons, and those are moved into the two-digit SIC groups 13, 37, 37, 42, and 60, respectively. 13Opschooretal. (2020)compareclusterassignmentsbasedonSICcodeswiththosebasedonsomeothercommon characteristics: market capitalization (size), the book-to-market ratio (value), and past returns (momentum). They (cid:133)nd that SIC-based assignments easily dominate these alternatives. 14We used a set of 100 random starting values for (cid:0); the cluster assignment vector, in estimation, and did not use information from SIC codes at all in the EM-based model. 16

near the optimum, indicating that choosing G between 20 and 25 leads to approximately the same (cid:133)t; i.e., there is some robustness to the speci(cid:133)c choice of G: [ INSERT FIGURE 2 ABOUT HERE ] Table 4 presents the estimated group assignments for the 110 stocks in our sample, along with each stock(cid:146)s SIC code. Some of the estimated groups line up closely with a two-digit SIC group. For example, the largest group (Group 1) is comprised of 13 stocks, ten of which have SIC code 28 ((cid:147)Chemical & Allied Products(cid:148)manufacturing). The three other stocks (Baxter, Medtronic and United Health) have di⁄erent SIC codes, but are clearly broadly in the same category as the rest of thisgroup. Group5,asanotherexample,looksclearlylikea(cid:147)Tech(cid:148)group,andallbuttwomembers have SIC code 73 ((cid:147)Business Services(cid:148)). The two listed with other codes are Apple (listed as 35, (cid:147)IndustrialMachinery&Equipment(cid:148)manufacturing)andNet(cid:135)ix(listedas78, (cid:147)MotionPictures(cid:148)). Despite the di⁄erent SIC codes, most investors would agree that Apple and Net(cid:135)ix (cid:133)t neatly in a cluster containing Google, Amazon and Ebay. Among the smaller clusters, we see some obvious pairs of stocks grouped together: AT&T and Verizon; Lowes and Home Depot; Mastercard and Visa; McDonalds and Starbucks. Overall, the group assignments in Table 4 look economically plausible, in addition to representingamuchbetterstatistical(cid:133)taccordingtotheBIC.InSection5.4weconductformalout-of-sample forecast comparison tests to determine whether the improved in-sample (cid:133)t leads to signi(cid:133)cantly better out-of-sample forecasts, and in Section 5.5 we study the economic environments in which this feature of the model is most helpful. [ INSERT TABLE 4 ABOUT HERE ] 5.3 Estimated dependence time series We now compare the (cid:133)tted dependence time series from the two-digit SIC factor copula model and the factor copula model with estimated group assignments. We use rank correlations as a summary measure for the strength and direction of the dependence between assets implied by these models. With a fully-speci(cid:133)ed copula model such as the ones employed here, it is also possible to extract 17

other dependence measures, such as tail dependence or probabilities of joint tail events, see e.g. the measures in Giesecke and Kim (2011) and Oh and Patton (2018). The complete rank correlation matrix is 110 110; and even just focusing on the blocks implied (cid:2) by the factor structure embedded in the model the matrix is 21 21: As an initial summary (cid:2) measure, we (cid:133)rstly consider the conditional rank correlation for pairs in the same group. Figure 3 plots these for three groups, along with the two-digit SIC group that best matches the estimated group.15;16 The top panel compares estimated group 3 with SIC group 13. We observe that the two conditional rank correlation paths track each other quite closely, but the rank correlations based on estimated group assignments appear to adjust more quickly to news, and the SIC-based estimates look somewhat like a rolling average of the path from the model with estimated group assignments. A similar picture arises in the middle panel, comparing estimated group 7 with SIC group 36. It appears that by getting group assignments that better match the data, the model is more quickly able to react to information that suggests dependence has gone up or down. The lower panel of Figure 3 compares estimated group 9 and SIC group 49, and represents a particularly interesting comparison. Group 9 contains six members, and all of them are from SIC group 49 ((cid:147)Electric, Gas, & Sanitary Services,(cid:148)in the (cid:147)Transportation & Public Utilities(cid:148)group). There is just one other SIC group 49 stock in our sample (Williams, ticker WMB), and this stock was estimated to belong to group 3, which is dominated by SIC group 13 members (SIC 13 is (cid:147)Oil & Gas Extraction(cid:148)in the (cid:147)Mining(cid:148)group). From the (cid:133)rm(cid:146)s description on its website, it conducts a mix of activities captured by these SIC labels, and it turns out that our cluster assignment algorithm estimates it to be a better match with mining (cid:133)rms than with utilities (cid:133)rms. The lower panel of Figure 3 shows that by removing just this one stock the within-group rank correlation rises from around 0.55 to around 0.68. Moreover, we again see that the conditional rank correlations are more dynamic in the model with estimated group assignments. [ INSERT FIGURE 3 ABOUT HERE ] 15Figures S1-S2 in the supplemental appendix present other comparisons of (cid:133)tted rank correlations from the two models. 16For example, estimated Group 3 has eleven members, including all eight of the SIC group 13 stocks. Estimated Group 7 has seven members including all (cid:133)ve members of SIC group 36. Estimated Group 9 has six members and all of them belong to SIC group 49; the single other SIC group 49 member was estimated to belong to Group 3. 18

The plots of conditional rank correlations in Figure 3 allow us to see di⁄erences in pairwise dependence implied by the two models. For a more complete depiction of the di⁄erences implied by the model in the upper panel of Figure 4 we plot the QLIKE distance measure between the full 110 110 rank correlation matrices implied by the two models.17 When this measure is lower, the (cid:2) rank correlation matrices are more similar. We see that the di⁄erence is largest in mid 2011, and also large in late 2015, while it was relatively low in 2012. The middle panel of Figure 4 presents the normalized sum of the (cid:133)rst 22 eigenvalues of the model-implied rank correlation matrices. Both of the models are based on a 22-factor model (one common factor and 21 group-speci(cid:133)c factors), and the sum of the (cid:133)rst 22 eigenvalues provides a summary for how informative the factors are.18 We see that the sum is uniformly greater for the model with estimated group assignments than for the model based on SIC group assignments. Note that the period when the two sums are furthest apart corresponds to the period when the QLIKE distance is also the greatest, indicating that this is one reason for the increased QLIKE distance. The lower panel of Figure 4 plots cross-sectional dispersion in pairwise rank correlations. We see that this dispersion has been broadly increasing overthesampleperiod,andthatperiodswhenthetwomodelsdi⁄ermostinthedegreeofdispersion also correspond to times when the QLIKE distance is larger. [ INSERT FIGURE 4 ABOUT HERE ] 5.4 Out-of-sample forecast performance We next compare the out-of-sample (OOS) forecasts of the factor copula models using SIC-based group assignments with those using estimated group assignments. To do so, we split our sample period in half, using data from 2010 to 2014 to estimate the models, and data from 2015 to 2019 to evaluate the models. Given the computational complexity of the models, we estimate the models only once, on the last day of the in-sample period, and retain those parameters throughout the 17The QLIKE distance between two (N N) matrices is QLIKE(A;B)=tr A(cid:0) 1B log A(cid:0) 1B N: 18Figure S3 in the supplemental append (cid:2) ix presents corresponding results using just th (cid:0) e largest eigen (cid:0) value, or the (cid:0) (cid:1) (cid:12) (cid:12) sumofthe(cid:133)rstthreeeigenvalues. Thelargesteigenvaluesfromeachofthemodelsareroughlyeq(cid:12)ual,al(cid:12)thoughsimilar tothepairwiserankcorrelationplots,thetimeseriesfromthemodelwithestimatedgroupassignmentsappearsmore dynamic. The plot of the sum of the largest three eigenvalues reveals not only more dynamics, but a slight gap in the level, though it is not as large and not uniform as it is for the sum of the (cid:133)rst 22 eigenvalues. 19

OOS period. In Table 5 we use OOS forecast performance to determine the optimal shape of the copula (Gaussian, t; or skew t), as well as the optimal choice of dynamics (static vs. GAS).19 We do this for a range of choices for the number of groups, to determine the robustness of the conclusions, and also for the two SIC-based group assignments. In all cases we compare the models using their out-of-sample likelihoods, which is a consistent scoring rule for ranking density forecasts, see Gneiting and Raftery (2007). We test for the signi(cid:133)cance of the di⁄erences in OOS likelihoods using a Diebold and Mariano (1995) test with a Newey-West (1987) estimator of the standard error based on 10 lags. The left panel of Table 5 clearly indicates that including GAS dynamics in the model improves the (cid:133)t: in all cases the t-statistic is positive, and the smallest t-statistic across all con(cid:133)gurations is 6:5; indicating strong evidence in favor of the GAS model over the static model. The right panel of Table 5 uses GAS dynamics in all cases, and we compare the choice of copula shape across various choices of the number of groups. We (cid:133)nd in all cases that the t and skew t models outperform the Gaussian factor copula, with t-statistics all greater than 7:7: This is consistent with previous work in the literature (see, e.g., Patton, 2004, 2013, and Amengual and Sentana, 2020) that the Normal copula is not a good description of equity return dependence. In the last column of Table 5 we compare the t and skew t copulas, and we (cid:133)nd that the t-statistics are all negative, and generally signi(cid:133)cant, indicatingthattheestimationoftheadditionalskewnessparameterintheskewtcopula leads to worse OOS performance than the symmetric t factor copula. This is in contrast with the in-sample parameter estimates (presented in Tables S3 and S4 in the supplemental appendix) where the copula asymmetry parameter is signi(cid:133)cantly negative.20 These con(cid:135)icting results can be reconciled by the fact that OOS forecast comparisons tend to carry a strong implicit penalty for estimation error, and so unless the new parameter is far from zero and precisely estimated, better 19In addition to being economically interesting in their own right, using OOS forecast performance to make these comparisons allows us to conduct formal statistical tests without having to make assumptions about the error rate in the estimated group assignments (see Section 3.2) that cannot be veri(cid:133)ed. 20These two tables present parameter estimates and standard errors for models using one-digit SIC codes (seven clusters)orusingestimatedclusterassignments(usingtheBIC-optimalnumberofclusters,21). Inbothcaseswesee that the copula asymmetry parameter is signi(cid:133)cantly negative. Standard errors in this table are computed assuming that estimation error from cluster assignments is negligible, as discussed in Section 3.2. 20

forecasts may be obtained by setting it to zero. [ INSERT TABLE 5 ABOUT HERE ] In Table 6 we compare the OOS performance of t factor copulas with GAS dynamics that use di⁄erent numbers of groups. Consistent with the BIC rankings of models presented in Figure 2, the model based on one-digit SIC groupings is beaten by every other model except the estimated group assignment model with only 3 groups. Similarly, the model based on two-digit SIC groupings is beatenbyeveryothermodelexceptthetheone-digitSICmodelandtheestimatedgroupassignment models with only 3 or 4 groups. Amongst the models with estimated group assignments, the model with G = 20 groups performs the best in terms of OOS likelihoods, signi(cid:133)cantly beating every other model, including the G = 21 model which was selected as being optimal over the full sample according to the BIC. That the optimal model for OOS forecasting is smaller than the optimal model for in-sample (cid:133)tting is consistent with the abovementioned predilection of OOS forecasts for parsimonious models. [ INSERT TABLE 6 ABOUT HERE ] 5.5 Economic determinants of forecast performance We next seek to understand the economic environments in which speci(cid:133)c features of the model lead to meaningful gains in forecast performance. We focus on three key model features: the model for dynamics in the conditional copula, the model for the shape of the copula, and the method for assigning stocks to groups in the factor structure. To summarize the economic environment we use three conditioning variables. Firstly, we use the Chicago Board of Exchange(cid:146)s volatility index or (cid:147)VIX(cid:148)(see Whaley, 2009), which is a widely-used measure of market volatility. Next we use the cross-sectional standard deviation of stock returns on each day, which is a common measure of the degree of idiosyncratic risk (see Goyal and Santa-Clara, 2003, for example).21 Finally, we use the absolute value of the cross-sectional average of the di⁄erence between realized returns and the 21We also considered the idiosyncratic risk relative to the CAPM for these stocks. That series had correlation of 0.99 with simple cross-sectional dispersion, and so we present results only using the latter. 21

return predicted by the capital asset pricing model (CAPM), known as (cid:147)alpha.(cid:148)22 This measure is interpretable as a measure of the degree of mispricing relative to the CAPM on a given day, or, more robustly, as a measure of how important non-market factors were for determining stock returnsonagivenday. Thesethreemeasureseachprovideadi⁄erentviewofeconomicenvironment on a given day. We use two complementary methods in this analysis. Firstly, we use the parametric approach of Giacomini and White (2005) (GW), where di⁄erences in out-of-sample forecast performance are regressed on one or more of the conditioning variables summarizing the economic environment. Secondly,weusethenonparametric(cid:147)conditionalsuperiorpredictiveability(cid:148)(CSPA)testofLiet al. (2021). The GW test allows us to determine whether the conditioning variable(s) help explain the relative performance of the competing models, while the CSPA test determines whether one model has expected performance above/below the other model for all values of the conditioning variable. It is possible for neither, one, or both of these tests to reject their respective null hypotheses, and thus they provide complementary information about relative performance. Table 7 presents the results for the three conditional model comparisons. In the GW tests, we consider the variables separately and jointly. We de-mean all regressors so that the intercept is interpretable as the expected di⁄erence in log-likelihoods when the regressors equal their average value. In all cases the intercepts are positive and strongly signi(cid:133)cant (consistent with Tables 5 and 6), indicating that in each comparison the more (cid:135)exible model is preferred to the simpler model. Our interest in this analysis is primarily the slope coe¢ cients, which a⁄ords us insight into whether there are speci(cid:133)c economic environments where one model outperforms the other. The penultimate and antepenultimate rows show the p-values from joint tests of all coe¢ cients in the regression, or only the slope coe¢ cients. IntheleftpanelofTable7wecompareamodelwithnodynamicsintheconditionalcopulawith a model that has GAS dynamics, and we see that the most useful variable is dispersion, which has a t-statistic of over three. The positive sign of this coe¢ cient reveals that the gains from allowing for dynamics in the conditional copula are particularly great when stock returns exhibit higher 22We estimate the CAPM beta for each stock just once, using the full sample of data. 22

idiosyncractic risk. The middle panel of Table 7 presents results comparing a Gaussian copula with a Student(cid:146)s t copula, both with GAS dynamics. Here we see that VIX is the most useful explanatory variable, withat-statisticof2.8. Thepositiveslopeofthiscoe¢ cientimpliesthatwhenvolatilityishighthe gains from allowing for joint fat tails, as provided by the Student(cid:146)s t copula, are larger than in lowvolatility times. We note also that dispersion and absolute alpha also have positive and signi(cid:133)cant coe¢ cients in this panel, indicating that joint fat tails are also important when cross-sectional dispersion is high, and when non-market factors are particularly important for the cross-section of realized returns.23 The right panel of Table 7 examines when the gains from using estimated group assignments rather than two-digit SIC groups are greatest. In this panel the most signi(cid:133)cant individual variable is absolute alpha, revealing that data-driven groupings are particularly valuable when non-market factors drive the cross-section of realized returns. [ INSERT TABLE 7 ABOUT HERE ] The bottom row of Table 7 presents the p-value from the nonparametric CSPA test for each of these comparisons.24 We (cid:133)nd p-values of less than 0.005 in all comparisons, consistent with (but generalizing) the strongly signi(cid:133)cant intercepts in the GW regressions, indicating that the more (cid:135)exible model in each comparison dominates the more restrictive speci(cid:133)cation uniformly on the support of the conditioning variable. Tounderstandthesenonparametricrelationshipsbetter,Figure5presentssimplenonparametric kernel-smoothed estimates of the di⁄erences in forecast performance as a function of each of the conditioning variables, for values between the 0.01 and 0.99 sample quantiles of each variable.25 We see that the gains from allowing for dynamics in the conditional copula (top row) are roughly (cid:135)at in VIX (consistent with the GW regression), while they are generally increasing in dispersion 23In the joint regression none of the three slope coe¢ cients are individually signi(cid:133)cant, though they are strongly jointlysigni(cid:133)cant(p-valuelessthan0.005),consistentwiththepresenceofmulticollinearity. Thecorrelationsbetween the regressors range from 0.21 to 0.50, indicating moderate multicollinearity. 24Beinganonparametrictest,theCSPAtestsu⁄ersfromthecurseofdimensionality. Forthisreason,weimplement it only for each variable separately, not for all variables jointly. 25The estimate and con(cid:133)dence intervals are computed using Theorem 2.2 of Li and Racine (2007). 23

and absolute alpha. In particular, they increase almost linearly as dispersion increases from its (cid:133)rst percentile (0.6) to about 1.8 (corresponding to its 88th percentile), and then (cid:135)atten beyond that. The gains from allowing for joint fat tails in the copula (second row of (cid:133)gures) are roughly (cid:135)at when the three conditioning variables are in the lower half of their support, and then sharply increase beyond that, indicating that in times of high volatility, high idiosyncratic risk, or high importance of non-market factors, joint fat tails are particularly useful in the model. The gains from allowing for estimating group assignments (bottom row) are e⁄ectively unrelated to volatility, as measured by VIX, but they are strongly increasing in dispersion and absolute alpha. It is noteworthy that even when all three of these conditioning variables are at their (cid:133)rst percentile, that is, when markets are calm, not disperse, and the market factor is dominant, the gains from allowing for dynamics, joint fat tails, and estimated group assignments are each signi(cid:133)cantly larger than zero. This is strong support for the importance of these features of the conditional copula. [ INSERT FIGURE 5 ABOUT HERE ] 6 Conclusion This paper proposes a new dynamic factor copula model for use in high dimensional time series applications. Our model does not require variables to be grouped according to ex ante information, like SIC industry codes or similar; instead we estimate the optimal assignment of variables to groups from the data using a k-means type approach. Our clustering method exploits the fact that clusters can be consistently estimated from a static version of the copula model, rather than the more computationally-challenging dynamic version that is of primary interest, making the clustering problem feasible. We show via an extensive simulation study that group assignments can be accurately estimated in (cid:133)nite samples. In an application to 110 U.S. equity returns over the period 2010-2019 we (cid:133)nd evidence that a model with estimated group assignments signi(cid:133)cantly outperformsanotherwiseidenticalmodelwithgroupassignmentsdeterminedusingSICcodes. The improvement in (cid:133)t appears to come from a better assignment of stocks that are labeled with one SIC code but comove more like stocks from a di⁄erent SIC code, which allows the dynamic model 24

to react to new information more quickly. The methods in this paper suggest at least two directions for future research. First, one could consider how to allow for time variation in estimated cluster assignments. For example, Lumsdaine et al. (2022) considers estimated group assignments that are subject to structural breaks, while Custodio Joªo et al. (2022) model group assignments as evolving according to a hidden Markov model. Adapting these ideas to models of the form considered in this paper, which are already computationally intensive, is an interesting and challenging problem. A second direction is to consider extensions to (cid:147)vast(cid:148)dimensional data sets. Our analysis considers a cross-section of 110 stocks, which is large in absolute terms but small relative to our time series of around 2500 observations. Applicationsinvolvingcross-sectionscomparableinsizetothetimeseriesmayrequire some new methods, e.g., adapting methods from random matrix theory (as in Fan et al., 2013, for example) as well as alternative, faster, methods for estimating group assignments, e.g., hierarchical clustering (see, e.g., Hastie, et al., 2009). We leave these interesting extensions for future work. Appendix A.1 The probability density of the skewed t copula We adopt the skewed t copula discussed in Demarta and McNeil (2005) and Christo⁄ersen et al. (2012). Speci(cid:133)cally, it is the copula embedded in the multivariate skewed t distribution of X , t where: X = W Z +(cid:16)W (16) t t t t p where (cid:16) is a N 1 asymmetry parameter vector (cid:133)lled with an identical scalar (cid:16), W is an inverse t (cid:2) gamma variable W IG (cid:23); (cid:23) , Z is a N 1 normal variable Z (0;R ), and W and Z are t (cid:24) 2 2 t (cid:2) t (cid:24) N t t t (cid:0) (cid:1) 25

independent. The probability density function of the skewed t copula is given by 2 ((cid:23) (cid:0) 2) 2 (N (cid:0) 1) K (cid:23)+N (cid:23) +x 0t R (cid:0)t 1x t (cid:16) 0 R (cid:0)t 1(cid:16) exp x 0t R (cid:0)t 1(cid:16) 2 c(x t ; R t ;(cid:23);(cid:16)) = (cid:18)q (cid:0) (cid:1) (cid:23)+N (cid:19) (cid:0) (cid:1) (17) (cid:0)(cid:127) (cid:23) 2 1 (cid:0) N j R t j 1 2 (cid:23) +x 0t R (cid:0)t 1x t (cid:16) 0 R (cid:0)t 1(cid:16) (cid:0) 2 1+ (cid:23) 1x 0t R (cid:0)t 1x t (cid:23)+ 2 N (cid:18)q (cid:19) (cid:0) (cid:1) (cid:0) (cid:23)+1 (cid:1) (cid:23)+1 (cid:0) (cid:1) N (cid:23) +x2 (cid:16)2 (cid:0) 2 1+ x2 it 2 it (cid:23) (cid:2) Y i=1 (cid:16) K q (cid:23) (cid:0) +1 (cid:1)(cid:23) + (cid:17) x2 it (cid:16)2 (cid:16) exp(x (cid:17) it (cid:16)) 2 (cid:16)q (cid:17) (cid:0) (cid:1) where (cid:0)(cid:127) is the Gamma function, K( ) is the modi(cid:133)ed Bessel function of the third kind (also (cid:1) called the modi(cid:133)ed Bessel function of the second kind, or the modi(cid:133)ed Hankel function), and x = t [x 1;t ; (cid:1)(cid:1)(cid:1) ;x N;t ] 0 = T s(cid:0)k 1 ew (u 1;t ;(cid:23);(cid:16)); (cid:1)(cid:1)(cid:1) ;T s(cid:0)k 1 ew (u N;t ;(cid:23);(cid:16)) 0 are obtained by applying the inverse of the univariate skew(cid:2)ed t distribution from equation (16) d(cid:3)e(cid:133)ned by 21 0:5((cid:23)+1)K ((cid:23) +x2)(cid:16)2 exp(x(cid:16)) y (cid:0) (cid:23)+1 2 T skew (y;(cid:23);(cid:16)) = (cid:18)q (cid:23)+1 (cid:19) dx: (cid:23)+1 Z(cid:0)1 (cid:0)(cid:127) (cid:23) p(cid:25)(cid:23) ((cid:23) +x2)(cid:16)2 (cid:0) 2 1+ x2 2 2 (cid:23) (cid:18)q (cid:19) (cid:16) (cid:17) (cid:0) (cid:1) Since T 1 ( ;(cid:23);(cid:16)) is not available in closed form, we generate 1,000,000 random draws from equas(cid:0)kew (cid:1) tion(16)anduselinearinterpolationtoapproximateT 1 ( ;(cid:23);(cid:16))on(0;1). NotethatT ( ;(cid:23);(cid:16)) s(cid:0)kew (cid:1) skew (cid:1) is identical across the cross sectional dimension and also over time because the shape parameters of this distribution are assumed constant, this means that we can approximate T 1 just once per s(cid:0)kew parameter set [(cid:23);(cid:16)] and apply it to all copula inputs u T , making the likelihood evaluation f t gt=1 of the skewed t copula very fast. An alternative to this simulation-based approach for (cid:133)nding an inverse CDF is to use quadrature-based methods, though in our initial analyses of such methods we found them too slow for use in the computationally-intensive estimation problem considered here. A.2 Derivation of the score 26

From equation (17), the log-likelihood of the skewed t copula is obtained by 1 (cid:23) +N 1 logc (x ;R ;(cid:23);(cid:16)) = log R log 1+ x R 1x (18) Skewt;t t t (cid:0)2 j t j(cid:0) 2 (cid:23) 0t (cid:0)t t (cid:18) (cid:19) +log K (cid:23) +x R 1x (cid:16) R 1(cid:16) (cid:23)+N 0t (cid:0)t t 0 (cid:0)t 2 (cid:18) (cid:18)q (cid:19)(cid:19) (cid:23) +N(cid:0) (cid:1) +x R 1(cid:16) + log (cid:23) +x R 1x (cid:16) R 1(cid:16) +const((cid:23);(cid:16)) 0t (cid:0)t 2 0t (cid:0)t t 0 (cid:0)t (cid:18)q (cid:19) (cid:0) (cid:1) where const((cid:23);(cid:16)) contains any components that do not depend on R , and recall that t R = L L +D ; L = (cid:21) ;:::;(cid:21) ; D = diag (cid:27)2 ;:::;(cid:27)2 t 0t t t t 1;t N;t t 1;t N;t h i (cid:0) (cid:1) e e e e e where (cid:21) 1 (cid:21) = i;t ; (cid:27)2 = i;t i;t 1+(cid:21) (cid:21) 1+(cid:21) (cid:21) 0i;t i;t 0i;t i;t e q To derive the score, we (cid:133)rst de(cid:133)ne L t = ((cid:21) 1;t ;:::;(cid:21) N;t ) R k (cid:2) N where (cid:21) i;t is a k 1 vector of 2 (cid:2) factor loadings for variable i. In the example of equation (7), the number of factors denoted by k is G+1 and if the variable i belongs to Group 3, then (cid:21) = [(cid:21) ;0;0;(cid:21) ;0;:::;0]. By the i;t M;3;t 3;3;t 0 chain rule, the derivative of equation (18) with respect to (cid:17) (given in equation 9) can be written t as product of three factors @logc (x ;R ;(cid:23);(cid:16)) @logc (x ;R ;(cid:23);(cid:16)) @vec(R ) @vec(L ) Skewt;t t t Skewt;t t t t t = (19) @(cid:17) @vec(R ) (cid:1) @vec(L ) (cid:1) @(cid:17) 0t t 0 t 0 0t (1 2G) (1 N2) (N2 (G+1)N) ((G+1)N 2G) (cid:2) (cid:2) (cid:2) (cid:2) | {z } | {z } | {z } | {z } where vec( ) stacks the columns of the matrix on top of one another to form a vector. The (cid:133)rst (cid:1) factor of equation (19) can be written as @logc (x ;R ;(cid:23);(cid:16)) @logc (x ;R ;(cid:23);(cid:16)) Skewt;t t t Skewt;t t t 0 = vec @vec(R ) @R t 0 (cid:18) (cid:18) t (cid:19)(cid:19) 27

so we focus on @logc Skewt;t (x t ;R t ;(cid:23);(cid:16)) = 1 @log j R t j (cid:23) +N @log 1+ (cid:23) 1x 0t R (cid:0)t 1x t @R (cid:0)2 (cid:1) @R (cid:0) 2 @R t t (cid:0) t (cid:1) @log K (cid:23) +x R 1x (cid:16) R 1(cid:16) (cid:23)+N 0t (cid:0)t t 0 (cid:0)t 2 + (cid:18) (cid:18)q (cid:19)(cid:19) (cid:0) @R (cid:1) t @log (cid:23) +x R 1x (cid:16) R 1(cid:16) @x R 1(cid:16) (cid:23) +N 0t (cid:0)t t 0 (cid:0)t + 0t (cid:0)t + (cid:18)q (cid:19) @R 2 (cid:1) (cid:0) @R (cid:1) t t Two useful formulas from the matrix di⁄erentials are d vM 1w dlog M 0 (cid:0) = M (cid:0) 1 0vw 0 M (cid:0) 1 0 and j j = M (cid:0) 1 dM (cid:0) dM (cid:0) (cid:1) (cid:0) (cid:1) (cid:0) (cid:1) where M is a symmetric non-singular matrix, and v and w are vectors conformable with M. With those formulas we calculate each component separately to obtain @logc (x ;R ;(cid:23);(cid:16)) 1 (cid:23) +N R 1x x R 1 K 0(cid:23)+N (A t ) B Skewt;t t t = R 1+ (cid:0)t t 0t (cid:0)t + 2 t (20) @R t (cid:0)2 (cid:0)t 2 (cid:23) +x 0t R (cid:0)t 1x t K (cid:23)+N (A t ) (cid:1) 2A t 2 R 1x (cid:16) R 1+ (cid:0) (cid:23) +N B t (cid:1) (cid:0) (cid:0)t t 0 (cid:0)t 4 A2 (cid:18) (cid:19) t where A = (cid:23) +x R 1x (cid:16) R 1(cid:16) t 0t (cid:0)t t 0 (cid:0)t q B = (cid:0) (cid:23) +x R 1x (cid:1) R 1(cid:16)(cid:16) R 1 R 1x x R 1 (cid:16) R 1(cid:16) t 0t (cid:0)t t (cid:0)t 0 (cid:0)t (cid:0)t t 0t (cid:0)t 0 (cid:0)t (cid:0) (cid:0) 1 (cid:0) (cid:1)(cid:0) (cid:1) (cid:0) (cid:1)(cid:0) (cid:1) K ( ) = K ( )+K ( ) : 0(cid:23)+ 2 N (cid:1) (cid:0)2 (cid:23)+ 2 N (cid:0) 1 (cid:1) (cid:23)+ 2 N+1 (cid:1) h i As (cid:16) 0, equation (20) boils down to the derivative of the log density of the Student t copula, ! @logc (x ;R ;(cid:23)) 1 (cid:23) +N R 1x x R 1 Studen @ t R (cid:0) t t ;t t t = (cid:0)2 R (cid:0)t 1+ 2 (cid:23) + (cid:0)t x 0t t R 0t (cid:0)t 1 (cid:0)t x t (cid:0) (cid:1) 28

and in addition, as (cid:23) , it becomes that of the Gaussian copula, ! 1 @logc (x ;R ) 1 1 Gaussian;t t t = R 1+ R 1x x R 1: @R (cid:0)2 (cid:0)t 2 (cid:0)t t 0t (cid:0)t t To derive the second and third factors in equation (19) we closely follow the logic and notations of Opschoor et al. (2020). The second factor is re-written as @vec L L @vec L @vec(R t ) 0t t t @vec(D t ) = + @vec(L ) (cid:16) (cid:17) (cid:1) @vec(cid:16)(L )(cid:17) @vec(L ) t 0 @vec L 0 t 0 t 0 e te e (cid:16) (cid:17) @vec L e t @vec(D ) t = (I +J ) I L + (21) N2 N;N N (cid:10) 0t (cid:1) @vec(cid:16)(L )(cid:17) @vec(L ) t 0 t 0 (cid:16) (cid:17) e e with another useful formula @vec(MM) 0 = (I +J ) I M n2 n;n n 0 @vec(M) (cid:10) 0 (cid:0) (cid:1) where M R m (cid:2) n , and J m;n R mn (cid:2) mn is the vectorized transpose matrix, i.e. vec(M 0 ) = 2 2 J vec(M). Furthermore, we have m;n Q 0 0 1;t (cid:1)(cid:1)(cid:1) 0 1 @vec L t = 0 Q 2;t (cid:1)(cid:1)(cid:1) 0 ; Q = I k (cid:21) i;t (cid:21) 0i;t @vec(cid:16)(L e t )(cid:17) 0 B B B . . . ... . . . C C C i;t 1+(cid:21) 0i;t (cid:21) i;t (cid:0) 1+(cid:21) 0i;t (cid:21) i;t 3=2 B C B C q (cid:0) (cid:1) B 0 0 Q N;tC B (cid:1)(cid:1)(cid:1) C @ A for i = 1;:::;N: (The denominator on the (cid:133)nal term in the above equation corrects a typo in Opschoor et al., 2020). The sparsity of I L and @vec L =@vec(L ) simpli(cid:133)es product of N 0t t t 0 (cid:10) (cid:16) (cid:17) (cid:16) (cid:17) those two factors to e e L Q 0 0t 1;t @vec L (cid:1)(cid:1)(cid:1) I N (cid:10) L 0t (cid:1) @vec(cid:16)(L t )(cid:17) = 0 e . . . ... . . . 1 : (22) (cid:16) (cid:17) e t 0 B B C C 0 L Q e B 0t N;tC B (cid:1)(cid:1)(cid:1) C @ A e 29

We de(cid:133)ne T as a N2 N transformation matrix such that vec(A) = T a where A is a N N diag diag (cid:2) (cid:1) (cid:2) diagonal matrix with a N 1 vector a on the diagonal. Then, (cid:2) (cid:0) 2(cid:21)01;t 0 @ (cid:27)2 ;:::;(cid:27)2 0 (1+(cid:21)01;t (cid:21)1;t )2 (cid:1)(cid:1)(cid:1) @ @ v v e e c c ( ( L D t t ) ) 0 = T diag (cid:1) h @ 1; v t ec(L t ) N 0 ;t i = T diag (cid:1) 0 B 0 ... 0 1 C : (23) B B 0 (cid:0) 2(cid:21)0N;t C C B B (cid:1)(cid:1)(cid:1) (1+(cid:21)0N;t (cid:21)N;t )2C C @ A For the last factor in equation (19), recall that (cid:17) in equation (9) is a vector of distinct factor t loadings and (cid:17) denotes i-th element of (cid:17) . L is written as t;i t t 2G L = (cid:17) S t t;i i0 (cid:1) i=1 X (cid:14) (cid:19) (cid:14) (cid:19) 0 0 1;i N1 G+1;i N1 (cid:1)(cid:1)(cid:1) 0 . . 1 (cid:14) (cid:19) 0 (cid:14) (cid:19) . S i = B B 2;i . . . N2 . . . G+2;i N2 ... 0 C C 2 R N (cid:2) (G+1) B C B C B C B B (cid:14) G;i (cid:19) NG 0 (cid:1)(cid:1)(cid:1) 0 (cid:14) 2G;i (cid:19) NG C C @ A where (cid:19) is a p 1 vector (cid:133)lled with ones, (cid:14) = 1 if i = j and zero otherwise, N for g = 1;:::;G p i;j g (cid:2) is the number of members in group g such that N = G N . Then g=1 g P @vec(L ) @(cid:17) t = vec(S 10 ); :::; vec(S G0 ); vec S G0 +1 ; :::; vec(S 20G ) 2 R (G+1)N (cid:2) 2G: (24) 0t (cid:18) (cid:19) (cid:0) (cid:1) Thus, the score expressed in equation (19) is obtained by combining equation (20), (21), (22), (23), and (24). A.3 Variance targeting The number of parameters to estimate in the proposed skewed t copula with GAS dynamics is 2G + 6, and when G is large estimating all the parameters at once is not feasible. We adopt a two-step approach, so-called (cid:147)variance targeting,(cid:148)to eliminate the need to numerically optimize over the intercept parameters !M;:::;!M;!C;:::;!C . Speci(cid:133)cally, under the station- 1 G 1 G (cid:2) (cid:3) 30

arity assumption, the unconditional expectation of all distinct factor loadings in equation (8), (cid:17) = [(cid:21) ;:::;(cid:21) ;(cid:21) ;:::;(cid:21) ] is t M;1;t M;G;t 1;1;t G;G;t 0 (cid:17)(cid:22) = !+B (cid:17)(cid:22) (cid:1) where (cid:17)(cid:22) (cid:17) E[(cid:17) t ]; ! (cid:17) !M 1 ;:::;!M G ;!C 1 ;:::;!C G 0 and B = diag (cid:12)M; (cid:1)(cid:1)(cid:1) ;(cid:12)M;(cid:12)C; (cid:1)(cid:1)(cid:1) ;(cid:12)C 2 R 2G (cid:2) 2G, so if (cid:17)(cid:22) is estimate(cid:2)d from data in a (cid:133)rst ste(cid:3)p, ! can be replac(cid:0)ed with (I 2G B)(cid:17)(cid:22) and o(cid:1)nly (cid:0) (cid:11)M;(cid:12)M;(cid:11)C;(cid:12)C;(cid:23);(cid:16) are left to be estimated numerically in a second step. b (cid:2) De(cid:133)ne a (G+1) (cid:3) G matrix L(cid:22) (cid:133)lled with elements of (cid:17)(cid:22) as (cid:2) E [(cid:21)M;1;t ] E[(cid:21)1;1;t] 0 0 v1 v1 (cid:1)(cid:1)(cid:1) 0E [(cid:21)M;2;t ] 0 E[(cid:21)2;2;t] 0 1 L(cid:22) = v2 v2 (cid:1)(cid:1)(cid:1) 0 B B . . . . . . . . . ... . . . C C B C B C B B E [(cid:21)M;G;t ] 0 0 E [(cid:21)G;G;t ]C C B vG (cid:1)(cid:1)(cid:1) vG C @ A where v i 1+E[(cid:21) M;i;t ]2+E[(cid:21) i;i;t ]2, then the model implied correlation matrix of within- and (cid:17) across-group q is obtained by L(cid:22) L(cid:22): The corresponding unconditional correlation matrix based on 0 samples x it = (cid:8) (cid:0) 1(u it ) is denoted by (cid:10)^ R G (cid:2) G where the g-th diagonal element is the average 2 correlationofanypairofvariablesbelongingtogroupganda(i;j)elementistheaveragecorrelation of any pair of variables belonging to group i and group j: Then (cid:17)(cid:22) is estimated by minimizing the di⁄erence between the sample ((cid:10)^) and model-implied correlation matrices: (cid:17)(cid:22) = argmin vech (cid:10)^ L(cid:22) L(cid:22) 0vech (cid:10)^ L(cid:22) L(cid:22) 0 0 (cid:17)(cid:22) (cid:0) (cid:0) h (cid:16) (cid:17)i (cid:16) (cid:17) b In most variance targeting applications, the estimation of the intercept can be done analytically. Here it requires numerical optimization, however it is extremely fast. A.4 Proof of Theorem 1 31

Firstly, de(cid:133)ne the pro(cid:133)le estimator T 1 (cid:18)~ ((cid:0)) = argmax logc(u ;(cid:18);(cid:0)) (25) T t (cid:18) T t=1 X Assumptions 1, 2(a) and 3(a) are su¢ cient for (cid:18)~ T ((cid:0)) p (cid:18)~ (cid:3)((cid:0)) for each (cid:0) ; see White (1994, ! 2 G Theorem 3.5) for example. Next de(cid:133)ne the sample and population pro(cid:133)le likelihoods as: T 1 Q(cid:22) ((cid:0)) = logc u ;(cid:18)~ ((cid:0));(cid:0) T t T T X t=1 (cid:16) (cid:17) Q (cid:3) ((cid:0)) E logc u t ;(cid:18)~ (cid:3)((cid:0));(cid:0) (cid:17) h (cid:16) (cid:17)i De(cid:133)ne the infeasible version of the sample likelihood using the population copula parameter as T 1 Q_ T ((cid:0)) logc u t ;(cid:18)~ (cid:3)((cid:0));(cid:0) (cid:17) T X t=1 (cid:16) (cid:17) Consider a mean-value expansion of the sample objective function: logc u t ;(cid:18)~ T ((cid:0));(cid:0) = logc u t ;(cid:18)~ (cid:3)((cid:0));(cid:0) + (cid:18) logc u t ;(cid:18)~ (cid:3)((cid:0));(cid:0) 0 (cid:18)~ T ((cid:0)) (cid:18)~ (cid:3)((cid:0)) (26) r (cid:0) (cid:16) (cid:17) 1 (cid:16) (cid:17) (cid:16) (cid:17) (cid:16) (cid:17) + (cid:18)~ T ((cid:0)) (cid:18)~ (cid:3)((cid:0)) 0 (cid:18)(cid:18) logc u t ;(cid:18)(cid:127) (cid:3);(cid:0) (cid:18)~ T ((cid:0)) (cid:18)~ (cid:3)((cid:0)) 2 (cid:0) r (cid:0) (cid:16) (cid:17) (cid:16) (cid:17)(cid:16) (cid:17) where (cid:18)(cid:127) (cid:3) = (cid:21)(cid:18)~ T ((cid:0))+(1 (cid:21))(cid:18)~ (cid:3)((cid:0)) for some (cid:21) [0;1] (cid:0) 2 Then summing the equation above over t = 1;:::;T we have 1 T 0 Q(cid:22) T ((cid:0)) Q_ T ((cid:0)) = (cid:18) logc u t ;(cid:18)~ (cid:3)((cid:0));(cid:0) (cid:18)~ T ((cid:0)) (cid:18)~ (cid:3)((cid:0)) + (27) (cid:0) T r (cid:0) ! X t=1 (cid:16) (cid:17) (cid:16) (cid:17) T 1 1 + (cid:18)~ T ((cid:0)) (cid:18)~ (cid:3)((cid:0)) 0 (cid:18)(cid:18) logc u t ;(cid:18)(cid:127) (cid:3);(cid:0) (cid:18)~ T ((cid:0)) (cid:18)~ (cid:3)((cid:0)) 2 (cid:0) T r (cid:0) (cid:16) (cid:17) X t=1 (cid:16) (cid:17)(cid:16) (cid:17) Assumption 2(b) and 3(a) imply that T 1 T t=1r (cid:18) logc u t ;(cid:18)~ (cid:3)((cid:0));(cid:0) ! p 0; as usual for Mestimation. Assumption 3(c) ensures the HessPian term has (cid:16) a (cid:133)nite limit. (cid:17) Thus we have Q(cid:22) ((cid:0)) T (cid:0) Q_ ((cid:0)) = o (1): Further, by Assumptions 1 and 2(a) we also have Q_ ((cid:0)) Q ((cid:0)) = o (1); T p T (cid:3) p (cid:0) 32

and so we have Q(cid:22) ((cid:0)) Q ((cid:0)) = o (1): Thus the sample objective function is pointwise (in (cid:0)) T (cid:3) p (cid:0) consistent for the population objective function. Since the parameter space is discrete, uniform convergence simpli(cid:133)es to pointwise convergence (see, e.g., Choirat and Seri, 2012). This implies the estimatorobtainedbymaximizingthesampleobjectivefunctionisconsistentfortheparameterthat maximizes the population objective function. Since the population objective function is una⁄ected by re-labeling of clusters, any element of maximizes the population objective function. We thus 0 G have Pr (cid:0)^ p 1 as T ; completing the proof. T 0 2 G ! ! 1 h i References [1] AmengualD.andE.Sentana,2020,IsaNormalCopulatheRightCopula?,JournalofBusiness & Economic Statistics, 38(2), 350-366. [2] Ang,A.andJ.Chen,2002,AsymmetricCorrelationsofEquityPortfolios,Journal of Financial Economics, 63, 443-494 [3] Bester, A. and C. Hansen, 2016, Grouped E⁄ects Estimators in Fixed E⁄ects Models, Journal of Econometrics, 190(1), 197-208. [4] Blasques, F., A. Lucas and S. J. Koopman, 2014, Stationarity and Ergodicity Conditions for Generalized Autoregressive Score Processes, Electronic Journal of Statistics, 8, 1088-1112. [5] Bollerslev, T., R. F. Engle, and D. B. Nelson, 1994, Chapter 49 ARCH Models, Handbook of Econometrics, Vol 4, Elsevier. [6] Bonhomme, S., and Manresa, E., 2015, Grouped patterns of heterogeneity in panel data, Econometrica, 83(3), 1147-1184. [7] Carassco, M. and X. Chen, 2002, Mixing and Moment Properties of Various GARCH and Stochastic Volatility Models, Econometric Theory, 18(1), 17-39. [8] Choirat, C. and R. Seri, 2012, Estimation in Discrete Parameter Models, Statistical Science, 27(2), 278-293. [9] Christo⁄ersen, P., K. Jacobs, X. Jin and H. Langlois, 2018, Dynamic Dependence and Diversi(cid:133)cation in Corporate Credit, Review of Finance, 22(2), 521-560. [10] Creal, D.D. and R. Tsay, 2015, High-dimensional Dynamic Stochastic Copula Models, Journal of Econometrics, 189(2), 335-345. [11] Creal, D.D., S.J. Koopman, and A. Lucas, 2013, Generalized Autoregressive Score Models with Applications, Journal of Applied Econometrics, 28(5), 777-795. 33

[12] Custodio Joªo, I., A. Lucas, J. Schaumburg, and B. Schwaab, 2022, Dynamic Clustering of Multivariate Panel Data, Journal of Econometrics, forthcoming. [13] Demarta, S. and A. J. McNeil, 2005, The t Copula and Related Copulas. International Statistical Review, 73(1), 111-129. [14] Diebold, F.X. and R.S. Mariano, 1995, Comparing Predictive Accuracy, Journal of Business & Economic Statistics, 13(3), 253(cid:150)263. [15] Engle R.F., 2002, Dynamic conditional correlation: A simple class of multivariate GARCH models, Journal of Business & Economic Statistics, 20, 339-350. [16] Fan, J., Y. Fan and J. Lv, 2008, High dimensional covariance matrix estimation using a factor model, Journal of Econometrics, 147, 186-197. [17] Fan,J.,Y.LiaoandM.Micheva,2013,LargeCovarianceEstimationbyThresholdingPrincipal Orthogonal Complements, Journal of Royal Statistical Society Series B, 75, 603-680. [18] Francis, N., M.T. Owyang, and (cid:214). Savascin, 2017, An endogenously clustered factor approach to international business cycles, Journal of Applied Econometrics, 32, 1261-1276. [19] Giacomini, R. and H. White, 2006, Tests of conditional predictive ability, Econometrica, 74, 1545-1578. [20] Giesecke, K. B. and Kim, 2011, Systemic Risk: What Defaults are Telling Us, Management Science, 57, 1387-1405. [21] Goyal, A. and P. Santa-Clara, 2003. Idiosyncratic risk matters! Journal of Finance, 58, 975- 1008. [22] Gneiting,T.andA.E.Raftery,2007,StrictlyProperScoringRules,PredictionandEstimation, Journal of the American Statistical Association, 102, 358-378. [23] Hafner, C.M. and A. Preminger, 2009, On Asymptotic Theory for Multivariate GARCH Models, Journal of Multivariate Analysis, 100, 2044-2054. [24] Hahn, J. and R. Moon, 2010, Panel Data Models with Finite Number of Equilibria, Econometric Theory, 26(3), 863-881. [25] Hansen, B. E., 1994, Autoregressive Conditional Density Estimation, International Economic Review, 35, 705(cid:150)730. [26] Harvey, A.C., 2013, Dynamic Models for Volatility and Heavy Tails, Econometric Society Monograph 52, Cambridge University Press, Cambridge. [27] Hastie, T., R. Tibshirani and J. Friedman, 2009, The Elements of Statistical Learning: Data Mining, Inference and Prediction, Second Edition, Springer, New York. [28] Hautsch, N., L.M. Kyj and R.C.A. Oomen, 2012, A blocking and regularization approach to highdimensionalrealizedcovarianceestimation,Journal of Applied Econometrics,27,625-645, 2012. 34

[29] Hong, Y., J. Tu and G. Zhou, 2007, Asymmetries in Stock Returns: Statistical Tests and Economic Evaluation, Review of Financial Studies, 20, 1547-1581. [30] Li, J., Z. Liao and R. Quaedvlieg, 2021, Conditional superior predictive ability, Review of Economic Studies, forthcoming. [31] Li,Q.andJ.S.Racine,2007,NonparametricEconometrics,PrincetonUniversityPress,Princeton. [32] Lin, C.-C. and S. Ng, 2012, Estimation of Panel Data Models with Parameter Heterogeneity when Group Membership is Unknown, Journal of Econometric Methods, 1(1), 42-55. [33] Lumsdaine, R.L. R. Okui and W. Wang, 2022, Estimation of panel group structure models with structural breaks in group memberships and coe¢ cients, Journal of Econometrics, forthcoming. [34] Newey, W.K. and D. McFadden, 1993, Large Sample Estimation and Hypothesis Testing, in R.F. Engle and D. McFadden (eds.), Handbook of Econometrics, Vol. IV, Elsevier. [35] Newey, W.K. and K.D. West, 1987, A Simple, Positive Semi-De(cid:133)nite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix, Econometrica, 55(3), 703-708. [36] Oh, D.H. and A.J. Patton, 2017, Modelling Dependence in High Dimensions with Factor Copulas, Journal of Business & Economic Statistics, 35(1), 139-154. [37] Oh, D.H. and A.J. Patton, 2018, Time-Varying Systemic Risk: Evidence from a Dynamic Copula Model of CDS Spreads, Journal of Business & Economic Statistics, 36(2), 181-195. [38] Opschoor, A., A. Lucas, I. Barra & D. van Dijk, 2020, Closed-form multi-factor copula models with observation-driven dynamic factor loadings, Journal of Business & Economic Statistics, forthcoming. [39] Patton, A.J., 2004, On the Out-of-Sample Importance of Skewness and Asymmetric Dependence for Asset Allocation, 2004, Journal of Financial Econometrics, 2(1), 130-168. [40] Patton, A.J., 2013, Copula Methods for Forecasting Multivariate Time Series, in G. Elliott and A. Timmermann (eds.) Handbook of Economic Forecasting, Vol 2, Springer Verlag. [41] Patton, A.J. and B.M. Weller, 2019, Risk price variation: The missing half of empirical asset pricing, working paper, Duke University. [42] Su, L., Z. Shi, and P.C.B. Phillips, 2016, Identifying Latent Structures in Panel Data, Econometrica, 6(84), 2215-2264. [43] Su, L. X. Wang and S. Jin, 2019, Sieve Estimation of Time-Varying Panel Data Models With Latent Structures, Journal of Business & Economic Statistics, 37(2), 334-349. [44] Tao, M., Y. Wang, Q. Yao and J. Zou, 2011, Large Volatility Matrix Inference via Combining Low-Frequency and High-Frequency Approaches, Journal of the American Statistical Association, 106(495), 1025-1040. 35

[45] Vogt, M. and O. Linton, 2020, Multiscale Clustering of Nonparametric Regression Curves, Journal of Econometrics, 216(1), 305-325. [46] Whaley, R.E., 2009, Understanding the VIX, Journal of Portfolio Management, 35(3), 98-105. [47] White, H., 1994, Estimation, Inference and Speci(cid:133)cation Analysis, EconometricSocietyMonographs No. 22, Cambridge University Press, USA. 36

Table 1: Simulation results for static copulas Panel A: Parameter estimation accuracy Gaussian t skew t True Mean Std Dev Mean Std Dev Mean Std Dev (cid:12)M 0.25 0.245 0.076 0.253 0.089 0.276 0.062 1 (cid:12)M 0.50 0.508 0.071 0.509 0.075 0.473 0.068 2 (cid:12)M 0.75 0.755 0.053 0.744 0.056 0.740 0.066 3 (cid:12)M 1.00 1.000 0.052 1.002 0.048 0.954 0.189 4 (cid:12)M 1.25 1.246 0.033 1.243 0.035 1.246 0.040 5 (cid:12)M 1.50 1.500 0.021 1.498 0.034 1.501 0.035 6 (cid:12)M 1.75 1.747 0.020 1.752 0.027 1.755 0.036 7 (cid:12)M 2.00 2.001 0.021 1.998 0.027 1.996 0.030 8 (cid:12)M 2.25 2.253 0.019 2.252 0.028 2.255 0.031 9 (cid:12)M 2.50 2.496 0.020 2.502 0.029 2.502 0.030 10 (cid:12)C 2.50 2.499 0.026 2.495 0.030 2.503 0.031 1 (cid:12)C 2.25 2.248 0.023 2.246 0.031 2.251 0.036 2 (cid:12)C 2.00 2.001 0.031 2.003 0.030 1.998 0.038 3 (cid:12)C 1.75 1.749 0.030 1.747 0.034 1.771 0.065 4 (cid:12)C 1.50 1.497 0.030 1.501 0.029 1.497 0.031 5 (cid:12)C 1.25 1.246 0.028 1.252 0.026 1.252 0.030 6 (cid:12)C 1.00 1.005 0.023 0.999 0.023 1.003 0.019 7 (cid:12)C 0.75 0.756 0.022 0.749 0.023 0.743 0.018 8 (cid:12)C 0.50 0.504 0.019 0.499 0.023 0.499 0.026 9 (cid:12)C 0.25 0.241 0.039 0.234 0.048 0.230 0.050 10 (cid:23) 5.00 5.007 0.069 4.858 0.369 (cid:16) -0.10 -0.095 0.015 Panel B: Group assignment estimation accuracy Number incorrect 0 100 100 100 1 0 0 0 (cid:21) Panel C: Estimation details Clustering Copula Clustering Copula Clustering Copula Time (min) 8.7 0.47 8.9 6.37 8.8 7.13 EM (iter) 79.54 (cid:150) 80.12 (cid:150) 80.56 (cid:150) Notes: Thistablepresentsresultsfrom100simulationsfromstaticGaussian,t,andskewt factorcopulas with 10 groups. Panel A presents results on estimation accuracy of the copula parameters, Panel B presents results on estimation accuracy of the group assignments, and Panel C presents average estimation time (for the two stages of estimation) and EM iterations using a machine with an Intel Xeon processor, with ten cores and clock speed of 2.60GHz. 37

Table 2: Simulation results for time-varying copulas Panel A: Parameter estimation accuracy Gaussian t skew t True Mean Std Dev Mean Std Dev Mean Std Dev !M 0.04 0.042 0.007 0.042 0.007 0.044 0.008 1 !M 0.04 0.042 0.007 0.042 0.007 0.044 0.008 2 !M 0.04 0.042 0.007 0.042 0.007 0.043 0.007 3 !M 0.04 0.042 0.007 0.042 0.007 0.044 0.008 4 !M 0.04 0.042 0.007 0.042 0.007 0.043 0.007 5 !M 0.04 0.042 0.007 0.042 0.007 0.044 0.008 6 !M 0.04 0.042 0.006 0.042 0.007 0.043 0.008 7 !M 0.04 0.041 0.007 0.041 0.007 0.044 0.008 8 !M 0.04 0.042 0.007 0.042 0.007 0.044 0.008 9 !M 0.04 0.042 0.007 0.042 0.007 0.044 0.008 10 !C 0.04 0.043 0.007 0.043 0.007 0.042 0.007 1 !C 0.04 0.043 0.007 0.043 0.008 0.042 0.007 2 !C 0.04 0.042 0.007 0.043 0.007 0.042 0.007 3 !C 0.04 0.043 0.007 0.044 0.008 0.042 0.007 4 !C 0.04 0.043 0.007 0.043 0.008 0.041 0.007 5 !C 0.04 0.043 0.007 0.043 0.007 0.042 0.007 6 !C 0.04 0.043 0.008 0.043 0.007 0.041 0.006 7 !C 0.04 0.043 0.007 0.043 0.008 0.042 0.006 8 !C 0.04 0.043 0.008 0.043 0.007 0.042 0.007 9 !C 0.04 0.043 0.007 0.043 0.006 0.041 0.007 10 (cid:11)M 0.02 0.020 0.002 0.020 0.002 0.020 0.002 (cid:12)M 0.90 0.894 0.015 0.894 0.014 0.893 0.017 (cid:11)C 0.02 0.020 0.002 0.020 0.002 0.020 0.002 (cid:12)C 0.90 0.896 0.016 0.894 0.016 0.898 0.014 (cid:23) 5.00 5.014 0.071 5.016 0.108 (cid:16) -0.10 -0.100 0.007 Panel B: Group assignment estimation accuracy Number incorrect 0 100 100 100 1 0 0 0 (cid:21) Panel C: Estimation details Clustering Copula Clustering Copula Clustering Copula Time (min) 8.8 21.6 9.1 37.4 9.1 41.7 EM (iter) 91.32 - 91.78 - 90.83 - Notes: Thistablepresentsresultsfrom100simulationsfromGaussian, t, andskewt factorcopulaswith 10 groups and GAS dynamics. Panel A presents results on estimation accuracy of the copula parameters, Panel B presents results on estimation accuracy of the group assignments, and Panel C presents average estimation time (for the two stages of estimation) and EM iterations using a machine with an Intel Xeon 38

processor, with ten cores and clock speed of 2.60GHz. Table 3: Summary statistics Cross-sectional distribution Mean 5% 25% Median 75% 95% Panel A: Marginal moments Mean 0.001 0.000 0.001 0.001 0.001 0.001 Std 0.016 0.010 0.012 0.015 0.018 0.023 Skewness -0.081 -0.748 -0.310 -0.091 0.092 0.648 Kurtosis 9.939 5.154 6.411 8.087 10.923 22.803 Panel B: Marginal model parameters Constant 0.001 0.000 0.001 0.001 0.001 0.001 AR(1) -0.019 -0.068 -0.041 -0.017 0.000 0.031 $ 104 0.009 0.002 0.003 0.006 0.011 0.025 (cid:2) (cid:11) 0.025 0.000 0.009 0.019 0.033 0.077 (cid:20) 0.099 0.029 0.064 0.095 0.131 0.179 (cid:12) 0.885 0.756 0.864 0.904 0.932 0.958 (cid:24) 5.089 3.401 4.234 4.846 5.798 7.256 -0.027 -0.087 -0.051 -0.025 -0.004 0.020 Panel C: Correlations of standardized residuals Pearson 0.322 0.170 0.256 0.314 0.378 0.492 Spearman 0.360 0.197 0.295 0.356 0.418 0.531 Notes: This table presents summary statistics on the 110 daily equity return series used in this paper. The sample period is January 2010 to December 2019. Panel A presents a summary of the cross-sectional distributionofthe(cid:133)rstfourmomentsofthesereturns, PanelBpresentsasummaryoftheestimatedAR(1)- GJR GARCH(1,1)-skew t model used for the marginal distributions, and Panel C presents a summary of the 5,995 pairwise correlations of the standardized residuals. 39

Table 4: Estimated group assignments Group Ticker Name SIC Group Ticker Name SIC 1 ABT Abbott Lab. 28 7 CSCO Cisco Sys 36 AGN Actavis 28 HPQ Hewlett Pac 35 AMGN Amgen 28 INTC Intel 36 BAX Baxter 38 MSFT Microsoft 73 BIIB Biogen 28 NVDA Nvidia 36 BMY Bristol-Myers 28 QCOM Qualcomm 36 GILD Gilead 28 TXN Texas Instru 36 JNJ Johnson & J 28 LLY Lilly Eli 28 8 AIG Ame Inter Group 63 MDT Medtronic 38 ALL Allstate 63 MRK Merck 28 CMCSA Comcast 48 PFE P(cid:133)zer 28 DIS Disney Walt 48 UNH Unitedhealth 63 F Ford 37 GE Gen Electric 35 2 BAC Bank Of Am 60 XRX Xerox 35 BK Bank Of NY 60 C Citigroup Inc 60 9 AEP Ame Elec Pow 49 COF Capital One 60 DUK Duke Energy 49 GS Goldman Sachs 62 ETR Entergy Corp 49 JPM Jpmorgan 60 EXC Exelon 49 MET Metlife 63 NEE Nextera Energy 49 MS Morgan Stanley 60 SO Southern Co 49 RF Regions Fin 60 USB U S Bancorp 60 10 COST Costco 53 WFC Wells Fargo 60 CVS C V S Health 59 TGT Target 53 3 APA Apache 13 WBA Walgreens 59 BHI Baker Hughes 35 WMT Walmart 53 COP Conocophillips 13 CVX Chevron 13 11 GD Gen Dynamics 37 DVN Devon 13 LMT Lockheed Martin 37 HAL Halliburton 13 RTN Raytheon 38 NOV Nat. Oilwell 35 OXY Occidental 13 12 AMT American Tower 48 SLB Schlumberger 13 SPG Simon Property 67 WMB Williams Co 49 WY Weyerhaeuser 8 XOM Exxon Mobil 13 13 BA Boeing 37 4 CAT Caterpillar 35 FCX Freeport Mcmo 10 EMR Emerson Ele 35 NKE Nike 30 FDX Fedex 45 HON Honeywell Int 37 14 ACN Accenture 67 MMM 3M 38 IBM IBM 35 NSC Norfolk South 40 ORCL Oracle 73 UNP Union Paci(cid:133)c 40 UPS United Parcel 42 15 AXP Amex 60 BLK Blackrock 62 5 AAPL Apple 35 ADBE Adobe 73 16 DHR Danaher 38 AMZN Amazon 73 TMO Thermo Fisher 38 CRM Salesforce 73 EBAY Ebay 73 17 T A T & T 48 GOOGL Google 73 VZ Verizon 48 NFLX Net(cid:135)ix 78 PCLN Priceline 73 18 AVP Avon Products 28 SNS Steak N Shake 58 6 CL Colgate Palmo 28 CPB Campbell Soup 20 19 MA Mastercard 73 KO Coca Cola 20 V Visa 61 MDLZ Mondelez 20 MO Altria 21 20 MCD Mcdonalds 58 PEP Pepsico 20 SBUX Starbucks 58 PG Procter Gamble 28 PM Philip Morris 21 21 HD Home Depot 52 LOW Lowes 52 Notes: This table presets the estimated group assignments based on the BIC-optimal number of groups, G=21. The groups are ordered by the number of members. 40

Table 5: Comparing di⁄erent copula speci(cid:133)cations Static vs. GAS Copula shape Gaussian t skew t G vs. t G vs. skew t t vs. skew t SIC 1 digit 7.861 12.067 11.552 9.288 8.596 -2.975 SIC 2 digit 9.887 15.730 16.501 8.938 8.465 -2.849 3 groups 6.528 6.636 6.761 8.339 7.711 -2.257 4 groups 7.681 10.059 9.412 9.121 8.381 -2.351 5 groups 7.548 10.804 10.913 9.236 9.088 -1.717 18 groups 10.571 16.052 15.711 9.295 7.945 -3.550 19 groups 9.806 15.457 15.259 9.553 8.847 -2.065 20 groups 10.908 16.193 14.741 9.426 7.995 -3.797 21 groups 10.916 16.626 17.321 9.474 8.670 -2.366 22 groups 10.732 16.891 16.802 9.688 8.962 -2.714 25 groups 11.817 19.001 19.140 9.475 8.569 -2.355 27 groups 10.725 15.836 15.794 9.591 8.951 -1.676 30 groups 10.917 17.169 15.917 9.697 8.717 -2.962 Notes: This table presents Diebold-Mariano t-statistics on pairwise comparisons of models using their out-of-sample log-likelihood. The left panel compares models assuming no dynamics with those using GAS dynamics, for three di⁄erent copula shapes (Gaussian, t, and skew t) and for a variety of choices for the number of groups. The right panel compares the di⁄erent copula shapes, using GAS dynamics in all cases, across a variety of choices for the number of groups. In a comparison labeled (cid:147)A vs B,(cid:148)a positive t-statistic indicatesthatBispreferred;anegativet-statisticindicatesthatAispreferred. Notethatthereare7groups of (cid:133)rms using the 1-digit SIC, and 21 groups using the 2-digit SIC. 41

sretsulc fo srebmun tnere⁄id gnirapmoC :6 elbaT spuorg 03 spuorg 22 spuorg 12 spuorg 02 spuorg 91 spuorg 81 spuorg 5 spuorg 4 spuorg 3 2-CIS 1-CIS snosirapmoc esiw-riap morf scitsitats-t :A lenaP 579.92 379.23 292.33 721.33 470.23 594.33 898.12 081.41 387.4- 471.62 1-CIS 570.71 552.12 953.32 882.32 202.12 681.32 035.2 274.8- 975.12- 471.62- 2-CIS 227.82 925.13 742.13 629.03 848.03 893.13 571.42 743.81 975.12 387.4 spuorg 3 265.12 115.62 759.62 059.62 292.52 460.72 229.21 743.81- 274.8 081.41spuorg 4 781.51 960.02 986.22 535.22 265.02 571.32 229.21- 571.42- 035.2- 898.12spuorg 5 543.21- 457.6- 481.0 773.2 977.6- 571.32- 460.72- 893.13- 681.32- 594.33spuorg 81 828.9- 072.3- 545.6 695.7 977.6 265.02- 292.52- 848.03- 202.12- 470.23spuorg 91 260.31- 298.7- 519.2- 695.7- 773.2- 535.22- 059.62- 629.03- 882.32- 721.33spuorg 02 897.21- 513.7- 519.2 545.6- 481.0- 986.22- 759.62- 742.13- 953.32- 292.33spuorg 12 637.7- 513.7 298.7 072.3 457.6 960.02- 115.62- 925.13- 552.12- 379.23spuorg 22 637.7 897.21 260.31 828.9 543.21 781.51- 265.12- 227.82- 570.71- 579.92spuorg 03 seulav doohilekil-gol elpmas-fo-tuO :B lenaP 2.95504 9.67214 1.05024 5.54124 0.46514 2.14024 6.30383 3.35363 8.57133 6.78873 5.47043 gol L lla nI .doohilekil-gol elpmas-fo-tuo rieht gnisu sledom fo snosirapmoc esiwriap no scitsitats-t onairaM-dlobeiD stneserp elbat sihT :setoN evitagen a ;ledom wor eht ot derreferp si ledom nmuloc eht taht setacidni citsitats-t evitisop A .scimanyd SAG htiw alupoc t a esu ew sesac .CIS tigid-2 eht gnisu spuorg 12 dna ,CIS tigid-1 eht gnisu smr(cid:133) fo spuorg 7 era ereht taht etoN .etisoppo eht setacidni citsitats-t 42

ecnamrofrep tsacerof fo stnanimreted cimonocE :7 elbaT sretsulc detamitsE .sv CIS epahs t s(cid:146)tnedutS .sv naissuaG scimanyd SAG .sv citatS 411.4 411.4 411.4 411.4 394.1 394.1 394.1 394.1 115.0 115.0 115.0 115.0 tpecretnI )020.0( )71.0( )771.0( )171.0( )020.0( )741.0( )641.0( )541.0( 660.0- )660.0( )660.0( )660.0( ).e.s( ]414.32[ ]812.42[ ]262.32[ ]331.42[ ]323.01[ ]241.01[ ]952.01[ ]613.01[ ]777.7[ ]197.7[ ]717.7[ ]147.7[ ]tats-t[ 160.0- 420.0 360.0 601.0 700.0 320.0 XIV )020.0( )930.0( )020.0( )830.0( )020.0( )810.0( ).e.s( ]003.1-[ ]906.0[ ]712.1[ ]028.2[ ]753.0[ ]562.1[ ]tats-t[ 641.2 342.2 067.0 453.1 393.0 474.0 noisrepsiD )020.0( )565.0( )020.0( )875.0( )681.0( )751.0( ).e.s( ]232.3[ ]969.3[ ]370.1[ ]243.2[ ]801.2[ ]810.3[ ]tats-t[ 333.2 890.6 652.3 652.5 974.0 213.1 ahpla .sbA )020.0( )593.1( )020.0( )441.2( )506.0( )894.0( ).e.s( ]264.1[ ]173.4[ ]693.1[ ]154.2[ ]397.0[ ]536.2[ ]tats-t[ 721.4 017.1 337.3 830.0 478.2 328.1 159.1 590.1 794.1 766.0 004.1 003.0 )%( 2R 000.0 000.0 000.0 000.0 000.0 000.0 000.0 000.0 000.0 000.0 000.0 000.0 eulav-p WG LLA 000.0 000.0 000.0 345.0 000.0 410.0 910.0 500.0 100.0 800.0 300.0 602.0 eulav-p WG SEPOLS (cid:150) 000.0 000.0 000.0 (cid:150) 000.0 000.0 000.0 (cid:150) 000.0 000.0 000.0 eulav-p APSC )APSC( ytiliba evitciderp roirepus lanoitidnoc )1202( .la te iL dna stset )5002( etihW-inimocaiG fo stluser eht stneserp elbat sihT :setoN 02 otni stnemngissa detamitse dna scimanyd SAG htiw alupoc t s(cid:146)tnedutS a sesu ledom enilesab ehT .sledom fo sriap eerht gnirapmoc ,stset tigid-owtsesu ;)lenapelddim(alupocnaissuaGasesu ;)lenaptfel(alupoclanoitidnoccitatsasesopmi :tahtledomaotderapmocsidna,sretsulc hcae ni ledom relpmis eht taht sisehtopyh llun eht rof eulav-p APSC eht stneserp wor mottob ehT .)lenap thgir( sretsulc mrof ot sedoc CIS WG(cid:147) delebal swor ehT .retteb yltcirts ylmrofinu si ledom elbixe(cid:135) erom eht taht evitanretla eht tsniaga retteb ylkaew ylmrofinu si nosirapmoc ,xedniXIVehteraselbairavgninoitidnocehT .orezotlauqeera,stneic ¢eocepolsehtylnoro,noissergerehtnisretemarapllatahttset(cid:148)eulav-p .ahpla MPAC egareva lanoitces-ssorc eht fo eulav etulosba eht dna ,)(cid:148)noisrepsid(cid:147)( snruter fo noitaived dradnats lanoitces-ssorc eht 43

(a) Gaussian (b) Student t 4 4 2 2 0 0 2 2 4 4 4 2 0 2 4 4 2 0 2 4 (c) Skew Normal (d) Skew t 4 4 2 2 0 0 2 2 4 4 4 2 0 2 4 4 2 0 2 4 Figure1: This (cid:133)gure presents random draws from four joint distributions, all with standard Normal margins. Panel (a) uses a Gaussian copula, Panel (b) uses a Student(cid:146)s t copula, Panel (c) uses a skew Normal copula, Panel (d) uses a skew t copula. For all four copulas the correlation parameter is set to 0.5. For both t copulas the degrees of freedom parameter is set to 5. For both skewed copulas the skewness parameter is set to -0.1. 44

Bayesian information criterion across values of G 13.5 1 digit SIC (7 groups) 2 digit SIC (21 groups) 14 EM 14.5 e u 15 la v C IB 15.5 16 16.5 17 3 6 9 12 15 18 21 24 27 30 Number of groups (G) Figure 2: Plot of BIC value as a function of the number of groups (G) for the EM-estimated model. The BIC values for the 1-digit and 2-digit SIC-based groups are also reported for comparison; these models have 7 and 21 groups respectively. As usual, lower BIC values are preferred. (Note the y-axis has been scaled by 10 4 for ease of presentation.) (cid:0) 45

Within group rank correlations for EM (thin line) and SIC (thick line) groups 0.8 0.6 3 1 & 3 0.4 0.2 2011 2012 2013 2014 2015 2016 2017 2018 2019 0.8 0.6 6 3 & 7 0.4 0.2 2011 2012 2013 2014 2015 2016 2017 2018 2019 0.8 0.6 9 4 & 9 0.4 0.2 2011 2012 2013 2014 2015 2016 2017 2018 2019 Figure 3: Time series plots of model-implied within-group rank correlations. The upper panel presents estimated group 3 and SIC group 13; the middle panel presents estimated group 7 and SIC group 36; the lower panel presents estimated group 9 and SIC group 49. 46

Distance between EM and SIC based rank correlation matrices 50 40 E 30 K IL Q 20 10 0 2011 2012 2013 2014 2015 2016 2017 2018 2019 Normalized sum of 22 largest eigenvalues 0.75 EM s 0.7 g SIC ie fo 0.65 m u S 0.6 0.55 2011 2012 2013 2014 2015 2016 2017 2018 2019 90 10% spread in pairwise rank correlations s le 0.3 r r o c n i n 0.2 o is r EM e p 0.1 SIC s iD 2011 2012 2013 2014 2015 2016 2017 2018 2019 Figure 4: The upper panel presents the QLIKE distance between the conditional rank correlation matrices implied by the 2-digit SIC-based model and the optimal EM-based factor copula model, both of which have a total of 22 factors. The middle panel presents the sum of the 22 largest eigenvalues of the conditional rank correlation matrices, divided by 110, the number of assets. The lower panel presents the di⁄erence between the 90% and 10% cross-sectional quantiles of all 5,995 pairwise rank correlations. 47

Conditional model comparisons: Differences in expected log likelihoods 2 2 2 S Est. A 1.5 1.5 1.5 G 95% CI s 1 1 1 v c ita 0.5 0.5 0.5 tS 0 0 0 10 15 20 25 1 1.5 2 2.5 0.15 0.3 0.45 7.5 7.5 7.5 t d u t S 5 5 5 s v n a is 2.5 2.5 2.5 s u a G 0 0 0 10 15 20 25 1 1.5 2 2.5 0.15 0.3 0.45 10 10 10 d e r 7.5 7.5 7.5 e ts u lC 5 5 5 s v 2.5 2.5 2.5 C IS 0 0 0 10 15 20 25 1 1.5 2 2.5 0.15 0.3 0.45 VIX Dispersion Abs alpha Figure 5: This (cid:133)gure presents estimates of the expected di⁄erence in out-of-sample log likelihoods between the models listed in the y-axis label, conditioning on the variable given in the x-axis label. Positive values indicate that the second model in the comparison is preferred. Also presented are pointwise 95% con(cid:133)dence intervals for the estimated di⁄erence. 48

Supplemental Appendix for Dynamic Factor Copula Models with Estimated Cluster Assignments by Dong Hwan Oh and Andrew J. Patton 14 April 2022 S.1

Table S1: Simulation results for time-varying copulas, G=20 Panel A: Parameter estimation accuracy Gaussian t skew t True Mean Std Dev Mean Std Dev Mean Std Dev !M 0.04 0.043 0.007 0.042 0.007 0.044 0.008 1 !M 0.04 0.043 0.006 0.041 0.008 0.043 0.008 2 !M 0.04 0.043 0.006 0.042 0.007 0.043 0.008 3 !M 0.04 0.043 0.007 0.042 0.007 0.044 0.008 4 !M 0.04 0.044 0.007 0.042 0.007 0.043 0.007 5 !M 0.04 0.043 0.007 0.042 0.006 0.043 0.007 6 !M 0.04 0.043 0.007 0.042 0.007 0.043 0.008 7 !M 0.04 0.043 0.007 0.042 0.007 0.043 0.008 8 !M 0.04 0.043 0.007 0.042 0.007 0.043 0.008 9 !M 0.04 0.043 0.007 0.042 0.007 0.043 0.008 10 !M 0.04 0.043 0.007 0.042 0.007 0.043 0.008 11 !M 0.04 0.043 0.006 0.042 0.007 0.043 0.008 12 !M 0.04 0.043 0.007 0.042 0.006 0.044 0.007 13 !M 0.04 0.044 0.007 0.042 0.007 0.043 0.008 14 !M 0.04 0.043 0.007 0.043 0.007 0.044 0.008 15 !M 0.04 0.043 0.007 0.042 0.007 0.043 0.007 16 !M 0.04 0.043 0.007 0.042 0.007 0.043 0.008 17 !M 0.04 0.043 0.006 0.041 0.007 0.043 0.008 18 !M 0.04 0.043 0.008 0.042 0.006 0.044 0.007 19 !M 0.04 0.043 0.007 0.042 0.007 0.043 0.008 20 !C 0.04 0.044 0.009 0.045 0.009 0.049 0.042 1 !C 0.04 0.045 0.009 0.045 0.010 0.049 0.032 2 !C 0.04 0.045 0.008 0.045 0.010 0.049 0.036 3 !C 0.04 0.045 0.009 0.045 0.009 0.049 0.031 4 !C 0.04 0.044 0.008 0.045 0.009 0.049 0.036 5 !C 0.04 0.045 0.008 0.045 0.010 0.049 0.043 6 !C 0.04 0.044 0.007 0.045 0.010 0.048 0.034 7 !C 0.04 0.044 0.008 0.045 0.010 0.049 0.038 8 !C 0.04 0.044 0.007 0.045 0.010 0.049 0.039 9 !C 0.04 0.044 0.008 0.045 0.010 0.049 0.039 10 !C 0.04 0.044 0.007 0.045 0.010 0.049 0.035 11 Notes: This table continues on the next page. S.2

Table S1: Simulation results for time-varying copulas, G=20 (continued) Panel A: Parameter estimation accuracy (continued) Gaussian t skew t True Mean Std Dev Mean Std Dev Mean Std Dev !C 0.04 0.045 0.009 0.045 0.010 0.049 0.039 12 !C 0.04 0.045 0.008 0.045 0.009 0.049 0.037 13 !C 0.04 0.044 0.008 0.045 0.009 0.048 0.035 14 !C 0.04 0.045 0.008 0.045 0.009 0.049 0.037 15 !C 0.04 0.045 0.009 0.045 0.010 0.049 0.043 16 !C 0.04 0.044 0.009 0.045 0.010 0.049 0.036 17 !C 0.04 0.045 0.009 0.045 0.010 0.049 0.037 18 !C 0.04 0.045 0.008 0.045 0.010 0.049 0.040 19 !C 0.04 0.045 0.009 0.045 0.010 0.048 0.036 20 (cid:11)M 0.02 0.020 0.002 0.020 0.002 0.020 0.002 (cid:12)M 0.90 0.891 0.014 0.894 0.015 0.895 0.017 (cid:11)C 0.02 0.020 0.002 0.020 0.003 0.020 0.003 (cid:12)C 0.90 0.890 0.018 0.889 0.022 0.879 0.091 (cid:23) 5.00 5.001 0.068 5.017 0.105 (cid:16) -0.10 -0.100 0.007 Panel B: Group assignment estimation accuracy Number incorrect 0 100 100 99 1 0 0 1 2 0 0 0 (cid:21) Panel C: Estimation details Clustering Copula Clustering Copula Clustering Copula Time (min) 21.64 42.6 21.59 56.4 22.85 81.6 EM (iter) 85.5 - 85.3 - 85.6 - Notes: Thistablepresentsresultsfrom100simulationsfromGaussian, t, andskewt factorcopulaswith 20 groups and GAS dynamics. Panel A presents results on estimation accuracy of the copula parameters, Panel B presents results on estimation accuracy of the group assignments, and Panel C presents average estimation time (for the two stages of estimation) and EM iterations based using a machine with an Intel Xeon processor, with ten cores and clock speed of 2.60GHz. S.3

Table S2: List of (cid:133)rms used in the empirical analysis Ticker Name SIC Ticker Name SIC Ticker Name SIC AAPL Apple 35 DVN Devon 13 NKE Nike 30 ABT Abbott Lab. 28 EBAY Ebay 73 NOV Nat. Oilwell 35 ACN Accenture 67 EMR Emerson Ele 35 NSC Norfolk South 40 ADBE Adobe 73 ETR Entergy Corp 49 NVDA Nvidia 36 AEP Ame Elec Pow 49 EXC Exelon 49 ORCL Oracle 73 AGN Actavis 28 F Ford 37 OXY Occidental 13 AIG Ame Inter Group 63 FCX Freeport Mcmo 10 PCLN Priceline 73 ALL Allstate 63 FDX Fedex 45 PEP Pepsico 20 AMGN Amgen 28 GD Gen Dynamics 37 PFE P(cid:133)zer 28 AMT American Tower 48 GE Gen Electric 35 PG Procter Gamble 28 AMZN Amazon 73 GILD Gilead 28 PM Philip Morris 21 APA Apache 13 GOOGL Google 73 QCOM Qualcomm 36 AVP Avon Products 28 GS Goldman Sachs 62 RF Regions Fin 60 AXP Amex 60 HAL Halliburton 13 RTN Raytheon 38 BA Boeing 37 HD Home Depot 52 SBUX Starbucks 58 BAC Bank Of Am 60 HON Honeywell Int 37 SLB Schlumberger 13 BAX Baxter 38 HPQ Hewlett Pac 35 SNS Steak N Shake 58 BHI Baker Hughes 35 IBM IBM 35 SO Southern Co 49 BIIB Biogen 28 INTC Intel 36 SPG Simon Property 67 BK Bank Of NY 60 JNJ Johnson & J 28 T A T & T 48 BLK Blackrock 62 JPM Jpmorgan 60 TGT Target 53 BMY Bristol-Myers 28 KO Coca Cola 20 TMO Thermo Fisher 38 C Citigroup Inc 60 LLY Lilly Eli 28 TXN Texas Instru 36 CAT Caterpillar 35 LMT Lockheed Mar 37 UNH Unitedhealth 63 CL Colgate Palmo 28 LOW Lowes 52 UNP Union Paci(cid:133)c 40 CMCSA Comcast 48 MA Mastercard 73 UPS United Parcel 42 COF Capital One 60 MCD Mcdonalds 58 USB U S Bancorp 60 COP Conocophillips 13 MDLZ Mondelez 20 V Visa 61 COST Costco 53 MDT Medtronic 38 VZ Verizon 48 CPB Campbell Soup 20 MET Metlife 63 WBA Walgreens 59 CRM Salesforce 73 MMM 3M 38 WFC Wells Fargo 60 CSCO Cisco Sys 36 MO Altria 21 WMB Williams Co 49 CVS C V S Health 59 MRK Merck 28 WMT Walmart 53 CVX Chevron 13 MS Morgan Stanley 60 WY Weyerhaeuser 08 DHR Danaher 38 MSFT Microsoft 73 XOM Exxon Mobil 13 DIS Disney Walt 48 NEE Nextera Energy 49 XRX Xerox 35 DUK Duke Energy 49 NFLX Net(cid:135)ix 78 Description Num Description Num Description Num SIC 0 Forestry, Agri. 1 SIC 3 Manuf: elec, mach 26 SIC 6 Finance, Ins 19 SIC 1 Mining, construct. 9 SIC 4 Transprt, comm(cid:146)s 16 SIC 7 Services 10 SIC 2 Manuf: food, furn. 19 SIC 5 Trade 10 Total 110 S.4

Table S3: Estimation results for the 1-digit SIC model Panel A: Parameter estimation accuracy Gaussian t skew t Est. Std Dev Est. Std Dev Est. Std Dev !M 0.105 0.015 0.020 0.007 0.020 0.007 1 !M 0.074 0.012 0.014 0.005 0.014 0.005 2 !M 0.095 0.015 0.018 0.007 0.018 0.007 3 !M 0.073 0.013 0.014 0.005 0.014 0.005 4 !M 0.074 0.012 0.014 0.005 0.014 0.005 5 !M 0.107 0.017 0.021 0.008 0.020 0.008 6 !M 0.086 0.013 0.017 0.006 0.016 0.006 7 !C 0.006 0.002 0.006 0.002 0.006 0.002 1 !C 0.003 0.001 0.003 0.001 0.003 0.001 2 !C 0.001 0.000 0.001 0.000 0.001 0.000 3 !C 0.003 0.001 0.003 0.001 0.003 0.001 4 !C 0.002 0.001 0.003 0.001 0.003 0.001 5 !C 0.003 0.001 0.004 0.001 0.004 0.001 6 !C 0.003 0.001 0.003 0.001 0.004 0.001 7 (cid:11)M 0.019 0.001 0.008 0.002 0.008 0.002 (cid:12)M 0.890 0.017 0.979 0.008 0.979 0.008 (cid:11)C 0.003 0.001 0.003 0.001 0.003 0.001 (cid:12)C 0.993 0.002 0.993 0.002 0.993 0.002 (cid:23) 32.170 1.390 30.913 2.464 (cid:16) -0.227 0.004 Panel B: Estimation details log 71237.13 74912.38 74997.21 L AIC -142438 -149787 -149954 BIC -142333 -149676 -149838 Time (hours) 0.82 0.96 1.42 Notes: This table presents the estimated parameters and standard errors for a factor copula when the group assignments are based on one-digit SIC codes, leading to seven groups. S.5

Table S4: Estimation results for the optimal 21 group model Panel A: Parameter estimation accuracy Gaussian t skew t Est. Std Dev Est. Std Dev Est. Std Dev !M 0.086 0.007 0.010 0.004 0.010 0.003 1 !M 0.143 0.007 0.017 0.005 0.017 0.005 2 !M 0.105 0.007 0.012 0.004 0.012 0.004 3 !M 0.129 0.008 0.015 0.005 0.015 0.005 4 !M 0.086 0.006 0.010 0.003 0.010 0.003 5 !M 0.087 0.006 0.010 0.004 0.010 0.004 6 !M 0.100 0.005 0.012 0.004 0.012 0.004 7 !M 0.101 0.006 0.012 0.004 0.012 0.004 8 !M 0.081 0.006 0.010 0.004 0.010 0.003 9 !M 0.082 0.006 0.010 0.003 0.010 0.003 10 !M 0.129 0.007 0.015 0.005 0.015 0.005 11 !M 0.086 0.006 0.010 0.004 0.010 0.004 12 !M 0.087 0.005 0.010 0.004 0.010 0.003 13 !M 0.117 0.007 0.014 0.005 0.014 0.005 14 !M 0.130 0.006 0.015 0.005 0.015 0.005 15 !M 0.135 0.009 0.016 0.005 0.016 0.005 16 !M 0.108 0.007 0.013 0.004 0.013 0.004 17 !M 0.059 0.005 0.007 0.003 0.007 0.002 18 !M 0.165 0.009 0.019 0.006 0.020 0.006 19 !M 0.090 0.006 0.011 0.004 0.011 0.004 20 !M 0.141 0.009 0.017 0.006 0.017 0.006 21 !C 0.003 0.002 0.003 0.001 0.003 0.001 1 !C 0.006 0.003 0.005 0.001 0.005 0.001 2 !C 0.006 0.003 0.005 0.001 0.005 0.001 3 !C 0.003 0.002 0.003 0.000 0.003 0.000 4 !C 0.003 0.002 0.003 0.001 0.003 0.001 5 !C 0.005 0.002 0.004 0.001 0.004 0.001 6 !C 0.003 0.002 0.003 0.000 0.003 0.000 7 !C 0.002 0.001 0.001 0.000 0.001 0.000 8 !C 0.009 0.004 0.007 0.001 0.007 0.001 9 !C 0.003 0.002 0.003 0.000 0.003 0.000 10 Notes: This table continues on the next page. S.6

Table S4: Estimation results for the optimal 21 group model, continued Panel A: Parameter estimation accuracy (continued) Gaussian t skew t Est. Std Dev Est. Std Dev Est. Std Dev !C 0.006 0.003 0.005 0.001 0.005 0.001 11 !C 0.003 0.002 0.003 0.000 0.003 0.000 12 !C 0.000 0.000 0.000 0.000 0.000 0.000 13 !C 0.003 0.002 0.002 0.000 0.002 0.000 14 !C 0.002 0.001 0.002 0.000 0.002 0.000 15 !C 0.004 0.002 0.003 0.000 0.003 0.000 16 !C 0.007 0.004 0.006 0.001 0.006 0.001 17 !C 0.001 0.001 0.001 0.001 0.001 0.001 18 !C 0.008 0.004 0.007 0.001 0.007 0.001 19 !C 0.004 0.002 0.003 0.000 0.003 0.000 20 !C 0.008 0.004 0.006 0.001 0.006 0.001 21 (cid:11)M 0.038 0.002 0.013 0.002 0.012 0.002 (cid:12)M 0.885 0.007 0.986 0.005 0.986 0.005 (cid:11)C 0.007 0.001 0.008 0.001 0.008 0.001 (cid:12)C 0.993 0.003 0.995 0.001 0.995 0.001 (cid:23) 33.965 1.321 33.788 1.573 (cid:16) -0.394 0.041 Panel B: Estimation details log 86056.86 89506.49 89624.77 L AIC -172022 -178919 -179154 BIC -171754 -178645 -178874 Time (clustering) (hrs) 8.63 8.63 8.63 Time (copula) (hrs) 1.37 1.38 1.43 EM iterations 98.7 98.7 98.7 Notes: This table presents the estimated parameters and standard errors for a factor copula when the groupassignmentsareestimatedfromthedata. Thenumberofgroupsissetto21and100randomly-chosen starting values for group assignment estimation are used. The average estimation time and EM iterations presented in Panel B are based on a machine with 28 cores. S.7

Within group rank correlations for EM (thin line) and SIC (thick line) groups 0.8 0.6 8 2 & 1 0.4 0.2 2011 2012 2013 2014 2015 2016 2017 2018 2019 1 0.8 0 6 & 2 0.6 0.4 2011 2012 2013 2014 2015 2016 2017 2018 2019 0.8 0.6 0 4 & 4 0.4 0.2 2011 2012 2013 2014 2015 2016 2017 2018 2019 Figure S1: Time series plots of model-implied within-group rank correlations. The upper panel presents estimated group 1 and SIC group 28; the middle panel presents estimated group 2 and SIC group 60; the lower panel presents estimated group 4 and SIC group 40. S.8

Within group rank correlations for EM (thin line) and SIC (thick line) groups 0.8 0.6 3 7 & 5 0.4 0.2 2011 2012 2013 2014 2015 2016 2017 2018 2019 0.8 0.6 0 2 & 6 0.4 0.2 2011 2012 2013 2014 2015 2016 2017 2018 2019 0.8 0.6 3 6 & 8 0.4 0.2 2011 2012 2013 2014 2015 2016 2017 2018 2019 Figure S2: Time series plots of model-implied within-group rank correlations. The upper panel presents estimated group 5 and SIC group 73; the middle panel presents estimated group 6 and SIC group 20; the lower panel presents estimated group 8 and SIC group 63. S.9

Normalized largest eigenvalue 0.5 EM SIC 0.45 s g ie fo 0.4 m u S 0.35 0.3 2011 2012 2013 2014 2015 2016 2017 2018 2019 Normalized sum of 3 largest eigenvalues 0.55 0.5 s g ie fo0.45 m u S 0.4 0.35 2011 2012 2013 2014 2015 2016 2017 2018 2019 Figure S3: The upper panel presents the largest eigenvalues of the conditional rank correlation matrices, divided by 110, the number of assets for the 2-digit SIC-based model and the optimal EM-based factor copula model, both of which have a total of 22 factors. The lower panel presents the sum of the 3 largest eigenvalues of the conditional rank correlation matrices, also divided by 110. S.10

Cite this document

APA

Dong Hwan Oh and Andrew J. Patton (2022). Dynamic Factor Copula Models with Estimated Cluster Assignments (FEDS 2021-029). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2021-029

BibTeX

@techreport{wtfs_feds_2021_029,
  author = {Dong Hwan Oh and Andrew J. Patton},
  title = {Dynamic Factor Copula Models with Estimated Cluster Assignments},
  type = {Finance and Economics Discussion Series},
  number = {2021-029},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2022},
  url = {https://whenthefedspeaks.com/doc/feds_2021-029},
  abstract = {This paper proposes a dynamic multi-factor copula for use in high dimensional time series applications. A novel feature of our model is that the assignment of individual variables to groups is estimated from the data, rather than being pre-assigned using SIC industry codes, market capitalization ranks, or other ad hoc methods. We adapt the k-means clustering algorithm for use in our application and show that it has excellent finite-sample properties. Applying the new model to returns on 110 US equities, we find around 20 clusters to be optimal. In out-of-sample forecasts, we find that a model with as few as five estimated clusters significantly outperforms an otherwise identical model with 21 clusters formed using two-digit SIC codes.},
}