feds · March 27, 2024

Linear Factor Models and the Estimation of Expected Returns

Abstract

This paper analyzes the properties of expected return estimators on individual assets implied by the linear factor models of asset pricing, i.e., the product of Î² and Î». We provide the asymptotic properties of factor-model-based expected return estimators, which yield the standard errors for risk premium estimators for individual assets. We show that using factor-model-based risk premium estimates leads to sizable precision gains compared to using historical averages. Finally, inference about expected returns does not suffer from a small-beta bias when factors are traded. The more precise factor-model-based estimates of expected returns translate into sizable improvements in out-of-sample performance of optimal portfolios.

Finance and Economics Discussion Series Federal Reserve Board, Washington, D.C. ISSN 1936-2854 (Print) ISSN 2767-3898 (Online) Linear Factor Models and the Estimation of Expected Returns Cisil Sarisoy, Peter de Goeij, and Bas J.M. Werker 2024-014 Please cite this paper as: Sarisoy, Cisil, Peter de Goeij, and Bas J.M. Werker (2024). “Linear Factor Models and the Estimation of Expected Returns,” Finance and Economics Discussion Series 2024-014. Washington: Board of Governors of the Federal Reserve System, https://doi.org/10.17016/FEDS.2024.014. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

Linear Factor Models and the Estimation of Expected Returns ∗ Cisil Sarisoy Federal Reserve Board Peter de Goeij Tilburg University Bas J.M. Werker Tilburg University January 2024 Abstract This paper analyzes the properties of expected return estimators on individual assets implied by the linear factor models of asset pricing, i.e., the product of β and λ. We provide the asymptotic properties of factor–model–based expected return estimators, which yield the standard errors for risk premium estimators for individual assets. We show that using factor-model-based risk premium estimates leads to sizable precision gainscomparedtousinghistoricalaverages. Finally,inferenceaboutexpectedreturns does not suffer from a small–beta bias when factors are traded. The more precise factor–model–basedestimatesofexpectedreturnstranslateintosizableimprovements in out–of–sample performance of optimal portfolios. Keywords: Cross Section of Expected Returns, Risk Premium, Small β’s. ∗We thank Torben G. Andersen, Bertille Antoine, Svetlana Bryzgalova, Frank de Jong, Joost Driessen, Stefano Giglio, Bryan Kelly, Frank Kleibergen, Yinying Li, Paulo Maio, Adam McCloskey, Dino Palazzo, Andrew Patton, Eric Renault, Enrique Sentana, George Tauchen, Viktor Todorov, Brian Weller, Dacheng Xiu, and Guofu Zhou for helpful comments and discussions as well as seminar and conference participants at BlackRock, Federal Reserve Board, Northwestern University Kellogg School of Management, Erasmus UniversityRotterdam,TilburgUniversity,andCIREQMontrealEconometricsConferenceinhonorofEric Renault. WealsothankChazzEdingtonforhisexcellentresearchassistance. Theviewsexpressedaresolely those of the authors and should not be interpreted as reflecting the views of the Board of Governors of the FederalReserveSystem,orofanyotherpersonassociatedwiththeFederalReserveSystem. Corresponding author: CisilSarisoy,FederalReserveBoard,Washington,D.C.20551U.S.A.E-mail: cisil.sarisoy@frb.gov. 1

1 Introduction Estimating expected returns on individual assets or portfolios is perhaps one of the longest standing challenges in asset pricing. One standard approach at hand is to use historical averages. However, it is known that these estimates are generally very noisy. Even using daily data does not help much, if at all. There is a long history of papers trying to improve estimates of expected returns by using asset pricing models, in which expected excess returns on individual assets are linear in their exposures to the risk factors imposed (β). Thecoefficientsinthislinearrelationshiparethepricesofriskforthefactors(λ). Examples include Sharpe (1964)’s CAPM, Merton (1973)’s ICAPM, Breeden (1979)’s CCAPM, Ross (1976)’ APT and Lettau and Ludvigson (2001)’s conditional CCAPM, among many others. The literature on inference based on factor models mainly concentrates, in a frequentist setting, on the econometric properties of the prices of risk, λ, and evaluating the ability of the models in explaining the cross section of expected returns. In this paper, the focus is different: we analyze the estimation of the expected (excess) returns on individual assets or portfolios based on linear factor models, i.e., the product of exposures β and risk prices λ. In order to have an estimate of the expected (excess) return on an individual asset, both β and λ have to be estimated, and the dependence between these estimators introduce a nontrivial noise structure in the standard errors of the expected (excess) return estimators. Jorion (1991) compares CAPM—based estimators with classical sample averages of past returns finding the former outperforming the latter in estimating expected stock returns. P´astor and Stambaugh (1999) investigate, in a Bayesian setting, the impact of prior uncertainty about mispricing in a factor model on the posterior estimates of the cost of equity. Similarly, Pa´stor (2000) develops Bayesian approaches to examine the role of prior mispricing in portfolio allocation decisions. Our paper complements this earlier work by providing the first asymptotic analysis for the expected (excess) return estimators for sev- 2

eral often–used factor models. Such limiting distributions yield the frequentist standard errors and, accordingly, confidence bounds for the expected (excess) return of individual assets or portfolios. Moreover, we evaluate the implications of weakly correlated factors on the estimation of expected (excess) returns. We examine the inference under various settings where the factors are traded, non-traded or their mimicking portfolios are used in the estimation. First, we derive the asymptotic properties of expected (excess) return –risk premium– estimators based on factor models. These limiting distributions yield the standard errors for individual assets or portfolios. We thereby assess the precision gains from using factor–model based risk–premium estimators vis–a`–vis the historical averages approach. In particular, we provide closed-form asymptotic expressions for these precision gains. We show in Theorems 4.2, 4.3, and 4.4 that exploiting the linear relationship implied by linear factor models indeed leads to more precise estimates of risk premiums as compared to historical averages. In an empirical analysis of the estimation of risk-premiums on 25 Fama and French (1992) size and book–to–market sorted portfolios, we document reductions in estimated variances of up to 24% for individual portfolios. Second, we analyze the estimation of risk premiums in the presence of weakly correlated and spurious factors. When factors are weakly correlated with assets, i.e., β’s are small, the standard confidence intervals of the price of risk estimates are known to be erroneous (see, e.g., Kleibergen, 2009). This effect may be severe in empirical research, as these confidence intervals may be unbounded as documented for the case of consumption CAPM of Lettau and Ludvigson (2001)1. This is a relevant issue in practice because macroeconomic variables are typically weakly related to individual asset/portfolio returns. We demonstrate that such issues do not exist if the object of interest is the risk premiums on 1See also Kan and Zhang (1999) Gospodinov, Kan, and Robotti (2014), Bryzgalova (2015), Burnside (2015), Gospodinov, Kan, and Robotti (2017, 2019), Giglio, Xiu, and Zhang (2021) on the role of spurious or weakly identified factors for inference about the prices of risk. 3

individual assets, but only in case factors are traded. In that case, the limiting variances of the risk–premium estimators are not affected by the β’s being small, see Corollary 5.1- 2. Monte Carlo simulation results document that those limiting variances provide reliable approximations of the finite-sample variances of the factor-model-based estimators of risk premiums. Third, we explore the implications of the precision gains from using factor model based estimates of expected returns in Markowitz (1952)’s setting. The implementation of the mean–variance framework of Markowitz (1952) in practice requires the estimation of the first two moments of asset returns. Constructing optimal portfolios with the imprecise estimates of expected returns, using historical averages, and the sample covariance matrix generally lead to poor out–of–sample performance.2 In the far end, this has led to simply abandoning the application of theoretically optimal decisions and using naive techniques such as the 1/N strategy or the global minimum variance (GMV) portfolio as these are not subject to estimation risk on expected returns (DeMiguel, Garlappi, and Uppal, 2009).3 Our Monte Carlo simulations document strong improvements in the out–of–sample Sharpe ratios of optimal portfolios when constructed with factor–model–based estimates of risk premiumsascomparedtowhenconstructedwiththehistoricalaverages. Moreover,optimal portfolios constructed with the factor–model–based risk–premium estimates perform better than both the GMV portfolio and the 1/N strategy portfolio. The remainder of the paper is organized as follows. Section 2 introduces our set–up and presents the linear factor model with the assumptions that form the basis of our statistical analysis. Next, we introduce factor–mimicking portfolios and clarify the link between the 2See,forexample,FrostandSavarino(1988),Michaud(1989),JobsonandKorkie(1980),andBestand Grauer (1991). 3Several studies provide solutions on improving the covariance matrix estimates (see, e.g., Ledoit and Wolf, 2003, DeMiguel et al., 2009 among others). However, the estimation error in asset return means is more severe than error in covariance estimates (see Merton, 1980, Chopra and Ziemba, 1993) and the imprecision in estimates of the expected returns affects the optimal portfolio weights more drastically compared to the imprecision in covariance estimates (see DeMiguel et al., 2009). 4

expected returns obtained with non–traded factors and with factor–mimicking portfolios. Section 3 discusses in detail the standard GMM estimators we consider. In particular, we recall the different sets of moment conditions for various cases such as all factors being traded or using factor–mimicking portfolios. Section 4 derives the asymptotic properties of these induced GMM estimators, and we derive the efficiency gains over and above the risk–premium estimator based on historical averages. Section 5 presents the analysis for the small βs. Section 6 reports results from a Monte Carlo simulation experiment to study the finite–sample properties of the factor–model based estimators of expected (excess) returns. Section 7 presents our simulation analysis for portfolio optimization, and Section 8 concludes. All proofs are gathered in the appendix. 2 Model and Assumptions Let M be a candidate stochastic discount factor such that for any traded asset i = 1,2,...,N with excess return Re i E[MRe] = 0. (2.1) i Linear factor models additionally specify M = a+b(cid:48)F, where F = (F ,...,F )(cid:48) is a vector 1 K of K factors. Note that (2.1) can be written in matrix notation using the vector of excess returns Re = (Re,...,Re )(cid:48). Throughout we impose the following. 1 N Assumption 1. The N–vector of excess asset returns Re and the K–vector of factors F with K < N satisfy the following conditions: 1. The covariance matrix of excess returns Σ has full rank N, ReRe 2. The covariance matrix of factors Σ has full rank K, FF 3. The covariance between excess returns and factors, Cov[Re,F(cid:48)], has full rank K. 5

The linear asset-pricing model can be alternatively expressed using the beta representation E[Re] = βλ, (2.2) where β = Cov[Re,F(cid:48)]Σ−1, and λ = − 1 Σ b. FF E [M] FF Thus, (2.2) specifies a linear relationship between risk premiums on individual assets, E[Re], and their exposures β to the risk factors, F. The vector λ denoted the so–called prices of risk of the factors.4 The primary focus of our analysis is on inference about (2.2). For our main results, the following assumptions are needed. Assumption 2. Assume that [Re(cid:48),F(cid:48)](cid:48) is a jointly stationary and ergodic process with a t t finite fourth moment. Assumption 3. Let ε = Re −α−βF . Assume that E[ε |F ] = 0 and Var[ε |F ] = Σ . t t t t t t t εε Assumption 2 provides primitive conditions for central limit theorem approximations for returns and factors. This assumption is sufficient to obtain limiting distributions for the GMM estimators that we focus in this paper. But to obtain explicit limiting results, we make further assumptions on the data. Assumption 3 is made for that purpose and can further be relaxed at the cost of a more cumbersome notation. 2.1 Factor–Mimicking Portfolios A large number of studies in the asset pricing literature suggest “macroeconomic” factors that capture systematic risk. Examples include the C-CAPM of Breeden (1979), the I- CAPM of Merton (1973), and the conditional C-CAPM of Lettau and Ludvigson (2001). In order to assess the validity of macroeconomic risk factors being priced or not, it has been 4We focus on constant parameter factor models. Gagliardini, Ossola, and Scaillet (2016) and Kelly, Pruitt, and Su (2019) allow for time varying risk exposures and time varying factor risk premia by incorporatinginformationfromstockcharacteristicsandmacroeconomicvariables. Ourresultscanbeextended to such a setting, at the cost of additional assumptions. As constant parameter models are still widely used in empirical applications, we’ve decided to focus on that setting. 6

suggested to refer to alternative formulations of such factor models replacing the factors by their projections on the linear span of the excess returns. This is commonly referred to as factor–mimicking portfolios and early references go back to Huberman, Kandel, and Stambaugh (1987) (see also, e.g., Fama, 1998, Lamont, 2001, and Balduzzi and Robotti, 2008). In this paper, we analyze the effect of such formulations on the estimation of risk premiums and we show, in Section 4, that there are efficiency gains from the information in mimicking portfolios when estimating risk premiums. It is important to understand that, the prices of risk of (non–traded) factors generally differ from the risk–premiums on their factor–mimicking portfolios. However, using factor– mimicking portfolios leads to identical risk premiums on individual assets. This is shown in Theorem 2.1 below. Tobeprecise, weprojectthefactorsF ontothespaceofexcessassetreturns, augmented t withaconstant. Inparticular, givenAssumption3, thereexistsaK–vectorΦ andaK×N 0 matrix Φ of constants and a K–vector of random variables u satisfying t F = Φ +ΦRe +u , (2.3) t 0 t t E[u ] = 0 , and Eu Re(cid:48) = 0 . t K×1 t t K×N We then define the factor–mimicking portfolios by Fm = ΦRe. (2.4) t t Now, we obtain an alternative formulation of the linear factor model by replacing the original factors by their factor–mimicking portfolios Re = αm +βmFm +εm, t = 1,2,...,T. (2.5) t t t 7

Recall that, using the projection results, we have Φ = Σ β(cid:48)Σ−1 , and βm = β (cid:0) β(cid:48)Σ−1 β (cid:1)−1 Σ−1. (2.6) FF ReRe ReRe FF The following theorem recalls that, while factor loadings and prices of risk change when using factor mimicking portfolios, expected (excess) returns on individual assets, i.e., their product, are not affected. For completeness we provide a proof in the appendix. Theorem 2.1. Under Assumption 1, we have βλ = βmλm with λm = E[Fm]. t Note that since the factor–mimicking portfolio is an excess return itself, asset pricing theory implies that the price of risk attached to it, λm, equals its expectation. This additional information can be imposed in the estimation of expected (excess) returns and one may hope that the expected (excess) return estimators obtained with factor–mimicking portfolios are more precise than the expected (excess) return estimators obtained with the non-traded factors themselves. We study this question in Section 4. 3 Estimation We concentrate on Hansen (1982)’s GMM estimation technique. The GMM approach is particularly useful in our paper as it avoids the use of two-step estimators and the resulting“errors-in-variables”problemwhencalculatinglimitingdistributions. Inaddition, we immediately obtain the joint limiting distribution of estimates for β and λ which is needed as we are interested in their product. In the following sections, we study the asymptotics of the expected (excess) return estimators by specifying different sets of moment conditions. In Section 3.1, we study a set of moment conditions which generally holds, i.e., both when factors are traded and when they are non-traded. In Section 3.2, we study the case where all factors are traded. We then incorporate the moment condition that factor prices equal expected factor values. 8

In Section 3.3, we consider expected (excess) return estimates based on factor–mimicking portfolios. 3.1 Moment Conditions - General Case We first provide the moment conditions for a general case, i.e., where factors may represent excess returns themselves, but not necessarily. In that case, the standard moment conditions to estimate both factor loadings β and factor prices λ are     1   ⊗[Re −α−βF ]     t t  E[h t (α,β,λ)] = E  F t   = 0. (3.1)     Re −βλ t The first set of moment conditions identifies α and β as regression coefficients, while the last set of conditions represents the pricing restrictions. Note that there are N(1+K +1) moment conditions although there are N(1+K)+K parameters, which implies that the system is overidentified. We set a linear combination of the given moment conditions to zero, that is, we set AE[h (α,β,λ)] = 0 with t   I 0 N(1+K) N(1+K)×N A =  .   0 Θ K×N(1+K) K×N Note that the matrix A specified above combines the last N moment conditions into K moment conditions so that the system becomes exactly identified. We take Θ = βTΣ−1. εε The advantage of this particular choice is that the resulting λ estimates coincide with the GLS cross–sectional estimates. 9

3.2 Moment Conditions - Traded Factor Case Asset pricing theory provides an additional restriction on the prices of risk when factors are traded, meaning that they are excess returns themselves. If a factor is an excess return, we have λ = E[F ]. For example, the price of market risk is equal to the expected excess t market return, and the prices of size and book–to–market risks, as captured by Fama- French’s SMB and HML portfolio movements, are equal to the expected SMB and HML excessreturns. Notethatweusetheterm“excessreturn”foranydifferenceofgrossreturns, that is, not only in excess of the risk-free rate. Prices of excess returns are zero, i.e., excess returns are zero investment portfolios. The standard two–pass estimation procedure commonly found in the finance literature may not give reliable estimates of risk prices when factors are traded. Hou and Kimmel (2006) provide an interesting example to point out this issue. They generate standard two– pass expected (excess) return estimates (both OLS and GLS) in the three factor Fama– French model by using 25 size and book–to–market porfolios as test assets. As shown in their Table 1, both OLS and GLS risk price estimates of the market are significantly different from the sample average of the excess market return. It is important to point out that the two–pass procedure ignores the fact that the Fama–French factors are traded factors and it treats them in the same way as non–traded factors. Consequently, when factors are traded we usually replace the second set of moment conditions with the condition that their expectation of the vector of factors equals λ. Then, the relevant moment conditions are given by     1   ⊗[Re −α−βF ]     t t  E[h t (α,β,λ)] = E  F t   = 0, (3.2)     Fe −λ t 10

where F is the K ×1 vector of factor (excess) returns. t In this case, estimates are obtained by an exactly identified system, i.e., the number of parameters equals the number of moment conditions. Note that if the factor is traded, but we do not add the moment condition that the factor averages equal λ , then the results are just those of the non-traded case in Section 3.1. Alternatively, we could incorporate the theoretical restriction on factor prices into the estimation by adding the factor portfolios as test assets in the linear pricing equation, Re−βλ. This set of moment conditions would be similar to the general case, with the only difference being that the linear pricing restriction incorporates the factors as test assets in   Re addition to the original set of test assets, i.e., we define RF =  t . Under this setting, t   F t the moment conditions would be given by     1   ⊗[Re −α−βF ]     t t  E[h t (α,β,λ)] = E  F t   = 0, (3.3)     RF −β λ t F,R   β with β =  . Following the same procedure as in Section 3.1, we specify the A F,R   I K matrix and set Θ = βT Σ−1 . Because we find that GMM based on (3.3) leads to the F,R RFRF same asymptotic variance covariance matrices for risk premiums as GMM based on (3.2), we omit the conditions (3.3) in the rest of the paper. 11

3.3 Moment Conditions - Factor–Mimicking Portfolios FollowingBalduzziandRobotti(2008), wealsoconsiderthecasewhereriskpricesareequal to expected returns of factor–mimicking portfolios. Then, the moment conditions used are     1   ⊗[F −Φ −ΦRe]     t 0 t   Re   t        E[h t (αm,βm,Φ 0 ,Φ,λm)] = E 1  = 0, (3.4)   ⊗[Re −αm −βmFm]     t t     Fm  t     ΦRe −λm t with Fm = ΦRe. In this case, there are K(1+N)+N(1+K)+K moment conditions and t t parameters, which makes the system exactly identified. 4 Precision of Risk–Premium Estimators As mentioned in the introduction, our focus is on estimating risk premiums of individual assets or portfolios. Much of the literature on multi–factor asset pricing models has primarily focused on the issue of a factor being priced or not. Formally, this is a test on (a component of) λ being zero or not and, accordingly, the properties of risk–price estimates for λ have been studied and compared.5 In this paper, since our focus is on analyzing the possible efficiency gains based on linear factor models in estimating expected (excess) returns, we first derive the joint distribution ofestimatesforβ andλforthethreecasesdiscussedinSections3.1to3.3.6 Then, wederive the asymptotic distributions of the implied expected (excess) return estimators given by ˆˆ the product βλ. Moreover, we illustrate the empirical relevance of our asymptotic results 5ExamplesincludeShanken(1992),JagannathanandWang(1998),Kleibergen(2009),Lewellen,Nagel, and Shanken (2010), Kan and Robotti (2011), and Kan, Robotti, and Shanken (2013). 6Wealsoprovideresultsforinferenceonriskpremiumswhenthemodelismisspecifiedintheappendix. 12

using the Fama–French three factor model with 25 Fama–French size and book–to–market portfolios as test assets. In particular, we provide the (asymptotic) variances of the various risk–premium estimators with empirically reasonable parameter values and evaluate the benefits of using linear factor models in estimating risk premiums, see Table 1 below. The asset data used in this paper consists of 25 portfolios formed by Fama-French (1992, 1993), downloaded from Kenneth French‘s website. The factors are the three factors of Fama and French (1992) (market, book–to–market, and size). Our analysis is based on monthly data from January 1963 until August 2020, i.e., we have 692 time–series observations. The following theorem provides the limiting distribution of the historical averages estimator. It’s classical and provided for comparison with the three GMM–based estimators in Sections 3.1-3.3. Theorem 4.1. Suppose that Assumptions 1 and 2 holds and that Re − E[Re] forms a t √ martingale difference sequence.7 Then, we have T (cid:0) R ¯e −E[Re] (cid:1) → d N(0,Σ ). ReRe Note that Theorem 4.1 assumes no factor structure. We will, next, provide the asymptotic distributions of expected (excess) return estimators given the linear factor structure impliedbyassetpricingmodels. Notethatthejointdistributionsofλandβ aredifferentfor each set of moment conditions, which leads to different asymptotic distributions for the risk premiums βλ as well. Hence, we derive the asymptotic distributions of expected (excess) return estimators for the three set of moment conditions introduced in Sections 3.1, 3.2, and 3.3 separately. 4.1 Precision with General Moment Conditions The following theorem provides the asymptotic variances of the risk–premium estimators based on the general moment conditions in Section 3.1. This result is valid for both traded 7We throughout assume that standard assumptions for martingale CLTs hold. 13

and non-traded factors. Theorem 4.2. Suppose that Assumptions 1-3 hold and that h (α,β,λ) forms a martingale t difference sequence. Consider the moment conditions (3.1). ˆˆ Then, the limiting variance of the expected (excess) return estimator βλ is given by (cid:0) (cid:1)(cid:0) (cid:1) Σ − 1−λ(cid:48)Σ−1λ Σ −β(β(cid:48)Σ−1β)−1β(cid:48) . (4.1) ReRe FF εε εε The proof is provided in the appendix. Theorem 4.2 provides the asymptotic covariance matrix of the factor–model based risk–premium estimators with the general moment conditions as in Section 3.1. This formula is useful mainly for two reasons. First, it can be used to compute the standard errors of these risk–premium estimates and, accordingly, the related t–statistics and p-values can be obtained. Second, it allows us to study the precision gains for estimating the risk premiums from incorporating information about the factor model. In case of a one–factor model and a single test asset, the (asymptotic) variances of both the naive risk–premium estimator and the factor–model based risk–premium estimator are the same. When more assets/portfolios are available (N > 1), observe that the magnitude of the asymptotic variances of the risk–premium estimators depends on the prices of risk λ, the exposures β, and the idiosyncratic variances Σ . Note that the difference εε between the asymptotic covariance matrix of the naive estimator, R ¯e, and the factor–based (cid:0) (cid:1) risk–premium estimator is 1−λ(cid:48)Σ−1λ (Σ −β(β(cid:48)Σ−1β)−1β(cid:48)). The following corollary FF εε εε formalizes this relation. Corollary 4.1. Suppose that Assumptions 1-3 hold and that h (α,β,λ) forms a martingale t difference sequence. If λ(cid:48)Σ−1λ < 1, then the limiting variance of the expected (excess) FF ˆˆ return estimator βλ is at most Σ . ReRe Corollary 4.1 shows that there may be precision gains for estimating risk premiums 14

from the added information about the factor model if λ(cid:48)Σ−1λ is smaller than one. Observe FF that, although λ(cid:48)Σ−1λ can be larger than one mathematically, in the one–factor case with a FF traded factor, λ(cid:48)Σ−1λ is the squared Sharpe ratio of that factor. This squared Sharpe ratio FF is, for stocks and stock portfolios, generally (much) smaller than 1. Moreover, plugging in the estimates from the Fama–French three factor model (based on GMM with moment conditions (3.1)) gives λ(cid:48)Σ−1λ = 0.034. Note that the smaller the value for λ(cid:48)Σ−1λ, the FF FF larger the efficiency gains from imposing a factor model. It may be surprising that, theoretically, the GMM estimator can behave worse than historical averages. Note, however, that we are, in line with the literature, not using optimal GMM weights. We calculate the (asymptotic) variances of the factor–model based risk–premium estimates for all 25 FF portfolios by plugging the parameter estimates into (4.1). Table 1 presents the results. Comparing the asymptotic variances of the factor–model based risk– premium estimators to those of the naive estimators, we see that the factor–model based risk–premium estimators are more precise than the naive estimators for all 25 Fama–French portfolios. In particular, using the 3–factor model in estimating risk premiums of 25 FF portfolios leads to considerable gains in variances of up to 23%. Note that a 23% gain in variances means that the same (statistical) precision can be obtained with 23% less observations. 4.2 Precision with Moment Conditions for Traded Factors When the risk factors are traded, meaning that the factor itself is an excess return, additional restrictions on the prices of risk can be incorporated into the estimation. With the availability of such information, one could again expect efficiency gains in estimating both the prices of risk and the expected (excess) returns. In this section, we consider this case and the following theorem gives the asymptotic variances of the expected (excess) return estimators with the moment conditions (3.2) for the case all factors are traded. 15

Theorem 4.3. Suppose that all factors are excess returns. In addition, suppose that Assumptions 1-3 hold and that h (α,β,λ) forms a martingale difference sequence. Consider t the moment conditions (3.2). ˆˆ Then, the limiting variance of the expected (excess) return estimator βλ is given by (cid:0) (cid:1) Σ − 1−λ(cid:48)Σ−1λ Σ . (4.2) ReRe FF εε Theorem 4.3 allows us to study the efficiency gains for estimating risk premiums from a model where the factors are traded compared to historical averages. Comparing the asymptotic covariance matrix of the factor–based risk–premium estimators from GMM based on (3.2) to the one of the naive estimator, we observe that the difference is given by (cid:0) (cid:1) 1−λ(cid:48)Σ−1λ Σ . As before, the factor 1 − λ(cid:48)Σ−1λ is empirically generally found to be FF εε FF positive implying an efficiency gain. Moreover, observe that asymptotic covariance matrix of the risk–premium estimator based on GMM with (3.2) can be different from the ones of the risk–premium estimator based on GMM with (3.1). This indicates that there may be efficiency gains even within the GMM framework from that the information that the factors are traded. The following corollary formalizes these issues. Corollary 4.2. Suppose that all factors are traded. In addition, suppose that Assumptions 1-3 hold and that h (α,β,λ) forms a martingale difference sequence. Consider the GMM t estimator based on the moment conditons (3.2). Then, we have the following. 1. If λ(cid:48)Σ−1λ < 1, then the limiting variance of the expected (excess) return estimator FF ˆˆ βλ is at most Σ . ReRe 2. The limiting variance of this expected (excess) return estimator is at most the limiting variance of the estimator based on the moment conditions (3.1). Plugging in the parameter estimates from the analysis of the Fama–French model gives λ(cid:48)Σ−1λ < 1 = 0.033. Comparing the variances of the risk–premium estimates based on FF 16

GMM with (3.2) to those of the naive estimators in Table 1, we see that the risk–premium estimates based on GMM with (3.2) have smaller asymptotic variances than the naive estimators. In particular, the magnitude of efficiency gains goes up to 24%. Moreover, consistent with Corollary 4.2, the asymptotic variances of the risk–premium estimates based on GMM with (3.1) exceed those of the risk–premium estimators based on GMM with (3.2). Overall, these precision gains from estimating risk premiums based on factor models stem from two sources. First, the linear relation implied by asset pricing models is valuable information in the estimation of risk premiums. Second, when the factors are traded, the additional information that the prices of risk equal the expected factor returns leads to higher precision of risk–premium estimates. 4.3 Precision with Moment Conditions Using Factor–Mimicking Portfolios One may hope that replacing factors by factor–mimicking portfolios may also bring efficiency gains compared to (4.1) since the additional restriction on the price of the factor risk can be incorporated into the estimation. In this section, we derive the asymptotic variances of expected (excess) return estimators obtained with factor–mimicking portfolios. Theorem 4.4. Suppose that Assumptions 1-3 hold and that h (αm,βm,Φ ,Φ,λm) forms a t 0 martingale difference sequence. Consider the GMM estimator based on the moment conditions (3.4). Then, the limiting variance of the expected (excess) return estimator, β ˆmλ ˆm, is given by (cid:16) (cid:110) (cid:111) (cid:17) Σ − µ(cid:48) Σ−1 −Σ−1 β (cid:0) β(cid:48)Σ−1 β (cid:1)−1 β(cid:48)Σ−1 µ (4.3) ReRe Re ReRe ReRe ReRe ReRe Re (cid:16) (cid:17) × Σ −β (cid:0) β(cid:48)Σ−1 β (cid:1)−1 Σ−1 (cid:0) βΣ−1 β (cid:1)−1 β(cid:48) ReRe ReRe FF ReRe (cid:16) (cid:17) − (cid:0) 1−µ(cid:48) Σ−1 µ (cid:1) Σ −β (cid:0) β(cid:48)Σ−1 β (cid:1)−1 β(cid:48) , Re ReRe Re ReRe ReRe 17

with µ = E[Re]. Re t Theorem 4.4 enables us to study the efficiency gains in risk premiums using factor– mimicking portfolios. Observe that the difference between the asymptotic covariance matrices of the naive estimator and the factor–model based GMM risk–premium estimator with (3.4) is given by (cid:110) (cid:111) µ(cid:48) Σ−1 −Σ−1 β (cid:0) β(cid:48)Σ−1 β (cid:1)−1 β(cid:48)Σ−1 µ (4.4) Re ReRe ReRe ReRe ReRe Re (cid:16) (cid:17) × Σ −β (cid:0) β(cid:48)Σ−1 β (cid:1)−1 Σ−1 (cid:0) βΣ−1 β (cid:1)−1 β(cid:48) ReRe ReRe FF ReRe (cid:16) (cid:17) + (cid:0) 1−µ(cid:48) Σ−1 µ (cid:1) Σ −β (cid:0) β(cid:48)Σ−1 β (cid:1)−1 β(cid:48) . Re ReRe Re ReRe ReRe Efficiency gains with respect to the historical averages estimator are dependent on (4.4) beingpositivesemi–definiteornot. TheresultsfromourempiricalanalysiswithFF-3factor model illustrates that there are considerable efficiency gains over the naive estimation for all 25 Fama–French 25 portfolios (see Table 1). In particular, estimating risk premiums with GMM (3.4) leads to, of up to 22%, smaller variances than estimating them with the naive estimator. Moreover, we find that estimating risk premiums by making use of the mimickingportfoliosleadstosmallefficiencylossesovertheestimationbasedonthegeneral case, i.e, GMM based on (3.1) for all assets. Note that one important difference between Theorem 4.2 and Theorem 4.4 may potentially come from the estimation of the mimicking portfolio weights. The estimation of the weights of the factor–mimicking portfolio potentially leads to different (intuitively higher) asymptotic variances for the betas of the mimicking factors as well as for the mimicking factor prices of risk, and the risk premiums, which are essentially a multiplication of βm and λm. Such issue is similar to errors–in–variables type of corrections in two step Fama–Macbeth estimation, i.e., the Shanken (1992) correction in asymptotic variances for generated regressors. We recall here that GMM standard errors automatically account for 18

such effects as the system of moment conditions is solved simultaneously. In particular, in our setting with moments conditions (3.4), GMM treats the moments producing Φ simultaneously with the moments generating βm and λm. Hence, the long–run covariance matrix captures the effects of estimation of Φ on the standard errors of the βm and λm, hence the risk premiums. If we consider the Fama–French three factor model with the 25 FF–portfolios, we can also intuitively gain insights about the difference between the inferences about risk premiums based on GMM with the two sets of moment conditions (3.2) and (3.4). In fact, since the factors are traded, meaning that they are excess returns themselves, we can estimate the risk premiums via the second set of moment conditions (3.2). Moreover, we can also estimate such system via the third set of moment conditions (3.4), which has the additional burden of estimating the coefficients for the construction of the mimicking portfolio. Accordingly, GMM estimation via the second set and the third set of moment conditions may lead to different precisions for the risk–premium estimates. The last column in Table 1 documents the efficiency comparisons in estimating risk premiums of 25 FF portfolios employing factor mimicking portfolios over risk premium estimation with moment conditions (3.2). Efficiency losses are present for all 25 Fama–French portfolios, meaning that risk–premium estimates employing factor mimicking portfolios, i.e., based on (3.4), are less precise than risk–premium estimates based on (3.2). 5 Inference about Risk Premiums when the β’s are small Several studies document inference issues regarding the prices of risk when the factors are weakly correlated with the asset returns (see, e.g., Kleibergen (2009), Gospodinov et al. (2014), Bryzgalova (2015), Burnside (2015), Kleibergen and Zhan (2015), and Gospodinov 19

et al. (2017, 2019)). When β’s are close to zero and/or when the β matrix is almost of ˆ reducedrank, theconfidenceintervalsofthepricesofriskestimates, λ, areerroneous, which leads to unreliable statistical inference. The effects may be severe in empirical research, as the confidence intervals of the risk price estimates may even be unbounded, as documented in Kleibergen (2009) for the case of the conditional consumption CAPM of Lettau and Ludvigson (2001).8 Accordingly, Kleibergen (2009) provides identification-robust statistics and confidence sets for the risk price estimates when the β’s are small. However, once the interest is in estimating risk premiums on individual assets or portfolios, a natural question is whether similar issues exist in the presence of weakly correlated factors. Recall that the focus of this study is on risk premiums, the product of β and λ, rather than on λ alone. In the rest of this section, we will focus on the specification where β has small but non-zero values. Following the literature on weak instruments (see, e.g., Staiger and Stock (1997)), and Kleibergen (2009), we consider a sequence of β’s getting smaller as the sample size increases. 9 Corollary 5.1. Suppose Assumption 2 and Assumption 3 hold and consider the small β case where β = √ 1 B for a fixed full rank N ×K matrix B. Then, the behavior of the risk T premium estimators in large samples can be characterized as follows. 1. Consider the moment conditions (3.1), then (cid:16) β ˆ λ ˆ −βλ (cid:17) → d (cid:0) B +YΣ −1 (cid:1) (cid:104) (cid:0) B +YΣ −1 (cid:1)(cid:48) Σ −1 (cid:0) B +YΣ −1 (cid:1) (cid:105)−1 FF FF (cid:15)(cid:15) FF (cid:0) (cid:1)(cid:48) × B +YΣ −1 Σ −1[Bλ+U]−βλ, (5.5) FF (cid:15)(cid:15) where U ∼ N(0,Σ ) and vec(Y) ∼ N(0,Σ ⊗Σ ) are independently distributed. 10 (cid:15)(cid:15) FF (cid:15)(cid:15) 8Kleibergen (2009) documents that %95 percent confidence bounds of the prices of risk on the scaled consumption growth coincides with the whole real line. 9The results in this section have been revised in the current version of the paper. 10Thelimitingvarianceoftheriskpriceestimatorinthepresenceofweak–factorsisunboundedforFama 20

2. Consider moment conditions (3.2), then (cid:16) (cid:17) (cid:16) (cid:17) β ˆ λ ˆ −βλ → d N 0,λ(cid:48)Σ −1λΣ . (5.6) FF (cid:15)(cid:15) The proof is provided in the appendix. Corollary 5.1 documents several important findings of our analysis regarding the issue of small but non-zero factor loadings. First, if the parameters of the linear factor model are estimated with GMM based on the moment conditions (3.1), the limiting behaviour of the risk premium estimator is non-standard and differs from normality. This result is in line with the literature documenting unreliable statistical inference about the prices of risk based on the Fama-Macbeth and GLS two-pass estimation and their unbounded confidence sets. Corollary 5.1-2 shed light on the issue of small β’s when all factors in the linear factor model of interest are traded. In this case, if one estimates the parameters of the model with GMM based on (3.2), then the limiting variances of the risk–premium estimators are not affected by the β having a small value or not. Observe that, the limiting variances in Corollary 5.1-2 is equal to (4.2) under β = 0. 6 Monte Carlo Simulations In this section, we conduct a Monte Carlo experiment to study the finite-sample properties of the various-factor model based estimators of expected returns. We consider correctly specified models, and models with spurious and weak factors. We calibrate the parameters of the true models by using the monthly returns on 25 Fama-French portfolios from January 1963 until August 2020 (available on Kenneth French’s website). We use the nominal 1–month Treasury bill rate as a proxy for risk–free rate. In our simulations we consider three different time-series sample sizes: are T = 240, and MacBeth (1973) and generalized least squares (GLS) regression methods (see, e.g. Kleibergen (2009), and Kleibergen and Zhan (2015). 21

T = 480, and T = 960. These choices cover the sample sizes typically encountered in empirical research with quarterly and monthly data. Varying T is useful for understanding the small-sample behaviour of the estimators and we use T = 960 to assess the quality of our asymptotic approximations. All results are based on 20,000 replications. 6.1 Under the null that asset pricing restrictions hold In this subsection, we assume that the asset pricing restrictions are true, i.e. Eqn. (2.2) holds for returns of N = 25 test assets. We present results for a one-factor model setting.11 In this case, the model is calibrated to mimic the CAPM, estimated over the sample period from January 1963 until August 2020. All factors and the error terms are generated from multivariate normal distribution.12 Figure 1 reports the simulation results for the standard errors of risk premium estimates on 25 test assets. Panel (a), Panel (b), and Panel (c) show the results for T = 240, T = 480, and T = 960, respectively. Top exhibits provide the root–mean–square errors (RMSE) for risk premium estimators across 25 test assets (x-axis values). The middle exhibits illustrate the simulation averages of the estimated asymptotic standard errors (AEST). The lower exhibits present the percentage errors (PE) of AESTs as compared to the RMSEs. There are several important observations to be made about the standard errors. First, AESTs for all 25 assets are very close to their corresponding RMSEs for all three factormodel based estimators of expected returns (GMM–Gen, GMM–Tr, GMM–Mim). In particular, absolute percentage errors across all assets are less than 0.5% for T=960, and they are less than 1% for T=480, and T = 240. Second, the RMSEs of all three factor–model based estimators of expected returns (GMM–Gen, GMM–Tr, GMM–Mim) are smaller than the RMSEs of the historical averages estimator, which holds true for all 25 individual assets 11Results for three-factor model is available upon request. 12Inthethree-factormodelsetting,theparametersofthemodelarecalibratedtomimictheFama–French three-factor model. 22

and for all sample sizes T = 240, T = 480, and T = 960. We observe a similar pattern for the estimated asymptotic standard errors. Those patterns are manifestations of the efficiency gains from using the factor–model based risk premium estimates. These simulation results show that the limiting variances for factor–model based estimators of risk premiums that we provide in our paper provide accurate approximations to the finite sample behavior of standard errors of the estimates. Furthermore, comparing the RMSEs (and AESTs) of factor–model–based estimators of risk premiums to RMSEs (and AESTs) of the historical averages, we observe that the efficiency gains from using the factor–model based estimators of risk–premiums are present for all individual assets. 6.2 Presence of weak factors In this section, we consider the case for the presence of a weak factor. The parameters are calibrated by using the monthly data from from January 1963 until August 2020 on 25 Fama French (1992) portfolios sorted by size and book–to–market and the corresponding market factor, Re . The data generating process is given by m,t B Re = α+ √ Re +ε , t = 1,2,...,T, (6.7) t m,t t T with normally distributed Re and multivariate normal ε under the null of E[Re] = √ B λ. m,t t t T We use 20000 replications for each estimator considered. Figure 2 present the simulation results for the standard errors of the risk premium estimates in the presence of weak factor. Panel (a), Panel (b), and Panel (c) show the results for T = 240, T = 480, and T = 960, respectively. For all three factor–model based risk premium estimators, the estimated asymptotic standard errors are close to their corresponding RMSEs, with absolute percentage errors being lower than 3% for all assets for T = 240,480,960. Moreover, comparing the RMSEs (and the standard errors of the historical averages) to the RMSEs (and the standard errors of the factor–model–based 23

risk–premium estimators), we observe that there are substantial efficiency gains from using the factor–model–based risk–premium estimates. These efficiency gains are present for all individual assets. To sum up, if one is interested in making econometric inference about the prices of risk λ, smallbutnon-zeroβ’smaymaycausespuriousinference. However, oncetheinterestisin estimating risk premiums, i.e., expected (excess) returns on individual assets or portfolios, thelimitingvariancesbasedonGMMwith(3.1),(3.2)or(3.4)providegoodapproximations to the finite–sample variances of the factor model based estimators when the β’s are weak. 7 Portfolio Choice with Parameter Uncertainty In the previous sections, we provided an asymptotic analysis of the three factor–model based risk–premium estimators and analyzed the efficiency gains with respect to naive historical averages. In this section, we analyze the economic significance of these gains in a portfolio allocation problem `a la Markowitz (1952). Theimplementationofthemean–varianceframeworkrequirestheestimationoffirsttwo momentsoftheassetreturns. Although, inthissetting, theoptimalportfoliosaresupposed to achieve the best performance, in practice, the estimation error in the estimated moments leads to large deterioration of the out–of–sample performance of the optimal portfolios (see, e.g., DeMiguel et al. (2009)). In this section, we analyze the out–of–sample performances of the optimal portfolios based on factor–based risk–premium estimates as well as the historical averages, 1/N portfolio and global minimum variance portfolio in a simulation analysis. We consider the following well known optimization problem: Suppose a risk–free asset exists and w is the vector of relative portfolio allocations of wealth to N risky assets. The investor has preferences that are characterized by the expected return and variance of her selected portfolio. The investor maximizes her expected utility, by choosing the vector of 24

portfolio weights w such that γ E[U] = w(cid:48)µe − w(cid:48)Σ w, (7.8) RR 2 is maximized, where γ measures the investor’s risk aversion level and µe and Σ denote RR the expected excess returns on the assets and covariance matrix of returns. The solution to this maximization problem is well known and given by 1 w = Σ−1µe. (7.9) opt γ RR In the optimization problem above, since the true risk–premium vector, µe, and the true covariance matrix of asset returns, Σ , are unknown, in empirical work, one needs RR to estimate them. We consider four portfolios constructed with different risk–premium estimators: the optimal portfolio constructed with historical averages, and the optimal portfolios constructed with the three factor model–based GMM risk–premium estimates with moment conditions (3.1), (3.2) and (3.4), respectively. Note that the covariance matrix is estimated using the traditional sample counterpart.13 We also consider the global minimum variance (GMV thereafter) portfolio to which we compare the performance of the portfolios based on the risk–premium estimates. Note that the implementation of this portfolio only requires estimation of the covariance matrix, for which we again use the sample counterpart, and completely ignores the estimation of expected returns.14 Morover, we analyze the performance of the 1/N portfolio. Wecompareperformancesoftheportfoliosbyusingtheirout-of-sampleSharpeRatios.15 131/(T −1) (cid:80)T(R −R¯ )(R −R¯ )(cid:48), where R¯ is the sample average of returns. 1 t t t t t 14This portfolio is obtained by minimizing the portfolio variance with respect to the weights with the onlyconstraintthatweightssumupto1andtheN–vectorofportfolioweightsisgivenbyw gmv = ιN Σ Σ R R R R ιN ιN 15SeePen˜arandaandSentana(2011)forananalysisexaminingtheimprovementsintheestimationofin– samplemeanvariancefrontiersbasedonassetpricingmodelrestrictions,tangencyorspanningconstraints. 25

We set the initial window length at 120 data points, corresponding to 10 years of monthly data. The parameters for the return–generating process are calibrated to mimic the Fama- French 3–factor model with 25 FF portfolios. All factors and the error terms are generated from multivariate normal distribution. We simulate independent sets of Z = 20,000 return samples. Table 2 provides the simulation results for out–of–sample Sharpe ratios of different portfolios. For each portfolio, we present the average estimate over simulations, SR (first line), the bias as the percentage of the population Sharpe ratios, (SR−SR)/SR (second line) and the root–mean–square error (RMSE), the square root of (cid:80)Z (S ˆ R − SR)/Z, s=1 s (third line) , where Z = 20,000. In order to isolate the effect of the error in risk–premium estimates, we present our results with both true and estimated Σ . First, note that the true Sharpe ratio of the op- RR timal portfolio is superior to the portfolios based on estimated risk–premiums or covariance matrix of asset returns. Comparing the average Sharpe ratio of the optimal portfolio based on historical averages to the true Sharpe ratio of optimal portfolio for enlarging samples, we see that the bias is strikingly large and negative with −42% and −43%, depending on whether the covariance matrix of asset returns is the true one or the estimated one. However, using the factor–models to estimate the risk–premiums reduces the bias in Sharpe ratios substantially to about −12% when the true coviariance matrix is used, and to about −9% when the covariance matrix is estimated. In particular, with GMM–Gen estimates, average Sharpe ratio of the optimal portfolio is 0.157 in case of true covariance matrix (with an improvement of about 50% over the average Sharpe ratios with the historical averages) and 0.161 in case of an estimated covariance matrix (with an improvement of about 60% over the average Sharpe ratios with the historical averages). Among the optimal portfolios constructed with factor–model based risk–premium estimates, the portfolio based on GMM–Tr estimates perform the best with 0.162. However, the differences in biases 26

are minimal for all optimal portfolios constructed with factor–model based risk–premium estimators. Next, we analyse the RMSEs of the various portfolios. Out–of–sample Sharpe ratio of the optimal portfolios based on historical averages is extremely volatile across simulations. That is, for the case of enlarging samples, it has a RMSE of 0.09 (given the average estimate 0.10) if the covariance matrix is estimated. However, using factor-model based risk–premium estimators decreases the RMSEs substantially. The differences in RMSEs are minor for all optimal portfolios constructed with factor–model based risk–premium estimators. Comparing the average Sharpe ratios of the optimal portfolios the factor model–based risk–premium estimates with GMV and 1/N, we see that optimal portfolios based on the naive estimator performs worse than both the 1/N strategy and the GMV portfolio. Moreover, both GMV and 1/N have substantially lower RMSEs. This result is consistent with the findings in the literature that GMV portfolio as well as 1/N strategy has better out–of– sample performance than the optimal portfolios based on sample moments.16 However, the average Sharpe ratios for all optimal portfolios based on factor model–based risk–premium estimates are larger than both the GMV and 1/N porfolios. Moreover, their out of Sharpe ratios across simulations are almost as stable as the GMV portfolio as well as the 1/N strategy. Overall, using the factor–model based risk–premium estimators improves the performance of optimal portfolios substantially over the optimal portfolios based on the plug in estimates of historical averages in terms of both bias and RMSEs. Moreover, in contrast to the optimal portfolios with historical averages, these portfolios perform considerably better than the global minimum variance portfolio. In practice, while there is little hope in knowing the universally true factor model, we know that particular factor models work 16See, e.g., DeMiguel et al. (2009), Jagannathan and Ma (2003), and Jorion (1985, 1991)). 27

well empirically in explaining expected returns of particular test assets (e.g. the Fama- French three factor model explains the returns of 25 size/book-to-market portfolios). Our results suggests that using factor-model-based expected return estimates likely improves the performance of optimal portfolios over the portfolios based on the historical averages estimators of expected returns, minimum variance portfolio, and 1/N portfolio. 8 Conclusions One traditional technique in the literature is to use average historical returns as estimates of expected excess returns, that is risk premiums, on individual assets or portfolios. These estimators are usually noisy. This translates into the need for very long, in practice, mostly infeasible, samplesofdatainordertoobtainsomeprecision. However, thefinanceliterature provides a wide variety of risk–return models which imply a linear relationship between the expected excess returns and their exposures. In this paper, we show that, when correctly specified, such parametric specifications on the functional form of risk premiums lead to significant inference gains for estimating expected (excess) returns. In the standard Fama–French three factor model using MKT, SMB, HML as factors with 25 FF portfolios, the efficiency gains are sizeable and go up to about 25% for individual portfolios. For applications, this translates into the benefit of using only about 75% of the data with factor–model based risk–premium estimates to obtainthesameprecisionaswiththehistoricalaveragesestimator. Moreover, weshowthat the presence of weakly identified factors, the confidence bounds of factor model based risk– premium estimators are not affected in case of traded factors. Finally, we show that out– of–sample performance of optimal portfolios significantly improves if factor–model based estimates of risk premium are used in portfolio weights instead of the classical historical averages. 28

References Balduzzi, P., and Robotti, C. (2008). Mimicking portfolios, economic risk premia, and tests of multi-beta models. Journal of Business & Economic Statistics, 26(3), 354–368. Best, M. J., and Grauer, R. R. (1991). On the sensitivity of mean-variance-efficient portfolios to changes in asset means: Some analytical and computational results. The Review of Financial Studies, 4(2), 315–342. Boyd, S., andVandenberghe, L.(2004). Convex Optimization. CambridgeUniversityPress. Breeden, D. T. (1979). An intertemporal asset pricing model with stochastic consumption and investment opportunities. Journal of Financial Economics, 7(3), 265–296. Bryzgalova, S. (2015). Spurious factors in linear asset pricing models. Working Paper. Burnside, C. (2015). Identification and Inference in Linear Stochastic Discount Factor Models with Excess Returns. Journal of Financial Econometrics, 14(2), 295–330. Chopra, V. K., and Ziemba, W. T. (1993). The effect of errors in means, variances, and covariances on optimal portfolio choice. The Journal of Portfolio Management, 19(2), 6–11. DeMiguel, V., Garlappi, L., and Uppal, R. (2009). Optimal versus naive diversification: How inefficient is the 1/n portfolio strategy? The review of Financial studies, 22(5), 1915–1953. Fama, E. F. (1998). Determining the number of priced state variables in the icapm. The Journal of Financial and Quantitative Analysis, 33(2), 217–231. Fama, E. F., and French, K. R. (1992). The cross-section of expected stock returns. the Journal of Finance, 47(2), 427–465. Fama, E. F., and French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33(1), 3–56. Fama, E. F., and MacBeth, J. D. (1973). Risk, return, and equilibrium: Empirical tests. Journal of Political Economy, 81(3), 607–636. Frost, P. A., and Savarino, J. E. (1988). For better performance. The Journal of Portfolio Management, 15(1), 29–34. Gagliardini, P., Ossola, E., and Scaillet, O. (2016). Time-varying risk premium in large cross-sectional equity data sets. Econometrica, 84(3), 985–1046. Giglio, S., Xiu, D., and Zhang, D. (2021). Test assets and weak factors. Chicago Booth Research Paper, 21(4). 29

Gospodinov, N., Kan, R., and Robotti, C. (2014). Misspecification-Robust Inference in Linear Asset-Pricing Models with Irrelevant Risk Factors. The Review of Financial Studies, 27(7), 2139–2170. Gospodinov, N., Kan, R., and Robotti, C. (2017). Spurious inference in reduced-rank asset-pricing models. Econometrica, 85(5), 1613–1628. Gospodinov, N., Kan, R., and Robotti, C. (2019). Too good to be true? fallacies in evaluating risk factor models. Journal of Financial Economics, 132(2), 451–471. Hall, A. (2005). Generalized Method of Moments. Advanced texts in econometrics. Oxford University Press. Hansen,L.P.(1982). Largesamplepropertiesofgeneralizedmethodofmomentsestimators. Econometrica: Journal of the Econometric Society, (pp. 1029–1054). Hou, K., and Kimmel, R. (2006). On the estimation of risk premia in linear factor models. Unpublished working paper. Ohio State University. Huberman, G., Kandel, S., and Stambaugh, R. F. (1987). Mimicking portfolios and exact arbitrage pricing. The Journal of Finance, 42(1), 1–9. Jagannathan, R., and Ma, T. (2003). Risk reduction in large portfolios: Why imposing the wrong constraints helps. The Journal of Finance, 58(4), 1651–1683. Jagannathan, R., and Wang, Z. (1998). An asymptotic theory for estimating beta-pricing models using cross-sectional regression. The Journal of Finance, 53(4), 1285–1309. Jobson, J. D., and Korkie, B. (1980). Estimation for markowitz efficient portfolios. Journal of the American Statistical Association, 75(371), 544–554. Jorion, P. (1985). International portfolio diversification with estimation risk. The Journal of Business, 58(3), 259–278. Jorion, P. (1991). Bayesian and capm estimators of the means: Implications for portfolio selection. Journal of Banking & Finance, 15(3), 717–727. Kan, R., and Robotti, C. (2011). On the estimation of asset pricing models using univariate betas. Economics Letters, 110(2), 117–121. Kan, R., Robotti, C., and Shanken, J. (2013). Pricing model performance and the two-pass cross-sectional regression methodology. The Journal of Finance, 68(6), 2617–2649. Kan, R., and Zhang, C. (1999). Two-pass tests of asset pricing models with useless factors. The Journal of Finance, 54(1), 203–235. Kelly, B. T., Pruitt, S., and Su, Y. (2019). Characteristics are covariances: A unified model of risk and return. Journal of Financial Economics, 134(3), 501–524. 30

Kleibergen,F.(2009). Testsofriskpremiainlinearfactormodels. Journal of Econometrics, 149(2), 149–173. Kleibergen, F., and Zhan, Z. (2015). Unexplained factors and their effects on second pass r-squared’s. Journal of Econometrics, 189(1), 101–116. Lamont, O. A. (2001). Economic tracking portfolios. Journal of Econometrics, 105(1), 161–184. Ledoit, O., and Wolf, M. (2003). Improved estimation of the covariance matrix of stock returns with an application to portfolio selection. Journal of Empirical Finance, 10(5), 603–621. Lettau, M., and Ludvigson, S. (2001). Resurrecting the (c)capm: A cross-sectional test when risk premia are time-varying. Journal of Political Economy, 109(6), 1238–1287. Lewellen, J., Nagel, S., and Shanken, J. (2010). A skeptical appraisal of asset pricing tests. Journal of Financial Economics, 96(2), 175–194. Markowitz, H. (1952). Portfolio selection. The Journal of Finance, 7(1), 77–91. Merton, R.C.(1973). Anintertemporalcapitalassetpricingmodel. Econometrica: Journal of the Econometric Society, (pp. 867–887). Merton, R. C. (1980). On estimating the expected return on the market: An exploratory investigation. Journal of Financial Economics, 8(4), 323–361. Michaud, R. O. (1989). The markowitz optimization enigma: Is ‘optimized’optimal? Financial analysts journal, 45(1), 31–42. Pa´stor, L. (2000). Portfolio selection and asset pricing models. The Journal of Finance, 55(1), 179–223. Pa´stor, L., and Stambaugh, R. F. (1999). Costs of equity capital and model mispricing. The Journal of Finance, 54(1), 67–121. Pen˜aranda, F., and Sentana, E. (2011). Inferences about portfolio and stochastic discount factor mean variance frontiers. Working paper. Ross, S. A. (1976). The arbitrage theory of capital asset pricing. Journal of Economic Theory, 13(3), 341–60. Shanken, J. (1992). On the estimation of beta-pricing models. The review of financial studies, 5(1), 1–33. Sharpe,W.F.(1964). Capitalassetprices: Atheoryofmarketequilibriumunderconditions of risk. The Journal of Finance, 19(3), 425–442. Staiger, D., and Stock, J. H. (1997). Instrumental variables regression with weak instruments. Econometrica: journal of the Econometric Society, (pp. 557–586). 31

A Proofs In the rest of the paper, the covariance matrix of the factor–mimicking portfolios is denoted by Σ . FmFm A.1 Equivalence of factor pricing using mimicking portfolios Proof of Theorem 2.1. Define Mm as the projection of M onto the augmented span of excess returns, Mm = P(M|1,Re) (A.1) so that = E[Mm], (A.2) Cov[M,Re] = Cov[Mm,Re]. (A.3) Thus, we have (cid:18) (cid:19) 1 βλ = Cov[Re,F(cid:48)]Σ−1 − Σ b (A.4) FF E[M] FF 1 = − Cov[Re,F(cid:48)]b E[M] 1 = − Cov (cid:2) Re,Fm(cid:48)(cid:3) b E[Mm] 1 = − Cov (cid:2) Re,Fm(cid:48)(cid:3) Σ−1 Σ b E[Mm] FmFm FmFm = βmλm, which completes the proof. A.2 Precision of Parameter Estimators Given a Factor Model This section provides the proofs for asymptotic properties of the parameter estimators under the specified linear factor model. The lemma A.1 below illustrates the asymptotic distribution of the GMM estimators with a given set of moment conditions provided that a pre–specified matrix A, that essentially determines the weigths of the overidentifying moments, is introduced. Thereafter, these results will be used to calculate the variance covariance matrix for the moment conditions (3.1), (3.2) and (3.4), respectively. 32

Under appropriate regularity conditions, see, e.g., Hall (2005), Chapter 3.4, we have the following result. Lemma A.1. Let θ ∈ Rp be a vector of parameters and the moment conditions are given by E[h (θ)] = 0 where h (θ) ∈ Rq, stationary and ergodic process with finite fourth moment. t t Given a prespecified matrix A ∈ Rp×q, its consistent estimator A ˆ and A ˆ1 (cid:80)T h (θ ˆ ) = 0, T t=1 t √ T(θ ˆ −θ) → d N (cid:0) 0,[AJ]−1ASA(cid:48)[J(cid:48)A(cid:48)]−1 (cid:1) , (A.5) where, (cid:20) (cid:21) ∂h (θ) t J = E , (A.6) ∂θ(cid:48) S = E[h (θ)h (θ)(cid:48)]. (A.7) t t The above lemma presents the asymptotic distribution of the parameters in a general GMM context. In the subsequent lemmas, we provide the limiting distributions for the parameter estimators based on the moment conditions (3.1), (3.2) and (3.4), respectively. Lemma A.2. Under Assumptions 1-3 and the moment conditions (3.1) with parameter vector θ = (α(cid:48),vec(β)(cid:48),λ(cid:48))(cid:48), we have √ ˆ d T(θ−θ) → N(0,V), (A.8) with     1+µ(cid:48) Σ−1µ −µ(cid:48) Σ−1   F FF F F FF ⊗Σ V     εε c   −Σ−1µ Σ−1  V =  FF F FF          V(cid:48) (1+λ(cid:48)Σ−1λ)(β(cid:48)Σ−1β)−1 +Σ c FF εε FF 33

  1+µ(cid:48) Σ−1λ where µ = E[F ] and V =  F FF ⊗β(β(cid:48)Σ−1β)−1. F t c   εε −Σ−1λ FF Proof. The proof follows from plugging the appropriate matrices for the moment conditions provided in Section 3.1 into the variance covariance formula in (A.5) and performing the matrix multiplications. Below, we provide the limiting variance covariance matrix (S) and the Jacobian (J) for this specific set of moment conditions,   Σ µ(cid:48) ⊗Σ Σ εε F εε εε     S =  µ ⊗Σ [Σ +µ µ(cid:48) ]⊗Σ µ ⊗Σ .  F εε FF F F εε F εε    Σ µ(cid:48) ⊗Σ βΣ β(cid:48) +Σ εε F εε FF εε     1 µ(cid:48) (cid:20) ∂h (θ) (cid:21)   −  F   ⊗I N 0 N(K+1)×K   J(θ) = E t =  µ Σ +µ µ(cid:48) . ∂θ(cid:48)  F FF F F   (cid:20) (cid:21)    0 −λ(cid:48) ⊗I −β N×N N Furthermore   I 0 N(K+1) N(K+1)×N A =  .   0 β(cid:48)Σ−1 K×N(K+1) εε so that the limiting variance of GMM estimator for θ is obtained by performing the matrix multiplications [AJ]−1ASA(cid:48)[J(cid:48)A(cid:48)]−1. Lemma A.3. Suppose that all factors are traded. Then, under Assumptions 1-3 and the moment conditions (3.2) with parameter vector θ = (α(cid:48),vec(β)(cid:48),λ(cid:48))(cid:48), we have √ ˆ d T(θ−θ) → N(0,V), (A.9) 34

with     1+µ(cid:48) Σ−1µ −µ(cid:48) Σ−1   F FF F F FF ⊗Σ 0     εε N(K+1)×K   −Σ−1µ Σ−1  V =  FF F FF .         0 Σ K×N(K+1) FF Proof. The proof follows from plugging the appropriate matrices for the moment conditions (3.2) into the variance covariance formula in (A.5) and performing the matrix multiplications. Below, we provide the limiting variance covariance matrix (S), Jacobian (J) for this specific set of moment conditions, In this case,   Σ µ(cid:48) ⊗Σ 0 εε F εε N×K     S =  µ ⊗Σ [Σ +µ µ(cid:48) ]⊗Σ 0 ,  F εε FF F F εε NK×K    0 0 Σ K×N K×NK FF and     1 µ(cid:48)  − F ⊗I 0     N N(K+1)×K  J(θ) =  µ Σ +µ µ(cid:48) .  F FF F F    0 I K×N(K+1) K Thus, the limiting variance of the GMM estimator for θ is obtained by performing the matrix multiplications J−1S[J(cid:48)]−1 since A = I . N(K+1)+K ThenextlemmaprovidestheasymptoticpropertiesoftheGMMestimatiorwithfactor– mimicking portfolios. Lemma A.4. Given that Assumptions 1-3 are satisfied and that (2.3) hold, then under the moment conditions (3.4), for θ = (vec(βm)(cid:48),λm(cid:48))(cid:48), we have √ ˆ d T(θ−θ) → N(0,V), (A.10) 35

with   Σ−1 ΦΣ Φ(cid:48)Σ−1 ⊗βmΣ βm(cid:48) +Σ−1 ⊗Σ −Σ−1 µ ⊗βmΣ FmFm ReRe FmFm uu FmFm εmεm FmFm Fm uu     V =  .     −µ(cid:48) Σ−1 ⊗Σ βm(cid:48) µ(cid:48) Σ−1 µ Σ +Σ Fm FmFm uu Re ReRe Re uu FmFm Proof. The proof follows again from plugging the appropriate matrices for the moment conditions (3.4) into the variance covariance formula in (A.5) and performing the matrix multiplications. Now, observe that from (A.7), we have     1 µ(cid:48)   Re ⊗Σ 0 0     uu K(1+N)×N(K+1) K(1+N)×K   µ Σ +µ µ(cid:48)   Re ReRe Re Re      S =   1 µ(cid:48)   ,  0  Fm ⊗Σ 0   N(K+1)×K(1+N)   εmεm N(K+1)×K   µ Σ +µ µ (cid:48)   Fm FmFm Fm Fm    0 0 Σ K×K(1+N) K×N(K+1) FmFm and from (A.6), we have     1 Re(cid:48)  − t ⊗I 0 0     K K(1+N)×N(K+1) K(1+N)×K   Re ReRe(cid:48)   t t t        J(θ) = E 0 Re(cid:48) 1 Fm(cid:48) ,   − t ⊗βm − t ⊗I 0        N N(K+1)×K   0 Φ(ReRe(cid:48)) Fm FmFm(cid:48)   K×1 t t t t t    0 Re(cid:48) ⊗I 0 −I K t K K×N(K+1) K with A = I . Thus, the limiting variance of the GMM estimator for θ = K(1+N)+N(K+1)+K (vec(βm)(cid:48),λm(cid:48))(cid:48) is obtained by performing the matrix multiplications J−1S[J(cid:48)]−1. Here, it is worth stressing that the limiting variance covariance matrix obtained by performing the matrix multiplications corresponds to the parameter vector (Φ (cid:48),vec(Φ)(cid:48),αm(cid:48),vec(βm)(cid:48),λm(cid:48))(cid:48). 0 36

Therefore, the asymptotic variance covariance matrix for θ = (vec(βm)(cid:48),λm(cid:48))(cid:48) is the lowerright KN +K by KN +K sub-matrix of the larger variance covariance matrix. Lemmas A.2–A.4 allow us to study the asymptotic properties of the obtained risk premium estimators. It is worth mentioning that the lower–left NK +K dimensional square matrices of the variance covariance matrices in Lemma A.2 and A.3 give the variance covariance matrices corresponding to parameters (vec(β)(cid:48),λ(cid:48))(cid:48). We will use these results to derive the variance covariance matrices of risk premium estimators in the following section. Proof of Theorem 4.1. ThisfollowsfromadirectapplicationoftheCentralLimitTheorem. Proofs of Theorems 4.2 and 4.3. Weareinterestedintheasymptoticdistributionofg(β,λ) = βλ. Given (cid:16) (cid:17)(cid:48) (vec β ˆ ,λ ˆ(cid:48))(cid:48) −(vec(β)(cid:48),λ(cid:48))(cid:48) → d N(0,V ), β,λ we have, by applying the delta method, that √ (cid:16) (cid:17) T g(β ˆ ,λ ˆ )−g(β,λ) → d N(0,g˙(cid:48)V g˙), β,λ with (cid:20) (cid:21) g˙ = λ(cid:48) ⊗I β . N √ ˆ Remember that Lemma A.2 and A.3 give the asymptotic distributions of T(θ − θ) where θ = (α(cid:48),vec(β)(cid:48),λ(cid:48))(cid:48) for the moment conditions (3.1) and (3.2). Observe that V β,λ is the lower NK +K block diagonal matrix of the variance covariance matrices provided in Lemma A.2 and A.3. Hence, the asymptotic variances of the risk premium estimators in Theorems 4.2 and 4.3 follow from plugging in the limiting variance covariance matrices of (vec(β)(cid:48),λ(cid:48))(cid:48) and calculating g˙(cid:48)V g˙. β,λ Proof of Theorem 4.4. We are interested in g(βm,λm) = βmλm. Given (vec (cid:16) β ˆm (cid:17)(cid:48) ,λ ˆm (cid:48) )(cid:48) −(vec(βm)(cid:48),λm(cid:48))(cid:48) → d N(0,V ), βm,λm 37

Then, by applying the delta method, we have √ T(g(β ˆm,λ ˆm)−g(βm,λm)) → d N(0,g˙(cid:48)V g˙) βm,λm and note that here (cid:104) (cid:105) g˙ = λm(cid:48) ⊗I βm N Then, we have g˙(cid:48)V g˙ = λm(cid:48)Σ−1 λmΣ +βmΣ βm(cid:48) (A.11) βm,λm FmFm εmεm FmFm + (cid:0) µ(cid:48) Σ−1µ −λm(cid:48)Σ −1λm (cid:1) βmΣ βm(cid:48) Re RR Re FmFm uu Theresultfollowsfrompluggingtheβm andΦrespectivelyintotheaboveequationvia(2.6). The following lemma follows from the Schur complement condition, see Boyd and Vandenberghe (2004). Lemma A.5. Let   K K 11 12 K =     K K 21 22 be a symmetric matrix and assume that K−1 exists. Then K ≥ 0 is equivalent to K ≥ 0 22 22 and K −K K−1K ≥ 0. 11 12 22 21 Proof of Corollary 4.1. Suppose λ(cid:48)Σ−1λ < 1. We need to study the difference between FF the limiting variance of the historical averages and the limiting variance of the expected (excess) return estimator based on (3.1). In particular, we need to study (cid:0) (cid:0) (cid:1)(cid:2) (cid:3)(cid:1) Σ − Σ − 1−λ(cid:48)Σ−1λ Σ −β(β(cid:48)Σ−1β)−1β(cid:48) ReRe ReRe FF εε εε (cid:0) (cid:1)(cid:2) (cid:3) = 1−λ(cid:48)Σ−1λ Σ −β(β(cid:48)Σ−1β)−1β(cid:48) . FF εε εε In order to show that Σ −β(β(cid:48)Σ−1β)−1β(cid:48) is positive semi–definite, we will use Lemma A.5. εε εε 38

Now, let K = Σ1/2 and K = β(cid:48)Σ−1/2. Then, 1 εε 2 εε     K (cid:20) (cid:21) K K(cid:48) K K(cid:48) K =  1  K(cid:48) K(cid:48) =  1 1 1 2    1 2   K K K(cid:48) K K(cid:48) 2 2 1 2 2 so that   Σ β εε K =  .   β(cid:48) β(cid:48)Σεε−1β Then, Lemma A.5 yields that Σ −β(β(cid:48)Σ−1β)−1β(cid:48) ≥ 0 εε εε Proof of Corollary 4.2. Suppose λ(cid:48)Σ−1λ < 1. FF In order to prove Corollary 4.2–1, we need to study the difference between the limiting variance of the historical averages and the limiting variance of the expected (excess) return estimator based on (3.2). In particular, we need to show that (cid:0) (cid:0) (cid:1) (cid:1) Σ − Σ − 1−λ(cid:48)Σ−1λ Σ ReRe ReRe FF εε (cid:0) (cid:1) = 1−λ(cid:48)Σ−1λ Σ FF εε is positive semi–definite. Since Σ is positive semi-definite, Corollary 4.2–1 follows. εε In order to prove Corollary 4.2–2, we need to study the difference between the limiting variance of the expected (excess) return estimator based on (3.1) and the limiting variance of the expected (excess) return estimator based on (3.2). In particular, we need to show 39

that (cid:0) (cid:0) (cid:1)(cid:2) (cid:3)(cid:1) (cid:0) (cid:0) (cid:1) (cid:1) Σ − 1−λ(cid:48)Σ−1λ Σ −β(β(cid:48)Σ−1β)−1β(cid:48) − Σ − 1−λ(cid:48)Σ−1λ Σ ReRe FF εε εε ReRe FF εε (cid:0) (cid:1) = 1−λ(cid:48)Σ−1λ β(β(cid:48)Σ−1β)−1β(cid:48) FF εε is positive semi–definite. This follows immediately from Σ being positive semi–definite. εε Following the literature on weak instruments (see, e.g., Staiger and Stock (1997)), and Kleibergen(2009), weconsiderasequenceofβ’sgettingsmallerasthesamplesizeincreases. Proof of Corollary 5.1. Suppose Assumption 2 and Assumption 3 hold and consider the small β case where β = √ 1 B for a fixed full rank N ×K matrix B. Then T 1. Consider the GMM estimator based on the moment conditions (3.1). We want to study (cid:16) (cid:17) (cid:16) (cid:17)−1(cid:16) (cid:17) β ˆ λ ˆ −βλ = Σ ˆ Σ ˆ−1 β ˆ(cid:48)Σ ˆ−1β ˆ β ˆ(cid:48)Σ ˆ−1R ¯e −βλ. (A.12) ReF FF εε εε Note, using R e = α+βF +ε , t t t T Σ ˆ = 1 (cid:88)(cid:0) Re −R ¯e (cid:1)(cid:0) F −F ¯(cid:1)(cid:48) (A.13) ReF T t t t=1 T 1 (cid:88)(cid:2) (cid:0) ¯(cid:1) (cid:3)(cid:0) ¯(cid:1)(cid:48) = β F −F +(ε −εˆ) F −F t t t T t=1 ˆ ˆ = βΣ +Σ , FF εF so that β ˆ = β +Σ ˆ Σ ˆ−1. (A.14) εF FF 40

Moreover, note (cid:16) (cid:17)(cid:48) β ˆ(cid:48)Σ ˆ−1R ¯e = β +Σ ˆ Σ ˆ−1 Σ ˆ (cid:0) α+βF ¯ +ε (cid:1) , (A.15) εε εF FF εε and (cid:16) (cid:17)(cid:48) (cid:16) (cid:17) β ˆ(cid:48)Σ ˆ−1β ˆ = β +Σ ˆ Σ ˆ−1 Σ ˆ β +Σ ˆ Σ ˆ−1 . (A.16) εε εF FF εε εF FF √ √ (cid:112) ˆ d d Considering β = B/ T and assuming that (T)Σ → Y, and Tε¯→ U for some εF random U and Y, from (A.16), we get β ˆ(cid:48)Σ ˆ−1β ˆ → d (cid:0) B +YΣ−1 (cid:1)(cid:48) Σ (cid:0) B +YΣ−1 (cid:1) . (A.17) εε FF εε FF Furthermore, note ERe = βλ = α+βEF, (A.18) t so that √ α = β(λ−EF) = B/ T (λ−EF). (A.19) As a result, (A.15) leads to Tβ ˆ(cid:48)Σ ˆ−1R ¯e → d (cid:0) B +YΣ −1 (cid:1)(cid:48) Σ (Bλ+U). (A.20) εε FF εε Hence, (A.12) implies (cid:16) β ˆ λ ˆ −βλ (cid:17) → d (cid:0) B +YΣ −1 (cid:1) (cid:104) (cid:0) B +YΣ −1 (cid:1)(cid:48) Σ −1 (cid:0) B +YΣ −1 (cid:1) (cid:105)−1 FF FF (cid:15)(cid:15) FF (cid:0) (cid:1)(cid:48) × B +YΣ −1 Σ −1[Bλ+U]−βλ. FF (cid:15)(cid:15) 41

Under the assumptions in the paper, U is independent of Y, U ∼ N(0,Σ ) and (cid:15)(cid:15) vec(Y) ∼ N(0,Σ ⊗Σ ). FF (cid:15)(cid:15) 2. Suppose that factors are traded, and consider the GMM estimator based on the moment conditions (3.2). Then, β ˆ λ ˆ = Σ ˆ Σ ˆ−1F ¯ (A.21) ReF FF (cid:16) (cid:17) = β +Σ ˆ Σ ˆ−1 F ¯ , εF FF so that √ (cid:16) (cid:17) √ √ T β ˆ λ ˆ −βλ = β T (cid:0) F ¯ −EF (cid:1) + TΣ ˆ Σ ˆ−1F ¯ . (A.22) εF FF √ Now, taking β = B/ T, √ (cid:16) (cid:17) T β ˆ λ ˆ −βλ → d YΣ−1λ. (A.23) FF Using vec(Y) ∼ N(0,Σ ⊗Σ ), FF (cid:15)(cid:15) (cid:0) (cid:1) VΣ−1λ = vec YΣ−1λ FF FF (cid:0) (cid:1) = λTΣ−1 ⊗I vec(Y) FF N (cid:0) (cid:0) (cid:1) (cid:1) ∼ N 0, λTΣ−1 ⊗I (Σ ⊗Σ )(Σ−1λ⊗I ) FF n FF εε FF N = N(0,λTΣ−1Σ Σ−1λ⊗Σ ) FF FF FF εε = N(0,λTΣ−1λΣ ) (A.24) FF εε 42

B Inference about Risk–Premiums with Omitted Factors The asymptotic results in the main paper are based on the assumption that the pricing model is correctly specified. The researcher is assumed to know the true factor model that explains expected excess returns on the assets. In that case, the risk–premium estimators areconsistentcertainlyunderourmaintainedassumptionofstationaryandergodicreturns. However, the pricing model may be misspecified and this might induce inconsistent risk– premium estimates. We investigate this issue and its solution in this section. We consider model misspecification due to omitted factors. An example of such type of misspecification would be to use Fama–French three factor model if the true pricing model is the four factor Fama–French–Carhart Model. Formally, assume that excess returns are generated by a factor model with two different sets of distinct factors, F and G such that Re = α∗ +β∗F +δ∗G+ε∗, (B.1) where ε∗ is a vector of residuals with mean zero and E[Fε∗(cid:48)] = 0 and E[Gε∗(cid:48)] = 0. Note that the sets of factors F and G perfectly explain the expected excess returns of the test assets, i.e., E[Re] = β∗λ +δ∗λ . F G However, a researcher may forget about the presence of the factors G and thus estimates the model only with factors F. Then, the estimated model is Re = α+βF +ε, (B.2) with zero–mean ε, and E[Fε(cid:48)] = 0. As the researcher might not know the underlying factor model exactly, she allows for misspecification by adding an N-vector of constant terms, α, in the estimation as in Fama and French (1993). The bias in the parameter estimates for, α, β and λ are presented in the following 43

theorem: Theorem B.1. Assume that returns are generated by (B.1) but α, β, and λ are estimated from (B.2) with GMM based on (3.1). Then, 1. αˆ converges to α∗ +(β∗ −β)E[F]+δ∗E[G], 2. β ˆ converges to β∗ +δ∗Cov (cid:2) G,FT (cid:3) Σ−1, FF 3. λ ˆ converges to λ +(β(cid:48)Σ−1β)−1β(cid:48)Σ−1[(β∗ −β)λ +δ∗λ ], F εε εε F G in probability. Theorem B.1 shows that, if a researcher ignores some risk factors G, then the risk price estimators associated with the factors F are inconsistent if and only if β(cid:48)Σ−1[(β∗ −β)λ +δ∗λ ] (cid:54)= 0. εε F G It is important to note that the inconsistency of the estimates of risk prices may be caused not only by the risk prices λ of the omitted factors but also by the bias in betas of the factors F. This result has an important implication: even if the ignored factors have zero price of risk, the cross–sectional estimates of the prices of risk on the true factors included in the estimation (F) can still be asymptotically biased. This happens in case F and G are correlated. Next, we analyse the asymptotic bias in the parameter estimates for α, β and λ in case the factors are traded and the estimation is based on GMM with the moment conditions (3.2) of Section 3.2. Theorem B.2. Assume that returns are generated by (B.1) but α, β, and λ are estimated from (B.2) with GMM based on (3.2). Then, 1. αˆ converges to α∗ +(β∗ −β)λ +δ∗λ , F G 2. β ˆ converges to β∗ +δ∗Cov (cid:2) G,FT (cid:3) Σ−1, F 44

ˆ 3. λ converges to λ , F in probability. Theorem B.2 illustrates that, even if the researcher forgets some risk factors, risk price estimators will still be asymptotically unbiased. Notice that this is in contrast with the estimator based on GMM with moment conditions (3.1) of Section 3.1. It is important to note that, if the forgotten factors G, are uncorrelated with the factors, then the bias in β disappears. Moreover, if the ignored factors are associated with zero prices of risk and are uncorrelated with F, then αˆ will converge to α. This raises the question what happens to the risk–premium estimators on individual assets or portfolios if some true factors are ignored in the estimation? The following corollary provides consistency conditions for risk–premium estimators of individual assets or portfolios. Corollary B.1. If the returns are generated by (B.1) and 1. the model (B.2) is estimated with GMM based on (3.1), then the vector of resulting risk–premium estimators β ˆ λ ˆ converges to E[Re] if and only if [I −β(β(cid:48)Σ−1β)−1β(cid:48)Σ−1]E[Re] = 0. N εε εε 2. all factors are traded. If the model (B.2) is estimated with GMM (3.2), then the vector of resulting risk–premium estimators β ˆ λ ˆ converges to E[Re] if and only if (β∗−β)λ + F δ∗λ = 0. G In the view of the theorem above, if the factors are traded, and the estimation is via GMM with moment conditions (3.2), then the risk–premium estimator is unbiased when the omitted factors are uncorrelated with the factors, F, and the the omitted factors are associted with zero prices of risk. In order to capture misspecification, it is a common approach to add an N–vector of constant terms, α, to the model as in (B.2). In the following theorem, we will show that in case of traded factors, it is possible to achieve the 45

consistency for estimating risk premiums, however, this comes at the cost of loosing all efficiency gains. Theorem B.3. Assume that all factors in F are traded. If the returns are generated by (B.1) but the model (B.2) is estimated with GMM based on (3.2) where the risk price ˆˆ estimates are given by the factor averages, then the estimator αˆ + βλ is consistent for E[Re]. The asymptotic variance of such estimator equals Σ . ReRe ˆˆ It is important to note that adding the αˆ to βλ does not solve the inconsistency problem if the system is estimated via GMM with (3.1). If some factors are non–traded and the parameters are estimated via GMM with (3.1), adding the αˆ capturing the misspecification to β ˆ λ ˆ doesn’t lead to consistent estimates of E[Re]. In particular, αˆ + β ˆ λ ˆ converges to E[Re] − β(λ − E[F]) and λ − E[F] is generally nonzero. Table A1 documents empirical values of the bias caused by estimating the CAPM model on 25 Fama French portfolios. There is a slight bias in ignoring the other factors, SMB and HML, about 1% on average across 25 Fama French portfolios for the general case and mimicking factor case, and about ˆˆ 1.5% for the traded factor case. Observe that adding the αˆ to βλ corrects for the bias for traded factors whereas it does not correct for the bias for the general case. C Proofs for Section “Inference about Risk–Premiums with Omitted Factors” ˆ Proof of Theorem B.1. Note that β converges to β and αˆ converges to α in probability, where (cid:2) (cid:3) β = Cov Re,FT Σ −1, (C.3) FF (cid:2) (cid:3) = Cov Re = α∗ +β∗F +δ∗G+ε∗,FT Σ −1, FF (cid:2) (cid:3) = β∗ +δ∗Cov G,FT Σ−1. FF 46

and α = E[Re]−βE[F], (C.4) = α∗ +β∗E[F]+δ∗E[G]−βE[F], = α∗ +(β∗ −β)E[F]+δ∗E[G]. ˆ Furthermore, for λ, first notice that (cid:16) (cid:17)−1 λ ˆ = β ˆ(cid:48)Σ ˆ−1β ˆ β ˆ(cid:48)Σ ˆ−1R ¯e. (C.5) εε εε ˆ The probability limit of λ from GMM (3.1) is thus given by λ = (cid:0) β(cid:48)Σ−1β (cid:1)−1 β(cid:48)Σ−1[β∗λ +δ∗λ ] εε εε F G = λ + (cid:0) β(cid:48)Σ−1β (cid:1)−1 β(cid:48)Σ−1[(β∗ −β)λ +δ∗λ ]. (C.6) F εε εε F G ˆ Proof of Theorem B.2. Note that β converges to β and αˆ converges to α in probability, with (cid:2) (cid:3) β = Cov Re,FT Σ −1 (C.7) FF (cid:2) (cid:3) = Cov Re −α∗ +β∗F +δ∗G+ε∗,FT Σ −1 FF (cid:2) (cid:3) = β∗ +δ∗Cov G,FT Σ−1. FF and α = E[Re]−βE[F] (C.8) = α∗ +β∗λ +δ∗λ −βλ F G F = α∗ +(β∗ −β)λ +δ∗λ . F G ˆ ˆ ¯ Furthermore, for λ , notice that λ = F, which converges to λ = E[F] in probability. F F F 47

Proof of Corollary B.1. For the first part of the corollary, note that βλ = β (cid:0) β(cid:48)Σ−1β (cid:1)−1 β(cid:48)Σ−1E[Re]. (C.9) F εε εε Hence, β ˆ λ ˆ isconsistentforE[Re]ifandonlyifE[Re] = β(β(cid:48)Σ−1β) −1 β(cid:48)Σ−1E[Re]. This, εε εε in turn, equivalent to (cid:104) (cid:105) I −β (cid:0) β(cid:48)Σ−1β (cid:1)−1 β(cid:48)Σ−1 E[Re] = 0. (C.10) N εε εε ˆˆ To prove the second part of the corollary, note that βλ converges to βλ. Using (C.7) and λ = E[F], we have F (cid:2) (cid:3) βλ = (β∗ +δ∗Cov G,FT Σ−1)λ , (C.11) F FF F = E[Re]−((β∗ −β)λ +δ∗λ ). F G Proof of Theorem B.3. Consistencyofαˆ+βλ isstraightforward. Theasymptoticvariance F isgivenbythedeltamethodusingg(α,β,λ ) = α+βλ . Theasymptoticcovariancematrix F F of α, β, and γ is given in Lemma A.3 (denoted by V). Thus, √ (cid:16) (cid:17) T g(αˆ,β ˆ ,λ ˆ )−g(α,β,λ) → d N(0,g˙(cid:48)V g˙), (C.12) α,β,λ with (cid:20) (cid:21) g˙ = [1 λ(cid:48)]⊗I β . N Matrix multiplication of calculating g˙(cid:48)V g˙ gives Σ . α,β,λ ReRe 48

Table A1: Omitted Factors This table provides the average risk premium estimates on average across assets, βˆλˆ (first line), its bias, the average risk premium estimates augmented with an alpha on average across assets, αˆ +βˆλˆ ,and its bias. The test assets are the 25 portfolios formed by Fama–French (1992,1993). The factor is the market factor in a standard CAPM. The results are based on monthly data from January 1963 until August 2020, i.e., 692 observations for each portfolio. The first column presents the improvements for the factor–model based risk–premium estimates based on GMM with (3.1) over the naive estimate of historical averages. The second and the third columns present the gains of factor–model based risk–premium estimates based on GMM with (3.2) and with (3.4) over naive estimates, respectively. General Case Traded Factors Mimicking Factors βˆλˆ 7.73 7.38 7.73 Bias -1.05 -1.39 -1.05 αˆ+βˆλˆ 9.12 8.77 8.77 Bias 0.35 0.00 0.00 49

snruteR detcepxE fo setamitsE rorrE dradnatS desaB-ledoM-rotcaF :1 erugiF 069=T )c( 084=T )b( 042=T )a( 00002 morf soiloftrop hcnerF-amaF 52 eht rof srotamitse nruter detcepxe suoirav eht fo srorre dradnats eht stroper erugfi sihT eht swohs tibihxe tsrfi ehT .)c lenap( 069 = T dna ,)b lenap( 084 = T ,)a lenap( 042 = T ezis elpmas htiw snoitalumis ehT .miM-MMG dna rT-MMG ,neG-MMG ,evian no desab setamitse nruter detcepxe eht fo )ESMR( srorre erauqs-naem-toor tibihxe driht eht dna ,)TSEA( srorre dradnats citotpmysa detamitse eht fo egareva noitalumis eht setartsulli tibihxe dnoces MPAC rotcaf-eno a morf detareneg era snruteR .ESMR eht ot derapmoc sa TSEA eht fo )EP( rorre egatnecrep eht stroper eht dna smret rorre eht dna rotcaf tekram eht fo stnemom ehT .smret rorre dna rotcaf tekram detubirtsid yllamron a htiw .8:0202-1:3691 morf atad htiw soiloftrop FF 52 htiw ledom MPAC ot detarbilac era ssecorp gnitareneg atad eht fo sretemarap 50

noitacfiitnedI kaeW :snruteR detcepxE fo setamitsE rorrE dradnatS desaB-ledoM-rotcaF :2 erugiF 069=T )c( 084=T )b( 042=T )a( kaewahtiwledomrotcaf-enoanisoiloftrophcnerF-amaF52ehtrofsrotamitsenruterdetcepxesuoiravehtfosrorredradnatsehtstropererugfisihT swohstibihxetsrfiehT .)clenap(069= Tdna,)blenap(084= T,)alenap(042= Tseziselpmashtiwsnoitalumis00002morferastluserehT .rotcaf ehT .)dedarT( rT-MMG dna ,)lareneG( neG-MMG ,)cirotsih( evian no desab setamitse nruter detcepxe eht fo )ESMR( srorre erauqs-naem-toor eht egatnecrep eht stroper tibihxe driht eht dna ,)TSEA( srorre dradnats citotpmysa detamitse eht fo egareva noitalumis eht setartsulli tibihxe dnoces htiwMPACrotcaf–1amorfdetarenegerasnruteR.ESMRehtotderapmocsaTSEAehtfo)EP(rorre B ,T,...,2,1=t ,tε+ t,m eR √ +α=e t R T .8:0202-1:3691morfatadhtiw,λ T B√ =]e t R[Enoitcirtsergnicirptessaehtdna tεdna tFdetubirtsidyllamronhtiw 51

Table 1: Improvements in Efficiency for the 25 Fama–French Portfolios (in percentage) This table illustrates the gains in variances (in percentage) for the various risk–premium estimates for the 25 portfolios formed by Fama and French (1992,1993). The factors are the three factors from Fama and French (1992): market, size and book–to–market. The results are based on monthly data from January 1963 until August 2020, i.e., 692 observations for each portfolio. The first column (RP –Gen over Naive) presents the improvements for the factor–model based risk–premium GMM estimates based on GMM with (3.1) over the naive estimate of historical averages. The second (RP –Tr over Naive) and the third (RP –Mim over Naive) columns present the gains of factor– GMM GMM model based risk–premium estimates based on GMM with (3.2) or (3.4) over naive estimates, respectively. Thefourthcolumn(RP –Tr over RP –Gen)correspondstotheprecisiongainsfromestimatingthe GMM GMM risk premiums based on GMM using the moment conditions (3.2) over the case based on GMM with (3.1). The last column (RP –Mim over RP –Gen) presents the gains from making use of mimicking GMM GMM portfolios using (3.4) over estimation based on (3.1) Assets RP –Gen RP –Tr RP –Mim RP –Tr RP –Mim RP –Mim GMM GMM GMM GMM GMM GMM over over over over over over Naive Naive Naive RP –Gen RP –Gen RP –Tr GMM GMM GMM 1 8.9 9.6 8.8 0.6 -0.1 -0.7 2 6.9 7.5 6.9 0.5 -0.1 -0.6 3 4.3 4.8 4.3 0.5 -0.1 -0.5 4 4.7 5.2 4.6 0.5 -0.1 -0.5 5 4.8 5.4 4.8 0.5 -0.1 -0.6 6 4.8 5.5 4.7 0.7 -0.1 -0.8 7 5.0 5.5 4.9 0.5 -0.1 -0.6 8 7.0 7.6 6.9 0.5 -0.1 -0.6 9 4.9 5.5 4.8 0.6 -0.1 -0.7 10 4.2 4.8 4.1 0.6 -0.1 -0.7 11 4.6 5.4 4.5 0.7 -0.1 -0.8 12 7.6 8.2 7.5 0.6 -0.1 -0.6 13 9.9 10.6 9.8 0.7 -0.1 -0.7 14 8.2 9.0 8.1 0.7 -0.1 -0.8 15 10.1 11.0 10.0 0.8 -0.1 -0.9 16 6.1 6.9 6.0 0.8 -0.1 -0.8 17 10.8 11.6 10.7 0.7 -0.1 -0.8 18 12.3 13.3 12.2 0.8 -0.1 -0.9 19 11.3 12.3 11.2 0.9 -0.1 -1.0 20 12.2 13.3 12.1 0.9 -0.1 -1.1 21 4.8 5.8 4.7 0.9 -0.1 -1.0 22 9.5 10.5 9.3 0.9 -0.1 -1.0 23 15.1 16.3 15.0 1.1 -0.1 -1.2 24 10.3 11.8 10.1 1.3 -0.2 -1.5 25 22.7 24.3 22.5 1.3 -0.2 -1.5 52

setamitse muimerp–ksir suoirav no desab soitaR eprahS elpmaS–fo–tuO :2 elbaT ,oitar eprahS eurt eht ot derapmoc rorre egatnecrep sti ,)enil tsrfi( RS ,snoitalumis revo oitar eprahS elpmas–fo–tuo egareva eht sedivorp elbat sihT rof stes atad detalumis 000,02 revo )enil driht( Z/)RS− RˆS( Z(cid:80) fo toor erauqs eht ,srorre derauqs–naem–toor eht dna )enil dnoces( RS/)RS−RS( s 1=s si ssecorp gnitareneg atad ehT .setamitse muimerp ksir suoirav htiw detcurtsnoc soiloftrop lamitpo ,T,...,2,1=t , ε+ Fβ+α= eR t t t srotcaf fo rotcev–K si F ,t doirep ta snruter tessa fo rotcev–N eht si eR .λβ=]eR[E noitcirtser gnicirp tessa eht dna ε dna F detubirtsid yllamron htiw t t t t fo noitarbilac a morf deniatbo era ssecorp gnitareneg atad fo sretemarap eht dna slaudiser dna srotcaf fo stnemom ehT .sdoirep fo rebmun eht si T dna -)1.3( snoitidnoc tnemom htiw MMG ,evian no desab era setamitse muimerp ksir ehT .0202 tsuguA ot 3691 yraunaJ morf ledom rotcaf 3 hcnerF–amaF si T .xirtam ecnairavoc ecnairav elpmas eht yb detamitse si xirtam ecnairavoc–ecnairav ehT .miM– PR-)4.3( dna rT– PR -)2.3( , neG– PR MMG MMG MMG .5 si noisreva ksir dna 296 eb ot demussa selpmaS gnigralnE 5 = γ laciteroehT N/1 VMG miM– PR rT– PR neG– PR eviaN eµ eurT MMG MMG MMG 871.0 741.0 11.0 751.0 751.0 751.0 401.0 181.0 811.0- 711.0- 811.0- 914.0- 510.0 Σ eurT RR 740.0 740.0 450.0 450.0 450.0 190.0 300.0 871.0 741.0 901.0 161.0 261.0 161.0 101.0 571.0 590.0- 390.0- 590.0- 334.0- 610.0- Σˆ RR 740.0 840.0 350.0 350.0 350.0 390.0 300.0 53

Cite this document

APA

Cisil Sarisoy, Peter de Goeij, & and Bas J.M. Werker (2024). Linear Factor Models and the Estimation of Expected Returns (FEDS 2024-014). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2024-014

BibTeX

@techreport{wtfs_feds_2024_014,
  author = {Cisil Sarisoy and Peter de Goeij and and Bas J.M. Werker},
  title = {Linear Factor Models and the Estimation of Expected Returns},
  type = {Finance and Economics Discussion Series},
  number = {2024-014},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2024},
  url = {https://whenthefedspeaks.com/doc/feds_2024-014},
  abstract = {This paper analyzes the properties of expected return estimators on individual assets implied by the linear factor models of asset pricing, i.e., the product of Î² and Î». We provide the asymptotic properties of factor-model-based expected return estimators, which yield the standard errors for risk premium estimators for individual assets. We show that using factor-model-based risk premium estimates leads to sizable precision gains compared to using historical averages. Finally, inference about expected returns does not suffer from a small-beta bias when factors are traded. The more precise factor-model-based estimates of expected returns translate into sizable improvements in out-of-sample performance of optimal portfolios.},
}