feds · April 12, 2017

Measuring Transaction Costs in the Absence of Timestamps

Abstract

This paper develops measures of transaction costs in the absence of transaction timestamps and information about who initiates transactions, which are data limitations that often arise in studies of over-the-counter markets. I propose new measures of the effective spread and study the performance of all estimators analytically, in simulations, and present an empirical illustration with small-cap stocks for the 2005-2014 period. My theoretical, simulation, and empirical results provide new insights into measuring transaction costs and may help guide future empirical work. Accessible materials (.zip)

Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C. Measuring Transaction Costs in the Absence of Timestamps Filip Zikes 2017-045 Please cite this paper as: Zikes, Filip (2017). “Measuring Transaction Costs in the Absence of Timestamps,” Finance and Economics Discussion Series 2017-045. Washington: Board of Governors of the Federal Reserve System, https://doi.org/10.17016/FEDS.2017.045. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

Measuring Transaction Costs in the Absence of Timestamps Filip Zikes∗ Federal Reserve Board April 6, 2017 Abstract Thispaperdevelopsmeasuresoftransactioncostsintheabsenceoftransaction timestamps and information about who initiates transactions, which are data limitations that often arise in studies of over-the-counter markets. I propose new measures of the effective spread and study the performance of all estimators analytically, in simulations, and present an empirical illustration with small-cap stocks for the 2005–2014 period. My theoretical, simulation, and empirical results provide new insights into measuring transaction costs and may help guide future empirical work. JEL Classification: C14, C15, G20 Keywords: transaction costs, effective spread, simulated method of moments, time-varying estimation ∗Board of Governors of the Federal Reserve System, Division of Financial Stability, 20th Street and Constitution Avenue N.W., Washington, D.C. 20551, United States. Phone: +1- 202-475-6617. Email: filip.zikes@frb.gov. I am grateful to Evangelos Benos, Dobrislav Dobrev, Erik Hjalmarsson, Ivan Ivanov, John Schindler, Michalis Vasios, and Yang-Ho Park for valuable comments,andtoEricParolinandMargaretYellenforexcellentresearchassistance. Theresults reported in the paper were generated using programs written SAS and in the Ox language of Doornik (2007); the programs are available on request. The views expressed in this paper are the sole responsibility of the author and should not be interpreted as representing the views of the Federal Reserve Board or any other person associated with the Federal Reserve System. 1

1 Introduction This paper develops measures of transaction costs that do not require observing the intraday transaction times or knowing who initiates trades (buyers vs. sellers). A recent and growing literature on large and previously opaque over-thecounter (OTC) markets employs transaction data that suffer from these limitations. Examples include studies of the credit default swap market (Chen et al., 2011; Benos, Wetherilt, and Zikes, 2013; Biswas, Nikolova, and Stahel, 2014), the interest rate swap market (Chen et al., 2012; Benos, Payne, and Vasios, 2016), the U.K. sovereign bond market (Benos and Zikes, 2016), and the U.S. corporate bond market (Bessembinder et al., 2006). But to the best of my knowledge, the literature has not yet formally tackled the problem of estimating transaction costs when timestamps and trade direction are missing. The contribution of this paper is to fill this gap. I propose three consistent estimators of the effective spread and study their sampling properties. The first one develops the idea of Benos and Zikes (2016), who suggest inferring the effective spread from the dispersion of the transaction prices from (1) some benchmark or reference price (e.g., end-of-day composite quote) and (2) the average transaction price. The estimator is available in closed form, which allows me to establish its finite-sample properties analytically and compare them to some well-known (infeasible) measures, explicitly quantifying the loss of information due to the missing timestamps, trade direction, or both. TheothertwomeasuresIproposearealsomoment-basedandcombinetheideas of Corwin and Schultz (2012) and Benos and Zikes (2016). The first one is based on the daily range, which is the difference between the daily high and low prices, together with the sample variance of the transaction prices. The advantage of this measure is that it is based solely on transaction prices and does not require a daily benchmark or reference price, which may be difficult to obtain for some illiquid assets. At the same time, it utilizes all available data (transaction prices), unlike 2

the Corwin and Schultz (2012) measure, which only uses the daily high and low prices. If a benchmark price is available, however, it is, of course, optimal to use all threemomentconditions—thetwodispersionmetricsandthedailyrange—andthis is how I construct my third estimator. Because I do not assume that the number of transactions is large, I have to resort to the simulated method of moments (SMM). Despite having to approximate the expected range by simulation, the estimator turnsouttobecomputationallycheapandeasytoimplementinpractice. Iprovide simple yet accurate small-sample approximations to the unknown moment that significantly speed up computations. To summarize my theoretical results, I find that the absence of timestamps or trade direction lead to a reduced convergence rate of the effective spread estimators. While in the case of full information the effective spead can be estimated n-consistently as the number of intraday transactions (n) increases, when times- √ tamps or trade direction are missing, only n-consistency can be achieved, and when both are missing, the effective spread cannot be estimated consistently from intraday data alone–averaging over an increasing number of days (T) is necessary. Thus, accurate estimates can only be obtained from weeks or months worth of transaction data. Inpractice, transactioncostsmayvaryatahigherfrequency,however. Isitpossible, then, to estimate transaction cost that vary, say, every day when timestamps and trade direction are missing? The answer is yes, provided the transaction costs vary sufficiently smoothly. Employing the recent advances in time-varying estimation by Giraitis, Kapetanios, and Yates (2013), I propose kernel-based estimators ofsmoothly-varyingeffectivespreadandestablishtheirasymptoticproperties. Doing so allows me to uncover smooth changes in transaction costs over time without relying on an increasing number of intraday observations. To corroborate the theoretical findings and to study how the various estimators perform in small samples, I run Monte Carlo simulations. I also provide an em- 3

pirical illustration with small-cap stocks listed on the New York Stock Exchange (NYSE)usingdatafromtheTradeandQuoteDatabase(TAQ).Tosummarize, the findings of these exercises show that the loss of information due to missing timestamps is large. The measures I propose in this paper deliver accurate estimates of transaction costs in situations where the transactions costs are high relative to the volatility of the efficient price. They may be, therefore, suitable for relatively illiquid, infrequently traded assets that exhibit relatively low fundamental volatility, such as some corporate and municipal bonds. They should not be applied, however, to highly liquid assets, such as listed equities, which trade with a tight spread and tend to be quite volatile. Fortunately, for these assets, high-quality time-stamped transaction data are typically available. In OTC markets, timestamps are often inaccurate or outright missing for variousreasons. Inthecreditdefaultswapandinterestratedatamentionedpreviously, transaction times are simply not reported, and the trade reporting time does not necessarily correspond to the actual trade time, making it impossible to chronologically order transactions. In the U.K. sovereign bond market data used by Benos and Zikes (2016), the timestamps are not accurate in the sense that two parties to the same transaction report widely different transaction times. More generally, though, the trading protocol in OTC markets often involves negotiation that may stretch over a period of time, and so the exact timing of the trade may be ambiguous; consider, for example, the “workup” protocol recently studied in Duffie and Zhu (forthcoming). Trade direction cannot be easily inferred because trades cannot be aligned with intaday quotes when timestamps are missing, making it impossible to use trade-signing algorithms such as that of Lee and Ready (1991). Researchers often assume that clients initiate trades with dealers, motivated by the fact that dealers are the main liquidity providers in these markets (Bessembinder, Maxwell, and Venkataraman, 2006; Edwards, Harris, and Piwowar, 2007). However, as recently 4

shown by Choi and Huh (2017) for the U.S. corporate bond market, dealers often initiate trades with clients as well, implying potenitally serious missclassification issues associated with this identification method. Moreover, in some markets, interdealer trades account for more than two-thirds of all transactions (Benos et al., 2013) and so the vast majority of transactions cannot be signed in this way. Thus, standard methods cannot be used to measure transaction costs in these large and important financial markets. Apart from the literature on measuring transaction costs (see Harris, 2015, and the references therein), my paper is also related to the recent literature on measuring volatility using high-frequency data starting with Andersen and Bollerslev (1998) and Barndorff-Nielsen and Shephard (2002); see A¨ıt-Sahalia and Jacod (2014)forarecenttextbooktreatment. Someofmyestimatorsemploytherange— that is, the difference between intraday high and low prices—and here I draw on the ideas of Christensen and Podolskij (2007) and Christensen, Podolskij, and Vetter (2009). Although the data-generating process I assume is very similar to many papers in this literature, the problem studied in my paper is different in three important ways. First, my goal is to estimate the effective spread. Thus, what the realized volatility literature (e.g. A¨ıt-Sahalia, Mykland, and Zhang, 2005; Zhang, Mykland, and A¨ıt-Sahalia, 2005, Hansen and Lunde, 2006; Christensen, Podolskij, and Vetter, 2009) treats as microstructure noise is precisely my object of interest, and what that literature is interested in estimating—the variation of the efficient price—is a source of noise in my framework. Second, in-fill asymptotics do not always apply—that is, increasing the number of intraday observations (n) does not generally improve the precision of the effective spread estimator. I have to rely on an increasing number of days (T) and employ large-T asymptotics or double asymptotics (both n → ∞ and T → ∞). Finally, my estimation framework is model based, unlike the estimation methods in the realized volatility literature that operate in a model-free environment. 5

Therestofthepaperisorganizedasfollows. InSection2, Isetoutmytheoreticalframework. InSection3, Iproposeaclosed-formmeasureoftheeffectivespread that does not require either timestamps or trade direction and study its properties analytically, explicitly quantifying the loss of information associated with missing timestamps, trade direction, or both. In Section 4, I introduce range-based estimators of the effective spread and propose simple computational methods to implement the estimator in practice. Section 5 reports Monte Carlo simulations. InSection6, Iproposekernel-basedtime-varyingestimationoftheeffectivespread. In Section 7, I present an empirical application to small-cap equities, and Section 8 concludes. Proofs are collected in the Appendix. 2 Framework The effective spread is defined as two times the difference between the actual transaction price (P) and the prevailing mid-quote or some proxy for the true value of the asset (efficient price) (M) at the time of the transaction. It can be expressed in absolute terms—that is, 2|P − M|—or in relative terms—that is, 2|P −M|/M or 2|log(P)−log(M)|. Like the bid-offer spread, the effective spread measures round trip transaction costs, but it is based on actual transaction price rather than on quoted prices. The effective spread can also be seen as a measure of the price impact of a trade, and because the price impact and transaction costs tend to vary inversely with liquidity, it is frequently used as a measure of liquidity (Foucault, Pagano, and Roell, 2013). My theoretical framework is essentially borrowed from Roll (1984) and it can be easily cast in continuous time as in Christensen, Podolskij and Vetter (2009). Suppose we have a sample of T days and divide each day into n subintervals of equal length. I assume that a transaction arrives at the beginning of each of these subintervals and that the associated logarithmic transaction prices—p ; i,t 6

i = 1,...,n; t = 1,...,T—are related to the logarithmic efficient price, m , by i,t s p = m + q , (1) i,t i,t i,t 2 where s is the proportional effective spread and q is a binary variable indicating i,t whether the i-th transaction on day t is buyer initiated (q = 1) or seller initiated i,t (q = −1). I initially assume that the efficient price is observable at the end of the i,t day—that is, at the end of the last subinterval n. I later relax this assumption and propose estimators that do not require observing m at all. Following Roll (1984) and Benos and Zikes (2016), I assume that the logarithmic efficient price m follows a random walk with independently and identically distributed (iid) increments: m = m +(cid:15) , (2) i+1,t i,t i+1,t where E((cid:15) ) = 0 and E((cid:15)2 ) = σ2/n. Thus, the daily integrated variance of the i,t i,t efficient price equals σ2 for any n and t. Finally, I assume that q is uncorrelated i,t with m for all i,j,t,s and that q is serially uncorrelated with E(q ) = 1—that j,s i,t i,t 2 is, there is the same number of buyer- and seller-initiated trades on average. I make no assumptions on the overnight return of the efficient price, m −m . 0,t n,t−1 3 Baseline estimator Inspired by Jankowitsch, Nashikkar, and Subrahmanyam (2011), Benos and Zikes (2016) rely on the dispersion of transaction prices around some benchmark price, but they recognize that the dispersion metric is affected by the intraday volatility of the benchmark price in a nontrivial way. They suggest using two dispersion metrics, n n 1 (cid:88) 1 (cid:88) d ˆ2 = (p −m )2, d ˜2 = (p −p¯)2, (3) t n i,t 0,t t n−1 i,t t i=1 i=1 7

and show that under the assumptions stated in the previous section, the two metrics satisfy s2 σ2 (cid:18) n+1 (cid:19) s2 σ2 (cid:18) n+1 (cid:19) E(d ˆ2) = + , E(d ˜2) = + . (4) t 4 2 n t 4 6 n Solving for s2, censoring at zero, and taking the square root yield the relative effective spread measure: (cid:113) ES = max{2(3d ˜2 −d ˆ2),0}. (5) t t t My baseline estimator develops the idea of Benos and Zikes (2016). I define sˆ2 = 2(3d ˜2 −d ˆ2), where d ˆ2 and d ˜2 are given in equation (3) and start by deriving t t t t t the variance of sˆ2, as it will be invoked repeatedly in the paper. t Proposition 1 Provided that the fourth moment of (cid:15) exists, 1,1 9s4 2s2σ2(2n2 +3n+1) 2(2nσ4 +σ4 +2κ)(2n3 +7n2 +7n+2) Var(sˆ2) = + + , t 2n(n−1) n2(n−1) 15n3(n−1) (6) where κ = E((cid:15)4 )−3σ4 is the excess kurtosis of (cid:15) . 1,1 1,1 Equation (6) implies that although sˆ2 is an unbiased estimator of s2, it is not consistent as the number of intraday transactions increases because Var(sˆ2) = t 8 σ4 + O(n−1) as n → ∞. This result is due to the fact that we are averaging 15 random walks in levels (prices) as opposed to first differences (returns), which cannot be constructed due to missing timestamps. To derive a consistent estimator of s based on sˆ2, we need to average sˆ2 over t t an increasing number of days before censoring at zero and taking the square root as in equation (5). The resulting estimator, which I denote by ES(1), is thus given T by (cid:118) (cid:117) (cid:40) T (cid:41) (cid:117) 1 (cid:88) ES(1) = (cid:116)max 2(3d ˜2 −d ˆ2),0 . (7) T T t t t=1 8

Given the nonlinear nature of the estimator, E(ES(1)) and Var(ES(1)) are not T T available in closed form. I employ a Taylor series expansion of ES(1) around s, T s > 0, together with equation (6), to establish the leading terms (as T → ∞). The leading term of the bias reads 1 σ4 (cid:18) 1σ4 1 κ 1σ2(cid:19) 1 (cid:18) 1 (cid:19) lim E[T(ES(1) −s)] = − − + + +O , (8) T→∞ T 15 s3 3 s3 15s3 2 s n n2 implying that ES(1) tends to underestimate the true effective spread. The limiting T variance reads √ 2 σ4 (cid:18) 2σ4 2 κ (cid:19) 1 (cid:18) 1 (cid:19) lim Var[ T(ES(1) −s)] = + + +σ2 +O . (9) T→∞ T 15 s2 3 s2 15s2 n n2 As expected, the absolute bias and variance decrease with the signal-to-noise ratio (SNR) s/σ, so it is more difficult to estimate the effective spread when it is small relative to the volatility of the efficient price. The absolute bias and variance of ES(1) also increase with excess kurtosis, but this only matters when the number T of transactions is small; the contribution of κ vanishes as n → ∞. The second terms in the expansions also show that for sufficiently large n, the absolute bias and variance of ES(1) decrease with the number of transactions, as the coefficients T on the n−1 terms in equations (8) and (9) are always positive. It follows from the assumptions stated in Section 2 and standard limit theorems √ that as T → ∞, ES(1) → s and if s > 0 and κ < ∞, T(ES(1) − s) → T p T d N(0,ω2), where ω2 = 1 Var(sˆ2). The limiting variance of ES(1) has a particularly 4s2 T simple form if we consider the asymptotics where both T,n → ∞, which may be appropriate in situations where the number of daily transactions is large. Then, √ from equation (9), we obtain T(ES(1) −s) → N(0, 2σ4 ). Feasible inference can T d 15s2 be obtained by replacing the unknown s and σ2 in the limiting variance with their 9

sample counterparts, ES(1) and σˆ2, respectively, where T T (cid:40) (cid:41) T 1 (cid:88) σˆ2 = max 3(d ˜2 −d ˆ2),0 (10) T T t t t=1 is a consistent estimator of σ2. 3.1 Comparison with estimators that use timestamps or trade direction In this section, I compare the baseline estimator with several well-known measures of the effective spread that require timestamps or trade direction, or both. The goal is to assess how serious the loss of information associated with these data limitations is.1 3.1.1 Observable timestamps Should timestamps be available, one would typically use the Roll (1984) estimator, which is equal to minus 4 times the sample first-order autocovariance of intraday returns: n 4 (cid:88) γˆ2 = − (p −p )(p −p ). (11) t n−2 i,t i−1,t i−1,t i−2,t i=3 It is easy to show that in my setup the estimator is unbiased for s2, and its variance reads (cid:18) 16σ4 16s2σ2 2s4(n−3) (cid:19) 1 Var(γˆ2) = + + +3s4 . (12) t n2 n (n−2) n−2 Clearly, Var(γˆ2) = 5s4 +O(n−2), so γˆ2 is a consistent estimator of s2 as n → ∞. t n t Similar to ES(1), γˆ2 can be averaged over T days and censored at zero to obtain a T t 1In the rest of this section, all analytical results are presented without proof to save space. The derivations follow similar steps as the derivation of equation (6) and can be obtained upon request. 10

nonnegative estimator of s: (cid:118) (cid:117) (cid:40) T (cid:41) (cid:117) 1 (cid:88) Roll = (cid:116)max γˆ2,0 . (13) T T t t=1 Unlike ES(1), Roll is consistent for s as n → ∞ for any T. T T √ But the Roll estimator is not the only n-consistent measure of s. Christensen, Podolskij, and Vetter (2009) propose an estimator based on realized volatility, which can be tailored to my framework as follows: n−1 2 (cid:88) ωˆ2 = (p −p )2. (14) t n−1 i+1,t i,t i=1 It is straightforward to show that 2σ2 s4 8s2σ2 4(κ−σ4) E(ωˆ2) = s2 + and Var(ωˆ2) = + + (15) t n t n−1 n(n−1) n2(n−1) whereκ = E((cid:15)4 ),whichimpliesthatωˆ2 isasymptoticallyunbiasedanditsvariance 1,1 satisfies Var(ωˆ2) = s4 +O(n−2). The limiting variance is five times smaller than t n that of the Roll estimator γˆ2, but the finite-sample bias can be large when σ2 is t large. Unlike sˆ2 or γˆ2, the estimator ωˆ2 is non negative by construction, and hence t there is no need for censoring when constructing a consistent estimator of s based on T days worth of data: (cid:118) (cid:117) T (cid:117)1 (cid:88) RVall = (cid:116) ωˆ2. (16) T T t t=1 Similar to Roll , and unlike ES(1), RVall is consistent for s as n → ∞ regardless T T T of T. 11

3.1.2 Observable trade direction As suggested by Warga (1991) and Schultz (2001), when the trade direction is observable one can simply regress the difference between the transaction price and some benchmark price on the trade indicator. Here I continue assuming that the benchmark price equals the end-of-day mid-quote and suggest running the following OLS regression: p −m = β(1)q +u(1). (17) i,t 0,t t i,t i,t If the daily benchmark prices are not available, one can use the average transaction price instead and run the regression p −p¯ = β(2)q +u(2). (18) i,t t t i,t i,t If model (1) is the data-generating process, the regression innovations are given by u(1) = m −m and u(2) = m −m¯ −(s/2)q¯, respectively. It is easy to show i,t i,t 0,t i,t i,t t t that the ordinary least squares (OLS) estimators of β(1) and β(2)in regressions (17) t t and (18) satisfy, under my assumptions: 2σ2(n+1) E(2β ˆ(1)) = s, Var(2β ˆ(1)) = , (19) t t n2 s 2σ2(n+1)(n−1) (cid:18) 1 1 (cid:19) E(2β ˆ(2)) = s− , Var(2β ˆ(2)) = +2s2 − .(20) t n t 3n3 n2 n3 Replacing m with m or p¯ therefore does not render the regression-based estii,t 0,t t mator inconsistent as it did for d ˆ2 and d ˆ2 in Section 3. The OLS estimators 2β ˆ(i), t t t √ i = 1,2 will converge in probability to s at rate n as n → ∞ as did the Roll and RV-based measures in Section 3.1. Similar to the effective spread measures in the previous section, none of these OLS estimators are guaranteed to be non-negative. Thus, I censor them at zero 12

and denote the regression-based estimators of the effective spread s for day t by RS(i) = max{2β ˆ(i),0}, i = 1,2. When estimating the effective spread using the full t t sample of T days with n transactions each, one simply runs the regressions (17) and (18) using all nT observations. I denote these estimators by RS(i), i = 1,2. T Note that in practice, n does not have to be the same for all days in the sample; I only make this assumption here to simplify derivations. 3.1.3 Observable timestamps and trade direction Finally, I investigate the loss of efficiency associated with both missing timestamps and trade direction. If timestamps were available, one could run the regression of p on q in first differences: ∆p = β(3)∆q +u(3). (21) i,t t i,t i,t The gain in efficiency compared with the regressions in levels is simply due to the fact that u(3) = (cid:15) , which has much smaller variance than either u(1) or i,t i,t i,t u(2) and is serially uncorrelated. A complication with the standard OLS estimai,t tor β ˆ(3) in regression (21) is that it is not always well defined. We have β ˆ(3) = (cid:80)n ∆p ∆q / (cid:80)n (∆q )2 and it is not difficult to show that P( (cid:80)n (∆q )2 = i=2 i,t i,t i=2 i,t i=2 i,t 0) = (1/2)n−1. But because 1 (cid:80)n (∆q )2 converges in probability to 2 as n−1 i=2 i,t n → ∞, an asymptotically equivalent, well-defined estimator can be obtained by simply setting (cid:80)n (∆q )2 equal to 2(n−1) in β ˆ(3) whenever (cid:80)n (∆q )2 = 0— i=2 i,t t i=2 i,t that is, I define (cid:80)n ∆p ∆q β ˜(3) = i=2 i,t i,t . (22) t 1{ (cid:80)n (∆q )2 = 0}2(n−1)+ (cid:80)n (∆q )2 i=2 i,t i=2 i,t Clearly, E(2β ˜(3)) = s, so 2β ˜(3) is an unbiased estimator of s. It is difficult to t t derive the exact variance of β ˜(3), but it is easy to show that the limiting variance t satisfies lim n2Var(2β ˆ(3)) = 2σ2 and that 2β ˜(3) is a n-consistent estimator of s. n→∞ t t 13

Thus, observing both timestamps and trade direction at the same time improves the convergence rate further: recall that the RMSE of the estimators Roll , RVall, T T RS(i), i = 1,2 only decays at rate n1/2. In small samples, β ˜(3) can be negative with T t positive probability, so I define RS(3) = max{2β ˜(3),0} as an estimator of s for day t t t and RS(3) = max{2β ˜(3),0}, where β ˜(3) is obtained by running regression (21) T T T using all nT observations (T days with n transactions each). 3.1.4 Summary The analytical comparison reveals that the absence of timestamps and/or trade direction reduces the convergence rates of the effective spread estimators. In the full-information case, one can achieve n-consistency, while in the absence of either √ timestamps or trade direction, only n-consistency is possible. In the absence of timestamps, the limiting RMSE only depends on s (equations (12) and (15)), while in the case of missing trade direction, it is solely driven by σ2 (equations (19)). Finally, when both are missing, consistency cannot be achieved by increasing the number of intraday observations and averaging over an increasing number of days is necessary. The limiting variance of the effective spread estimator depends on the ratio of σ2 and s, see equation (9). Now, in practice this means that the relative performance of the various estimators depends on the parameter configuration and the number of intraday observations. To illustrate that, I plot in Figure 1, the RMSE as a function of n on a log-log scale. Clearly, when the signal-to-noise ratio σ2/s is high, the absence of time stamps and trade direction lead to significant deterioration in RMSE for any n. But when σ2/s is small (left panel), there exists a wide range for n where the infeasible estimators do not really improve much upon the estimator that does not require either timestamps or trade direction. This is a useful result because in practice it is precisely illiquid, infrequently traded asses for which these data limitations occur. 14

4 Range-based estimators The baseline estimator is simple to compute, but it requires observing the bench- ˆ mark price m . When these prices or mid-quotes are not available, d cannot be 0,t t calculated and we need an alternative moment condition. Inspired by Corwin and Schultz (2012), I use the daily range: rˆ2 = (maxp −minp )2. (23) t j,t j,t j j The range has a long tradition in financial econometrics, dating back to Parkinson (1980), and has been widely used for estimating variance from intraday data (Brandt and Diebold, 2006; Christensen and Podolskij, 2007; Christensen, Podolskij, and Vetter, 2009; Dobrev, 2007). It is clear that rˆ2 is expected to depend on t both s and σ, as do d ˜2 and d ˜2. Corwin and Schultz (2012) combine equation (23) t t with a second moment condition based on the squared range over two consecutive days, that is, (max p −min p )2, where p denotes the j-th transacj j,t:t+1 j j,t:t+1 j,t:t+1 tionpriceinatwo-daywindowstartingondayt. Toderivetheirestimator, Corwin and Schultz (2012) make two strong assumptions that I do not want to make here: continuous intraday trading and zero overnight returns.2 Moreover, relying solely on the one-day and two-day range means throwing away a lot of data, so I develop measures that use all available transaction data. My range-based estimators therefore use only the daily range in equation (23) and work for any finite n. I continue with the assumptions stated in Section 2 and additionally assume thattheinnovationsoftheefficientpricearenormallydistributed. Theexpectation of the squared range can then be approximated by simulation for any finite n and the SMM employed to consistently estimate s. I proceed as follows. Let θ = (s,σ2)(cid:48) and let p∗ = (p∗ ,p∗ ,...,p∗ )(cid:48) denote a random draw from model (1) s 1s 2s ns 2CorwinandSchultz(2012)suggestasimplecorrectionfornon-zeroovernightreturn,butthe correctiondoesnoteliminatetheovernightreturnproblemcompletelyandtheestimatorremains generally biased and inconsistent. 15

given θ. Taking S independent draws, I approximate the expectation of rˆ2 by S 1 (cid:88) m (θ,n) = (maxp∗ −minp∗ )2. (24) S S j js j js s=1 The SMM estimator is then obtained by θ ˆ = argming(cid:48) g , (25) T T T θ∈R ++ where g = 1 (cid:80)T g , g = (g ,g )(cid:48), g (θ,n) = d ˜2 − E(d ˜2), and g (θ,n) = T T t=1 t t 1t 2t 1t t t 2t rˆ2−m (θ,n). The objective function g(cid:48) g must be minimized numerically under t S T T the restrictions that s and σ are nonnegative. The range-based estimator of s, which I denote by ES(2), is then given by ES(2) = θ ˆ . It follows that if T/S → 0 T T 1,T as T → ∞, ES(2) → p s and the SMM estimator is asymptotically equivalent to the T generalizedmethodofmoments(GMM)(seechapter2inGourierouxandMonfort, 1996), and the usual GMM inference applies. My final estimator follows naturally from the previous two. If the benchmark prices are observable, it is clearly desirable to use all three moment conditions at the same time. Formally, define g (θ,n) = d ˆ2 − E(d ˆ2) and g = (g ,g ,g )(cid:48), 3t t t t 1t 2t 3t where g and g are previously given. The over-identified SMM estimator of θ is 1t 2t given by θ ˜ = argming(cid:48) W g (26) T T T T θ∈R ++ for some positive definite matrix W . I follow the standard two-stage approach, T ˆ whereby I first use W = I to obtain a preliminary estimate θ and then use the T T ˆ optimal W(cid:100) (sample variance of g evaluated at θ ) in the second stage to obtain T t T θ ˜ . My third estimator of s is then given by ES(3) = θ ˜ . Again, if T/S → 0 as T T 1,T T → ∞, ES(3) → p s, and we obtain asymptotic equivalence with GMM. T 16

4.1 Computational aspects When the number of transactions is large, the previously described simulationbasedestimationmaybeslow. Theminimizationmustbedonenumericallyandthe evaluation of the objective function can be costly. Fortunately, simulation-based or analytical approximations for E(r2) can be devised that significantly speed up t computations. Observethatthepriceprocessinequation(2)canbeapproximatedbyaprocess σW(t)+ sq(t), where W(t) is standard Brownian motion and q(t) is a continuous- 2 time process such that for any t, q(t) = 1 with a probability of 1/2, and q(t) = −1 otherwise. Now due to continuity of Brownian motion, the range of σW(t)+ sq(t) 2 equals the range of σW(u) plus s. Thus, for large n, we can approximate the range of p by the range of m plus s: i,t i,t E[(maxp −minp )2] ≈ E[(σ(maxz −minz )+s)2], (27) j,t j,t j j j j j j where z , j = 0,...,n, is a discretized Brownian motion on [0,1]. All that has to be j simulated, then, is the expectation of the range and squared range of a discretized Brownian motion. This simulation needs to be done only once, before the SMM estimation begins, and not every time the objective function is evaluated. When n is large, this approximation leads to significant gains in computational speed. The approximation can be further improved by using the decomposition of Christensen, Podolskij, and Vetter (2009), Lemma A.1, where the maximum of the efficient price is only taken over buyer-initiated transactions and the minimum over seller-intiated transactions when calculating the range of z in (27). Formally, let b , i = 0,...,1 be an iid binary process independent of z, where b = 1 with a i i probabilityof1/2andb = −1otherwise. Givenasamplepathofbandz, therange i of z is calculated over the set I = {(i,j)|b = 1,b = −1]}. The approximation i j 17

then becomes E[(maxp −minp )2] ≈ E[(σ max(z −z )+s)2]. (28) j,t j,t i j j j (i,j)∈I As before, the range on the right-hand side of equation (28) needs to be simulated only once and not every time the objective function is evaluated. But the simulation can be avoided altogether because accurate analytical approximations for the range of discretized Brownian motion in equation (27) exist. Using Lemma A.8 in Andersen, Dobrev, and Schaumburg (2013) together with equation (27) leads to the approximation (cid:114) (cid:32)(cid:114) (cid:33) 8 ζ(1/2) 8 4 E[(maxp −minp )2] ≈ (4log2)σ2+2 σs+s2+ √ σ2 +σs √ , j,t j,t j j π 2π π n (29) √ where ζ(1/2)/ 2π ≈ −0.5826. To see how these approximations work, I plot expressions (27),(28), and (29) togetherwiththetruevalueE[(max p −min p )2]inFigure2fordifferentvalues j j,t j j,t of n. I find that all approximations are generally quite close to the true value for n > 1000. Interestingly, there is virtually no difference between the analytical approximations in equations (29) and 27); clearly, the first-order correction in Andersen, Dobrev, and Schaumburg (2013) works very well, even for small n. But both of these approximations are significantly upward biased when n is small. Fortunately, the approximation in equation (28), based on the idea of Christensen, Podolskij, and Vetter (2009), is significantly more accurate for all values of n and is very close to the true value when n > 100. Thus, it seems that in practice one should simulate E[(max p −min p )2] when n is small—say, less than 100—and j j,t j j,t use the approximation in equation (28) to speed up computations when n > 100. For very large values of n, one can avoid simulations altogether and use equation (29). 18

Another issue that obviously arises in practice is that the number of transactions is not the same every day. This issue poses no problems for my estimators. All one needs to do is to replace n in equations (3) and (29) with n , where n t t denotes the number of transactions on day t. Similarly, in the SMM estimation, one would simply simulate the squared range with the appropriate n for each t. t 5 Simulations To assess the performance of my estimators ES(i), i = 1,2,3 in a controlled en- T vironment, I run a Monte Carlo experiment. I set the daily integrated volatility of the efficient price (σ) to 35 basis points, which is approximately equal to the daily volatility of the 10-year Treasury futures price, and let the efficient price innovations follow a normal distribution. I vary the true effective spread (s) between 5 and 50 basis points, the number of daily transactions (n) between 10 and 250, and the number of days (T) in the sample between 25 and 250. Recall that the absolute values of s and σ are not that important—what matters are their relative values. Each simulation is based on 10,000 Monte Carlo replications. Table 1 reports the average effective spread obtained in the simulation together with the associated RMSE. Starting with the results for the baseline estimator ES(1), which are reported in the top two rows of each panel, I find that the bias T of the estimator can be either positive or negative in small samples depending on the true effective spread. But as predicted by theory (see equation (8)), the bias does become negative for sufficiently large T before eventually converging to zero as T → ∞. The RMSE of the estimator approaches zero at a rate that is broadly √ in line with T consistency. Turning to the just-identified range-based estimator, ES(2), reported in rows 3 T and 4 of each panel in Table 1, I find that the estimator exhibits a bias that can be either positive or negative depending on n, T, and s, but both the bias and 19

the RMSE decline as T → ∞, as expected. Comparing the performance of ES(2) T with the baseline estimator ES(1), I find that the two estimators can perform quite T differently. On the one hand, ES(1) does well when s is large and T is small; for T example, when s = 50, n = 50, and T = 50, the RMSE of ES(1) is around three T times smaller than that of ES(2). On the other hand, ES(2) works relatively well T T when s is small and T is large; for example, when s = 5, n = 250, and T = 250, the RMSE of ES(1) is more than three times larger than that of ES(2). T T It is therefore not surprising that the over-identified estimator, ES(3), which T combines the moment conditions underlying ES(1) and ES(2) using the optimal T T weighting matrix, generally performs the best. The results reported in rows 5 and 6 of Table 1 show that the estimator is typically the most precise in terms of RMSE, except when T is very small. 6 Time-varying effective spread In this section, I allow the effective spread to vary over time. As shown previously, due to the lack of transaction timestamps and trade direction, I am not able to estimate the effective spread consistently from data spanning a fixed time period such as a day or week, even if the number of transactions within that period increases without bound. Transaction costs do fluctuate over time, however, and so it is worth exploring the conditions under which one can recover the path of the time-varying effective spread using all available data. It turns out that this is possible if the effective spread process is sufficiently smooth. For any given period t (for example, a day), I assume that the effective spread within the period is constant and equal to s . I follow the recent advances by t Giraitis, Kapetanios, and Yates (2013) and Giraitis et al. (2016) and adopt a nonparametric approach whereby the law of motion of the parameters is left unspecified, up to a class of processes, and the parameters are estimated by local 20

averaging. The processes I consider for s are bounded stochastic and/or detert ministic processes satisfying the smoothness condition: sup ||s −s ||2 = O (h/t) (30) t j p j:|j−t|<h as t → ∞, h → ∞, and h = o(t). Examples of processes that belong to this (cid:0) (cid:1) class include s = t−1x , s = xt , and s = g t , where {x} is a unit root t t t max j≤T |xj| t n process with stationary increments and g is a smooth deterministic function on the unit interval. These processes are bounded in probability and are smoother than random walks. (cid:80) To estimate s , I take some weights w = w˜ / w˜ , where w˜ = K((j − t j,t j,t j j,t j,t t)/H) for some kernel function K and bandwidth parameter H. I then define the kernel estimator of the effective spread at time t as (cid:118) (cid:117) (cid:40) T (cid:41) ES(1) = (cid:117) (cid:116)max 2 (cid:88) w (3d ˜2 −d ˆ2),0 . (31) t,T j,t j j j=1 The kernel can be uniform, leading to simple rolling estimation with a window of size H, or have unbounded support, such as the Gaussian kernel. For an unbounded kernel, I require that K(x) ≥ 0, x ∈ R, is a continuous bounded function with a bounded first derivative such that (cid:82) K(x)dx = 1. K(x) = O(e−cx2), ∃c > 0, |K(cid:48)(x)| = O(|x|−2), x → ∞. Several popular kernel functions satisfy these conditions—for example, the Gaussian and quartic kernels. Similar to the timevarying spread s , I assume that the volatility of the efficient price is constant t intraday but varies smoothly over days—that is, σ = σ for all i, and that σ is a i,t t t bounded stochastic and/or deterministic process satisfying the same smoothness condition (30). This assumption, of course, contains constant volatility as a special case. The key to achieving consistency of the time-varying estimator ES(1) is the t,T 21

¯ choice of H relative to T. Define H = H if the kernel has bounded support and H ¯ = Hlog1/2H for an unbounded kernel. Then we have the following. ¯ Proposition 2 For any t = [τT],0 < τ < 1, if H → ∞, H/T → 0 as T → ∞, ES(1) → p s . t,T t The proposition shows that the bandwidth parameter has to grow with T but not as fast as T. When choosing H, one faces the familiar tradeoff between bias and variance: Smaller (larger) H produces less (more) biased and more (less) volatile estimates. There is currently no data-driven method for choosing H, but taking √ H = T seems to work well in existing applications (Giraitis, Kapetanios, and Yates, 2013; Giraitis et al. 2016). The advantage of the ES(1) estimator is that it is available in closed form and t,T easy to calculate. However, the simulation results reported in the previous section show that the range-based estimators can perform better. It is therefore desirable to adapt those estimators to the time-varying parameter setting as well. This method turns out to be more involved because the expectation of the range is not known in closed form. Some theoretical results are nonetheless possible to obtain under the assumption that both n → ∞ and T → ∞. To construct the kernel analog to the estimators ES(2) and ES(3), define T T (cid:88) T (cid:18) s2 σ2(cid:19) g(1) = w d ˆ2 − − , (32) t,T jt j 4 2 j=1 (cid:88) T (cid:18) s2 σ2(cid:19) g(2) = w d ˜2 − − , (33) t,T jt j 4 6 j=1 T (cid:32) (cid:114) (cid:33) (cid:88) 8 g(3) = w rˆ2 −4log2σ2 −2 sσ −s2 , (34) t,T jt j π j=1 and g = (g(1),g(2),g(2))(cid:48). Then the kernel estimator of θ = (s ,σ2)(cid:48) is given by t,T t,T t,T t,T t t t θ ˜ = argming(cid:48) W g (35) t,T t,T t,T t,T θ∈R ++ 22

for some positive definite matrix W and ES(3) = θ ˜(1). When using only the t,T t,T t,T second and third moment conditions, we obtain ES(2). The following proposition t,T establishes the consistency of these estimators. ¯ Proposition 3 For any t = [τT],0 < τ < 1, if n → ∞, H → ∞, H/T → 0 as T → ∞, ES(i) → p s , i = 2 and 3. t,T t 7 Empirical illustration Havingexploredthebehaviorofthevariouseffectivespreadestimatorsanalytically and through simulations, I now turn to an empirical illustration. The purpose of this exercise is to assess the performance of my measures against (1) the true effective spread one would simply calculate if m, q, and timestamps were observable, and (2) the measures that require either timestamps or trade direction, or both, that I discussed previously. In other words, I want to examine the performance of my estimators against those that require progressively more information. This approach is similar to Goyenko, Holden, and Trzcinka (2009), who compare lowfrequency liquidity measures with their high-frequency counterparts for selected U.S. stocks. Now, ideally, I would like to employ data from an OTC market, since that is where I expect my measures would naturally find applications, but to the best of my knowledge, no time-stamped OTC trade and quote data are available that would allow me to do this exercise. I thefore employ the widely-used TAQ data for selected NYSE-listed stocks; the TAQ data are time stamped to the second and contain information about m and q, so that the true effective spread can be readily computed. 23

7.1 Data and descriptive statistics The universe of NYSE-listed stocks is too broad to consider here in full. Because my measures of effective spread would typically be applied to OTC-traded contracts, which tend to be less liquid and trade less frequently than exchange-traded instruments, I focus on stocks with small market capitalization. In particular, I take all stocks in the TAQ database that satisfy two criteria. The first is that the stock was included in the S&P Small-Cap 600 Index for the entire period between January 2, 2005, and December 31, 2014. The second is that there are trade and quote data available for this stock in the TAQ database for every trading day in this period. These criteria select 147 stocks. For each stock and day in my sample, I download from the Wharton Research Data Services (WRDS) the WRDS-derived trades files (WCT data sets), which contain trades matched with the prevailing National Best Bid and Offer quotes. I then filter the data and retain only those trades with trade times between 9:35 a.m. and 4:00 p.m., positive transaction price, positive prevailing mid-quote, and positive quoted spread. In addition, I drop all trades where the prevailing quoted spread is greater than 50 times the median quoted spread for the same day, and where the proportional effective spread is greater than 50 times the median proportional effective spread for the same day; these rules are similar to those proposed by Barndorff-Nielsen et al. (2008). Table 2 reports some descriptive statistics for the data, separately for five twoyear periods that span my sample. The average daily number of trades varies between 1,000 and 2,000 for a typical stock day. The average effective spread varies between 12 and 16 basis points, while the average daily realized volatility varies between 150 and 250 basis points. The average SNR, which I define here as the ratio of effective spread and realized volatility (s/σ), fluctuated between 6 and 9 percent. Thus, despite being small cap, the typical stock in my sample traded relatively frequently and with a fairly tight spread during my sample period. At 24

the same time, there are stocks and trading days with relatively little trading and fairly large effective spreads as indicated by the 5th and 95th percentiles reported in the table. 7.2 Results MymainempiricalresultsaresummarizedinPanelAofTable3. Iruntheeffective spread estimations separately for each stock month, stock quarter, stock half-year, and stock year in the sample period and compare the estimates with the actual effective spreads observed for a given stock in a given time period. Specifically, I calculate the bias and RMSE associated with each estimator and the correlation of the estimated spread with the actual effective spreads calculated from TAQ. I consider my estimators ES(i), i = 1,2, 3 and the Roll, RV-based, and regression- T based estimators for comparison. Recall that the latter three are infeasible in the absence of timestamps. I find that estimating the effective spread without timestamps is very challenging. The infeasible estimators are significantly less biased and an order of magnitude more accurate in terms of RMSE than my estimators; they are also much more closely correlated with the actual effective spreads. In line with theory, the over-identified estimator ES(3) is generally most accurate in terms of RMSE T out of my three estimators, although ES(2) tends to be more closely correlated T with the actual spread. The results do not improve as the number of observations used for estimation increases. As expected, all infeasible estimators deliver RMSE that is an order of magnitude lower than that of my feasible estimators. The estimator based on realized volatility performs remarkably well, exhibiting almost no bias and having significantly lower RMSE than the Roll measure. The relatively poor performance of the feasible estimators should not come as a surprise: The average SNR for the 147 stocks in my sample is very small, and the simulation results discussed previously clearly indicate that in such circumstances 25

all feasible estimators struggle. To shed more light on the relationship between the SNR and RMSE, I perform the following experiment. Rather than using the original transaction prices when computing effective spread estimates, I construct a new set of transaction prices p˜where I set p˜ = m +10(p −m ) for all i and i,t i,t i,t i,t t—that is, I artificially inflate the actual effective spread by a factor of 10. This procedure leaves the intraday volatility and time-series dynamics of the mid-quote unchanged, but it increases the SNR tenfold. I then reproduce the results reported in Panel A of Table 3 using the artificial transaction prices p˜in place of the actual transaction prices p. The results are reported in Panel B. I find that the relative performance of my estimators improves significantly. Although they are still upward biased, their RMSE is now much smaller relative to the actual spread. Also, while the two estimatorsthatutilizetimestamps(RollandRVall)stilloutperformmyestimators, the differences in terms of RMSE have become smaller. The regression-based estimators that require trade direction perform better that either Roll or RVall. Finally, the correlation between my estimates and the true spreads has increased significantly. In addition to the experiment with the SNR, I study how the intraday number of transactions (n) affects performance. I do this by sampling sparsely from the set of transaction prices, retaining only every 10th observation on a given day for a given stock, and re-run all estimations on the sparsely sampled data. The results are reported in Panels C and D of Table 3; the former reports results based on the original data, while the latter shows results based on the artificial transaction prices previously described (inflated true effective spreads). Starting with Panel C, I find that the performance of my estimators is largely unaffected by sparse sampling. This finding is in line with the theoretical result thatnhasonlyasecond-ordereffectontheRMSEof these estimators. Incontrast, the infeasible Roll and RV-based estimators exhibit a significant deterioration in 26

precision as n decreases. Notably, the RV-based estimator now exhibits a significant upward bias and a substantial increase in RMSE. When I artificially inflate the actual effective spreads by a factor of 10 (Panel D), the differences between the RMSE of my estimators and the infeasible ones become even smaller. In summary, the empirical results based on the 147 stock in my sample are largely consistent with the theoretical and simulation results reported previously. The key driver of the performance of my estimators is the SNR. To illustrate this finding graphically, Figure 3 plots the RMSE expressed as a fraction of the true spread separately for stock-month sorted into deciles by their SNR. The figure is based on the same data as Panel D in Table 3. Clearly, as the SNR increases, the performance improves, and gradually approaches the performance of the infeasible estimators. This is very much in line with the behavior of the theoretical RMSE shown in Figure 1 in Section 3.1. 7.3 Time-varying estimation I now turn to the time-varying kernel estimation proposed in Section 6. I use √ the Gaussian kernel and set the bandwidth parameter according to H = T. Figure 4 summarized the estimation results. Because it is impossible to show the time-varying estimates for all 147 stocks here, I simply report the cross-sectional averages over time. Panel A shows results based on the original data, while Panel B reports results based on the (artificial) transaction prices obtained by inflating the actual effective spreads by a factor of 10. I find that my time-varying estimators perform well when the SNR is not very low, which is hardly surprising given the findings of the previous subsection. When the SNR is low, the estimators are significantly upward biased and do not capture the dynamics of the actual effective spread accurately. In particular, ES(1) drops t,T substantially during the financial turmoil of 2008 even though the actual effective spreadincreased. Whiletherange-basedestimatorsdonotsufferfromthisproblem 27

and are generally more closely correlated with the true spreads, they tend to peak a few months late. When I inflate the true effective spreads, the performance of all estimators improves considerably, both in terms of bias and correlation with the true spread. 8 Conclusion In this paper, I have studied the problem of estimating transaction costs in the absence of timestamps. Building on insights from the previous literature, I proposedseveralmeasuresoftheeffectivespreadandstudiedtheirsamplingproperties within the simple framework of Roll (1984). I corroborated my theoretical findings using a Monte Carlo simulation and assessed the performance of my estimators in an empirical application to selected NYSE-listed small-cap stocks. The theoretical, simulation-based, and empirical results show that the loss of information due to missing timestamps is large. My estimators are suitable for measuring transaction costs in illiquid OTC markets, where effective spreads tend tobewiderelativetothefundamentalvolatility, butnotnecessarilyinhighlyliquid exchange-based markets such as those for equities and futures contracts, where the opposite is generally true. But in those cases, accurate transaction timestamps are typically available, so my estimators would not be necessary. Throughout the paper, I have worked in the widely used framework of Roll (1984). Whereasthesimplicityofthisframeworkallowsforstraightforwardanalytical derivations, future work may consider more elaborate microstructure models, such as those by Huang and Stoll (1997); Madhavan, Richardson, and Roomans (1997); Bessembinder, Maxwell, and Venkataraman (2006); and Edwards, Harris, and Piwowar (2007). These models allow for a richer relationship between order flow and returns, and they relax some of the arguably restrictive assumptions of the Roll (1984) model. It would be interesting to explore whether these models can 28

be reliably estimated when timestamps are missing. Last but not least, it would be interesting to study whether the effective spread can be estimated consistently in the presence of stochastic volatility of the efficient price. References A¨ıt-Sahalia,Y.,P.MyklandandL.Zhang,2005,HowOftentoSampleaContinuous- Time Process in the Presence of Market Microstructure Noise, Review of Financial Studies, 18, 351-416. A¨ıt-Sahalia, Y., and J. Jacod, 2014, High-frequency financial econometrics, Princeton University Press. Andersen, T. G., and T. Bollerslev, 1998, Answering the sceptics: Yes, standard volatility models do provide accurate volatility forecasts, International Economic Review, 39(4), 885–905. Andersen, T. G., D. Dobrev, and E. Schaumburg, 2013, Duration-based volatility estimation, working paper, Northwestern University. Barndorff-Nielsen, O. E., and N. Shephard, 2002, Econometric analysis of realized volatility and its use in estimating stochastic volatility models, Journal of the Royal Statistical Society, Series B, 64(2), 253–280. Barndorff-Nielsen, O.E., P.R.Hansen, A.Lunde, andN.Shephard, 2008, Realized kernels in practice: Trades and quotes, Econometrics Journal, 4(1), 1–32. Benos, E., A. Wetherilt, and F. Zikes, 2013, The structure and dynamics of the UK credit default swap market, Financial Stability Paper No. 25, Bank of England. 29

Benos, E., R. Payne, and M. Vasios, 2016, Centralized trading, transparency and interest rate swap market liquidity: Evidence from the implementation of the Dodd-Frank Act, working paper, Bank of England. Benos, E., and F. Zikes, 2016, Liquidity determinants in the UK gilt market, Working Paper No. 600, Bank of England. Bessembinder, H., 2003, Issues in assessing trade execution costs, Journal of Financial Economics, 6, 233–257. Bessembinder, H., W. Maxwell, and K. Venkataraman, 2006, Market transparency, liquidityexternalities, andinstitutionaltradingcostsincorporatebonds, Journal of Financial Economics, 82, 251–288. Biswas, G., S. Nikolova, and C. W. Stahel, 2014, The transaction costs of trading corporate credit, working paper, Securities and Exchange Commission. Brandt, M. W. and F. X. Diebold, 2006, A no-arbitrage approach to range-based estimation of return covariances and correlations, Journal of Business, 79, 61-74. Chen, K., M. Fleming, J. Jackson, A. Li, and A. Sarkar, 2011, An analysis of CDS transactions: Implications for public reporting, Staff Report No. 517, Federal Reserve Bank of New York. Chen, K., M. Fleming, J. Jackson, A. Li, and A. Sarkar, 2012, An analysis of OTC interest rate derivatives transactions: Implications for public reporting, Staff Report No. 557, Federal Reserve Bank of New York. Choi,J.andY.Huh,2016,Customerliquidityprovisionincorporatebondmarkets, working paper. Christensen, K., and M. Podolskij, 2007, Realized range-based estimation of integrated variance, Journal of Econometrics, 141, 323–349. 30

Christensen, K., M. Podolskij, and M. Vetter, 2009, Bias-correcting the realized range-based variance in the presence of microstructure noise, Finance and Stochastics, 13, 239–268. Corwin, S., and P. Schultz, 2012, A simple way to estimate bid-ask spreads from daily high and low prices, Journal of Finance, 67(2), 719–759. Dobrev, D., 2007, Capturing volatility from large price moves: Generalized range theory and applications, working paper, Northwestern University. Doornik, J. A., 2007, Object-oriented matrix programming using Ox, 3rd ed. London: Timberlake Consultants Press and Oxford: www.doornik.com. Duffie, D. and H. Zhu, forthcoming, Size discovery, Review of Financial Studies. Edwards, A. K., L. E. Harris, and M. S. Piwowar, 2007, Corporate bond market transaction costs and transparency, Journal of Finance, 62(3), 1421–1451. Foucault, T., M. Pagano, and A. Roell, 2013, Market liquidity: Theory, evidence and policy, Oxford University Press. Giraitis, L., G. Kapetanios, and T. Yates, 2013, Inference on stochastic timevarying coefficient models, Journal of Econometrics, 179, 46–65. Giraitis, L., G. Kapetanios, A. Wetherilt, and F. Zikes, 2016, Estimating the dynamics and persistence of financial networks, with an application to the sterling money market, Journal of Applied Econometrics, 31(1), 58–84. Gourieroux, C., and A. Monfort, 1996, Simulation-based econometric methods, Oxford University Press. Goyenko, R. Y., C. W. Holden, and C. A. Trzcinka, 2009, Do liquidity measure measure liquidity?, Journal of Financial Economics, 92, 153–181. Greene, W. H., 2008, Econometric analysis, Sixth Edition, Pearson Prentice Hall. 31

Hansen, P. R. and A. Lunde, 2006, Realized variance and market microstructure noise (with discussion), Journal of Business and Economic Statistics, 24, 127218. Harris, L., 2015, Transaction costs, trade throughs, and riskless principle trading in corporate bond markets, working paper, USC Marshall School of Business. Hong, G., and A. Warga, 2000, An empirical study of bond market transactions, Financial Analyst Journal, 56, 32–46. Huang, R., and H. Stoll, 1997, The components of the bid-ask spread: A general approach, Review of Financial Studies, 10, 995-1034. Jankowitsch, R., A. Nashikkar, and M. G. Subrahmanyam, 2011, Price dispersion inOTCmarkets: Anewmeasureofliquidity, Journal of Banking and Finance, 35, 343–357. Lee, C. M. C., and M. J. Ready, 1991, Inferring trade direction from intraday data, Journal of Finance, 46(2), 733–746. Madhavan, A., M. Richardson, and M. Roomans, 1997, Why do securities prices change: A transactions level analysis of NYSE-listed stocks, Review of Financial Studies, 10, 1035–1064. Parkinson, M., 1980, The extreme value method for estimating the variance of the rate of return, Journal of Business, 53, 61-65. Roll, R., 1984, A simple implicit measure of the effective bid-ask spread in an efficient market, Journal of Finance, 39, 1127–1139. Schultz, P., 2001, Corporate bond trading costs: A peak behind the curtain, Journal of Finance, 56(2), 677–698. 32

Warga, A., 1991, Corporate bond price discrepancies in the dealer and exchange markets, Journal of Fixed Income, 1, 7–16. Zhang, L., P. Mykland, and Y. A¨ıt-Sahalia, 2005, Tale of two time scales: Determining integrated volatility with noisy high-frequency data, Journal of the American Statistical Association, 100, 1394-1411. A Proofs Proof of Proposition 1. Dropping the subscript t to simplify notation, we have sˆ2−s2 = 2(3d˜−dˆ)−s2, (36) 3s2 (cid:34)(cid:32) 1 (cid:88) n (cid:33) (cid:35) = (q −q¯)2 −1 i 2 n−1 i=1 (cid:34) n n (cid:35) 3 (cid:88) 1 (cid:88) +2s (m −m¯)(q −q¯)− (m −m )q i i i 0 i n−1 n i=1 i=1 (cid:34) n n (cid:35) 3 (cid:88) 1 (cid:88) +2 (m −m¯)2− (m −m )2 (37) i i 0 n−1 n i=1 i=1 =: A +B +C . (38) n n n By construction, E(A ) = E(B ) = E(C ) = 0, and it is easy to show that E(A B ) = n n n n n E(A C ) = E(B C ) = 0 because E(m q ) = 0 for all i and j. Thus, Var(sˆ2) = n n n n i j E(A2)+E(B2)+E(C2). It is clear from the equation above that sˆ2 does not depend on n n n m , so we will set it to zero to simplify notation. 0 Starting with E(A2), write n   9s4 1 (cid:88) n (cid:88) n 1 (cid:88) n A2 n = 4  (n−1)2 (q i −q¯)2(q j −q¯)2− n−1 (q i −q¯)2+1. (39) i=1 j=1 i=1 33

Because E(q q ) = 0 if i (cid:54)= j and q2 ≡ 1, we have i j i       n n n n n n n n (cid:88)(cid:88) (cid:88)(cid:88) 1 (cid:88)(cid:88)(cid:88)(cid:88) E (q i −q¯)2(q j −q¯)2  = n2−2E q i q j+ n2 E q i q j q k q l, i=1 j=1 i=1 j=1 i=1 j=1k=1 l=1 (40) 2 = n2−2n+3− . (41) n This, together with E (cid:0)(cid:80)n (q −q¯)2(cid:1) = n−1, gives after some algebra i=1 i 9s4 E(A2) = . (42) n 2n(n−1) Turning to E(B2), write n (cid:20) (cid:21) n n 9 6 1 (cid:88)(cid:88) B2 = 4s2 − + m m q q n (n−1)2 n(n−1) n2 i j i j i=1 j=1 (cid:20) (cid:21) n n n 6 18 (cid:88)(cid:88)(cid:88) +4s2 − m m q q n2(n−1) n(n−1)2 i j j k i=1 j=1k=1 36s2 (cid:88) n (cid:88) n (cid:88) n (cid:88) n + m m q q . (43) n2(n−1)2 i j k l i=1 j=1k=1 l=1 Because E((cid:15) (cid:15) ) = 0 if i (cid:54)= j, i j   (cid:88) n (cid:88) n (cid:88) n (cid:88) n (cid:88) n (cid:88) n (cid:88) i σ2 E m i m j q i q j = E(m i m j )E(q i q j ) = E(m2 i )q i 2 = E((cid:15)2 p ) = 2 (n+1). i=1 j=1 i=1 j=1 i=1 i=1 p=1 (44) (cid:16) (cid:17) (cid:16) (cid:17) (cid:80) (cid:80) (cid:80) (cid:80) (cid:80) (cid:80) (cid:80) (cid:80) (cid:80) Similarly,E m m q q = E(m m )andE m m q q = i j k i j j k i j i j i j k l i j k l (cid:80) (cid:80) (cid:80) (cid:80) n E(m m ). Thus, it remains to derive E(m m ). The case i = j follows i j i j i j i j from above, so we focus on the case when i (cid:54)= j: n n n n (cid:88)(cid:88) (cid:88) (cid:88) E(m m ) = 2 E(m m ), (45) i j i j i=1 j=1 i=1j=i+1 i(cid:54)=j 34

  n n i (cid:32) i j (cid:33) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) = 2 E + (cid:15) p (cid:15) r, (46) i=1j=i+1 p=1 r=1 r=i+1 n n i (cid:88) (cid:88) (cid:88) = 2 E((cid:15)2), (47) p i=1j=i+1p=1 2σ2 (cid:88) n (cid:88) n = i, (48) n i=1j=i+1 1 = σ2n(n+1)− σ2(n+1)(2n+1). (49) 3 Plugging (44) and (49) into the expectation of (43) and simplifying gives 2s2σ2(2n2+3n+1) E(B2) = . (50) n n2(n−1) Finally, we derive E(C2). Write n (cid:20) (cid:21) n n 9 6 1 (cid:88)(cid:88) C2 = 4 − + m2m2 n (n−1)2 n(n−1) n2 i j i=1 j=1 (cid:20) (cid:21) n n n 6 18 (cid:88)(cid:88)(cid:88) +4 − m m m2 n2(n−1) n(n−1)2 i j k i=1 j=1k=1 n n n n 36 (cid:88)(cid:88)(cid:88)(cid:88) + m m m m . (51) n2(n−1)2 i j k l i=1 j=1k=1 l=1 We focus on the last term because the other two terms follow from the derivation of this term. Observe that n n n n n n n n n (cid:88)(cid:88)(cid:88)(cid:88) (cid:88) (cid:88)(cid:88) (cid:88)(cid:88) m m m m = m2+3 m2m2+4 m3m i j k l i i j i j i=1 j=1k=1 l=1 i=1 i=1 j=1 i=1 j=1 i(cid:54)=j i(cid:54)=j n n n n n n (cid:88)(cid:88)(cid:88) (cid:88)(cid:88)(cid:88) +12 m2m m +12 m m2m i j k i j k i=1 j=1k=1 i=1 j=1k=1 i<j<k i<j<k n n n n n n n (cid:88)(cid:88)(cid:88) (cid:88)(cid:88)(cid:88)(cid:88) +12 m m m2 +24 m m m m . i j k i j k l i=1 j=1k=1 i=1 j=1k=1 l=1 i<j<k i<j<k<l (52) 35

To save space, we derive here only the expectation of the last term; the other terms follow using the same approach:   n n n n (cid:88)(cid:88)(cid:88)(cid:88)  E m i m j m k m l   i=1 j=1k=1 l=1 i<j<k<l   n n n n i j k l (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)(cid:88)(cid:88)(cid:88) = E (cid:15) p (cid:15) r (cid:15) s (cid:15) t, (53) i=1j=i+1k=j+1l=k+1p=1r=1s=1 t=1    n n n n i (cid:32) i j (cid:33) i j k (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) (cid:88) = E +  + +  i=1j=i+1k=j+1l=k+1p=1 r=1 r=i+1 s=1 s=i+1 s=j+1    i j k l (cid:88) (cid:88) (cid:88) (cid:88) ×  + + + (cid:15) p (cid:15) r (cid:15) s (cid:15) t, (54) t=1 t=i+1 t=j+1 t=k+1      n n n n i i i i i i j j (cid:88) (cid:88) (cid:88) (cid:88) (cid:88)(cid:88)(cid:88)(cid:88) (cid:88)(cid:88) (cid:88) (cid:88) = E (cid:15) p (cid:15) r (cid:15) s (cid:15) t+3E (cid:15) p (cid:15) r (cid:15) s (cid:15) t i=1j=i+1k=j+1l=k+1 p=1r=1s=1 t=1 p=1r=1s=i+1t=i+1   i i k k (cid:88)(cid:88) (cid:88) (cid:88) +E (cid:15) p (cid:15) r (cid:15) s (cid:15) t, p=1r=1s=j+1t=j+1 (55) n n n n 1 (cid:88) (cid:88) (cid:88) (cid:88) = 3σ4i2+κi+3σ4i(j −i)+σ4i(k−j), (56) n2 i=1j=i+1k=j+1l=k+1 σ4 (cid:16) κ(cid:17) (cid:18) 13σ4 κ (cid:19) (cid:18) σ4 κ (cid:19) σ4 κ 1 = n4+ σ4+ n3+ + n2+ + n+ − , (57) 3 5 12 2 2 3 12 30n where κ = E((cid:15)4)−3σ4. Above, we use the fact that 1   i i i i i i i i (cid:88)(cid:88)(cid:88)(cid:88) (cid:88)(cid:88)(cid:88)(cid:88) E (cid:15) p (cid:15) r (cid:15) s (cid:15) t = E((cid:15) p (cid:15) r (cid:15) s (cid:15) t ), (58) p=1r=1s=1 t=1 p=1r=1s=1 t=1 i i i (cid:88)(cid:88) (cid:88) = 3 E((cid:15)2(cid:15)2)+ E(e4), (59) p r p p=1r=1 p=1 p(cid:54)=r 1 = (3σ4i2+κi), (60) n2 36

and     i i j j i i (cid:32) j j (cid:33) (cid:88)(cid:88) (cid:88) (cid:88) (cid:88)(cid:88) (cid:88) (cid:88) E (cid:15) p (cid:15) r (cid:15) s (cid:15) t = E (cid:15) p (cid:15) rE (cid:15) s (cid:15) t , (61) p=1r=1s=i+1t=i+1 p=1r=1 s=i+1t=i+1   i (cid:32) j (cid:33) (cid:88) (cid:88) =  E((cid:15)2 p ) E((cid:15)2 r ) , (62) p=1 r=i+1 1 = σ4i(j −i). (63) n2 The expectation of the other terms in C2 can be obtained analogously. We obtain n 2(2σ4n+σ4+2κ)(2n3+7n2+7n+2) E(C2) = . (64) n 15n3(n−1) The variance of sˆ2 then follows after some algebra. (cid:4) Proof of Proposition 2. Write T T (cid:32) (cid:34) n (cid:35) (cid:33) (cid:88) (cid:88) 1 1 (cid:88) 2w (3d˜2−dˆ2) = w 3(q −q¯)2−1 s2−s2 (65) jt j t jt 2 n−1 i,j j j t j=1 j=1 i=1 T (cid:34) n n (cid:35) (cid:88) 3 (cid:88) 1 (cid:88) +2 w s (m −m¯ )(q −q¯)− (m −m )q jt j i,j j i,j j i,j 0,j i,j n−1 n j i=1 i=1 T (cid:34) n n (cid:35) (cid:88) 3 (cid:88) 1 (cid:88) +2 w (m −m¯ )2− (m −m )2 , (66) jt i,j j i,j 0,j n−1 n j=1 i=1 i=1 =: A +B +C . (67) t,T t,T t,T (cid:104) (cid:105) Define Q = 1 1 (cid:80)n 3(q −q¯)2−1 and write j,n 2 n−1 i=1 i,j j T T (cid:88) (cid:88) A = w (s2−s2)+ w (Q −1)s2. (68) t,T jt j t jt j,n j j=1 j=1 Now, because Q and s are uncorrelated and E(Q ) = 1, the expectation of the j,n j j,n 37

second term equals zero, and its variance is given by  2   T T (cid:88) (cid:88) E w jt (Q j,n −1)s2 j = E w j 2 t (Q j,n −1)2s4 j, (69) j=1 j=1 T (cid:88) ≤ CE(Q −1)2 w2, (70) j,n jt j=1 = O((n2H)−1). (71) Turning to the first term in (68), write T (cid:88) (cid:88) (cid:88) w (s2−s2) = w (s2−s2)+ w (s2−s2). (72) jt j t jt j t jt j t j=1 j:|j−t|<h j:|j−t|≥h Following Giraitis et al. (2016), we take h = bHlog1/2H for some positive constant b. Then the second term can be ignored, as it is of smaller order than the first term. Now (cid:12) (cid:12) (cid:12) (cid:12) (cid:88) w jt (s2 j −s2 t ) (cid:12) (cid:12) (cid:12) (cid:12) ≤ C sup |s j −s t | (cid:88) w jt = O p (cid:32) (cid:18) H T ¯(cid:19)1/2 (cid:33) , (73) j:|j−t|<h j:|j−t|<h j:|j−t|<h (cid:104) (cid:105)1/2 because|s −s | ≤ sup (s −s )2 = O ((H¯/T)1/2)and (cid:80) w = O(1). j t j:|j−t|<h j j p j:|j−t|<h jt (cid:16) (cid:17) Thus, A = O (cid:0) H¯/T (cid:1)1/2 +O (cid:0) (n2H)−1/2(cid:1) . t,T p TurningtoB ,defineP := 3 (cid:80)n (m −m¯ )(q −q¯)−1 (cid:80)n (m −m )q t,T j,n n−1 i=1 i,j j i,j j n i=1 i,j 0,j i,j and note that E(P ) = 0 and E(P P ) = 0 unless j = k. Thus, E(B ) = 0. Its j,n j,n k,n t,T variance satisfies T T E(B2 ) = E(4 (cid:88) w2s2P2 ) ≤ CE(P2 ) (cid:88) w2 = CO (cid:0) n−1(cid:1) O (cid:0) H−1(cid:1) = O (cid:0) (nH)−1(cid:1) , t,T jt j j,n j,n jt j=1 j=1 (74) as E(P2 ) = O(n−1). Thus, B = O ((nH)−1/2). Finally, by a similar argument, we j,n t,T p obtain C = O (cid:0) H−1/2(cid:1) , which completes the proof. (cid:4) t,T p Proof of Proposition 3. For the sake of brevity, I present a proof for W = I. In particular, I show that g(cid:48) g converges in probability, at appropriate rates and t,T t,T 38

uniformly in θ, to (cid:20) 1 1 (cid:21)2 (cid:20) 1 1 (cid:21)2 (s2−s2)+ (σ2−σ2) + (s2−s2)+ (σ2−σ2) 4 t 2 t 4 t 6 t (cid:34) (cid:114) (cid:35)2 8 + (4log2)(σ2−σ2)+2 (s σ −sσ)+(s2−s2) , (75) t π t t t (1) which is clearly minimized at (s ,σ ). Starting with g , write t t t,T  2   (cid:16) g t ( , 1 T ) (cid:17)2 =  (cid:88) T w jt dˆ2 j − s 4 2 t − σ 2 t 2  +2 (cid:88) T w jt dˆ2 j − s 4 2 t − σ 2 t 2  j=1 j=1 (cid:18) 1 1 (cid:19) (cid:18) 1 1 (cid:19)2 × (s2−s2)+ (σ2−σ2) + (s2−s2)+ (σ2−σ2) . (76) 4 t 2 t 4 t 2 t Now (cid:12) (cid:12) (cid:12) (cid:12) (cid:88) T w jt dˆ2 j − s 4 2 t− σ 2 t 2(cid:12) (cid:12) (cid:12) (cid:12) ≤ (cid:12) (cid:12) (cid:12) (cid:12) (cid:88) T w jt (cid:32) n 1 (cid:88) n (m i,j −m 0,j )2− σ 2 t 2 (cid:33)(cid:12) (cid:12) (cid:12) (cid:12) + (cid:12) (cid:12) (cid:12) (cid:12) (cid:88) T w jt (cid:32) n 1 (cid:88) n (m i,j −m 0,j )q i,j (cid:33)(cid:12) (cid:12) (cid:12) (cid:12) . j=1 j=1 i=1 j=1 i=1 (77) Define M = 1 (cid:80)n (m −m )2 and write j,n n i=1 i,j 0,j (cid:88) T w (M − σ t 2 ) = (cid:88) T w (cid:32) M − σ j 2 (cid:33) + 1 (cid:88) T w (cid:0) σ2−σ2(cid:1) . (78) jt j,n 2 jt j,n 2 2 jt j t j=1 j=1 j=1 By the same argument as in (72)–(73), the second term in (78) is O ((H¯/T)1/2). The p first term in (78) can be written as (cid:88) T (cid:32) σ j 2 (cid:33) (cid:88) T (cid:32) σ j 2 n+1 (cid:33) w M − = w M − +o (1), (79) jt j,n jt j,n p 2 2 n j=1 j=1 where E(M − σ j 2 n+1) = 0 and Var(M − σ j 2 n+1) = O(1), which can be shown j,n 2 n j,n 2 n using the law of iterated expectations and the results in Section 4 of the Supplementary Appendix. Thus, | (cid:80)T w (M −σ2/2)| = O (H−1/2). Turning to the second term j=1 jt j,n j p in (77), we apply similar arguments to arrive at | (cid:80)T w (cid:0)1 (cid:80)n (m −m )q (cid:1) | = j=1 jt n i=1 i,j 0,j i,j 39

O ((nH)−1/2). Because s and σ2 are bounded, we have p t t (cid:16) (cid:17)2 (cid:18) 1 1 (cid:19)2 g (1) = (s2−s2)+ (σ2−σ2) +O ((H¯/T)1/2)+O (H−1/2) (80) t,T 4 t 2 t p p uniformly in s and σ. (g (2) )2 can be treated analogously. Turning to (g (3) )2, write t,T t,T  2 T g t ( , 3 T ) =  (cid:88) w jt rˆ j 2−k 2 σ t 2−k 1 s t σ t −s2 t j=1   T +2 (cid:88) w jt rˆ j 2−k 2 σ t 2−k 1 s t σ t −s2 t (cid:0) k 2 (σ t 2−σ2)+k 1 (s t σ t −sσ)+s2 t −s2(cid:1) j=1 + (cid:0) k (σ2−σ)+k (s σ −sσ)+s2−s2(cid:1)2 , (81) 2 t 1 t t t (cid:112) where k = 2 8/π and k = 4log2. Define I = {(i,k)|q = 1,q = −1]}, j = 1,...,T, 1 2 j i,j k,j and observe that for every j, with probability 21−n, max (m −m )+s ≤ maxp −minp ≤ maxm −minm +s. (82) i,j k,j i,j i,j i,j i,j (i,k)∈Ij i i i i Thus, we can proceed by conditioning on the event {I (cid:54)= ∅ for all j}. Define z = j i,j m /σ , j = 1,...,T, and write i,j j (cid:12) (cid:12) (cid:12) T T (cid:12) (cid:12) (cid:12) (cid:88) w jt [σ j (maxz i,j −minz i,j )+s j ]2− (cid:88) w jt rˆ j 2(cid:12) (cid:12) (cid:12) i i (cid:12) (cid:12)j=1 j=1 (cid:12) (cid:12) (cid:12) (cid:12) T T (cid:12) ≤ (cid:12) (cid:12) (cid:88) w jt [σ j (maxz i,j −minz i,j )+s j ]2− (cid:88) w jt [σ j max (z i,j −z k,j )+s j ]2(cid:12) (cid:12) (83) (cid:12) (cid:12)j=1 i i j=1 (i,k)∈Ij (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) T (cid:12) ≤ C (cid:12) (cid:12) (cid:88) w jt [(maxz i,j −minz i,j )2−( max (z i,j −z k,j ))2] (cid:12) (cid:12) (cid:12) (cid:12)j=1 i i (i,k)∈Ij (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) T (cid:12) (cid:12)(cid:88) (cid:12) +C(cid:12) w jt [(maxz i,j −minz i,j )− max (z i,j −z k,j )](cid:12). (84) (cid:12) (cid:12)j=1 i i (i,k)∈Ij (cid:12) (cid:12) 40

By the Markov inequality, (cid:12) (cid:12)  (cid:12) T (cid:12) P  (cid:12) (cid:12) (cid:88) w jt [(maxz i,j −minz i,j )2−( max (z i,j −z k,j ))2] (cid:12) (cid:12) > (cid:15) ≤ 1 (λ 2,n −λ˜ 2,n ), (85) (cid:12) (cid:12)j=1 i i (i,k)∈Ij (cid:12) (cid:12) (cid:15) where λ = E([max z − min z ]2) and λ˜ = E([max (z − z )]2). Be- 2,n i i,j i i,j 2,n (i,k)∈Ij i,j k,j cause {z } is a scaled symmetric Gaussian random walk, it follows that lim λ = i,j n→∞ 2,n lim λ˜ = k (Christensen, Podolskij, and Vetter, 2009). The second term in (84) n→∞ 2,n 2 can be handled similarly. Thus, (83) is o (1) as n → ∞, and it suffices to focus on p (cid:12) (cid:12) (cid:12) T (cid:12) (cid:12) (cid:12) (cid:88) w jt (σ j R jn +s j )2−k 2 σ t 2−k 1 s t σ t −s2 t (cid:12) (cid:12) (86) (cid:12) (cid:12) (cid:12)j=1 (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) T (cid:12) (cid:12) T (cid:12) (cid:12) T (cid:12) ≤ (cid:12) (cid:12) (cid:88) w jt (σ j 2R j 2 n −k 2 σ t 2) (cid:12) (cid:12)+ (cid:12) (cid:12) (cid:88) w jt (2s j σ j R jn −k 1 s t σ t ) (cid:12) (cid:12)+ (cid:12) (cid:12) (cid:88) w jt (s2 j −s2 t ) (cid:12) (cid:12) (87) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12)j=1 (cid:12) (cid:12)j=1 (cid:12) (cid:12)j=1 (cid:12) =: D +E +F , (88) t,T t,T t,T where R = max z −min z , j = 1,...,T. Now F = O ((H¯/T)1/2) by (44)–(45), jn i i,j i i,j t,T p and for D we have t,T (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) T (cid:12) (cid:12) T (cid:12) D t,T ≤ C (cid:12) (cid:12) (cid:88) w jt [R j 2 n −E(R j 2 n )] (cid:12) (cid:12)+C (cid:12) (cid:12)E(R j 2 n )−k 2 (cid:12) (cid:12)+k 2 (cid:12) (cid:12) (cid:88) w jt (σ j 2−σ t 2) (cid:12) (cid:12). (89) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12)j=1 (cid:12) (cid:12)j=1 (cid:12) By the central limit theorem, the first term on the right-hand side of (89) is O (H−1/2), p the second term is o(1) as n → ∞, and the third term is O ((H¯/T)1/2) by the same p argument as in (72)–(73). Thus, D = O (H−1/2)+O ((H¯/T)1/2). Following the same t,T p p steps, and noting that ||s σ −s σ ||2 ≤ C||s −s ||2+C||σ −σ ||2+C|s −s ||σ −σ | t t j j t j t j t j t j and hence sup ||s σ −s σ ||2 = O (h/t), we find that E is of the same order as |t−j|≤h t t j j p t,T D , which completes the proof. (cid:4) t,T 41

B Figures and Tables 2 2 1 0 0 EESS11 -1 RRoollll RRVVaallll RRSS11 RRSS22 -2 RRSS33 -2 1 2 3 4 1 2 3 4 1 2 3 4 Figure 1: Simulated RMSE of alternative estimators of the effective spread as a function of n on a log-log scale. The parameter values are σ = 35 bps, κ = 0, and s = 120 bps (left panel), 50 bps (middle panel), and 10 bps (right panel), and T = 1. (a) s = 35 bps, s = 10 bps (b) s = 35 bps, s = 5 bps 40 45 30 35 True Simulated approx. (28) 25 Simulated approx. (27) 20 Simulated approx. (29) 1.0 1.5 2.0 2.5 3.0 3.5 4.0 1.0 1.5 2.0 2.5 3.0 3.5 4.0 Figure 2: Expected squared range of p and its approximations as a function of log n. The line labeled “True” shows the true expectation E[(max p − 10 j j,t min p )2], “Simulated approx. (28)” shows the right-hand side of (28), “Simuj j,t lated approx. (27)” shows the right-hand side of (27), and “Analytical approx. (29)” shows the right-hand side of (29). The various expectations are approximated by simulation with 100,000 replications. The efficient price innovations are normally distributed with a volatility of 35 bps. 42

1.4 SSNNRR EESS11 EESS22 EESS33 1.2 RRSS11 RRSS22 RRSS33 RRoollll 1.0 RRVVaallll 0.8 0.6 0.4 0.2 1 2 3 4 5 6 7 8 9 10 Figure 3: Average RMSE expressed as a fraction of the true effective spread for deciles based on the signal-to-noise (SNR) ratio. Every month, the stocks in the sample are sorted into deciles by their SNR. The RMSE for each decile is then calculated by averaging across all stock-month observations in the decile. TTrruuee 400 TTrruuee EESS11 EESS11 150 EESS22 EESS22 EESS33 EESS33 100 200 50 2006 2008 2010 2012 2014 2006 2008 2010 2012 2014 Figure 4: Time-varying effective spread estimates averaged across the 147 stocks in the sample. The left panel shows results for the original data, and the right panel shows results based on transaction prices with artificially inflated effective spreads. All time-varying estimations are performed using the Gaussian kernel, and the bandwidth is set equal to the square root of the sample size. 43

spb 5 = s spb 01 = s spb 02 = s spb 05 = s 052 001 05 52 052 001 05 52 052 001 05 52 052 001 05 52 T n 78.4 35.5 62.6 70.7 72.9 10.9 72.9 64.9 39.91 86.91 83.91 88.81 00.05 69.94 09.94 49.94 )1(SE 01 T ]14.4[ ]24.5[ ]04.6[ ]36.7[ ]82.4[ ]27.5[ ]17.6[ ]88.7[ ]30.2[ ]04.3[ ]81.5[ ]83.7[ ]31.1[ ]97.1[ ]75.2[ ]95.3[ 87.4 04.5 69.5 66.6 09.9 66.9 54.9 33.9 11.12 81.12 42.12 33.12 99.94 59.94 98.94 76.94 )2(SE T ]88.3[ ]01.5[ ]49.5[ ]57.6[ ]89.3[ ]15.5[ ]53.6[ ]20.7[ ]40.3[ ]59.3[ ]97.4[ ]96.5[ ]49.0[ ]05.1[ ]31.2[ ]22.3[ 18.4 03.5 10.6 71.7 69.9 69.9 01.01 85.01 19.91 18.91 26.91 33.91 20.05 20.05 40.05 30.05 )3(SE T ]91.3[ ]02.4[ ]13.5[ ]87.6[ ]74.2[ ]50.4[ ]73.5[ ]07.6[ ]89.1[ ]21.3[ ]44.4[ ]81.6[ ]09.0[ ]34.1[ ]00.2[ ]29.2[ 46.4 22.5 09.5 15.6 44.9 21.9 70.9 32.9 39.91 97.91 06.91 12.91 99.94 89.94 89.94 39.94 )1(SE 05 T ]40.4[ ]09.4[ ]08.5[ ]88.6[ ]95.3[ ]70.5[ ]21.6[ ]41.7[ ]45.1[ ]75.2[ ]38.3[ ]77.5[ ]86.0[ ]70.1[ ]35.1[ ]61.2[ 00.5 48.4 46.4 05.5 34.01 64.01 11.01 34.01 94.91 07.91 20.02 76.91 89.94 29.94 56.94 97.84 )2(SE T ]24.1[ ]03.2[ ]51.3[ ]80.4[ ]39.1[ ]76.2[ ]33.3[ ]40.4[ ]52.2[ ]08.2[ ]04.3[ ]99.3[ ]44.0[ ]11.2[ ]85.4[ ]46.7[ 80.5 50.5 11.5 43.6 82.01 75.01 17.01 55.11 58.91 47.91 17.91 67.91 99.94 10.05 40.05 59.94 )3(SE T ]62.1[ ]31.2[ ]13.3[ ]18.4[ ]93.1[ ]94.2[ ]45.3[ ]07.4[ ]35.1[ ]24.2[ ]92.3[ ]20.4[ ]24.0[ ]76.0[ ]49.0[ ]53.1[ 37.4 61.5 47.5 35.6 54.9 90.9 21.9 22.9 49.91 78.91 66.91 53.91 99.94 89.94 69.94 89.94 )1(SE 052 T ]89.3[ ]38.4[ ]86.5[ ]17.6[ ]04.3[ ]59.4[ ]89.5[ ]40.7[ ]74.1[ ]53.2[ ]26.3[ ]33.5[ ]95.0[ ]49.0[ ]23.1[ ]68.1[ 50.5 71.5 04.5 57.5 87.01 86.01 84.01 03.01 21.91 79.81 99.81 49.81 10.05 99.94 37.94 91.94 )2(SE T ]02.1[ ]89.1[ ]39.2[ ]28.3[ ]07.2[ ]79.2[ ]52.3[ ]76.3[ ]00.3[ ]45.3[ ]89.3[ ]85.4[ ]13.0[ ]22.1[ ]77.3[ ]63.6[ 92.5 15.5 10.6 28.6 45.01 98.01 72.11 07.11 47.91 26.91 05.91 74.91 59.94 49.94 29.94 88.94 )3(SE T ]89.0[ ]88.1[ ]42.3[ ]67.4[ ]66.1[ ]26.2[ ]35.3[ ]45.4[ ]83.1[ ]42.2[ ]90.3[ ]49.3[ ]13.0[ ]84.0[ ]96.0[ ]89.0[ naem eht stroper elbat ehT .2 noitceS fo snoitpmussa rednu srotamitse daerps evitceffe rof stluser noitalumiS :1 elbaT yllamron era snoitavonni ecirp tneicffie ehT .stekcarb ni ESMR eht dna noitalumis eht ni deniatbo daerps evitceffe 000,01 no desab era stluser ehT .spb 53 ot tes si ecirp tneicffie eht fo ytilitalov detargetni yliad eht dna ,detubirtsid .snoitacilper olraC etnoM 44

2005–6 2007–8 2009–10 2011–12 2013–14 A. Number of transactions Mean 975 2098 1947 1697 1757 Std. dev. 1023 2001 2635 2019 1836 5th percentile 174 409 283 228 268 95th percentile 2740 5831 5867 5248 4874 B. Effective spread (bps) Mean 12.6 13.1 15.6 13.2 13.0 Std. dev. 7.3 9.0 12.1 9.0 8.7 5th percentile 5.8 5.3 5.4 4.8 4.8 95th percentile 25.9 27.7 36.2 30.0 31.0 C. Realized volatility (bps) Mean 168.8 246.0 252.3 192.0 156.3 Std. dev. 67.7 151.8 133.4 92.3 64.4 5th percentile 86.6 89.7 100.6 84.6 80.9 95th percentile 293.4 559.1 498.0 362.4 272.5 D. Signal-to-noise ratio Mean 0.078 0.058 0.063 0.071 0.084 Std. dev. 0.046 0.025 0.030 0.036 0.047 5th percentile 0.039 0.030 0.031 0.033 0.040 95th percentile 0.144 0.103 0.116 0.136 0.169 Table 2: The descriptive statistics are calculated over all stock days in a given two-year period. The effective spread and realized volatility were winsorized at the 99.5% level, separately for each stock, before pooling and calculating the stock-day descriptive statistics. The sample consists of 147 small-cap stocks over the period from January 2005 to December 2014, spanning 2,517 business days. 45

launnA launnaimeS ylretrauQ ylhtnoM rroC ESMR saiB naeM rroC ESMR saiB naeM rroC ESMR saiB naeM rroC ESMR saiB naeM sdaerps lanigiro ,snoitcasnart llA .A - - - 25.31 - - - 25.31 - - - 25.31 - - - 25.31 eurT )1( 014.0 92.98 99.77 15.19 873.0 78.09 06.87 21.29 463.0 93.29 90.97 16.29 243.0 10.59 28.77 43.19 SE T )2( 434.0 69.49 50.57 75.88 654.0 88.59 40.37 65.68 274.0 80.79 97.27 13.68 304.0 95.58 26.56 41.97 SE T )3( 593.0 90.68 29.17 44.58 093.0 96.58 60.96 85.28 983.0 16.58 03.66 38.97 433.0 16.68 68.76 83.18 SE T 969.0 263.5 335.4- 889.8 969.0 995.5 646.4- 478.8 769.0 587.5 017.4- 318.8 559.0 700.6 987.4- 137.8 lloR 769.0 640.2 720.0- 94.31 869.0 390.2 181.0- 43.31 969.0 451.2 072.0- 52.31 569.0 543.2 753.0- 61.31 VR lla )1( 126.0 561.9 257.6- 077.6 245.0 886.9 846.6- 278.6 105.0 70.01 714.6- 601.7 554.0 37.01 786.5- 338.7 SR T )2( 398.0 705.4 716.2- 19.01 278.0 368.4 195.2- 39.01 148.0 713.5 025.2- 00.11 867.0 444.6 624.2- 01.11 SR T )3( 079.0 171.5 402.4- 813.9 579.0 651.5 861.4- 253.9 779.0 280.5 611.4- 704.9 479.0 590.5 290.4- 924.9 SR T 01 yb deilpitlum sdaerps ,snoitcasnart llA .B - - - 3.531 - - - 3.531 - - - 3.531 - - - 3.531 eurT )1( 069.0 86.98 36.77 9.212 469.0 06.78 15.57 8.012 759.0 53.78 42.47 5.902 739.0 40.98 31.27 4.702 SE T )2( 967.0 0.201 99.28 3.812 367.0 6.001 36.97 9.412 557.0 2.001 78.77 2.312 237.0 2.101 44.57 7.012 SE T )3( 668.0 0.601 96.69 0.232 738.0 5.411 9.201 2.832 808.0 1.221 3.801 6.342 577.0 6.631 1.121 4.652 SE T 569.0 48.33 88.42- 4.011 569.0 70.63 51.62- 1.901 569.0 97.73 88.62- 4.801 069.0 78.93 56.72- 6.701 lloR 179.0 03.91 560.3- 2.231 479.0 36.91 216.4- 6.031 579.0 51.02 074.5- 8.921 179.0 40.22 953.6- 9.821 VR lla )1( 889.0 54.61 267.7- 5.721 299.0 50.51 022.7- 0.821 499.0 14.31 726.6- 7.821 989.0 70.51 573.6- 9.821 SR T )2( 889.0 67.61 620.7- 2.821 299.0 52.51 075.6- 7.821 499.0 25.31 599.5- 3.921 399.0 28.31 397.5- 5.921 SR T )3( 689.0 84.42 28.61- 5.811 299.0 41.32 14.61- 8.811 599.0 31.12 67.51- 5.911 599.0 26.02 14.51- 9.911 SR T .egap txen no .tnoC 46

.egap suoiverp morf .tnoC launnA launnaimeS ylretrauQ ylhtnoM rroC ESMR saiB naeM rroC ESMR saiB naeM rroC ESMR saiB naeM rroC ESMR saiB naeM sdaerps lanigiro ,noitcasnart ht01 yrevE .C - - - 25.31 - - - 25.31 - - - 25.31 - - - 25.31 eurT )1( 604.0 58.78 55.67 64.98 673.0 93.98 31.77 30.09 563.0 59.09 27.77 36.09 933.0 05.39 44.67 43.98 SE T )2( 644.0 09.69 60.67 69.88 774.0 60.79 00.47 09.68 284.0 50.79 10.37 29.58 354.0 69.18 89.26 98.57 SE T )3( 444.0 19.58 33.47 32.78 424.0 64.58 53.17 52.48 214.0 12.58 66.86 75.18 973.0 95.58 49.86 48.18 SE T 976.0 663.8 199.5- 119.6 576.0 207.8 860.6- 538.6 556.0 040.9 780.6- 028.6 126.0 706.9 300.6- 109.6 lloR 478.0 78.22 81.91 80.23 778.0 27.22 28.81 37.13 778.0 36.22 16.81 25.13 078.0 95.22 73.81 82.13 VR lla )1( 935.0 774.9 298.6- 374.6 084.0 868.9 336.6- 827.6 054.0 72.01 582.6- 970.7 224.0 81.11 592.5- 960.8 SR T )2( 378.0 276.4 186.2- 86.01 948.0 579.4 815.2- 48.01 708.0 916.5 044.2- 39.01 517.0 781.7 292.2- 70.11 SR T )3( 759.0 829.5 598.4- 074.8 069.0 229.5 658.4- 505.8 259.0 509.5 208.4- 265.8 529.0 760.6 767.4- 795.8 SR T 01 yb deilpitlum sdaerps ,noitcasnart ht01 yrevE .D - - - 3.531 - - - 3.531 - - - 3.531 - - - 3.531 eurT )1( 269.0 49.09 12.87 9.112 469.0 40.98 21.67 8.902 659.0 79.88 88.47 6.802 339.0 31.19 66.27 3.602 SE T )2( 777.0 91.47 30.35 7.681 297.0 78.17 79.94 6.381 287.0 74.37 42.94 9.281 267.0 96.67 46.94 3.381 SE T )3( 629.0 57.17 35.36 2.791 998.0 12.77 80.66 7.991 478.0 66.28 10.96 7.202 558.0 89.09 50.67 7.902 SE T 579.0 35.34 63.23 0.661 879.0 42.14 33.03 0.461 879.0 31.04 32.92 9.261 869.0 18.14 48.72 5.161 lloR 979.0 54.25 33.14 0.571 389.0 00.05 42.93 9.271 489.0 25.84 41.83 8.171 879.0 57.84 18.63 5.071 VR lla )1( 789.0 91.81 940.9- 7.421 199.0 37.41 818.6- 8.621 299.0 04.31 522.6- 5.721 589.0 53.61 610.6- 7.721 SR T )2( 689.0 12.81 222.8- 5.521 299.0 62.41 551.6- 5.721 499.0 95.21 895.5- 1.821 299.0 45.31 454.5- 2.821 SR T )3( 889.0 82.61 855.8- 1.521 499.0 69.31 471.8- 5.521 699.0 76.11 325.7- 2.621 599.0 92.21 462.7- 4.621 SR T yraunaJ neewteb skcots pac-llams 741 ot deilppa srotamitse daerps evitceffe suoirav eht rof stluser laciripmE :3 elbaT tcepser htiw )saiB( saib eht ,)naeM( setamitse daerps evitceffe naem eht stroper elbat ehT .4102 rebmeceD dna 5002 dna detamitse neewteb noitalerroc eht dna ,ESMR eht ,)eurT( atad QAT eht ni devresbo sdaerps evitceffe lautca eht ot .skcots ssorca gniloop yb deniatbo era stluser ehT .)rroC( daerps evitceffe eht fo seulav lautca 47

Cite this document

APA

Filip Zikes (2017). Measuring Transaction Costs in the Absence of Timestamps (FEDS 2017-045). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2017-045

BibTeX

@techreport{wtfs_feds_2017_045,
  author = {Filip Zikes},
  title = {Measuring Transaction Costs in the Absence of Timestamps},
  type = {Finance and Economics Discussion Series},
  number = {2017-045},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2017},
  url = {https://whenthefedspeaks.com/doc/feds_2017-045},
  abstract = {This paper develops measures of transaction costs in the absence of transaction timestamps and information about who initiates transactions, which are data limitations that often arise in studies of over-the-counter markets. I propose new measures of the effective spread and study the performance of all estimators analytically, in simulations, and present an empirical illustration with small-cap stocks for the 2005-2014 period. My theoretical, simulation, and empirical results provide new insights into measuring transaction costs and may help guide future empirical work. Accessible materials (.zip)},
}