feds · December 19, 2024

A Market Interpretation of Treatment Effects

Abstract

Markets, likened to an invisible hand, often appear to contradict econometric assumptions that rule out spillovers of one person’s treatment on another’s outcomes. This paper provides a simple statistical framework highlighting that controls are indirectly affected by the treatment through the market. Further, the effect of the treatment on the treated reveals only part of the consequence for the treated of treating the entire market. When combined with economic theory, our framework leads to a new application of Marshall’s Laws of Derived Demand that relates econometric estimates of treatment effects in the marketplace to the substitution and scale effects of demand theory. We show how treatment-effect estimators can diverge – both in magnitude and direction – from the causal effects of treatment on the treated or counterfactual policies treating all market participants. The framework shows how the consequences of targeted treatments reveal the effects of marketwide treatments, and the role of market frictions in that inference. Examples from labor, public finance, economic geography, development, and the macro literature on the “missing intercept” are provided.

Finance and Economics Discussion Series Federal Reserve Board, Washington, D.C. ISSN 1936-2854 (Print) ISSN 2767-3898 (Online) A Market Interpretation of Treatment Effects Robert Minton; Casey B. Mulligan 2024-096 Please cite this paper as: Minton, Robert, and Casey B. Mulligan (2024). “A Market Interpretation of Treatment Effects,”FinanceandEconomicsDiscussionSeries2024-096. Washington: BoardofGovernors of the Federal Reserve System, https://doi.org/10.17016/FEDS.2024.096. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

A Market Interpretation of Treatment Effects* by Robert Minton and Casey B. Mulligan† November 2024 Abstract Markets, likened to an invisible hand, often appear to contradict econometric assumptions that rule out spillovers of one person’s treatment on another’s outcomes. This paper provides a simple statistical framework highlighting that controls are indirectly affected by the treatment through the market. Further, the effect of the treatment on the treated reveals only part of the consequence for the treated of treating the entire market. When combined with economic theory, our framework leads to a new application of Marshall’s Laws of Derived Demand that relates econometric estimates of treatment effects in the marketplace to the substitution and scale effects of demand theory. We show how treatment-effect estimators can diverge – both in magnitude and direction – from the causal effects of treatment on the treated or counterfactual policies treating all market participants. The framework shows how the consequences of targeted treatments reveal the effects of marketwide treatments, and the role of market frictions in that inference. Examples from labor, public finance, economic geography, development, and the macro literature on the “missing intercept” are provided. * This paper stems from the earlier “Difference-in-Differences in the Marketplace” piece (NBER wp 32111, FEDS 2024.008) that was written for a price theory audience and centered around an industry-equilibrium model. We thank seminar participants at the Federal Reserve Board, the University of Chicago and the participants of the 2024 Price Theory Summer Camp. We also appreciate comments from Jim Heckman, Jack MountJoy, Evan Munro, Allen Sanderson, Alex Torgovitsky, Giuseppe Forte, Josh Gross, João Pugliese, Alex Tordjman, Harald Uhlig. † Affiliations and contact: Federal Reserve Board, Robert.j.minton@frb.gov and University of Chicago, c-mulligan@uchicago.edu, respectively. The views in this paper are those of the authors and do not represent those of the Board of Governors of the Federal Reserve System or the Federal Reserve System.

I. Introduction Markets, likened to an invisible hand, often appear to contradict econometric assumptions that rule out spillovers of one person’s treatment on another’s outcomes. Beyond coordinating activities across the supply and demand sides of the market, the invisible hand also coordinates activities within the supply and demand sides of the market—even when participants do not directly interact. Spillovers are thus a hallmark of markets. Our paper provides a model of market spillovers to investigate the relationship between treatment-effect estimators and counterfactual outcomes resulting from more broadly applied treatments. We show that traditional estimators do not measure the effect of scaling up treatments, even when adjusted for the spillover of the treatment on controls. Although not relying on maximization, the framework has the Hicks-Marshall Laws of derived demand as a special case in which price treatments have scale and substitution effects on quantity outcomes. We then formalize how effects of scaling up the treatment can be extrapolated from treatment-control comparisons in the presence of market frictions. We begin by illustrating why the market spillovers resulting from a marketwide treatment are likely significant even when the spillovers from a targeted treatment are negligible by some metrics. In this example, market demand is price inelastic. Some suppliers experience a productivity treatment, while others are untreated. The additional output from the treated drives down the market price that all suppliers receive. The revenue growth of treated relative to untreated suppliers is positive, but the revenue effect of treating all suppliers is negative. In this sense, the treatment effect has the opposite sign, besides having the wrong magnitude. We then develop a simple statistical framework that allows for market spillovers and distinguishes targeted treatments from marketwide treatments. Treatment-control comparisons, which we term “difference-in-differences” (hereafter, DiD), differ from meaningful effects such as the treatment effect on the treated (hereafter, ToT) and the effect of a marketwide treatment (hereafter, the “scale effect”). Indeed, the difference between scale and DiD can serve as a definition of market spillovers. We establish that knowledge of ToT and DiD is insufficient to compute the scale effect. ToT and DiD differ by the spillover effect of the targeted treatment on the control group. The scale effect and the ToT differ by the counterfactual spillover effect on the treated of additionally treating all the controls. The former spillover effect can differ substantially from the latter, particularly when the treated group is “small” relative to the control group. Formally, ToT is a weighted average of the scale effect and DiD. The weight is unknown, although in many applications it is expected to be related to the fraction of the market that is treated. ToT and DiD become less informative about the scale effect as the share of treated units falls to zero. In the limit of a zero treated share, ToT and DiD coincide, differing from the scale effect by an arbitrary degree. Research designs with a small treated share are underpowered for the purposes of estimating market spillovers. 1

We then use demand theory to derive additional restrictions on the relationship between DiD, ToT, and scale. Prices are the treatments and quantities are outcomes. Market spillovers become cross-price effects on demand. A first result is that, although the treated goods are potentially heterogeneous, Hick’s Composite Commodity theorem says that their aggregate can be treated as a single good due to the commonality of their treatment. This holds regardless of whether treated and controls are the same in the baseline, as is the goal of a randomized experiment, or have systematic baseline differences as in natural experiments. Second, DiD measures the degree of substitution in preferences between treatments and controls, regardless of the fraction of the market that is treated or the magnitude of market spillovers. Put differently, DiD contains no information about the scale effect, which is the degree of substitution with goods outside the market where treated and controls participate. Third, the Hicks-Marshall Laws of derived demand show ToT to be an expenditure-share weighted average of the scale effect and DiD. Although we focus on market spillovers within the demand or supply-side of the market, another role of prices in equilibrium models is to equalize quantities supplied and demanded. This function is left implicit in our analysis because it is already familiar in econometrics, particularly regarding the simultaneous feedback between supply and demand schedules. However, we show that our results generalize to a simultaneous equilibrium analysis: the scale and DiD effects we identify are both eigenvalues of a matrix of direct and indirect treatment effects on the same side of the market. Therefore, any arithmetic operations on treatment-effects matrices – such as combining demand and supply effects in a simultaneous system – translate into the same scalar operations on their respective scale effects and DiDs. Neither the statistical nor demand-theoretic frameworks by themselves indicate how the treatment effect of a targeted treatment provides any information about the consequences of a marketwide treatment. We show how they are related in models of market frictions. The first case completely segregates treatment and controls in unrelated markets. DiD and ToT are equivalent because control outcomes are unaffected by spillovers. However, if there are any untreated in the market with the treated, DiD and ToT are different from the scale effect, understood as the effect of treating the entire market where the targeted treatment was administered. However, the scale effect can be inferred from the outcomes of a targeted treatment if, additionally, outcomes are measured for some of the untreated participants in the market. We analyze a second case that lacks any rigid barrier between markets, but spillovers diminish with economic distance from the treated. By considering comparison groups at varying distances from the treated, DiD, ToT, and scale can each be inferred from the outcomes of a targeted treatment. In both cases, the scale effect is outside the range spanned by DiD and ToT. Finally, we provide several additional examples in which acknowledging equilibrium effects profoundly changes the interpretation of DiD estimates. These include a variety of models with time and region fixed effects, a model of the effects of targeted wage subsidies, and a discussion of why the “missing intercept” literature in macroeconomics relates to the difference between ToT and DiD rather than their differences from the scale effect. Section II covers our illustration of market spillovers. Section III covers our statistical framework and its economic interpretation in terms of Marshall’s laws of derived demand. Section 2

IV covers our guidance on how to recover scale effects. Section V covers additional examples. Section VI concludes. I.A Related Literature Econometric results on causal research designs, along with recent extensions in the literature, often rely on the assumption of “no spillovers.”1 “Spillovers” and “peer effects” are treated in microeconometrics as advanced, albeit interesting, topics that primarily arise when there are “externalities” (Angrist & Pischke, 2008; Athey & Imbens, 2017). Attempts to relax this assumption entail structure on how treatment spills over to the controls (Manski, 1993)—a structure which could be economic or statistical. The statistical approach reviewed in Huber (2023) might allow spillovers from observations within predetermined clusters but not from observations outside those clusters (Sobel, 2006; Hong & Raudenbush, 2006; Hudgens & Halloran, 2008). It might also allow spillovers that are decreasing with network distance (Viviano et al. 2024) or geographical distance (Butts 2023). Our contribution aligns with the economic approach, which we view as lacking in the general frameworks more recently available in statistics. We provide closed-form results for interpreting quantity or price comparisons, showing how these estimates relate to broader treatment effects on the entire market. Our approach focuses on spillovers mediated by market forces as opposed to spillovers through externalities (such as the urban knowledge spillovers in Jacobs (1969) or spillovers of medical treatment in Miguel and Kremer (2004)). The analysis of specific market-based spillovers is extensive and spans many fields. In urban economics, for example, Glaeser and Gottlieb (2009) assess the benefits of easy labor mobility across firms within cities. See also Banzhaf (2021). In labor economics, Monte, Redding, and Rossi-Hansberg (2018) find evidence that commuting is an important adjustment mechanism for localized labor demand shocks. Crépon et al. (2013) find that gains to unemployed job seekers of job placement assistance can be offset by displacement effects for those who did not receive the program. Cautioning against “inattention to the market consequences of the [programs evaluated],” Heckman, Lochner, and Taber (1999) provide an equilibrium model for evaluating both behavior and welfare effects of tuition subsidies and other public policies. Heckman, LaLonde, and Smith (1999) conclude that “the costs of ignoring indirect [equilibrium] effects may be substantial.”2 In development economics, Egger et al. (2022) find that transfer payments in one village can affect outcomes in nearby villages. Cunha, De Giorgi, and Jayachandran (2015) and Muralidharan, Niehaus, and Sukhtankar (2017) also find that market spillovers are important. As discussed in our Section V, public economics acknowledges that the introduction of state-specific cigarette taxes may affect the wholesale price of cigarettes faced by all states, a broader market response not captured by analyses comparing retail price changes in different states. 1 See, for example, de Chaisemartin and d’Haultfoeuille (2020); Goldsmith-Pinkham, Sorkin and Swift (2020); and Borusyak, Hull and Jaravel (2022). Sometimes “no spillovers” is called “no interference” or is wrapped into the broader notion of the “stable unit treatment value assumption.” 2 See also Heckman and Pinto (2024), who cite the “simultaneous causality” inherent in market clearing as a reason why the “treatment-control paradigm” is too narrow. 3

Munro et al.’s (2021) observation that “the interference pattern produced by marketplace price effects is dense and simultaneously affects all units, so cluster- or sparsity-based methods are not applicable” is consistent with the view of market equilibrium taken in this paper. They refer to a “Global Treatment Effect (GTE)” which we link to the “scale effect” from price theory. Both our paper and theirs treat this as “a meaningful policy-relevant counterfactual of treating all individuals in the [market] compared to treating no individuals in the [market].” Their “average direct effect” is analogous to what we call the “difference-in-differences” estimator. Our paper differs from Munro et al. by emphasizing the economic interpretation of treatment effects and enabling statistical analyses in observational settings of market spillovers when randomized experiments are unavailable. Tools from economic theory point to several results such as our eigenvalue characterization of various treatment-control comparisons as built on just two “fundamentals:” scale and substitution effects. They also point to the role of market frictions in using information from a targeted treatment to estimate a counterfactual scale effect. The economic-theory approach emphasizes that market spillovers are not a nuisance, but rather of intrinsic interest. Muralidharan and Niehaus (2017) highlight the dual nature of spillovers, saying that “comparing outcomes for (randomly) treated and untreated neighbors will yield a doubly biased estimate of the average impacts of treating both, since it ‘nets out’ spillovers from the treated to the untreated and also fails to capture the effects of spillovers that would have occurred from the untreated to the treated had the former been treated as well” (p. 109). Spillovers are a reason that experimenters should consider conducting their experiments on a large scale, they say. Our findings call that advice into question if the spillovers operate through a frictionless market. Our approach agrees that the effect on the treated of a widespread treatment is closer to the scale effect than the effect of a targeted treatment. The problem is that the effects on controls also grow with the size of the treatment group. Without market frictions, treatment effects estimators can be worse than “biased” – they may contain no information about the scale effect, regardless of the size of the experiment.3 Sraer and Thesmar (2023) revisit the literature on the misallocation of production factors. In that context, they acknowledge that firm-specific treatments affect untreated firms through market prices. They define three potentially distinct effects that, in our terminology, correspond to DiD, ToT, and the effect of treating all firms. At the same time, they note that some meaningful outcomes are themselves differences among firms. With the commonly assumed log-linear policy functions, their firm differences are free of any indirect effects operating through the market. DiD, ToT, and scale coincide for such outcome measures. An estimate of one is a valid estimate of the other two. A macroeconomics literature recognizes that cross-sectional outcome comparisons do not reveal the full effect of fiscal policies. For example, households not receiving a transfer are affected by the transfers going to other households. Wolf (2023) refers to the absence of market spillover from cross-sectional comparisons as a “missing intercept” problem. He uses a general 3 Muralidharan and Niehaus also point out that experimental results may reveal features of the effectiveness of the experimental organization. Our focus is on connections between treatment effects and the tastes and technology in the market being studied. 4

equilibrium model (that is, an equilibrium consists of multiple prices and quantities) to explore solutions. The point of our Section III is that equilibrium spillovers are relevant in microeconomics too, even in partial equilibrium (only one price equilibrating). We also note that the missing intercept relates to the difference between DiD and ToT as defined in microecometrics. Our setup further highlights the distinction between these two effects and the scale effect. Drawing from market equilibrium analysis, we emphasize that the treated and untreated experience scale and substitution effects in different combinations.4 This analytical approach has parallels with Heckman and Vitlaycil’s (2005) expression of estimators as combinations of “marginal treatment effects,” each of which refers to a specific type of individual. To focus on the price theoretic components, this paper considers only limited heterogeneity, namely treated versus untreated and in-market versus out-of-market. It emphasizes market connections, with counterfactual treatment regimes understood as additional distinct combinations of scale and substitution effects. II. An illustration of equilibrium spillovers The industry in our illustration has many suppliers of a homogeneous product. The demand curve is Q = D(P), where P is the output price and Q is industry-aggregate quantity. With suppliers producing a homogeneous product, consumers choose the lowest price, which is matched by any supplier that intends to sell any output. All production inputs are held fixed firm-by-firm, while a “treated” subset of the firms experiences an increase in productivity by a factor of et > 1. The remaining firms in the industry are referred to as “controls” because they are not “exposed” to the productivity treatment, except indirectly through the market. Although our setting is simplified for clarity, the illustration is not far from important examples such as farming, where capital and fertilizer improvements (i.e., productivity in this example) enable some farmers to produce more crops from a given amount of land than others. Industry-level effects are illustrated in Figure 1a while firm-level effects are shown in Figure 1b. For the purposes of illustration, we assume that the treated and untreated have the same baseline output and revenue, which is why the first vertical line in Figure 1b simultaneously shows the supply of any firm in the baseline and untreated firms (if any) with has treated competitors. The second vertical line is the supply of any treated firm. 4 Much quantitative work in both micro- and macro-economics treats scale and substitution parameters as constants. 5

Treating suppliers alters the market equilibrium. The point E in Figure 1a shows the baseline equilibrium. E is the equilibrium when a subset of suppliers is treated. Because market supply has shifted outward, the market price is lower. Imagine using a DiD analysis at the firm level to study the effects of the productivity increase on revenues. The gap between the revenues of treated and untreated is represented by the areas R + R in Figure 1b. This treatment effect is 4 6 also DiD (the difference between the average revenue change for treated suppliers and the average change for those untreated) because the two types of suppliers have the same revenue in the baseline. DiD is different from ToT, understood as the amount of revenue that the treated gain 6

relative to the baseline, because ToT includes the revenue loss (R in magnitude) from the reduced 3 price. Although the DiD for revenue must be positive, the ToT for revenue would be negative if price falls enough. E is the equilibrium when the entire market is treated. Of course, E has a lower price than E does because the productivity treatment is more widespread. The scale effect  for revenue is defined as the difference between revenue under the marketwide treatment and revenue in the baseline. In Figure 1b, that is represented as the area difference R − R − R .  is less than ToT 6 1 3 and therefore even more likely to have the opposite sign as DiD. More precisely,  and DiD have opposite signs if and only if market demand is price inelastic. Through market competition, the price charged by any one supplier is largely determined by the productivity of competing suppliers. To put it another way, DiD “correctly” shows that supplier-specific productivity growth has little supplier-specific effect on price, but without clearly indicating the much larger price effects, and more negative revenue effects, of industry-wide productivity growth. A statistician might say that the “control group is contaminated” because the productivity growth of the treated suppliers is “spilling over” to the untreated through competition for consumers. However, the divergence between DiD and  is not mitigated by reducing the share treated. The empirical challenge is that equilibrium spillovers, such as the price effects illustrated in Figure 1a, are of intrinsic interest – they are part of the scale effect  – but are differenced out by DiD and treatment-effect estimators. Equilibrium effects can exceed the direct effects, as they must in Figure 1a if demand is price inelastic and revenue is the outcome metric. Our purpose here is not to discourage treatment-control comparisons, even those with contaminated control groups, but rather to use price theory to help understand what they measure and how the findings can be applied to other settings. III. Treatments and controls according to Marshall’s laws III.A. A vector representation of market spillovers To begin the formal analysis, we consider a population of agents that are designated either as treatments or controls. Their population shares are denoted  and 1−, respectively. Their outcomes are denoted T and K, respectively. The treatment, which directly affects the treated but not the controls, is denoted t. We let k denote a comparable shock that directly affects the controls but not the treated. The mappings from the two treatments to outcomes are denoted T(t,k;) and T K(t,k;), where  and  denote other factors that influence outcomes for treated and controls. K T K Figure 2 illustrates, in the time dimension, the effects of a treatment dt = 1. The familiar parts of the diagram are that (i) other factors influence both the T and K outcomes over time and (ii) the effect of the treatment on the treated (ToT) is the difference between the final outcome for the treated and what that outcome would be without treatment. Given our emphasis on market connections between T and K, our Figure 2 also allows for an effect of dt = 1 on the outcome for the controls. 7

To focus on equilibrium interpretations of DiD estimates, we maintain the parallel trends assumption that d = d and have the same marginal effects on each K and T. We refer to this T K assumption as “parallel trends with respect to omitted variables” (PTOV).5 In other words, under PTOV and without any treatments (dt = dk = 0), the treated and control groups experience the same outcome changes dT = dK. PTOV requires that the heavy dashed lines in Figure 2 be parallel. With PTOV, the relevant four first partial derivatives of the outcome mapping are represented as a two-by-two matrix S: 𝑠 𝑠 𝜕𝑇/𝜕𝑡 𝜕𝑇/𝜕𝑘 𝑇𝑡 𝑇𝑘 𝑆 = ( ) = ( ) (1) 𝑠 𝑠 𝜕𝐾/𝜕𝑡 𝜕𝐾/𝜕𝑘 𝐾𝑡 𝐾𝑘 The first entry in S is the effect s of a unit treatment t on the treated, which is commonly known Tt as ToT as labeled as such in Figures 1 and 2. The final entry s is the analog of ToT for the Kk controls. The off-diagonal elements reflect spillovers, sometimes known as indirect effects of treatments. The spillover effect shown in Figure 2 is s . Kt S’s first column difference and first row sum are central to our interpretation of differencein-differences. We therefore establish the following definitions: 𝐷𝑖𝐷 ≡ 𝑠 −𝑠 (2) 𝑇𝑡 𝐾𝑡 𝜀 ≡ 𝑠 +𝑠 (3) 𝑇𝑡 𝑇𝑘 5 Up to sampling error, PTOV is guaranteed if treatments are randomly assigned. 8

Definition (2) is our representation of difference-in-differences (more literally, a difference in treatment derivatives) under the aforementioned parallel-trends assumption. DiD subtracts the effect, measured per unit t, of the treatment t on controls from its effect on the treated. The definition (3) refers to the “scale effect,” which is the effect on the treated group of applying the treatment uniformly across the entire population, or what we call “the entire market.” The scale effect  is often the parameter of interest. The case of no spillovers has S as a diagonal matrix (s = 0 = s ), with no difference Tk Kt between DiD and  or DiD and ToT. While not ruling out the zero-spillover case, the purpose of this paper is to link the off-diagonal elements to the diagonal elements and to results from price theory. More generally, the difference between the scale effect and DiD is the sum of the spillover elements of S:  − DiD = s + s .6 Tk Kt Another restriction on the S matrix also resembles parallel trends and drives many of our results. Specifically, administering the treatment uniformly to both the treated group and control group should not affect the difference between their outcomes: 𝑠 +𝑠 = 𝑠 +𝑠 (4) 𝑇𝑡 𝑇𝑘 𝐾𝑡 𝐾𝑘 We refer to assumption (4) as “Parallel Trends for Parallel Treatments” (PTPT). At this point, PTOV is distinct from PTPT. Proposition 1 establishes that the familiar procedure of dividing differential outcome changes by a treatment differential yields DiD if and only if the PTPT assumption (4) holds. PROPOSITION 1 (Differential and parallel treatments). Assume parallel trends for omitted variables (PTOV) and that dk is neither 0 nor equal to dt. Then the PTPT assumption (4) is equivalent to (5): 𝑑𝑇−𝑑𝐾 𝐷𝑖𝐷 = (5) 𝑑𝑡−𝑑𝑘 Proof. To obtain an expression for the numerator in (5), totally differentiate T(t,k;) − T K(t,k;). With PTOV eliminating the  terms, the numerator is DiD (dt−dk) plus the product of K dk and the difference between the LHS and RHS of (4). With dk  0, the RHS of (5) differs from DiD if and only if equation (4) is satisfied. QED A corollary to Proposition 1 is that, with PTPT, equation (5) corresponds to the DiD defined in (2) regardless of whether treatments are solely for the treatment group (dt  0 = dk) or solely for the control group (dt = 0  dk). If treatment and control groups are identical in the baseline, as is the design of randomized controlled trials, then equation (5) becomes the ratio of the outcome gap T−K to the treatment gap t−k. 6 See also Munro et al.’s (2021) expression of a “global treatment effect” as the sum of “direct” and “indirect” treatment effects. 9

Figure 3 illustrates the model (1)-(4) for the case that  > DiD, showing all four elements of S.7 The axes measure outcomes for controls and treatments. The square point represents the baseline, showing outcomes absent any treatment. The green vector is the first column of S, showing the effects on both groups of treating only the treated. As shown, that vector is not vertical but has a slope greater than 45 degrees, which indicates that t has a spillover effect, although one that is less than the direct effect on the treated group. Unsurprisingly, the DiD (red segment) measures the distance between the treatment effect and the 45-degree line. The black vector, which is the second column of S, shows the effect of subsequently treating the rest of the market. The sum of the two arrows follows the 45-degree line if and only if the PTPT assumption (4) holds. The vertical and horizontal dimension of their sum is . Figure 4 illustrates a case with the same scale effect as Figure 3, but with DiD closer to zero. In contrast, Figure 5’s case has the same DiD as Figure 3 but no scale effect (treating the controls “undoes” the effects of t).8 As such, it also shows an instance of  < DiD. 7 It also shows DiD > 0, although for what follows the sign of each  and DiD is less important than the sign of their difference. 8 The area of the triangle shown in Figures 2-4 is half of the magnitude of DiD*. 10

Let  denote the share of the between-group spillover effects of a full market treatment dt = dk that would be experienced by the control group. 11

𝑠 𝑠 𝐾𝑡 𝐾𝑡 𝜆 ≡ = (6) 𝑠 +𝑠 𝜀 −𝐷𝑖𝐷 𝐾𝑡 𝑇𝑘 The symmetry of our discussion of treated and controls suggests that  would be closely related to , if not identical to it, because a larger treatment group is expected to have a greater effect on the controls than treating a comparatively group would. However, until we say more about the units of K, T, k, and t (see subsection III.B and following), a precise relationship between  and  cannot be specified. Regardless, the common intuition that small-scale treatments (  0) have near-zero spillover effects on the controls can be represented by assuming (  0). PROPOSITION 2 (Treatment effects decomposition). If the PTPT assumption (4) holds, then the treatment effects matrix S can be written in terms of DiD, , and , as defined in (2), (3), and (6): 𝜆𝜀 +(1−𝜆)𝐷𝑖𝐷 (1−𝜆)(𝜀−𝐷𝑖𝐷) 𝑆 = ( ) (7) 𝜆(𝜀−𝐷𝑖𝐷) (1−𝜆)𝜀+𝜆𝐷𝑖𝐷 Proof. The share defined by (6) distributes the sum of the spillover terms, already established to be  − DiD, between its two components as shown in (7). From (2), the s term is Tt the sum DiD + s and therefore what is shown in (7). PTPT requires that s = DiD + s , which Kt Kk Tk is the result shown in (7). QED The definitions and axiom (2)-(6) allow the diagonal of S to be expressed entirely in terms of weighted averages of DiD and , using  and 1− as weights. The off-diagonal “spillover” elements are the difference between the scale effect and DiD, scaled by either  or 1−. The direction of the spillover effects can therefore be understood as a comparison between the scale effect and DiD. The expression (7) and the intuition about signing market spillovers are familiar from price theory, where they are known as Marshall’s Laws of Derived Demand. This connection is made explicit in section III.B. Expression (7) also formalizes that, when  is interpreted as the treated share of the market  and is small, the ToT is approximately equivalent to DiD and is uninformative about the scale effect 𝜀. The eigenvalues of S are simple, of intrinsic interest, and useful for establishing additional results. COROLLARY. Under the PTPT assumption (4), the eigenvalues of S are DiD and . The matrix sum (product) of two matrices each of the form (7) itself has the form (7), with one eigenvalue that is the sum (product) of the two component DiDs and another eigenvalue that is the sum (product) of the two s, respectively. Proof. Use (7), which requires (4), to calculate the eigenvalues. QED 12

Even though (so far) the matrix S has three degrees of freedom, its eigenvalues are independent of the spillover share . Each eigenvalue is of intrinsic interest because DiD is commonly measured while  represents a meaningful counterfactual. As a result, any arithmetic operations on treatment-effects matrices – such as combining two different 𝑆 matrices, one for each of the demand and supply effects in a simultaneous system – translate into the same operations on their respective scale effects and DiDs. The inverse operation on S is particularly informative because the spillover-share parameter is retained: 𝜆 1−𝜆 1 1 −1 + (1−𝜆)( − ) 𝜆𝜀 +(1−𝜆)𝐷𝑖𝐷 (1−𝜆)(𝜀−𝐷𝑖𝐷) 𝜀 𝐷𝑖𝐷 𝜀 𝐷𝑖𝐷 ( ) = ( ) (8) 𝜆(𝜀−𝐷𝑖𝐷) (1−𝜆)𝜀+𝜆𝐷𝑖𝐷 1 1 1−𝜆 𝜆 𝜆( − ) + 𝜀 𝐷𝑖𝐷 𝜀 𝐷𝑖𝐷 Suppose we have results for quantity outcomes of price treatments. Equation (8) translates those into results for price outcomes of quantity treatments. III.B. Derived-demand interpretations of treatment effects When the treatments are prices and outcomes are quantities, or vice versa, demand theory is another lens through which to compare DiD and scale effects as defined above. In contrast to the illustration featured in Figure 1, the demand analysis that follows allows for (but does not require) heterogeneous treatment effects at the individual level and imperfect substitution in demand among the various quantities. Our first result is that treated outcomes and control outcomes can be interpreted as composite commodities. This interpretation allows the DiD and compensated scale effects to be precisely linked to Allen elasticities of substitution in preferences. The magnitude of DiD proves to be a shadow elasticity of substitution. A final result is that compensated scale effect summarizes everything about substitution that cannot be discovered with DiD, and vice versa. Consider an empirical procedure that identifies a group of G suppliers, selecting a fraction of them to be “treated” and using the remainder as a comparison group. The outcome of interest is equilibrium group-average quantity, expressed in logs. The components of each average are denoted as vectors 𝑇̂ and 𝐾̂. T and K represent the per-capita log average quantities, respectively, supplied by the two groups and ultimately consumed by downstream consumers. The selection procedure could be random, although our main results do not require that T be representative of G overall. The quantities T and K could be labor employed in two different states, as in the minimum wage literature. They could be capital in two different industries. In other applications, T and K might represent sales of distinct retail products, sales or employment firms in the same industry that differ by size or location, or different sectors of the economy. The model is flexible in accommodating these cases. The purpose of the comparison is to infer something about the characteristics of market demand. We refer to the “treatments” t and k is as log prices on the demand side of the market and 13

refer to this interpretation as demand-price treatments with quantity outcomes. The dt and dk notation can be understood as a shorthand for a simultaneous-equation analysis of exactly how the T and K suppliers must be treated to result in demand price changes dt and dk. We expand those details later after we establish results connecting  and DiD to consumer preferences. Final consumers have convex preferences 𝑢(𝑇̂,𝐾̂,𝑋̂) over G+N consumption choices, where N is the dimension of the vector 𝑋̂ of the “outside” goods supplied by neither the treated nor the controls. The t treatment increases prices of the components of 𝑇̂ by the common factor et. The k treatment applies to the components of 𝐾̂, with a price factor of ek. At this point, we do not necessarily assume that treated and controls are weakly separable from each other or the outside goods, which is why u has G+N distinct arguments. Composite Commodity Theorem. The only thing that the components of 𝑇̂ necessarily have in common with each other is the treatment t, and similarly for the components of 𝐾̂. Nevertheless, that is enough to represent the treated with a composite commodity and the controls with a second composite commodity. As Sir John Hicks (1939/1975, p. p. 50) put it “when the relative prices of a group of commodities can be assumed to remain unchanged, they can be treated as a single commodity.” Formally, market responses to the treatment can be understood in terms of a threedimensional utility function: 𝑢(𝑧 ,𝑧 ,𝑧 ;𝑃̂ ,𝑃̂ ,𝑃̂ ) ≡ max 𝑢̂(𝑇̂,𝐾̂,𝑋̂) 𝑠.𝑡. 𝑇 𝐾 𝑋 𝑇 𝐾 𝑋 {𝑇̂,𝐾̂,𝑋̂} (9) 𝑃̂ ⋅𝑇̂ ≤ 𝑧 ∧𝑃̂ ⋅𝐾̂ ≤ 𝑧 ∧𝑃̂ ⋅𝑋̂ ≤ 𝑧 𝑇 𝑇 𝐾 𝐾 𝑋 𝑋 where the scalars z , z and z in (9) are weighted sums of T, K, and X quantities, respectively, that T K X would be demanded without any treatment.9 Note that the marginal rate of substitution between components of 𝑇̂, between components of 𝐾̂, or across those groups can depend on the value of 𝑋̂. We include the price vectors {𝑃̂ ,𝑃̂ ,𝑃̂ } in the definition of u to be clear that those must be held 𝑇 𝐾 𝑋 constant in what follows, otherwise u becomes a different function of the three aggregates. The scalars z and z represented weighted sums, whereas eT and eK are unweighted T K averages. According to demand theory, summing various quantities is of little economic meaning unless they have the same prices. Equivalently, the quantities of various goods are more meaningfully summed after each has units set so that they have the same price per unit. This is the principle of quantity indexes. Henceforth, we show results for the weighted sums and leave it as an exercise to the reader to determine how treatments affect (ln𝑧 −𝑇) and (ln𝑧 −𝐾) in 𝑇 𝐾 specific applications. Assuming that the treatments have no effect on relative prices among the treated and no effect on relative prices among the controls, the definition (9) can characterize market equilibrium responses to the treatments as shown in (10): 9 If there were more than the two treatment values t and k, then application of the composite commodity theorem results in more than two aggregates. Namely, a supplier would be aggregated with all others receiving the same treatment. 14

max 𝑢(𝑧 ,𝑧 ,𝑧 ;𝑃̂ ,𝑃̂ ,𝑃̂ ) 𝑠.𝑡. 𝑇 𝐾 𝑋 𝑇 𝐾 𝑋 {𝑧𝑇,𝑧𝐾,𝑧𝑋 } (10) 𝑒𝑡𝑧 +𝑒𝑘𝑧 +𝑧 ≤ 𝑀 𝑇 𝐾 𝑋 The model (10) describes a prototype three-dimensional demand system. Its solution is three Marshallian demand functions for the three quantity indexes z , z and z , each as a function of T K X income M and three prices, with the usual properties.10 It also defines three Hicksian demand functions that are related to the Marshallian system by the usual Slutsky correspondence. In this setting, PTPT requires that (a) both z aggregates respond in the same proportion to any common treatment dk = dt, and (b) each aggregate’s response is equally composed between substitution and income effects. These conditions are satisfied if, for example, utility over (𝑧 ,𝑧 ,𝑧 ) can equivalently be written as utility over 𝑧 and a homothetic aggregator of 𝑧 and 𝑇 𝐾 𝑋 𝑋 𝑇 𝑧 , a common weak-separability assumption. These require that both aggregates have the same 𝐾 income elasticity. By definition, for any nonzero dt, PTPT requires Hicksian and Marshallian demand functions to satisfy: 𝑑ln𝑧 𝑑ln𝑧 𝑇 𝐾 | = | (11) 𝑑𝑡 𝑑𝑡 𝑑𝑘=𝑑𝑡,𝑑𝑃̂=0 𝑑𝑘=𝑑𝑡,𝑑𝑃̂=0 where 𝑑𝑃̂ = 0 is our shorthand for holding constant all three 𝑃̂ vectors.11 In summary, this framework restricts preferences with PTPT (equation (11)) while restricting price treatments to be common among the treated and common among the controls. Let 𝜂𝐻 denote compensated elasticity of z with respect to the price of z and likewise for 𝑇𝐾 T K any other pair of {T,K,X}. PTPT (equation (11)) requires the substitution matrix to satisfy 𝜂𝐻 + 𝑇𝑇 𝜂𝐻 = 𝜂𝐻 +𝜂𝐻 in addition to the usual homogeneity and, in levels or substitution elasticities, 𝑇𝐾 𝐾𝑇 𝐾𝐾 symmetry conditions. Let , , and  denote the expenditure shares corresponding to each of T K X the three expenditure terms in the model (10)’s budget constraint, respectively. The shares sum to one across T, K, and X. The off-diagonal terms in equation (1) are the spillovers. For the demand model (10) and (11), those terms have the economic interpretation of cross-price elasticities 𝜂𝐻 and 𝜂𝐻 , 𝑇𝐾 𝐾𝑇 respectively. As already noted, the weight of scale in ToT is the share  of the spillovers that is contamination of the controls. In the demand model that share is 𝜆 = 𝜂𝐻 /(𝜂𝐻 +𝜂𝐻 ). By 𝐾𝑇 𝑇𝐾 𝐾𝑇 Hicksian symmetry, that share can also be written as 𝜔 /(𝜔 +𝜔 ). In words, a Hicksian ToT is 𝑇 𝑇 𝐾 the expenditure-share weighted average of scale  and DiD, whose determinants are further revealed in the propositions that follow: 10 The same solution could be obtained by maximizing 𝑢̂(𝑇̂,𝐾̂,𝑋̂) with respect to the G+N quantities subject to the budget constraint 𝑒𝑡𝑃̂ ⋅𝑇̂ +𝑒𝑘𝑃̂ ⋅𝐾̂+𝑃̂ ⋅𝑋̂ ≤𝑀, and then using the price vectors {𝑃̂ ,𝑃̂ ,𝑃̂ } to form the 𝑇 𝐾 𝑋 𝑇 𝐾 𝑋 quantity indexes. See also Deaton and Muellbauer (1980, p. 121), although our paper presents much work in the treatment-control paradigm as counterexamples to their assumption that “the usefulness of this theorem in constructing commodity groupings for empirical analysis is likely to be somewhat limited.” 11 Recall that the price vector for the T (K) goods is 𝑒𝑡𝑃̂ (𝑒𝑘𝑃̂ ), respectively. 𝑇 𝐾 15

𝜔 𝜔 𝑇 𝐾 𝑇𝑜𝑇 = 𝜀 + 𝐷𝑖𝐷 (12) 𝜔 +𝜔 𝜔 +𝜔 𝑇 𝐾 𝑇 𝐾 The components of 𝑇̂ can have different income elasticities, although Engel aggregation of (9) requires that the share-weighted average of their income elasticities coincides with the common income elasticity of z and z . The treatment effect for treated good i is therefore the product of T K lnz /t and the allocation of z among its components based their relative income elasticities and T T substitution patterns with the outside good. In other words, the model (9) - (10) allows for rich heterogeneity in treatment effects, although such heterogeneity is not required. The treated goods can also differ in terms of the rate at which they substitute for various control-group goods or with outside goods because these differences are, beyond the income elasticity ratio already noted, irrelevant for predicting responses to t or k. Let 𝐴𝐸𝑆 denote the Allen elasticity of substitution between z and z , which is 𝜂𝐻 /𝑠 . 𝑇𝐾 T K 𝑇𝐾 𝐾 Defining likewise for all other pairs of {T,K,X}, we have notation for the symmetric 33 matrix of Allen elasticities of substitution (AES). The shadow elasticity of substitution between z and z , T K as defined by McFadden (1963), is defined in terms of the AES and expenditure shares and denoted  : TK 𝜔 𝜔 𝑇 𝐾 𝜎 ≡ − [(𝐴𝐸𝑆 −𝐴𝐸𝑆 )+(𝐴𝐸𝑆 −𝐴𝐸𝑆 )] > 0 (13) 𝑇𝐾 𝜔 +𝜔 𝑇𝑇 𝑇𝐾 𝐾𝐾 𝑇𝐾 𝑇 𝐾 Proposition 3 links the treatment and scale effects to these two concepts of substitution from demand theory: PROPOSITION 3 (Treatment and scale effects linked to Allen substitution). With PTPT, the comparative statics for the Hicksian demand functions can be expressed in terms of expenditure shares and Allen elasticities of substitution (AES):12 𝑑ln𝑧 /𝑧 𝑇 𝐾 | = (𝐴𝐸𝑆 −𝐴𝐸𝑆 )𝜔 = −𝜎 < 0 (14) 𝑑𝑡−𝑑𝑘 𝑇𝑇 𝑇𝐾 𝑇 𝑇𝐾 𝑑𝑃̂=0≠𝑑𝑡−𝑑𝑘 𝑑ln𝑧 𝜔2 𝜀𝐻 ≡ 𝑇 | = 𝑋 𝐴𝐸𝑆 < 0 (15) 𝑑𝑡 𝜔 +𝜔 𝑋𝑋 𝑑𝑃̂=0=𝑑𝑡−𝑑𝑘=𝑑𝑢 𝑇 𝐾 Proof. In elasticity form, the differentials of the Hicksian demand functions are: 𝑑ln𝑧 = 𝜔 𝐴𝐸𝑆 𝑑𝑡+𝜔 𝐴𝐸𝑆 𝑑𝑘∧𝑑ln𝑧 = 𝜔 𝐴𝐸𝑆 𝑑𝑡+𝜔 𝐴𝐸𝑆 𝑑𝑘 (16) 𝑇 𝑇 𝑇𝑇 𝐾 𝑇𝐾 𝐾 𝑇 𝑇𝐾 𝐾 𝐾𝐾 12 With 𝑃̂ held constant, 𝑃̂ ⋅𝑋̂ is itself a composite commodity with well-defined own- and cross-price Allen 𝑋 𝑋 elasticities of substitution. Recall that Allen elasticities of substitution can defined from price derivatives of the cost function corresponding to u. 16

where we hold constant the utility level and all three 𝑃̂ vectors and use the symmetry of the Allen elasticity of substitution. PTPT requires that the difference between the two dt coefficients in (16) has the same magnitude, and opposite sign, as the difference between the two dk coefficients. The former difference is therefore the DiD featured in the proposition. The aforementioned PTPT restriction on the coefficients also implies that the former difference is − . TK The scale effect shown in (15) is the sum of the first two coefficients in (16). Homogeneity and symmetry of the compensated demand for z allows that sum to be rewritten as s AES . T X XT Homogeneity and symmetry for the other goods, together with PTPT, equate s AES to the X XT expression shown in (16). QED COROLLARY (Treatment effect linked to shadow substitution). The difference-in-differences estimator is negative, with magnitude equal to the shadow elasticity of substitution  between TK the T and K goods in u(). Equation (14) is an expression for the DiD of the logs of the quantity indexes z and z . T K The H defined in Proposition 3 is a compensated scale effect in the sense that it describes the compensated responses of both ln z and ln z to a common treatment. If the scale effect in equation T K (12) is interpreted as a compensated scale effect, then the equation describes a compensated ToT. By the Slutsky equation, the uncompensated scale effect is M = H − (s +s ), where  is the T K income elasticity of demand common to z and z .13 If the scale effect in equation (12) is T K interpreted as an uncompensated scale effect, then the equation describes a uncompensated ToT.14 A comparison of (14) and (15) is the starkest demonstration that the treatment-control comparison − reveals something entirely different about the structure of demand than a TK common treatment dt = dk does. They refer to different entries in the AES matrix. Although the matrix is restricted by symmetry and homogeneity, those restrictions impose no relationship between AES and the Allen elasticities that define  . Especially, the AES term highlights XX TK XX that the compensated scale effect H is all about substitution to the outside goods 𝑋̂ whereas treatment-control comparisons reveal only substitution between treatment and controls. Proposition 4 establishes that the compensated scale effect H summarizes everything about substitution that cannot be discovered with the treatment-control comparison − . Moreover, TK none of the nine Allen substitution terms can be inferred from shares and  alone. Each of them TK also requires H. PROPOSITION 4 (Recovering the substitution matrix from DiD and scale). With PTPT, homogeneity, symmetry, and values for  , H, and two expenditure shares, the entire AES matrix TK can be constructed as: 13 Note that the gap between M and H is small to the extent that a large majority of consumer spending is on the outside goods. The DiD is not characterized as either Hicksian or Marshallian because T and K have a common income effect that differences out. 14 The difference between the compensated ToT and the uncompensated ToT is . T 17

𝜔 𝜀𝐻 − 𝐾𝜎 𝜔 𝑇𝐾 𝜀𝐻 +𝜎 𝜀𝐻 𝑇 𝑇𝐾 − 𝜔 +𝜔 𝜔 +𝜔 𝜔 𝑇 𝐾 𝑇 𝐾 𝑋 (𝐴 𝐴 𝐸 𝐸 𝑆 𝑆 𝑇𝑇 𝐴 𝐴 𝐸 𝐸 𝑆 𝑆 𝑇𝐾 𝐴 𝐴 𝐸 𝐸 𝑆 𝑆 𝑇𝑋 ) = 𝜀𝐻 +𝜎 𝜀𝐻 − 𝜔 𝜔 𝑇𝜎 𝑇𝐾 𝜀𝐻 (17) 𝐾𝑇 𝐾𝐾 𝐾𝑋 𝑇𝐾 𝐾 − 𝐴𝐸𝑆 𝑋𝑇 𝐴𝐸𝑆 𝑋𝐾 𝐴𝐸𝑆 𝑋𝑋 𝜔 𝑇 +𝜔 𝐾 𝜔 𝑇 +𝜔 𝐾 𝜔 𝑋 𝜀𝐻 𝜀𝐻 𝜔 +𝜔 𝜀𝐻 𝑇 𝐾 − − ( 𝜔 𝜔 𝜔 𝜔 ) 𝑋 𝑋 𝑋 𝑋 Holding constant  and the shares, each of the nine entries varies with H. TK Proof. The AES entry derives directly from (15).  = 1−  − . Homogeneity, XX X T K symmetry, and PTPT require that AES = AES = AES = AES . Homogeneity of X demand XT XK KX TX together with the expression for AES requires those four Allen elasticities to be each be −H/. XX X The remaining four Allen elasticities then follow from symmetry, homogeneity, and the definition (14) of the shadow elasticity  . QED TK Equation (17) shows than none of the nine substitution terms can be inferred from shares and treatment-control comparisons alone. Each of them also requires the scale effect H. By proposition 3, DiD only told us about 𝜎 . 𝑇𝐾 Marshall’s Laws of Derived demand were formulated by Marshall (1895) and Hicks (1936) under the more restrictive assumption that z and z are weakly separable from the X goods.15 In T K this case, the shadow elasticity  in u is itself the elasticity of substitution between T and K in TK the function that aggregates them in preferences.16 With this interpretation of DiD and  , TK equation (12) becomes Marshall’s Laws expressed in elasticity form. Equations (12) and (14) show the more general case where −DiD is a shadow elasticity of substitution in u. IV. Bias correction The straightforward case for interpreting a DiD estimate is when we are interested in ToT – the effect on T of dt > 0 = dk – rather than the effect of a hypothetical aggregate treatment. Proposition 2 shows that ToT =  + (1−)DiD, which we expect to be similar to the DiD estimate when the share treated is close to zero. For example, one could use a DiD estimate of Canada’s policy experience as an estimate of what would happen to another small (compared to the world economy) and otherwise similar country that adopts the same policy because both effects would be DiD dt. 15 With randomly assigned treatments, z and z would, up to sampling error, have the same income elasticity and T K cross-price elasticities with each element of 𝑋̂. These are the conditions for weak separability with a homothetic aggregator. 16 The concept of shadow elasticity of substitution was not yet invented when Marshall, Hicks, and Allen were writing about these issues. Allen and shadow elasticities of substitution coincide when there are only two goods. 18

Even when the parameter of interest involves the effect  of treating the entire market, price theory shows how DiDs can be used to obtain a reliable estimate of . Two instances follow, both where a scale effect can be inferred entirely from the results of a targeted treatment. The result is possible due to the presence of market frictions that create known heterogeneity in the spillover among the untreated. In IV.A., we describe how out-of-market controls in the cross section, combined with DiD estimates, can be used to recover the scale effect. Subsection IV.B. provides a setting, familiar from industrial organization and spatial economics, where no control is fully beyond the reach of the invisible hand. Nevertheless, with potential controls differing in terms of distance from the treatment, targeted treatments provide enough information to estimate a scale effect from two DiDs. Both examples have the scale effect outside the range spanned by the two DiDs. IV.A. Outside- and within-market control groups Suppose that a fraction n of the controls were beyond the reach of the invisible hand. That is, the spillover of a targeted treatment dt onto the in-market controls is s dt, as compared to zero for Kt the out-of-market controls. The average spillover among the controls is therefore (1−n)s dt. Kt The scale effect is defined as before,   s + s . For a targeted treatment of dt = 1, we Tt Tk denote the difference between the treated outcome and the average control outcome as DiD(n): 𝐷𝑖𝐷(𝜆𝑛) ≡ 𝑠 −(1−𝑛)𝑠 = 𝜆𝑛𝜀 +(1−𝜆𝑛)𝐷𝑖𝐷(0) (18) 𝑇𝑡 𝐾𝑡 where the second equality follows from the element-by-element equations (7). DiD(0) refers to the difference between ToT and the spillover effect s on the in-market controls. Unlike DiD(0), Kt DiD(n) puts some weight on the scale effect. The scale effect coefficient is less than one both because only part of the market is treated ( < 1) and because only some of the controls are outside the market (n < 1). Still, DiD(n) over- or under-estimates the magnitude of the scale effect according to whether T and K are substitutes or complements, respectively. If none of the controls were in the market (n = 1), DiD(n) would be the ToT but still differ from the scale effect because it does not include the effect on the treated of applying treatments to the untreated in their market. Having at least some of the controls out of the market raises the possibility of recovering the scale effect from a meta-analysis. Specifically, assume that two DiDs are available from distinct markets with the same  and DiD(0) but different shares for the out-of-market controls (n) or different treatment shares as reflected in . Letting subscripts denote markets, the common scale effect can be written in the two-market case as a weighted average of DiD from each market: 𝜀 = 𝛿𝐷𝑖𝐷(𝜆 𝑛 )+(1−𝛿)𝐷𝑖𝐷(𝜆 𝑛 ) (19) 1 1 2 2 1−𝜆 𝑛 2 2 𝛿 ≡ (20) 𝜆 𝑛 −𝜆 𝑛 1 1 2 2 One, but not both, of the markets could have n = 0. Note that the market-1 weight  must be outside the unit interval and therefore put negative weight on one of the DiDs and a coefficient 19

greater than one on the other. According to (19), the scale effect is outside the range spanned by the two DiDs. The scale effect falls below (above) that range when treatments and controls are complements (substitutes), respectively. Alternatively, a single targeted treatment permits recovery of the scale effect when the inmarket controls can be distinguished from the out-of-market controls. Their outcomes each serve as a distinct comparison for the treated outcome. The DiD using in-market controls is DiD(0), while the DiD using out-of-market controls is DiD() = ToT. This is the special case of equation (19) that reduces to the ToT expression in equation (7). All of this is possible by having at least some controls that do not experience the full effect of the invisible hand. IV.B. Market spillovers that diminish with distance So far, we have considered controls as either in the same market as the treated, or out of the market entirely. Here we allow for a gray area, namely a model in which the magnitude of market spillovers diminishes with “distance” from the treatment. The model is sometimes known as the circle-city or Salop model (1979). A key result is a simple formula for inferring a scale effect on prices entirely from the results of a targeted treatment. An integer number N of producers are evenly spaced around the circle, whose circumference is equal to one. Distance along the circle can be interpreted literally, as for an urban economics application. Alternatively, gaps between producers on the circle can represent other product-attribute differences. We assume that each producer 𝑖 ∈ {1,2,…,𝑁} sets its price p as a weighted sum of its own i cost c and the prices of its immediate neighbors: i 𝑝 +𝑝 𝑖−1 𝑖+1 𝑝 = 𝜌 +𝜌 𝑐 +𝜌 (21) 𝑖 0 1 𝑖 2 2 with the constants  > 0 and   (−1,1). Note that 𝑝 = 𝑝 and 𝑝 = 𝑝 , which appear in 1 2 0 𝑁 𝑁+1 1 equation (21) when evaluated for producers 𝑖 ∈ {1,𝑁}, because of the model’s circular setup. Our Appendix I shows how the special case of equation (21) with  =  = ½ can be derived from 1 2 profit-maximizing and utility-maximizing behavior. Our purpose here is to show how it restricts the pattern of market spillovers. Let {dp ,…,dp } denote the price effects of a targeted treatment that increases the costs 1 N only of producer 1, i.e. dc > 0 = dc = … = dc , and let {dP ,…,dP } denote the price effects of 1 2 N 1 N treating all producers (the entire market) with the same-magnitude cost shock as producer 1 experiences in the targeted treatment. Differencing the total derivative of equation (21) between the marketwide and targeted outcomes, 20

𝑑𝑃 +𝑑𝑃 𝑑𝑝 +𝑑𝑝 𝑑𝑃 −𝑑𝑝 = 𝜌 𝑁 2 −𝜌 𝑁 2 (22) 1 1 2 2 2 2 The symmetry of the model and the two treatments requires that dp = dp and dP = dP = dP . 2 N 1 2 N 𝑑𝑃 −𝑑𝑝 = 𝜌 (𝑑𝑃 −𝑑𝑝 ) (23) 1 1 2 2 2 By definition, the scale effect on prices is dP = dP = , the DiD for the targeted treatment 1 2 is dp − dp , and the ToT for the targeted treatment is dp . Equation (23) can be rewritten as 1 2 1 𝜌 2 𝜀 = 𝑇𝑜𝑇+ 𝐷𝑖𝐷 (24) 1−𝜌 2 Note that the DiD in equation (24) refers to the treated-adjacent producers as the controls. To a close approximation, the ToT is the difference dp −dp between the treated firm’s price and the 1 N/2+1 price charged on the opposite side of the circle.17 Therefore, equation (24) allows the scale effect on prices to be inferred entirely from the results of a targeted treatment. With  = ½, the scale 2 effect is simply the sum of ToT and DiD. Appendix II further illustrates a broader principle that market frictions allow scale effects to be recovered from a targeted treatment’s ToT and DiD. An analysis related to (24) is possible for quantities. As with prices, the effect on quantities of a market wide cost change is outside the interval bracketed by the ToT and adjacent-producer DiD observed from a targeted cost change. However, unlike the scale effect on prices, the scale effect on quantities is closer to zero than either ToT or DiD. V. Further examples of difference-in-differences in the marketplace V.A. Models with time and region fixed effects Without price theory as a guide, difference-in-differences estimates can easily be misinterpreted in geographical contexts. One case is an early set of studies attempting to detect imperfect competition in cigarette manufacturing in the form of “over-shifting” cigarette excise taxes (Sumner, 1981). Over-shifting means a $1 per pack tax would increase the retail price of cigarettes by more than $1 per pack, whereas “one-for-one passthrough” refers to a dollar-fordollar correspondence between excise taxes and retail prices. These studies were executed with essentially a difference-in-differences framework by comparing states with large tax increases to states with little or no increase. DiD pass-through studies found nearly one-for-one passthrough, but overlooked the possibility that retail prices in the control states were increased by the tax rates in the treatment 17 In this way, both (19) and (24) show the scale effect as a weighted sum of two different DiDs, except that the sum of (24)’s weights exceeds one. 21

states, which can occur through wholesale prices.18 If the control states were affected in this way, nationwide increases in excise taxes would be over-shifted even though the state DiD shows onefor-one pass through. If we interpret T as retail prices in the states with tax increases and K represents retail prices in the other states, that is the situation illustrated in Figures 2 and Figures 3. The blue arrow represents the retail-price effects of a nationwide tax increase. A national tax would increase prices more than a geographically-concentrated tax increase (green arrow), even in the states targeted by those taxes. Another example is related to Jaffe, Minton, Mulligan, & Murphy (2019, p. Chapter 17), which concludes that business taxes reduce wages in the long run because the taxes reduce productivity. Nevertheless, an increase in business taxes in a particular locality may not reduce wages in that locality relative to the rest of the nation because workers have a choice of where to live and work. In effect, the wage in any locality is influenced by business taxes throughout the country, or even throughout the world. By failing to account for this, a DiD approach might not show any wage effect of business taxes. If geographic differences in business taxes result in little or no geographic differences in wages, they might result in especially large geographic differences in employment. This is another case in which the geographic-specific effect is different from the aggregate effect, but this time with the former effect being greater. Another policy question is the employment effect of public projects such as building a sports stadium or hosting a major event such as the Olympics. Early studies used something like a DiD approach and found a “multiplier”: that total employment in the vicinity of the stadium increased more than the number directly employed by the sports enterprise (Wanhill, 1983; Johnson, Obermiller, & Radtke, 1989). For example, complementary businesses such as restaurants, lodging, and parking were opened nearby. But later studies found that most, if not all, of the additional employment was pulled in from other localities (Dwyer & Forsyth, 2009). Development economics includes experiments that encourage healthcare providers in treatment villages to supply more healthcare. Others incentivize more instructional effort by teachers in the treatment villages. Such experiments can be analogous to the sports-stadium studies. Namely, through factor markets the experiment reallocates resources from control villages to treatment villages. The per-capita effect of treating all villages would be different unless resources are moved with equal ease (or difficulty) between villages as from outside the village economy as a whole. In our notation, that condition is  = DiD. 18 See also Miravete, Seim and Thurk (2018). Suppose that, for example, cigarette manufacturers set one nationwide wholesale price because of concerns that regional wholesale price inequality would result in unauthorized wholesale orders and shipments in the low-price regions on behalf of the high-price regions. Such manufacturers would respond to an increase in one state’s excise rate by adjusting their nationwide wholesale price, and through that mechanism indirectly adjust retail prices throughout the nation. Later studies acknowledged this market mechanism's effect on state differences (Keeler, et al. 1996, Evans, Ringel and Stech 1999, Adhikari 2004); see also Tennant (1950). Harris (1987) emphasizes the results of a federal tax change. Our Appendix II provides a model of such price setting, expressing its results in the format (1). The eigenvalues of S prove to be the national passthrough coefficient (NPTC) and a weighted harmonic mean of one and that same NPTC. The weights depend on the trans-shipping costs, with very little weight on NPTC. That is, due to trans-shipping, difference-in-differences tend to show a one-for-one “effect” of tax on retail price, regardless of the NPTC. 22

Regional DiD studies often include many regions that are treated at different times and different amounts. The average region has a small share;  is close to zero in our notation. Neither the ToT nor the DiD reflects much of the scale effect. For instance, with cigarette excise taxes, the DiD indicates what would happen to retail prices in a state that increases its tax rate, while the national effect of a national increase would be different. More generally, the DiD in a many-region study closely approximates the effect on the treated of treating yet another region of similar size, even though it does not indicate the aggregate effect of treating all regions. Newer studies have proposed to estimate the ToT by selecting control observations that are located at a given radius from treated observations. Butts’ (2023) reexamination of the Tennessee Valley Authority analysis in Kline and Moretti (2014) is an example. The economic interpretation of such procedures depends on the outcome of interest and the structure of market spillovers. For example, for quantity-price relationships satisfying Hicksian symmetry, as in our Section III.B, a targeted treatment by itself does not provide enough information to infer scale effects. In contrast, the scale effect on prices of a cost treatment in the Salop model can be estimated as the sum of the ToT and adjacent-neighbor DiD observed from a geographically targeted treatment, as shown in our Section IV.B. V.B. Welfare effects of random treatments Let’s examine treatment effects in a setting where the substitution effect exceeds the scale effect. A large pool of ex ante identical workers supplies hours on the intensive margin. Their population is normalized to one. From employers’ perspective, any worker’s hours are perfect substitutes for another’s. In the baseline, each worker is paid the same hourly wage w and supplies the same hours. The aggregate demand for their hours is D(w), with D(w) < 0. The per capita supply of labor is L(w), with L(w) > 0. An experiment selects a fraction  of the workers for a wage subsidy t  0.19 Their hours are denoted T per treated and T in total. The untreated “controls” supply K per control and (1−)K in aggregate. To highlight the analogy with the model (1), our notation also includes k as a subsidy for the controls, although it is not emphasized here. Given values for , t and k, an equilibrium is a list {w,T,K} of wage and hours satisfying: 𝐾 = 𝐿(𝑤+𝑘) 𝑇 = 𝐿(𝑤 +𝑡) (1−𝜏)𝐾 +𝜏𝑇 = 𝐷(𝑤) 19 This is a simplified version of Heckman, LaLonde, and Smith (1999) that focuses on incidence rather than employment effects. [also connect to the example at the beginning of the paper] 23

Unsurprisingly, dK/dt < 0 < dT/dt and dw/dt < 0.20 The subsidy benefits the treated and employers (even those who do not employ any treated) and harms the controls.21 Regardless of the share treated, the first magnitude can easily be less than the combined magnitudes of other two. Take the case when labor demand is wage inelastic,   (0,1), and the subsidy is small.22 The treated benefit from the subsidy, but employers benefit even more because they pay less for both treated workers and untreated workers. Shrinking the treated share does not change this result. If the treated are to be the primary beneficiary of the subsidy, demand needs to be wage elastic enough or supply inelastic enough.23 Clearly, quantifying the scale effect is essential for understanding the relationship between the welfare effect on the treated and welfare effects more broadly. This example also distinguishes the DiD estimator dT/dt − dK/dt from the effect of treating all workers. The DiD estimate is L(w) because the subsidy moves treated and controls in opposite directions along the supply curve. The equilibrium quantity effect of subsidizing all workers, is a parameter of interest and results from shifting the supply curve downward by dt. As expected from the general substitutes case, this scale effect is closer to zero than the DiD estimate. V.C. The missing intercept Equilibrium spillovers have been acknowledged in a macroeconomics literature that uses cross-sectional comparisons to infer effects of national policies, such as cash-transfer programs intended to affect consumption. It recognizes that the consumption of households not receiving a transfer will be affected by the transfers going to other households. Wolf (2023) characterizes the effect as the “missing intercept” problem. Our paper features a treatment that is uniform among a single treatment group. In that case, the intercept perspective can be seen by writing the treatment outcome dT from the first column of the effects matrix (7): 𝑑𝑇 = 𝑑𝐾 +𝐷𝑖𝐷 ∗𝑑𝑡 (25) the first term on the RHS of equation (25) is the spillover onto the controls. This “intercept” term would be differenced out if the version of (25) for controls were subtracted. Alternatively, the 20 For this example, the matrix elements corresponding to (1) are 𝑠 =𝐿′(𝑤) (1−𝜏)𝐿′(𝑤)−𝐷′(𝑤) , 𝑠 = 𝑇𝑡 𝐿′(𝑤)−𝐷′(𝑤) 𝑇𝑘 −(1−𝜏)𝐿′(𝑤) 𝐿′(𝑤) , 𝑠 =−𝜏𝐿′(𝑤) 𝐿′(𝑤) , and 𝑠 =𝐿′(𝑤) 𝜏𝐿′(𝑤)−𝐷′(𝑤) . These satisfy  =  and the 𝐿′(𝑤)−𝐷′(𝑤) 𝑘𝑇 𝐿′(𝑤)−𝐷′(𝑤) 𝐾𝑘 𝐿′(𝑤)−𝐷′(𝑤) parallel trends assumption (4). 21 The expressions for aggregate effects on surplus for treated, controls, and employers are (dw+dt)T, (1−)Kdw, and −D(w)dw, respectively. 22 A “small subsidy” refers to the comparative static dt > 0 in the neighborhood of t = 0, holding k constant at zero. 23 For non-zero supply and demand elasticities, the aggregate benefit for the treated as a ratio to the aggregate employer benefit is 1 −  − D(w)/L(w). 24

intercept might be captured by other variables such as time effects added to the analysis for the purpose of holding constant determinants of the outcome that are distinct from the treatment. Now consider the relation between outcomes and treatments when there are multiple treatment groups g = 1, …, N each receiving a different treatment. Outcomes for controls and each treated group could be calculated sequentially by applying one treatment at a time. If DiD were a constant – not varying with the baseline – then the resulting relation would be linear as in (26): 𝑑𝑇 = 𝑑𝐾 +𝐷𝑖𝐷 ∗𝑑𝑡 (26) 𝑔 𝑔 dK is the outcome change for the untreated, which reflects the market spillovers of all N of the treatments given. In Wolf (2023), the outcome is household consumption, with households differing in terms of the size of the treatment (stimulus check) they received. What we call DiD is estimated in his work (and others) with a panel regression at the household-by-quarter level including time fixed effects and estimated in first differences. The market featured in our Figure 1 provides another example. DiD is the constant one for a log revenue outcome regardless of the functional form of market demand. Comparisons of log revenue to the productivity treatment in a cross-section of suppliers would show a linear relationship with treatment slope one. If the missing intercept were identified, the econometrician would possess an estimate of the ToT, which, as we have discussed before, still differs from the scale effect of a uniform treatment on the entire market. At that point, however, estimates of both ToT and DiD can be entered into a variant of equation (12) to determine the scale effect 𝜖. Therefore, Wolf (2023) can be interpreted as using a version of the solution described in our section IV.A. within a setting covered by our section III. VI. Summary and conclusions Markets are ubiquitous. Consumers and businesses do not live or work in isolation, even approximately so. Perhaps one reaction among those engaged in measurement is to actively attempt to isolate members of the treatment group. Clinical drug trials, for example, do try to prevent trial participants from trading with each other, that is, sharing or exchanging the treatments with others. Some clinical trials even discourage participants from communicating specifics about their trial experiences to prevent (what the investigators view as) potential bias or crosscontamination of results. We take a different approach in this paper, which is to acknowledge trade and keep it at the center of the analysis. In our framework, parallel trends require the treatment and control outcomes to be weakly separable in utility, production, or cost from all other outcomes. Marshall’s Laws of Derived Demand are thereby vehicles for several analytical results. One is that a DiD estimator measures the degree of substitution between treatments and controls, regardless of the fraction of the market that is treated and the magnitude of market spillovers (Proposition 3). In contrast, the 25

effect of treating the entire market is a “scale effect,” which is the degree of substitution with goods outside the market where treated and controls participate. The effect of the treatment on the treated (ToT) is a weighted average of the scale effect and the DiD, whereas the market spillovers are proportional to their difference (Proposition 2). Proposition 2 also establishes that, assuming “parallel trends for parallel treatments” (PTPT), the eigenvalues of the treatment-effects matrix are the scale effect and the DiD. As a result, any arithmetic operations on treatment-effects matrices translate into the same operations on their respective scale effects and DiDs. This correspondence appeared in a few of our examples where treatment effects on demand were inverted, combined with an identity matrix, or combined with treatment effects on supply. The presumption that market-equilibrium responses are typically dampened relative to experimental evidence (an example of which is provided in Banerjee and Duflo (2009)) may refer to the market-level feedback between supply and demand, which we have left only implicit in this paper. For the usual incidence reasons, for example, the market-level reduced form for the quantity elasticity of tax changes is −sd/(s−d). This incidence coefficient reflects equilibrium dampening in the sense that it is less than both the demand elasticity magnitude −d and the supply elasticity s.24 However, there is more to market equilibrium than supply-demand feedback. In particular, the actions of market participants—even those on just one side of the market—are coordinated by prices. Either the controls are affected by the treatment, the treated would be further affected by treating the rest of the market, or some combination thereof. Market spillovers drive a wedge between DiD and the scale effect. The market-level demand elasticity d, the market-level supply elasticity s and the market-level incidence coefficient sd/(s−d) are each examples of a scale effect. The DiD from an experiment with control and treatment in the same market recovers substitution effects instead of scale effects. Treatment-control comparisons by themselves do not even partially identify d, s, or sd/(s−d). Complementarity is the case when the treated are affected more by a full-market treatment than by receiving the same treatment while others in the market are untreated. Note that complementarity requires neither increasing returns nor externalities. It does not require that the treated and controls ever meet each other to trade. It does not require Leontief preferences or technology. Complementarity in this sense only means that the scale effect exceeds the substitution effect. Our section IV provides guidance for estimating the degree of complementarity – or recovering the scale effect – from multiple DiD estimates. What econometricians sometimes call “spillover” effects are not well described as externalities – missing markets – because markets also transmit treatment effects to the untreated through prices. Analogizing spillover effects with externalities may give the wrong impression that such effects are rare or beyond basic economic training. 24 The incidence coefficient can be derived in the usual way as the equilibrium quantity effect of a one price-unit wedge between market supply and market demand. 26

Per capita market spillover effects tend to decrease as the size of the treatment group goes to zero, but so does the aggregate treatment effect. Small-scale treatments thereby come with two disadvantages. One is that scale effects are especially obscured by substitution effects. Second, and surprisingly, the spillover effect is comparatively large in the aggregate. 27

Appendix I: Derivation of the circle-city pricing equation A continuum of consumers is uniformly distributed around the same circle as the N producers cited in the main text. Consumers have a choice to consume only the outside good, or to travel to a producer and purchase one unit. If the purchase is from a producer charging p, the i purchase results in a consumer surplus of v − p minus a travel cost that is h times the distance i traveled. If the number of producers N or value v are too low, or the cost parameters h and {c} too i high, then the consumers furthest from producers will not make a purchase. Each producer would thereby operate in its own market, with no reaction to the pricing of other producers. We rule out this case. We also assume that prices and costs are sufficiently similar across producers that no consumer would benefit by leapfrogging a producer to make a purchase. That is, the choice for the consumer is either to travel left or right, and then stop at the first producer encountered. Therefore, there are N distinct locations around the circle where the consumers are indifferent between purchasing from the two nearest producers. For the consumers indifferent between producer i and i+1, their location is at distance x from the former and 1/N − x from the latter: 1 𝑝 −𝑝 𝑥 = + 𝑖+1 𝑖 (27) 2𝑁 2ℎ The demand facing producer i is the sum of the consumers traveling from the producer’s left and those traveling from the producer’s right: 𝑝 +𝑝 𝑝 𝑖−1 +𝑝 𝑖+1 1 𝑖−1 2 𝑖+1 −𝑝 𝑖 (28) 𝐷(𝑝 ; ) = + 𝑖 2 𝑁 ℎ This demand function is linear in the producer’s own price and the average of the prices of adjacent competitors. A producer pricing exactly at that average receives 1/N of the city-wide demand. The profit maximizing price for a producer i taking as given the prices of competitors is: 𝑝 +𝑝 ℎ 𝑐 + 𝑖−1 𝑖+1 + 𝑖 2 𝑁 (29) 𝑝 = 𝑖 2 The condition (29) must hold for all integers i from 1 to N. It is a special case of equation (21) from the main text, with  = h/(2N) and  =  = ½. 0 1 2 28

Appendix II: National and local pass-through of excise taxes The outcomes are retail prices T and K, expressed in levels (as in the literature). The treatments are excise tax rates t and k, respectively. Consumers are not mobile between the treatment and control areas, which have populations  and 1−, respectively. Per-capita consumer demand in each area is a function D() of the area’s retail price. Supply prices are T−t and K−k, respectively. A national manufacturer sets retail prices in each area to maximize profits: [𝑇−𝑡−𝑐((𝑇−𝑡)−(𝐾 −𝑘))]𝜏𝐷(𝑇) (30) +[𝐾−𝑘 −𝑐((𝑇−𝑡)−(𝐾 −𝑘))](1−𝜏)𝐷(𝐾) where the average and marginal cost c() depends on the gap in supply prices between areas. It is a convex function and minimized when the two areas have the same supply price. These assumptions reflect incentives for trans-shipping by area wholesalers, who reimburse the supply price to the manufacturer and handle excise tax payments. Especially, supply-price gaps incentivize wholesalers in the low-price areas to acquire more quantity than needed for their area and (before excise tax is determined) sell the excess to wholesalers in other areas. The first order conditions for maximizing profit are: 𝜏{𝐷(𝑇)+[𝑇−𝑡−𝑐((𝑇−𝑡)−(𝐾 −𝑘))]𝐷′(𝑇)} = [𝜏𝐷(𝑇)+(1−𝜏)𝐷(𝐾)]𝑐′((𝑇−𝑡)−(𝐾 −𝑘)) (31) = (1−𝜏){𝐷(𝐾)+[𝐾−𝑘 −𝑐((𝑇−𝑡)−(𝐾−𝑘))]𝐷′(𝐾)} By totally differentiating the two equations (31) in the neighborhood of t = k while holding  constant, yields expressions for dT and dK as functions of dt and dk. We write them in the matrix form (1) (not shown), and simplify them with two definitions: 𝐷′′(𝑇)/𝐷′(𝑇) −1 𝑇 𝜌𝑑(𝑇) ≡ [2− ] > 0∧𝜂(𝑇) ≡ 𝐷′(𝑇) < 0 (32) 𝐷′(𝑇)/𝐷(𝑇) 𝐷(𝑇) The first is the pass-through coefficient d defined in the usual way as a transformation of the demand function. The second is the price elasticity of demand. Both depend on the retail price because they are not necessarily constant along the demand curve. With the S matrix so derived, its eigenvalues are: 1 −1 𝜀 = 𝜌𝑑 ∧𝐷𝑖𝐷 = [𝜔+(1−𝜔) ] (33) 𝜌𝑑 DiD is a weighted harmonic mean of 1 and d, where the weight is 𝜔 ≡ 𝑇𝑐′′(0) ∈ (0,1). 𝑇𝑐′′(0)−(1−𝜏)𝜏𝜂 None of the weight is on d as the trans-shipping cost function becomes more convex or as the treatment share  approaches zero. The weight  for calculating ToT from the two eigenvalues is, in this application, the population share . 29

References Adhikari, Deergha Raj. 2004. "Measuring market power of the US cigarette industry." Applied Economics Letters 11: 957–959. Angrist, Joshua D., and Jörn-Steffen Pischke. 2008. Mostly harmless econometrics: An empiricist's companion. Princeton, NJ: Princeton University Press. Athey, Susan, and Guido W. Imbens. 2017. "The state of applied econometrics: Causality and policy evaluation." Journal of Economic Perspectives 31: 3–32. Banerjee, Abhijit V., and Esther Duflo. 2009. "The experimental approach to development economics." Annu. Rev. Econ. 1: 151–178. Banzhaf, H. Spencer. 2021. "Difference-in-differences hedonics." Journal of Political Economy 129: 2385–2414. Borusyak, Kirill, Peter Hull, and Xavier Jaravel. 2022. "Quasi-experimental shift-share research designs." The Review of Economic Studies 89: 181–213. Butts, Kyle. 2023. "Difference-in-Differences Estimation with Spatial Spillovers." arxiv.org. https://arxiv.org/abs/2105.03737. Crépon, Bruno, Esther Duflo, Marc Gurgand, Roland Rathelot, and Philippe Zamora. 2013. "Do labor market policies have displacement effects? Evidence from a clustered randomized experiment." The quarterly journal of economics 128: 531–580. Cunha, Jesse M., Giacomo De Giorgi, and Seema Jayachandran. 2015. "The Impact of In-kind and Cash Transfers on Local Prices." Economic Journal 6: 195–230. De Chaisemartin, Clément, and Xavier d’Haultfoeuille. 2020. "Two-way fixed effects estimators with heterogeneous treatment effects." American Economic Review 110: 2964–2996. Deaton, Angus, and John Muellbauer. 1980. Economics and consumer behavior. Cambridge University Press. Dwyer, Larry, and Peter Forsyth. 2009. "Public sector support for special events." Eastern Economic Journal 35: 481–499. Egger, Dennis, Johannes Haushofer, Edward Miguel, Paul Niehaus, and Michael Walker. 2022. "General equilibrium effects of cash transfers: experimental evidence from Kenya." Econometrica 90: 2603–2643. Evans, William N., Jeanne S. Ringel, and Diana Stech. 1999. "Tobacco taxes and public policy to discourage smoking." Tax policy and the economy 13: 1–55. Glaeser, Edward L., and Joshua D. Gottlieb. 2009. "The wealth of cities: Agglomeration economies and spatial equilibrium in the United States." Journal of economic literature 47: 983–1028. Goldsmith-Pinkham, Paul, Isaac Sorkin, and Henry Swift. 2020. "Bartik instruments: What, when, why, and how." American Economic Review 110: 2586–2624. Harris, Jeffrey E. 1987. "The 1983 increase in the federal cigarette excise tax." Tax policy and the economy 1: 87–111. Heckman, James J., and Edward Vytlacil. 2005. "Structural equations, treatment effects, and econometric policy evaluation 1." Econometrica 73: 669–738. Heckman, James J., Lance Lochner, and Christopher Taber. 1999. "Human capital formation and general equilibrium treatment effects: a study of tax and tuition policy." Fiscal Studies 20: 25–40. Heckman, James J., Robert J. LaLonde, and Jeffrey A. Smith. 1999. The economics and econometrics of active labor market programs. Vol. 3, in Handbook of labor economics, 1865–2097. Elsevier. 30

Heckman, James, and Rodrigo Pinto. 2024. "Econometric causality: The central role of thought experiments." Journal of Econometrics 243 (1-2). Hicks, John. 1936. The Theory of Wages. London: Macmillan. —. 1939/1975. Value and Capital. Oxford: Oxford University Press. Hong, Guanglei, and Stephen W. Raudenbush. 2006. "Evaluating kindergarten retention policy: A case study of causal inference for multilevel observational data." Journal of the American Statistical Association 101: 901–910. Huber, Martin. 2023. Causal Analysis. Cambridge: The MIT Press. Hudgens, Michael G., and M. Elizabeth Halloran. 2008. "Toward causal inference with interference." Journal of the American Statistical Association 103: 832–842. Jacobs, Jane. 1969. "Strategies for helping cities." The American Economic Review 59: 652–656. Jaffe, Sonia, Robert Minton, Casey B. Mulligan, and Kevin M. Murphy. 2019. Chicago Price Theory. Princeton University Press (ChicagoPriceTheory.com). Johnson, Rebecca L., Fred Obermiller, and Hans Radtke. 1989. "The economic impact of tourism sales." Journal of Leisure Research 21: 140–154. Keeler, Theodore E., Teh-wei Hu, Paul G. Barnett, Willard G. Manning, and Hai-Yen Sung. 1996. "Do cigarette producers price-discriminate by state? An empirical analysis of local cigarette pricing and taxation." Journal of health economics 15: 499–512. Kline, Patrick, and Enrico Moretti. 2014. "Local Economic Development, Agglomeration Economies, and the Big Push: 100 Years of Evidence from the Tennessee Valley Authority." Quarterly Journal of Economics 129 (1): 275-331. Manski, Charles F. 1993. "Identification of endogenous social effects: The reflection problem." The review of economic studies 60: 531–542. Marshall, Alfred. 1895. Principles of Economics. London: MacMillan and Co. McFadden, Daniel. 1963. "Constant elasticity of substitution production functions." The Review of Economic Studies 30: 73–83. Miguel, Edward, and Michael Kremer. 2004. "Worms: identifying impacts on education and health in the presence of treatment externalities." Econometrica 72: 159–217. Miravete, Eugenio J., Katja Seim, and Jeff Thurk. 2018. "Market Power and the Laffer Curve." Econometrica 86: 1651–1687. http://www.jstor.org/stable/44955255. Monte, Ferdinando, Stephen J. Redding, and Esteban Rossi-Hansberg. 2018. "Commuting, migration, and local employment elasticities." American Economic Review 108: 3855– 3890. Munro, Evan, Stefan Wager, and Kuang Xu. 2021. "Treatment effects in market equilibrium." arXiv preprint arXiv:2109.11647 (arXiv). Muralidharan, Karthik, and Paul Niehaus. 2017. "Experimentation at scale." Journal of Economic Perspectives 31: 103–124. Muralidharan, Karthik, Paul Niehaus, and Sandip Sukhtankar. 2017. "Direct Benefits Transfer in Food." UC San Diego. Salop, Steven C. 1979. "Monopolistic Competition with Outside Goods." Bell Journal of Economics 10 (1): 141-56. http://www.jstor.org/stable/3003323. Sobel, Michael E. 2006. "What do randomized studies of housing mobility demonstrate? Causal inference in the face of interference." Journal of the American Statistical Association 101: 1398–1407. Sraer, David, and David Thesmar. 2023. "How to use natural experiments to estimate misallocation." American Economic Review 113: 906–938. 31

Sumner, Daniel A. 1981. "Measurement of monopoly behavior: an application to the cigarette industry." Journal of Political Economy 89: 1010–1019. Tennant, Richard B. 1950. "The American cigarette industry: a study in economic analysis and public policy." (No Title). Viviano, Davide. 2024. "Policy targeting under network interference." Review of Economic Studies rdae041. Wanhill, Stephen R. C. 1983. "Measuring the economic impact of tourism." The Service Industries Journal 3: 9–20. Wolf, Christian K. 2023. "The Missing Intercept: A Demand Equivalence Approach." American Economic Review 113 (8): 2232-69. 32

Cite this document
APA
Robert Minton and Casey B. Mulligan (2024). A Market Interpretation of Treatment Effects (FEDS 2024-096). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2024-096
BibTeX
@techreport{wtfs_feds_2024_096,
  author = {Robert Minton and Casey B. Mulligan},
  title = {A Market Interpretation of Treatment Effects},
  type = {Finance and Economics Discussion Series},
  number = {2024-096},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2024},
  url = {https://whenthefedspeaks.com/doc/feds_2024-096},
  abstract = {Markets, likened to an invisible hand, often appear to contradict econometric assumptions that rule out spillovers of one person’s treatment on another’s outcomes. This paper provides a simple statistical framework highlighting that controls are indirectly affected by the treatment through the market. Further, the effect of the treatment on the treated reveals only part of the consequence for the treated of treating the entire market. When combined with economic theory, our framework leads to a new application of Marshall’s Laws of Derived Demand that relates econometric estimates of treatment effects in the marketplace to the substitution and scale effects of demand theory. We show how treatment-effect estimators can diverge – both in magnitude and direction – from the causal effects of treatment on the treated or counterfactual policies treating all market participants. The framework shows how the consequences of targeted treatments reveal the effects of marketwide treatments, and the role of market frictions in that inference. Examples from labor, public finance, economic geography, development, and the macro literature on the “missing intercept” are provided.},
}