feds · April 22, 2025

Agglomeration and sorting in U.S. manufacturing

Abstract

Using data on U.S. manufacturing plants, I estimate a production function model that includes agglomeration intensity as a component of total factor productivity and allows agglomeration benefits to vary across establishments, which can lead to sorting. I find that agglomeration benefits decline with unobserved establishment-level raw productivity.

Finance and Economics Discussion Series Federal Reserve Board, Washington, D.C. ISSN 1936-2854 (Print) ISSN 2767-3898 (Online) Agglomeration and sorting in U.S. manufacturing Andrea Stella 2025-031 Please cite this paper as: Stella, Andrea (2025). “Agglomeration and sorting in U.S. manufacturing,” Finance and Economics Discussion Series 2025-031. Washington: Board of Governors of the Federal Reserve System, https://doi.org/10.17016/FEDS.2025.031. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

Agglomeration and sorting in U.S. manufacturing Andrea Stella* Federal Reserve Board April 8, 2025 Abstract UsingdataonU.S.manufacturingplants,Iestimateaproductionfunctionmodel that includes agglomeration intensity as a component of total factor productivity and allows agglomeration benefits to vary across establishments, which can lead to sorting. I find that agglomeration benefits decline with unobserved establishmentlevel raw productivity. JEL classifications: D22, D24, E24, L11, R11, R32 Keywords: Agglomeration, Sorting, Census of Manufactures. *emailaddress: andrea.stella@frb.gov. IthankGianniAmisano,DavidA.Benson,EdHerbst,IlleninO. Kondo, Logan T. Lewis, Amil Petrin, Santiago Pinto, Devesh Raval, and seminar participants at the 2024 SystemCommitteeonRegionalAnalysisforhelpfulcommentsandsuggestions. Theviewsexpressedhere should not be interpreted as reflecting the views of the Federal Reserve Board of Governors or any other personassociatedwiththeFederalReserveSystem. Anyviewsexpressedarethoseoftheauthorandnot thoseoftheU.S.CensusBureau. TheCensusBureauhasreviewedthisdataproducttoensureappropriate access, use, and disclosure avoidance protection of the confidential source data used to produce this product. Thisresearch was performed at a Federal StatisticalResearch Data Center under FSRDC Project Number2427. (CBDRB-FY23-0489,CBDRB-FY24-P2427-R10962,CBDRB-FY24-P2427-R11365,andCBDRB- FY25-P2427-R12216) 1

Most importantly, however, cities differ in productivity: large cities produce more output per capita than small cities do. This urban productivity premium may occur because of locational fundamentals, because of agglomeration economies, because more talented individuals sort into large cities, or because large cities select the most productive entrepreneurs and firms. Behrens and Robert-Nicoud, Handbook of Regional and Urban Economics, 2015 1 Introduction It is striking that roughly 80% of the U.S. population lives on just 5% of the country’s land. Such extreme spatial concentration of people and industry suggests that cities confer significant advantages. A natural explanation is that agglomeration – the clustering of firms and workers – boosts productivity. Indeed, dense urban regions are observed to be more productive, on average, than less dense ones, but why this "urban productivity premium” exists has been a subject of debate. Economists have proposed several mechanisms: true agglomeration economies, whereby proximity yields external benefits such as labor pooling, knowledge sharing, and quicker, cheaper access to intermediate goods andservices(CombesandGobillon2015);selectioneffects,whereonlyhigh-productivity firms survive the tough competition of cities (Combes et al. 2012); and sorting, in which inherently more-productive firms and workers choose to locate in big cities (Combes et al. 2012). Disentangling these forces is crucial for understanding the benefits of city size. Are big cities productive because they make firms more efficient or because they attract the most efficient firms? This paper tackles that question with new evidence from U.S. manufacturing. While agglomeration economies and selection effects have been extensively studied in the urban literature, the role of firm sorting has received far less empirical attention. If high-productivity firms systematically sort into dense regions, then the observed productivity advantage of cities might be a composition effect rather than a pure agglomeration effect. Conversely, if lower-productivity firms benefit more from being in clusters, it would mean that agglomeration acts to narrow productivity gaps, casting doubt on sorting as the primary explanation. This paper fills that gap in the literature by examining whether agglomeration benefits vary across establishments in a way that could induce sorting. To answer this question, I develop an empirical framework that allows agglomeration’s effect on productivity to differ for each establishment. Specifically, I propose a 2

model of establishment productivity that includes an interaction between local agglomeration intensity and the firm’s unobserved productivity. In essence, the production function is specified such that agglomeration economies can depend on a plant’s individual efficiency. This flexible, heterogeneous approach permits me to identify whether more-efficient firms experience larger or smaller productivity boosts from being in a dense manufacturing region. For example, if agglomeration benefits were higher for more-productive establishments, those firms would have a stronger incentive to locate in areas with greater industry density – a pattern I would interpret as evidence of positive sorting. On the other hand, if the benefits of agglomeration are greater for lessproductive plants, it would imply an opposite sorting pattern (or perhaps no sorting), where high-productivity firms are not disproportionately drawn to clusters by productivity advantages. Identifying sorting effects is challenging because firms’ underlying productivity is not directly observable to the researcher. I tackle this challenge by extending the control function approach of Olley and Pakes (1996) to my setting. In practice, this means I use a production function estimation method that controls for unobserved productivity shocks while allowing those shocks to interact with agglomeration intensity. By doing so, I correct for the fact that more-productive firms might endogenously choose higher inputs and favorable locations. This approach allows me to recover each establishment’s intrinsic productivity and to estimate the elasticity of output with respect to local agglomeration for firms at different points in the productivity distribution. In short, my empirical strategy disentangles genuine agglomeration economies from the sorting of firms by productivity, using panel data on manufacturing plants and modern econometric techniques to address bias. Using confidential microdata on U.S. manufacturing establishments from the U.S. Census Bureau, I find clear evidence of heterogeneous agglomeration benefits. In my estimates, agglomeration boosts productivity more for low-productivity plants than for high-productivity plants. Quantitatively, when moving from a plant at the 25th percentile of the productivity distribution to one at the 75th percentile, the output elasticity with respect to local agglomeration declines by about 10 to 40 percentage points. In other words, a relatively less-efficient factory sees a much larger productivity gain from being located in a dense manufacturing cluster than a highly efficient factory does. I interpret this finding as evidence against strong positive sorting being the driver of the urban productivity premium. If anything, the pattern suggests that density levels the playing field: It helps weaker firms catch up more than it helps the superstar firms. This casts doubt on firm sorting as the primary explanation for the positive correlation 3

between density and productivity observed in U.S. manufacturing. This paper contributes to and expands the small but growing literature on firm sorting and agglomeration. There are only a few empirical papers that directly examine sorting, and their findings have differed. Forslid and Okubo (2014), for example, develop a theoretical model of spatial sorting with heterogeneous firms and two regions, and they find a non-monotonic sorting pattern: Firms with very high productivity (and high capital intensity) and firms with very low productivity (low capital intensity) tend to relocate to the larger region. In contrast, Gaubert (2018) estimates a richer general equilibrium model using French firm-level data and finds evidence of a positive interaction between agglomeration and firm productivity, meaning that more-productive firms gain more from locating in big cities. My results for U.S. manufacturing suggest the opposite interaction – high-productivity firms appear to gain less – thereby providing a new perspective on this question. Methodologically, my approach, which is based on a partial equilibrium production function model, requires fewer structural assumptions than the general equilibrium models used in prior work. In particular, I am able to let thedataspeakonhowinputsandproductivityjointlydetermineoutputinclusteredversus non-clustered settings rather than imposing calibrated parameters. Unlike Gaubert (2018), who computes total factor productivity (TFP) as a residual and calibrates certain elasticities before assessing agglomeration effects, I jointly estimate productivity and agglomeration elasticities within a unified framework. This approach avoids potential biases from misspecification and provides a direct, data-driven estimate of the extent of sorting. The findings have important implications for urban economic policy and industrial location decisions. Policymakers often promote local "cluster” development or offer incentives for firms to locate in their city, aiming to harness agglomeration spillovers to boost regional growth. My results suggest that such place-based policies should consider the types of firms they target. If smaller or less-productive manufacturers reap the largest productivity gains from clustering, then policies that support the formation of industrial hubs – for instance, providing shared infrastructure, facilitating supplier networks, or establishing industrial parks – could disproportionately raise the productivity of those firms and help lagging regions. On the other hand, if highly productive firms experience little additional boost from being in an already dense area, expensive tax breaks to attract a superstar firm might yield fewer spillover benefits than expected. Understanding the sorting dynamic also guides individual firms’ location decisions: A moderately productive plant might see substantial gains from moving into a major industrial center, whereas a very productive plant may find that it can operate nearly as 4

efficiently in a smaller city or peripheral area without losing much in terms of agglomeration benefits. In summary, recognizing who benefits most from agglomeration can lead to more targeted and efficient urban economic policies, ensuring that efforts to stimulate local manufacturing yield the maximum possible improvement in productivity and competitive advantage. The remainder of the paper is organized as follows. Section 2 describes the empirical model and identification strategy. Section 3 introduces the data used in the estimation. Finally, section 4 presents the empirical results and discusses their implications for agglomeration and sorting in manufacturing. 2 Empirical model The production function of an establishment takes the following form in logs: v = β +β l +β k +τ +ϵ , (1) it 0 L it K it it it where i refers to the establishment, t is time, v is value added, l is labor, k is capital, it it it τ is TFP, and ϵ is an i.i.d shock not observed by the establishment before making input it it decisions in time t and never observed by the econometrician. I assume that TFP is a function of the agglomeration intensity of the location where the establishment is based, a , and establishment-specific raw efficiency, ω . More specifically, I assume the followit it ing: τ = β a +ω +γa ω . ω represents the current and past productivity shocks it a it it it it it that the establishment observes before making input decisions in t; the econometrician never observes ω .1 it I measure agglomeration intensity as the number of employees of other establishments in the same sector that are within 10 kilometers from establishment i; more precisely, a is the logarithm of agglomeration intensity plus one.2 There is evidence in the it literature that agglomeration delivers economically and statistically significant benefits toproductivity. Iaminterestedintestingwhethersuchagglomerationbenefitsvarywith the establishment’s raw efficiency. 1The assumption that agglomeration benefits are Hicks neutral is consistent with the literature; see RosenthalandStrange(2004). 2Since taking the logarithm of a variable plus one has been shown to create bias in several estimation models, as a robustness check I estimated an alternative production function model that is a simple Cobb–Douglas when agglomeration intensity is zero and is the baseline model, equation (1), when agglomeration intensity is strictly higher than zero; in this alternative robustness model, a is equal to the it logarithm of agglomeration intensity instead of the logarithm of agglomeration intensity plus one, as in thebaselinemodel. Theresultsarequalitativelyandquantitativelyverysimilar. 5

As shown by the algebra in (2), β and β are not separately identified from the level 0 a of ω. I replace ω with ω¯ −c, where ω¯ = ω+c, and I obtain an identical equation with ω¯, β¯ , and β¯ replacing ω, β , and β , where β¯ = β −c and β¯ = β −cγ. 0 a 0 a 0 0 a a v = β +β l +β k +β a +ω +γa ω +ϵ (2) it 0 L it K it a it it it it it = β +β l +β k +β a +(ω +c−c)+γa (ω +c−c)+ϵ 0 L it K it a it it it it it = β −c+β l +β k +(β −cγ)a +(ω +c)+γa (ω +c)+ϵ 0 L it K it a it it it it it = β¯ +β l +β k +β¯ a +ω¯ +γa ω¯ +ϵ 0 L it K it a it it it Since β , β , and the level of ω are not separately identified, I replace ω with ω˜ = 0 a it it ω +β and β with β˜ =(β −γβ ). The production function I estimate is then it 0 a a a 0 v = β l +β k +β˜ a +ω˜ +γa ω˜ +ϵ . (3) it L it K it a it it it it it Before going into the details of the estimation, it is useful to provide the intuition behind the estimation approach and why it is needed. Estimating the production function with an agglomeration term – equation (3) – poses a classic identification challenge: More-productive firms may choose higher inputs and potentially different locations, which can bias naive estimates of agglomeration benefits. To address this, I employ a control function approach in the spirit of Olley and Pakes (1996) and Levinsohn and Petrin (2003). This semiparametric estimator uses a firm’s decision on intermediate input purchases as a proxy for its unobserved productivity shocks. The intuition is that, given certain regularity conditions, a more-productive establishment will employ more intermediate inputs, so observed intermediate input use can reveal the establishment’s current productivity level to the econometrician. I extend the traditional control function method to allow unobserved productivity to interact with an observed agglomeration measure. Practically, this means that total productivity – including unobserved productivity, local agglomeration intensity, and their interaction – is assumed to be a function of intermediate input choices. By doing so, I account for the possibility that firms may invest differently in intermediate inputs, depending on the external economies they experience in their location. Under the reasonable assumptions I discuss next, I can recover the unobserved productivity term as a function of observable inputs and agglomeration intensity. This yields an estimated production function that includes the following: (i) conventional input elasticities for labor and capital, (ii) an agglomeration elasticity term, and (iii) an interaction term capturing how agglomeration’s effect varies with firm productivity (the 6

parameter γ in my model). I verify that the conditions for identification discussed in the literature hold in my setting, lending credibility to my estimation of the interaction effect. Iwillnowgointothespecificsoftheestimationprocedure, startingwiththeassumptions I make. Assumption 1 Information Set: The establishment’s information set at t – that is, I – includes it current and past raw efficiency {ω }t but does not include future raw efficiency {ω } ∞ . iτ τ=0 iτ τ=t+1 The transitory shocks ϵ satisfy E[ϵ | I ] = 0. it it it Assumption 2 First-Order Markov: Raw efficiency evolves according to a first-order Markov process. This distribution is known to firms and is stochastically increasing in raw efficiency: p(ω |I )= p(ω |ω ). it+1 it it+1 it Assumption 3 TimingofInputChoices: Firmsaccumulatecapitalaccordingtok =κ(k |i ), it it−1 it−1 where investment i is chosen in period t−1. Labor input is flexible and chosen in period t. it−1 Assumption 4 Timing of Location Choices: Location choices by establishments are made when the establishment is set up. Establishments do not move. Establishments enter a location as startups and exit when they shut down. Location choices need to be made a period in advance and are therefore based on the previous-period information set I . it−1 Assumptions 1 through 3 are standard in the literature. Assumption 4 on location choicesisdrivenbythedataIuse; intheCensus’LongitudinalBusinessDatabase(LBD), the establishment identity is defined by its geographical location, so establishments do not change location.3 Gaubert (2018) only relies on data from one year for the structural estimation – specifically, 2000; consequently, she also does not exploit location changes resulting from firms endogenously relocating to identify the sorting parameter. Assumption 5 Firms’ intermediate input demand is given by m = f(k ,l ,τ ). it it it it Assumption 6 Strict Monotonicity: f(k ,l ,τ ) is strictly increasing in τ . it it it it I assume that establishments decide what level of intermediate input to use based on their capital and productivity levels and that the intermediate input demand is strictly increasing in productivity. It should be noted that the non-separability between raw efficiency ω and agglomeration a implies that the intermediate input choice in this model 3AsIidentifythelocationofanestablishmentwithitsZIPCode,itisimportanttonotethatZIPCode boundaries can change over time. Therefore, while changes in location within the data are possible, they shouldnotbedrivenbytheestablishment’sendogenouschoices. 7

is not necessarily strictly increasing in ω, as assumed in the previous literature. However, it seems reasonable to assume that it is strictly increasing in overall productivity τ; in other words, I am assuming, like Levinsohn and Petrin (2003), that more-productive firms, conditional on capital, use more intermediate inputs. The conditions for Assumption 6 are the same as in Levinsohn and Petrin (2003), with τ instead of ω. Following Assumption 6, I can invert f and express productivity τ as a function of capital and intermediate input use: v = β +β l +β k + f −1(k ,l ,m )+ϵ (4) it 0 l it k it it it it it = Φ(k ,l ,m )+ϵ (5) it it it it where Φ(k ,l ,m )= β l +β k +β˜ a +ω˜ +γa ω˜ . (6) it it it l it k it a it it it it I can estimate Φ(k ,l ,m ) in (5) by replacing Φ(k ,l ,m ) with a third-order polynoit it it it it it mial in k , l , and m . To avoid the identification issues raised by Ackerberg et al. (2015) it it it and Gandhi et al. (2020), I take advantage of the firm’s first-order condition with respect to labor to identify β . It can be shown that l W L eϵ it β = it it , (7) l P it Q it E it eϵ it W L where it it is the labor share. Estimating (5) produces an estimate of ϵ , which I use to P Q it it it estimate βˆ using (7). Raw efficiency can then be derived from (6) as l Φ(k ,l ,m )−βˆ l −β k −β˜ a ω˜ = it it it l it k it a it . (8) it 1+γa it Assumptions 2 and 3 imply that I can decompose raw efficiency into its conditional expectation at time t−1 and an innovation term η: ω˜ = E(ω˜ |I )+η . (9) it it t−1 it E(ω˜ |I ) in (9) can be estimated non-parametrically with a third-order polynomial to it t−1 retrievetheinnovationterm ϵ . ϵ and η areorthogonaltotheinformationsetinperiod it it it t −1, I : E[(η + ϵ )|I ] = 0. This can be used to derive moment conditions and t−1 it it t−1 8

estimate the model parameters with GMM. I use as instruments variables determined in t−1: capital and agglomeration from period t, labor and materials from period t−1, and interactions between them. The moment conditions are then as follows:    1      k it       a    it       m it−1       l    it−1       k it a it       k l    it it−1     E  (ηˆ it (β k ,β˜ a ,γ)+ϵˆ it (β k ,β˜ a ,γ))⊗   k it m it−1     = 0. (10)   a l    it it−1       a it m it−1       l m    it−1 it−1       k it a it m it−1       k a l    it it it−1       k it m it−1 l it−1       a m l    it it−1 it−1  k a m l it it it−1 it−1 In Appendix A, I perform a brief Monte Carlo study, using a similar setup to Ackerberg et al. (2015), and find that my algorithm performs well. 3 Data description The main data sources are the U.S. Census Bureau’s Annual Survey of Manufacturers (ASM) and Census of Manufactures (CM). The model is estimated separately on seven four-digit sectors: 3111 (Animal Food Manufacturing), 3113 (Sugar and Confectionery ProductManufacturing), 3116(AnimalSlaughteringandProcessing), 3118(Bakeriesand Tortilla Manufacturing), 3141 (Textile Furnishings Mills), 3152 (Cut and Sew Apparel Manufacturing), and 3162 (Footwear Manufacturing).4 The sample period is 1991 to 4Since the estimation is very time consuming, I ran the model only on four-digit sectors within the NAICS 31 sector, with the exception of sectors ending in 9, which are residual groups that combine different types of establishments. I was not able to achieve identification for the sectors not included in Table2,assomebootstrapsimulations,orinafewcasesthepointestimatesthemselves,wereeconomically 9

2019. I include in my sample only establishments that were included in the ASM with certainty during at least one survey year. Moreover, I drop observations with labor share below the 2.5th percentile and above the 97.5th percentile. After the cleaning, the resulting samples cover most of the value added in the original ASM samples. I follow Foster et al. (2016) and Cunningham et al. (2023) in the computation of value added and the production function inputs: Nominal output is measured as total shipmentsanddeflatedusinganindustry-levelmeasurefromtheNBER–CESManufacturing IndustryDatabase,capitaliscomputedwithaperpetualinventorymethod,laborismeasuredastotalhours,materialsaremeasuredseparatelyforphysicalmaterialsandenergy and each is deflated by an industry-level deflator, and, finally, value added is computed as output minus materials. I measure agglomeration intensity as the number of employees in the same four-digit sector within 10 kilometers. I merged the ASM–CM sample with the LBD to compute agglomeration intensity, as the LBD provides the location of all establishments in the United States during my sample period. The distances among establishments are computed as Euclidean distances among the ZIP Code centroids. reliminary evidence Table 1: P My dataset 2012 CM Log density 0.025 0.024 (0.002) (0.002) Log total employment 0.016 0.025 (0.002) (0.003) N. Obs. 28,500 28,500 2,900 2,900 R2 0.005 0.002 0.036 0.032 Notes: Standarderrorsarereportedinparentheses. N.Obs. istheroundednumber ofobservations. To get a first taste of the data, I ran regressions relating the density of an area to its productivity. Such regressions are popular in the literature; see Combes and Gobillon (2015). More precisely, I ran a regression of the mean log TFP of the establishments in a ZIP Code on either the log density of the ZIP Code or its log total manufacturing employment.5 I ran the regressions on the dataset described above (including all seven four-digit sectors) and on the 2012 CM sample. As shown in Table 1, the agglomeration elasticity is estimated to be around 2.5%, on the lower end of the range of estimates in the literature. It must be noted that the R2 statistics are low compared to similar models nonsensical and the confidence intervals too large to make a precise inference possible. For sectors 3116 and3162,Ihadtore-runafewbootstrapsimulationstoachieveidentification. 5TFPiscomputedasinFosteretal.(2016)andCunninghametal.(2023). 10

in other studies. 4 Empirical results and discussion 4.1 Main findings: heterogeneity in agglomeration benefits Table 2 presents parameter estimates for two models: a simple Cobb–Douglas production function with only capital and labor (Column "CB”),and the full model (1) (Column "Full"). The table displays for each four-digit sector the coefficients on capital, β , labor, k β , and the interaction between unobserved raw productivity and agglomeration intenl sity, γ. stimates Table 2: E Panel A 3111 3113 3116 3118 CB Full CB Full CB Full CB Full β 0.661 0.648 0.682 0.787 0.591 0.705 0.651 0.675 k (0.029) (0.207) (0.022) (0.072) (0.016) (0.037) (0.010) (0.037) β 0.203 0.203 0.248 0.248 0.299 0.299 0.303 0.303 l (0.004) (0.004) (0.006) (0.006) (0.004) (0.004) (0.004) (0.004) γ -0.146 -0.473 -0.476 -0.667 (0.121) (0.183) (0.358) (0.128) N. Obs. 11,500 4,800 17,000 16,000 N. Estab. 1,400 550 1,700 2,400 Panel B 3141 3152 3162 CB Full CB Full CB Full β 0.535 0.490 0.433 0.534 0.542 0.987 k (0.026) (0.055) (0.016) (0.049) (0.049) (0.220) β 0.345 0.345 0.434 0.434 0.415 0.415 l (0.007) (0.007) (0.005) (0.005) (0.012) (0.012) γ -0.470 -0.618 -0.409 (0.124) (0.124) (0.205) N. Obs. 4,800 14,000 1,800 N. Estab. 700 3,700 250 Notes: Thetableshowstheparameterestimatesfor7four-digitindustries. CBstandsforCobb–Douglas,and Fullreferstomodel(1). Bootstrappedstandarderrorsarereportedinparentheses. N.Obs. istherounded numberofobservations.N.Estab.istheroundednumberofuniqueestablishments. As shown in Table 2, the estimates of capital elasticity are a touch higher in the 11

full model than in the Cobb–Douglas model for most sectors. However, the confidence intervals are wide enough that the difference is not statistically relevant for most sectors. The elasticity of labor is, by construction, identical under the two models and is around 30%. The estimated interaction term (γ) between local agglomeration intensity and firmlevel productivity is negative in virtually all specifications. This means that higherproductivity plants experience a smaller marginal gain from agglomeration than lowerproductivity plants.6 Toanalyzetheagglomerationsideoftheproductionfunction,Icombinetheestimates of β˜ and γ with the distribution of raw productivity ω˜. This analysis is presented in a Table 3, where I calculate the average output elasticity with respect to agglomeration, Avg. = E(β +γω), and its interquartile range, IQR = γ(ω −ω ). Notably, the terms a 75 25 involving β cancel out when computing both Avg. and IQR, allowing me to express 0 them as functions of β and ω instead of their normalized counterparts β˜ and ω˜. a a The average agglomeration benefits ("Avg." in Table 3) appear to be quite small, as they are close to zero and statistically insignificant for most sectors. This contrasts with findings from Table 1 and the existing literature, which reports agglomeration externalitiesrangingfrom1%to12%; seeRosenthalandStrange(2004)andCombesandGobillon (2015) on urbanization economies and Greenstone et al. (2010) for a similar production function setting. Even though the average agglomeration benefits appear modest, the IQR estimates in Table 3 reveal a broad variation, indicating that many firms derive substantial advantages from agglomeration. These findings serve as a caution against estimating agglomeration benefits without carefully identifying productivity and considering the interactions between productivity and agglomeration intensity. gglomeration elasticity Table 3: A 3111 3113 3116 3118 3141 3152 3162 Avg. -0.652 0.003 -0.002 -0.012 0.000 0.000 0.013 (0.201) (0.027) (0.024) (0.089) (0.026) (0.036) (0.107) IQR -0.419 -0.155 -0.112 -0.086 -0.184 -0.130 -0.322 (0.107) (0.060) (0.024) (0.039) (0.022) (0.050) (0.196) Notes: Avg. isequaltoE(βa +γω). IQRisequaltoγ(ω75 −ω25 ). Bootstrappedstandard errorsarereportedinparentheses. Theinterquartilerangeoftheelasticitywithrespecttoagglomerationintensity("IQR" in Table 3) is always negative and statistically significant. The data strongly supports a 6Since interpreting β˜ is challenging because of the identification issue discussed in Section 2, I did a notincludeitinTable2. 12

negative interaction between establishment-level raw productivity and agglomeration benefits. Quantitatively, moving from a relatively low-productivity establishment (at the 25thpercentileoftheproductivitydistribution)toahigh-productivityestablishment(the 75th percentile) is associated with a decline in the agglomeration elasticity of output on the order of 10 to 40 percentage points. In other words, the establishments benefiting the most from a dense industrial environment tend to be the least productive ones, whereas the most-productive firms enjoy more modest gains from local agglomeration. This finding is illustrated in Table 3 by the consistently negative estimates I obtain for the IQR difference in agglomeration elasticity across productivity levels. The negative and statistically significant IQR effect confirms that the production function is not logsupermodular in density and productivity, contrary to what one would expect if only the most productive firms reaped the largest external gains, as in Gaubert (2018). From an economic standpoint, this result is somewhat surprising: It suggests that smaller or less-efficient manufacturers rely more heavily on external economies of scale (e.g., shared suppliers and local knowledge) to improve their performance, whereas the industry leaders do so to a lesser extent. A possible explanation is that highly productive firms may already be operating near the technological frontier or have better internal processes, leaving less scope for external factors to further raise their productivity. In contrast, firms with lower innate productivity have more to gain from the networking, learning, and input-sharing opportunities that agglomeration provides. As a result, the productivity boost from the local industrial environment exhibits diminishing returns as a firm’s own efficiency rises. Itisimportanttonotethatmyevidencecastsdoubtonfirmsortingbyproductivityas the primary explanation for the well-documented positive correlation between density and productivity. If the most productive firms were endogenously sorting into dense regions because they gain the most from those environments, I would expect to find a positive interaction term (higher agglomeration elasticity for higher-productivity firms). I findthe opposite. Thus, while more-productivefirms are indeedfound in denserareas, on average, my results imply this pattern is not because those firms derive outsized productivity benefits from agglomeration. Other mechanisms must be at play to explain why high-productivity firms tend to be in cities. 4.2 Robustness checks I now consider the robustness of these findings. I subject the core result – that agglomeration’s benefits decline with firm productivity – to a variety of robustness checks. 13

These checks verify that my findings are not an artifact of a particular measurement of “agglomeration” or driven by specific modeling assumptions. First, I alter the geographic scope used to define agglomeration intensity. My baseline measure considers employment in the same four-digit NAICS industry within a 10-kilometer radius. I reestimate the models using a narrower radius of 5 kilometers and a much wider radius of 50 kilometers. Intuitively, a 5-kilometer radius captures very localized clusters, while a 50-kilometerradiusextendstoabroaderregionalscale,wherespilloversmightdissipate. Second, I change the industry definition of the agglomeration measure: Instead of counting only same-four-digit NAICS neighbors, I use a coarser three-digit NAICS classification (grouping more-related industries together) to define nearby activity. This addresses whether my results are sensitive to using a tight industry definition (localization economies) versus a slightly broader industry grouping (which might incorporate some cross-industry externalities). obustness on estimates Table 4: R IQR 3111 3113 3116 3118 3141 3152 3162 Five Kilometers -* - - -** -** + - Fifty Kilometers - + + -* -** + -* Three-digit agglom. - n.c. -* -* -** -** - Notes: **denotesstatisticalsignificanceatthe1%level,and*denotesstatisticalsignificanceatthe 5%level.“n.c.”standsfornotconverged. Table 4 reports the results of these robustness exercises. Because of data confidentiality constraints, I report only the sign and significance of some estimates rather than the exact magnitudes. Across these alternative specifications, the pattern remains consistent. The estimated interaction effect (expressed again as the difference in agglomerationelasticitybetweenhigher-andlower-productivityplants)staysnegativeinalmostall cases. In the majority of industries and alternative definitions, the less-productive establishments continue to show higher returns to local density than their more-productive counterparts. The only notable deviations occur under the 50-kilometer radius definition, where the interaction effect is positive, albeit not statistically significant, in three sectors. This attenuation is expected – a very wide radius likely dilutes the true local externalities by including relatively distant activity that offers limited direct interaction or spillovers. In contrast, using the broader three-digit industry aggregation yields the same sign and significance of the productivity–agglomeration interaction as in the baseline specification. 14

4.3 Implications for theory and policy These empirical findings carry important implications for theories of urban agglomeration and the interpretation of past results. The evidence suggests that the production functionformanufacturingestablishmentsisnotsupermodularinfirmproductivityand local agglomeration intensity. In theoretical terms, I do not observe the complementarity that would make high-productivity firms benefit disproportionately from being in a dense cluster. Such complementarity is a key assumption in models where “the best firms end up in the best places” through endogenous sorting. By rejecting this assumption, my results indicate that one should be cautious in attributing the productivity advantages of cities to a matching of inherently more-efficient firms with richer external environments. It is useful to contrast my findings with those of Gaubert (2018), one of the few other empirical studies on firm sorting. Gaubert (2018) develops a general equilibrium model of heterogeneous firms across cities and, using French data, finds evidence of a positive interaction between firm productivity and agglomeration benefits. In her framework, more-productive firms see larger gains from locating in big cities, which in turn reinforces their incentive to sort into those locations. By contrast, my results for U.S. manufacturing show a negative interaction – implying that, in my data, highly productive firms do not enjoy larger marginal external benefits. How can we reconcile these differences? One explanation may be methodological. Gaubert’s approach entails calibratingcertainparameters(e.g., factorelasticities), computingproductivityasaresidual, and then matching moments in a structural model. If productivity is mismeasured or if input adjustments are not fully accounted for, one might mistakenly infer a positive sorting effect. My estimation, by directly controlling for input choices and jointly estimating productivity, avoids relying on pre-calculated TFP measures. This could lead to amoreaccurate(andinthiscase, lower)estimateoftheproductivity–densityinteraction. Anotherreasoncouldbecountrydifferences: Theindustrialstructureandspatialconfiguration in the United States might differ from France in ways that affect agglomeration economies. The finding that agglomeration benefits decline with firm productivity has practical implicationsforregionaleconomicpolicy,especiallyplace-baseddevelopmentprograms. AsshownbyBartik(2020),governmentsatthefederal,state,andlocallevelsincreasingly investinplace-basedpolicies–spendingontheorderof $60billionperyear–tospurjob creation and productivity growth in specific areas. Examples include enterprise zones, relocation incentives, cluster development grants, and infrastructure investments targeted at lagging regions. A common rationale behind these policies is the expectation of 15

agglomeration spillovers: By attracting firms (ideally high-performing ones) to a region, policymakers hope to create self-reinforcing cycles of growth through clustering effects. My results offer a more nuanced perspective that can improve the design and evaluation of such policies by underscoring the importance of considering firm heterogeneity in policyimplementation. Ifpolicymakers aim tomaximize local productivityspillovers, theyshouldrecognizethatattractingthemost-productivefirmsmaynotyieldthelargest external benefits per firm. Instead, policies that support a mix of firms, especially those that can gain the most from density, may be more effective in fostering overall economic growth. A superstar manufacturing plant, while certainly beneficial for many reasons, might not experience a large productivity jump just from being in a given location if it is already very efficient. In contrast, a moderately productive plant could substantially improve its performance when placed in a dense industrial cluster by leveraging knowledge transfers, skilled labor pools, or supplier networks that it did not have access to before. This suggests that policies that are purely focused on luring marquee firms (forexample,offeringlargetaxbreakstobigmultinationalstolocateinadepressedarea) mightnotfullycapitalizeonagglomerationeconomies–theirpresencemightnotdiffuse as much additional efficiency to themselves or others as one would hope. Meanwhile, supporting a critical mass of small and medium enterprises or slightly less-productive firms to co-locate and interact could potentially generate more substantial productivity improvements collectively because these firms are more responsive to the agglomeration stimulus. References Ackerberg, Daniel A, Kevin Caves, and Garth Frazer, “Identification properties of recent production function estimators,” Econometrica, 2015, 83 (6), 2411–2451. Bartik, Timothy J, “Using place-based jobs policies to help distressed communities,” Journal of Economic Perspectives, 2020, 34 (3), 99–127. Behrens, Kristian and Frédéric Robert-Nicoud, “Agglomeration theory with heterogeneous agents,” Handbook of regional and urban economics, 2015, 5, 171–245. Biesebroeck, Johannes Van, “Robustness of productivity estimates,” The Journal of Industrial Economics, 2007, 55 (3), 529–569. Combes, Pierre-Philippe and Laurent Gobillon, “Chapter 5 - The Empirics of Agglomeration Economies,” in Gilles Duranton, J. Vernon Henderson, and William C. 16

Strange, eds., Handbook of Regional and Urban Economics, Vol. 5 of Handbook of Regional and Urban Economics, Elsevier, January 2015, pp. 247–348. , Gilles Duranton, Laurent Gobillon, Diego Puga, and Sébastien Roux, “The Productivity Advantages of Large Cities: Distinguishing Agglomeration From Firm Selection,” Econometrica, November 2012, 80 (6), 2543–2594. Cunningham, Cindy, Lucia Foster, Cheryl Grim, John Haltiwanger, Sabrina Wulff Pabilonia, Jay Stewart, and Zoltan Wolf, “Dispersion in Dispersion: Measuring Establishment-Level Differences in Productivity,” Review of Income and Wealth, 2023, 69 (4), 999–1032. Foster, Lucia, Cheryl Grim, and John Haltiwanger, “Reallocation in the great recession: cleansing or not?,” Journal of Labor Economics, 2016, 34 (S1), S293–S331. Gandhi, Amit, Salvador Navarro, and David A. Rivers, “On the identification of gross output production functions,” Journal of Political Economy, 2020, 128 (8), 2973–3016. Gaubert, Cecile, “Firm sorting and agglomeration,” AmericanEconomicReview, 2018, 108 (11), 3117–3153. Greenstone, Michael, Richard Hornbeck, and Enrico Moretti, “Identifying agglomeration spillovers: Evidence from winners and losers of large plant openings,” Journal of Political Economy, 2010, 118 (3), 536–598. Levinsohn, James and Amil Petrin, “Estimating production functions using inputs to control for unobservables,” The Review of Economic Studies, 2003, 70 (2), 317–341. Olley, G Steven and Ariel Pakes, “The Dynamics of Productivity in the Telecommunications Equipment Industry,” Econometrica, 1996, 64 (6), 1263–1297. Rosenthal, Stuart S and William C Strange, “Evidence on the nature and sources of agglomeration economies,” in “Handbook of regional and urban economics,” Vol. 4 2004, pp. 2119–2171. 17

A Online Appendix: Monte Carlo simulations I now posit a data generating process to simulate data and test the estimation algorithm described in Section 2. A.1 Data generating process It is well outside the scope of this paper to develop a general equilibrium model of firm location choices with dynamic capital investment. Thus, I will replace agglomeration with an AR(1) process. Following Ackerberg et al. (2015), the production function is Leontief in materials: Q =min{eβ 0K β KL β Leτ it,β M }eϵ it (11) it it it M it where i refers to the establishment, t to time, L is labor, K is capital, M is materials, ϵ it is an unobservable shock to production, and τ is productivity. I assume ϵ ∼ N(0,σ2) it it ϵ and constant returns to scale, β + β = 1. I assume that productivity is a function K L of both agglomeration, a , and idiosyncratic establishment-specific raw efficiency, ω . it it More specifically, I assume: τ = β a +ω +γa ω . The firm observes productivity it a it it it it at time t before making labor decisions. Capital is not flexible and evolves according to K =(1−δ)K + I . I also assume that firms’ wages follow the process . it+1 it t The establishment chooses labor maximizing revenues minus labor costs, where the price of output is set to 1: maxE {Q −L W } (12) t it it it L it The first order condition is E (β β K β KL β L −1 eτ iteϵ it)−W =0 (13) t 0 L it it it which provides the optimal labor choice: L= β1− 1 βL (cid:18) β L (cid:19) 1− 1 βL K1− βK βLe1− τi β t Le 0.5 (1− σ β ϵ 2 L )2 (14) 0 W it it where I use that E(eαϵ)=e0.5α2σ ϵ 2 . The investment choice is an intertemporal choice. The establishment maximizes the 18

discounted stream of profits. The discount factor is β and there are investment adjustment costs equal to ϕ I2. 2 it ∞ maxE ∑ βt{Q − ϕ I2} (15) I t it 2 it it t=0 subject to the production function and the capital accumulation dynamics. Following Van Biesebroeck (2007), the first order conditions are as follows ∂Q ∂I −ϕI + E β[ it+1 −ϕI t+1]=0 (16) it t it+1 ∂I ∂I t t ϕI it = βE t [β 0 1− 1 βL (cid:18) W β L (cid:19) 1− βL βL β K K i 1 t − β + K β 1 L −1 e1 τ − it+ βL 1 e 0.5 (1 β − L β σ L ϵ 2 )2eϵ it+1]−ϕ i βE t [−(1−δ)I t+ (1 1 7]) it+1 ϕI it = ββ 0 1− 1 βLβ K β L 1− βL βLE t [e1 τ − it+ βL 1− 1− βL βL lnW it+1 +0.5 (1 β − L β σ L ϵ 2 )2 +ϵ it+1]+(1−δ)βE t [ϕ i I t+1 ] (18) Forward substituting the optimal investment choice in equation (18) I get 1 I it = ββ K ϕ β 0 1−βL β L 1− βL βLE t ∑ ∞ ((1−δ)β)s[e τi 1 t − + β 1+ L s− 1− βL βL lnW it+1+s +0.5 (1 β − L β σ L ϵ 2 )2 +ϵ it+1+s] (19) i s=0 Since τ does not have a simple dynamic progression, the expectation in (19) will have to i be numerically computed. I will consider two cases. In Case A, a and ω evolve according to unrelated autoreit it gressive processes: ω = ρ ω +ξω (20) it ω it−1 it a = ρ a +ξa (21) it a it−1 it where ξω ∼ N(0,σ2 ) and ξa ∼ N(0,σ2 ). In Case B, a is influenced by the previous it ξω it ξa it period shock to raw productivity: ω = ρ ω +ξω (22) it ω it−1 it a = ρ a +α ξω +ξa (23) it a it−1 a it−1 it where α measures the responsiveness of a to the productivity shock. a it 19

A.2 Simulation results I simulate 20,000 firms over 10 periods for 500 times.7 The parameters of the model are calibrated similarly to Ackerberg et al. (2015). β = 2, β = 0.6, β = 0.4, β = 0.3, and 0 L K a γ=0.2. The depreciation rate δ=0.2, the discount rate β=0.95, the standard deviation of the optimization error in labor σ =0.37, the variation in the capital adjustment cost ξ l ϕ is such that Std(ϕ)=0.6, the standard deviation of ϵ is 0.1, the AR parameters for raw productivity omega are ρ = 0.7 and σ = 0.2142. In both Case A and B, ρ = 0.3 and ω ξω a σ =0.7, and in Case B α =0.5. ξa a Table 5: Monte Carlo simulation results DGP Statistic β β γ L K Case A Mean 0.6000 0.4003 0.2005 St. Dev 0.0002 0.0028 0.0269 Case B Mean 0.6000 0.4000 0.1990 St. Dev 0.0002 0.0032 0.0344 Truth 0.6 0.4 0.2 Notes: Thetableshowsthemeanandstandarddeviationoftheparameter estimates obtained simulating 20,000 firms over 10 periods for500times. Table 5 shows the mean and standard deviation of the parameter estimates in the 500 Monte Carlo simulations; β is not shown because it cannot be identified as explained in a Section 2. In both Case A and Case B, the estimation procedure is able to correctly and precisely estimate the elasticities β and β , and the agglomeration parameter γ. L K Table 6: γ with equation (25) DGP Statistic γ Case A Mean 0.2005 St. Dev 0.0270 Case B Mean 0.1989 St. Dev 0.0344 Notes:SeeTable5. 7Thefirmsaresimulatedfor100periodsandthenthefirst90periodsareremovedasburn-in. 20

Since τ = β a +ω +γa ω , the distribution of TFP, τ, conditional on agglomerait a it it it it tion, a, depends on the parameters β and γ. If β is different from zero, the conditional A A distributionwillbeshifteddependingonthelevelofagglomeration, butitsvariancewill not be influenced. If γ is different from zero, both the mean and variance of the conditional distribution of TFP will change depending on agglomeration. The way agglomeration impacts the conditional distribution of TFP can help understand the identification of the two agglomeration parameters. For instance, taking the variance of TFP conditional on agglomeration I obtain V(τ |a)=(1+γa)2V(ω |a) (24) it it Equation (24) can be used to obtain γ as Std(τ |a)−Std(ω |a) γ= it it (25) a×Std(ω |a) it Table 6 shows the γ estimates obtained with equation (25) after estimating τ and ω with the moment condition in (10); the agglomeration parameter γ is identified by its role in determining the distribution of TFP conditional on agglomeration. This is not an alternative way to estimate the agglomeration parameters, as the GMM estimation to obtain τ and ω also delivers the agglomeration parameters, but I find it useful to understand how the agglomeration parameters are identified within the model. 21

Cite this document
APA
Andrea Stella (2025). Agglomeration and sorting in U.S. manufacturing (FEDS 2025-031). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2025-031
BibTeX
@techreport{wtfs_feds_2025_031,
  author = {Andrea Stella},
  title = {Agglomeration and sorting in U.S. manufacturing},
  type = {Finance and Economics Discussion Series},
  number = {2025-031},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2025},
  url = {https://whenthefedspeaks.com/doc/feds_2025-031},
  abstract = {Using data on U.S. manufacturing plants, I estimate a production function model that includes agglomeration intensity as a component of total factor productivity and allows agglomeration benefits to vary across establishments, which can lead to sorting. I find that agglomeration benefits decline with unobserved establishment-level raw productivity.},
}