Business Dynamics in the National Establishment Time Series (NETS)
Abstract
Business microdata have proven useful in a number of fields, but the main sources of comprehensive microdata are subject to significant confidentiality restrictions. A growing number of papers instead use a private data source seeking to cover the universe of U.S. business establishments, the National Establishment Time Series (NETS). Previous research documents the representativeness of NETS in terms of the distribution of employment and establishment counts across industry, geography, and establishment size. But there exists considerable need among researchers for microdata suitable for studying business dynamics--birth, growth, decline, and death. We evaluate NETS in terms of its ability to corroborate key insights from the business dynamics literature with a particular focus on the behavior of new and young firms. We find that NETS microdata exhibit patterns of business dynamics that are markedly different from official administrative sources, limiting the usefuln ess of NETS for studying these topics. Accessible materials (.zip)
Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C. Business Dynamics in the National Establishment Time Series (NETS) Leland D. Crane and Ryan A. Decker 2019-034 Please cite this paper as: Crane, Leland D., and Ryan A. Decker (2019). “Business Dynamics in the National Establishment Time Series (NETS),” Finance and Economics Discussion Series 2019-034. Washington: Board of Governors of the Federal Reserve System, https://doi.org/10.17016/FEDS.2019.034. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.
Business Dynamics in the National Establishment Time Series (NETS) ∗ LelandD.Crane RyanA.Decker May1,2019 Abstract Businessmicrodatahaveprovenusefulinanumberoffields,butthemainsourcesofcomprehensivemicrodataaresubjecttosignificantconfidentialityrestrictions. Agrowingnumberofpapers insteaduseaprivatedatasourceseekingtocovertheuniverseofU.S.businessestablishments,the NationalEstablishmentTimeSeries(NETS).Previousresearchdocumentstherepresentativeness of NETS in terms of the distribution of employment and establishment counts across industry, geography,andestablishmentsize. Butthereexistsconsiderableneedamongresearchersformicrodatasuitableforstudyingbusinessdynamics—birth,growth,decline,anddeath. Weevaluate NETSintermsofitsabilitytocorroboratekeyinsightsfromthebusinessdynamicsliteraturewith aparticularfocusonthebehaviorofnewandyoungfirms. WefindthatNETSmicrodataexhibit patterns of business dynamics that are markedly different from official administrative sources, limitingtheusefulnessofNETSforstudyingthesetopics. ∗BothauthorsareattheFederalReserveBoardofGovernors We thank Joonkyu Choi, Sabrina Howell, Erick Sager, and conference and seminar participants at SGE 2019 andtheFederalReserveBoardforhelpfulcomments.WealsothankGrayKimbroughformakinghisgraphing schemaavailable. TheanalysisandconclusionssetfortharethoseoftheauthorsanddonotindicateconcurrencebyothermembersoftheresearchstaffortheBoardofGovernors.
1 Introduction Research based on business microdata has become increasingly important in economics in recent years. Such research can be difficult, however. The most easily available sources of business microdata, such as Compustat, do not cover the universe of private businesses (Compustat only covers publicly traded firms) and may be subject to significant selection problems (Davis et al. (2007)). The availability of comprehensive U.S. business microdata has increased significantly during the last decade due to efforts by statistical agencies; but accesstothesedataisstillcostly,andprudent(andlegallymandated)confidentialityrestrictionslimitthescopeofresearchthatcanbeconductedwithsuchdata. Aprominentprivate sector data source has emerged, however, with nominal coverage of a significant fraction oftheuniverseofU.S.businessactivityandwithoutonerouspublicationscoperestrictions. The National Establishment Time Series (NETS), a product of Walls & Associates, consists of longitudinally linked Dun & Bradstreet establishment-level data (with firm linkages) on businessemployment,industry,andlocation,aswellasothervariablesofpotentialinterest toresearchersandpolicymakers. In a companion paper, Barnatchez et al. (2017) explore the representativeness of NETS in the cross section, comparing the data to the Census Bureau’s County Business Patterns (CBP)andNonemployerStatistics(NES)andtheBureauofLaborStatistics’(BLS)Quarterly CensusofEmploymentandWages(QCEW).StaticdistributionsofNETSdatacanbemade reasonablycomparabletoofficialsources,onaverage,intermsofestablishmentsize,industry, and geography cells, subject to important limitations arising from divergent counts of small establishments and, to a lesser extent, very large establishments. These differences may be due to imputation in NETS, difficulties with the firm/establishment distinction in theDun&Bradstreetdata,andthemismeasurementofemploymentatnon-employerbusinesses. Barnatchez et al. (2017) conclude that NETS can be useful to cautious researchers who apply appropriate sample restrictions and investigate questions about static distributionsofeconomicactivity. 1
One finding of Barnatchez et al. (2017), however, is that the NETS data miss three key dynamic developments in the U.S. in recent years: the shale oil and gas boom that began in the mid-2000s, the post-2007 construction contraction associated with the reversal of the housing boom, and the swift decline in manufacturing employment after 2000. The failure of NETS to capture these dynamic industry events hints at potentially broader limitations ofNETSalongthetimeseriesdimension. Inthepresentpaper,weinvestigateestablishmentgrowthandfirmlifecyclepatternsin NETS, including higher moments of growth distributions, motivated by key insights from thefirmdynamicsliterature. WefindthatNETSdatacannotreplicatekeyempiricalpatterns of establishment and firm growth documented in comprehensive, official administrative data—the Census Bureau’s Longitudinal Business Database (LBD) and associated publicuseproduct,theBusinessDynamicsStatistics(BDS).1 A significant, though not sole, reason for these limitations appears to be the prevalence ofdataimputationinNETS,theeffectsofwhicharemagnifiedinadynamicsetting. In2014, employmentisimputedformorethantwothirdsofestablishmentswithfewerthanfiveemployees, while the imputation rate is more than one third for establishments with between five and nine employees. Even larger establishment classes have employment imputation rates close to 10 percent. Imputation is particularly prevalent among young firms: about 90 percent of firms aged zero or one have imputed employment data. Imputation can be particularly consequential in dynamic settings where multiple years of data must be relied upon for a single observation. We find that in 2014, the employment data for 10 percent of firms had been imputed for seven or more consecutive years. Imputation of sales data is even more prevalent, at rates around 80 percent among small firms and 95 percent among large firms. Nearly all of the sales data for establishments of multi-establishment firms are imputed. 1WedidnotaccessLBDmicrodataforthisproject;rather,werelyonpublishedresultsfromtheLBDaswell asourownanalysisofthepubliclyavailableBDS.AllpreviouslypublishedLBDresultswedescribehaveundergoneappropriateCensusBureaudisclosureavoidanceprocessestoensurethatnoconfidentialinformation isdisclosed. 2
More broadly, business-level employment data are surprisingly non-volatile in NETS. ThedistributionoffirmemploymentgrowthratesisfarlessdispersedandskewedinNETS thaninofficialdata. Youngfirmgrowth,whichexistingliteratureshowsischaracterizedby substantialdispersionandskewness,isparticularlypoorlycapturedinsuchasetting. NETS appearsillsuitedforthestudyoflabormarketflows,firmentryandexit,andbusinesslifecycledynamics,thoughcarefuluseofNETSincasestudysettingsmaystillbeproductive. The paper proceeds as follows. In Section 2, we briefly describe the NETS data, related literature, and our data preparation methods. Given the prevalence of data imputation in NETS,inSection3wedescribepatternsofimputationwithafocusonimplicationsforbusiness dynamics measurement. In Section 4 we compare NETS data with official sources in terms of aggregate patterns of firm dynamics, the geographic and industrial composition of firm growth, and the lifecycle behavior of firms. Section 5 is an argument for preferring officialdatatoNETSwhendiscrepanciesbetweenthetwoarise;somereadersmaypreferto startthere. Section6concludes. 2 Data background and preparation 2.1 NETSbackground Barnatchez et al. (2017) explain NETS data in detail; we refer the interested reader to that paper for more details while we provide a short summary here. For many years, Dun & Bradstreet (D&B) has actively sought to maintain a database of all business establishments in the U.S., which the firm uses in its business of selling marketing and other information. D&B collects these data from state secretaries of state, Yellow Pages, court records, credit inquiries, and direct telephone contact. Each year, D&B provides a snapshot of the establishment cross section to Walls & Associates, which creates longitudinal links and cleans thedataforusebyresearchersandothers. Thefinisheddataincludeannualestablishmentlevel information on detailed industry, employment, sales, and other variables, with longi- 3
tudinal establishment linkages and firm identifiers to link the establishments of multi-unit firms. TheCensusBureau’sLBD,widelyusedbyacademicandgovernmentresearchers,is similar in structure and aspiration to NETS, except that NETS seeks to track nonemployer businesses while the LBD is limited to employers (i.e., businesses with at least one paid employee). TheNETSproducttowhichwehaveaccesscoverstheyears1990-2014. Barnatchez et al. (2017) review recent papers using NETS data for a variety of research questions.2 Given our focus on business dynamics, here we describe just two key references. First, Neumark et al. (2005) evaluate the California sample of NETS through comparisons to QCEW. The authors recommend dropping establishments with one employee to approximate the employer universe; we adopt a modified (firm-level) version of this rule in our work. Neumark et al. (2005) also highlight the prevalence of imputation in the data and note that frequent imputation causes a low frequency of employment change at the establishment level. Most relevant to our purposes, the authors calculate employment growthatthecounty-by-industrylevelandstudythecorrelationofemploymentgrowthbetweenNETSandQCEW.Annualemploymentgrowthisweaklycorrelatedbetweenthetwo sources(0.528),sotheauthorsstudy3-yearemploymentgrowth,whichshowsacorrelation of 0.864. These aggregate exercises are useful and suggestive; we differ from Neumark et al.(2005)infocusingonthenationwideNETSsampleandonawiderrangeofmeasuresof businessdynamics. Separately, Echeverri-Carroll and Feldman (2017) evaluate the usefulness of NETS for studying entrepreneurship and firm entry by focusing on two specific case studies: the Austin-Round Rock (Texas) metropolitan statistical area and the North Carolina “Research Triangle.” TheauthorscompareNETStoTexasandNorthCarolinasecretaryofstate(SOS) data(compiledbyGuzmanandStern(2016))andrecommendrestrictingthedataasfollows: exclude known sole proprietorships (which do not appear in secretary of state data) and 2Some additional examples of recent work, not reviewed there, are Heider and Ljungqvist (2015), Faccio andHsu(2017), Chavaetal.(2018), andRossi-Hansbergetal.(2018). Choetal.(2019)matchNETStoother establishment-leveldatabases,aswellaspublicuseEconomicCensusfiles. 4
firms with nonprofit components, focus on headquarters establishments, and omit singleemployee firms (as we do in the present paper and related work). With these restrictions, NETSdatamatchsecretaryofstatedataforthetwocitiesreasonablywell,thoughtherestill exist significant discrepancies particularly in recent years of data. Importantly, the authors show that successive NETS vintages revise heavily for several recent years, so NETS reliabilityimprovesovertimeyetshouldbeexpectedtobeweakforthemostrecentyearsinthe data(particularlythemostrecentfouryears). AparticularlynotablecontributionofEcheverri-CarrollandFeldman(2017)isthatthey matchNETSmicrodatawithSOSdataforsoftwarestartupsinAustin,apainstakingprocess with large benefits for our questions here. They first exclude recent years of data to avoid vintageproblemsdiscussedabove. Theythenseektomatchabout3,500NETSfirmstothe SOSdata,firstfocusingonnameandzipcodematches,thenrelaxingtonamematchesonly, using standard name generalization techniques. They successfully match about 40 percent of NETS firms to SOS firms. Among those matched, only 50 percent report the same entry yearinNETSasinSOSdata. About75percenthaveNETSandSOSentryyearswithintwo years of each other, and about 80 percent are within 3 years. The authors discuss reasons forthelowmatchrate,whichincludemissinglegalformoforganizationdatainNETS.The implications of this exercise are mixed, but the SOS data provide a degree of validation of NETSandsuggestusefulnessinlimitedexercises,particularlyincasestudysettingssimilar toEcheverri-CarrollandFeldman(2017). WhileEcheverri-CarrollandFeldman(2017)focusoncarefullymatchedmicrodatacomparisonswithintwospecificcasestudies(AustinandtheResearchTriangle),wefocusmore broadlyoncomparisonsusingknownaveragepatternsoffirmdynamicsacrosstheU.S.We willarguethatNETSdataareoflimitedusefulnessforstudyingbroadpatternsoffirmdynamics,leavingtheEcheverri-CarrollandFeldman(2017)casestudyapproachasthebetter (thoughmoretediousandtimeintensive)usecaseforNETS. 5
2.2 LBDbackground The longitudinal business database (LBD), housed at the U.S. Census Bureau, covers the near-universe of private nonfarm employer business establishments in the U.S. starting in themid-1970s. ItisconstructedfromtheCensusBureau’sBusinessRegister,thesamesource data as the County Business Patterns (CBP).3 The ultimate source data for the LBD, first described by Jarmin and Miranda (2002), draw from federal business tax records (both IRS and Social Security Administration), a variety of Census Bureau surveys, and the semidecadal Economic Censuses (conducted in years ending 2 and 7). Importantly, the source datafortheLBDinclude,byconstruction,allin-scopeemployerbusinessesintheU.S.that areknowntotaxauthorities. Theactualdataconsistoflongitudinallylinkedestablishment records with employment (as of March 12 of a given year), detailed industry, location, and other establishment characteristics. Establishment records also include firm identifiers that effectivelygroupestablishmentsundercommonoperationalcontrol. The LBD has become a critical resource for the study of firm dynamics. For example, Davis et al. (2007) first documented multi-decade declines in measures of firm-level employmentvolatilityandgrossjobflowsintheU.S.privatesectorusingtheLBD;theauthors alsolinkedtheLBDtoCompustat,awidelyuseddatasetofpubliclytradedfirms,anddocumented key differences in the behavior of publicly traded and privately held businesses. Haltiwanger et al. (2013) used the LBD to show that the job creation contribution that is widelyattributedtosmallfirmsismoreappropriatelyattributedtoyoungfirms. Deckeret al.(2014)describedkeycharacteristicsofyoungfirmsintheLBD,including“up-or-out”dynamics and high growth rate dispersion and skewness. Alon et al. (2018) show that cohort productivitygrowthdeclineswithageandthathighproductivitygrowthofyoungfirmsis primarilyaselectionphenomenon. AfurtherlargeliteratureexploitstheLBDforstudiesof internationaltrade,labormarketflows,andawiderangeofothertopics. While the LBD has become the primary resource for research on firm dynamics, it is 3Barnatchez et al. (2017) describe features of the Business Register that are relevant for comparisons with NETS.DeSalvoetal.(2016)describetheBusinessRegisterinexhaustivedetail. 6
subject to strong confidentiality requirements and is therefore only accessible to sworn researcherswithapprovedprojectsworkingintheCensusBureauoraFederalStatisticalResearch Data Center (FSRDC). Researchers using the LBD in FSRDCs must carefully follow rules to comply with federal law and prudent confidentiality concerns, and publishing results from statistical work on the LBD requires a lengthy process for disclosure avoidance. Theprocessisgenerallycostlyandtimeconsuming. Giventheimportanceofthedata,therefore,theCensusBureaupublishesthepubliclyavailableBusinessDynamicsStatistics(BDS), whichconsistsofaggregatesoftheLBDdesignedtotrackbusinessdynamicsatthelevelof sectors,firmageandsizegroups,andestablishmentlocations. ResearchusingtheBDShas made considerable contributions to the literature. However, there are many questions that cannotbeansweredwiththeBDS,particularlyquestionsabouthighermomentsofthefirm distributionandfirmdynamics,thatrequiremicrodata. The limitations of the BDS and the tradeoffs involved with LBD access and use create demand for a public use source of business microdata like NETS. It is therefore important thatresearchersunderstandthestrengthsandlimitationsofNETS.Themainpurposeofthis paper is to compare NETS with the LBD, with the latter serving as the benchmark against whichanyemployerbusinessmicrodatashouldbejudgedgivenitswell-definedandnearuniversalscopeanditswideuseintheliterature(seeSection5formorediscussionofofficial versus private data sources). We do not present any original results from LBD microdata; rather, we compare our original NETS calculations to existing LBD and BDS calculations fromtheliterature. 2.3 NETSdatapreparation 2.3.1 Samplerestriction WefirstimplementsamplerestrictionsdescribedindetailbyBarnatchezetal.(2017). First, sinceNETSaspirestoincludeboththenonemployerandemployeruniverse,andsincecoveragebeyondtheemployeruniverseisevidentinthedata,werestrictthesampletoourbest 7
guessoftheemployeruniversebysubtractingoneemployeefromtheemploymentofeach firmheadquartersestablishmentthendroppingestablishmentswithzero(post-subtraction) employment. This is a modified version of the sample restriction recommended by Neumark et al. (2005) and follows from the notion that owners are likely to be counted as employeesinNETSthoughtheymaynotbeinofficialsources,whereemploymenthasastrict definition based on paycheck issuance. We restrict NETS to the employer universe to be comparablewiththedatasetstowhichwewillmakecomparisons—theLBDandtheBDS— whicharebothemployerdatasets. WethenrestricttheNETSsampletomatchtheindustry scope of the LBD and BDS, which is the same as the scope of CBP (see Barnatchez et al. (2017)foraspecificindustryscopelist). 2.3.2 Establishmentidentifiers Studying business dynamics is more complicated than studying cross-sectional snapshots ofmicrodata. Inparticular,questionsofbusinessdynamicsrequirecarefulattentiontolongitudinallinkagesofbusinessidentifiers. Dataproblems(suchasbrokenlinkages)andrealworldeventslikemergersandacquisitionsgeneratechallengestolongitudinalconceptsand requireresearcherstomakejudgments. GivenourgoalofassessingtheNETSdatarelative toofficialdata,weattempttotreattheNETSdatainawaythatmakesthemmostcomparabletotheLBDandtheempiricalfirmdynamicsliteraturebasedontheLBD. The basic unit of observation in NETS is the dunsnumber. D&B views the dunsnumber as a line of business; but with respect to official sources, it is most similar to the concept of an establishment. In the LBD, an establishment is a single business operating location (identified by lbdnum in the LBD). In NETS, though, a single business operating location can have multiple dunsnumbers. This can be the case, for example, when the production operationsandsalesoperationsofabusinessareco-locatedbutcountedseparatelybyD&B. Barnatchez et al. (2017) aggregate dunsnumbers to the establishment level to be consistent with CBP and QCEW establishment definitions; to do this, they identify dunsnumbers that 8
have the same reported firm headquarters (hqduns), 5-digit zip code, and first five street addresscharacters(i.e.,roughlyspeaking,samestreetandbuildingnumber). Thisapproach isdesignedtoidentifylinesofbusinessoperatinginthesamelocationandfallingunderthe samefirm. Theythensumtheemploymentofthematchedlinesofbusinessandassignthe merged establishment a new identifier (termed the locduns) and the industry code of the largest line of business (in terms of employment). Since establishments in official data are assignedindustrycodestoreflecttheirprincipalactivity,thismethodofmergingD&Blines ofbusinessshouldroughlyapproximatetheofficialconcept. Inpractice,thelineofbusiness vs. establishment distinction seems to matter mostly for a small number of headquarters establishments. WefollowtheapproachofBarnatchezetal.(2017)forconstructingestablishmentmicrodata, but we introduce additional procedures for ensuring the longitudinal integrity of the resultingmergedlocdunsestablishmentidentifiers. AnaiveapplicationoftheBarnatchezet al. (2017) method could result in spurious changes in locduns establishment identifiers that reflect changes in the composition of establishment employment rather than the death of one establishment and birth of another. We first identify a locduns establishment as a continuer (i.e., not a birth or death) if there is a year-to-year overlap in at least one original line-of-business dunsnumber; that is, if a locduns disappears from the data we only assume theestablishmenthasexitedifallitsassociatedline-of-businessdunsnumbersceasetoexist. We create a new identifier, the netsnum, that does not change from year to year even if a mergedestablishment’slocdunschangesduetochangingemploymentcompositionoflines ofbusiness. Inthe(rare)casethatlinesofbusinessthatexistinthesamelocationbutbelong todifferentfirms(i.e.,havedifferenthqduns)inyeart-1moveintothesamefirm(i.e.,takeon thesamehqduns)inyeart,weassigntheyear-tcombinedentitythenetsnumoftheyear-t-1 locdunsestablishmentthatcontributedthemostemployment(intermsoflinesofbusiness) tothenewentity.4 4Thisisarareoccurrencebecauseitsuggeststhattwoseparatefirmswithestablishmentsinthesamebuilding engagedinamergeroracquisition. 9
Theresultingnetsnumisalongitudinalidentifierthatiscloseinconceptandspirittothe longitudinal establishment identifier in the LBD (lbdnum). We next focus on longitudinal firmidentifiers. 2.3.3 Firmidentifiers A firm is a collection of establishments. The LBD defines the firm based on common operational control. The NETS firm concept is based on a common headquarters establishment (hqduns), where the hqduns refers to the dunsnumber of the headquarters establishment. NETSapparentlyallowsformultiplelevelsofheadquarters—perhapsincludingboth regionalandnationalheadquarters—becauseweobservesomecasesinwhichanestablishment record has a dunsnumber that is equal to other establishments’ hqduns, but that itself refers to a different hqduns.5 That is, there are cases in which an establishment appears to be a headquarters for other establishments but does not refer to itself as its own headquarters. We attempt to unite all establishments that are related through headquarters, either directly or indirectly, under a single firm identifier by “rolling up” hqduns identifiers. That is, we assign all related establishments the hqduns of the highest level headquarters, which necessarilyreportsitselfasitsownhqduns(or,inrarecases,reportsanhqdunsthatdoesnot appearasadunsnumberanywhereelse).6 The firm identifier setup in NETS also presents the longitudinal challenge of determining which groups of establishments are successors to each other over time. As with the LBD’s firm identifier (firmid), the hqduns can change for many reasons, including merger andacquisitionactivitybutalsosimpledataproblems. UnlikeintheLBD,inNETSthefirm identifierautomaticallychangesifthefirmmovesitsheadquartersfromoneestablishment toanother. Wereassignhqdunsfirmidentifiersasfollows. Foragivenfirmxinyeart−1,we identifyallfirmsinyeartthatcontrolatleastsomeoffirmx’st−1establishments. Weselect 5ThereissomediscussionofthisinNETSmarketingmaterials. 6Inextremelyrarecases, weobserveheadquarterslinkagesthatare“cyclical;” forexample, dunsnumberA reportsdunsnumberBasitsheadquarters,whileBreportsAasitsheadquarters. Inthosecases,wearbitrarily assignanhqdunstoapplytoallrelatedestablishments. 10
x’s candidate successor as the firm which controls the plurality of employment at these continuing establishments. Very often, this firm has the same hqduns and essentially the same establishments as x, and there is no ambiguity. But when a firm “fractures” into several seperateentities,itissensibletomatchthesourcefirmtothelargestcontinuingfragment. Onemorestepisnecessarytohaveconsistentfirmlinkages. Accordingtotheruleabove, it is possible for a single period t firm to be the candidate successor for two distinct period t−1 firms. For example, a firm z in year t could include the largest continuing fragments ofbothfirms xandyfromyeart−1. Thiswouldbethecaseforanacquisitionoramerger. In such a case we assume that z is the successor to whichever of x and y accounts for the largestshareofemploymentatthenewfirm. Thesuccessorfirmisassignedthesamehqduns number as the source firm. Firms which lack a successor are assumed to have ceased to exist. Thisprocessisrepeatedyearbyyearforthewholesample. Thistreatmentofmergers has a number of limitations, though LBD firm identifiers are also not immune to merger problems and we accordingly follow best practice from the literature when we define firm ageandgrowthrates. Weconstructfirmagetobeconsistentwiththewidelyusedconventionfromtheliterature(e.g.,Haltiwangeretal.(2013)). Atthefirstappearanceofanewfirmidentifier(hqduns) inthedata,weassignthefirmtheageofitsoldestestablishment(whereestablishmentageis givenbyyearssincethefirstappearanceoftheestablishment’snetsnum,whichisdescribed above). Thereafter, the firm ages naturally. This approach abstracts from problems associated with spurious changes in the firm’s headquarters identifier and is consistent with the conventionusedintheLBD-basedliteraturetowhichwewillcompareNETSdata. 2.3.4 Growthrateconcepts In various places below we report statistics based on firm or establishment employment growth rates. We employ the widely used growth rate concept of Davis et al. (1996) (the so-called “DHS growth rate”). Let E be employment in year t for establishment e. Then e,t 11
theestablishment-levelDHSgrowthrateisgivenby: E −E g = e,t e,t−1 . (1) e,t 0.5·(E e,t +E e,t−1 ) The DHS growth rate differs from standard growth rates by using average two-year employmentinthedenominatorinsteadofsimplyemploymentinyeart−1. Thisgrowthrate concept has been widely used in the literature because it can accomodate entry (in which case, E e,t−1 = 0, E e,t > 0,and g e,t = 2)andexit(E e,t−1 > 0, E e,t = 0,and g e,t = −2). While calculating establishment-level DHS growth rates is straightforward, calculating firm-levelgrowthratesismorecomplicatedduetothepossibilitiesofmergers,acquisitions, and divestitures, which can generate changes in firm-level employment that do not reflect “organic”growth. FollowingHaltiwangeretal.(2013)andrelatedliterature,wefocusonan “organic” growth concept that abstracts from such reorganizations. The firm-level organic growthrateforfirm J isgivenby: ∑ (E −E ) g f = e∈J e,t e,t−1 . (2) J,t ∑ e∈J 0.5·(E e,t +E e,t−1 ) Thesummationsubscripte ∈ J limitsthesetofestablishmentsbeingincludedtothosethat belong to firm J in year t, regardless of whether they belonged to firm J in year t−1. That is,thefirmgrowthrateisconstructedasifallofthefirm’sestablishmentsinyeartbelonged to the firm in year t−1 (even if they did not in reality belong to firm J in year t−1), and any establishments controlled by firm J in year t−1 that were divested to a different firm between t−1 and t do not affect the growth rate of firm J. Establishments controlled by firm J in year t−1 that exited (i.e., failed or closed) between t−1 and t do contribute to measuredgrowth,with E e,t−1 > 0and E e,t = 0asmentionedabove.7 7It is straightforward to show that g f is equivalent to the employment-weighted average of the growth J,t ratesofallofthefirm’syear-testablishments(andexiters),wheretheemploymentweightisdefinedasaverage two-yearemploymentasintheDHSdenominatorabove. 12
3 Imputation 3.1 Employmentimputation NETS data include an imputation flag (empc) that takes on four possible values: (0) actual figure,(1)bottomofrange,(2)D&Bestimate,and(3)Walls&Associatesestimate. Thefirst twocategories(empc ∈ {0,1})indicatevaluesreportedtoD&Bbysurveyrespondents,with the“bottomofrange”categoryindicatingthattherespondentreportedarangeratherthan aspecificcount. D&Busesproprietarycross-sectionalimputationmethodsincasesofnonreporters(empc = 2). Walls&Associatesestimatesemploymentforallnon-reportersineach year using a longitudinal imputation method; in cases where this longitudinally imputed estimatediffersfromtheD&Bcross-sectionallyimputedmethod,Walls&Associatesinserts their own estimate and sets empc = 3. The Walls & Associates method uses regressions based on the time series of sales and employment for the establishment and its industry.8 Weconsiderallvaluesofempcexceptempc = 0tobeimputed,wheretheimputationcanbe donebytherespondent(empc = 1),D&B(empc = 2),orWalls&Associates(empc = 3). Barnatchez et al. (2017) report establishment imputation rates by establishment size. Imputationisprevalent,particularlyamongsmallestablishments. Fortheyear2014,establishments without exact reported employment values comprise more than two thirds of establishments with fewer than 5 employees, more than one third of establishments with 5 to 9 employees,andmorethan15percentofestablishmentswith10to19employees. Imputation ratesarearound10percentforallremainingsizeclassesexceptestablishmentswith1000or moreemployees,ofwhichabout15percentlackexactemploymentvalues. Barnatchezetal. (2017) conclude that the imputation problem is nontrivial but can be managed by omitting smallestablishments,whichisalsowhereNETSdiffersmostmarkedlyfromofficialdata. That said, we also find evidence that data reporters implicitly impute some data by roundingtheirreportedemploymentfigures, leadingtoapotentialunderstatementoftrue 8NETS imputation details are provided with NETS marketing materials, Understanding Data in the NETS Database(2009). 13
imputationratesinNETS.Figure1reportsthedistributionoflast-digitsofreportedemployment numbers among non-imputed (empc = 0) Walmart establishments in the year 2000.9 In that year, 88 percent of Walmart establishments’ employment data are reported as not beingimputed;thatis,theyarecodedwithempc = 0. Yetweobserveoverwhelmingprevalence of employment figures ending in 0 or 5, suggesting that respondents are rounding. This kind of rounding by respondents is a well-known issue in the survey methodology literature. We see more reasonable last-digit distributions among establishments generally, yet within this large firm there appears to be significant rounding. This kind of rounding mayhavelittlecostinstaticorcrosssectionalsettings,butbelowwemakethecasethatthe costishigherindynamicresearch. 80 60 40 20 0 stnemhsilbatse fo tnecreP Final employment digit, non-imputed Walmart establishments 0 1 2 3 4 5 6 7 8 9 Final digit Year 2000. 88 percent of Walmart establishments are classified as non-imputed. Figure1 Given our focus on firm dynamics, we also explore firm imputation rates. Figure 2 reportstheshareoffirmswithimputedemploymentdata,wherethepresenceofanyimputed 9OurgraphingschemesarebasedonKimbrough(2018). 14
establishments within a firm results in the firm counting as imputed (and establishments countasimputedifempc (cid:54)= 0). Thesolidbluelinereportstheshareoffirmsthatcountasimputed,whilethedashedgreenlinereportstheemployment-weightedimputationrate(that is, the total employment—imputed or not—of imputed firms divided by total NETS employment). Inearlyyears,abouthalfofNETSfirmsareimputed,butthissharerisesabove twothirdsbytheendofthesample. Weightedimputationrates—theshareofemployment that is at imputed firms—are more steady, suggesting that the recent rise in unweighted imputationisprimarilydrivenbysmallerfirms. 100 90 80 70 Unweighted 60 Weighted 50 40 30 20 10 0 tnecreP Firm employment imputation rates, NETS 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Imputed firms are those with any imputed establishments. Weighted rate is share of employment imputed; unweighted rate is share of firms imputed. Figure2 Figure3usesa morerestrictive(andNETS-friendly)definitionof“imputed”: we count firmsasimputedifandonlyifatleast10percentoftheiremploymentisatimputedestablishments. This has no visible effect on the unweighted imputation rate, but the weighted rate moves down. The relaxation of the imputation definition has little effect on the un- 15
100 90 80 70 Unweighted 60 50 Weighted 40 30 20 10 0 tnecreP Firm employment imputation rates, NETS (10% cutoff) 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Imputed firms are those with at least 10% of employment at imputed establishments. Weighted rate is share of employment imputed; unweighted rate is share of firms imputed. Figure3 weighted imputation rate because it primarily affects a relatively small number of large firms. Figure 4 further relaxes the imputation standard, defining as imputed only those firms with at least half of their employment at imputed establishments. Figure 5 goes to the extreme, counting as imputed only firms with 90 percent or more of their employment at imputed establishments. Even with this excessively liberal definition, about one fifth of employmentisatimputedfirms. Firm imputation is therefore nontrivial. Imputation may cause only limited problems for cross-sectional studies, but there are several reasons imputation is much more costly in research on dynamics. First, the longitudinal imputation method of Walls & Associates necessarily uses data on the establishment time series, implicitly assuming that past and futurebehaviorisindicativeofpresentbehaviorandtherebydampeningdynamicvolatility. Moreover, Walls & Associates rely on industry and other data that may serve to minimize 16
100 90 80 70 Unweighted 60 50 40 30 Weighted 20 10 0 tnecreP Firm employment imputation rates, NETS (50% cutoff) 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Imputed firms are those with at least 50% of employment at imputed establishments. Weighted rate is share of employment imputed; unweighted rate is share of firms imputed. Figure4 the dispersion of measured outcomes. Second, measures of dynamics depend on multiple consecutive data observations such that imputation is magnified. Concretely, employment growthfromyeart−1toyeartdependsonemploymentlevelsinyearst−1andt;ifeither year’s employment value is imputed, the overall employment growth value is necessarily imputed. Third, in the case of firm (rather than establishment) dynamics, imputation of any establishments within a multi-unit firm implies that the overall firm employment value is necessarily imputed. We find that this problem is particularly salient among firms with manyestablishments. A striking way to see the longitudinal costs of imputation is to consider imputation spells. Wedefinetheimputationspellasthenumberofconsecutiveyearsthatafirmcounts as imputed. For example, suppose a firm first counts as imputed in 1995. Then in 1995, the firm’s imputation spell is equal to 1. If the firm is again imputed in 1996, then in that 17
100 90 80 70 Unweighted 60 50 40 30 Weighted 20 10 0 tnecreP Firm employment imputation rates, NETS (90% cutoff) 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Imputed firms are those with at least 90% of employment at imputed establishments. Weighted rate is share of employment imputed; unweighted rate is share of firms imputed. Figure5 yearitsimputationspellisequalto2. Ifthefirmisnotimputedin1997,thenitsimputation spell in that year resets to 0. Figure 6 characterizes the distribution of imputation spells, wherewecountafirmasimputedifanyofitsestablishmentsareimputed. Thesolidgreen line (the highest line) reports the 90th percentile imputation spell. For example, in 1998, the 90th percentile firm had an imputation spell of 8, meaning that 10 percent of firms had beenimputedfor8ormoreconsecutiveyears. Themedianfirmhadanimputationspellof zero for most years in the sample, but by the end of the sample the median had risen to 2 years. Figure 7 reports the same exercise but restricts the sample to imputed firms in any givenyear;thatis,thefigurereportsthedistributionofimputationspellsconditionalonfirms being imputed, rather than including non-imputed firms. Among imputed firms, even the 25thpercentilereflectsmultipleconsecutiveyearsofimputationinmanyyears,themedian firm bounces between 2-year and 4-year imputation spells, and the 75th percentile shows 18
24 22 20 18 16 14 12 10 8 6 4 2 0 sraeY Length of imputation spell, unconditional distribution (unweighted) 90th percentile 75th percentile Median 25th percentile 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Imputed firms are those with any imputed establishments Unweighted percentiles among all observations. Figure6 imputationspellsbetween4and6years. The problem of consecutive imputation is particularly pronounced amount large firms. Figure 8 reports the employment-weighted distribution of imputation spells, again including all firms (i.e., not just conditional on imputation). The 90th percentile of the weighted distribution has the maximum possible imputation spell throughout most of the sample (i.e.,aspellofimputationbeginningattheoriginofthesample),asdoesthe75thpercentile. Thismeansthat25percentofoverallemploymentisatfirmsthathavebeenimputedforthe maximum possible number of consecutive years. Figure 9 reports the same weighted, unconditional distribution with a more relaxed definition of firm imputation in which a firm counts as imputed if 25 percent or more of its employment is at imputed establishments. Thepictureimprovessomehere,yetstill10percentofemploymentisatfirmswith7years ormoreofconsecutiveimputation. 19
24 22 20 18 16 14 12 10 8 6 4 2 0 sraeY Length of imputation spell, conditional distribution (unweighted) 90th percentile 75th percentile Median 25th percentile 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Imputed firms are those with any imputed establishments Unweighted percentiles among imputed firms. Figure7 Needlesstosay,thelongitudinalintegrityofdatainwhichsubstantialsharesofactivity reflectfirmswhosedatahavebeenimputedformultipleconsecutiveyearsislimited. Wedeveloponeotherimputationmeasuretotracklongitudinalimputationonayear-toyearbasis. Fortherestofthepaper,wedefineafirmasbeinglongitudinallyimputedinyear tifitcountsasimputedineitheryeartoryeart−1. Thisdefinitionishighlyrelevantwhen studying year-to-year firm-level growth or dynamics; as noted above, in a dynamic setting imputation binds in two consecutive years even if only one of the years has imputed data. Wefindthatlongitudinalimputationvariesbyfirmageincriticalways. Forexample,Figure 10reportslongitudinalimputationratesbyfirmagefortwosnapshotyears,2003and2014. Asnotedelsewhereandinexistingliterature,themostrecentyearofdataseesparticularly acute imputation problems. But even in revised data, imputation is extremely prevalent amongyoungfirms,withratesabovetwothirdspriortoage3. Thesehighimputationrates 20
24 22 20 18 16 14 12 10 8 6 4 2 0 sraeY Length of imputation spell, unconditional distribution (weighted) 90th percentile 75th percentile Median 25th percentile 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Imputed firms are those with any imputed establishments Employment-weighted percentiles among all observations. Figure8 amongyoungfirmswillprovetobeproblematicintheexercisesbelow. 3.2 Salesimputation Recent literature in firm dynamics relates firm employment growth with firm productivity (Deckeretal.(2018),Alonetal.(2018))bycalculatingrealsalesperworkeratthefirmlevel.10 While we focus primarily on employment dynamics in this paper, it is useful to briefly mentionsalesimputation. The sales variable in NETS is somewhat more complicated than the employment variable.11 Whilearespondentmaybeabletoreportcurrentpoint-in-timeemploymenttoD&B surveyors at any time, a respondent is not likely to know current-year sales at the time 10This literature follows the construction of the Revenue-Enhanced LBD (RE-LBD) by Haltiwanger et al. (2017),whichlinkedfirmrevenuedatafromtheCensusBureau’sBusinessRegistertotheLBD. 11ThisparagraphdrawsheavilyonWalls(2008). 21
24 22 20 18 16 14 12 10 8 6 4 2 0 sraeY Length of imputation spell, unconditional distribution (weighted) 90th percentile 75th percentile Median 25th percentile 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Imputed firms are those with at least 25% of employment at imputed establishments Employment-weighted percentiles among all observations. Figure9 of contact. NETS documentation suggests that respondents are likely to report some combination of the prior year’s sales and an estimate of the current year’s sales. Moreover, establishment-levelsalesisamorecomplicatedobjectthanfirm-levelsales(indeed,Census Bureau researchers who bring sales data from the Business Register to the LBD study sales at the firm level). Therefore, reported establishment sales figures are estimates at best. For non-reportedsalesfigures,D&BandWalls&Associatesrelyonimputationmethodsthatare similar to those used for employment (described above), including reliance on firm or industrysales/employmentratios,withsomeadditions. Inparticular,formulti-establishment firms, when firm-level sales are known (e.g., in the case of publicly traded firms), sales are allocated among establishments using employment shares. Note that sales figures are attributed even to establishments that do not sell products or services but instead produce inputs used by other establishments within the same firm; in such cases, the establishment 22
100 90 80 70 60 50 40 30 20 10 0 tnecreP Longitudinal imputation rates by firm age 2003 2014 0 1 2 3 4 5 6 7 8 9 10 11+ Firm age Share of firms with any imputed establishment employment in year t or t-1. Figure10 salesdataprovidenomarginalinformationbeyondtheestablishmentemploymentdata. NETS does include an imputation flag for the sales variable, salesc, with the same coding as the employment imputation variable (empc described above). That is, salesc can take on the following values: 0 (actual figure or estimate provided by respondent), 1 (bottom of range reported by respondent), 2 (D&B estimate), or 3 (Walls & Associates estimate).12 Imputationiscommon. Inboththeyears2000and2014, justunder20percentofestablishments report salesc = 0, indicating that the sales figure is a true reported value or respondent estimate. This likely overstates the accuracy of the figures, however, for the reasons above—even reported sales figures may be rough estimates. In any case, these establishments account for only about 10 percent of total (imputed and non-imputed) employment andtotalsales(imputedandnon-imputed)inbothyears,indicatingthatimputationismore 12Wealsoobserveanextremelysmallnumberofestablishmentswithmissingsalesdataandsalesimputation flags. 23
common among larger establishments. Remaining establishments are imputed, almost entirelyreflectingimputationbyWalls&Associates(salesc = 3). Year Firmsize(employees) 2000 2014 1to4 80 80 5to9 78 85 10to19 77 82 20to49 79 84 50to99 85 88 100to249 89 91 250to499 93 94 500to999 94 94 1,000to2,499 93 93 2,500to4,999 95 92 5,000to9,999 95 94 10,000+ 96 94 Source:NETS Notes:Percentoffirmswithimputedestablishmentsalesdata. Table1: Establishmentsalesimputationrates Sales imputation varies widely by firm size. Table 1 reports establishment imputation rates byfirmsize binsfor the years 2000and 2014. Small firms haveestablishment imputationratesaround80percent,whilearound95percentofestablishmentsoflargefirmshave imputed data. The high imputation rates among large firms appear to be driven by firms with multiple establishments; in results not shown, we find that close to 95 percent of establishments of multi-establishment firms have imputed sales data, compared with about 80 percent among single-establishment firms. The interpretation of these imputation rates is not entirely clear. For example, there may be cases (particularly among publicly traded firms) where D&B receive accurate firm-level sales data, but establishment-level sales data must be imputed. Since our NETS data do not provide firm-level sales information, if we requirefirmsalesfigureswemustconstructthembysummingacrossestablishmentswithin firms. So it ispossible that the imputation rates we report forestablishments of large firms 24
overstate the rate of imputation of firm-level sales among large firms; that is, there may be caseswhereafirm’sestablishmentshavesalesdataimputedfromtotalfirmsalessuchthat summing across establishments results in true firm sales figures. However, the number of firmsforwhichD&Breceivetruesalesdataisprobablysmall(forexample,therearefewer than 5,000 publicly traded firms in the U.S.), so if there is some overstatement, it is likely to be minimal. Moreover, the research for which establishment-level microdata like NETS wouldbemostusefularelikelytorequiregeographicbreakdownsofactivity,inwhichcase establishment imputation is the most relevant. In any case, establishment imputation rates are high across the firm size distribution, even among small firms that are likely to have onlyoneestablishment. Sales data would be particularly useful for the study of productivity; however, we find large discrepancies between NETS and official data on this topic. For example, using the LBD, Decker et al. (2018) find that the within-industry dispersion (standard deviation) of sales per worker has risen in recent decades; in NETS, we find the opposite pattern. Moreover,theleveloflaborproductivitydispersionismuchlowerinNETSthanintheLBD,likely owingtotheindustryaveragerulesofthumbusedforNETSsalesimputation. Forexample, Decker et al. (2018) find that among young (age less than five) high-tech firms, a firm that isonestandarddeviationmoreproductivethanitscorrespondingindustry-by-yearmeanis about2.5timesasproductiveasthemeanin1996(thefirstyearinwhichLBDsalesdataare available) and 3.0 times as productive in 2012. In NETS, this ratio is about 1.8 in 1996 and 1.7in2012. Theprevalenceofsalesimputation—whichismorecommonthanemploymentimputation— and the reliance of the imputation methods on employment data imply that the marginal value of the sales data is very low. Moreover, popular business dynamics topics such as productivity dispersion, decompositions of aggregate productivity growth, or the relationshipbetweenbusiness-levelproductivityandgrowthcannotbestudiedwithNETS. 25
4 Business dynamics in NETS and official data 4.1 Aggregatepatterns We first characterize the NETS data in terms of aggregate measures that are well known in the business dynamics literature. Figure 11 reports the share of firms that have age zero (often referred to as the startup rate or entry rate). The dashed green line reports the entry ratefromNETS,whilethebluesolidlinereportstheentryratefromtheBDS.TheNETSentryrateismorevolatilethantheofficialdata,thoughinmanyyearstheNETSratebounces aroundtheBDSrate. NETSseesanexcesssurgeinentryin2002thenagainin2011,consistentwiththefindingofBarnatchezetal.(2017)thatNETSseesitsestablishmentcountsurge above the levels of official data after 2000, which we believe likely reflects an expansion of D&Bscopeorcoverageratherthantrueentry. 25 20 15 10 BDS NETS 5 0 tnecreP Entrants as share of all firms, NETS vs. BDS 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Entrants are employer firms with age zero. Figure11 26
Figure12broadensourstudyofyoungfirmbehaviortoincludeallfirmsofagelessthan five, a cutoff that is common in the literature. Here we report the young firm employment share. ThesurgeinnewbusinessesappearinginNETSbutnotintheofficialdataisreadily apparent here, with a divergence starting in 2008 and the cumulative effects of differing coverage becoming notable by the end of the sample. Importantly, the well-documented decline in young firm formation and activity described in a large and growing literature (Decker et al. (2014)) is reversed in NETS data due to this late-2000s divergence. While official data show young firm shares moving below 10 percent by 2010, young firm shares inNETSexceed16percentin2012and2013,alevelnotseeninofficialdatasincethe1980s. In short, while a large and growing literature explores the puzzling decline of young firm activity in official data, NETS data tell the opposite story. We believe this is likely due to spurious measured entry brought on by an apparent scope expansion.13 Barnatchez et al. (2017) plot total employment in the NETS employer universe against total employment in County Business Patterns (see their Figure 1); the shape of the gap between total NETS employment and total CBP employment documented by Barnatchez et al. (2017) closely mimics the shape of the gap between NETS young firm shares and BDS young firm shares shownonFigure12. We next study patterns of gross job flows; first, we define “job creation” as the number ofjobscreatedbyenteringorexpandingestablishments,andwedefine“jobdestruction”as the number of jobs destroyed by exiting and downsizing establishments (these definitions are consistent with the literature). We express each of these as a rate by dividing by total employment, averaged over years t and t−1 in usual DHS fashion. Figure 13 reports the jobcreation ratefromthe BDS(solidblue line), NETS(dashed greenline), and NETSomitting firms with longitudinal imputation (dot-dashed purple line). Figure 14 reports the job 13NETSmarketingmaterialspointouttheriseinentryandarguethatthisreflectsgrowthofselfemployment orgigeconomyworkbroughtonbychangesinthenatureofentrepreneurshipandtheweaklabormarketsof theGreatRecessionandaftermath. Asnotedabove, wedropfirmswithonlyonereportedemployee, which shouldroughlyeliminatetruenonemployersfromthedata.Thus,thediscrepancyweobservereflectsapparent differencesinmeasuredemploymentatemployerbusinesses. 27
20 18 16 14 NETS 12 10 BDS 8 6 4 2 0 tnecreP Young firm employment as share of total, NETS vs. BDS 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 Young firms have age less than five. Figure12 destructionrate. In general, NETS exhibits much lower rates of gross job flows than the official data, as one might expect given the foregoing discussion of imputation and rounding. But it is somewhatsurprisingthatthenon-imputedNETSseriesaresometimesevenlowerthanthe overallNETSseries,suggestingthatimputationalonedoesnotexplainthelowvolatilityof NETSfirms. Onelikelyreasonisthat,asshownonFigure10,imputationismostprevalent among young firms. Dropping imputed firms means shifting the firm distribution heavily toward more mature firms that tend to have lower job creation rates. Other problems arise from the simple fact that dropping imputed firms significantly reduces the sample, and likely in a non-random way, so any statistics calculated on the residual sample are biased. The job creation rate patterns of the late years in the sample, when the NETS and NETSwithout-imputation series diverge, likely reflect the surge in firm entry seen in NETS since 28
20 15 BDS 10 NETS 5 Non-imputed 0 tnecreP Gross job creation rate, NETS vs. BDS 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 DHS denominator. Non-imputed series omits firms with imputation in either year. Figure13 2000; entrants mechanically contribute to gross job creation, and the omission of imputed entrantsshouldmechanicallyreducetheoverallrate. In any case, the patterns of gross job flows in NETS are substantially different from the BDS,bothintermsoflevelsandintermsoftimeseriesbehavior,andimputationalonedoes notaccountforthesediscrepancies. 4.2 Cell-basedcomparisons We can drill down further by comparing detailed “cells” in the BDS and NETS. We focus on two disaggregations available in the publicly available BDS files: firm size by firm age by state, and firm size by firm age by industry. Comparing individual cells along these dimensions allows for a more complete picture of the two data sources. We focus on three critical measures of business dynamics: job creation rates, job destruction rates, and net 29
20 15 BDS 10 NETS Non-imputed 5 0 tnecreP Gross job destruction rate, NETS vs. BDS 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 DHS denominator. Non-imputed series omits firms with imputation in either year. Figure14 employment growth rates. We also study simple employment levels measured as the DHS denominator(i.e.,two-yearemploymentaverages). Firm size bins, in terms of employees (based on DHS denominator), are defined as follows: 1-4, 5-9, 10-19, 20-49, 50-99, 100-249, 250-499, 500-999, 1,000-2,499, 2,500-4,999, 5,000- 9,999, and10,000orabove; thesearethenarrowestsizebinsavailableintheBDS.Firmage bins are defined as follows: 0, 1, 2, 3, 4, 5, 6-10, and 11 or above. The BDS provides more age detail in the 11 and above category (11-15, 16-20, 21-25, and beyond), but given the shorter time series available in NETS, we combine the 11+ categories for better coverage. AllstatesplustheDistrictofColumbiaareused,asareallSICsectorsavailableintheBDS: agricultural services, forestry, and fishing (SIC 7); mining (SIC 10); construction (SIC 15); manufacturing (SIC 20); transportation and public utilities (SIC 40); wholesale trade (SIC 50); retail trade (SIC 52); finance, insurance, and real estate (SIC 60); and services (SIC 70). 30
Therefore,thesizebyagebystatedisaggregationhaspotentialforupto4,896cells,andthe size by age by industry disaggregation has potential for up to 864 cells. When a cell in one datasourceismissingbutthatcellisnotmissingintheotherdatasource,wepopulateeach ofjobcreation,jobdestruction,andDHSemploymentaszerointhemissingsource.14 Bothlocationandindustryareestablishmentcharacteristics,andmulti-establishmentfirms canoperateinmultiplestatesandindustries. Whencreatingcellaggregatesweimplement BDS methodology, which follows.15 A single firm’s activity can appear in any industry or statecellinwhichthatfirmhasestablishments,butonlytheestablishmentsthatbelongtoa givencellcontributedatatothatcellaggregate. However,firmcharacteristicsarefirm-wide and apply to all of a firm’s establishments. That is, firm size and firm age information are thesameforallestablishmentsofagivenfirm. Forexample,considerafirmwithtwoestablishments,oneinNewYork(firstopenedin2000)andtheotherinNewJersey(firstopened in 2002). Suppose we observe this firm in the year 2003 and find that the New York establishmenthasfiveemployeesandtheNewJerseyestablishmenthastenemployees. Thenin 2003, thefirmhasfirmageofthreeandfirmsizeoffifteen. Theemploymentandjobflows oftheNewYorkestablishmentwillappearinthecelldefinedasfirmsizeof10-19employees, firm age of 3, and New York state. The employment and job flows of the New Jersey establishment will appear in the cell defined as firm size of 10-19 employees, firm age of 3, and New Jersey state. That is, the New York and New Jersey establishments appear in the samefirmsizeandagebinssincetheybelongtothesamefirm,buttheyappearindifferent states. Industryistreatedinthesamemanner. Table2reportssimplecell-basedcorrelationsbetweentheBDSandNETSintermsofjob creation, job destruction, net employment growth, and the DHS employment level; these correlationsrefertoactuallevels(i.e.,numberofjobscreated). Forbrevitywefocusontwo snapshotyears,2003and2014. Wechoose2003becausethisisthefirstyearinwhichNETS is available given our firm age scheme, and we choose 2014 because it is the latest year in 14Ifacellismissinginbothsources,wedonotgenerateanemptycelltopopulateinbothsources. 15WeconfirmedtheBDSmethodologythroughcorrespondencewithCensusBureaustaff. 31
our data. The first two rows of Table 2 refer to the size-age-state cells. As can be seen from the first row, in 2003 the levels of job creation and job destruction were highly correlated between BDS and NETS, though net growth is less correlated. These correlations generally deteriorate by 2014. The correlation of employment levels, in the last column, remains extremely high throughout. The size-age-industry cell scheme shows similar results. The correlationsforjobcreation,jobdestruction,andemploymentlevelsarereassuring,atleast for 2003, and suggest that NETS broadly tracks the BDS along the studied dimensions. It maybethatfuturerevisionsto2014datawillimprovethecoverage. Correlations Cells Year JobCreation JobDestruction Net Denominator Size-Age-State 2003 0.891 0.937 0.651 0.984 Size-Age-State 2014 0.756 0.567 0.554 0.968 Size-Age-Industry 2003 0.893 0.910 0.685 0.971 Size-Age-Industry 2014 0.735 0.671 0.598 0.966 Source:NETS,BDS Notes:Cross-cell,unweightedPearsoncorrelationsofBDSandNETSlevels.“Denominator”istheaverageofemploymentinyearst−1andt Table2: CellCorrelations: Levels However,thesecorrelationshideanunderlyingdivergence. Figure15plotsjobcreation in BDS cells against NETS cells in 2003, and Figure 16 similarly plots job destruction. The job creation pattern illustrates how correlations can overstate the correspondence between the two data sources; a tight linear relationship is apparent, resulting in a high correlation, but the slope of the relationship is clearly steeper than the 45-degree line (dashed red line) thatwouldindicateperfectcorrespondence. Thatis, NETStendstounderstatejobcreation in 2003, relative to the BDS. The job destruction pattern has a less clear story but suggests thatNETSmayoverstatejobdestructionrelativetoBDS,particularlyincellswithhigherjob destruction levels. The high correlations shown on Table 2, therefore, partly reflect the fact that NETS and BDS show roughly similar magnitudes in an ordinal sense without always matching well in actual levels. These divergences in levels also help explain the lower cor- 32
relationsfornetjobcreationseenonTable2,sincenetjobcreationisthedifferencebetween creationanddestruction. 500 400 300 200 100 0 SDB Job creation (thousands), BDS vs. NETS, size-age-state cells (2003) 0 100 200 300 400 500 NETS Figure15 Theseresultsonlevelsofjobflowsmaybeoflimitedimportance, however, sincemuch research focuses on rates of job flows. We calculate cell-level job creation rates, job destruction rates, and net employment growth rates by dividing each level by overall DHS employment for the cell. We drop all firm age zero cells since, in both sources, these have job creationandemploymentgrowthratesof200percentandjobdestructionratesof0percent byconstruction. Table3reportsthesecellcorrelations,againforthetwodifferentdisaggregationschemesandfortheyears2003and2014. Thesecorrelationsaregenerallyquitelow, again suggesting that the level correlations mostly reflect common employment scale effects,andthatoncethingsarenormalizedbyemploymenttherateslackacloserelationship acrossthedatasources. 33
400 300 200 100 0 SDB Job destruction (thousands), BDS vs. NETS, size-age-state cells (2003) 0 100 200 300 400 NETS Figure16 The cell-based comparisons generally support the concerns suggested by the aggregate analysis. NETS appears to have dampened rates of business dynamics compared with the BDS,andcell-leveljobflowratesarenotstronglycorrelatedbetweenthetwosources. Correlations Cells Year JobCreation JobDestruction Net Size-Age-State 2003 0.000 0.233 0.139 Size-Age-State 2014 0.078 0.158 0.095 Size-Age-Industry 2003 -0.081 0.312 0.181 Size-Age-Industry 2014 0.134 0.070 0.045 Source:NETS,BDS Notes:Cross-cell,unweightedPearsoncorrelationsofBDSandNETSrates. Table3: CellCorrelations: Rates 34
4.3 Lifecycledynamics Many questions in firm dynamics focus on the firm lifecycle. Indeed, firm age and the behaviorofyoungfirmsareatthecenterofmanykeyfirmdynamicsquestionsbecauseyoung firmsplayadisproportionateroleinaggregatejobgrowth(Haltiwangeretal.(2013))andaggregateproductivitygrowth(Alonetal.(2018);Deckeretal.(2014)andreferencestherein). Assuch,accuratemeasurementofentryandyoungfirmbehavioriscriticalforanydataset used to study firm dynamics. In this section, we proceed by using NETS to investigate criticalfirmlifecyclepatternsthathavebeendocumentedbyLBD-basedliterature. 4.3.1 Averagegrowth A highly cited study in empirical firm dynamics is Haltiwanger et al. (2013). Using the LBD,theauthorsshowthatthewidelyheldviewthatsmallbusinessesaretheprimaryjob creators—aviewreinforcedbyNETS-basedevidence(Neumarketal.(2011))—wasclouded by data limitations. Rather, Haltiwanger et al. (2013) show that young firms are the key jobcreators; whilesmallbusinessesdocreatejobsdisproportionately,oncetheeconometrician controls for firm age, the small firm advantage diminishes. The empirical regularity ofsmallfirmsdisproportionatelycreatingjobsarisesbecauseyoungfirmstendtobesmall. The evidence that young firms are critical for job creation has motivated a wide literature seekingtobetterunderstandyoungfirms. Herewedonotreplicatethespecificexercisesof Haltiwangeretal.(2013)butinsteadillustratetheconceptwithasimplerexercise. Figure17,whichreliesonBDSdatafor1992-2014,reportsnetfirmemploymentgrowth rates by firm size bin, where size bins are set using initial firm employment and growth rates are averaged over the years in the sample.16 Exiting firms are included (which have growthof-200percent). ThelightbluebarsuseallfirmsintheBDSandillustratetheview thatwascommonpriortoHaltiwangeretal.(2013): firmgrowthratesdeclinewithfirmsize (at least among the smaller size bins) then hover near zero for larger sizes. The dark green 16Initialfirmsizemeanssizeint−1,wheregrowthiscalculatedfromt−1tot. 35
16 12 8 4 0 -4 -8 -12 -16 tnecreP Net employment growth by firm size, BDS All firms Incumbent firms 1 - 4 5 - 9 10 - 19 20 - 49 50 - 99 100 - 249 250 - 499 500 - 999 1000 - 2499 2500 - 4999 5000 - 9999 10000+ Initial firm size (employment) DHS denominator; annual averages from BDS for 1992-2014. Figure17 bars feature incumbent firms only (that is, new entrants are omitted). A starkly different picture emerges. The smallest size bin still has some growth advantage, though it is much diminished compared to the all-firm sample. Aside from the smallest class, all size classes below500employeesactuallyseenegativenetgrowthonaverage. Thefigureillustratesthe notionthatthesmall-firmgrowthadvantageisdrivenalmostentirelybynewentrants. Figure 17 illustrates a critical stylized fact about the firm size and age distribution, so it is important that NETS data exhibit similar properties. Figure 18 reports the same exercise with NETS data. Rather reassuringly, NETS results are qualitatively (though not quantitatively)similartothoseseenintheBDS.Figure19repeatsthesameexcerciseomittingfirms inwhichatleast10percentofemploymentisatestablishmentswithlongitudinallyimputed employment figures. The result is starkly different and suggests that, oddly, the ability of NETSqualitativelytoreplicateFigure17isheavilydependentonimputedobservations. In 36
16 12 8 4 0 -4 -8 -12 -16 tnecreP Net employment growth by firm size, NETS All firms Incumbent firms 1 - 4 5 - 9 10 - 19 20 - 49 50 - 99 100 - 249 250 - 499 500 - 999 1000 - 2499 2500 - 4999 5000 - 9999 10000+ Initial firm size (employment) DHS denominator; annual averages from NETS for 1992-2014. Figure18 particular, it appears that much of entrants’ contribution to the employment growth of the smallfirmbinsreflectsimputedemploymentdataassignedtonewfirms. Indeed,asshown onFigure10,closeto90percentofnewentrants(age0)haveimputedemploymentdata. In 2014, of the new firms with imputed employment data, less than 1 percent reflect respondentimputation(i.e.,“bottomofrange”),whileD&BandWalls&Associatesestimateseach compriseabouthalfofimputations. 4.3.2 Skewnessandchurn Haltiwangeretal.(2013)showedthatyoungfirmsaccountforthehighaveragegrowthrates of small firms. Decker et al. (2014) explore higher moments of the growth rate distribution over the lifecycle, documenting two key characteristics of young firm growth: skewness and churn. The growth outcomes of young firms are highly skewed, with a small number 37
16 12 8 4 0 -4 -8 -12 -16 tnecreP Net employment growth by firm size, NETS All firms Incumbent firms 1 - 4 5 - 9 10 - 19 20 - 49 50 - 99 100 - 249 250 - 499 500 - 999 1000 - 2499 2500 - 4999 5000 - 9999 10000+ Initial firm size (employment) DHS denominator; annual averages from NETS for 1992-2014. Firms with at least 10% of employment imputed are omitted. Figure19 of extreme growth events. And young firms undergo considerable “churn”: the growth outcomes of young firms are highly dispersed, with a large amount of both very positive and very negative growth events, and young firms exhibit strong “up-or-out” dynamics as high incidence of failure among some young firms coexists with rapid growth of many survivors. Thesecharacteristicsofyoungfirmsarenotcapturedbyaveragegrowthstatistics but instead require study of the full distribution of growth outcomes, including outcomes ofsurvivorsandtheprevalenceoffirmexit. Figure 20, which is taken from Decker et al. (2014) exercises on LBD data, reports the growthratedistributionofsurvivingfirms(i.e.,thosethatdonotexit)byage,averagedover the years 1992-2011.17 The solid line with dots is the median of the employment-weighted 17Decker et al. (2014) report 16 age bins, with the top bin including all firms age 16 and above. Given the shorter time series of NETS, to improve the comparison we report only 11 age bins. Since our project lacks accesstotheLBDmicrodata,inourreproductionoftheDeckeretal.(2014)figurewecollapseagebins11and higherusingsimpleaveragesofthereportedpercentiles. 38
100 75 50 25 0 -25 -50 -75 -100 tnecreP Distribution of net employment growth rates for surviving firms, LBD Median 90th percentile 10th percentile 1 2 3 4 5 6 7 8 9 10 11+ Firm age Source: Decker et al. (2014). DHS denominator. Average across years 1992-2011. Figure20 growth rate distribution for the corresponding age bin; that is, for each age bin, half of all employmentisatfirmswithgrowthratesatorbelowtheblackline. Thetopofthedarkblue bars indicates the 90th percentile of the employment-weighted growth rate distribution, while the bottom of the light green bars indicates the 10th percentile of the employmentweightedgrowthratedistribution. Eachstatisticiscalculatedforeveryyearinthesample, thenaveragedacrossyears.18 AfewkeypatternsareevidentfromFigure20(seeDeckeretal.(2014)formorediscussion). First, median employment growth is only positive among young firms; the typical maturefirmhaszeroemploymentgrowth,consistentwiththeageprofilesdescribedabove. Second,growthoutcomesarehighlydispersedamongyoungfirms,withdispersiondeclin- 18ThepopulationoffirmsincludedinFigure20differsfromthepopulationincludedinFigure17inthat17 potentiallyincludesallfirmsorallincumbents(includingfirmsthatexit,withagrowthrateof-200percent), but20includesonlysurvivingfirms.Thatis,inFigure20,thebarscorrespondingwithfirmage1includefirms thatsurvivedtoreachage1,omittingthosethatexitedbetweenages0and1. 39
ingasfirmcohortsage. Thisfactillustratesthehighpaceofchurnamongyoungfirms,with many outcomes of both extreme growth and extreme decline. Third, the growth rate distributionofyoungfirmsischaracterizedbyskewness,shownasthedistancefromthe90th percentile to the median compared with the distance from the 10th percentile to the median; this skewness illustrates that the substantial job growth contribution of young firms includesnotwidespreadgrowthbutinfactafewfirmswithextremelyhighgrowth. Skewnessdisappearsentirelybyagefive,areasonthatmuchoftheliteraturestudiesyoungfirms withanagecutoffaroundfive. Highgrowthisacharacteristicof(some)youngfirms. 100 75 50 25 0 -25 -50 -75 -100 tnecreP Distribution of net employment growth rates for surviving firms, NETS Median 90th percentile 10th percentile 1 2 3 4 5 6 7 8 9 10 11+ Firm age DHS denominator. Average across years 1992-2011. Figure21 As with the data on firm growth by size and age, the patterns of dispersion and skewness over the (surviving) firm lifecycle evident in Figure 20 are critical stylized facts about the behavior of young firms and the sources of aggregate employment growth. We evaluate the ability of NETS to exhibit these patterns on Figure 21, which mimics Figure 20. 40
The difference between the figures is very concerning: While LBD data in Figure 20 exhibit significant growth rate dispersion among firms of all ages (and particularly young firms),verylittledispersionisevidentintheNETSdatashownonFigure21. Sincetheseare employment-weighteddistributions,thelatterfigureindicatesthat90percentofsurvivingfirm employment is at firms with a growth rate around zero percent or higher for almost all age groups, while in the LBD we observe very young firms with growth approaching negative 50 percent and even many mature firms with growth around negative 25 percent. And while negative growth is nearly absent from NETS data, positive growth is almost as rare. Forexample,intheLBDweobserveyoungfirmsthataccountforaround10percentof employmentgrowingatarateof50percentormore,butnofirmagegroupinNETSreports a 90th percentile growth rate beyond 20 percent. The median growth rate in NETS, shown by the black line with dots, is close to zero for firms of all ages, in contrast to the positive growthratesseenbyyoungfirmsintheLBD.NETSdatathereforemissvirtuallytheentire distributionoffirmgrowthoutcomes,whatevertheirperformancetrackingaveragegrowth patterns. This is a significant limitation of NETS generally and is particularly problematic forthestudyofyoungfirms,which(asshownonFigure20)areespeciallycharactertizedby widedispersionandhighskewnessoffirmgrowthrates. Inunreportedresultswefindthat omittingimputedobservationsfromNETSdoesnotmateriallyalterFigure21. Deckeretal.(2014)alsodocumentthe“up-or-out”natureoftheyoungfirmenvironment by contrasting exit and survival. Figure 22, which uses BDS data to replicate Decker et al. (2014), reports the experiences of firm cohorts as follows. The light blue bars report jobs destroyed (over one year) by firms that exit just before reaching a given age; that is, the blue bar for age 1 reflects exits of firms between age 0 and age 1, the blue bar for age 2 reflectsexitsoffirmsbetweenage1andage2,andsoon. Thedarkgreenbarsreportnetjob creation (over one year) among firms that survive to a given age; that is, the green bar for age1reportsjobscreatedbyfirmscontinuingfromage0to1,thegreenbarforage2reports jobs created by firms continuing from age 1 to 2, and so on. All figures are scaled by the 41
18 16 14 12 10 8 6 4 2 0 tnecreP Up or out dynamics for young firms, LBD Job destruction from exiting firms Net job creation of continuing firms 1 2 3 4 5 6 7 8 9 10 11+ Firm age Source: Decker et al. (2014). DHS denominator; annual averages from LBD for 1992-2011. Figure22 DHS employment denominator for the entire cohort, and rates are calculated by year then averagedoverallyears1992-2011.19 Figure22illustratesthreekeypointsaboutthefirmlifecycle. First, bothjobdestruction fromexitsandjobcreationfromentrantsarehighamongyoungfirmsanddeclinemonotonically with firm age, consistent with evidence above that young firm outcomes are volatile and highly dispersed. Second, an “up-or-out” pattern is evident in the sense that, while many jobs are destroyed by exiting firms, surviving firms have high average growth rates. Third, job creation from survivors is more than offset by job destruction from exiting firms forallagegroups. Notethat,byconstruction,agezerofirms(notshownonFigure22)only create (i.e., do not destroy) jobs, so creation far offsets destruction upon entry. A reason- 19As with the previous set of figures, we collapse all age categories above 10 into a single “11+” category, whichissimpleinthisexercisesincewerelyonBDSdata. WedothisforcomparabilitywithNETSdatabut makeanoteofitbecauseitdiffersslightlyfromthesetupinDeckeretal.(2014). 42
able characterization of young firm dynamics, then, is that each new cohort creates a large number of jobs upon entry, but firms immediately begin failing, destroying many jobs as firms age but with continued growth among surviving firms that partially offsets the job destruction.20 18 16 14 12 10 8 6 4 2 0 tnecreP Up or out dynamics for young firms, NETS Job destruction from exiting firms Net job creation of continuing firms 1 2 3 4 5 6 7 8 9 10 11+ Firm age DHS denominator; annual averages from NETS for 1992-2011. Figure23 Figure 23 mimics Figure 22 using NETS data to assess the presence of “up-or-out” dynamicsinNETS.TheperformanceofNETSinthisexerciseisnotasweakasintheprevious skewness and dispersion exercise: the lifecycle pattern of exit-driven destruction and creation of continuers is not quite monotonic but is qualitatively similar to BDS data in that destruction from exit outpaces creation among continuers for all age groups. Moreover, among age groups above 5 the magnitudes of job destruction and creation appear reason- 20Deckeretal.(2014)notethatthepost-entryjobdestructionofexitingfirmsisstillnotenoughtocompletely offsetthejobscreateduponentry: fiveyearsafterentry,theemploymentofthetypicalcohortisstillequalto 80percentofthecohortsentryemployment,suchthatnewcohortsoffirmsmakepermanentcontributionsto aggregateemploymentdespitehighfailureratesinearlyyears. 43
ablyaccurate. However,themagnitudesillustratedbythefigureindicateparticularlypoor measurementofyoungfirmdynamics. Thedifferencesbetweenyoungandmaturefirms,in termsofbothjobdestructionandcreation, aremuchsmallerinNETSthanintheBDS,and themonotonicity-by-ageiswrongfortheyoungestagegroups. 18 16 14 12 10 8 6 4 2 0 tnecreP Up or out dynamics for young firms, NETS Job destruction from exiting firms Net job creation of continuing firms 1 2 3 4 5 6 7 8 9 10 11+ Firm age DHS denominator; annual averages from NETS for 1992-2011. Firms with at least 10% of employment imputed are omitted. Figure24 Figure24documentsthesameexerciseinNETSomittingfirmsthathavelongitudinally imputed data comprising at least 10 percent of their employment; interestingly, NETS’ inabilitytotrackthepatternofexit-drivenjobdestructionamongyoungfirmsshowninFigure 23appearstobeduetoimputedobservations;jobdestructionratesacrossthelifecyclelook reasonably accurate among non-imputed firms. However, job creation rates among young firmsappearlittlebetteramongthenon-imputedobservationsthanamongNETSfirmsgenerally. AgainweobservethatNETSisparticularlylimitedinitsmeasurementofyoungfirm dynamics, and young firm dynamics are a critical component of the overall firm dynamics 44
literature. 5 Discussion TheforegoingcomparisonsrevealseriousdiscrepanciesbetweenNETSandofficialadministrative data. NETS displays markedly different patterns of young firm activity in terms of both aggregate activity shares and the micro behavior of young firms. NETS businesses generally exhibit patterns of business dynamics that are far less volatile than those seen in officialsources. AkeydriverofthesediscrepanciesisthehighrateofimputationinNETS, particularlyamongyoungfirms,mostofwhomlackfreshdataobservations. Butrestricting thesampletoomitimputeddataisnopanacea,asimputationisextremelyprevalentamong young businesses and restricting the sample to non-imputed observations creates composition effects that do not usually mitigate the discrepancies. These limitations of NETS are seriousandobligeresearchersusingNETStousecaution. Topicsincludingreallocation,entrepreneurship, firm growth and exit, and inaction are highly vulnerable to the limitations ofNETS. One potential response to the NETS/LBD discrepancies we document here is that governmentdatasourcesarealsoimperfectandmaynothaveaclaimonbeingthebenchmark against which private data sources should be judged. We readily acknowledge that official sourceshavemanylimitations. Forexample,usersoftheLBDencounterproblemswithfirm identifier longitudinal linkages, staleness of industry codes and firm organization details betweencensusyears,andlackofeasilyintergratedcoverageofthenonemployeruniverse. Indeed, methods of defining firm age and organic employment growth, now widely used inempiricalliteraturebutpioneeredbyHaltiwangeretal.(2013)andDavisetal.(2007),are designed to minimize the errors introduced by these limitations. Barnatchez et al. (2017) discuss limitations of official data more broadly, including the Census Bureau’s County Business Patterns (which uses the same source data as the LBD), and show discrepancies betweenCensusandBureauofLaborStatistics(BLS)sourcesevenwhenrestrictedtocom- 45
monindustryscope(thoughthesediscrepanciesaresmallerthanthosebetweenNETSand eitherofficialsource). However,thereareatleasttworeasonstotreattheofficialsourcesasauthoritative. First, official data collection efforts are characterized by intense focus on consistency and measurement best practices. For example, the employment data on which the LBD is based always report employment as of March 12 of a given year; in contrast, the Dun & Bradstreet employment figures could be recorded at any point during the year, rendering them vulnerabletoseasonalfluctuations. OtherLBDvariablesarecontinuallyupdatedwithinformationfromadministrativeandsurveysources,suchastheannualReportofOrganization survey,andCensussurveysareconductedscientifically.21 Morebroadly,theU.S.statistical agencies employ large staffs of experts tasked with ensuring data quality as well as active researchersexploringandperformingresearchanddevelopmentondataproducts.22 These efforts are supplemented by robust exchanges between statistical agency staff and outside experts, such as those facilitated by the Federal Economic Statistics Advisory Committee (FESAC). For the purposes of D&B, scientific best practice is likely to be both excessively costly and unnecessary; for example, an estimated or imputed employment observation is oftengoodenoughfortheneedsofD&Bclientswhilebeingmuchlessusefulforresearchers ofbusinessdynamics. Second, official sources are based in part on administrative data that are accurate by construction. The LBD source data are ultimately tax records, so the LBD represents the universe of in-scope employer businesses that are known to U.S. tax authorities—a clear and reasonable definition of business activity that contrasts with D&B’s looser goal of covering a less-defined employer and nonemployer business universe with large potential for undercoveragerelativetoitsgoal(asappearstohavebeenthecasepriortothelikelyscope 21Seehttps://www.census.gov/programs-surveys/cos/about.htmlfordetailsabouttheReportofOrganizationsurvey,alsoknownastheCompanyOrganizationSurvey. 22Forexample,theCensusBureau’sCenterforEconomicStudiesemploysmanysocialscientistswhoactively evaluate the research uses and limitations of the LBD and other Census data products. Other Census offices havesimilarfeatures,andadditionalqualitycontrolisperformedbyauthorizedoutsideresearchersusingthe FederalStatisticalResearchDataCenters. 46
improvementofthe2000sdocumentedbyBarnatchezetal.(2017)). TotheextentthatNETS differsfromthecombinedCensusemployerandnonemployeruniverse(i.e.,CBPandNES) intermsofestablishmentcoverage, itmustbethatNETSiseitherincludingbusinessesdefined in some other way than taxable entities or omitting taxable businesses. Likewise, annual employment snapshots in the LBD represent data that are routinely used for administrativepurposesbytheIRSandtheSocialSecurityAdministration,limitingthescope forinaccuracyandimputation. WhilesomeLBDestablishmentsonlyreceiveindustrycode and company organization updates after the semidecennial Economic Censuses, employment data come from administrative sources. NETS, by contrast, exhibits high rates of imputation of employment data, which is particularly problematic for the study of business dynamics. Weaknesses and limitations of the official sources notwithstanding, then, the LBD and corresponding BDS are, in our view, best treated as more authoritative than NETS. The discrepancies between the sources are therefore cause for concern about the usefulness of NETSforbusinessdynamicsresearch. Forthestudyoffirmdynamics,weconcludethatthemostpromisinguseofNETSisnot for broad studies of entrepreneurship or firm dynamics but instead for more detailed, narrowerinvestigationsofspecificcasestudiesinwhichthemicrodatacanbeevaluatedagainst outside sources prior to analysis. Echeverri-Carroll and Feldman (2017), discussed above, is one such example, though it is limited to two cities (albeit ones with important recent entrepreneurshippatterns, Austin, TXandtheNorthCarolina“ResearchTriangle”). Those authorscarefullydescribewaystomaketheNETSdatamostreliable,includingappropriate sample restrictions and relaxation of firm entry timing to windows broader than one year. Researchers with questions for which such restrictions and timing conventions are appropriate may find similar success, though we argue that validation against independent data sourceswillcontinuetobenecessarybeforeproceedingwithcasestudiesofothercities. 47
6 Conclusion Barnatchezetal.(2017)arguethatNETScanbeusefulforthestudyofstaticbusinessdistributions,providedthatresearchersexerciseappropriatecautionandpaycarefulattentionto problemswithcoverageofcertainkindsofestablishments. Inthispaper,weobtainlessoptimisticresultswhenusingNETStostudybusinessdynamicsinamicrodataapproach. We showthatoneparticularconceptofhighinteresttofirmdynamicsresearchers—thelifecycle dynamicsofyoungfirms—appearspoorlymeasuredinNETSdata,asarebroaderconcepts like gross and net job flows among firms generally. This is a considerable limitation given theimportanceoftheseconceptsforstudiesofbusinessandlabormarketdynamics. Popular topics including entrepreneurship, job reallocation, high-growth firms, and firm failure maybedifficulttostudywithhighconfidenceinNETS,afindingconsistentwiththemore limited investigation performed by Haltiwanger et al. (2013). Through painstaking firmlevelcomparisonworkresearchersmayfindspecificsettingsinwhichNETScanbereliable, suchasEcheverri-CarrollandFeldman(2017),butingeneralweurgecaution. Whileweviewourresultsascompelling,therearemanyaspectsofNETSthatwedonot investigate. NETS includes a wealth of information on variables other than employment, industry,andlocation,suchassales,creditinformation,andlegalformoforganization. We leaveinvestigationoftheseandothervariablesforfutureresearch. References Alon, Titan, David Berger, Robert Dent, and Benjamin Pugsley, “Older and Slower: The Startup Deficit’s Lasting Effects on Aggregate Productivity Growth,” Journal of Monetary Economics,2018,93,68–85. Barnatchez,Keith,LelandD.Crane,andRyanA.Decker,“AnAssessmentoftheNational Establishment Time Series (NETS) Database,” Finance and Economics Discussion Series 2017-110,FederalReserveBoard2017. 48
Chava,Sudheer,AlexOettl,ManpreetSingh,andLinghangZeng,“TheDarkSideofTechnologicalProgress? ImpactofE-CommerceonEmployeesatBrick-and-MortarRetailers,” 012018. Cho, Clare, Patrick Mclaughlin, Eliana Zeballos, Jessica Kent, and Chris Decken, “Capturing the Complete Food Environment With Commercial Data: A Comparison of TDLinx, ReCount, and NETS Databases,” Technical Bulletin 1953, USDA Economic ResearchServiceMarch2019. Davis, Steven J., John C. Haltiwanger, and Scott Schuh, Job Creation and Destruction, The MITPress,1996. , John Haltiwanger, Ron S. Jarmin, and Javier Miranda, in Daron Acemoglu, Kenneth Rogoff, and Michael W, eds., NBER Macroeconomics Annual 2006, Volume 21, The MIT, 2007,chapterVolatilityandDispersioninBusinessGrowthRates: PubliclyTradedversus PrivatelyHeldFirms,pp.107–180. Decker,RyanA.,JohnHaltiwanger,RonS.Jarmin,andJavierMiranda,“ChangingBusiness Dynamism and Productivity: Shocks vs. Responsiveness,” NBER working paper no. 24236,2018. Decker, Ryan, John Haltiwanger, Ron S. Jarmin, and Javier Miranda, “The Role of EntrepreneurshipinUSJobCreationandEconomicDynamism,”JournalofEconomicPerspectives,2014,28(3),3–24. DeSalvo, Bethany, FrankLimehouse, andShawnD.Klimek, “DocumentingtheBusiness Register and Related Economic Business Data,” Technical Report, Center for Economic StudiesworkingpaperCES-WP-16-172016. Echeverri-Carroll,ElsieandMaryannFeldman,“ChasingEntrepreneurialFirms,”TechnicalReport2017. 49
Faccio, Mara and Hung-Chia Hsu, “Politically Connected Private Equity and Employment,”TheJournalofFinance,2017,72(2),539–574. Guzman,JorgeandScottStern,“TheStateofAmericanEntrepreneurship: NewEstimates of the Quantity and Quality of Entrepreneurship for 15 US States, 1988-2014,” Working Paper22095,NationalBureauofEconomicResearchMarch2016. Haltiwanger, John, Ron S. Jarmin, and Javier Miranda, “Who Creates Jobs? Small versus LargeversusYoung,”TheReviewofEconomicsandStatistics,2013,95(2),347–361. , ,RobertKulick,andJavierMiranda,“Highgrowthyoungfirms: Contributiontojob growth, revenue growth, and productivity,” in “Measuring entrepreneurial businesses: Currentknowledgeandchallenges,”UniversityofChicagoPress,2017. Heider, Florian and Alexander Ljungqvist, “As certain as debt and taxes: Estimating the taxsensitivityofleveragefromstatetaxchanges,”JournalofFinancialEconomics,2015,118 (3),684–712. NBERSymposiumonNewperspectivesoncorporatecapitalstructures. Jarmin, Ron S. and Javier Miranda, “The Longitudinal Business Database,” Technical Report,CenterforEconomicStudies2002. Kimbrough, Gray, “Uncluttered Stata Graphs,” 2018. Github repository. At https://github.com/graykimbrough/uncluttered-stata-graphs.AccessedApril29,2019. Neumark, David, Brandon Wall, and Junfu Zhang, “Do Small Businesses Create More Jobs? NewEvidencefortheUnitedStatesfromtheNationalEstablishmentTimeSeries,” TheReviewofEconomicsandStatistics,2011,93(1),16–29. , Junfu Zhang, and Brandon Wall, “Employment Dynamics and Business Relocation: NewEvidencefromtheNationalEstablishmentTimeSeries,” WorkingPaper11647, NationalBureauofEconomicResearchOctober2005. 50
Rossi-Hansberg, Esteban, Pierre-Daniel Sarte, and Nicholas Trachter, “Diverging Trends in National and Local Concentration,” Working Paper 25066, National Bureau of EconomicResearchSeptember2018. UnderstandingDataintheNETSDatabase Understanding Data in the NETS Database, Technical Report, Walls & Associates NETS marketingmaterials2009. Walls,Donald,“UnderstandingDataintheNETSDatabase,”NETSDocumentation,2008. 51
Cite this document
Leland D. Crane and Ryan A. Decker (2019). Business Dynamics in the National Establishment Time Series (NETS) (FEDS 2019-034). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2019-034
@techreport{wtfs_feds_2019_034,
author = {Leland D. Crane and Ryan A. Decker},
title = {Business Dynamics in the National Establishment Time Series (NETS)},
type = {Finance and Economics Discussion Series},
number = {2019-034},
institution = {Board of Governors of the Federal Reserve System},
year = {2019},
url = {https://whenthefedspeaks.com/doc/feds_2019-034},
abstract = {Business microdata have proven useful in a number of fields, but the main sources of comprehensive microdata are subject to significant confidentiality restrictions. A growing number of papers instead use a private data source seeking to cover the universe of U.S. business establishments, the National Establishment Time Series (NETS). Previous research documents the representativeness of NETS in terms of the distribution of employment and establishment counts across industry, geography, and establishment size. But there exists considerable need among researchers for microdata suitable for studying business dynamics--birth, growth, decline, and death. We evaluate NETS in terms of its ability to corroborate key insights from the business dynamics literature with a particular focus on the behavior of new and young firms. We find that NETS microdata exhibit patterns of business dynamics that are markedly different from official administrative sources, limiting the usefuln ess of NETS for studying these topics. Accessible materials (.zip)},
}