feds · November 2, 2017

An Assessment of the National Establishment Time Series (NETS) Database

Abstract

The National Establishment Time Series (NETS) is a private sector source of U.S. business microdata. Researchers have used state-specific NETS extracts for many years, but relatively little is known about the accuracy and representativeness of the nationwide NETS sample. We explore the properties of NETS as compared to official U.S. data on business activity: The Census Bureau's County Business Patterns (CBP) and Nonemployer Statistics (NES) and the Bureau of Labor Statistics' Quarterly Census of Employment and Wages (QCEW). We find that the NETS universe does not cover the entirety of the Census-based employer and nonemployer universes, but given certain restrictions NETS can be made to mimic official employer datasets with reasonable precision. The largest differences between NETS employer data and official sources are among small establishments, where imputation is prevalent in NETS. The most stringent of our proposed sample restrictions still allows scope that covers about three quarters of U.S. private sector employment. We conclude that NETS microdata can be useful and convenient for studying static business activity in high detail. Accessible materials (.zip)

Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C. An Assessment of the National Establishment Time Series (NETS) Database Keith Barnatchez, Leland D. Crane, and Ryan A. Decker 2017-110 Please cite this paper as: Barnatchez, Keith, Leland D. Crane, and Ryan A. Decker (2017). “An Assessment of the National Establishment Time Series (NETS) Database,” Finance and Economics Discussion Series 2017-110. Washington: Board of Governors of the Federal Reserve System, https://doi.org/10.17016/FEDS.2017.110. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

An Assessment of the National Establishment Time Series (NETS) Database ∗ KeithBarnatchez,LelandD.Crane,andRyanA.Decker October27,2017 Abstract The National Establishment Time Series (NETS) is a private sector source of U.S. businessmicrodata. Researchershaveusedstate-specificNETSextractsformanyyears, but relatively little is known about the accuracy and representativeness of the nationwide NETS sample. We explore the properties of NETS as compared to official U.S. data on business activity: The Census Bureau’s County Business Patterns (CBP) and Nonemployer Statistics (NES) and the Bureau of Labor Statistics’ Quarterly Census of Employment and Wages (QCEW). We find that the NETS universe does not cover the entirety of the Census-based employer and nonemployer universes, but given certain restrictionsNETScanbemadetomimicofficialemployerdatasetswithreasonableprecision. The largest differences between NETS employer data and official sources are amongsmallestablishments,whereimputationisprevalentinNETS.Themoststringent ofourproposedsamplerestrictionsstillallowsscopethatcoversaboutthreequartersof U.S. private sector employment. We conclude that NETS microdata can be useful and convenientforstudyingstaticbusinessactivityinhighdetail. 1 Introduction WeexploretherepresentativenessoftheNationalEstablishmentTimeSeries(NETS),aprivate sector source of business microdata, relative to official U.S. business universe data ∗BarnatchezisastudentatColbyCollege. CraneandDeckerareeconomistsattheFederalReserveBoard. Bo Yeon Jang provided excellent research assistance. Without implication, we thank Maria Tito for helpful conversations. Theanalysisandconclusionssetfortharethoseoftheauthorsanddonotindicateconcurrence byothermembersoftheFederalReserveresearchstaffortheBoardofGovernors. 1

sources: The Census Bureau’s County Business Patterns (CBP) and Nonemployer Statistics (NES) and the Bureau of Labor Statistics’ (BLS) Quarterly Census of Employment and Wages (QCEW). A key purpose of the present note is to provide background for the exercisesdescribedinCraneetal.(2017),whichuses2014NETSdatatoestimatefloodexposure duringandafterHurricaneHarvey,butouranalysiswillbeusefulforpotentialNETSusers morebroadly. NETS consists of establishment-level longitudinal microdata covering, in principle, the universe of U.S. businesses. Though costly, NETS can be accessed without extensive proposalandsecurityclearanceprocessesandcanbeusedoutsideofsecuregovernmentfacilities,potentiallyprovidingahighlyefficientwaytoconductresearchontopicsthatrequire business-levelmicrodata. However,NETSdataarenotgeneratedbytherigorousprocesses that characterize official data collection activities of U.S. statistical agencies, and comparisonsofNETStoofficialsourceshaveraisedquestionsaboutthespecificbusinessuniverse coveredbythedataaswellasthequalityofannual-frequencyinformationonestablishmentlevel entry, exit, and employment (see, e.g., Haltiwanger et al. (2013) and Neumark et al. (2005),whichwediscussfurtherbelow). AnumberofstudieshavedocumentedpropertiesofNETSinlimited, single-statesamples. FewerstudieshaveusedthefullnationalNETSfile,andthesestudiestypicallyrestrict the national file to a subset that can be matched to external data samples. Using national NETS data for 1992-2014, we highlight previous concerns about the precise nature of the NETS business universe, document limitations in terms of imputation and other data artifacts,andoutlinespecificsamplerestrictioncriteriathatrenderNETSreasonablycomparable to official sources for the purpose of studying static business distributions. In related workinprogressnotreportedhere,wefindthatNETSismuchmorelimitedinitsvaluefor studyingbusinessdynamics. Afterapplyingappropriatesampleselectioncriteria,wefindthatthecorrelationofNETS employment counts and CBP employment counts across U.S. counties can be in excess of 2

0.99. Correlations across state-industry-size class cells are somewhat lower but are still above0.9onrestrictedsamples. Zipcode-levelcorrelationsarealsoremarkablyhigh. That said,wefindseveraldiscrepanciesbetweenNETSandtheotherdatasources. NETSheavily over-represents establishments with fewer than 10 employees relative to official employer business data, possibly due to the imputation of positive employment to nonemployers in NETS.Wealsofindsmallerbutsignificantdiscrepanciesamongthelargestestablishments, likely due in part to large, public educational establishments that are difficult to identify as government owned, as well as establishments that mistakenly report firm employment numbers in NETS. Finally, post-2000 developments in U.S. mining, construction, and manufacturing employment do not appear to be well captured in NETS, though these discrepancies may be due in part to industry labeling differences. When we omit very small establishments, NETS agrees reasonably well with CBP and QCEW, both in terms of trends over time and in terms of the cross section, and we find better agreement still when we omit the largest establishments and educational establishments. This most restrictive sample accounts for 73 percent percent of US employment in the QCEW. We propose that any NETSresultsbecheckedonboththeunrestrictedsampleandtherestrictedsampletoensure robustness. Thepaperproceedsasfollows. Section2describesNETSandtherelatedliterature. Section 3 contains the main results, and Section 4 concludes. Additional tables and figures, as wellasdetailsregardingCBPandQCEW,arefoundintheappendices. 2 NETS background NETS is a product of Walls & Associates. The source data for NETS are collected by Dunn & Bradstreet (D&B) for the Duns Marketing Information file (DMI; see Walls (2008)). D&B uses and sells the data for, among other things, marketing and credit scoring. While there is no legal obligation for establishments to participate or report truthfully, D&B has strong profit-basedincentivestocompileaccuratedata,andindividualbusinesses’accesstocredit 3

and other business relationships may depend on the quality of the information they provide. D&B attempts to collect information on all U.S. businesses from secretaries of state, Yellow Pages, court records, and credit inquiries, as well as other sources. They also contactbusinessesdirectlybytelephone. Eachestablishmentisassignedauniquedunsnumber, whichisconstant overtimeandfollowsthe business whenitmovesor when itisacquired by another firm. D&B also attempts to link establishments to firms; NETS files include an hqduns number for each establishment, which is the dunsnumber of the ultimate domestic headquarters.1 While D&B concepts are not as rigidly defined as concepts used by statistical agencies, a reasonable assessment is that D&B attempts to catalog every business establishment in the U.S., where “business” is broadly defined to include private for-profit and nonprofit organizations as well as government agencies. In NETS, an establishment is a specific line of business at a specific location (see below), and employment includes all workers at an establishment, potentially including proprietors, independent contractors, and temporary workers supplied by outside organizations. While this broad scope suits D&B business purposeswell,itcanbedifficulttoreconcilewithwell-defineduniverseconceptsemployed byU.S.statisticalagencies;asweshowbelow,though,NETSdataaresufficientlydetailedto allowapproximationofstandardbusinessscopedefinitions. Forexample,theNETSlineof businessconceptisasubsetofstandardestablishmentconceptsinofficialsources,inwhich anestablishmentincludesallworkersataspecificbusinesslocation(and,hence,isthesum ofalllinesofbusinessinoperationatthatlocation).2 AnadditionaldifferencebetweenNETSandofficialsourcesisthatofficialsourcesrecord establishmentexistenceandemploymentonspecified,uniformdates(forexample,thepay periodincludingMarch12). NETSrecordsareannual,butinformationiscollectedthroughout the year, and the timing of measurement for each establishment is not reported in the 1Thereisnoinformationinourfilesontheintermediatelayersofthefirm.Hqdunspointstothehighest-level headquarterswithintheU.S.,neverto,e.g.,theregionalheadquartersofthefirm. 2Inourbaselineresults,wemergeNETSlinesofbusinesstobetterapproximatetheCBP/QCEWestablishmentconcept.Allofourqualitativeconclusionsareunchangedifinsteadweusetheseparatelinesofbusiness. 4

data. This is not only a source of micro-level measurement error but also a likely cause of discrepanciesbetweenNETSaggregatesandofficialdatasources. Besidesfirmlinkages,D&Bcollectsandconstructsnumerousestablishmentcharacteristics;forourpurposes,themostimportantareemployment,industry,andlocation(address). Notably,establishmentemploymentgenerallyincludesthefirmowner(s). Employmentinformation usually comes from direct inquiries by D&B, unless it is imputed (employment imputation codes are included in NETS files, and we discuss imputation rates below). Industry is generally either self reported by the business or drawn from Secretaries of State data. Industry is available as 8-digit SIC or 6-digit NAICS codes, though there are no imputation codes for industry information. Address information may be self reported or collected from administrative-type records such as Yellow Pages and includes street address, zipcode,stateandcounty,aswellaslatitudeandlongitude. 2.1 RelatedLiterature NETSdatahavebeenusedinanumberofstudies. AkeyreferenceisNeumarketal.(2005), which provides a detailed discussion of the history of D&B data collection, and then compares California NETS data to several official sources. Neumark et al. (2005) appears to be thesourceoftherulesofthumbthat(a)theemployeruniversecanbeapproximatedbysubtracting 1 from all NETS establishments’ employment, and (b) business dynamics are best studied at 3-year frequency instead of annually (in a companion paper, we further explore businessdynamics inNETS).Anumber ofsubsequentstudies adopttheseconventions. In theirCaliforniasample, Neumarketal.(2005)findthattotalNETSemploymentisnotdramatically different from the sum of UI-based employer establishment count measures and theCensusBureau’snonemployercounts(aswedetailbelow, though, inthenationalsample we find that total NETS employment is consistently between the size of the employer universeandtheunionofemployerandnonemployeruniverses,aconcernnotedbyHaltiwanger et al. (2013)). Moreover, the authors report high correlations of employment levels 5

betweenNETSandofficialemployerdatasetsatthecounty-by-industryleveland,toasomewhatlesserextent,theindustry-by-sizelevel. ConsistentwiththenotionthatNETSincludes alargenumberofnonemployerbusinesses,theauthorsshowthatdifferencesbetweenNETS andemployerdatasetsareheavilyconcentratedinsmallestablishmentsizeclasses,particularlythe1-4employeeclass. Importantly,Neumarketal.(2005)findconsiderableevidence of rounding (to the nearest 10 or 5) in NETS employment numbers; moreover, the authors document a significant amount of employment imputation, particularly in establishments’ early years in the data, and they show that annual changes in industry-by-county employment are only weakly correlated between QCEW and NETS (correlation 0.53).3 A handful ofsubsequentstudiesbyNeumarkandcoauthorsrelyontheCaliforniaNETSsample(e.g., Kolko and Neumark (2010)). Neumark et al. (2011) study business growth in the national sampleofNETSbutrefertothepreviousCalifornia-basedresultsforbackgroundonrepresentativeness. AnotherkeyreferenceforunderstandingrepresentativenessinNETSisChoietal.(2013) (andthefollow-uppaperChoietal.(2017)), whichusestheGeorgiaNETSextract. TheauthorsfocusprimarilyoncomparisonswithCBPand,importantly,useindustrycodes,establishment size, and legal status criteria to create a NETS universe that is roughly consistent withCBPscope. Wefollowasimilarapproachinthepresentstudy,creatingaNETSsample thatisrestrictedtomatchCBPscopeascloselyaspossible. Anothersignificantinvestigation is Echeverri-Carroll and Feldman (2017), who use secretary of state data for Austin, Texas and the North Carolina Research Triangle to validate NETS founding dates; after extensive efforts using automated and hand matching, the authors find that about half of NETS founding dates match secretary of state founding dates (which reflect business formation applications),with75percentofmatchesbeingaccuratewithintwoyearsand80percentof 3Three-year changes in industry-by-county employment have a correlation of 0.86. Neumark et al. (2005) furtherstudythebusinessdynamicspropertiesofNETSbyobtainingfoundingdatedataontwosetsofbusinesses(SanFranciscophonelistingsandBioAbilitybiotechrecords)andcomparingtheseexternaldatasources withNETS.TheauthorsfindthatNETSfoundingdatesareaccurateaboutthreequartersofthetimeandare withintwoyearsofaccuracyabout90percentofthetime;theydonotinvestigatefirmsizeatbirth. 6

matchesbeingaccuratewithinthreeyears. TheseresultsconfirmtheconcernsofNeumark et al. (2005) and Haltiwanger et al. (2013) about recognition of founding dates and annualfrequencydynamics.4 AparticularlyimportantcontributionofEcheverri-CarrollandFeldman(2017)isthedirectcomparisonoftwodifferentNETSvintages,allowingforastudyof revision history; the authors find significant establishment additions between the 2013 and 2014NETSvintages,withlargerevisionsextendingbackmorethanfouryearsfromtheend of the time series. This suggests the need for caution in interpreting data near the end of a NETSsample(theauthorsarguefordroppingthelasttwoyearsofcoverage).5 Exploration of the national NETS sample has been more limited. Mach and Wolken (2012) use NETS along with the Survey of Small Business Finance (SSBF), which uses the DMI as its sampling frame; the authors restrict attention to NETS records matched to the 2004 SSBF sample, and they note that there are some employment discrepancies between NETS and D&B-based SSBF data. Amezcua (2010) uses the national NETS file but restricts it to records that can be identified as startup incubators (relying in part on external data). Acsetal.(2008)usetheDMIfilesonwhichNETSisbased,constructingtheirownlongitudinal version of the data and studying high-growth firms. In principle this independently constructed longitudinal file should be similar to NETS, though no direct comparison has been undertaken. These authors rely heavily on business dynamics data in the DMI, raising some concerns based on high-frequency limitations of the data found in other studies. GreenstoneandMas(2012)useanextractofthenationalNETSfile,providingsomelimited comparisons with official sources, though they switch to the LBD in subsequent revisions (Greenstoneetal.(2014)). Aside from the above, we are not aware of any significant attempts to benchmark the nationalNETSfiletoofficialdatasources. 4Itisimportanttonote, though, thatfoundingdateisadifficultconceptmoregenerallygivenmovement betweenemployerandnonemployeruniverses(see,e.g.,Davisetal.(2009)). 5Severalotherstudiesusestate-specificNETSextracts. Forexample,Cromwell(2015)studiesMiami-Dade countylivingwagecontractsusingtheFloridaNETSextract. Currieetal.(2010)andGroizardetal.(2015)use the California extract to study fast food locations and manufacturing establishments, respectively. Donegan (2014)studiesbiomedicalbusinessesintheNorthCarolinaResearchTriangle. 7

3 Methodology and Results WecompareNETStovariousofficialdatasourcesandtrytounderstand,totheextentpossible, the sources of discrepancies. We focus on finding ways to align NETS with CBP and QCEW.Errorispresentinalldatasources,butCBPandQCEWhaveimportantadvantages relative to NETS. These programs are based on consistent, well-documented methodology, with the explicit goal of producing representative data. While the coverage and methods may not always be ideal, it is invaluable to have data generated by a well-understood collection process. In contrast, the source data for NETS are collected as a consequence of D&B’sotherbusinessprocesses. Assuch,thedatacollectionmethodologywilllikelynever beastransparentorlongitudinallyconsistentasCBPorQCEW. That said, NETS is certainly an impressive effort by the private sector to construct a research-readydatabase,andwewillarguethatitcancomplementofficialdatasourceswith some caveats. NETS provides geographic detail and other features that are not available in the most easily accessible government datasets. To the extent that NETS is shown to agreewithofficialdataoncommonscopeconcepts,wecanfeelconfidentextendinganalyses using the unique features of NETS. For example, CBP has been used at the county level to estimatestormeffects(e.g.,Bayardetal.(2017));thesekindsofanalysescanbeextendedto sub-county geographies using NETS, as in Crane et al. (2017). This extension is validated by the fact that NETS and CBP are largely in agreement with respect to the geographic distributionofbusinesses. 3.1 Officialdatasources Weusethreeofficialsources: CountyBusinessPatterns(CBP),NonemployerStatistics(NES), andtheQuarterlyCensusofEmploymentandWages(QCEW).Allofthesedatasourcesare freely available to the public, and these comprise the main official sources of information about the business universe in the U.S. Note that fine aggregations of public use CBP and QCEW files are occasionally censored to protect confidentiality. In these cases, we impute 8

employment numbers using the national average for the relevant establishment size cell. Thiscensoringoccursonlyforemployment,soestablishmentcountsareunaffected. Asnotedabove,NETSusesdifferentemploymentandestablishmentconceptsandtiming than official data sources. The most well-known government data sources, including CBPandQCEW,covernearlyallworkerswhoreceivearegularpaycheckbutexcludemany business owners, the self employed, and independent contractors. NETS, in principle, includes all of these groups, making its employment and establishment universes proper supersets of CBP and QCEW. In addition, CBP excludes most government employment, and manyQCEWtabulationsexcludeallgovernmentemployment,whereasNETSincludesgovernment. Though there is no explicit ownership code that distinguishes government from private establishments in NETS, most government establishments can be flagged in NETS by using NAICS codes and firm linkages, as we discuss below. Finally, annual data from official sources reflect snapshots at a certain time of the year (March 12 in CBP, a date that can also be observed in the monthly QCEW data). NETS data can be collected at any time oftheyear,andthetimingisnotdisclosedinthedata. 3.1.1 CountyBusinessPatterns(CBP) The Census Bureau’s CBP program provides (publicly available) annual tabulations of establishmentcounts,employmentcounts,andpayrollbygeography,industryandestablishment size class based on mid-March snapshots (i.e., employment information reflect the payroll period including March 12). We focus on post-1997 CBP data, after the program switched from SIC industry classification to NAICS. The source data for CBP is the Census Bureau’s Business Register, which is in turn built from federal business tax records, surveys, and the Economic Census (conducted in years ending in 2 and 7).6 Importantly, 6The Business Register is also the source for the Longitudinal Business Database (LBD) and the Business DynamicsStatistics(BDS),theworkhorsedatasetsforthestudyofbusinessdynamicsintheU.S.(seeJarmin andMiranda(2002)).AccesstoLBDmicrodata(aswellasBusinessRegisterfiles)requiresanapprovedresearch proposalandspecialswornstatus,whiletheBDS(whichconsistsofvariousaggregationsofLBDdata)ispubliclyavailable. BDSdataareconstructedtomatchthescopeofCBP.SeeDeSalvoetal.(2016)fordetailsonthe constructionoftheBusinessRegister. 9

the Business Register is based on IRS and Social Security Administration (SSA) lists of all knownbusinessesintheU.S.,andemploymentdataarederivedfromthesefederalsources.7 Single-establishment firms are efficiently covered with tax records supplemented at times withindustryandlocationinformationfromsurveyandcensusdata. Informationonmultiestablishmentfirmsalsocomesfromtaxrecordssupplementedbycensusandsurveydata. TheEconomicCensusandtheannualCompanyOrganizationSurveyareparticularlycriticalfortrackingmulti-unitstatusanddistributingemploymentacrossestablishments. CBP covers nearly all non-government employer businesses; the exact coverage exceptions are listed in Section A.1 in the appendix. Employment includes all wage and salary workers,bothfull-andpart-time,andexcludesproprietors,partners,independentcontractors,andtemporaryhelpserviceworkersemployedbyoutsideestablishments(thelatterof whichareincludedintheestablishmentthatissuestheirpaycheckratherthantheestablishment where they work). In addition to excluding self-employed individuals, the CBP excludes private households, railroads, agricultural production employees, and government employees. A popular misconception is that CBP excludes sole proprietorships; some sole proprietorships have payroll employees and therefore can appear in the CBP. The files we usearelimitedtothe50statesandtheDistrictofColumbia. 3.1.2 NonemployerStatistics(NES) The Census Bureau’s NES shares the same industry scope as the employer statistics used in the CBP and is thus the nonemployer counterpart to CBP. All entities with taxable business income but no employees comprise the total potential set of nonemployers, but Census procedures remove those nonemployer tax entities that can be connected to multi-unit employerbusinessesaswellasregulatedinvestmentcompaniessuchasmutualfunds. Further removals are based on revenue thresholds that vary by legal form; businesses with 7The Business Register’s reliance on federal tax data for employment information contrasts with the BLS employeruniversefiles,whichrelyonseparatestateunemploymentinsurancedata(aswedetailbelow).However,theBusinessRegisterissupplementedwithsomegeographicandindustryinformationfromstate-based recordsprovidedviaBLS. 10

lessthan$1,000inrevenuearedropped,asarebusinessesabovecertainthresholds($1millionfornon-servicescorporationsandpartnerships,$2millionforservicescorporationsand partnerships,andindustry-dependentcutoffsforsoleproprietorships).8 Inprinciple, these thresholdscouldgiverisetodiscrepanciesbetweenCensusdataandNETStotheextentthat thelattercapturesnonemployerswithextremelyloworhighrevenue. NESisavailableincounty-by-industryaggregations. NotethattheunionoftheCBPand theNES,roughlyspeaking, comprisestheuniverseofU.S.businessesasknowntotheIRS, withtheexceptionofbusinessesinout-of-scopeindustries.9 3.1.3 QuarterlyCensusofEmploymentandWages(QCEW) The QCEW is based on the BLS’ independent counterpart to the Census Bureau’s Business Register. The BLS data are derived from state unemployment insurance (UI) records supplied by State Workforce Agencies; BLS collects monthly data on employment (collected quarterly but covering the pay period including the 12th of each month) from these state sources. The BLS supplements UI records with frequent surveys of multi-establishment firms as well as a rotating survey on industry and geographic information. Inclusion in the QCEW is based primarily on whether an organization is part of the UI system; this is mandatory for most for-profit businesses but optional for some nonprofits.10 Employees in QCEW, as in CBP, are wage and salary workers (both production and supervisory, with fewexceptions);proprietorsandotherself-employedindividuals(whohavenoemployees), independentcontractors, somefarmandhouseholdworkers, andexternallysuppliedtemporaryworkersarenotincluded(thoughthelatterarecountedasemployeesoftheagencies 8Changes in nonemployer screening mechanisms were implemented in 2009 such that there is a modest break in the time series at that time. The new screening mechanisms have not been retroactively applied to previousyearswiththeexceptionof2008,inwhichthechangesaffectedabout0.2percentoffirms. 9The source data for the NES is also used to construct the confidential Integrated Longitudinal Business Database(ILBD),inwhichnonemployerbusinessesarelinkedlongitudinallyandcombinedwiththeemployer universelongitudinalfile,theLBD(seeDavisetal.(2009),whodescribetheILBDanddocumentconsiderable movementofbusinessesbetweenthenonemployerandemployeruniverses). 10NotethattheUIsourcedataunderlyingtheQCEWarealsousedtoconstructthepubliclyavailableBLS Business Employment Dynamics data as well as the Census Bureau Longitudinal Employer-Household Dynamicsdataandtheirpublic-useaggregate,theQuarterlyWorkforceIndicators. 11

that supply them). QCEW includes some non-employers in establishment counts, since UI accounts that had paid employees in previous quarters are sometimes retained in the state databases after becoming nonemployers. As is the case for CBP, it is a misconception that QCEW excludes all sole proprietorships; those who hire payroll employees are typically subjecttotheUIsystemandarethereforelikelytoappearinQCEW.High-levelQCEWtabulations include government employment, though in our work we always restrict to the privatesector. WealsorestricttheQCEWsampletotheannualobservationcoveringMarch 12thtobeconsistentwithCBPtiming. Aswenoteabove,recordtiminginNETSisunknown andcanvarybyestablishmentandyear. In terms of industry coverage, QCEW is neither a superset nor a subset of CBP. A detailedlistofQCEWindustrycoverageisinSectionA.2. Beckeretal.(2005)providefurther discussion of differences between QCEW and CBP and show that there remain small but nontrivial discrepancies between the two sources (in terms of both establishment and employmentcounts)evenafterharmonizingindustryscope. AnadvantageoftheU.S.statistical system is the availability of both QCEW and CBP, both of which serve as universe files (subject to minor scope restrictions) yet provide almost entirely independently generated informationaboutU.S.businessactivity. Figure 5 in the appendix reports aggregate employment in CBP and QCEW in samples that restrict both data sources to industries in the intersection of their respective industry scopes. These scope restrictions include complete omission of industries that are partially out of scope since determining the exact overlap between the two sources is impossible in such industries. We see in Figure 5 that CBP has higher employment than QCEW, and this does not change markedly over time. Tables 6 and 7 in the appendix show these differences by size class, where it is apparent that employment discrepancies between CBP and QCEW reflect, primarily, measurement of large establishments and, secondarily, measurement of very small establishments. Multi-establishment firms in the QCEW are measured more frequently and with different survey designs than in CBP, and precise measurement 12

ofsmallestablishmentsisdifficultinanysourceduetothegreyareabetweenemployerand nonemployerstatus. Inaddition,itisknownthatQCEWincludessomezero-payrollestablishments,whichareentirelyexcludedfromCBP.ThedifferencesbetweenCBPandQCEW arefairlysensitivetomoderatechangestoscoperules. Furtherinvestigationofthesedifferences is beyond the scope of this paper; see Fairman et al. (2008) for more detailed discussion. While these differences are significant and warrent futher research, we will see that thedifferencesbetweenNETSandeitherCBPorQCEWareanorderofmagnitudelarger. 3.2 Analysissamples Toaddressthedifferencesincoverageanddefinitionsoutlinedabove,weconstructseveral analysissamplesfromthefullNETSfiles. Unrestricted This sample includes all establishments and workers in NETS. Our only modification to the data is to merge certain NETS establishments to make the establishment concept closer to that of CBP and QCEW by locating NETS “lines of business”, which represent different portions of an enterprise that are located in the same location. We identify dunsnumbers (i.e.,linesofbusiness)thathavethesamehqduns(headquartersidentifier),5-digitzipcode, andfirstfivestreetaddresscharacters(thislatteritemamountstothebuildingnumberand the first one or two characters of the street name). The purpose of this filter is to identify dunsnumbers that are in the same building and the same firm. We want to abstract from slight variations in the address (e.g. “ST” vs “STREET”) and differences in suite numbers. Thus,werelyonthezipcodeandthetruncatedaddresstomakethematch,andthematching firm criterion precludes the spurious matching of independent businesses that operate in the same building. We merge (sum) the employment of appropriately matched lines of business, treating them as a single establishment and assigning the merged establishmenttheNAICScodeofthelargestlineofbusiness(intermsofemployment). Thisshould 13

roughlycorrespondwiththeprincipleactivityindustryconceptobservedinofficialdata. Baseline-CBPandBaseline-QCEW Beginningwiththeunrestrictedsample(inwhichlinesofbusinesshavealreadybeenmerged), wefirstidentifyandexcludegovernmentestablishmentsasthosewithNAICS92,thosewith aheadquartersthathasNAICS92,orthoseinafirmwheremorethanhalfofestablishments haveNAICS92. Again,thisismotivatedbythefactthatCBPandmostQCEWstatisticsexclude government establishments and employment. We then restrict the industry scope of NETS to match each respective official data source, resulting in -CBP and -QCEW variants ofthebaselinesample(thecoveragedetailsareinSectionAoftheappendix). Inaddition,itisworthnotingthatseveralindustriesarepartiallycoveredbyQCEW.The extentofundercoverageisnotdocumentedindetail,sowecompletelyexcludetheseindustriesfrombothQCEWandbaseline-QCEWNETSsample. SeeSectionAoftheappendixfor details. UnmergedLinesofBusiness(UnmergedLoB-CBPandUnmergedLoB-QCEW) Thesesamplesareidenticaltothebaselinesamples(includingCBPorQCEWindustryscope restrictions) but retain all NETS lines of business as separate establishments. As with the baseline samples, we construct two variants that mimic the coverage of CBP and QCEW respectively. ExcludingNAICS61(Ex. 61-CBPandEx. 61-QCEW) Aswewilldiscuss,itappearsthatmanylargepublicuniversitiesandschoolsystemsthatare outsidethescopeofofficialdataarenotcapturedbyourgovernmentfilter. Itisdifficultto distinguishpubliceducationalinstitutionsfromprivateones,soweinsteaddropallNAICS 61 establishments from the baseline samples and make comparisons against similarly restrictedQCEWandCBPsamples. Aswiththebaselinesamples,therearetwovariantsthat 14

mimicthecoverageofCBPandQCEWrespectively. 3.3 Treatmentofemployment As has been noted, NETS appears to include business owners in their employment counts. Neumarketal.(2005)proposesubtracting1fromtheemploymentofeachestablishmentto align NETS employment with the standard wage-and-salary employment concept. When the establishment only reports a single worker, it is presumed to be a nonemployer and should not be counted in the employer universe. Thus we can estimate the count of employerestablishmentsbydroppingallNETSestablishmentsreportingasingleworker. This isnotaperfectrule, astherelikelyexistmanytrueemployerbusinesseswithonlyoneemployee;however,asweshowbelow,thesignificantportionofthedifferencebetweenNETS andofficialsourcesisfoundamongthesmallestestablishments,andofficialsourcesindicate that the nonemployer universe is large compared to the small-employer universe. As such we believe that identifying nonemployers in this way implies less measurement error than treatingthemasemployers. We modify the Neumark et al. (2005) rule slightly, subtracting 1 from the employment of each headquarters establishment rather than all establishments. This is motivated by the presumption that the non-payroll owner would work only at the firm headquarters. In practice, the outcomes of using our rule are nearly indistinguishable from those using the broaderNeumarketal.(2005)rulesincemostfirmshaveonlyoneestablishment.11 This adjustment can be implemented in any of the samples listed above, so for each of thefoursamplesweobtaintwoestablishmentcountsandtwoemploymentcounts. Werefer totheadjustednumbersaspayrollemploymentandpayrollestablishmentssincetheyapproximatethepayrollworkerconceptofofficialsources. TherawNETScountsarereferred toasrawemploymentandrawestablishments. 11Itisalsopossibleforfirmstohavemultipleowners,thoughtheprevalenceofsoleproprietorshipssuggests thatfurtherrefinementsofourmethodologywouldnotsubstantiallychangetheresults. 15

35 30 25 20 15 10 5 1998 2000 2002 2004 2006 2008 2010 2012 2014 Year stnemhsilbatsE fo snoilliM NETS raw establishments NETS payroll establishments Census all establishments Census payroll establishments Source:NETSdatabase,CountyBusinessPatterns,CensusNonemployerStatistics. Note:NETSsamplerestrictedtoCBPindustryscope. Figure1: Aggregateestablishmentcounts 3.4 AggregateactivityinNETSandCensusdata WefirstfocusoncomparingNETSwithCensusdatasources. Theadvantageoffocusingfirst on Census sources is that Census provides data on both the employer universe (CBP) and thenonemployeruniverse(NES),theunionofwhichis(inprinciple)theuniversetargeted byD&B. Figure 1 plots establishment counts for NETS and for Census data, excluding government. The NETS sample is the baseline-CBP sample, with the thick red line showing raw establishment counts and the thin black line showing payroll establishment counts. “Censuspayrollestablishments”(dashed,thinblackline)reflectsthepayrollemployeruniverse of CBP, and “Census all establishments” (dashed, thick red line) reflects the union of the CBPpayrollemployeruniverseandtheNESnonemployeruniverse. Inprinciple,thelatter unionisconceptuallyequivalenttotheNETSuniverse. 16

Comparing the dashed lines, it is evident that the number of nonemployer establishments dwarfs the number of employers in official statistics. The count of raw NETS establishments falls somewhere between the two Census totals, trending closer to the total Census universe in recent years. This suggests that NETS covers more than the employer universe but fails to cover the union of the employer and nonemployer universes, a point notedbyHaltiwangeretal.(2013). In the early years of the sample, the total NETS universe was somewhat too high to match the Census employer universe but far too low to match the total Census business universe. Notably, though, NETS coverage appears to have expanded in recent years. It is unclearwhetherthisreflectschangesintargetedscopeortrulyimprovedcoverage,though we can likely rule out the possibility that the actual U.S. business universe expanded at the rate indicated by NETS over that period. Figure 1 should caution researchers against interpreting rising NETS establishment counts since 2000 as reflecting a surge in business entry. ItisalsopuzzlingthatNETScountsmanymorepayrollemployerestablishments(thin, solidblackline)thanCBP.CBPrecordsarebasedonIRSandSSAtaxdata,sothereislimited room for mismeasurement. The recent gap implies that about 8 million establishments reportedpositiveemploymenttoD&BbuthadnowageorsalaryemployeesforIRSpurposes. Wewillreturntothisissueinourdiscussionofimputation. Figure2plotstheemploymentlevelscorrespondingtotheestablishmentcountsinFigure 1. The “Census payroll employment” line is total CBP employment, and the “Census all employment” number sums payroll employment from CBP with the number of business owners in CBP and NES, under the assumption that there is one (non-payroll) owner per establishment.12 Consistent with Figure 1, NETS payroll employment is significantly higher than Census payroll employment. The raw employment numbers (thick red lines) 12ThisdiffersslightlyfromourowneradjustmentintheNETSdata,wherewesubtractoneownerfromeach firm.WithCBPandNESalonewecannotimplementthisadjustmentbecausetherearenofirmdata.Inprinciple wecouldsupplementCBPwithfirmcountsfromtheBDS,butthedifferencebetweentheadjustmentsisvery smallsincemostfirmsonlyoperateasingleestablishment. 17

are somewhat closer, but the NETS values are still higher. It is not entirely clear why raw NETSestablishmentcountsarelowerthantotalCensusuniverseestablishmentcounts(see Figure1),whilethecorrespondingNETSemploymentcountsareabovetheCensuscounterparts. It appears that while NETS does not cover all the establishments that Census does, NETS must be measuring or imputing higher employment to those establishments it does cover. 180 170 160 150 140 130 120 110 100 1998 2000 2002 2004 2006 2008 2010 2012 2014 Year snoilliM NETS raw employment NETS payroll employment Census all employment Census payroll employment Source:NETSdatabase,CountyBusinessPatterns,CensusNonemployerStatistics. Note:NETSsamplerestrictedtoCBPscope. Figure2: Aggregateemployment Tosummarize,NETSestablishmentcountsfallbetweentheCensusemployercountand the Census count of all business establishments. The NETS counts begin near the Census employerlevelinthe1990sbutarenowclosertotheCensusemployer-nonemployerunion. In terms of employment, NETS is consistently above Census, though the raw employment totals are not dramatically different. Taken together, these findings imply that the average payrollestablishmentsizeinNETSissmallerthaninCensus. Wewillturntoestablishment 18

sizeinthenextsection. 3.4.1 Discussion There are many possible explanations for these patterns. In the appendix, Figures 7 and 8 showthataddinggovernmentestablishmentsbackintoNETSandtreatinglinesofbusiness asseparateestablishmentsmakealmostnodifferencefortheestablishmentcountdiscrepancies,andFigures9and10showthatthepatternsforemploymentaresimilarlyrobust. AnotherpossibilityisthatthedifferencebetweenNETSandCBPpayrollestablishmentcounts reflectsnonemployerestablishmentsthatmakeuseofinformalworkers(suchasfamily),independentcontractors,orexternallysuppliedtemporaryworkers;however,thiswouldnot explain why total (employer and nonemployer) NETS establishment counts are below the Census business universe. Businesses making use of informal workers may be reluctant to reportsuchtoD&B,suggestingthatinformalityisnotlikelytobequantitativelysignificant. Instead, belowwedocumenthighratesofemploymentimputationamongsmallestablishmentsinNETS,indicatingthatD&Barenotactuallyreceivingemploymentdatafrommany smallbusinesses. ItisthereforeunlikelythatNETSprovidesamoreaccuratecountofformal andinformalemployment. The problem of independent contractors and temporary workers may be more salient, however, as such workers might be double counted in NETS (i.e., they may be counted both in the establishment in which they work and the establishment that pays them). In unreported results, we construct NETS versus CBP comparisons in which we omit NAICS 56(whichincludestemporaryhelpservicesaswellasothercommonlycontractedbusiness services such as landscaping, janitorial, and security services); the omission of this sector doesnotmateriallyaffectthegapbetweenNETSandCBP. In what follows, we focus on trying to match NETS payroll employment and payroll establishment totals to official data, rather than raw employment and raw establishment counts,sinceitisapparentfromFigure1thatNETSdoesnotcovertheemployer-nonemployer 19

union. ThatisnotsurprisinggiventhatD&Bdoesnothaveaccesstocomprehensiveadministrative data; larger establishments are likely easier for D&B to locate and perhaps more willing to answer their questions. Therefore, we choose to focus on matching the payroll statistics,droppingpresumednonemployersfromtheanalysis. We also considered the possibility that the Census nonemployer figures are incorrect. Measuringnonemployerscanbedifficult,andonehastodistinguishbetween“true”nonemployersandtaxentitiesthatmaybepartoflargerfirmsorfinancialvehicles.13 Itisconceivablethat,forsomedefinitionofnonemployerfirms,CensuscountstoomanyandtheNETS figuresareclosertotruth. Wearenotawareofanycorroboratingevidenceforthishypothesis, and with the data in hand we cannot test it; however, we note that Census does make serious attempts to restrict the nonemployer universe with linking attempts and revenue cutoffs. Moreover, if Census mismeasurement explains the gap, it must be time-varying mismeasurement since the gap has narrowed. Regardless, the nonemployer universe is a difficultsetofbusinessestounderstand,sowefeelcomfortablerestrictingattentiontopayrollemployers. 3.5 SizedistributionsinNETSandCBP TogetabettersenseofwhereNETSdiffersfromofficialdata,Table1comparespayrollemployment and payroll establishment counts by size class for 2000, 2007, and 2014. For each year, we report the the percent difference between NETS counts and CBP counts as a percent of CBP counts; for example, the top left cell shows that in 2000, NETS had 82 percent moreemploymentthanCBPamongestablishmentswithfewerthan5employees.14 Several patterns stand out. First, consistent with the discussion in the previous section, NETS employer activity has generally grown over time compared with CBP. The “Aggregate” row shows that in 2000 NETS had 16 percent more employment and 33 percent more establish- 13See https://www.census.gov/programs-surveys/nonemployer-statistics/technicaldocumentation/methodology.htmlforthemethodologyCensususes. 14Table8intheappendixreportssimilarexercisesbutforQCEWinsteadofCBP. 20

Percentdifference 2000 2007 2014 Sizeclass(Employees) Emp. Estab. Emp. Estab. Emp. Estab. 1to4 82.20 53.85 113.28 130.21 195.87 241.87 5to9 15.89 15.24 19.47 18.30 44.21 41.04 10to19 4.68 2.94 5.80 4.15 1.29 0.43 20to49 8.23 5.93 8.82 6.11 1.85 −0.53 50to99 9.98 9.38 8.36 8.07 11.59 9.77 100to249 −0.50 0.66 −2.29 −0.27 2.59 4.45 250to499 −0.21 0.21 −0.97 −0.07 −3.75 −2.60 500to999 9.16 9.30 −1.23 0.42 −5.90 −4.88 1000+ 49.36 39.18 13.01 9.80 18.12 3.69 Aggregate 16.23 33.18 13.08 75.66 19.49 139.79 Aggregate ex. less than 10 11.93 4.61 5.10 4.87 4.75 1.33 employees Aggregate ex. less than 10 5.03 4.48 3.63 4.86 2.20 1.32 and greater than 1000 employees Source:NETSandCBP Note: DifferencebetweenNETSandCBPemploymentaspercentofCBPemployment. NETSsample restrictedtoCBPscope. Table1: NETSversusCBPbyestablishmentsize ments than CBP. By 2014, NETS had 20 percent more employment and more than twice as manyestablishmentsasCBP. The table shows that these differences are primarily driven by small establishments. In 2014NETShadmorethanthreetimesasmany1-to-4-employeeestablishmentsasCBP,with almost three times as many employees.15 In the same year NETS had 40 percent more activityinthenextestablishmentsizeup, thosewith5to9employees. Moregenerally, small establishments account for the bulk of the difference between NETS and CBP, and these small establishments also appear to account for much of the expansion of the discrepancy since 2000. This can be seen on the row marked “Aggregate ex. less than 10 employees”, whichreportsthecomparisonexcludingestablishmentswithfewerthan10employeesfrom both datasets. In this sample, the differences between NETS and CBP are minor and have 15NoteagainthatwehavealreadysubtractedoneemployeefromeveryNETSfirmanddroppedestablishmentswhoseresultingemploymentiszero. 21

Imputationrates Sizeclass 2000 2007 2014 1to4 41.66 54.85 71.88 5to9 21.56 20.17 37.82 10to19 19.46 16.82 16.98 20to49 17.94 13.83 9.19 50to99 16.82 13.75 7.37 100to249 14.70 12.89 7.19 250to499 18.65 14.91 9.84 500to999 18.48 17.77 11.61 1000+ 23.74 21.68 15.67 Source:NETS Note: EstablishmentswithimputedNETS employmentasapercentoftotalNETSestablishmentsbyestablishmentsize. NETS sampleisrestrictedtoCBPscopebutdoes notmergelinesofbusiness. Table2: NETSimputationratesbyestablishmentsize not worsened over time. The largest class of establishments, those with 1,000 or more employees,alsoexhibitsomediscrepancy. Thefinalrowofthetableshowsthatomittingthese largestestablishmentsinadditiontothesmallestonesimprovesthematchmodestly,particularlyin2000. There are a number of possible explanations for the wide disparity among small establishments. WebelievethatourconstructionofNETSpayrollestablishmentandemployment counts depends on reasonable assumptions about proprietors, but it does introduce minor errors in cases of multiple working (but non-payroll) owners, cases of absentee owners, andcasesofon-the-payrollowners(i.e.,paycheck-receivingownerswhoshouldnotbesubtractedfromemploymentcounts). Theseerrorsarelessproblematicforemployment-based comparisonsthanforestablishmentcount-basedcomparisons,however,andthedivergence fromofficialsourcesisevidentevenintheemploymentnumbers. Thatsaid,measuringvery small businesses (particularly very new businesses, which tend to be small) is a difficult challenge even for official statistical agencies, since there is a fair amount of movement between the employer and nonemployer universes (Davis et al. (2009)) and small businesses 22

may be more likely to have periods of inactivity.16 Excluding the small size classes results in a match between NETS and CBP that is not substantially worse than the match between CBP and QCEW (shown on Tables 6 and 7 in the appendix), and the same holds for the matchbetweenNETSandQCEW(whichwereportonTable8intheappendix). Themostlikelyexplanationforthedivergenceamongsmallestablishmentsistheprevalence of imputation in these size classes. Direct contact with the business is an important source of D&B’s employment data for these smallest establishments. If the business cannot be contacted or does not answer questions, D&B can be forced to impute missing employmentvaluesusingcross-sectionalinformation(e.g.,establishmentlocationorindustry). Walls & Associates reviews these imputations and adjusts them where longitudinal establishment linkages provide information missing from the cross-sectional imputation. Table 2 presents employment imputation rates.17 Employment is often imputed for the establishments in the smallest size classes (and, of course, size class itself is a function of the potentially imputed employment count).18 In 2014, more than two-thirds of the employment values for the smallest size class are imputed, and more than one-third of the values for the 5-9 employee class are imputed. Imputation rates in the smallest size class have risen dramatically since 2000, the same period over which excess NETS employment and establishmentcounts(versusCBP)haverisen. Theobviousconjectureisthatnonemployers are being imputed with positive payroll employment (NETS employment greater than 1), which causes us to treat them as employers. Under these conditions, the simple fix of subtracting 1 from firm employment is not sufficient for satisfactorily reconciling NETS with official employer numbers. Researchers should be cognizant of large measurement error among very small establishments generally, particularly when studying the post-2000 rise ofsmall-establishmentactivityinNETS. 16AsnotedaboveandshownonTables6and7,nontrivialdiscrepanciesexistevenbetweenCBPandQCEW inthesmallestsizeclasses. 17TheseimputationratesnecessarilyrefertotheNETSsampleinwhichlinesofbusinessarenotmerged. 18Inrelatedworkinprogress,wefindthatasignificantnumberofestablishmentsseemultipleconsecutive yearsofimputation. 23

Percentdifference 2000 2007 2014 Industry Excl<10 All Excl<10 All Excl<10 All 11Ag.,For.,Fish.,Hunt 74 68 67 66 71 73 21Mining 70 69 −2 3 −8 −5 22Utilities −40 −37 −39 −35 −46 −40 23Construction −8 3 −5 3 7 23 31-33Manufacturing 32 34 39 43 50 54 42WholesaleTrade 7 15 7 17 −4 8 44-45RetailTrade −11 −2 −11 −1 −3 4 48-49Trans.,Warehous. 15 19 −6 0 −19 −7 51Information 15 20 15 21 10 18 52Finance,Insurance 25 23 5 10 7 12 53RealEst.,Rent.,Leas. 70 70 66 68 75 86 54Prof.,Sci.,Tech. Svcs 35 39 12 17 5 17 55Management −91 −90 −91 −89 −86 −83 56Admin.,WasteMgmt −34 −26 −39 −18 −47 −8 61EducationSvcs 287 278 252 244 261 260 62Health,SocialAsst. 5 7 −8 −2 −9 1 71Arts,Entertain.,Rec. 33 44 15 27 2 22 72Accom.,FoodSvcs −1 2 −8 −1 −14 −2 81OtherSvcs 39 41 10 25 6 29 Source:NETSandCBP Note:DifferencebetweenNETSandCBPemploymentaspercentofCBPemploymentbyNAICS sector.NETSsamplerestrictedtoCBPscope. Table3: NETSversusCBPbysector 3.6 IndustrydistributionsinNETSandCBP To shed further light on differences between NETS and CBP, Table 3 reports employment comparisonsby2-digitNAICSsector. Asinoursize-basedcomparisons,wereportthedifference between NETS and CBP as a percent of CBP levels. For each of 2000, 2007, and 2014, we report the differences both for the full baseline samples (“All”) and for the sample excluding establishments with fewer than 10 workers (“Excl <10”). Within industries, omittingsmallestablishmentsimprovesthematchinabouthalfofcases. Educational services (NAICS 61) is the worst-fitting sector, with NETS consistently reporting more than three times as many employees as CBP. This discrepancy has changed 24

littleovertimeandislikelyduetoourdifficultyidentifyinglargegovernment-ownededucationalinstitutions. InQCEW-basedcomparisonsbelowwefindthatomittingeducational servicesresultsindramaticimprovementsinaggregatecomparisons. The next significant (though less egregious) discrepancy exists in management of companies and enterprises (NAICS 55); according to the NETS documentation, D&B attempts toavoidusingpartsofthisindustrycategoryandinsteadlocatesestablishmentsinspecific industry areas, resulting in much lower (by more than 80 percent) NETS employment in thissector. Othersignificantdiscrepanciesexistinagriculture,forestry,fishingandhunting (NAICS11),inwhichNETSshowshigheremploymentevenafterourattemptstolimitthe sample to CBP scope (which omits much of NAICS 11); this discrepancy has varied only modestlyovertime. NETS appears to poorly capture three noteworthy sectoral reallocations of the last 15 years. First, the shale oil and gas boom drove dramatic gains in U.S. mining (NAICS 21) activity starting in the mid-2000s. NETS significantly over-covers this sector in 2000 (by about 70 percent) then approximately matches it in 2007 (the year often chosen to mark the ramp-up of the shale boom) and 2014. While NETS mining coverage may appear to improve over the relevant years, it is far from clear that this coverage expansion actually reflectsimprovedmeasurementsinceNETScoveragewassodramaticallyoverstatedatthe beginningofthesample(and,importantly,thisinitialoverstatementwasnotdrivenbysmall establishments).19 Second,NETSappearstomissmuchofthepost-2007construction“bust”thataccompa- 19Oilandgasindustriesmaybeparticularlysensitivetoindustryclassificationerrors. Spot-checkingexercisesrevealthatanumberofoilandgasestablishmentsareclassifiedinmedia-relatedsectors,perhapsdueto theword“production”appearingintheirbusinessnames;wethankMariaTitoforsharingthisdiscoverywith us.Thisislikelyanontrivialproblembutunlikelytobethemaindriverofthepatternofminingdiscrepancies. Additionally, sinceNAICSbroad(2-digit)sectorclassificationsoftendividesimilaractivitiesinwaysthatare effectivelyarbitraryrelativetothesetofactivitiesengagedinbybusinesses,inunreportedexerciseswecreated anadhoc“oilandgas”sectorconsistingofseveralrelevantnarrowindustriesinmining,manufacturing,transportation,professionalservices,andconstruction(designedtoencompassoilandgasexploration,production, transportation,refining,andotherprocessing)andcomparedNETSwithQCEWinthisspeciallydesignedsector. WefindthatthediscrepanciesobservedinNAICSminingarenotsignificantlyalteredbythedevelopment ofthis“oilandgas”sector;thatis,NETSdoesappeartotrulymisstheshaleoilandgasboom. 25

niedthehousingcrisis(NAICS23). NETScoverageofthesectorisreasonablygoodin2000 and 2007, but by 2014 NETS overstates construction employment by just over 20 percent. Thisoverstatementislargelyaccountedforbysmallestablishments. Weakhousing-related coverage may also be seen in finance and insurance (NAICS 52) and real estate and rental and leasing (NAICS 53), in which cases small establishments do not appear to be the main culprit. Third,NETSseeminglymissessomeportionofthepost-2000dropinU.S.manufacturing employment (NAICS 31-33). NETS overstatement of the sector rises from about 35 percent in2000toabout40percentin2007thentoatleast50percentin2014. Smallestablishments account for only a modest portion of the overall overstatement and roughly none of the changeovertime. It is important to note that industry classification is a notoriously difficult endeavor. Eveninofficialsources,industryassignmentismuchlessobjectivethangeographicassignment (and even size assignment). As such, some differences in industry coverage between NETS and official sources may partly reflect subjective differences in industry assignment methods rather than substantive measurement error in NETS. While we are aware of little research on this topic, a notable exception is Isenberg et al. (2013). They compare Census BusinessRegisterestablishmentindustrycodes(i.e.,theCBPsourcedata)totheindustryin which employees claim to work in the American Communities Survey (ACS). Using links between the worker identifier and the establishment identifier, they find that the workerreported industry matches the establishment-reported industry 75 percent of the time (at the 2-digit NAICS level). These results suggest that we should expect significant variation in industry assignment across data sources, even at the broadest levels of industry aggregation. The previous section shows that NETS and CBP are reasonably aligned when the smallest size classes are omitted; the industry discrepancies described in this section may therefore partly reflect offsetting differences in subjective industry assignment. That said, the specific discrepancies mentioned here, particularly in mining, construction, and manu- 26

facturing, are cause for concern (while the differences in education likely reflect true scope differences). 3.7 AggregateactivityinNETSandQCEW In this subsection we compare NETS time series to official data and show how excluding certain classes of establishments can improve agreement between the two. We began this section by focusing on comparisons between NETS and Census data primarily because of theavailabilityofCensusnonemployerdata,whichfacilitatefull-universecomparison. We nowturntoQCEWcomparisons. AkeyadvantageofQCEWisthatitisavailableinNAICS formatbeginningin1992,sowecaneasycomparelongertimeseriesthanwithCBP.Moreover, we find that NETS tends to align with QCEW slightly better than with CBP. Recall that we create separate NETS samples meant to mimic the scope of CBP and QCEW, respectively. ThebetteralignmentoftheNETSbaseline-QCEWsamplewithQCEWsuggests that the QCEW-covered industries may be covered more accurately in NETS that the CBPcovered industries. In any case, using CBP instead of QCEW leads to the same qualitative conclusions, and CBP-based versions of the figures in this subsection can be found in the Appendix. Forthesecomparisons,werestrictattentiontoemployerestablishments(i.e.,wehavealreadysubtracted1fromallNETSfirms’employmentanddroppedresultingnonemployers) because there is no nonemployer counterpart in QCEW. We focus on the baseline-QCEW sampleofNETSthatisconsistentwithQCEWindustryscope. Figure3showstimeseriescomparisonsofQCEWandNETSintermsofthetotalnumber of(employer)establishmentsandemployees. ThetoprowofFigure3showsthetotals. Consistent with our CBP-based comparisons, NETS reports significantly more establishments and workers than QCEW and is therefore too large to represent the employer universe. Again, the gap between NETS and official establishment counts grows over time. Table 8 in the appendix shows that, as is the case with CBP, the key source of the discrepancy be- 27

tween NETS and QCEW is small establishments, with additional significant discrepancies among very large establishments. Therefore, the second row of Figure 3 restricts the data to establishments with more than 9 employees, and the third row restricts the data to establishments with more than 9 but fewer than 1000 employees. These restrictions reduce the discrepancy considerably; in particular, small establishments account for the dramatic widening of the discrepancy in recent years. However, under these restrictions NETS still countsmoreemployeesandmoreestablishmentsthanQCEW. Table9intheappendixreportsdifferencesbetweenNETSandQCEWbyNAICSsector; similarly to Table 3 (which compares NETS with CBP), we find significant variation across industries and over time in terms of the discrepancy between NETS and QCEW. As noted with the CBP sector comparisons, a noteworthy sector is educational services (NAICS 61). Thissectorincludespublicandprivateschoolsaswellasmartialartsacademiesandrelated establishments. Many primary and secondary schools and public universities are not classifiedasgovernmentestablishmentsaccordingtoourrulefromtheinitialcleaningprocess. ThusmanylargeeducationalestablishmentsareincludedinourQCEW-comparableNETS samplewhiletheywouldbeexcludedfromtheQCEWprivateemploymentfigures. Figure 4 repeats Figure 3 but omits education services establishments (in both NETS and QCEW). Asthebottomtworowsofthefigureshow,largeestablishmentsintheeducationsectoraccountforasignificantportionofthediscrepancybetweenNETSandQCEW,andwhenwe omitbothfromNETSweareabletoreplicateQCEWestablishmentandemploymentcounts reasonablywell. Figures11and12intheappendixreplicateFigures3and4butforCBPdata(withCBP scope),withsimilarresultsacrosstheyearsthatarecommontobothQCEWandCBP. 3.8 Comparingnarrowcells It is not enough for NETS, or a subset of NETS, to have the same employment levels and establishmentcountsasofficialsources. Tohaveconfidencethatinferencesfromonesource 28

160 140 120 100 80 1990 1995 2000 2005 2010 2015 Year snoilliM Number of Employees 120 110 100 90 80 70 60 1990 1995 2000 2005 2010 2015 Year snoilliM Number of Employees Excluding small establishments 90 80 70 60 50 1990 1995 2000 2005 2010 2015 Year snoilliM 20 15 10 5 1990 1995 2000 2005 2010 2015 Year Number of Employees Excluding small and large establishments NETS QCEW snoilliM Number of Establishments 2.2 2 1.8 1.6 1.4 1990 1995 2000 2005 2010 2015 Year snoilliM Number of Establishments Excluding small establishments 2.2 2 1.8 1.6 1.4 1990 1995 2000 2005 2010 2015 Year snoilliM Number of Establishments Excluding small and large establishments Source:NETS,QCEW. Notes: NETSsampleisrestrictedtoQCEWscope(sampleBaseline-QCEW)."Small"establishmentsarethose withfewerthan10employees."Large"establishmentsarethosewith1000ormoreemployees. Figure3: NETSversusQCEW 29

130 120 110 100 90 80 1990 1995 2000 2005 2010 2015 Year snoilliM Number of Employees 110 100 90 80 70 60 1990 1995 2000 2005 2010 2015 Year snoilliM Number of Employees Excluding small establishments 85 80 75 70 65 60 55 1990 1995 2000 2005 2010 2015 Year snoilliM 20 15 10 5 1990 1995 2000 2005 2010 2015 Year Number of Employees Excluding small and large establishments NETS QCEW snoilliM Number of Establishments 2 1.8 1.6 1.4 1.2 1990 1995 2000 2005 2010 2015 Year snoilliM Number of Establishments Excluding small establishments 2 1.8 1.6 1.4 1.2 1990 1995 2000 2005 2010 2015 Year snoilliM Number of Establishments Excluding small and large establishments Source:NETS,QCEW. Notes:NETSsampleisrestrictedtoQCEWscope(sampleEx.61-QCEW)."Small"establishmentsarethosewith fewerthan10employees."Large"establishmentsarethosewith1000ormoreemployees. Figure4: NETSversusQCEW,educationservices(NAICS61)excluded 30

will carry over to the other, we should have more detailed agreement. To this end, we partition the NETS establishments into various cells and calculate the correlation of celllevel employment and establishment counts with offical data. Table 4 presents the results for 2000, 2007, and 2014. The first column notes the official dataset being compared with NETS, the second column notes the level of cell aggregation, and the third column notes sample restrictions (in each case, NETS is restricted to match the scope of the official data towhichitisbeingcompared). Remainingcolumnsreportsimplecorrelationcoefficients.20 Emptycorrelationcellsonthetableindicatethatthespecifiedofficialdatasetdoesnotpermit suitablyaccuratecorrelationsatthatlevelduetoconfidentialitycensoring.21 Thefirstandfourthrowsofthetable,correspondingtostate-size-sectorcellswithallsize classes in QCEW and CBP respectively, illustrate the problems of mapping the full NETS sampletoofficialsources. Someofthesecorrelationsarearound0.5or0.6,eventhoughour geography (state) is quite broad. As seen in previous exercises, the relationship between NETS and official sources weakens over time when all sizes are included. The second and fifth rows show, however, that exclusion of small establishments (fewer than 10 workers) both dramatically improves the correlation and significantly attenuates time variation in the match quality. Simply omitting small establishments boosts the correlation between NETS and official sources above 95 percent for establishment counts and above 80 percent foremployment. Droppinglargeestablishments—rowsthreeandsix—boostsemployment correlationstoaround90percent. 20Forexample,letnNETS bethenumberofestablishmentsthatNETScountsincounty-size-sectorcelli,and i letnCBPbetheCBPcountofestablishmentsincelli.Wereportthecorrelationcoefficient i (cid:16) (cid:17)(cid:16) (cid:17) ∑ nNETS−nNETS nCBP−nCBP i i i ρ= (cid:20) (cid:16) (cid:17)2 (cid:16) (cid:17)2 (cid:21)1/2 ∑ nNETS−nNETS ×∑ nCBP−nCBP i i i i wherethesummationsandaveragesaretakenoverallcounty-size-sectorcellsi,andhorizontalbarsindicate averages.Employmentcorrelationsreplacecellestablishmentcountswithcellemploymentcounts. 21CBPdoesnotreportemploymentbysizeclassatthecountyorzipcodelevel,insteadreportingemployment for the geography-industry cell as a whole. Thus we do not calculate employment correlations for size-bycountyorzipcomparisons. 31

Theseventhrownarrowsthecomparisontocounty-basedaggregates;county-levelemployment is nearly perfectly correlated between NETS and CBP. To our knowledge, D&B does not attempt to match county aggregates, so these correlations are strikingly high.22 This is likely due in part to the fact that business activity is highly correlated with population,soanydistinctattemptstomeasurebusinessactivityshouldbehighlycorrelatedacross geography; but these results do lend credibility to NETS. As rows 8-10 show, however, the high county-level correlations apparently reflect offsetting misses in narrower (sector-bysize) cells, where correlation is somewhat lower (at this level of detail, only establishment counts are available). As with the state-based correlations, the county-size-sector correlations degrade over time when all size classes are included. However, when small establishments are omitted, the correlations are above 95 percent. The last three rows of Table 4 exploit zip code-level cells from CBP. Cells omitting small establishments see respectable, butnotoverwhelming,zip-size-sectorcorrelationsabove80percent.23 The fact that county aggregates are more correlated than county-size-sector aggregates isnoteworthy. Atfirstglance,offsettingmissesacrosssize-by-industrycellswithincounties may be cause for concern. However, as noted in previous exercises, industry is difficult to measure consistently. Unlike geography, industry categorization is necessarily subjective, and caution is always warranted when using industry codes in any microdata. High correlation across geography with somewhat lower correlation across industry likely reflects in part (perhaps large part) differences in industry assignment processes. Statistical agencies may be more consistent and rule based when assigning industry codes to individual businesses than is D&B; yet even the best industry measurement processes are subject to considerable ambiguity and error, and D&B does at least have profit incentives to identify businesses in an accurate, useful way. Size measurement is certainly less ambiguous than 22WeconfirmedthisincorrespondencewithDonWallsofWalls&Associates.Similarcorrelationshavebeen notedbyNeumarketal.(2005)andothers. 23Thesecorrelationsarelikelylowerboundssincewehave5-digitzipcodesfromNETSwhileCBPprovides tabulationsbyzipcodetabulationareas(ZCTAs).ZCTAsareoftenthesameaszipcodes,butsometimessmaller zipcodesareincludedunderlargerZCTAidentifiers. Thismismatchinidentifierslikelymakesthemeasured correlationslower. 32

Correlation 2000 2007 2014 Comparison Cell Exclusions Emp. Estab. Emp. Estab. Emp. Estab. QCEW State-Size-Sector None 0.82 0.79 0.88 0.61 0.81 0.49 QCEW State-Size-Sector <10 0.81 0.97 0.89 0.97 0.86 0.96 QCEW State-Size-Sector <10,>1000 0.91 0.97 0.91 0.97 0.87 0.96 CBP State-Size-Sector None 0.88 0.96 0.83 0.86 0.77 0.71 CBP State-Size-Sector <10 0.88 0.97 0.84 0.97 0.82 0.96 CBP State-Size-Sector <10,>1000 0.92 0.97 0.91 0.97 0.87 0.96 CBP County None 0.99 0.98 1.00 0.99 1.00 0.99 CBP Cty.-Size-Sector None 0.94 0.87 0.73 CBP Cty.-Size-Sector <10 0.96 0.97 0.96 CBP Cty.-Size-Sector <10,>1000 0.96 0.97 0.96 CBP Zip-Size-Sector None 0.89 0.81 0.65 CBP Zip-Size-Sector <10 0.80 0.83 0.84 CBP Zip-Size-Sector <10,>1000 0.80 0.83 0.83 Source:NETS,CBP,andQCEW. Notes: Simplecorrelationsofcell-levelemploymentandestablishmentcounts,NETSandofficialsources. Exclusions refertoexcludedestablishmentsizeclasses. QCEWcomparisonsuseNETSsamplerestrictedtoQCEWscope. CBP comparisonsuseNETSsamplerestrictedtoCBPscope. Table4: Cell-basedcorrelations industry measurement, yet size is not always easy to pin down given the employer versus nonemployer distinctions discussed above. Moreover, seasonal business fluctuations can easily result in the movement of establishments across size bins, and NETS measurement timing is heavily vulnerable to seasonality. In any case, we do not interpret the correlation gapbetweencountyaggregatesandcounty-size-industryaggregatesasnecessarilyindicatingqualityproblemsinNETS(asidefromourseasonalityconcern); asignificantportionof thisdiscrepancymaybeattributabletobenigndifferencesinlabeling. Table 4 suggests that NETS is in general agreement with official sources in terms of the distributionofeconomicactivityacrossstates,counties,establishmentsizes,andindustries whenitisrestrictedtoexcludethesmallestestablishments. Considerablemismeasurement of small establishments is a definite problem in NETS—indeed, a much bigger problem than the well-known challenge of measuring small establishments in official sources. But ourbroadconclusionisthatNETScanbemadereasonablyrepresentativeoftheU.S.econ- 33

ShareCovered Sample Emp. Estab. Sample1: ex. lessthan10 0.848 0.223 Sample2: ex. lessthan10&greaterthan1000 0.743 0.223 Sample3: ex. lessthan10,greaterthan1000&NAICS61 0.727 0.219 Memo: PrivateQCEWlevels 113,326,720 8,994,650 Source:QCEW Note:ShareofQCEWemploymentaccountedforbyscoperestrictions.2014data. Table5: QCEWcoverageofindustryscoperestrictions omy in terms of location, industry, and establishment size. This lends credibility to NETSbasedstudiesofstaticbusiness-levelactivity(inrelatedworkinprogress,wefinddynamic comparisons to be less appropriate). We therefore recommend that researchers ensure that NETS-based results are robust to sample restrictions that omit small establishments—and perhapsthelargestestablishmentsandtheeducationsectoraswell. Animportantquestion ishowcostlythesesamplerestrictionsareintermsofcoverage. Table5reportsthefraction ofQCEWestablishmentsandemploymentthatfallwithinourrestrictedsamples. Sample 1, which excludes small establishments and closes most of the gap between NETS and QCEW, still covers 85 percent of QCEW employment. Even the most restrictive sample,whichexcludessmallandlargeestablishmentsandtheeducationsector,stillcovers 73percentofworkers. Thecoverageofestablishmentsis,ofcourse,muchlower,sincesmall establishments are much more common than large ones. Low establishment coverage may beasignificantprobleminsomeapplications,butemploymentistypicallythemoreimportant target of study given both its use as a measure of economic activity and the problems withcountingestablishmentseveninofficialdata. Thehighemploymentsharecoveredby ourrestrictedsamplesisreassuringabouttheusefulnessofNETS.24 24For comparison, another commonly used source of publicly available business microdata is Compustat, which reports detailed data on the universe of publicly traded businesses; the firms covered by Compustat accountforwelllessthanhalfofU.S.privatesectoremployment(Davisetal.(2007)). 34

3.9 Datatiming As noted above, official sources record data in mid-March, while D&B data are collected continually throughout the year. The NETS sample for year t reflects a snapshot of the databaseinJanuaryoft. Therefore,forexample,anemploymentnumberrecordedbyD&B in February of 2005 will appear in NETS as the 2006 observation. For this reason, some researchers have rolled NETS data back one year when comparing against official sources (thatis,settingt∗ = t−1). We have not modified the NETS year data in our reported analysis for two reasons. First, the appropriate timing for comparison is actually not obvious. For example, if D&B collectsanestablishment’semploymentdatainNovember2005,thenNETSwillcountthese data as the 2006 observation. Our current approach would compare this November 2005 employmentnumberinNETStotheMarch2006observationsinofficialsources,foratiming differenceoffourmonths. ThoseresearcherswhorollNETSdatabackoneyearwillcompare theNovember2005NETSobservationtotheMarch2005observationsinofficialsources,for a timing difference of eight months. It is true, however, that if D&B recording is uniformly distributed across the year, our method of leaving NETS year data unaltered will result in somewhat more error (on average) than a method that rolls NETS years back. But given potentiallagsinNETSrecordingandreporting,wedonotseethisdifferenceaslikelytobe significant. Second, in unreported results we performed all analyses in this paper with NETS data rolledbackbyoneyear(thismeansthatinsteadofcomparingNETSobservationsfor2000, 2007, and 2014 to the same years’ data in official sources, we compare NETS observations thatarereportedascovering2000,2007,and2014totheyears1999,2006,and2013,respectively,inofficialsources). Inamajorityofcomparisons,therelationshipbetweenNETSand officialsourcesiscloserwhenweleaveNETSyearsunalteredthanwhenwerollNETSyears back. ComparisonsbyestablishmentsizearealmostuniformlybetterwithunalteredNETS years. Comparisonsbyindustryaremoremixed,withamodestmajorityfavoringunaltered 35

years, andinmanycasestheunaltereddataarebetterbywidemargins(whilethemargins for cases in which altered data are better tend to be small). Narrow-cell correlations in the unaltered-yeardataarehigherinalargemajorityofcases. Afewchartslookvisuallybetter inthealtered-yeardata,butmostarevisuallysimilar. We therefore prefer leaving NETS year data unaltered and interpreting NETS years at face value when comparing with official sources. However, other researchers may have needsforwhichthealternativeapproachismoreappropriate. 4 Conclusion WedocumentanumberoflimitationswithNETS.Coverageofverysmallestablishmentsdivergesmarkedlyfromofficialdatasources,withsignificantapparentmismeasurementdue to imputation. The small-establishment problem worsens significantly during the 2000s, and researchers must be careful to not interpret rising establishment counts in NETS over the last 20 years as indicative of robust business formation activity.25 The NETS universe began the early 2000s being moderately too large to match the official employer universe, and by the early 2010s the NETS universe is far larger than the official employer universe but still smaller than the total employer and nonemployer universe by Census (and IRS) reckoning. In other words, differences between NETS and official sources do not simply reflect the notion that D&B has a more comprehensive and complete universe of business activity. These differences between NETS and official sources are not reason to reject the use of NETS in research, however. We first show that the NETS universe can be restricted to approximate the scope of official sources in terms of industry and wage-and-salary employment. We then show that remaining discrepancies between NETS and official sources are largely driven by differences among small establishments, where imputation is prevalent. 25Tothecontrary,overwhelmingevidenceexistsindicatingthatemployerbusinessformationhasdeclinedin recentdecades;see,e.g.,Deckeretal.(2014). 36

Further, we find large differences in the education sector, where restricting the NETS sampletomatchofficialsourcesisdifficult(notethatthisisnotreallyastrikeagainstNETSbut ratheralabelingdifficulty. Moreinvolvedeffortstodeterminewhetherindividualestablishmentsarepublicorprivatewouldlikelyclosethisgap). WhilewedoshowthatNETSfails toadequatelycapturesomekeyrecentdevelopmentsinmining,construction,andmanufacturing, the extent to which these problems reflect industry labeling is unclear. Correlations of county-level aggregates between NETS and Census data are strikingly high. Moreover, atthenarrowcelllevel(geographybyindustrybyestablishmentsize),wefindcorrelations between NETS and official sources that are reassuringly strong in appropriately restricted samples. OurrecommendationisthatNETSusersensureresultsarerobusttothesesample restrictions. In related work in progress, we find stronger reason for caution when studying dynamic elements of NETS, consistent with previous state-level research. Our view is that static study is likely to be much safer than study of business dynamics. Note that we have not investigated the revenue data in NETS, which in principle could also be done by comparingNETSwithpublicCensussources. It is also worth noting that the high-quality microdata underpinning CBP, QCEW, and other official datasets are available through the continually expanding Federal Statistical Research Data Center (FSRDC) network and the BLS’s visiting researcher program. These programsallowapprovedresearcherstoaccessdataatleastasdetailedaswhatareavailable in NETS. Participating researchers must first submit proposals describing their application ofthedata,traveltoalocationwherethedatacanbeaccessedsecurely,andsubjectallresults to a disclosure process to ensure no sensitive information is released. Much productive research has been published through this process.26 NETS has the advantage that the only barrier to access is the subscription fee; it can be used freely on any machine, and output doesnothavetoundergoareviewprocess;inthissenseNETSisanappealingoption. WetendtofavortheuseofCensusandBLSdatawhenpossible. Additionally,whenag- 26For examples, see the Center for Economic Studies working paper series at https://ideas.repec.org/s/cen/wpaper.html,andthemanypaperswrittenusingconfidentialBLSdata. 37

gregateddataareappropriate, officialsourcesremainthegoldstandard.27 However, when microdata are needed and speed or flexibility of analysis are requirements, NETS can be a tremendouslyusefulresource. References Acs, Zoltan, William Parsons, and Spencer Tracy, “High-Impact Firms: Gazelles Revisited,”TechnicalReport,SmallBusinessAdministration,OfficeofAdvocacy2008. Amezcua, Alejandro S., “Boon or Boondoggle? Business Incubation as Entrepreneurship Policy.”PhDdissertation,SyracuseUniversity2010. Bayard, Kimberly,RyanA.Decker, andCharlesGilbert, “NaturalDisastersandtheMeasurementofIndustrialProduction: HurricaneHarvey,aCaseStudy,”FEDSNotes,2017. Becker, Randy, Joel Elvery, Lucia Foster, C. J. Krizan, Sang Nguyen, and David Talanl, “A Comparison of the Business Registers Used by the Bureau of Labor Statistics and the BureauoftheCensus,”TechnicalReport2005. Choi, Taelim, Anil Rupasingha, John C Robertson, and Nancey Leigh, “The Effects of HighGrowthonNewBusinessSurvival,”TheReviewofRegionalStudies,012017,47,1–23. ,JohnC.Robertson,andAnilRupasingha,“High-growthfirmsinGeorgia,”FRBAtlanta WorkingPaper2013-20,FederalReserveBankofAtlantaDecember2013. 27As noted previously, collectors and producers of official data face considerable challenges as well, and a largeliterature(muchofitwrittenbyexpertemployeesofstatisticalagencies)documentsshortcomingsofofficialsources.Yetthesesourcesaresimplyunrivaledintermsofconceptualconsistencyandscientificallydriven measurementduetotheexplicitmeasurementgoalsoftheirdatacollectionprocessesandtotheprofessional andtrainedstaffundertakingthedatacollectionandproduction;inaddition,thestatisticalagenciesarecontinuallyengagedininteractionswiththebusinessandacademiccommunitiestoimprovethemeasurementprocess. Moreover,asweshowintheappendix,theU.S.isfortunatetohavetwoindependentlyconstructedbusiness registers that, while not in complete harmony, are remarkably consistent and indeed more consistent than is NETSwitheither,supportingthecasefortheiraccuracy.Inthisrespect,weviewNETSasacomplement(rather thanasubstitute)toofficialsources. 38

Crane,LelandD.,RyanA.Decker,andBoYeonJang,“FloodEffects,”UnpublishedDraft, BoardofGovernorsoftheFederalReserveSystem2017. Cromwell, Erich W., “Topics on Labor and Public Policy.” PhD dissertation, Florida State University2015. Currie, Janet, Stefano DellaVigna, Enrico Moretti, and Vikram Pathania, “The Effect of FastFoodRestaurantsonObesityandWeightGain,”AmericanEconomicJournal: Economic Policy,August2010,2(3),32–63. Davis, Steven J., John Haltiwanger, Ron Jarmin, and Javier Miranda, “Volatility and Dispersion in Business Growth Rates: Publicly Traded versus Privately Held Firms,” in “NBER Macroeconomics Annual 2006, Volume 21” NBER Chapters, National Bureau of EconomicResearch,Inc,2007,pp.107–180. , , Ron S. Jarmin, C. J. Krizan, Javier Miranda, Alfred Nucci, and Kristin Sandusky, “MeasuringtheDynamicsofYoungandSmallBusinesses: IntegratingtheEmployerand NonemployerUniverses,”in“ProducerDynamics: NewEvidencefromMicroData,”NationalBureauofEconomicResearch,2009,pp.329–366. Decker, Ryan, John Haltiwanger, Ron S. Jarmin, and Javier Miranda, “The Role of EntrepreneurshipinUSJobCreationandEconomicDynamism,”JournalofEconomicPerspectives,2014,28(3),3–24. DeSalvo, Bethany, FrankLimehouse, andShawnD.Klimek, “DocumentingtheBusiness Register and Related Economic Business Data,” Technical Report, Center for Economic StudiesworkingpaperCES-WP-16-172016. Donegan,Mary,“InsidetheTriangle: DoesDatabaseSelectionAlterourUnderstandingof UrbanIndustrialSystems?,”TechnicalReport2014. Echeverri-Carroll,ElsieandMaryannFeldman,“ChasingEntrepreneurialFirms,”TechnicalReport2017. 39

Fairman, Kristin, Lucia Foster, C.J.Krizan, and Ian Rucker, “An Analysis of Key Differences in Micro Data: Results from the Business List Comparison Project,” Technical Report,CenterforEconomicStudiesworkingpaperCES08-282008. Greenstone, Michael, Alexandre Mas, and Hoai-Luu Nguyen, “Do Credit Market Shocks affect the Real Economy? Quasi-Experimental Evidence from the Great Recession and ’Normal’EconomicTimes,”WorkingPaper20704,NationalBureauofEconomicResearch November2014. and , “Do Credit Market Shocks affect the Real Economy? Quasi-Experimental EvidencefromtheGreatRecessionand’Normal’EconomicTimes,”2012. Groizard, Jose L., Priya Ranjan, and Antonio Rodriguez-Lopez, “Trade Costs and Job Flows: EvidencefromEstablishment-LevelData,”EconomicInquiry,2015. Haltiwanger, John, Ron S. Jarmin, and Javier Miranda, “Who Creates Jobs? Small versus LargeversusYoung,”TheReviewofEconomicsandStatistics,2013,95(2),347–361. Isenberg, Emily, Liana Christin Landivar, and Esther Mezey, “A Comparison Of Person- ReportedIndustryToEmployer-ReportedIndustryInSurveyAndAdministrativeData,” WorkingPapers13-47,CenterforEconomicStudies,U.S.CensusBureauSeptember2013. Jarmin, Ron S. and Javier Miranda, “The Longitudinal Business Database,” Technical Report,CenterforEconomicStudies2002. Kolko, Jed and David Neumark, “Do Entreprise Zones Create Jobs? Evidence from California’sEnterpriseZoneProgram,”JournalofUrbanEconomics,2010,68(1). Mach,TraciL.andJohnD.Wolken,“ExaminingtheImpactofCreditAccessonSmallFirm Survivability,”in“SmallBusinessesintheAftermathoftheCrisis: InternationalAnalyses andPolicies,”Heidelberg: Physica-VerlagHD,2012,pp.189–210. 40

Neumark, David, Brandon Wall, and Junfu Zhang, “Do Small Businesses Create More Jobs? NewEvidencefortheUnitedStatesfromtheNationalEstablishmentTimeSeries,” TheReviewofEconomicsandStatistics,2011,93(1),16–29. , Junfu Zhang, and Brandon Wall, “Employment Dynamics and Business Relocation: NewEvidencefromtheNationalEstablishmentTimeSeries,” WorkingPaper11647, NationalBureauofEconomicResearchOctober2005. Walls&Associates,“NationalEstablishmentTime-Series(NETS)Database,”2014. Walls,Donald,“UnderstandingDataintheNETSDatabase,”NETSDocumentation,2008. 41

A Appendix: CBP and QCEW coverage A.1 CBPcoverage CBPcoversprivatebusinessestablishments,withthefollowingexclusions:28 • NAICS111and112(Cropandanimalproduction) • NAICS482(Railtransportation) • NAICS491(Postalservice) • NAICS525110,525120and525190(Health,welfareandvacationfunds) • NAICS525920(Trusts,estatesandagencyaccounts) • NAICS814(Privatehouseholds) • NAICS92(Publicadministration) CBPalsoincludesgovernment-runestablishmentsinthefollowingindustries: • NAICS4248(Governmentsponsoredwholesaleliquorestablishments) • NAICS44531(Retailliquorstores) • NAICS511130(Bookpublishers) • NAICS522120(Federally-charteredsavingsinstitutions) • NAICS522130(Federally-charteredcreditunions) • NAICS662(Hospitals) 28The material in this section is from https://www.census.gov/programs-surveys/cbp/technicaldocumentation/methodology.html 42

A.2 QCEWcoverage WeuseQCEWtablesthatexcludefederal,state,andlocalgovernmentestablishments. The followingindustriesarepartiallyorentirelyexcludedfromQCEW:29 • NAICS11(Agricultureetc.) • NAICS482(Railroads) • NAICS813(Religiousgroups) • NAICS814(Domesticworkers) When we compare QCEW to CBP, we exclude all 4-digit NAICS which are completely or partiallyexcludedfromeitherdataset. 29BasedonTableAofhttps://www.bls.gov/cew/cewbultn15.htm#Employment 43

B Additional Tables and Figures 110 105 100 95 90 85 1998 2000 2002 2004 2006 2008 2010 2012 2014 Year snoilliM CBP QCEW Source:CBP,QCEW. Note: StartingfromtheCBPandtheprivate-sectorQCEW,weexcludethefollowingNAICS,whichare(partially)outofscopeforeitherCBPorQCEW:11,814,4248,4453,5111,5221,622,4821,8131 Figure5: Aggregateemployment,CBPversusQCEW(intersectionofscopes) 8.5 8 7.5 7 6.5 6 1998 2000 2002 2004 2006 2008 2010 2012 2014 Year snoilliM CBP QCEW Source:CBP,QCEW. Note: StartingfromtheCBPandtheprivate-sectorQCEW,weexcludethefollowingNAICS,whichare(partially)outofscopeforeitherCBPorQCEW:11,814,4248,4453,5111,5221,622,4821,8131 Figure6: Aggregateestablishmentcounts,CBPversusQCEW(intersectionofscopes) 44

35 30 25 20 15 10 5 1998 2000 2002 2004 2006 2008 2010 2012 2014 Year stnemhsilbatsE fo snoilliM NETS raw establishments NETS payroll establishments Census all establishments Census payroll establishments Source:NETSdatabase,CountyBusinessPatterns,CensusNonemployerStatistics. Note:NETSsamplenotrestrictedtoCBPscope. Figure7: Aggregateestablishmentcounts(includingNETSgovernmentestablishments) 35 30 25 20 15 10 5 1998 2000 2002 2004 2006 2008 2010 2012 2014 Year stnemhsilbatsE fo snoilliM NETS raw establishments NETS payroll establishments Census all establishments Census payroll establishments Source:NETSdatabase,CountyBusinessPatterns,CensusNonemployerStatistics. Note:NETSsamplerestrictedtoCBPscope. Figure8: Aggregateestablishmentcounts(separateNETSlinesofbusiness) 45

200 180 160 140 120 100 1998 2000 2002 2004 2006 2008 2010 2012 2014 Year snoilliM NETS raw employment NETS payroll employment Census all employment Census payroll employment Source:NETSdatabase,CountyBusinessPatterns,CensusNonemployerStatistics. Note:NETSsamplenotrestrictedtoCBPscope. Figure9: Aggregateemployment(includingNETSgovernmentestablishments) 180 170 160 150 140 130 120 110 100 1998 2000 2002 2004 2006 2008 2010 2012 2014 Year snoilliM NETS raw employment NETS payroll employment Census all employment Census payroll employment Source:NETSdatabase,CountyBusinessPatterns,CensusNonemployerStatistics. Note:NETSsamplerestrictedtoCBPscope. Figure10: Aggregateemployment(separateNETSlinesofbusiness) 46

150 140 130 120 110 100 1995 2000 2005 2010 2015 Year snoilliM Number of Employees 120 115 110 105 100 95 90 1995 2000 2005 2010 2015 Year snoilliM Number of Employees Excluding small establishments 95 90 85 80 75 1995 2000 2005 2010 2015 Year snoilliM 20 15 10 5 1995 2000 2005 2010 2015 Year Number of Employees Excluding small and large establishments NETS CBP snoilliM Number of Establishments 2.2 2.1 2 1.9 1.8 1995 2000 2005 2010 2015 Year snoilliM Number of Establishments Excluding small establishments 2.2 2.1 2 1.9 1.8 1995 2000 2005 2010 2015 Year snoilliM Number of Establishments Excluding small and large establishments Source:NETS,CBP Notes: NETSsampleisrestrictedtoCBPscope(NETsamplebaseline-CBP)."Small"establishmentsarethose withlessthan10employees."Large"establishmentsarethosewith1000ormoreemployees. Figure11: NETSversusCBP 47

135 130 125 120 115 110 105 1995 2000 2005 2010 2015 Year snoilliM Number of Employees 110 105 100 95 90 1995 2000 2005 2010 2015 Year snoilliM Number of Employees Excluding small establishments 90 85 80 75 1995 2000 2005 2010 2015 Year snoilliM 20 15 10 5 1995 2000 2005 2010 2015 Year Number of Employees Excluding small and large establishments NETS CBP snoilliM Number of Establishments 2.1 2 1.9 1.8 1.7 1995 2000 2005 2010 2015 Year snoilliM Number of Establishments Excluding small establishments 2.05 2 1.95 1.9 1.85 1.8 1.75 1995 2000 2005 2010 2015 Year snoilliM Number of Establishments Excluding small and large establishments Source:NETS,CBP Notes:NETSsampleisrestrictedtoCBPscope,andNAICS61isexcludedfrombothdatasets(NETsampleex. 61-CBP)."Small"establishmentsarethosewithlessthan10employees. "Large"establishmentsarethosewith 1000ormoreemployees. Figure12: NETSversusCBPexcludingeducationalservices 48

2000 2007 2014 Sizeclass QCEW CBP Pct. diff. QCEW CBP Pct. diff. QCEW CBP Pct. diff. 1to4 3439.5 3471.0 −0.91 4438.3 3929.7 12.94 5162.8 3849.4 34.12 5to9 1185.4 1242.9 −4.63 1305.5 1327.9 −1.69 1283.1 1273.4 0.76 10to19 791.9 825.1 −4.03 872.8 905.5 −3.61 901.4 895.6 0.65 20to49 559.1 574.6 −2.69 616.7 626.7 −1.59 626.3 632.0 −0.90 50to99 195.8 200.3 −2.25 210.7 215.8 −2.39 212.9 214.8 −0.88 100to249 111.2 117.0 −4.96 118.7 122.3 −2.94 118.4 122.0 −3.02 250to499 27.1 29.8 −9.13 28.4 30.1 −5.51 28.6 30.3 −5.86 500to999 8.4 10.2 −17.85 9.7 10.5 −7.38 9.3 10.5 −11.01 1000+ 3.8 5.0 −24.27 3.9 5.1 −23.18 4.0 5.4 −26.35 Source:CBP,QCEW Notes: Levelsareinthousands. PercentdifferencesareCBPlessQCEWdividedbyCBPestablishmentcounts. BothCBP andQCEWarerestrictedtotheintersectionoftheirscopes. StartingfromtheCBPandtheprivate-sectorQCEW,weexcludethefollowingNAICS,whichare(partially)outofscopeforeitherCBPorQCEW:11,814,4248,4453,5111,5221,622, 4821,8131. Table6: CBPversusQCEW,establishmentcountsbysizeclass 2000 2007 2014 Sizeclass QCEW CBP Pct. diff. QCEW CBP Pct. diff. QCEW CBP Pct. diff. 1to4 5.5 5.9 −7.07 6.8 7.0 −3.12 7.4 6.9 6.85 5to9 7.8 8.2 −4.58 8.6 8.8 −1.84 8.5 8.5 0.67 10to19 10.7 11.1 −3.93 11.8 12.2 −3.38 12.2 12.1 0.82 20to49 16.9 17.4 −2.57 18.6 18.9 −1.39 18.9 19.1 −0.97 50to99 13.5 13.8 −2.16 14.5 14.9 −2.57 14.6 14.7 −0.85 100to249 16.7 17.6 −5.35 17.8 18.2 −2.38 17.6 18.2 −3.00 250to499 9.2 10.1 −9.23 9.7 10.2 −5.14 9.7 10.4 −6.67 500to999 5.7 6.9 −17.69 6.6 7.1 −7.48 6.3 7.1 −11.54 1000+ 8.0 11.1 −28.13 8.0 11.5 −30.35 8.0 12.6 −36.34 Source:CBP,QCEW Notes:Levelsareinmillions.PercentdifferencesareCBPlessQCEWdividedbyCBPemploymentlevels.BothCBP andQCEWarerestrictedtotheintersectionoftheirscopes.StartingfromtheCBPandtheprivate-sectorQCEW,we excludethefollowingNAICS,whichare(partially)outofscopeforeitherCBPorQCEW:11,814,4248,4453,5111, 5221,622,4821,8131. Table7: CBPversusQCEW,employmentbysizeclass 49

Percentdifference 2000 2007 2014 Sizeclass(Employees) Emp. Estab. Emp. Estab. Emp. Estab. 1to4 99.10 59.02 123.77 105.30 174.77 156.24 5to9 23.67 23.07 22.90 21.62 45.11 42.17 10to19 10.14 8.52 9.48 8.20 0.27 −0.29 20to49 11.25 9.18 9.92 7.46 2.15 −0.26 50to99 12.13 11.81 10.96 10.55 12.62 10.76 100to249 4.42 5.73 1.15 3.10 6.81 8.10 250to499 9.02 9.72 4.31 5.54 3.74 4.01 500to999 32.51 33.14 7.62 8.88 6.77 7.42 1000+ 100.59 78.82 56.40 39.39 64.18 35.30 Aggregate 26.85 38.74 20.99 66.61 26.33 102.76 Aggregate ex. less than 10 22.48 9.29 13.20 7.95 12.69 1.68 employees Aggregate ex. less than 10 11.25 9.09 7.18 7.86 5.38 1.58 and greater than 1000 employees Source:NETS,QCEW. Note: Difference between NETS and QCEW employment as percent of QCEW employment. NETS samplerestrictedtoQCEWscope. Table8: NETSversusQCEWbyestablishmentsize 50

Percentdifference 2000 2007 2014 Industry Excl<10 All Excl<10 All Excl<10 All 11Ag.,For.,Fish.,Hunt −99 −99 −98 −99 −99 −99 21Mining 54 54 6 10 −14 −12 22Utilities −34 −32 −30 −26 −36 −31 23Construction −3 6 −5 4 7 23 31-33Manufacturing 25 27 34 37 43 47 42WholesaleTrade 19 24 13 18 6 13 44-45RetailTrade −13 −3 −9 2 −1 7 48-49Trans.,Warehous. 2 6 −6 1 −18 −6 51Information 15 20 30 37 39 47 52Finance,Insurance 35 34 15 20 18 23 53RealEst.,Rent.,Leas. 65 67 76 82 74 96 54Prof.,Sci.,Tech. Svcs 39 41 24 29 14 25 55Management −85 −84 −84 −81 −79 −74 56Admin.,WasteMgmt −21 −14 −24 −0 −29 20 61EducationSvcs 441 422 364 347 379 371 62Health,SocialAsst. 15 18 2 8 −8 0 71Arts,Entertain.,Rec. 38 50 29 44 16 38 72Accom.,FoodSvcs −1 2 −5 3 −10 3 81OtherSvcs −30 −19 −34 −23 −42 −26 Source:NETSandQCEW Notes: DifferencebetweenNETSandQCEWemploymentaspercentofQCEWemploymentby NAICSsector.NETSsamplerestrictedtoQCEWscope. Table9: NETSversusQCEWbysector 51

Cite this document
APA
Keith Barnatchez, Leland D. Crane, & and Ryan A. Decker (2017). An Assessment of the National Establishment Time Series (NETS) Database (FEDS 2017-110). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2017-110
BibTeX
@techreport{wtfs_feds_2017_110,
  author = {Keith Barnatchez and Leland D. Crane and and Ryan A. Decker},
  title = {An Assessment of the National Establishment Time Series (NETS) Database},
  type = {Finance and Economics Discussion Series},
  number = {2017-110},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2017},
  url = {https://whenthefedspeaks.com/doc/feds_2017-110},
  abstract = {The National Establishment Time Series (NETS) is a private sector source of U.S. business microdata. Researchers have used state-specific NETS extracts for many years, but relatively little is known about the accuracy and representativeness of the nationwide NETS sample. We explore the properties of NETS as compared to official U.S. data on business activity: The Census Bureau's County Business Patterns (CBP) and Nonemployer Statistics (NES) and the Bureau of Labor Statistics' Quarterly Census of Employment and Wages (QCEW). We find that the NETS universe does not cover the entirety of the Census-based employer and nonemployer universes, but given certain restrictions NETS can be made to mimic official employer datasets with reasonable precision. The largest differences between NETS employer data and official sources are among small establishments, where imputation is prevalent in NETS. The most stringent of our proposed sample restrictions still allows scope that covers about three quarters of U.S. private sector employment. We conclude that NETS microdata can be useful and convenient for studying static business activity in high detail. Accessible materials (.zip)},
}