A Concordance Between Ten-Digit U.S. Harmonized System Codes and SIC/NAICS Product Classes and Industries
Abstract
While the relationship between international trade and domestic economic activity is an important topic in economics, research in this area has been slowed due to data limitations. In this paper we provide tools that improve the existing data in two ways. First, we develop an algorithm that yields concordances between the ten-digit Harmonized System (HS) codes used to classify products in U.S. international trade and the SIC and NAICS industry codes used to classify domestic economic activity. These concordances then yield novel time series of industry-level international trade data for the years 1989 to 2009. Second, we provide concordances between HS codes and the SIC and NAICS product classes used to classify U.S. manufacturing production, allowing for matching at a more disaggregated level than was previously available.
Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C. A Concordance Between Ten-Digit U.S. Harmonized System Codes and SIC/NAICS Product Classes and Industries Justin R. Pierce and Peter K. Schott 2012-15 NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.
A Concordance Between Ten-Digit U.S. Harmonized System Codes and SIC/NAICS Product Classes and Industries (cid:3) Justin R. Pierce y Board of Governors of the Federal Reserve System Peter K. Schott z Yale School of Management & NBER December2011 Abstract While the relationship between international trade and domestic economic activity is an importanttopicineconomics, researchinthisareahasbeenslowedduetodatalimitations. Inthispaper we provide tools that improve the existing data in two ways. First, we develop an algorithm that yields concordances between the ten-digit Harmonized System (HS) codes used to classify products in U.S. international trade and the SIC and NAICS industry codes used to classify domestic economic activity. These concordances then yield novel time series of industry-level international trade data for the years 1989 to 2009. Second, we provide concordances between HS codes and the SIC and NAICS product classes used to classify U.S. manufacturing production, allowing for matching at a more disaggregated level than was previously available. Keywords: Internationaltrade;industry classi(cid:133)cation JEL classi(cid:133)cation: F1 We thank Julie Linden of the Yale University Social Sciences Library for generous help in securing the (cid:3) publicly available U.S. trade data. We thank Kitjawat Tacharoen and Matt Flagge for research assistance. We thank Alvin Venning, Carol Ann Aristone, James Kristo⁄and Mendel Gayle of the U.S. Census Bureau for many enlightening conversations. Schott thanks the National Science Foundation (SES-0241474 and SES-0550190) for research support. The analysis and conclusions set forth in this paper are those of the authors and do not indicate concurrence by the Board of Governors, other members of the research sta⁄or the National Science Foundation. Correspondence: 20th & C ST NW, Washington, DC 20551; email: justin.r.pierce@frb.gov; telephone: y 202-452-2980; fax: 202-736-1937. 135 Prospect Street, New Haven, CT 06520, tel: (203) 436-4260, fax: (203) 432-6974, email: pez ter.schott@yale.edu.
HS to SIC and NAICS 2 1. Introduction Empirical researchers in the (cid:133)elds of international trade and industrial organization are increasingly focused on examining the relationship between international trade and domestic economic activity. This research agenda was (cid:133)rst pursued with industry-level data as in Revenga (1992) and Sachs and Shatz (1994). More recently, demand for linked trade and production data has increased along with the massive growth of research using highly disaggregated plant and (cid:133)rm-level data, as in Bernard, Jensen and Schott (2009) and Bernard, Redding and Schott (2010). Applied research in these (cid:133)elds has been slowed, however, due to an inability to create long time series of industry-level international trade and production data or to match trade data to detailed product-level domestic data. In the U.S., international trade data have been classi(cid:133)ed since 1989 based on the World Customs Organization(cid:146)s Harmonized System (HS). In contrast, domestic economic activity has been classi(cid:133)ed using the North American Industrial Classi(cid:133)cation System (NAICS)(cid:151) beginning with the 1997 economic census(cid:151)and the Standard Industrial Classi(cid:133)cation (SIC), prior to the 1997 economic census.1 This creates two potential di¢ culties when linking trade andproductiondata. First, the HSclassi(cid:133)es products solelyonphysical characteristics while SIC and NAICS classify products based on physical characteristics and the type of economic activity. Second, the switch from SIC to NAICS beginning with the 1997 economic census means that it has been di¢ cult to construct a time series linking trade and production data for the entire period from 1989 to present. This paper improves on currently available data in two ways. First, we provide an algorithm that generates concordances linking the ten-digit HS codes used by the United States to track international trade with the four-digit SIC and six-digit NAICS industry codes used to characterize domestic economic activity. These concordances are assembled from published U.S. Census Bureau ((cid:147)Census(cid:148)) data, which provide a mapping of HS to SIC and NAICS industries from 1989 to 2001 and 2000 to 2009, respectively. Our contribution here is to extend these mappings to match HS codes with SIC industries after 2001, and to match HS codes with NAICS industries before 2000. As a result, applied economists will be able to create(cid:151)for the (cid:133)rst time(cid:151)linked datasets of trade and domestic production in both SIC and NAICS over a long time series(cid:150)1989-2009 for NAICS and 1989-2006 for SIC.2 Second, we provide a set of concordances linking ten-digit import and export HS codes to one or more (cid:133)ve-digit SIC (SIC5) or seven-digit NAICS (NAICS7) product classes. These concordancesareconstructedusingbridgecodesknownas(cid:147)basecodes,(cid:148)whicharecreatedby the U.S. Census Bureau ((cid:147)Census(cid:148)). In each year of an economic census, Census constructs twomappings linkingHScodes tobasecodes andlinkingbasecodes toSIC5orNAICS7product classes, respectively. We combine the two mappings to directly link HS codes to product classes. This set of concordances then allows researchers to match international trade and domestic production data at a more disaggregated level than has previously been available. Each of the contributions in this paper improves the ability of empirical researchers to calculate measures of trade and domestic economic activity that are more directly comparable and hence more accurate for research purposes. 1Each of these product classi(cid:133)cation systems is described in more detail in Section 2. 2The reason for the shorter time period for HS-SIC4 mappings is discussed in Section 3 below.
HS to SIC and NAICS 3 Finally, we brie(cid:135)y discusses how these concordances might be applied in current empirical international trade research. In particular, we provide background information useful for linking the (cid:133)rm-product-class domestic production data in the U.S. Census of Manufactures (CM) to the (cid:133)rm-product import and export data in the Longitudinal Firm Trade Transaction Database (LFTTD). For more detail on the former, see Bernard, Redding and Schott (2010). For more detail on the latter see Bernard, Jensen and Schott (2009). The remainder of the paper is organized as follows. Section 2 provides a description of the HS, SIC and NAICS classi(cid:133)cation systems. Section 3 describes the HS to SIC4/NAICS6 industry concordance, while Section 4 describes the HS to SIC5/NAICS7 product-class concordance. Section 5 discusses how the latter can be used to link Census production and trade data. Section 6 concludes. Appendices provide the Stata code used to implement our algorithm and generate the concordances discussed in the paper and describe the key (cid:133)les used to construct the concordances. 2. A Description of the HS, SIC and NAICS Classi(cid:133)cation Systems 2.1. Classifying Products in U.S. International Trade - The Harmonized System International trade data in all major trading countries(cid:151)including the U.S.(cid:151)is classi(cid:133)ed based on the Harmonized System developed by the World Customs Organization (WCO). The WCO begins by assigning products into 99 broad 2-digit categories such as chapter 72, (cid:147)Iron and Steel.(cid:148)These chapters are then further broken out into 6-digit HS codes for categories of goods such as heading 851670, which is de(cid:133)ned in the 2007 HS as (cid:147)Co⁄ee or tea makers.(cid:148)Individual countries are then free to maintain more disaggregated classi(cid:133)cations beyond the 6-digit level. The U.S. maintains separate HS classi(cid:133)cations for imports and exports and classi(cid:133)es products at the ten-digit level. Import codes are provided in the Harmonized Tari⁄Schedule and maintained by the U.S. International Trade Commission (ITC). Export codes(cid:151)formally known as (cid:147)Schedule B(cid:148)codes(cid:151)are maintained by the Foreign Trade Division (FTD) of the U.S. Census Bureau. In this paper we refer to import and export codes generically as HS codes. For import HS codes, the ITC further aggregates the 99 chapters into 12 broad (cid:147)sections,(cid:148)whicharelistedinTable1. ThefulllistingofHSchaptersand10-digitHSimport and export codes are available at websites of the ITC and FTD, respectively. 2.2. Classifying U.S. Domestic Economic Activity - SIC and NAICS In contrast to the HS, which classi(cid:133)es products based solely on their physical characteristics, SIC and NAICS are classi(cid:133)cations of business activities that incorporate product characteristics as well as the type of economic activity. SIC codes were used to classify U.S. economic activity until the Census Bureau(cid:146)s 1997 economic census, with major revisions of the SIC occurring in 1972 and 1987. Starting with the 1997 census, U.S. economic activity is classi(cid:133)ed according to the NAICS, which is standardized for the (cid:133)rst (cid:133)ve digits across the U.S., Canada and Mexico. Census refers to the (cid:133)rst four digits of an SIC code, and the (cid:133)rst six digits of a NAICS code, as an industry. It reserves the terms product class and product for the (cid:133)rst (cid:133)ve and
HS to SIC and NAICS 4 Table 1: Import HS Sections and Chapters HS Section Name Chapters 1 Live Animals; Animal Products 1 5 2 Vegetable Products 6 14 3 Animal or Vegetable Fats and Oils 15 4 Prepared Foodstuffs; Beverages, Spirits, Tobacco 16 24 5 Mineral Products 25 27 6 Products of the Chemical or Allied Industries 28 38 7 Plastics, Rubber and Articles Thereof 39 40 8 Raw Hides, Skins, Leather 41 43 9 Wood and Articles of Wood 44 46 10 Pulp of Wood, Paper 47 49 11 Textile and Textile Articles 50 63 12 Footwear, Headgear, etc. 64 67 13 Articles of Stone, Plaster, Cement, Ceramics, Glass 68 70 14 Pearls, precious stones, precious metals 71 15 Base Metals and Articles of Base Metal 72 83 16 Machinery, Appliances, Electrical Equipment 84 85 17 Vehicles, Aircraft, Vessels 86 89 18 Precision Instruments 90 92 19 Arms and Ammunitions 93 20 Misc. Manufactured Articles 94 96 21 Works of Art 97 22 Special Classification Provisions 98 99 Notes:ThistabledisplayssectionsandchaptersU.S.ImportHSCodes. Section names have beenshortened forbrevity. Seethe websiteof the U.S. International Trade Commission for full section names. seven digits of an SIC code, and the (cid:133)rst seven and ten digits of a NAICS code, respectively. Whilethesetofo¢ cialU.S.industriesisde(cid:133)nedoutsidetheCensusBureau,Censusgenerally has discretion in de(cid:133)ning product classes and products within these industries. The primary economic activity classi(cid:133)cations for both SIC and NAICS are provided in Table 2. There are a number of di⁄erences between SIC and NAICS. First, NAICS provides more granular industry de(cid:133)nitions than SIC, with the movement from 1,004 industries in SIC compared to 1, 170 industries in NAICS in 1997. Second, some activities were completely reclassi(cid:133)ed in the switch from SIC to NAICS, such as printing and publishing, which was reclassi(cid:133)ed from manufacturing (SIC 27) to wholesale trade (NAICS 51). 2.3. Some Complications Associated With Mapping HS to SIC/NAICS As mentionedabove, the HSandSIC/NAICS systems are fundamentallydi⁄erent inthat the HS classi(cid:133)es products based solely on physical characteristics, while SIC and NAICS incorporate physical product characteristics as well as the type economic activity. This di⁄erence between the two systems can perhaps be most easily seen through a speci(cid:133)c example. In the 1992 Schedule B codes used to classify U.S. exports, HS code 7215200000 tracks exports of (cid:147)other bars and rods of iron or nonalloy steel, cold-formed or cold-(cid:133)nished, less than 0.25 percent carbon.(cid:148)While this de(cid:133)nition is based solely on physical characteristics, the SIC/NAICS product classes to which it matches also take into account the
HS to SIC and NAICS 5 Table 2: Import NAICS and SIC Categories NAICS SIC Categories Description Categories Description 11 Agriculture, Forestry, Fishing and Hunting 01 09 Agriculture, Forestry, Fisheries 21 Mining, Quarrying, and Oil and Gas Extraction 10 14 Mineral Industries 22 Utilities 15 17 Construction Industries 23 Construction 20 39 Manufacturing 31 33 Manufacturing 41 49 Transportation, Communication, Utilities 42 Wholesale Trade 50 51 Wholesale Trade 44 45 Retail Trade 52 59 Retail Trade 48 49 Transportation and Warehousing 60 67 Finance, Insurance and Real Estate 51 Information 70 89 Service Industries 52 Finance and Insurance 91 97 Public Administration 53 Real Estate and Rental and Leasing 54 Professional, Scientific, and Technical Services 55 Management of Companies and Enterprises Administrative and Support and Waste 56 Management and Remediation Services 61 Educational Services 62 Health Care and Social Assistance 71 Arts, Entertainment, and Recreation 72 Accommodation and Food Services 81 Other Services (except Public Administration) 92 Public Administration Notes: Table displays the primary categories of economic activity in the NAICS and SIC classification systems. Source: U.S. Census Bureau. method of production. In particular, this HS10, maps to two separate SIC5 product classes, 33128(cid:151)(cid:147)cold-(cid:133)nishedsteelbars/barshapes(madeinmills)(cid:151)and33168(cid:151)(cid:147)cold-(cid:133)nishedsteel bars/bar shapes (not made in mills). The switch from SIC to NAICS for classifying domestic production also complicates matters. Because international trade data are reported in SIC format only for the years 1989-2001 and in NAICS format only for the years 2000 to 2009, researchers have been unable to construct a long time series spanning SIC and NAICS years. The concordances provided in this table allow applied economists to construct these long time series for the years1989-2009 for NAICS and 1989-2006 for SIC. Lastly, HS codes are continually revised over time. Changes to the U.S. import or export codes occur via three routes: changes by the World Customs Organization (WCO) to the o¢ ciallistofinternationalsix-digitpre(cid:133)xes; U.S.legislationthata⁄ectsU.S.eight-digitcodes (imports only); or changes by the Committee for Statistical Annotation of Tari⁄Schedules (known as the (cid:147)484(f) Committee(cid:148)) to statistical ten-digit codes.3 For more information on changes in HS codes over time, including a concordance tracking these changes, see Pierce and Schott (forthcoming). 3See http://www.census.gov/foreign-trade/aip/comb_seminar_pres.ppt, and www.census.gov/foreigntrade/faq/sb/sb0008.html for more detail.
HS to SIC and NAICS 6 3. Concording HS to SIC4/NAICS6 Industries Asdescribedabove, empiricalresearchershavebeenhamperedbyaninabilitytogenerate long time series of industry-level international trade data and domestic production spanning SIC and NAICS years. This section describes an algorithm and concordances that we create, which link international trade and domestic economic activity data for the years 1989-2009. The concordances can be used to construct comparable datasets of international trade and domestic production data for longer time series than have previously been available. The source data for the concordances is found in the monthly trade data published in CD format by Census(cid:146)s Foreign Trade Division.4 Each of the monthly CDs for imports and exports contains a dBase-formatted (cid:133)le (called concord.dbf) that separately matches the ten-digit import and export HS codes used in the month to four-digit SIC and/or six-digit NAICS codes. We refer to these four-digit SIC and six-digit NAICS codes as (cid:147)baseroots(cid:148)for reasons discussed in the next section, but they are almost always proper industries.5 Note that the December CD for each year contains annual, as well as monthly totals. From 1989 to 2001, the mappings provided by Census match ten-digit HS codes to fourdigit SIC baseroots. From 2000 to the present, they match ten-digit HS codes to six-digit NAICS baseroots. But for certain applications, it might be useful to extend each set of mappings beyond the years for which these o¢ cial concordances are available. That is, it may be useful to have an HS-NAICS6 concordance for years prior to 2000 or an HS-SIC4 concordance for years after 2001. We extend the HS-NAICS6 mappings to cover the period from 1989-2009 and the HS- SIC4 mappings for the years 1989-2006 using a three-step algorithm based on the procedures used previously in Feenstra et al. (2002). The algorithm is implemented on a (cid:147)master list(cid:148) of concordances assembled by appending the HS-baseroot mappings contained in the annual DecembertradeCDsfortheyears1989-2009. NotethatwedonotprovideHS-SIC4mappings for years after 2006 because the number of SIC4 codes that need to be assigned by hand-step 3 in the algorithm-rises to a level that makes the mapping less reliable, in our view. The Stata code for steps 1 and 2, and for incorporating the results of step 3, is available in Appendix 1 under (cid:133)lename schott_algorithm_20.do.6 The Stata code for the algorithm was created using Intercooled Stata, version 9.2 on a 2.0 GHz T2700 Intel Core 2 CPU. The steps of the algorithm are described immediately below. 1. Step 1 (Mechanical Match 1): Examine all ten-digit HS within a nine-digit category. If all assigned ten-digit HS within this category have the same NAICS6 (SIC4) assignment, assignthatNAICS6(SIC4)toanyunassignedten-digitHSwithinthatnine-digit 4CDs are available starting in December, 1989 for exports and January 1989 for imports. The CDs are available for purchase from Census and are often also available in university libraries. The copies used here are provided generously by the Yale University Social Sciences Library. 5Of the 461 NAICS baseroots in the HS-NAICS6 import concordance and 455 NAICS baseroots in the HS-NAICS6 export concordance, 10 are not real industries as de(cid:133)ned in the NAICS. They are 11211X, 1123XX, 31131X 31181X, 31511X, 33631X, 910000, 920000, 980000, 990000. Of the 471 SIC baseroots in the HS-SIC import concordance, 5 are not real industries as de(cid:133)ned in the SIC. They are 314X, 9100, 9200, 9800, 9900. Of the 470 SIC baseroots in the HS-SIC export concordance, 7 are not real industries. They are 314X, 3XXX, 9000, 9100, 9200, 9800, 9900. 6The(cid:133)leisalsoavailableelectronicallyonSchott(cid:146)swebsite: http://www.som.yale.edu/faculty/pks4/sub_international.htm.
HS to SIC and NAICS 7 category. Repeat for eight-, seven-, etc. digit HS categories. 2. Step2(MechanicalMatch2): Sortlistbyten-digitHScode. Examine(cid:147)gaps(cid:148)consisting of HS codes, or groups of consecutive codes that have not been matched to a baseroot. If a gap is preceded and succeeded by the same NAICS6 (SIC4) code, use that NAICS6 (SIC4) code for all unassigned ten-digit HS codes in the gap. 3. Step 3 (Hand Matching): Hand match remaining unmatched HS codes where possible. Notethatanyremainingunmatchedten-digitHScodesaccountforaverysmallfraction of U.S. imports or exports. Tables 3 and 4 summarize the number of HS codes assigned using this procedure with SIC4 codes for years after 2001 and NAICS6 codes for years before 2000, respectively. The descriptions in the (cid:147)source(cid:148)column match those provided by the variable (cid:147)matchtype(cid:148)in the (cid:133)les described in Appendix 2 below. Table 3: Extending the HS-SIC4 Concordance Source 2002 2003 2004 2005 2006 From Census 16,043 15,989 15,915 15,854 15,805 From Mechanical Match 1 1,101 1,184 1,289 1,377 1,469 From Mechanical Match 2 209 215 216 218 235 From Hand Match 293 300 308 308 355 From Census 7,912 7,886 7,883 7,856 7,853 From Mechanical Match 1 752 768 773 839 843 From Mechanical Match 2 132 132 134 134 134 From Hand Match 151 151 150 150 150 SH tropmI SH tropxE sedoC sedoC Notes:Thistable displaysthe methodusedtoassignSICcodestoHScodesforyears after 2001, when Census stopped reporting HS SIC matches. Table 4: Extending the HS-NAICS6 Concordance Source 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 From Census 10,464 11,293 11,630 11,874 12,057 12,834 14,964 16,345 16,853 16,926 17,110 From Mechanical Match 1 2,991 2,981 2,862 2,711 2,599 2,407 1,237 394 354 132 66 From Mechanical Match 2 340 317 297 263 258 220 137 68 58 23 2 From Hand Match 607 623 625 582 588 519 292 75 80 18 1 From Census 6,750 6,874 7,117 7,117 7,200 7,338 7,468 8,250 8,478 8,592 8,620 From Mechanical Match 1 827 769 676 673 658 624 592 228 69 14 6 From Mechanical Match 2 143 139 137 137 137 120 99 54 19 9 0 From Hand Match 188 189 180 180 172 157 149 61 43 5 0 SH tropmI SH tropxE sedoC sedoC Notes:ThistabledisplaysthemethodusedtoassignNAICScodestoHScodesforyearsbefore2000,theyearinwhichCensusbeganreportingHS NAICS matches. By aggregating the HS-baseroot mappings for all available years and extending them for the full period in which the HS was in existence, we create HS to SIC4 and HS to NAICS6 concordances for both imports and exports, for the period from 1989 to 2006 and 1989 to 2009, respectively. See Appendix 2 for a full description of the (cid:133)nal concordance (cid:133)les available in the electronic appendix to this paper.
HS to SIC and NAICS 8 4. Concording HS to SIC5/NAICS7 Product Classes 4.1. Census(cid:146)s Procedure for Mapping HS to SIC and NAICS Researchers in international trade and industrial organization have recently begun studying the role of changes in product mix on plant and (cid:133)rm-level performance, as well as examining how exposure to international trade can a⁄ect (cid:133)rms(cid:146)product mix. Examples of this research include Bernard, Redding and Schott (2010), Pierce (2011), Bernard, Redding and Schott (2011) and Goldberg, Khandelwal, Pavcnik and Topalova (Forthcoming). With this growing interest in product-level data, it is increasingly important to be able to match international trade and domestic production data at a highly disaggregated level. This section describes the construction of concordances that match ten-digit HS codes to (cid:133)ve-digit SIC and seven-digit NAICS product classes(cid:150)a more disaggregated level than has previously been available to researchers. The primary bridge between HS and SIC (NAICS) product classes is a code referred to by the Census Bureau as a (cid:147)SIC-base(cid:148)((cid:147)NAICS-base(cid:148)), which we refer to generically as (cid:147)basecodes.(cid:148)7 Basecodes are eight-digit alphanumeric codes that can generally be thought of as describing product characteristics. The (cid:133)rst four (six) digits of the SIC (NAICS) basecode represent the (cid:147)root(cid:148)industry of the basecode. We refer to basecode roots here as (cid:147)baseroots(cid:148)and use them in constructing the industry concordances, as described in the preceding section. The remaining digits are internal identi(cid:133)ers for whether the basecode encompasses one or more product classes, and, in the latter instance, whether those product classes are from di⁄erent industries. For Census year 1992, enough data are available for us toalsoconstruct aHStoSIC5/NAICS7concordance basedonbasecodes. Forthe 1997, 2002 and 2007 HS to SIC5/NAICS7 concordances, however, we are restricted by data limitations to matching HS and SIC/NAICS product codes through baseroots only. The di⁄erences between constructing concordances using basecodes versus baseroots are discussed in detail in the next sub-section. We match HS product codes to SIC5/NAICS7 product classes via baseroots using two complementary mappings produced by Census. The (cid:133)rst mapping, which we refer to here as an (cid:147)HS-baseroot(cid:148)concordance, assigns a single baseroot to each HS code. As noted above, these mappings are published in Census(cid:146)s monthly releases of U.S. trade data on CDs. The secondmappingisknownastheprincipledi⁄erences(PD)(cid:133)le, whichisconstructedforevery economic census in years ending in 2 and 7. The PD (cid:133)le assigns a single baseroot to each product class inthe SICorNAICS. HSproduct codes canthenbe matchedtoSIC5/NAICS7 product classes through their baseroots. The HS-baseroot and PD mappings are discussed in detail in Appendixes 4 and 5 below, respectively. Atthispoint,anexamplemaybeusefulfor(cid:133)xingideas. In1992,HScode7215200000was used to track exports of (cid:147)other bars and rods of iron or nonalloy steel, cold-formed or cold- (cid:133)nished, less than 0.25 percent carbon.(cid:148) According to the 1992 HS-baseroot concordance, this HS code (cid:150)and 222 others (cid:150)maps into SIC baseroot 3312. This baseroot, in turn, maps into 11 di⁄erent SIC product class codes from 3 di⁄erent four-digit SIC industries in the 1992 PD (cid:133)le: 33121, 33122, 33123, 33124, 33126, 33127, 33128, 3312C, 33167, 33168 and 7A more detailed discussion of Census(cid:146) SIC and NAICS concordance methods is available at www.census.gov/epcd/www/intronet.html.
HS to SIC and NAICS 9 33170.8 We note that in the o¢ cial Census ten-digit HS to four-digit SIC mapping discussed in Section 3, HS code 7215200000 maps uniquely to SIC industry 3312. This example highlights the (cid:147)many-to-many(cid:148)nature of the HS-product-class concordances. While each HS code maps to a single baseroot, many HS codes (223 in this example) can map to a single baseroot. Similarly, while each (cid:133)ve-digit SIC product class maps to a single baseroot, many product classes (from three di⁄erent industries, in this example) may map to a single baseroot. As discussed in Section 5 below, the HS-baseroot and PD (cid:133)les can be used to match the product classes U.S. manufacturing (cid:133)rms produce in each CMF year to the products they import and export in those years. 4.2. Matching on Basecodes Versus Baseroots Matching on baseroots is appealing because HS-baseroot mappings are available in all years, allowing us to create concordances for every economic census year since 1992. As notedabove, however, wehaveaccesstomoredisaggregatedHS-basecodeandSIC5-basecode mappings for 1992. The primary advantage of concordances using basecodes is a (cid:147)more precise(cid:148)mapping between HS and SIC5. To illustrate what we mean by (cid:147)more precise(cid:148), consider once again HS code 7215200000, which we used to illustrate matching through baseroots in the previous sub-section. As mentionedabove, HScode72150200000and222otherHScodesmatchedto11di⁄erentSIC5 product classes through baseroot 3312. When HS and SIC5 codes are matched through full basecodes, rather than baseroots, however, we (cid:133)nd that 7215200000 is one of only 10 HS codes that map to only two SIC5s (cid:150)33128 and 33168, de(cid:133)ned under basecode 33128B00 (cid:150) described as (cid:147)cold-(cid:133)nished steel bars/bar shapes (made in mills)(cid:148)and (cid:147)cold-(cid:133)nished steel bars/bar shapes (not made in mills).(cid:148)9 Because HS code 7215200000 is described as (cid:147)other bars and rods of iron or nonalloy steel, cold-formed or cold-(cid:133)nished, less than 0.25 percent carbon(cid:148), it appears that assigning SIC5s based on a full basecode, rather than a baseroot, has provided a better match, by dropping unrelated SIC5 products like sheet and strip, pipe and tube and rails. HS code 7215200000 was matched to 9 additional SIC5s when we matched HS codes to SIC5 codes with baseroots versus basecodes. This matching of HS codes to additional SIC5s when matching with baseroots is not uncommon, as illustrated in the following analysis of 1992, the only year for which we can do both types of mappings. Of the 16,022 import HS codes in use in 1992, 9,289 are matched to additional SIC5s when using baseroot matching. The mean number of additional SIC5s matched to each import HS is 2.35. Similarly, of the 8,054 export HS codes in use in 1992, 5,396 are matched to extra SIC5s under baseroot 8The product descriptions for these SIC5 product-classes are as follows: 33121 - Coke over and blast furnace products; 33122 - Steel ingot and semi(cid:133)nished shapes; 33123 - Hot-rolled sheet and strip including tin-milled products; 33124 - Hot-rolled bars and bar shapes, plates, structural; 33126 - Steel pipe and tubes (made in steel mills); 33127 - Cold-rolled steel sheet and strip (made in mills); 33128 - Cold-(cid:133)nished steel bars/bar shapes (made in mills); 3312C - Other steel mill products, including steel rails; 33167 - Cold-rolled steelsheetandstrip(notmadeinmills);33168-Cold-(cid:133)nishedsteelbarsandbarshapes(notmadeinmills); 33170 - Steel pipe and tubes. 9NotethatthisexampleprovidesagoodillustrationofhowHScodesmaymatchtomorethanoneSIC5, since the SIC considers the method of production when assigning product classi(cid:133)cations. The di⁄erence between these product classes is whether or not they are made in steel mills.
HS to SIC and NAICS 10 matching. The mean number of additional SIC5s matched to each export HS is 2.72. Table 5 displays the number of extra SIC5s associated with HS10 import and export codes for 1992. Table 5: Additional SIC5s Associated With Each HS Under Baseroot Matching Exports Imports HS10 Additional SIC5 HS10 Additional SIC5 2,658 0 6,733 0 1,090 1 1,818 1 1,102 2 1,983 2 742 3 1,294 3 731 4 1,225 4 389 5 618 5 181 6 289 6 400 7 819 7 283 8 368 8 212 9 402 9 42 10 107 10 25 11 38 11 74 12 112 12 50 13 75 13 33 14 60 14 23 16 57 16 11 17 13 17 1 20 1 20 3 24 6 24 4 27 4 27 Notes: Table displays the number of "Additional" SIC5s associated with HS10 export and import codes in 1992. Additional SIC5s are SIC5 product codes that are associatedwithaparticularHS10whenaconcordanceis constructedwith4 digitbaseroots,ratherthanafull8 digit basecode. Forsome types of research, matching HSandSIC5/NAICS7 codes throughfull basecodes might be useful. Pierce (2011), for example, identi(cid:133)es U.S. manufacturing establishments that received antidumping protection by matching the HS10s used to classify products in antidumpinginvestigationstotheSIC5product-classesthatestablishmentsreportedproducing in the CMF. In this case, matching on baseroots, rather than full basecodes, would likely lead to some unprotected plants being incorrectly identi(cid:133)ed as recipients of antidumping protection. Unfortunately, Census published a full HS10-basecode mapping only for 1992. As a result, matching on basecodes can only be performed in a somewhat limited time period. In the electronic appendix, we provide HS10 to SIC5 concordances constructed with basecode matching for 1992 with (cid:133)lenames m_basecode_92.csv and x_basecode_92.csv for imports and exports, respectively. In the import concordance, 16,022 HS codes are matched to 1,564 SIC5 codes through 812 basecodes.10 In the export concordance 8,053 HS codes are matched to 1,555 SIC5 codes through 806 basecodes.11 10Six-hundred twenty-(cid:133)ve import HS codes have basecodes with no SIC5 match. Two SIC5 codes have basecodes with no import HS match. 11Four-hundred eight export HS codes have basecodes with no SIC5 match. Eleven SIC5 codes have basecodes with no export HS match.
HS to SIC and NAICS 11 Figure 1: Linking the LFTTD to the CMF at the Firm-Baseroot-Level 5. Linking the LFTTD and CM As mentioned above, a large new literature has grown around examining changes in productmixat(cid:133)rmsandplants,andespeciallyhowthosechangesarerelatedtointernational trade. This brief section illustrates how the concordances generated above can be used to createda(cid:133)rm-baseroot-leveldatasetoftradeandproduction. Firm-leveltradedataforevery U.S. importer and exporter are located in Census(cid:146)s Longitudinal Firm Trade Transactions Database, which is described in detail in Bernard, Jensen and Schott (2009). Firm-productlevel domestic production data for every U.S. manufacturer are fromthe product trailer data of Census(cid:146)s Census of Manufactures (CM). Once these datasets are merged, researchers will possess a (cid:133)rm-baseroot-level dataset recording production, imports and exports at the same level of aggregation (i.e., according to SIC or NAICS baseroots) for a particular census year. This merged dataset will then greatly increase researchers(cid:146)ability to understand changes in (cid:133)rms(cid:146)product mix over time. Themergedtradeandproductiondatasetcanbeconstructedrelativelysimply, asfollows. First, the international trade data in the LFTTD are merged with the trade concordance describedinSection3byHScodeandyear,yieldingdataonthefullsetofbaserootsimported and exported by U.S. (cid:133)rms. Then, the product trailer (cid:133)les of the CM(cid:150)which contain data on output by product for every U.S. manufacturing establishment(cid:150)are merged with the PD (cid:133)le for the appropriate year, and aggregated to the (cid:133)rm-baseroot-level. Lastly, these two datasets are merged by baseroot. The resulting dataset contains information on the value of shipments, imports and exports for every U.S. manufacturing (cid:133)rm in SIC or NAICS format. This process is illustrated in Figure 1.
HS to SIC and NAICS 12 6. Conclusion While empirical economists increasingly study the relationship between international tradeanddomesticeconomicactivity, researchhasbeenslowedduetogapsinthesedatasets. This paper creates an algorithm and provides sets of concordances linking the ten-digit HS codes used by the United States to track international trade with the SIC and NAICS categories used to characterize domestic economic activity. Through the use of these concordances it is now possible to create linked datasets of trade and domestic production in both SIC and NAICS from 1989-2009 and to link trade and production data at a more disaggregated level than is typically available. In addition, we provide concordances linking ten-digit HS codes to (cid:133)ve-digit SIC and seven-digit NAICS product classes. These concordnaces then allow researchers studying the product-switching behavior of U.S. (cid:133)rms to match trade and domestic production data at a more disaggregated level than was previously available. References A.B. Bernard, J.B. Jensen and P.K. Schott (2009), (cid:147)Importers, Exporters and Multinationals: A Portrait of Firms in the U.S. that Trade Goods(cid:148)in T. Dunne, J.B. Jensen and M.J. Roberts (eds.), Producer Dynamics: New Evidence from Micro Data (University of Chicago Press). A.B. Bernard, S.J. Redding and P.K. Schott (2010), (cid:147)Multi-Product Firms and Product Switching,(cid:148)American Economic Review, 100:70-97. A.B. Bernard, S.J. Redding and P.K. Schott (2011), (cid:147)Multi-Product Firms and Trade Liberalization,(cid:148)Quarterly Journal of Economics, 126(3), 1271-1318. R.C. Feenstra. (1996), (cid:147)U.S. Imports, 1972-1994: Data and Concordances,(cid:148)NBER Working Paper no. 5515. R.C. Feenstra, J. Romalis and P.K. Schott. (2002), (cid:147)U.S. Imports, Exports and Tari⁄Data, 1989 to 2001,(cid:148)NBER Working Paper 9387. P. Goldberg, A. Khandelwal, N. Pavcnik and P. Topalova. (Forthcoming), (cid:147)Imported Intermediate Inputs and Domestic Product Growth: Evidence from India,(cid:148)Quarterly Journal of Economics. J.R.PierceandP.K.Schott(forthcoming),(cid:147)ConcordingU.S.HarmonizedSystemandSchedule B Codes Over Time,(cid:148)Journal of O¢ cial Statistics. J.R. Pierce (2011), (cid:147)Plant-Level Responses to Antidumping Duties: Evidence from U.S. Manufacturers,(cid:148)Journal of International Economics, 85(2), 222-233. A.L. Revenga (1992), (cid:147)Exporting Jobs? The Impact of Import Competition on Employment and Wages in U.S. Manufacturing,(cid:148)Quarterly Journal of Economics 107(1), 255-284. J.D.SachsandH.J.Shatz(1994),(cid:147)TradeandJobsinU.S.Manufacturing,(cid:148)Brookings Papers on Economic Activity, 1994(1), 1-69.
HS to SIC and NAICS 13 A Appendix 1: Stata Code Contents of schott_algorithm_20.do: **0 Prelim clear set more o⁄ set mem 500m **1 SIC Mapping foreach zzz in exp imp { **1.1 read in the hs-sic mappings provided by census in its monthly trade cd (cid:133)les cd"C: Users pks4 Documents MyDropbox research concordances production for_schott " n n n n n n n n n *create list of mappings use (cid:145)zzz(cid:146)_concord_89_106, clear *keep latest year for which sic is available keep if year==101 keep commodity sic drop if sic=="" duplicates drop commodity, force sort commodity save temp0, replace *read in the list of raw hs10 export codes use (cid:145)zzz(cid:146)_concord_89_106, clear *only need to match years in which sic data are not provided keep if year>101 keep commodity duplicates drop commodity, force sort commodity merge commodity using temp0, keep(sic) tab _merge drop _merge destring commodity, force g(hs) egen sic87=group(sic) save (cid:145)zzz(cid:146)temp_01, replace *save group-sic mapping for below use (cid:145)zzz(cid:146)temp_01, clear collapse (mean) sic87, by(sic) rename sic87 sic87_new1 rename sic sic_new1 drop if sic_new1=="" sic87_new1==. j sort sic87_new1 save temp1, replace use (cid:145)zzz(cid:146)temp_01, clear collapse (mean) sic87, by(sic)
HS to SIC and NAICS 14 rename sic87 sic87_new2 rename sic sic_new2 drop if sic_new2=="" sic87_new2==. j sort sic87_new2 save temp2, replace **1.2 First Mechanical Match **Create new matches mechanically by looking to see what the already-matched sic look like. **Look at all hs9 to see what sic87 the already-matched have; if unanimous, use that. If not, **go up one level. and so on. use (cid:145)zzz(cid:146)temp_01, clear gen sic87_new1 = sic87 sum hs sic87* quietly { foreach x in 9 8 7 6 5 4 3 2 { noisily display [(cid:145)x(cid:146)] local y = 10-(cid:145)x(cid:146) gen hs(cid:145)x(cid:146)= int(hs/(10^(cid:145)y(cid:146))) egen t1 = mean(sic87), by(hs(cid:145)x(cid:146)) egen t2 = sd(sic87), by(hs(cid:145)x(cid:146)) egen t3 = count(sic87), by(hs(cid:145)x(cid:146)) gen sic87_(cid:145)x(cid:146)= t1 if t2==0 t3==1 j replace sic87_new1 = sic87_(cid:145)x(cid:146)if sic87==. & sic87_new1==. drop t1 t2 t3 drop hs(cid:145)x(cid:146)sic87_(cid:145)x(cid:146) } } sum hs sic87 sic87_new1 sort hs save (cid:145)zzz(cid:146)temp_02, replace **1.3 Second Mechanical Match **Look at gaps. If last known and next know are the same, use them to (cid:133)ll in. use (cid:145)zzz(cid:146)temp_02, clear gen sic87_new2 = sic87_new1 gen begin = 1 if sic87_new1==. & sic87_new1[_n-1]~=. gen end = sic87_new1==. & sic87_new1[_n+1]~=. gen bsum = sum(begin) gen gap = sic87_new1==. replace bsum=. if gap==0 gen sb = sic87_new1[_n-1]*begin gen se = sic87_new1[_n+1]*end egen tb = mean(sb), by(bsum) egen te = mean(se), by(bsum)
HS to SIC and NAICS 15 gen match = tb==te replace sic87_new2 = tb if match==1 & sic87_new1==. sum hs sic87* drop begin end bsum gap sb se tb te match sort hs save (cid:145)zzz(cid:146)temp_03, replace *1.4 Recover groups from above use (cid:145)zzz(cid:146)temp_03, clear sort sic87_new1 merge sic87_new1 using temp1, keep(sic_new1) tab _merge drop _merge sort sic87_new2 merge sic87_new2 using temp2, keep(sic_new2) tab _merge drop _merge sort hs gen t=sic87_new1~=. tab t drop t drop sic87* format hs %15.0g drop if hs<100 save (cid:145)zzz(cid:146)_concord_89_106_sic(cid:133)llin, replace } **2 naics foreach zzz in exp imp { **2.1 read in the hs-sic mappings provided by census in its monthly trade cd (cid:133)les *create list of mappings use (cid:145)zzz(cid:146)_concord_89_106, clear *keep earliest year for which naics is available keep if year==100 keep commodity naics drop if naics=="" duplicates drop commodity, force sort commodity save temp0, replace *read in the list of raw hs10 export codes use (cid:145)zzz(cid:146)_concord_89_106, clear *Only need years for which there is no naics keep if year<100 keep commodity
HS to SIC and NAICS 16 duplicates drop commodity, force sort commodity merge commodity using temp0, keep(naics) tab _merge drop _merge destring commodity, force g(hs) egen naics87=group(naics) save (cid:145)zzz(cid:146)temp_01, replace *save group-naics mapping for below use (cid:145)zzz(cid:146)temp_01, clear collapse (mean) naics87, by(naics) rename naics87 naics87_new1 rename naics naics_new1 drop if naics_new1=="" naics87_new1==. j sort naics87_new1 save temp1, replace use (cid:145)zzz(cid:146)temp_01, clear collapse (mean) naics87, by(naics) rename naics87 naics87_new2 rename naics naics_new2 drop if naics_new2=="" naics87_new2==. j sort naics87_new2 save temp2, replace **2.2 First Mechanical Match **Create new matches mechanically by looking to see what the already-matched naics look like. **Look at all hs9 to see what naics87 the already-matched have; if unanimous, use that. If not, **go up one level. and so on. use (cid:145)zzz(cid:146)temp_01, clear gen naics87_new1 = naics87 sum hs naics87* quietly { foreach x in 9 8 7 6 5 4 3 2 { noisily display [(cid:145)x(cid:146)] local y = 10-(cid:145)x(cid:146) gen hs(cid:145)x(cid:146)= int(hs/(10^(cid:145)y(cid:146))) egen t1 = mean(naics87), by(hs(cid:145)x(cid:146)) egen t2 = sd(naics87), by(hs(cid:145)x(cid:146)) egen t3 = count(naics87), by(hs(cid:145)x(cid:146)) gen naics87_(cid:145)x(cid:146)= t1 if t2==0 t3==1 j replace naics87_new1 = naics87_(cid:145)x(cid:146)if naics87==. & naics87_new1==. drop t1 t2 t3 drop hs(cid:145)x(cid:146)naics87_(cid:145)x(cid:146)
HS to SIC and NAICS 17 } } sum hs naics87 naics87_new1 sort hs save (cid:145)zzz(cid:146)temp_02, replace **2.3 Second Mechanical Match **Look at gaps. If last known and next know are the same, use them to (cid:133)ll in. use (cid:145)zzz(cid:146)temp_02, clear gen naics87_new2 = naics87_new1 gen begin = 1 if naics87_new1==. & naics87_new1[_n-1]~=. gen end = naics87_new1==. & naics87_new1[_n+1]~=. gen bsum = sum(begin) gen gap = naics87_new1==. replace bsum=. if gap==0 gen sb = naics87_new1[_n-1]*begin gen se = naics87_new1[_n+1]*end egen tb = mean(sb), by(bsum) egen te = mean(se), by(bsum) gen match = tb==te replace naics87_new2 = tb if match==1 & naics87_new1==. sum hs naics87* drop begin end bsum gap sb se tb te match sort hs save (cid:145)zzz(cid:146)temp_03, replace *2.4 recover groups from above use (cid:145)zzz(cid:146)temp_03, clear sort naics87_new1 merge naics87_new1 using temp1, keep(naics_new1) tab _merge drop _merge sort naics87_new2 merge naics87_new2 using temp2, keep(naics_new2) tab _merge drop _merge sort hs gen t=naics87_new1~=. tab t drop t drop naics87* format hs %15.0g drop if hs<100 save (cid:145)zzz(cid:146)_concord_89_106_naics(cid:133)llin, replace }
HS to SIC and NAICS 18 **3 Add in hand matches to imports and exports, respectively, (cid:133)rst for sic and then for naics ** Any missing matches after the last section were matched by hand by kitjawat. Add these ** hand matches into the data here and then also create a variable that identi(cid:133)es each *** mapping according to whether it is from Census, mechanical match 1, mechanical match 2 or ** from kitjawat(cid:146)s hand matching. ** ** 2009.10.16 change sic 2612 to 2621 in kitjawat_handmatch_imports_sic_20080821 per Justin(cid:146)s email ** also add leading zero to sic(cid:146)s from handmatch and (cid:133)x missing naics for 1605106000 ** use imp_concord_89_106_sic(cid:133)llin, clear sort hs merge hs using kitjawat_handmatch_imports_sic_20080821 tab _merge drop if _merge==2 replace kitjawat = 2621 if kitjawat==2612 drop _merge gen sic_new3=sic_new2 tostring kitjawat, g(kitjawats) replace kitjawats = "0"+kitjawats if kitjawat>=100 & kitjawat<=999 replace sic_new3=kitjawats if sic_new3=="" & kitjawats!="" replace sic_new3="" if sic_new3=="." sort hs merge hs using sic_imp_jrp tab _merge replace sic_new3=sic_new4 if sic_new3=="" & sic_new4!="" codebook sic_new3 gen id = "From Census" gen newsic = sic replace id = "From mechanical match 1" if sic=="" replace newsic = sic_new1 if sic=="" replace id = "From mechanical match 2" if newsic=="" replace newsic = sic_new2 if newsic=="" replace id = "From hand match" if newsic=="" replace newsic = kitjawats if newsic=="" label var id "SIC match type" keep commodity hs newsic id rename newsic sic rename id sic_matchtype rename sic new_sic keep commodity new_sic sic_matchtype order commodity new_sic sic_matchtype
HS to SIC and NAICS 19 sort commodity save sic_imp_(cid:133)nal, replace use imp_concord_89_106_naics(cid:133)llin, clear sort hs merge hs using kitjawat_handmatch_imports_naics_20081016 tab _merge drop if _merge==2 drop _merge gen naics_new3=naics_new2 tostring kitjawat, g(kitjawats) replace kitjawats = "311711" if commodity=="1605106000" replace naics_new3=kitjawats if naics_new3=="" & kitjawats!="" replace naics_new3="" if naics_new3=="." sort hs merge hs using naics_imp_jrp tab _merge replace naics_new3=naics_new4 if naics_new3=="" & naics_new4!="" codebook naics_new3 gen id = "From Census" gen newnaics = naics replace id = "From mechanical match 1" if naics=="" replace newnaics = naics_new1 if naics=="" replace id = "From mechanical match 2" if newnaics=="" replace newnaics = naics_new2 if newnaics=="" replace id = "From hand match" if newnaics=="" replace newnaics = kitjawats if newnaics=="" label var id "NAICS match type" drop naics rename newnaics naics rename id naics_matchtype rename naics new_naics keep commodity new_naics naics_matchtype order commodity new_naics naics_matchtype sort commodity save naics_imp_(cid:133)nal, replace use exp_concord_89_106_sic(cid:133)llin, clear sort hs merge hs using kitjawat_handmatch_exports_sic_20080821 tab _merge drop if _merge==2 drop _merge gen sic_new3=sic_new2 tostring kitjawat, g(kitjawats) replace kitjawats = "0"+kitjawats if kitjawat>=100 & kitjawat<=999 replace sic_new3=kitjawats if sic_new3=="" & kitjawats!=""
HS to SIC and NAICS 20 replace sic_new3="" if sic_new3=="." sort hs merge hs using sic_exp_jrp tab _merge replace sic_new3=sic_new4 if sic_new3=="" & sic_new4!="" codebook sic_new3 gen id = "From Census" gen newsic = sic replace id = "From mechanical match 1" if sic=="" replace newsic = sic_new1 if sic=="" replace id = "From mechanical match 2" if newsic=="" replace newsic = sic_new2 if newsic=="" replace id = "From hand match" if newsic=="" replace newsic = kitjawats if newsic=="" label var id "SIC match type" drop sic rename newsic sic rename id sic_matchtype rename sic new_sic keep commodity new_sic sic_matchtype order commodity new_sic sic_matchtype sort commodity save sic_exp_(cid:133)nal, replace use exp_concord_89_106_naics(cid:133)llin, clear sort hs merge hs using kitjawat_handmatch_exports_naics_20081016 tab _merge drop if _merge==2 drop _merge gen naics_new3=naics_new2 tostring kitjawat, g(kitjawats) *replace kitjawats = "0"+kitjawats if kitjawat>=100 & kitjawat<=999 replace naics_new3=kitjawats if naics_new3=="" & kitjawats!="" replace naics_new3="" if naics_new3=="." sort hs merge hs using naics_exp_jrp tab _merge replace naics_new3=naics_new4 if naics_new3=="" & naics_new4!="" codebook naics_new3 gen id = "From Census" gen newnaics = naics replace id = "From mechanical match 1" if naics=="" replace newnaics = naics_new1 if naics=="" replace id = "From mechanical match 2" if newnaics=="" replace newnaics = naics_new2 if newnaics==""
HS to SIC and NAICS 21 replace id = "From hand match" if newnaics=="" replace newnaics = kitjawats if newnaics=="" label var id "NAICS match type" drop naics rename newnaics naics rename id naics_matchtype rename naics new_naics keep commodity new_naics naics_matchtype order commodity new_naics naics_matchtype sort commodity save naics_exp_(cid:133)nal, replace **4 Reassemble HS-SIC data for all years *Imports use imp_concord_89_106, clear sort commodity merge commodity using sic_imp_(cid:133)nal tab _merge drop _merge replace sic_matchtype="From Census" if sic!="" replace sic=new_sic if sic=="" & new_sic!="" sort commodity merge commodity using naics_imp_(cid:133)nal tab _merge drop _merge replace naics_matchtype="From Census" if naics!="" replace naics=new_naics if naics=="" & new_naics!="" drop new* descrip* destring commodity, g(hs) force append using imp_107_concord append using imp_108_concord append using imp_109_concord order commodity hs year sic sic_matchtype naics naics_matchtype sort commodity year save hs_sic_naics_imports_89_109_20111004, replace outsheet using hs_sic_naics_imports_89_109_20111004.csv, replace *Exports use exp_concord_89_106, clear sort commodity merge commodity using sic_exp_(cid:133)nal tab _merge drop _merge replace sic_matchtype="From Census" if sic!="" replace sic=new_sic if sic=="" & new_sic!="" sort commodity merge commodity using naics_exp_(cid:133)nal
HS to SIC and NAICS 22 tab _merge drop _merge replace naics_matchtype="From Census" if naics!="" replace naics=new_naics if naics=="" & new_naics!="" drop new* descrip* destring commodity, g(hs) force *This drops several special classi(cid:133)cation codes for U.S. goods returned from Puerto Rico drop if hs<10 append using exp_107_concord append using exp_108_concord append using exp_109_concord order commodity hs year sic sic_matchtype naics naics_matchtype sort commodity year save hs_sic_naics_exports_89_109_20111004, replace outsheet using hs_sic_naics_exports_89_109_20111004.csv, replace Contents of hs_sic5_basecodes_02.do: clear capture log close set more o⁄ set mem 1000m log using full_conc_92.log, replace use appndxd, clear keep sicbase92 pc5 rename pc5 sic5 drop if sic5=="N/A" sort sicbase92 save t1, replace use hs_sic_m_allsources_1989_2006, clear keep hs sicbase92 drop if sicbase92=="" sort sicbase92 joinby sicbase92 using t1, unmatched(both) tab _merge keep if _merge==3 drop _merge rename sicbase92 basecode sort hs save m_basecode_92, replace outsheet using m_basecode_92.csv, replace use hs_sic_x_allsources_1989_2006, clear keep hs sicbase92 drop if sicbase92=="" sort sicbase92 joinby sicbase92 using t1, unmatched(both) tab _merge
HS to SIC and NAICS 23 keep if _merge==3 drop _merge rename sicbase92 basecode sort hs save x_basecode_92, replace outsheet using x_basecode_92.csv, replace capture log close Contents of hs_sic5_naics7_baseroots_04.do: clear set more o⁄ set mem 1000m cd"C: Users Justin Documents RAWork Jensen_Schott_Bernard hs_sic_naics_concordance" n n n n n n capture log close log using baseroot_conc_create.log, replace foreach x in imports exports { use hs_sic_naics_(cid:145)x(cid:146)_89_106_20091016, clear keep commodity sic sic_matchtype order commodity sic keep commodity sic rename commodity hs rename sic sicbaseroot sort sicbaseroot save hs_sic_(cid:145)x(cid:146), replace } foreach x in imports exports { use hs_sic_naics_(cid:145)x(cid:146)_89_106_20091016, clear keep commodity naics naics_matchtype order commodity naics keep commodity naics rename commodity hs rename naics naicsbaseroot sort naicsbaseroot save hs_naics_(cid:145)x(cid:146), replace } foreach x in imports exports { use pd92, clear keep sicbase92 pc5 drop if pc5=="N/A" gen sicbaseroot=substr(sicbase92,1,4) rename pc5 sic5 keep sicbaseroot sic5 sort sicbaseroot joinby sicbaseroot using hs_sic_(cid:145)x(cid:146), unmatched(both) tab _merge keep if _merge==3
HS to SIC and NAICS 24 drop _merge order hs sic5 save hs_sic5_(cid:145)x(cid:146)_92, replace outsheet using hs_sic5_(cid:145)x(cid:146)_92.csv, replace } foreach y in 97 02 { foreach x in imports exports { noisily display "(cid:145)x(cid:146)(cid:145)y(cid:146)" use pd(cid:145)y(cid:146), clear keep baseroot pc7 drop if baseroot=="N/A" drop if pc7=="N/A" rename pc7 naics7 rename baseroot naicsbaseroot sort naicsbaseroot joinby naicsbaseroot using hs_naics_(cid:145)x(cid:146), unmatched(both) tab _merge keep if _merge==3 drop _merge order hs naics7 save hs_naics7_(cid:145)x(cid:146)_(cid:145)y(cid:146), replace outsheet using hs_naics7_(cid:145)x(cid:146)_(cid:145)y(cid:146).csv, replace } } capture log close B Appendix 2: Downloads Downloads All (cid:133)les described here are available in a zip archive accompanying this paper on Schott(cid:146)s website.12 B1. HS-SIC4/NAICS6 Concordance Files The HS-NAICS6 (SIC4) industry concordances for 1989 to 2009 (1989 to 2006) are availableintwo(cid:133)lesforexportsandimports,named,respectively,hs_sic_naics_imports_89_109_20111004.dta and hs_sic_naics_exports_89_109_20111004.dta, where 89 represents the beginning year of 1989, 109 represents the ending year of 2009 and 20101220 represents the version date. 1. HS: ten-digit HS import or export code 2. SIC: corresponding four-digit SIC code 3. NAICS: corresponding six-digit NAICS code 12See http://www.som.yale.edu/faculty/pks4/sub_international.htm.
HS to SIC and NAICS 25 4. SIC_MATCHTYPE: description of match origin (see Table 3) 5. NAICS_MATCHTYPE: description of match origin (see Table 3) 6. COMMODITY: a string version of HS, with leading zeroes, where applicable The Stata do-(cid:133)le used to create these concordances are also available in the electronic appendix with (cid:133)lename schott_algorithm_20.do. B2. HS-SIC5 (1992, Using Full Basecodes) TheHS-SIC5(basecode)concordancesfor1992areavailableintwo(cid:133)lesnamedm_basecode_92.csv for imports and x_basecode_92.csv for exports. These (cid:133)les contain the following variables: 1. HS: ten-digit HS import or export code 2. Basecode: eight-character basecode associated with HS 3. SIC5: The SIC5s associated with a particular HS and basecode. Note that there may be multiple entries for a single HS code when it matches to more than one SIC5. The Stata do-(cid:133)le used to create these concordances is also available in the electronic appendix with hs_sic5_basecodes_02.do. B3. HS-SIC5 (1992, Using Baseroots) TheHS-SIC5(baseroot)concordancesfor1992areavailableintwo(cid:133)lesnamedhs_sic5_imports_92.csv for imports and hs_sic5_exports_92.csv for exports. These (cid:133)les contain the following variables: 1. HS: ten-digit HS import or export code 2. SICBASEROOT: four-character SIC baseroot associated with HS 3. SIC5: The SIC5s associated with a particular HS and basecode. Note that there may be multiple entries for a single HS code when it matches to more than one SIC5. The Stata do-(cid:133)le used to create these concordances is also available in the electronic appendix with (cid:133)lename hs_sic5_naics7_baseroots_04.do. B4. HS-NAICS7 (1997 and 2002, Using Baseroots) The HS-NAICS7 (baseroot) concordances for 1997 and 2002 are available in four (cid:133)les named hs_naics7_imports_yy.csv for imports and hs_naics7_exports_yy.csv for exports, where yy is the last two digits of the year. These (cid:133)les contain the following variables: 1. HS: ten-digit HS import or export code 2. NAISBASEROOT: six-character NAICS baseroot associated with HS
HS to SIC and NAICS 26 3. NAICS7: The NAICS7 associated with a particular HS and basecode. Note that there may be multiple entries for a single HS code when it matches to more than one NAICS7. The Stata do-(cid:133)le used to create these concordances is also available in the electronic appendix with (cid:133)lename hs_sic5_naics7_baseroots_04. B5. HS-SITC Concordance Files Census(cid:146)s mapping of HS and SITC codes from its published trade data are available in two (cid:133)les named hs_sitc_imports.csv for imports and hs_sitc_exports.csv for exports. These (cid:133)les contain the following variables: 1. HS: ten-digit HS import or export code 2. Corresponding (cid:133)ve-digit revision 3 SITC code. C Appendix 3: Other Concordances This appendix discusses the relationship between the concordances developed above and two other HS-SIC/NAICS concordances that can be found on the web. C1. The Feenstra (2002) Concordance Feenstra et al. (2002) provide background for U.S. HS10-level trade data for 1989 to 2001. Those data have subsequently been extended to 2006 and are available on Feenstra(cid:146)s website. Of the 26,277 ten-digit HS codes used to track U.S. imports (exports) in the Feenstra et al. (2002) 1989 to 2001 dataset, Census provided a baseroot concordance for all but 1,222. Of these 1,222 HS codes, 898 were assigned to a four-digit SIC category using a HS to 1987-revision MSIC concordance from Feenstra (1996). Though in principle MSIC codes di⁄er from SIC codes, a number of MSIC codes map directly into regular SIC codes. The remaining 324 products were assigned to industries via an algorithm similar to that described in Section 3 above. The set of HS codes found in the Feenstra et al. concordances di⁄ers slightly from that of the master list described in Section 3. Of the 25,329 (11,509) unique import (export) HS codes that result from merging Feenstra et al.(cid:146)s concordances with our own, we (cid:133)nd that 24,947 (11,472) are in common while 382 (37) are only in the Feenstra et al (2002) concordance. We don(cid:146)t have an explanation for the codes unique to the Feenstra et al (2002) dataset though we suspect they may be due to Census(cid:146)periodic revisions of the trade data. C2. The EIIT Concordance A (cid:133)ve-digit SIC to ten-digit HS concordance of unknown origin is posted to the EIIT website.13 This concordance does not distinguish between import or export HS categories and it does not note the years to which either its HS codes apply. 13Seewww.macalester.edu/research/economics/page/haveman/Trade.Resources/Concordances/FromHS/10hs5sic87.txt.
HS to SIC and NAICS 27 The EIIT concordance contains 17,436 HS codes and maps them to 805 (cid:133)ve-digit SIC categories, 741 of which are in manufacturing. If collapsed to the four-digit SIC level, this list comprises 439 four-digit SIC codes, 386 of which are in manufacturing. This compares with the 1,440 (cid:133)ve-digit and 459 four-digit manufacturing SIC codes contained in the 1987 revision of the SIC, and the 1,462 (cid:133)ve-digit and 459 four-digit manufacturing SIC codes in the 1992 revision of the SIC. The 386 unique manufacturing codes in the EIIT concordance are similar to the 386 (cid:147)super-sic(cid:148)codes described in Feenstra et al (2002). The EIIT concordance appears to be a close cousin of the concordance described in Section 4. Of the 8,215 (15,120) export (import) HS codes which appear in both concordances, 6,058 (10,762) have the same four-digit root. D Appendix 4: Census(cid:146)s HS-Baseroot Concordances Census produces an HS-basecode concordance only for the years in which there is an economic census. However, it provides more aggregate, HS-baseroot concordances with its monthly published trade statistics. Census constructs the HS-to-basecode and HS-baseroot concordances so that the Foreign Trade Division can publish trade statistics using the same industrycategoriesitusestopublishdomesticproductionstatistics. Asalludedtoaboveand as discussed in more detail at www.censusbureau.biz/epcd/oei/view/appenda.txt, the HS to basecode mappings often make more sense for exports than for imports: (cid:147)It is somewhat easier to (cid:133)nd a reasonable statistical basis for comparing domestic output with exports than with imports. This is because there are substantial numbers of imported commodities which are not produced in the United States or are produced in very small quantities. On the other hand, the merchandise exported from the United States is ordinarily produced in this country and re(cid:135)ects items important to output.(cid:148) As discussed above, we assemble a (cid:147)master list(cid:148)of these mappings by appending the HSbaseroot concordances containedinthe Decembertrade CD-roms. The Stata(cid:133)les containing these lists are discussed in Section B They are available on Schott(cid:146)s website. D1. HS-SIC Census(cid:146)s HS-baseroot concordances virtually always map HS codes to a single fourcharacter SIC root. As noted above, these roots are the (cid:133)rst four characters of an eightcharacter SIC basecode.14 For the most part, these baseroots are proper industries, but there are some (e.g., 3XXX) that re(cid:135)ect the di¢ culties noted in Sections 3 and 4 above. We note the following: As indicated in Table 6, the number of unique HS export (import) codes in the master (cid:15) list that have SIC basecodes associated with them in at least one year ranges from 7,908 (14,402) in 1989 to 8,629 (17,183) in 2001. 2001 is the (cid:133)nal year in which SIC codes appear in the concordance. 14Though the concordance (cid:133)les included with the monthly trade data do not include the full, internalto-Census basecode, that mapping is available for 1992 at http://www.census.gov/epcd/www/intronet.html (see second paragraph).
HS to SIC and NAICS 28 The number of unique SIC codes to which these export (import) HS codes match (cid:15) ranges from 429 (443) in 1989 to 449 (450) in 2001.15 SomeoftheSICbasecodestowhichHScodesareassignedareincomplete(e.g., 23XX), (cid:15) while others are outside manufacturing (e.g., 0273). As noted in the third column of each panel in Table 6, the number of manufacturing SIC basecodes to which these export (import) codes match ranges from 371 (386) in 1989 to 391 (392) in 2001. The fact that there are fewer than the o¢ cial number of 459 manufacturing SIC codes in the concordance (cid:133)les is consistent with the discussion in Sections 3 and 4 above. Table 6: HS and Four-Digit SIC Codes in the (cid:147)Master List(cid:148) Exports Imports HS10 SIC4 Man SIC4 HS10 SIC4 Man SIC4 1989 7,908 429 371 14,402 443 386 1990 7,971 447 387 15,214 446 387 1991 8,110 448 387 15,414 446 387 1992 8,107 448 387 15,430 448 388 1993 8,167 449 391 15,502 447 389 1994 8,239 449 391 15,980 447 389 1995 8,308 449 391 16,630 447 389 1996 8,593 449 391 16,882 447 389 1997 8,609 449 391 17,345 447 389 1998 8,620 449 391 17,099 447 389 1999 8,626 449 391 17,179 450 392 2000 8,635 449 391 17,215 450 392 2001 8,629 449 391 17,183 450 392 Notes: Table displays number of ten digit HS codes, four digit SIC codes, and four digit manufacturing SIC codes appearing in the concordance files accompanying the U.S. monthly trade statistics sold by the U.S. Census Bureau. D2. HS-NAICS As with the SIC, Census(cid:146)s concordance (cid:133)les virtually always map HS codes to a unique six-digit NAICS baseroot. For the most part, these baseroots are proper NAICS industries, but there are some that re(cid:135)ect the di¢ culties noted in Sections 3 and 4 above. We note the following: As summarized in Table 7, the number of HS export (import) codes in the master (cid:15) list that have NAICS basecodes associated with them in at least one year ranges from 8,628 (16,897) in 2000 to 8,882 (17,745) in 2009. 2000 is the (cid:133)rst year that NAICS codes appear in the concordance (cid:133)les. The number of NAICS basecodes to which these export (import) codes match ranges (cid:15) from 454 in 2000 to 456 in 2009 for imports and switches between 453 and 454 for exports.16 15There are 459 (cid:147)o¢ cial(cid:148)four-digit SIC manufacturing codes in 1992 and 1997 economic censuses. For a complete list, see http://www.censusbureau.biz/epcd/oei/view/sic-sht2.txt. 16There are 473 (cid:147)o¢ cial(cid:148)six-digit NAICS manufacturing codes in the 2002 economic census. For a complete list of the six-digit codes, see http://www.census.gov/epcd/naics02/naico602.txt.
HS to SIC and NAICS 29 Some of the NAICS basecodes to which HS codes are assigned are incomplete, while (cid:15) others are outside manufacturing. As noted in the third column of each panel of Table 7, the number of manufacturing NAICS industry codes to which these export (import) codes match ranges from 387 (387) in 2000 to 386 (388) in 2009. As with the SIC, these numbers of manufacturing codes are lower than the 473 o¢ cial manufacturing industries in the NAICS. Table 7: HS and Six-Digit NAICS Codes in the (cid:147)Master List(cid:148) Exports Imports HS10 NAICS6 Man. NAICS HS10 NAICS6 Man. NAICS 2000 8,628 454 387 16,897 454 387 2001 8,622 453 386 16,910 453 386 2002 8,940 453 386 17,351 453 386 2003 8,930 454 386 17,390 454 386 2004 8,933 454 386 17,382 454 386 2005 8,971 453 386 17,717 453 386 2006 8,972 453 386 17,746 453 386 2007 8,878 453 385 17,665 455 387 2008 8,883 453 385 17,728 455 387 2009 8,882 454 386 17,745 456 388 Notes:Table displaysnumberof ten digitHS codes, six digit NAICScodes andsix digitmanufacturingNAICScodesappearingintheconcordancefiles accompanying the U.S. monthly trade statistics sold by the U.S. Census Bureau. E Appendix 5: Census(cid:146)s Principle Di⁄erences (Product Class-Basecode) Concordances This section summarizes Census(cid:146)PD (cid:133)les for 1992, 1997, 2002 and 2007. E1. 1992 Economic Census The 1992 PD(cid:133)le maps (cid:133)ve-digit SICproduct classes toeight-digit (SIC-based) basecodes and is available in the electronic appendix with (cid:133)lename pd92.csv. We note the following: 814 unique basecodes match to a product class (PC) in the 1992 PD (cid:133)le, 768 of which (cid:15) areinmanufacturing. Table8summarizesthedistributionofthesebasecodesaccording to the number of (cid:133)ve-digit SIC product classes into which they map. As a group, the eight-digit basecodes contain 418 unique four-character basecode roots, 391 of which are in manufacturing. Note that there are 459 unique four-digit SIC manufacturing industries in 1992.17 17The set of four-digit SIC manufacturing industries in 1992 is identical to the set used in 1987. See www.census.gov/prod/2/manmin/mc92-r-1.pdf.
HS to SIC and NAICS 30 Table 8: Number of Product Classes per Basecode and Basecode Root (1992) Overall Manufacturing Product Basecode Basecode Classes Basecodes Roots Basecodes Roots 1 549 117 520 111 2 109 60 103 54 3 59 76 53 69 4 36 50 32 46 5 20 39 19 37 6 17 25 17 24 7 9 13 9 13 8 2 7 2 7 9 2 9 2 8 10 2 5 2 5 11 1 2 1 2 12 2 4 2 4 14 3 3 15 1 1 16 1 1 18 1 3 1 3 20 1 1 21 1 1 22 1 1 23 1 1 1 1 25 1 1 1 1 28 1 1 Total 814 418 768 391 Notes: Table displays distribution of basecodes and basecoderootsaccordingtothenumberofproductclasses into which they map, overall and for manufacturing. 1,566 unique (cid:133)ve-digit SIC product classes are matched to an eight-digit basecode in (cid:15) the 1992 PD (cid:133)le. The o¢ cial list of SIC categories for the 1992 CMF encompasses 1,462 (cid:133)ve-digit product classes for manufacturing.18 (cid:150)A merge of the unique (cid:133)ve-digit SIC codes from the PD concordance into the o¢ cial list from Census (1992) reveals that 1400 codes match exactly and that they are all in manufacturing. The largest portion (24) of the 62 in the o¢ cial list but not in the PD concordance end in (cid:147)9(cid:148), and their descriptions indicate they are generally receipts for contract work on the good categorized by the (cid:133)rst four digits. Code 22579 in the PD (cid:133)le, for example, is (cid:147)contract and commission receipts for knitting only or knitting and (cid:133)nishing weft (circular) knit fabrics(cid:148). Code 22573, which appears in both the PD and the o¢ cial list, by comparison, is (cid:147)(cid:133)nished weft (circular) knit fabrics, excluding hosiery(cid:148). (cid:150)There are 166 (cid:133)ve-digit SIC codes that are matched to HS codes in the PD concordance but do not appear in the o¢ cial SIC list. Of the 166, 102 end in (cid:147)0(cid:148)and 95 are in manufacturing. We suspect that the 102 codes ending in (cid:147)0(cid:148) are used to facilitate the matching of SIC and basecodes by capturing a range of 18See Census (1992) at http://www.census.gov/prod/2/manmin/mc92-r-1.pdf.
HS to SIC and NAICS 31 goods spread across (cid:133)ve-digit codes with the same four-digit root. For example, 20220 is in the PD (cid:133)le but not on the o¢ cial list, and is described as (cid:147)cheese, natural and processed, not speci(cid:133)ed as to kind(cid:148), versus 20223 and 20224, both of which are in both the PD and the o¢ cial list but which break cheese down into natural and processed cheese, respectively. All three of these codes map into the same basecode, 20223B00, which maps to HS codes beginning with 0406, i.e., (cid:147)cheese and curd(cid:148). E2. 1997 Economic Census The 1997 PD (cid:133)le maps seven-digit NAICS product classes to eight-digit (NAICS-based) basecodes and is available in the electronic appendix with (cid:133)lename pd97.csv.19 We note the following: 841 unique basecodes are matched to a product class (PC) in the 1997 PD (cid:133)le, 763 (cid:15) of which are in manufacturing (i.e., begin with a (cid:147)3(cid:148)). Table 9 summarizes the distribution of these basecodes according to the number of seven-digit NAICS product classes into which they map. As a group, the eight-digit basecodes contain 451 unique six-character basecode roots, 388 of which are in manufacturing. Table 9: Number of Product Classes per Basecode and Basecode Root (1997) Overall Manufacturing Product Basecode Basecode Classes Basecodes Roots Basecodes Roots 1 576 143 518 105 2 130 91 120 80 3 55 71 50 65 4 24 44 21 40 5 25 38 23 35 6 7 16 7 15 7 6 12 6 12 8 5 7 5 7 9 4 6 4 6 10 3 4 3 4 11 4 4 12 4 4 13 1 1 1 1 14 2 2 16 1 1 17 1 1 18 2 2 2 2 19 1 1 24 1 1 1 1 30 1 1 36 1 1 44 1 1 1 1 Total 841 451 763 388 Notes: Table displays distribution of basecodes and basecoderootsaccordingtothenumberofproductclasses into which they map, overall and for manufacturing. 19We thank Alvin Venning of the U.S. Census Bureau for providing us with a copy of the 1997, 2002 and 2007 PD (cid:133)les.
HS to SIC and NAICS 32 1559 unique seven-digit NAICS product classes are matched to an eight-digit basecode (cid:15) in the 1997 PD (cid:133)le, of which 1418 are in manufacturing. The o¢ cial list of NAICS categories for the 1997 CMF encompasses 1469 seven-digit product classes in manufacturing. E3. 2002 Economic Census The 2002 PD (cid:133)le maps seven-digit NAICS product classes to eight-digit (NAICS-based) basecodes and is available in the electronic appendix with (cid:133)lename pd02.csv. We note the following: 832 unique basecodes are matched to a product class (PC) in the 2002 PD (cid:133)le, 754 (cid:15) of which are in manufacturing (i.e., begin with a (cid:147)3(cid:148)). Table 10 summarizes the distribution of these basecodes according to the number of seven-digit NAICS product classes into which they map. As a group, the eight-digit basecodes contain 450 unique six-character basecode roots, 388 of which are in manufacturing. Table 10: Number of Product Classes per Basecode Root Overall Manufacturing Product Basecode Basecode Classes Basecodes Roots Basecodes Roots 1 567 143 509 105 2 132 90 122 80 3 53 73 48 67 4 25 42 22 39 5 25 39 23 36 6 7 19 7 17 7 6 11 6 11 8 4 4 4 4 9 4 6 4 6 10 2 5 2 5 11 2 2 12 4 4 13 2 2 2 2 14 1 1 15 2 2 17 2 2 18 1 1 19 1 2 1 2 23 1 1 1 1 31 1 1 37 1 1 43 1 1 1 1 Total 832 450 754 388 Notes: Table displays distribution of basecodes and basecoderootsaccordingtothenumberofproductclasses into which they map, overall and for manufacturing. 1,547uniqueseven-digit NAICSproductclasses arematchedtoaneight-digit basecode (cid:15) in the 1997 PD (cid:133)le, of which 1,406 are in manufacturing. The o¢ cial list of NAICS categories for the 2002 CMF encompasses 1,450 seven-digit product classes in manufacturing.
HS to SIC and NAICS 33 E4. 2007 Economic Census The 2007 PD (cid:133)le maps seven-digit NAICS product classes to eight-digit (NAICS-based) basecodes and is available in the electronic appendix with (cid:133)lename pd07.csv. We note the following: 799 unique basecodes are matched to a PC in the 2007 PD (cid:133)le, 724 of which are in (cid:15) manufacturing (i.e., begin with a "3"). Table 11 summarizes the distribution of these basecodes according to the number of seven-digit NAICS product classes into which they map. As a group, the eight-digit basecodes contain 454 unique six-character basecode roots, 390 of which are in manufacturing. Table 11: Number of Product Classes per Basecode Root (2007) Overall Manufacturing Product Basecode Basecode Classes Basecodes Roots Basecodes Roots 1 558 170 498 123 2 115 82 107 75 3 48 67 46 65 4 23 42 19 37 5 25 34 24 32 6 8 19 8 18 7 7 9 7 9 8 5 5 5 5 9 3 5 3 5 10 1 4 1 4 11 3 3 12 1 5 1 5 13 1 1 1 1 14 2 2 15 1 1 18 1 1 20 1 1 1 1 22 1 1 1 1 29 1 1 1 1 30 1 1 36 1 1 Total 799 454 724 390 Notes: Table displays distribution of basecodes and basecode rootsaccordingtothenumberofproductclassesintowhichthey map, overall and for manufacturing. 1,496 unique seven-digit NAICS product classes are matched to an eight-digit basec- (cid:15) ode in the 2007 PD (cid:133)le, of which 1,383 are in manufacturing. The o¢ cial list of NAICS categories for the 2007 CMF encompasses 1,435 seven-digit product classes in manufacturing.
Cite this document
Justin R. Pierce and Peter K. Schott (2011). A Concordance Between Ten-Digit U.S. Harmonized System Codes and SIC/NAICS Product Classes and Industries (FEDS 2012-15). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2012-15
@techreport{wtfs_feds_2012_15,
author = {Justin R. Pierce and Peter K. Schott},
title = {A Concordance Between Ten-Digit U.S. Harmonized System Codes and SIC/NAICS Product Classes and Industries},
type = {Finance and Economics Discussion Series},
number = {2012-15},
institution = {Board of Governors of the Federal Reserve System},
year = {2011},
url = {https://whenthefedspeaks.com/doc/feds_2012-15},
abstract = {While the relationship between international trade and domestic economic activity is an important topic in economics, research in this area has been slowed due to data limitations. In this paper we provide tools that improve the existing data in two ways. First, we develop an algorithm that yields concordances between the ten-digit Harmonized System (HS) codes used to classify products in U.S. international trade and the SIC and NAICS industry codes used to classify domestic economic activity. These concordances then yield novel time series of industry-level international trade data for the years 1989 to 2009. Second, we provide concordances between HS codes and the SIC and NAICS product classes used to classify U.S. manufacturing production, allowing for matching at a more disaggregated level than was previously available.},
}