How much has wealth concentration grown in the United States? A re-examination of data from 2001-2013
Abstract
Well known research based on capitalized income tax data shows robust growth in wealth concentration in the late 2000s. We show that these robust growth estimates rely on an assumption---homogeneous rates of return across the wealth distribution---that is not supported by data. When the capitalization model incorporates heterogeneous rates of return (on just interest-bearing assets), wealth concentration estimates in 2011 fall from 40.5% to 33.9%. These estimates are consistent in levels and trend with other micro wealth data and show that wealth concentration increases until the Great Recession, then declines before increasing again. Accessible materials (.zip)
Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C. How much has wealth concentration grown in the United States? A re-examination of data from 2001-2013 Jesse Bricker, Alice Henriques, and Peter Hansen 2018-024 Please cite this paper as: Bricker, Jesse, Alice Henriques, and Peter Hansen (2018). “How much has wealth concentration grown in the United States? A re-examination of data from 2001-2013,” Finance and Economics Discussion Series 2018-024. Washington: Board of Governors of the Federal Reserve System, https://doi.org/10.17016/FEDS.2018.024. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.
How much has wealth concentration grown in the United States? A re-examination of data from 2001-2013 Jesse Bricker∗1, Alice Henriques†2, and Peter Hansen‡3 1Federal Reserve Board 2Federal Reserve Board 3Federal Reserve Board Dated: March 30, 2018 Well known research based on capitalized income tax data shows robust growth in wealth concentration in the late 2000s. We show that these robust growth estimates rely on an assumption—homogeneous rates of return across the wealth distribution—that is not supported by data. When the capitalization model incorporates heterogeneous rates of return (on just interest-bearing assets), wealth concentration estimates in 2011 fall from 40.5% to 33.9%. These estimates are consistent in levels and trend with other micro wealth data and show that wealth concentration increases until the Great Recession, then declines before increasing again. Keywords: Household wealth, wealth concentration JEL Classification: D31, D14, H0, E6 ∗email: jesse.bricker@frb.gov. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. We would like to thank our colleagues on the SCF project who helped make this research possible: Lisa Dettling, Joanne Hsu, LindsayJacobs, ElizabethLlanes, KevinMoore, SarahPack, JeffThompson, andRichardWindle. We would like to thank John Sabelhaus, Karen Pence, Emmanuel Saez, Gabriel Zucman, Steven Pedlow, Tom Crossley, Katharine Abraham, and participants at the Society of Economic Measurement, NBER Summer Institute, ESRA, and WID.world conferences for helpful comments. We also thank Emmanuel Saez for sharing a crosswalk between PUF and INSOLE data, and Arthur Kennickell for inherited knowledge of the SCF sampling process. We have also greatly benefited from coordination and discussions with the staff at the Statistics of Income, especially Barry Johnson, David Paris, Michael Parisi, Lori Hentz, and Lisa Russ. †email: alice.henriques@frb.gov ‡email: peter.hansen@frb.gov 1
1. Introduction Income and wealth concentration is increasingly viewed as a potential source of political and macroeconomic instability (Piketty, 2014; Stiglitz, 2012). A flurry of research shows that income is increasingly concentrated, that the increase in income concentration arises from both permanent and transitory factors, and that the tax system can help alleviate income concentration (Piketty and Saez, 2003; DeBacker, Heim, Panousi, Ramnath, and Vidangos,2012; Auten,Gee,andTurner,2013; CBO,2014). Muchoftherecentresearchuses micro-level income tax data because income tax filing is nearly universal at the top. In the United States, though, there are no micro wealth data with comparable coverage.1 Wealth concentration estimates are instead based on either household survey data, “capitalized” income tax data, or estate tax data. Though the same forces that increase income concentration may also increase wealth concentration—for example, through saved income—there is a notable lack of agreement in the growth in wealth concentration in recent years.2 Previous research using capitalized incometaxdata(SaezandZucman, 2016)hasfoundthatwealthconcentration—definedhere as the share of wealth held by the wealthiest 1 percent of families—has grown rapidly, from 32 percent in 2002 to 42 percent in 2012 (figure 1). But micro wealth data from the Survey of Consumer Finances (SCF) and from estate tax data shows that wealth concentration has grown modestly over this time (Bricker, Henriques, Krimmel, and Sabelhaus, 2016; Kopczuk and Saez, 2004). However, we show in this paper that there is actually considerable agreement in level and trend in wealth concentration across measurement methods once the capitalization model is 1The only wealth tax that exists in the United States is an estate tax applied at death and to very few families (less than 0.5 percent of the population, currently). 2Theory is also inconclusive. Wealth concentration can grow if wealthier families have higher returns on capital assets and can fall if they have lower returns. Inheritances also have a theoretically ambiguous effect on concentration. In the data, wealthier families tend to realize higher returns on assets (Fagareng, Guiso, Malacrino, and Pistaferri, 2016)—serving to increase concentration—and inheritances appear to reduce concentration (Elinder, Erixson, and Waldenstrom, 2016; Boserup, Kopczuk, and Kleiner, 2016). 2
allowedtobetterreflectempiricaldata.3 Acapitalizationmodelreliesonasetofassumptions about asset rates of return. In the baseline model, wealth is estimated from taxable income by assuming a homogeneous rate of return on assets—one that is common across families but varies across types of asset. We provide evidence that this assumption does not fit the data on interest-bearing assets, and estimate several alternative capitalization models that relax that assumption (for just interest-bearing assets). In these heterogeneous return models, wealth concentration increases until the period of the Great Recession and then plateaus at about 34 percent (figure 3). These results are in contrast to the capitalized results reported in Saez and Zucman (2016), where wealth concentration increases rapidly during and after the Great Recession, reaching more than 40 percent by the end of the period. However, theresultsareconsistentwithestimatesfromothermicrowealthdatasources— namely the SCF (figure 6)—and comparable to the trend in estate tax data (figure 7). With the exception of wealth capitalized from a homogeneous return model, then, the trend in U.S. wealth concentration estimated here is similar to the trend in U.S. income concentration (Piketty and Saez, 2003 and updates), and to wealth concentration in other countries (Alvaredo, Atkinson, Chancel, Piketty, Saez, and Zucman, 2016 and updates).4 We present evidence from three U.S. micro data sources that wealthier families realize higher rates of return on interest-bearing assets. The data come from matched estate taxto-income tax data (originally shown in Saez and Zucman, 2016), from the SCF, and from a match of SCF and administrative income data. Each shows that wealthier families have higher rates of return on interest-bearing assets (figure 2). This evidence is consistent with recent research with Norwegian registry data (Fagareng, Guiso, Malacrino, and Pistaferri, 2016).5 3We do not include a detailed estate tax data analysis but point out the work of others that use these data. The threshold to file for the estate tax has increased over the years, leading to concerns about their representativeness (Saez and Zucman, 2016). 4InadditiontotheU.S.,theWID.worldprojectreportstheshareofwealthheldbythetop1%forFrance, the U.K., China, and Russia. 5Wealthy families may have access to investment vehicles—such as private equity, hedge funds, and loaning money to a closely held business—that less wealthy families cannot typically access (Calvet et al, 3
Why does incorporating heterogeneous returns on interest-bearing assets change the estimates so much? The modeling needed to transform income to wealth is sensitive to small changes in assumed rates of return, especially when the return is low as is the recent case for interest-bearing assets (Kopczuk, 2015).6 And we pay special attention to interest-bearing assets because the baseline capitalization model predicts that nearly all of the growth in wealth concentration since the mid-2000s is due to the growth in interest-bearing asset holdings of the wealthiest families.7 We also augment the SCF to include wealth measured in the capitalized estimates by adding estimates of DB pension wealth and Forbes 400 wealth (as in Bricker et al., 2016). Here, though, we include the Forbes 400 wealth through a new set of SCF weights— constructed and described in this paper—that take advantage of the overlap between the wealth distribution of SCF respondents and the wealth distribution of the Forbes 400. In addition to comparing point estimates, we also calculate and compare cross-sectional variabilityoftheestimates. Wedescribethe“feasibleregion”ofcapitalizedwealthestimates— the range of possible estimates under the permuted models (figures 8 and 9, in shaded red)— and the range of SCF estimates under sampling and item non-response variability (figures 8 and 9, in shaded blue).8 These two ranges overlap throughout the sample period. Notably, the feasible region in the capitalization model is larger than in the SCF survey. The benefit of using income tax data to infer wealth is the nearly universal coverage at the 2007; Cagetti and De Nardi, 2007). This may be especially true in interest income where a typical family has access to savings accounts while wealthier families can invest in higher-yield bonds. 6For example, in such a model, a 1 percent rate of return on assets means that we infer $100 of assets for every$1ofincome. Inthisexample,a1percentagepointincreaseintherateofreturn(to2percent)implies inferred wealth falls by half (to $50) from the same $1 of income. As rates of return on interest-bearing assets have fallen since the Great Recession, these assets are especially sensitive to small deviations in the capitalization model. 7See Saez and Zucman, 2016, appendix tables B5 and B6. Interest-bearing assets are nearly 50% of the portfolios of the wealthiest families in the capitalization model in 2011, up from about 30% in the 2007 capitalization. 8Uncertainty can also arise from using annual income instead of permanent income to predict wealth (Bricker, Henriques, and Moore, 2017). For example, top income shares in income tax data are about 20 percentlowerwhenusingpermanentincomeinsteadofannualincome(Thompson,Parisi,andBricker,2018). Relativetomodelingvariability, though, webelievethisisasmallpartofthetotalvariability. Seeappendix A for more details. 4
top that the tax data provide. But in the results shown here, this benefit is overwhelmed by the variability resulting from the need to make model assumptions to predict wealth from income data. 2. Income Tax Data and Capitalization Models Measuring and explaining wealth concentration has challenged economists at least since Pareto (1896). Unlike income, there is no administrative data system directly associated with measuring the cross-section of wealth at a point in time in the United States.9 In this section we describe efforts to infer wealth from these administrative income tax data by “capitalizing” taxable income. 2.1 Administrative tax data TheIndividualandSoleProprietor(INSOLE)dataareasetofadministrativerecordsderived from income tax returns. The INSOLE file consists of a sample of individuals and sole proprietorship tax filings from Internal Revenue Service (IRS) administrative tax data, and are statistically edited for quality by the Statistics of Income (SOI) at the IRS (see, for example, Statistics of Income, 2012a). These data describe the annual distribution of income and income sources, deductions, and taxes, and are used in Saez and Zucman (2016). The Public Use File (PUF) is a modified version of the INSOLE files that are available for public use, either through NBER or directly from SOI.10 For public disclosure reasons, some records in the INSOLE file are subsampled for the PUF release, and some extreme values are also excluded from the PUF (SOI 2012b). This has the effect of lowering, somewhat, the income concentration in the PUF. However, Saez (2016) develops a set of annual aggregate records that will top off the annual PUF data so that aggregates income—by type—in the 9The only official wealth record that exists in the U.S. comes from an estate tax applied at death. 10Thoughthesedataarepubliclyavailable,theyarerestricted-use. Thereisafeeassociatedwithobtaining the PUF files available directly from SOI, though the NBER version is available to associated researchers. 5
PUF matches that in the INSOLE file. In our work, we use the PUF file for the years 2002 to 2012—the last year the PUF data are available to us—and the Saez (2016) supplement so that aggregates and distributions from our PUF data will match that of the INSOLE. The files contain several hundred thousand observations, and they weight up to the full population of tax units. In 2011, for example, there were about 135 million tax units. There may be several tax units in a family, so there are more tax units than families. For comparison, there are about 110 million families measured in the 2011 March CPS.11 2.2 Wealth inference from capitalization models One can predict wealth from income by “capitalizing” (or inflating) asset income by an asset-specific rate of return (Greenwood, 1983, Kennickell, 2001, Saez and Zucman, 2016). However, many forms of wealth do not appear on a tax form, and a capitalization model must also include an estimate of these non-financial sources of wealth. The general form of such a model is: K (cid:92) CAP (cid:92) (cid:88) wealth = nonfinancial +kg + (incomek/rk), (1) i i i i k=1 where there are i=1...N tax units, K types of income and rk is the rate of return on the k-th type of income, and rk is typically (0,1). Capital gains may or may not be included in such a model. 2.2.1 Saez and Zucman (2016) model SaezandZucman(2016)useaversionofthe gross capitalization model described above (equation 1) to predict the stock of wealth held by each tax filing unit. Their estimated rate of return results from (cleverly) distributing the household asset stock of the Financial Accounts of the United States (FA) based on the distribution of realized capital income in the INSOLE data. 11Note that using tax units leads to an upwardly biased estimate of income (Larrimore Mortenson and Splinter, 2017) and wealth (Bricker et al., 2016) concentration because of the way multiple tax unit families appear in the income distribution. 6
FA assets are organized into seven classes: (1) taxable interest bearing assets, (2) nontaxable interest bearing assets, (3) dividend-producing assets (e.g. from publicly traded companies), (4)pass-throughbusinessassetsthatproduceScheduleEincome, (5)corporatebusinessassetsthatproduceScheduleCincome,(6)privatelyheldpensionassets(IRAs,401(k)s), and (7) workplace defined benefit pensions. Income from the INSOLE data are mapped to these asset classes, and an asset-class rate of return is found by the ratio of INSOLE income to the FA asset stock for each of the seven types (rorK,SZ = ( (cid:80)N incomeINSOLEk )/FAk). i=1 i Applying that rate of return to INSOLE income inflates that income into an estimate of wealth. Other assets held by household and debts owed by households appear in a limited fashion on tax filings. Some families that own a home, for example, pay enough mortgage interest and property taxes to deduct these expenses on their tax filings. One can distribute that aggregate FA value of housing assets and mortgage debts by using these deduction expenses along with assumptions about the number of families that itemize deductions. Other assets and debts do not appear on a tax form. Only a small amount of consumer credit debt, for example, is deductible, so the Saez and Zucman (2016) model uses the distribution of consumer credit by income in the SCF to back out how to distribute the FA amount. Defined benefit pensions, too, rarely appear on a tax form, so the model distributes this stock of wealth according to wage and pension income distributions found the tax data with help from the SCF. 2.2.2 Heterogeneous rates of return One of the biggest assumptions in the capitalization model described above is that all families realize identical rates of return by asset. Saez and Zucman (2016) suggest that the assumption is modest by using matched estate tax to income tax filings. Implied rates of return for wealthy filers are often higher than the return found by the method described in section 2.2.1, but they are often not much higher. However, other evidence suggests that wealthier families get higher rates of return (Fagareng 7
et al, 2016). Here, we consider heterogeneous rates of return on just one asset class: interest-bearing assets. Wenotethattheimpliedrateofreturnoninterest-bearingassetsinSaezandZucman (2016) is much lower than market rates from the 10-year Treasury yield (table 1) or Moody’s Aaa corporate bond—the type of interest-bearing assets that are held by wealthy families (Bricker et al, 2016, Kopczuk, 2015). For example, in 2011 the rate of return on taxable interest of 1.15% in the Saez and Zucman (2016) model while the 10-year Treasury yield was 2.78%, on average, in 2011. A lower rate of return would push up concentration estimated by this model. The capitalization model can be modified to allow some families to get higher rates of return than others. The FA shows that households own about $10.9 trillion in taxable bonds, deposits and other fixed income assets in 2011. Taxable interest income of the families at the top end—say, the top 1 percent—can be capitalized at the 2.78% rate of the 10-year Treasury, and the remaining income is then capitalized at a lower rate, to keep total interest assets equal to the FA total of $10.9 trillion. In a capitalization model, when the rate of return for top-end families is higher than the average return then the share of wealth held by the top-end will fall. This is especially true when rates of return are low (Kopczuk, 2015) 3. Survey of Consumer Finances data The SCF is a cross-section survey, conducted every three years by NORC on behalf of the Federal Reserve Board (FRB) and with the cooperation of the Department of Treasury (SOI).12 The SCF provides the most comprehensive and highest quality survey microdata 12See Bricker, et al. (2017) for results from the most recent triennial SCF. A great degree of security is involved with this sampling procedure and formal contract govern the agreement between the FRB, NORC and SOI. The FRB selects the sample from an anonymized data file. The FRB sends the sampled list to SOI, who remove the famous families and passes along the list to NORC for contacting. NORC collects the survey information and sends to FRB. Thus, the FRB never knows any contacting information, SOI never 8
available on U.S. household wealth. SCF families respond to questions about financial and nonfinancial assets, debts, employment, income, and household demographics. As noted before, measuring income and wealth at the top using simple random sampling is not viable due to the concentrated nature of economic resources. Thin tails at the top lead to large sampling variability, and disproportional non-participation at the top biases down top share estimates. Both make measuring wealth concentration extremely difficult. The Survey of Consumer Finances (SCF) overcomes both problems by oversampling at the top using administrative data derived from income tax records (the INSOLE file), and by verifying that the top is represented using targeted response rates in several high end strata. The list sample ensures that the SCF has adequate representation of the upper tail of the wealth distribution and adequate representation of sparsely held assets. 3.1 Wealth measurement: SCF versus FA The wealth concentration estimates derived in Saez and Zucman (2016) use the household and non-profit aggregate wealth in the Financial Accounts (FA) of the United States as the stock of wealth, and distribute it according to income tax filings. The SCF asks families for their assessment of the family’s stock of wealth, and uses the same tax records to populate the upper tail of the wealth distribution. Notably, theFAincludesdefinedbenefit(DB)pensions, whilethebaselineSCFwealthestimates do not.13 But the SCF collects detailed pension information, and the SCF estimates shown here will include an estimate of DB pension wealth held by US families.14 knows any survey responses, and NORC never knows anything more than survey responses and location information. 13The two data sources are remarkably similar once they are put in comparable terms (Henriques and Hsu, 2013, and Dettling et al, 2016). 14DetailedinformationonDBpensions—whetheratacurrentjoborapastjob—arecollectedintheSCF. We use this information and life tables to allocate the DB wealth across families. Please see the appendix of Bricker et al (2016) for a detailed description of the DB pension estimate for SCF families. 9
3.2 SCF sampling The SCF combines a geographically stratified and nationally representative area probability (AP) sample with a list sample (LS), an oversample of households that are likely to be wealthy. In the 2013 SCF, there are about 6,000 families surveyed by the SCF, of which about 1,500 are from the list sample. The AP sample is drawn by NORC at the University of Chicago and provides a nationally representative sample of families.15 The list sample is drawn using a frame of statistical records derived from tax returns—the INSOLE data described above.16 The list sample frame data are typically based on income tax returns filed the year prior to the SCF survey year, meaning that the income was earned two years prior to the SCF survey year. In the 2013 SCF, for example, the frame data were derived from tax returns covering income earned in 2011. However, since 2001 the SCF list sample is drawn using multiple years of income for the returns in the frame. In the 2013 SCF, for example, the 2011 income data in the frame were supplemented with 2010 and 2009 income.17 The SCF sampling strategy uses two methods of predicting wealth from income. The first is a capitalization model, and is similar to that described in equation 1 in the previous section. The second model uses the empirical correlation between wealth collected in the SCF and income from the administrative sampling data. The basis for this model is a 15See Tourangeau, et al. (1993), O’Muircheartaigh et al. (2002) for more information about the NORC national samples. 16The unit of observation in the INSOLE data is a tax unit while the SCF unit of observation is a family. In practice, there are millions more tax units than families because several members of a family can file distinct tax returns; without a correction, these multi-filer families would have a disproportionately large chance of being selected. To account for this in the SCF LS sampling process, the INSOLE sampling weight of tax units that filed married filing separately is divided in half. Further, all filers below the age of 18 are dropped (a family headed by someone less than age 18 is ineligible for the SCF). Still, to a certain extent, the discrepancy between tax units and families remains in the adjusted INSOLE sampling frame. 17TheINSOLEfileisnotdesignedtobeapanel,thoughcertaintysamplingofhighincomefamilies(among others) means that families with consistently high incomes are often in the sample year over year. Filers with total income of at least $5 million, filers with total income of less than negative $5 million, filers with $50 million of Schedule C receipts, and filers with at least $200,000 of AGI but zero tax liability are all sampled with certainty (Czajka, Sukasih, and Kirwan, 2014; Bryan, 2015). Filers with at least $2 million (or less than negative $2 million) in income are sampled at about a 50 percent rate. The file is also sampled in a Keyfitz method, meaning that there is a strong overlap between adjoining year files. 10
regression of observed SCF wealth from the most recent SCF on the administrative income used to generate the SCF list sample for that survey year.18 A wealth prediction and wealth ranking can be inferred from each model. The rankings frombothmodelsareblendedtogether, andtheSCFlistsampleissampledfromthisblended ranking.19 The records are organized into seven mutually exclusive strata of increasing wealth, and the top four strata cover the wealthiest top 1% of records. About 5,100 LS cases are selected and the majority are from these top strata. Response rates at the top end strata ranged from 33 percent to about 12 percent in the 2013 SCF. These response rates are targetedpriortothesurveyfieldperiodtoensuresufficientcoverage,sothedecliningresponse rate at higher wealth levels does not necessarily imply a nonrandom set of respondents at the top. In fact, Bricker et al (2016) show that the income of the responding list sample families is similar to the income of non-responding families within each sampling strata, and non-response is ignorable for the upper list sample. The SCF sample weights adjust for nonresponse. 4. Wealth concentration estimates Baseline trends in wealth concentration for the capitalized income data and the SCF are shown in figure 1. The capitalized data are available on an annual basis while the SCF is conducted every three years. The baseline capitalized data here are from Saez and Zucman (2016)andassumehomogeneousratesofreturnacrossfamilies(asinsection2.2.1). Twosets of SCF estimates are presented: one without an estimate of defined-benefit (DB) pension wealth held by US families (the dashed line) and another that includes an estimate of DB pension wealth described in the previous section (the solid line). Previousresearchshowsthatmeasurementdifferencesbetweenthetwodatasets—including 18Model details are provided in detail in Bricker, Henriques, and Moore (2017). 19Typically, the blend is a 50/50 split, although in recent years the split has favored the empirical correlation model, because of the empirical correlation model has a higher correlation with SCF wealth ranking than does the capitalized model. 11
DB pension wealth, but also including unit of observation, and population coverage—can explain some of the difference in levels of concentration in recent years (Bricker et al., 2016). Accordingly, for the rest of the paper we use SCF estimates that include DB pension wealth to be comparable to the capitalized wealth estimates (which include DB wealth from the Financial Accounts data). But, suchchangesdolittletoexplainthedivergenceinthegrowth ofwealthconcentration thathasemergedsincethelate2000s. Webuildonthispastworkandareabletodemonstrate that levels and growth in wealth concentration are remarkably similar once an assumption about rates of return on interest-bearing assets that underlies the capitalized estimates is changed to better fit the data. 4.1 Wealth concentration from capitalized wealth Overall, the SCF estimates grow about 3 percentage points from 2001-2013, but concentration estimates from modeled income tax data grow almost 9 percentage points from 2001-2012. We begin by noting that most of the difference in growth between the two comes from the late 2000s period, when concentration estimates from the baseline capitalization model grow rapidly (from 34.8% to 40%) while the SCF estimates fall modestly. For much of this period, though, the SCF and capitalized wealth concentration estimates grow at the same pace (figure 1). We also note that in the baseline capitalization model—the red line in figure 1—interestbearing assets drive the increase in wealth concentration since the period of the Great Recession. This model predicts that nearly half of the wealth of the wealthiest families is held in interest-bearing assets by 2011—up from 30% in 2007—and that nearly all of the growth in wealth concentration since the mid-2000s in this baseline model is due to the growth in interest-bearing asset holdings of the wealthiest families (Saez and Zucman, 2016, appendix figures B5 and B6).20 20In contrast, the SCF indicates that the wealthiest families hold about 20% of assets as interest-bearing assets, and that just 1/3rd of the growth in wealth concentration is due to interest-bearing assets (Bricker 12
Inthisbaselinemodel,thegrowthininterest-bearingassetsbythewealthiesthappenedat the same time that interest rates fell precipitously. The 10-year Treasury yield, for example, fell from 4.6% in 2007 to 1.8% in 2012, and the homogeneous rate of return used in Saez and Zucman (2016) fell from 2.9% to 1.0% during the same time. In a capitalization model, a fall in an asset rate of return implies a larger capitalization factor on income from that asset. And in these models, there is a non-linear increase in predicted wealth levels as rates of return fall toward zero (Kopczuk, 2015). For example, when the average rate of return is 3%, the model predicts that $1 of interest income is associated with $33 of wealth, and predicts $50 of wealth when the return falls by 1 percentage point to 2%. However, the inferred wealth doubles (to $100) when the return falls by another percentage point to 1%. In the baseline capitalization model, the fixed-income rate of return is quite low—just over one percent in the year 2011, for example. In reality, this rate of return is an average of higher yields on interest-bearing assets available to wealthy families (10-year Treasuries or Aaa corporate bonds), and the lower yields on those assets available to the other families (from traditional savings accounts, for example). One potential test of the robustness of the capitalization model, then, is to allow the top-end families to have a higher rate of return on fixed-income assets. This approach is begun in Saez and Zucman (2016) and will be expanded in the next section. 4.2 Heterogeneous rates of return on interest-bearing assets, by wealth In this section we test to see if the wealthiest families get a higher return of fixed-income assets. We use three U.S. data sources, and each dataset has excellent coverage of the wealthiest families. We use estate tax data (used in Saez and Zucman, 2016), the SCF, and a new source: a match of SCF assets and INSOLE income data for SCF list sample et al., 2016, figure 13). 13
respondents.21 All three data sources support a positive correlation between wealth and rates of return on interest-bearing assets—especially since the late 2000s. This evidence is consistent with recent research from Fagareng, Guiso, Malacrino, and Pistaferri (2016) with Norwegian registry data. 4.2.1 Evidence for heterogeneous rates of return on interest-bearing assets, by wealth The correlation between rates of return and wealth in the estate tax data has been previously described in Saez and Zucman (2016, appendix table C6b), and rates of return on interest-bearing assets from these data are reproduced in column 1 of table 1. These rates of return are inferred from a match between income tax records and estate tax filings and are generally larger than those estimated from the baseline capitalization model (table 6). We also estimate rates of return on interest-bearing assets across the wealth distribution in the SCF in the SCF in a similar fashion: by finding the ratio of the flow of interest income (captured in the set of income questions), and the sum of interest-bearing assets.22 The inferred rates of return for wealthy families are similar to those in the estate tax data, though the SCF estimates are typically a bit larger (table 1, column 2) and closer to yields from the 10-year Treasury. In both cases, the rate of return on interest-bearing assets for the wealthy is larger than that seen in the overall population. The dashed line in figure 2 plots the ratio of wealthy rates of return to overall rates of return in the estate tax data and the solid line plots the similar ratio in the SCF data. In both cases, the ratio is generally greater than one and has been steadily increasing in recent years. By the end of the time series of each data set, wealthy families realized a 40% higher rate of return on interest-bearing assets than the 21Note, this match has traditionally been used in the SCF sampling process to support the empirical correlation model, which is described in more detail in Bricker, Henriques, and Moore (2017). 22The sum of interest-bearing assets in the SCF is the sum of liquid deposit accounts, CDs, bonds (nonmunis), government-bond mutual funds, general bond mutual funds, 1/2 of combo mutual funds, savings bonds, and the portion of trusts and managed investment accounts that are invested in interest-bearing assets. The rate of return is calculated as a ratio of means: the mean of income to the mean of assets, for those in the top 1 and those in the bottom 99. 14
return estimated for all families.23 Both data sources used so far have limitations. The estate tax data are a non-random set of decedents in a given year. The SCF represents the US population, but may underestimate interest income relative to the INSOLE data (Moore and Johnson, 2007; Saez and Zucman, 2016).24 Thus, we also match the list sample of the SCF to their sampling income data (the INSOLE data), and generate a third rate of return estimate on interest-bearing assets (table 1, column 3).25 These data, too, show that wealthy families have a return premium of about 50% relative to non-wealthy families on interest-bearing assets (figure 2). An alternative to the data-driven rates of return is to use the 10-year Treasury yield as a plausible rate of return on interest-bearing assets, as in Saez and Zucman (2016). These yields are also described in table 1 and are comparable to the rates of return estimated for wealthy families in the estate tax-income tax data, the SCF, and the SCF-INSOLE linked data. 4.2.2 Wealth concentration under heterogeneous rates of return In the previous subsection, we calculated three rates of return on interest-bearing assets for wealthy families from micro data. In this subsection, we will capitalize the interest-bearing assets for wealthy families using these three rates of return, as well as the 10-year Treasury yield.26 It is important to note here that all other asset classes (non-taxable interest bearing assets, publicly-held equities, privately-held equities and businesses, housing, retirement accounts, and others) are assumed to have a homogeneous rate of return, as in the baseline model in 23Note that the all families rate of return is calculated in Saez and Zucman (2016) and includes wealthy families. 24Total income is nearly identical between the two data sets, but the SCF typically has more business incomeandtheINSOLEtypicallyhasmoreinterestanddividends. Aworkinghypothesisforthedivergence is the misclassification of business income in the SCF: many households that own a business may be paid in the form of a dividend or as interest on a personal loan to the business. 25This estimate is calculated as: [(0.5∗inct−2)+(0.3∗inct−3)+(0.2∗inct−4)]/SCFassets t, where interest inc is INSOLE interest income from years t-4, t-3, t-2 prior to SCF interest-bearing assets (see footnote 22) collected in year t. We also use a set of weights developed specifically for the list sample to weight up to population totals (including nonfilers), and post-stratify to wealth, financial income, and region. 26The rate of return for families outside of the top 1% is calculated as the ratio of the remaining FA interest-bearing assets and PUF income (after accounting for the top 1%). 15
Saez and Zucman (2016) and section 2.2.1. In each of these four models, wealth concentration grows steadily from 2002 to around the Great Recession, peaking at 35 to 37 percent. From there, each of the models that incorporates heterogeneous returns on interest-bearing assets then declines or levels-out until 2011; wealth concentration is estimated to between 33.9 and 37 percent at that time. All four heterogeneous rate of return models and the homogeneous rate of return estimates grow modestly (5 to 15 percent of the 2002 value) from 2002 to the period around the Great Recession. ItisonlyaftertheGreatRecession—andduringthelowinterest-rateenvironment in the U.S.—that the homogeneous return estimates diverge from the heterogeneous return estimates. In 2012, at the end of the time series, estimated wealth concentration spikes. Most of this increase appears to be due to transitory changes in tax law concerning the treatment of dividends and capital gains (recall that these wealth concentration estimates are based on income tax data). In appendix A, we use alternate administrative income data derived from tax returns that run through 2013 and show a similar spike in wealth concentration from 2011 to 2012, before falling back in 2013.27 Figure 3 and table 1 also help show how sensitive the capitalized estimates are to small changes in fixed-income rates of return. For example, in 2010 the top 1 wealth share is 33.9% when wealthy families’ interest income is capitalized with a 10-year Treasury yield (3.2%), is 34.5% when capitalized with the SCF-INSOLE rate of return (2.8%), is 36% when capitalized with the SCF-only rate of return (2.2%), and 36.9% when capitalized with the estate-tax rate of return (1.9%). The estimate is 40.2% in 2010 when a rate of return on interest is used (1.4%) as in Saez and Zucman (2016). The full time-series of these permuted capitalized wealth models shows a muted growth in wealth concentration relative to the baseline. In the baseline model, the increase in wealth 27As explained in appendix A, we do not base our main results on these INSOLE data because we have only intermittent years of data and have less detailed business income in our pull of the INSOLE data than in the PUF data. 16
concentration is close to 9 percentage points from 2002 to 2011. When rates of return are correlated with wealth, the increase is much lower: 3.5 percentage points (from 30.4 percent 2001 to 33.9 percent in 2011) when interest income is capitalized with the 10-year Treasury. Overall, then, the baseline capitalization model—one with homogeneous returns—is the onlymodelpresentedherethatshowsadramaticriseinwealthinequality. Alloftheavailable data support heterogeneous rates of return by wealth on interest-bearing assets. And all of themodelswithheterogeneousreturnsshowmuchlowergrowthinwealthconcentrationfrom 2001-2011 with little to no growth since the late 2000s (when the current low interest-rate environment began). 4.2.3 Alternate specifications Which “top-end” families should get a higher return on interest-bearing assets? One approach would find the top 1 percent by total income and capitalize their interest income with the higher return, while all other asset classes remain capitalized as described in Section 2 (see Saez and Zucman 2016). Under this approach— andcapitalizingthesetop-endfamilieswiththe10-yearTreasuryyield—wealthconcentration falls from about 40 percent to 38 percent in 2011 (figure 4). The top 1 by total income, though, are often the “working rich”—families with high incomethatarestillaccumulatingwealth. Thesefamiliesreceivedaboutone-thirdofinterest income in 2011. When the data are ranked by total interest income or total wealth—instead of total income—the top 1 percent of families in 2011 received more than two-thirds of interest income. Thus, when defining the “top end” by total income, more than half of the interest income of the wealthiest families—who are the families with the most interest income—is still assumed to be in low-yield investment vehicles, like savings and deposit accounts. The approach taken here, then, is to capitalize the interest income of the wealthiest 1 percent with, for example, the 10-year Treasury yield—this is done in the previous section. Ranking families by wealth and then applying a different capitalization factor to the 17
wealthiest can be a circular event. In principle, we can re-rank families after applying the heterogeneous rate of return to see if they are still in the top 1 percent. Such a re-ranking makes a negligible difference in top wealth shares, as those that fall out of the top 1 percent are at the low end of the top (and are replaced by families with nearly as much wealth). The levels change only marginally in such a re-ranking (relative to the dashed-dotted line in figure 4), and the trend is the same. We can also repeat the exercise by applying the higher rate of return to the top 1 by interest income. These estimates are nearly identical to the heterogeneous return by wealth exercise (figure 4). 4.2.4 Large and small balance liquid accounts in FA data the Financial Accounts assets data provide a final piece of evidence supporting the idea of heterogeneous rates of return. If wealthy families realized a lower-than-market rate on interest assets (as the homogeneous model implies in the late 2000s-early 2010s), that would imply that wealthy families were holding considerable balances in savings or checking accounts—not higher-yielding bonds—and that these balances would be growing during the late 2000searly 2010s.28 But large-balance savings accounts measured in the FA did not grow during the late 2000s-early 2010s, making it improbable that these wealthy families were increasing their holdings of these low-yield assets.29 All of the rapid growth in household ownership of non-bond fixed-income assets, then, is due to growth in small balance accounts.30 4.3 SCF wealth concentration estimates The dashed line in figure 5 repeats the same line shown in figure 1: the share of wealth held by the wealthiest 1 percent of SCF households, including an estimate of DB pension wealth held by households. Here, we augment the SCF with Forbes 400 wealth data and put the 28Recall that the 10-year Treasury yield in 2011 was 2.78% and the Moody’s Aaa average yield was 4.5% in 2011. 29Large balance is defined as accounts with $100,000 or more (see table L.205 of the Z.1 FA data). 30See table B.101 in the Z1 release for the FA. 18
SCF in tax unit level, to be comparable to the administrative income tax data in figures 3 and 4. 4.3.1 Combining Forbes and SCF information ThoughtheSCFisprecludedfrom sampling from the Forbes 400, some SCF respondents are as wealthy as Forbes families. As described in the next section, we develop weights to incorporate the wealth of these omitted families into the SCF wealth totals. SCF wealth concentration is after the weight adjustment is mostly a level shift up by about 1 percentage point (as seen is movement from dashed to dotted blue lines in figure 5).31 This is our preferred measure of SCF wealth. However, as noted in section 3.1, the capitalized income data and SCF survey data are also measured in different units: the capitalized wealth estimates use “tax units” as the unit of analysis—while the SCF uses families. In the solid line of figure 5, we try to make the SCF data as comparable as possible to the capitalized estimates by changing the measure to reflect tax units, rather than families.32 Importantly, the augmented SCF trend (in blue) still rises through 2007, before dropping in 2010 and finally rising again in 2013. 4.4 Overall wealth concentration trend In figure 6, we present the SCF estimates—modified to be comparable with the tax data, as noted above—and the four capitalized income tax estimates that use heterogeneous rates of 31Some wealthy families in the SCF have wealth comparable to some Forbes 400 families, so the SCF weights prior to this adjustment already in principle covered some of the wealth held by Forbes 400 families (even though these families are by definition excluded from the SCF). The 1 percentage point adjustment is comparable to (though slightly lower than) the increase in SCF top 1 percent wealth shares estimated from appending a Pareto distribution to proxy for the excluded Forbes wealth in appendix B, table 2. 32These ideas are explored in detail in Bricker et al (2016). Here we provide just a summary treatment. For example, we account for the difference between tax units—in the tax data—and families in the SCF. In 2012, for example, there are 161 million tax units but only 122 million families. Families in the bottom 99 percent are often split into multiple tax units, but a tax unit in the top 1 percent is almost always a family. Counting the top1 percent (1.61 million) oftax units, then, effectivelyincludes more families thancounting the top 1 percent (1.22 million) of families in a survey, inducing an upward bias in concentration estimates relative to those at the family level. There are also some valuation differences between the surveys, though the wealth aggregates are fairly comparable in the SCF and the FA (Dettling et al 2016). Unlike Bricker et al(2016)wedonottrytochangethevaluationofassetclassestomatchtheFAdata(usedinthecapitalized results). The FA values for housing are lower, on aggregate, than the SCF housing values, as are private equity values. 19
return on interest-bearing assets. Overall, the levels and trends in wealth concentration are remarkably consistent across measurement methods. Intermsoftrend,wealthconcentrationintheSCFrisesfrom2001to2007,beforefallinga bit in 2010 (figure 6). Capitalized wealth estimates share the same general trend—increasing fromthe early2000s, peaking inthelate 2000s, before fallingorleveling-offthrough2011—in all four parameterizations with a heterogeneous rate of return by wealth. The estimates are also consistent in levels, with each beginning the period below 35 percent, rising close to 35 percent by the late 2000s, and falling into the 2010-2011 period. We note here, though, that our preferred set of concentration estimates are at the family level (the dotted line in figure 5) and do not measure wealth at the tax unit level. These preferredestimateshavethesametrendasboththecapitalizedestimateswithheterogeneous returns and the SCF in tax unit level; the difference in levels with the preferred estimates, then, is due to differences in unit of observation (tax units versus families). The SCF estimates rise again between 2010 and 2013, while it is unclear how the capitalized estimates evolve over this period. Most are flat from 2010 to 2011 and rise markedly in 2012. How much of this increase is transitory—due to previously-mentioned tax changes—is unknown. But appendix A indicates that the 2012 increase may be short-lived (figure 11). Figure 7 reproduces figure 6, but includes the estimated wealth concentration from the estate tax data.33 These data are measured at the individual level, so are not comparable in levels with the capitalized income tax and SCF estimates; however, they do not indicate growing wealth concentration. These estimates, though, should be taken with caution, as the data cover less and less of the U.S. population over time (Saez and Zucman, 2016). 33See appendix table C4 of Saez and Zucman (2016) for these updated wealth concentration estimates from Kopczuk and Saez (2004). 20
5. Variability in survey and administrative data Another way of evaluating the trend in wealth concentration is to estimate how variable each point estimate is to the assumptions underlying it. Here, we estimate variability in the SCF point estimates through well-known techniques, and estimate a “feasible region” of capitalized wealth estimates—the range of possible estimates under the permuted models in figures 3 and 4. Overall, we show that the “feasible region” of capitalized wealth estimates and the range of SCF estimates under sampling and item non-response variation overlap during the sample period. These results, again, show that trends and levels of wealth concentration are not different in this period. The sources of variability in surveys are well-established (see, for example, Weisberg, 2005), and fall under two main headings: sampling error and non-sampling error. Each is described below, but in this paper we can estimate sampling error, coverage error, unit nonresponse error, and item nonresponse error. When wealth is estimated from income in the capitalization model, wealth shares will depend on the estimated rates of return for assets, on the income data, and on the FA asset data. We will call “modeling error” the changes in estimated wealth share due to deviations in the rates of return in the capitalization model. The main source of variability in wealth share estimates from the income tax data are due to modeling error, though we will note here that the same sources of error in surveys—sampling and non-sampling error—can afflict the administrative income tax data, which is a sample of IRS administrative records. 5.1 Modeling variability This is the main source of variability we find for predicting wealth from the income tax data. When income is capitalized into wealth (as in equation 1) there is one main set of parameters to the model–rate of return–and one main input–income. We can vary the rate of return, though, to see how wealth predictions vary in order to estimate modeling error and estimate 21
a “feasible region” of capitalized wealth estimates—the range of possible estimates under the permuted models in figures.34 5.2 Sampling error Sampling error is the difference between the true population mean and the sample mean. It can be estimated from a set of replicate weights to estimate sampling variability (described in Kennickell and Woodburn, 1999).35 The replicate weights are derived by resampling the SCF respondents along the dimensions of the SCF sample design; the resampling is done 999 times and a unique set of weights are calculated each time. The final result is a set of 999 “bootstrap replicate weights” from which 999 SCF point estimates can be computed. The SCF sampling variation is estimated from these 999 estimates.36 5.3 Nonsampling error Nonsampling error includes a variety of errors inherent in sampled data, including item nonresponse error, coverage error, measurement error, concept validity error, processing error, and adjustment error.37 34One can also think about varying the income inputs into the model. Annual income, as in Saez and Zucman (2016), can be a mixture of permanent income and transitory income. Appendix A describes why a tax-unit panel of taxable income (to approximate permanent income) may be preferable to annual income of tax units. We vary annual and permanent income to see how wealth predictions vary, but find wealth concentration estimates do not vary substantially. 35The FRB provides bootstrap replicates for the public SCF data on the SCF website: https://www.federalreserve.gov/econres/scfindex.htm. Sampling theory (see, for example, Neyman, 1934) allows a household survey to sample and interview relatively few households but make inferences about the population of households. For example, in a random sample the sample mean (xˆ) is an unbiased estimate of the true population mean (x). 36The INSOLE and PUF files are a sample from the IRS administrative records (see https://www.irs.gov/pub/irs-soi/sampling.pdf) As such, there is sampling error associated with the use of these data. We do not have access to the full sampling strategy used by SOI, but in principle one could create bootstrapped standard errors, replicating the sampling strategy (and approximating sampling error in the data). 37Both the administrative tax data and the survey data have other errors that we cannot consider here— including measurement error, concept validity error, processing error, and adjustment error—because we do not have a good way of estimating the variability. The SOI data may have measurement error when families do not accurately file, while in a survey this occurs when a respondent gives a value that does not reflect the true condition of the family.38 Concept validity error can occur in the tax data when a family is confused about where to report certain income or deductions, and in the SCF when the respondent does not understand the question—for example, by confusing an IRA with a 401(k). Statistical editing—which 22
5.3.1 Unit nonresponse occurs when a sampled family does not participate in the survey. The SCF wealth share estimates would be biased if the participants were different than the non-participants (i.e. biased down if the poorest list sample families participate but the wealthiest do not). We show in earlier work (Bricker et al 2016) that this is not the case. There are several sampling strata in the top 1%, ensuring that the SCF covers the top and bottom of the top 1%. Within each sampling strata, we also show that the family income and capital income of respondents comes from the same distribution as the non-respondents. So we argue that unit non-response is ignorable in the SCF top wealth share estimates.39 5.3.2 Item nonresponse describes the situation where a responding SCF family refuses to (or cannot) answer all of the survey questions. Considering only the “complete cases” and ignoring the cases with item nonresponse will lead to selection bias if families of certain types are more likely to have item nonresponse. The SCF uses a multiple imputation technique to impute data to the questions with item nonresponse (Kennickell 1995). Five “implicates” are imputed for each missing value to acknowledge that any one imputation model can only imperfectly recover the distribution of the underlying missing data. The full SCF data, then, is actually five datasets put together, each identified by their implicate number. Because imputed data vary across implicates within a family, we can calculate the variance across the five implicate datasets (called the imputation variance).40 5.3.3 Coverage error occurs when the sampling frame cannot cover the entire population. For example, the PUF and INSOLE data cover the full population of tax filers, but happensforboththeINSOLE,PUF,andSCF—helpsalleviatethissourceoferror,butmayintroduceother “processing” errors. Finally, both data sources may suffer from adjustment error if weighting procedures go awry. 39Unit nonresponse in the INSOLE and PUF files can occur, too. For example, a self-employed filer may work “off the books” for an employer. 40Item nonresponse may occur in the tax data when a family does not claim positive income on a type of income when the family does, in fact, have positive income. For example, a 1099-INT is automatically generated by a financial institution when interest income is greater than $10, but when interest income is lessthan$10thenthefamilywillnotgetthisremindertoclaiminterestincomeontheirtaxfiling. Appendix B of Bricker et al (2016) shows that the number of returns with positive interest income has fallen over the past decade while the number of families with interest-bearing accounts has stayed constant. A decline in interest rates on savings accounts may mean that fewer families will get an automatic 1099-INT reminder. 23
not non-filers. To cover the entire population, we approximate the number of non-filers as in Saez and Zucman (2016).41 In the SCF, the AP sample is derived from an address-based sample and covers the entirety of the U.S.42 The list sample covers the upper tail of the wealth distribution, allowing the SCF to have coverage at the top. However, the SCF is not allowed to sample the Forbes 400 families; these missing 400 imply coverage error in the SCF.43 Below we describe a method of augmenting the SCF sample weights to include the missing Forbes 400 wealth.44 Weights correction OurpreferredtreatmentofcoverageerrorinvolvesadjustingtheSCF sample weights at the top and including a weighted version of the Forbes 400 wealth. We do so in a “combining samples” weighting approach by leveraging the overlap between the Forbes wealth and the wealth of some SCF respondents (O’Muircheartaigh and Pedlow, 2002).45 The Forbes list relies, in part, on public knowledge of wealth (through public filings for publicly traded companies, or through self-promotion). Privately held forms of wealth, for example, can evade such public knowledge. We begin by creating three wealth bins ($1-$2 billion, $2-$5 billion, and $5 billion or more) and counting the number of SCF and Forbes cases–weighted and unweighted–in the bins.46 In each bin (b), we find the relative frequency (RF) of SCF and Forbes cases by the formula RF = (n /N )/[(n /N )+(n /N )] (2) b,t b,t b,t b,SCF b,SCF b,Forbes b,Forbes 41Onerealizationofthisnumbermaybe20million,butvaryingthisestimatecanapproximatethecoverage error inherent in these income tax data. 42Save some very remote areas that are too hard to contact in person, but less than 0.1 percent of the population lives in these areas. 43These families are too easily identifiable to be released in a public dataset. 44In appendix B, we can estimate coverage error by assuming that the missing Forbes 400 wealth follows a Pareto distribution. Pareto distribution has been used in other studies to augment European household survey data (Vermeulen, 2015, Dalitz, 2016, Eckerstorfer et al, 2016). 45We do so in a similar way to how the AP and list sample weights are are woven together to create final weights for the SCF (Kennickell and Woodburn, 1999). See, for example, Vermeulen (2015) for a visual of the overlap in the 2010 SCF, and Kennickell (2001). This overlap exists in every survey year used in this analysis. 46WeassumethatForbesfamiliesareself-representingwithweightofonesothenumberofweightedcases isequaltothenumberofunweightedcases. WeusetheSCFsurveyweightwhenconsideringtheSCFcases. 24
for t = {SCF,Forbes}, b = {$1−$2bill.,$2−$5bill.,$5+bill.}, where n is an unweighted count in bin b, N is a weighted count in bin b, and RF is defined in [0,1]. b,t The combined and adjusted weight is adjusted = RF ∗ SCF + RF ∗ wgt b,SCF wgt b,Forbes Forbes , where RF depends on b. With this weight we can use wealth information in the wgt SCF and Forbes, weighted properly for the overlap in the two datasets.47 Using this weight allows us to treat coverage error in the SCF wealth data and still use bootstrap replication techniques (described above) to estimate sampling variability. 5.4 Variability in SCF wealth concentration estimates Overall, the variability in the SCF top 1 percent wealth share is about plus or minus 2 to 3 percentage points in each survey year, symmetric about the point estimate (figure 8 in shaded blue). This variability reflects sampling error and item non-response error. In other work, we show that unit non-response at the top-end in the SCF can be ignored (Bricker et al 2016, figure 3). The SCF top 1 percent wealth share varies by 1 or 2 percentage points each year due to this variability.48 In the SCF, oversampling wealthy families keeps sampling variability low (as much as 1/6th of the unweighted sample count is in the top 1 percent). Item non-response variance can be estimated by the variance across the five implicates of (multiply-imputed) SCF data. The confidence intervals shown in the paper describe an estimate of both the sampling and imputation variance of the SCF estimates. The combined standard error of the top 1 percent wealth share due to both sampling and imputation is described by the formula: SEoverall = (varsampling +(6/5)varimputation)0.5. The shaded band in figure 7 describes this variation. 47When SCF families with wealth greater than the minimum Forbes wealth have a sample weight greater thanone,theyrepresentnotjustthemselvesbutotherfamilieswiththeirwealthlevel. Thesearepresumably families in the Forbes list. Thus, the SCF sample weights prior to this weight correction represent some of Forbes families. 48Wecanestimatethesamplingvariabilitybyre-samplingourdatathroughbootstraptechniquesdescribed in Section III. We re-sample the SCF data 999 times (replicating the sample design) and recalculate top wealth shares in each re-sample to compute an estimate of sampling error. 25
5.5 Wealth concentration trends–incorporating variability The “feasible region” of capitalized wealth estimates—the range of possible estimates under the permuted models in figure 4—is presented in shaded red in figure 8 along with the range of SCF estimates under sampling and item non-response variability, corrected for Forbes 400 under-coverage (in shaded blue). The shaded regions of both estimates generally intersect during the entire sample period, indicating again that there is no real difference in the levels and growth of these two sets of estimates. However, most of the upper end of the recent red area is the homogeneous rate of return estimates. Figure 9 repeats figure 8 but uses only the set of four heterogeneous (by wealth) rate of return capitalization estimates. The red and blue areas overlap during the entire sample period. 6. Conclusion In well-known recent research, wealth concentration estimated from capitalized income tax data are considerably different in level and trend from the SCF and other measures of wealth (Saez and Zucman, 2016). The results described here, though, show a surprisingly consistent set of results once the assumptions of the capitalization model agree with the data. In the SCF survey data and the capitalized income tax data, wealth concentration grows from the early 2000s to the late 2000s, then either plateaus or declines slightly through the period of the Great Recession. This time trend is consistent across three SCF wealth measures, too, including our preferred estimate that includes DB pension wealth and Forbes 400 wealth. This general agreement across wealth measurements is not unexpected, as the datasets have considerable commonality between them. The top-end estimates in the SCF are based on identifying wealthy families from administrative income tax data, and then asking that family the value of their assets in a structured interview. The capitalized income tax data 26
also identify a set of wealthy families from the same administrative income tax data, and predict a value for their wealth. Income tax data have been revelatory in the measurement of income concentration, and havethepotentialtoestimatewealthconcentration, too, becauseofnearlyuniversalcoverage of wealthy families (Greenwood, 1983, Kennickell, 2001, Saez and Zucman, 2016). Overall, though, wealth concentration estimates from capitalized income tax data can be highly variable—even moreso than the SCF. Strong coverage of wealthy families in the tax data is undermined by highly variable wealth modeling—modeling that is needed with these data. Recent research indicates that rates of return are higher for wealthy families in most asset classes—not just interest-bearing assets (Fagareng et al., 2016). Of note, though, is that heterogeneous rates of return in other asset classes should not have as much of an effect on wealth concentration point estimates because the average rates of return in other asset classes never got as low as interest rates of return. But understanding how rates of return vary across the wealth distribution in these other asset classes in the United States is an area for future research. Another area for future research is the path of wealth concentration from 2010 to the present. The recent SCF data indicate that concentration has increased at a much more rapid pace than it did in the 2000s, as equity prices have increased and homeownership has fallen, especially for less well-off families. The SCF estimates increase from 2010 to 2013 (shown in figure 6, for example), but capitalized estimates may (figure 4) or may not (figure 11) have increased during this period. It is possible, then, that the capitalized and SCF wealth concentration estimates are in less agreement in recent years. 27
References [1] Alvaredo, Facundo, Anthony Atkinson, Lucas Chancel, Thomas Piketty, Emmanuel Saez, and Gabriel Zucman (2016). “Distributional National Accounts (DINA) Guidelines: Concepts and Methods used in WID.world,” WID.world working paper series, No. 2016/1. [2] Bricker, Jesse, Lisa J. Dettling, Alice Henriques, Joanne W. Hsu, Lindsay Jacobs, Kevin B.Moore, JohnSabelhaus, JeffreyThompson, and RichardA. Windle(2017).“Changes in U.S. Family Finances from 2013 to 2016: Evidence from the Survey of Consumer Finances,” Federal Reserve Bulletin, vol. 103 (September), pp. 1–42. [3] Bricker,Jesse,AliceHenriques,JacobKrimmel,andJohnSabelhaus(2016).“Measuring Income and Wealth at the Top Using Administrative and Survey Data,” Brookings Papers on Economic Analysis, Spring. [4] Bricker, Jesse, Alice Henriques, and Kevin Moore (2017). “Updates to the Sampling of Wealthy Families in the Survey of Consumer Finances,” FEDS paper 2017-114. [5] Bryan, Justin (2015). “High-Income Tax Returns for 2012’,’ Statistics of Income Bulletin, Summer, pp. 1-60. [6] Calvet, Laurent, John Campbell, and Paolo Sodini (2007). “Down or Out: Assessing the Welfare Costs of Household Investment Mistakes,” Journal of Political Economy, vol. 115, no. 5, pp. 707-747. [7] Cagetti, Marco and Mariacristina De Nardi (2007). “Entrepreneurship, Frictions, and Wealth,” Journal of Political Economy, vol. 114, no. 5, pp. 835-870. [8] Clauset, Aaron, Cosma Rohilla Shalizi, and M.E.J. Newman (2009). “Power-law Distribution in Empirical Data,” SIAM Review, vol. 54, no. 2, pp. 661-703. [9] CongressionalBudgetOffice(2014).“TheDistributionofHouseholdIncomeandFederal Taxes, 2011.” Washington. https://www.cbo.gov/publication/49440 [10] Czajka, John, Amang Sukasih, and Brendan Kirwan (2014). “An Assessment of the Need for a Redesign of the Statistics of Income Individual Tax Sample” mimeo. [11] Dalitz, Christoph (2016). “Estimating Wealth Distribution: Top Tail and Inequality” Hochschule Niederrhein Technical Report 2016-01. [12] Eckerstorfer, Paul, Johannes Halak, Jakob Keppler, Bernhard Schutz, Florian Springholz and Rafael Wildauer (2016). “Correcting for the MIssing Rich: an Application to Wealth Survey Data.” The Review of Income and Wealth, vol. 62, no. 4, pp. 605–627. 28
[13] Debacker, Jason, Bradley Heim, Vasia Panousi, Shanthi Ramnath, and Ivan Vidangos (2013). “Rising Inequality: Transitory or Persistent? New Evidence from a Panel of U.S. Tax Returns.” Brookings Papers on Economic Activity, Spring: 67–122. [14] Dettling, Lisa J., Sebastian J. Devlin-Foltz, Jacob Krimmel, Sarah J. Pack, and Jeffrey P. Thompson (2015). “Comparing Micro and Macro Sources for Household Accounts in the United States: Evidence from the Survey of Consumer Finances.” Finance and Economics Discussion Series, no. 2015-086. Washington: Board of Governors of the Federal Reserve System. [15] Fagereng, Andrea, Luigi Guiso, Davide Malacrino, and Luigi Pistaferri (2016). “Heterogeneity and Persistence in Returns to Wealth,” NBER Working Paper 22822. [16] Greenwood, Daphne (1983). “An Estimation of U.S. Family Wealth and Its Distribution from Microdata, 1973.” Review of Income and Wealth vol. 29, no. 1, pp. 23–44. [17] Kennickell, Arthur (2011). “Ponds and Streams” mimeo. [18] Kennickell, Arthur (2007). “The Role of Over-sampling of the Wealthy in the Survey of Consumer Finances” mimeo. [19] Kennickell, Arthur B., and R. Louise Woodburn (1999). “Consistent Weight Design for the 1989, 1992, and 1995 SCFs, and the Distribution of Wealth.” Review of Income and Wealth, vol. 45, no. 2, pp. 193–215. [20] Kopczuk, Wojciech (2015). “What Do We Know about the Evolution of Top Wealth Shares in the United States?” Journal of Economic Perspectives 29, no. 1: 47–66. [21] Kuznets, Simon (1953). “Shares of Upper Income Groups in Income and Savings.” New York: National Bureau of Economic Research. [22] Larrimore, Jeff, Jacob Mortenson, and David Splinter (2017). “Household Incomes in Tax Data: Using Addresses to Move from Tax Unit to Household Income Distributions.” Finance and Economics Discussion Series, no. 2017-002. Washington: Board of Governors of the Federal Reserve System. [23] Neyman, Jerzy (1934). “On the Two Different Aspects of the Representative Method: The Method of Stratified Sampling and the Method of Purposive Selection.” Journal of the Royal Statistical Society 97, no. 4: 558-625. [24] O’Muircheartaigh, Colm, Stephanie Eckman, and Charlene Weiss (2002). “Traditional and Enhanced Field Listing for Probability Sampling.” In Proceedings of the Joint Statistical Meetings, Survey Research Methods Section. Alexandria: American Statistical Association. [25] O’Muircheartaigh, Colm, Steven Pedlow (2002). “Cumlulating Cases versus Combing Samples.” In Proceedings of the Joint Statistical Meetings, Survey Research Methods Section. Alexandria: American Statistical Association. 29
[26] Pareto, Vilfredo (1896). “Cours d’conomie Politique.” Geneva: Droz. [27] Piketty, Thomas (2014). “Capital in the Twenty-First Century.” Belknap Press. [28] Piketty, Thomas, and Emmanuel Saez (2003). “Income Inequality in the United States, 1913-1998.” Quarterly Journal of Economics, vol. 118, no. 1: 1–39. [29] Saez, Emmanuel (2016) “Improving the Individual Tax Return Public Use Tax Files for Tax and Distributional Analysis” mimeo. [30] Saez, Emmanuel, and Gabriel Zucman (2016) “Wealth Inequality in the United States since 1913: Evidence from Capitalized Income Tax Data,” Quarterly Journal of Economics, vol. 131, no. 2, pp. 519–578. [31] Statistics of Income (2012a). “Individual Income Tax Returns.” Washington, DC: Internal Revenue Service [32] Statistics of Income (2012b). “General Description Booklet for the 2008 Public Use Tax File.” Compiled by Victoria Bryant, Washington, DC: Internal Revenue Service [33] Stiglitz, Joseph E (2012). “The Price of Inequality: How Today’s Divided Society Endangers Our Future.” New York: W. W. Norton. [34] Thompson, Jeffrey, MichaelParisi, andJesseBricker(2018)“TopIncomeConcentration and Volatility,” FEDS paper 2018-010. [35] Vermeulen, Philip (2015). “How Fat Is the Top Tail of the Wealth Distribution?” Working Paper no. 1692. European Central Bank. 30
Table 1: Rate of return on interest-bearing assets of wealthiest families. Note: Rates of return are inferred from ratio of income to assets. Column 1 is inferred from INSOLE income and estate tax assets, and is taken from appendix table C6b of Saez and Zucman, 2016. Column 2 is inferred from ratio of total SCF income to total SCF assets of families in top 1% of net worth. Column 3 is inferred from ratio of total INSOLE income to total SCF assets of families in top 1% of net worth. Column 4 is average annual 10-year Treasury yield. Column 5 shows the homogeneous rate of return from Saez and Zucman (2016) that is applied to the wealthiest families in those estimates. Rate of return from estate tax assets are a weighted average: return from estates valued $20 million or more (weight=0.5), return from estates valued $10-$20 million (weight=0.3), and $5-$10 million (weight=0.2). Estate tax rates of return are from appendix table C6b of Saez and Zucman, 2016. (1) (2) (3) (4) (5) Estate tax SCF SCF-INSOLE 10-year Treasury Memo: SZ16 2001 4.0% 2.8% na. 5.0% 4.0% 2002 3.1% na. na. 4.6% 3.0% 2003 2.8% na. na. 4.0% 2.4% 2004 2.3% 2.3% 4.2% 4.3% 1.9% 2005 2.8% na. na. 4.3% 2.1% 2006 3.3% na. na. 4.8% 2.7% 2007 3.4% 3.9% 3.4% 4.6% 2.9% 2008 3.2% na. na. 3.7% 2.3% 2009 2.1% na. na. 3.3% 1.7% 2010 1.9% 2.4% 4.3% 3.2% 1.4% 2011 1.7% na. na. 2.8% 1.2% 2012 na. na. na. 1.8% 1.0% 2013 na. 1.9% 2.3% 2.4% na. 2014 na. na. na. 2.5% na. 2015 na. na. na. 2.1% na. 2016 na. 1.6% 1.7% 1.8% na. 31
50 40 30 2001 2004 2007 2010 2013 Year tnecreP SCF bulletin SCF + DB pension SZ Figure 1: Share of wealth held by top 1 percent of families in SCF and capitalized from income tax data. Source: Federal Reserve Board, Survey of Consumer Finances, Saez and Zucman (2016). 32
1.5 1 1998 2001 2004 2007 2010 2013 2016 Year oitaR SCF assets, SCF income SCF assets, admin income Estate tax assets, admin income Figure 2: Rate of return on interest-bearing assets, ratio of wealthiest to overall. See table 1 for estimates of rates of return on interest-bearing assets for wealthiest families across time. The ratio of rate of return of wealthiest to overall rate of return in population is plotted here. Source: Survey of Consumer Finances, SOI (INSOLE), Saez and Zucman (2016). 33
40 35 30 2003 2005 2007 2009 2011 Year tnecreP Homogeneous rate of return (RoR) on all assets Heterog. RoR on interest assets: estate tax Heterog. RoR on interest assets: SCF Heterog. RoR on interest assets: SCF-INSOLE Heterog. RoR on interest assets: 10-year Treasury Figure 3: Wealth concentration using several alternate heterogeneous rates of return by wealth. Share of wealth held by top 1 percent of tax filing units, 2002 to 2012. Solid red line is wealth share estimates from baseline capitalization model in Saez and Zucman (2016) with homogeneous rates of return on all assets. All black lines and red dash-dotted line assume heterogeneous rate of return by wealth on interest-bearing assets. The red dash-dotted line assumes wealthy top 1 percent have return on interest assets equal to 10-year Treasury yield. Black dashed line assumes wealthy top 1 percent have rate of return on interest assets as in matched estate tax-income tax data (see Saez and Zucman, 2016). Black dotted line assumes wealthy top 1 percent have rate of return on interest assets as the wealthy top 1 pct. in SCF data. Black dash-dotted line assumes wealthy top 1 percent have rate of return on interest assets as the wealthy top 1 pct. in matched SCF-INSOLE data. Note: in years where SCF was not conducted, we use rate of return from nearest year. We also infer estate tax rate of return in 2012 from table 1 and figure 2. Source: Public Use File (SOI), Saez (2016) top-up of PUF data, and authors’ calculations. 34
40 35 30 2003 2005 2007 2009 2011 Year tnecreP Homogeneous rate of return (RoR) on all assets Heterogeneous RoR on interest assets, by income Heterogeneous RoR on interest assets, by int inc Heterogeneous RoR on interest assets, by wealth Figure 4: Wealth concentration in capitalized income tax model, by rate of return assumptions. Note: Share of wealth held by top 1 percent of tax filing units, 2002 to 2012. Solid red line is wealth share estimates from baseline capitalization model in Saez and Zucman (2016) with homogeneous rates of return on all assets. Dashed red line is top 1 pct. wealth share assuming heterogeneous rates of return on interest-bearing assets where the interest income of top 1 pct. of total income is capitalized with 10-year Treasury yield (as in Saez and Zucman, 2016). Lower two red lines (dashed and dotted) show top 1 pct. wealth share assuming heterogeneous rates of return on interest-bearing assets where the interest income of top 1 pct. of total wealth or interest income is capitalized with 10-year Treasury yield. In these lines, the top 1 pct. wealth share peaks in 2008. Source: Public Use File (SOI), Saez (2016) top-up of PUF data, and authors’ calculations. 35
40 35 30 2001 2004 2007 2010 2013 Year tnecreP SCF + DB SCF + DB +Forbes SCF + DB +Forbes, tax units Figure 5: SCF wealth concentration estimates including Forbes wealth, tax units. The blue dashed line shows the share of wealth held by the top 1 percent of families in the SCF after an estimate of DB pension wealth is included. The blue dotted line shows the marginal increase in wealth concentration after Forbes 400 wealth is included. And the blue solid line is further augmented to “tax unit” form. This solid blue line, then, is the SCF wealth concentration estimate, augmented to include an estimate of household wealth derived from Defined Benefit (DB) pensions, an estimate of Forbes 400 wealth excluded from the SCF, and to include estimate of top 1 percent of tax units, as in income tax data. Source: Survey of Consumer Finances, authors’ calculations 36
40 35 30 2001 2004 2007 2010 2013 Year tnecreP SCF (incl. DB, Forbes, tax units) Homogeneous rate of return (RoR) on all assets Heterog. RoR on interest assets: estate tax Heterog. RoR on interest assets: SCF Heterog. RoR on interest assets: SCF-INSOLE Heterog. RoR on interest assets: 10-year Treasury Figure 6: Wealth concentration in SCF data and capitalized income data. Red and black linesasinfigure3andbluelineasinfigure5. BluelineisSCFwealthconcentrationestimate, augmented to include an estimate of household wealth derived from Defined Benefit (DB) pensions, an estimate of Forbes 400 wealth excluded from the SCF, and to include estimate of top 1 percent of tax units, as in income tax data. Source: Survey of Consumer Finances, Public Use File (SOI), Saez (2016) top-up of PUF data, and authors’ calculations. 37
40 30 20 2001 2004 2007 2010 2013 Year tnecreP SCF (incl. DB, Forbes, tax units) Homogeneous rate of return (RoR) on all assets Heterog. RoR on interest assets: estate tax Heterog. RoR on interest assets: SCF Heterog. RoR on interest assets: SCF-INSOLE Heterog. RoR on interest assets: 10-year Treasury Estate tax (individual level) Figure 7: Wealth concentration in SCF data, capitalized income data, and estate tax data. Red, black, and blue lines as in figure 6. Brown line is top 1 percent wealth share derived from estate tax data (see Saez and Sucman, 2016, and Kopczuk and Saez, 2004). Source: Survey of Consumer Finances, Public Use File (SOI), Saez (2016) top-up of PUF data, Saez and Sucman (2016), and authors’ calculations. 38
40 30 2001 2004 2007 2010 2013 Year tnecreP Variability in SCF, capitalized top 1 wealth shares Capitalized feasible set SCF + DB + Forbes + tax units: 95% CI Figure 8: Range of estimates of wealth concentration: SCF and capitalized income tax data. The upper bound of the red area is the baseline capitalization estimate (from figures 4, 3, 6, 7). The blue shaded area is the 95% confidence interval from estimated sampling and non-sampling errors in SCF data. Source: Survey of Consumer Finances, Public Use File (SOI), Saez (2016) top-up of PUF data, and authors’ calculations. 39
40 30 2001 2004 2007 2010 2013 Year tnecreP Variability in SCF, capitalized top 1 wealth shares Capitalized (heterogeneous RoR) feasible set SCF + DB + Forbes + tax units: 95% CI Figure 9: Range of estimates of wealth concentration: SCF and heterogeneous capitalized income tax data. Solid red line is the homogeneous rate of return estimate from figure 4 and traces out the upper bound on figure 8. For each year, the upper bound of the red area is the maximal heterogeneous rate of return wealth concentration estimate from figure 3 (and repeated in figures 6, and 7). The blue shaded area is the 95% confidence interval from estimated sampling and non-sampling errors in SCF data. Source: Survey of Consumer Finances, Public Use File (SOI), Saez (2016) top-up of PUF data, and authors’ calculations. 40
7. Appendix A: wealth concentration in INSOLE data The administrative income data used in the main results of the paper are the Public Use File (PUF) of the Individual and Sole Proprietor (INSOLE) data. The INSOLE data are a set of administrative records derived from income tax returns and describe the distribution of income and income sources, deductions, and taxes paid in a tax year (Statistics of Income, 2012a). While some INSOLE dataset are available to us, we can access only some years of data, and the INSOLE data available to us do not have all of the variables used in Saez and Zucman (2016). We use the PUF for the main results because we have a consistent time series of these data (2002-2012) and all of the variables used in Saez and Zucman (2016) are available in the PUF. Further, with the set of top-up data created by Saez (2016), our PUF data matches that in the INSOLE file. However, the PUF time series end in 2012, a year with a spike in estimated wealth concentration from most capitalization models. It was advantageous to realize income in tax year 2012—before the scheduled 2013 tax increases on capital income—so the spike in wealth concentration estimated from income tax data is not unexpected. In this appendix we use the INSOLE data available to us to demonstrate several things. First,becauseourINSOLEdataextendthrough2013,wecanshowthatwealthconcentration in 2013 is approximately at 2011 levels (not the 2012 spike) for the heterogeneous return models. Second, we demonstrate that the results in figure 6—which are based on PUF data, topped-up to match the INSOLE—are broadly consistent with INSOLE data. Third, though our INSOLE data are only available in scattered years, they are panel datasets so we can estimate how much permanent income matters to the wealth estimates capitalized from income. The estimates are not very different from those using annual income.49 49Intermsofincome concentration,though,usingpermanentincomemattersquiteabitintermsoflevels (Thompson, Parisi, and Bricker, 2018). 41
7.1 Estimates from annual INSOLE income data The INSOLE data from the years 2005, 2008, 2011, and 2013 are available to us as the sampling frame for the SCF (Bricker, Henriques, and Moore, 2017). In addition, we have a 2-year panel of income preceding the INSOLE year for nearly all INSOLE filers. These data, though, come with a cost relative to the PUF data used in our main results. The INSOLE data available to us come with less detail than the PUF data, especially in business income. For example, in the main results we separate out S-corp income from partnership income, and can identify profits or losses from both. In our data, we have a summary measure of the net income from S-corps and partnerships. In this section, then, we capitalize the income available to us in the same manner as is reported in figure 6, for example, but with less income detail.50 The details about financial income, though, is unchanged. Figure 10 shows the estimated wealth concentration under homogeneous rate of return assumption (dashed red line) and homogeneous returns on interest-bearing assets (by estate tax, SCF, SCF-INSOLE, and 10-year Treasury rates of return for top 1 percent by wealth). Wealthconcentrationestimatedwithhomogeneousreturnassumptionincreasesslightlyfrom 2011 to 2013, while growth in concentration estimated with heterogeneous rates of return show no growth between 2011 to 2013. The spread between the homogeneous and heterogeneous return estimates are even wider in 2013 than in 2011 in figure 10, which is a function of the widening spread between the homogeneous rate of return and the alternate rates. The rate of return on interest-bearing assets in 2013 in the homogeneous return model is about 0.85%. For comparison, the 10-year Treasury yield is about 3 times that large in 2013, and the SCF and SCF-INSOLE based rates of return are about 2 and 2.5 times that large. In 2011, though, the spreads between the homogeneous return and the alternate rates of return are a bit smaller. We can also use the 2012 income from the 2013 INSOLE income panel (that is used to 50Thus, unlike the main results we will not be able to fully replicate the Saez and Zucman (2016) baseline results. 42
40 35 30 2001 2004 2007 2010 2013 Year tnecreP SCF + DB +Forbes, tax units Homogeneous rate of return (RoR) on all assets Heterog. RoR on interest assets: estate tax Heterog. RoR on interest assets: SCF Heterog. RoR on interest assets: SCF-INSOLE Heterog. RoR on interest assets: 10-year Treasury Figure 10: Share of wealth held by top 1 percent: SCF and capitalization model with INSOLE income data. Using 2005, 2008, 2011, and 2013 INSOLE data to replicate figure 6. Source: INSOLE, Statistics of Income, authors’ calculations. sample to SCF) and confirm that these data show a spike in 2012 for both the homogeneous and heterogeneous rate of return models, as in figure 6 (figure 11). The 2012 spike seems to be temporary, though, as wealth concentration in 2013 is approximately the same as the 2011 concentration estimates (figure 10). The fact that the 2012 income used here is a panel of the 2013 INSOLE filers should bias us toward finding no spike, as there is no guarantee that the set of filers based on 2013 income would have actually realized a lot of income in 2012. 43
40 35 30 2001 2004 2007 2010 2013 Year tnecreP SCF + DB +Forbes, tax units Heterog. RoR on interest assets: estate tax Heterog. RoR on interest assets: SCF Heterog. RoR on interest assets: SCF-INSOLE Heterog. RoR on interest assets: 10-year Treasury Figure 11: Share of wealth held by top 1 percent: SCF and capitalization model with INSOLE income data. As in figure 10, using 2005, 2008, 2011, and 2013 INSOLE data to replicate figure 6. In addition, in this figure we use the 2012 panel income from the 2013 INSOLE filers to investigate spike in 2012 estimates in 10-year Treasury and SCF from figure 6. This spike is seen here for both, but concentration estimates for both 10-year Treasury and SCF-based heterogeneous models fall back to 2011 levels in 2013. Source: INSOLE, Statistics of Income, authors’ calculations. 7.2 Estimates from permanent income For the years in which we have INSOLE data, we also have two years of panel data that we use to construct a proxy for permanent income–the mean of three years of income, by income type. Figure 12 shows the estimates of wealth concentration using annual income and the range of values allowed when using permanent income. Wealth concnetration estimates can change by 1 to 2 percentage points per year when using permanent versus annual income, 44
but the trends are generally the same.51 40 35 30 2005 2008 2011 2013 Year tnecreP Heterog. RoR on interest assets: estate tax Heterog. RoR on interest assets: 10-year Treasury Figure 12: Range of wealth concentration estimates when using permanent income and annual income. Using 2005, 2008, 2011, and 2013 INSOLE data as in figure 10. Range of estimates defined by estimates using permanent nd annual income. Only heterogeneous models with 10-year Treasury and estate tax rates of return for top 1% are shown for space. Source: INSOLE, Statistics of Income, authors’ calculations. 7.2.1 Time series properties of income The estimates shown in figure 12 are not very different from those in figure 10, but which household income should be used in these capitalized wealth predictions: annual or permanent? If we model current income of family i in year t (y ) as a function of observables (z ) and an unexplained stochastic term (u ) it it it then y = βz + u . The unexplained term is often modeled as the sum of a persistent it it it component (p ) and a transitory innovation (e ) to income: u = p +e . The transitory it it it it it innovation has the form e = ε +δ ε +δ ε +... and the persistent component has it it 1 it−1 2 it−2 the form p = ρ p +µ , where µ is a white noise error term. it 1 it−1 it it If ρ =1 then the stochastic term of income is said to have a unit root (or to follow 1 51. 45
a random walk) and changes to income over time are governed by white noise (µ ) and it transitory changes from the lagged values of ε . If the correlation between current transitory it innovations to income and past transitory innovations (the δ terms) dies out quickly then income can be described as having a unit root with a low order moving average (MA) component. Income changes are often modeled as a random walk with a low order MA component (Meghir and Pistaferri, 2004), and empirical support for this model with MA(2) has been found (Abowd and Card, 1989, MaCurdy, 1982). If income follows a random walk then the most current set of income data are the most valuable, as these income contain all the information needed to predict future income. In this case, averaging multiple years of income would not help in our wealth estimation, and moving the sampling data back in time would also cost us a lot. However, more recent estimates of the time series properties of income indicate that ρ < 1, meaning that income 1 does not follow a random walk (Baker, 1997; DeBacker, Panousi, Ramnath, and Vidangos, 2013; Altonji, Smith and Vidangos, 2014). Long time series of income are needed to tease out these time series properties: with 10 years of data, earnings follow a random walk (Abowd and Card, 1989, MaCurdy, 1982) but with 20 years of earnings data on the same people a random walk is not found (Baker, 1997). Our 4-year panel is likely not able to identify these time series properties. But DeBacker, et al. (2013) use a 23 year time series of SOI household income and find estimates of ρ < 1, 1 indicating a lack of random walk in household income. 8. Appendix B: Pareto estimates When empirical information on a distribution is sparse or otherwise flawed, knowledge of the functional form can contribute to the measurement of the distribution. In the case of wealth, its distribution has long been modeled as a type of power law distribution called a Pareto distribution (Pareto, 1893). In the Pareto distribution, above a minimum point 46
x the distribution follows the form (x /x)α.52 We investigate the application of the min min Pareto distribution to improve wealth concentration estimates using SCF data and find that it reduces estimate precision without evidence of improved accuracy. In the economics literature, income and wealth distributions are often assumed to be a mixture of a Pareto distribution—at the top of their distributions—and a lognormal in the lower part of the distribution (Gabaix, 2009, Atkinson 1978, 2005, Piketty, 2002, Benhabib and Moll, 2013, Klass et al 2009). Building on this idea, recent research has applied the Pareto distribution to household survey data, assuming that the survey can collect income and wealth data up to the point x , and assuming a Pareto distribution above x . Most min min notably, this has been applied to wealth data from the Household Finance and Consumption Survey (HFCS), a survey of families in European countries (Vermuelen, 2015, Dalitz, 2016, Bach et al 2015, Eckerstorfer et al 2016). In the simplest case, the Pareto distribution would just fill in the wealth held by families necessarily omitted from a household survey—the Forbes 400, in the case of the SCF (see section 2). As described in these papers, though, the maximum value of wealth collected in HFCS data often falls short of the minimal wealth in so-called “rich lists”—family wealth data from sources like the Forbes 400, or Manager magazin in Germany. In most countries, the gap between the survey data and rich list data is large (Vermeulen 2015). In these cases, the Pareto distribution will also have to fill in these gaps. On the other hand, the SCF shows no such gaps; indeed, even though members of the Forbes 400 are excluded from the SCF, there is overlap between the richest survey respondents and the poorest Forbes members, even in the public SCF data (figure 13).53 Thus, the SCF is an example of the simplest case, whereby the Pareto distribution fills in the missing wealth held by families necessarily omitted (the Forbes 400). 52This is the functional form of 1−F(x) for values above x , and the pdf is p(x)=x−α. min 53TheSCFistheonlywealthsurveyinVermeulen(2015)withthisoverlap,thoughtheSpanishandFrench HFCS surveys have a large number of wealthy families. Both of these surveys use similar oversampling techniques as the SCF. The publicly-available version of the SCF goes through a disclosure review where high- and low-end estimates are imputed. 47
PART II: Tail of the wealth distribution ThisAppendixcontainsasetoffiguresshowingthetailofthewealthdistribution(starting at1million euro) together with theestimated relationship on alog-logscale. Dotsare survey observations, crosses are Forbes observations. These figures illustrate how the regressionlinechangesslopewhenForbesdataareaddedtosurveydata. Exceptforthe US,wheretheslopechangesverylittle,theabsolutevalueoftheslope(i.e. α)reduces, leadingtoanupwardtiltingintheregressionline. (Inotherwords,asimpleextrapolation ofthesurveydata,whichsuffersfromdifferentialnon-response,wouldleadtoatoolow estimateofthenumberofForbesbillionaires.) Tail of the wealth distribution 100 10−1 10−2 10−3 10−4 10−5 10−6 10−7 100 101 102 103 104 105 Wealth (in million euro) )x=>X(P Empirical ccdf (Survey) Empirical ccdf (Forbes) Regression (survey) Regression (survey and Forbes) Pseudo Maxlik(survey) Figure1: Tailofthewealthdistribution: USA Figure 13: Top tail of public SCF and Forbes wealth data. Source: Vermeulen (2015) Modeling the mising wealth (x) with a Pareto and a known x involves estimating only min the α parameter in the distribution function (x /x)α. The α parameter can be estimated min via maximum likelihood: αˆ = [ (cid:80)wmax n(w )/N(w ) ∗ ln(w /w )]−1 with standard ml 7 wi=wmin i min i min error αˆ ∗1/[N(w )−0.5] (Gabaix 2009). ml min The estimates of α are shown in table 2 (below) along with the new estimated top 1 percent wealth share and the percentage point increase in top 1 shares when we do that. The increase in estimated concentration—around 1.5 percentage points each year—is similar to (but slightly larger than) the re-weighting method described in section 2 and shown in figure 5.54 HereweusetheSCFdatauptothemaximalSCFvalue; hence, x issuppressedinthe min table for disclosure reasons, but we can note that in each year in the table, the maximal SCF wealth value lies above the minimal Forbes value. The standard error around the percentage 54Note that the excercise shown here is on the top 1% wealth share implied from the SCF bulletin data (excludingDBpensionestimates),whilethatinfigure5usestheSCFbulletindataincluding theDBpension estimate. The levels are different, but the change in top share estimates after incorporating Forbes wealth is similar. 48
point increase in top share (last column) is fairly small (not shown). Year αˆ [95 pct. CI] xˆ SCF bull. SCF bull. + Rich list min 2001 1.30 [1.25,1.36] $xxxb 32.6 33.9 2004 1.34 [1.26,1.41] $xxxb 33.4 34.6 2007 1.49 [1.42,1.55] $xxxb 33.8 35.3 2010 1.39 [1.34,1.45] $xxxb 34.5 35.8 2013 1.32 [1.27,1.37] $xxxb 36.3 38.2 Table 2: SCF + Rich list top shares 1998-2013 However, the above excercise assumes that x is known. In the papers linking survey min data and Pareto distributions (Vermuelen, 2015; Dalitz, 2016; Bach et al., 2015; Eckerstorfer et al., 2016), it is assumed that x is unknown. Typically, these papers either select min several possible points for x (Vermeulen, 2015) and calculate top wealth shares based min on the several resulting αˆ, or apply a van der Wijk test, whereby x is estimated as the min lowest value x beyond which the ratio of x¯/x is constant (Dalitz, 2016; Bach et al., 2015). A confidence interval could be inferred based on the range of concentration estiamtes from varying the x . min Clauset, Shalizi, and Newman (2009), though, propose a formal method that estimates x and bootstraps the sample to estimating a confidence interval of topshare estimates, min resampling the dataset 999 times. The share of wealth held by the top 1% under this estiamtion is shown in the third column of table 3. The increase in top share estimates are generally comparable to those reported in figure 5—after re-weighting—and in table 3 after picking x .55 The upper and lower bounds of the 95% confidence interval from this min approach is presented in table 3. The wide range of potential top share estimates highlight the reduction in precision of topshare estimates when implementing Pareto interpolation. 55Note that the excercise shown here is on the top 1% wealth share implied from the SCF bulletin data (excludingDBpensionestimates),whilethatinfigure5usestheSCFbulletindataincluding theDBpension estimate. The levels are different, but the change in top share estimates after incorporating Forbes wealth is similar. 49
Year SCF bull. SCF bull. + Pareto interpolation LB of top share UB of top share 2001 32.6 33.0 30.6 43.7 2004 33.4 34.1 31.9 37.5 2007 33.8 34.7 32.2 37.7 2010 34.5 35.0 32.7 40.4 2013 36.3 37.5 34.3 47.0 Table 3: Top 1 percent wealth share in SCF with Pareto interpolation. The LB and UB are lower and upper bounds of 95% confidence interval from a bootstrap approach to estimating a confidence interval of Pareto-inferred top share estimates (resampling the dataset 999 times and implementing the method developed in Clauset et al., 2009). Source: Survey of Consumer Finances, authors’ calculations. However, these exercises—and the above-cited economics literature—also assume that a Pareto distribution is appropriate to use to model wealth at the top of the distribution. A method of testing the validity of modeling data as a Pareto distribution is proposed in Clauset et al. (2009) using the following simulation based approach. First, calculate of the maximum likelihood estimator of the slope parameter (αˆ) using each point in the dataset as the minimum point. Then, all of these Pareto distributions are compared based on the Kolmogorov-Smirnov statistic, with the x associated with the best score selected as the min cutoff. Next, synthesize new datasets where points below x are drawn from the existing min dataset, and points above are randomly generated from a Pareto distribution. Running the Clauset power law estimation method (described above) on these synthetic datasets, storing the Kolmogorov-Smirnov statistic of each final distribution. If the Kolmogorov-Smirnov statistic of the actual data is outside the bootstrapped 95% confidence interval, the null of a Pareto-distributed dataset is rejected. We ran this test on all iterations of the SCF from 1989-2013 augmented with Forbes data, and the null was rejected for all years.56 While rejecting this test does not mean that there is no value in thinking of the U.S. top-end wealth distribution as Pareto-distributed, it does cast doubt on the ability of Pareto interpolation to increase the accuracy of SCF 56Clauset et al. (2009) also reject for Forbes 400 data alone. 50
wealth concentration estimates. 9. Appendix C: Top 0.1% estimates The evolution of the share of wealth held by the top 0.1% of families is similar to that of the top 1% in the main text (figure 14). The wealth share in all models—the homogeneous rate of return capitalization model, the four heterogeneous return models—and the SCF increase from the early 2000s through the late 2000s. From 2007 to 2010, the SCF concentration dips (beforeincreasingfrom2010to2013),andtheheterogeneousmodelslevelofffromabout2007 to 2011. The wealth concentration implied from the homogeneous returns model, though, increases at a faster rate from 2007-2011 than in the 2002-2007 period. The level of wealth concentrated at the top 0.1% in the heterogeneous capitalized models is a bit higher than in the SCF. In the early 2000s, the difference is about 1 to 2 percentage points, and increases to 3 or 4 percentage points by the end of the data. However, most of this difference is due to the valuation of assets inherent between the two data sources. When the SCF is augmented to reflect the Financial Accounts valuations (which underlie capitalized estimates), the share of wealth held by the top 0.1% is nearly identical in the SCF and the heterogeneous return capitalization models (see the dash-dotted blue line in figure 14).57 Prior work has shown that the SCF oversample targets families from all parts of the top 1 percent, and that the income and predicted wealth of responding families is distributionally equivalent to nonresponding families (Bricker et al., 2016, figure 4). The difference, then, is 57RecallthatthecapitalizedwealthestimatesdistributetheaggregatewealthintheFinancialAccountsof theUnitedStates,andtheSCFusesself-reportedvaluesoftherespondents. Theaggregateassetsanddebts are comparable in the two datasets, as are asset and debt classes (Dettling et al., 2015), with the exception of two notable asset classes: housing and privately held business values are both higher, in aggregate, in the SCF.Distributionally, theseshoudactincountervailingdirections, withhousingpushingtowardmoreequal wealth distribution and business pushing toward less equality. There are equally valid reasons to prefer one over the other, but we prefer the SCF valuations: the SCF collects the market value of a business while the FA collects the book value, and the SCF collects the owner’s valuation of a house rather than one derived from a house price index. 51
20 15 10 2001 2004 2007 2010 2013 Year tnecreP SCF (incl. DB, Forbes, tax units) SCF (incl. DB, Forbes, tax units, FA values) Homogeneous rate of return (RoR) on all assets Heterog. RoR on interest assets: estate tax Heterog. RoR on interest assets: SCF Heterog. RoR on interest assets: SCF-INSOLE Heterog. RoR on interest assets: 10-year Treasury Figure 14: Share of wealth held by top 0.1 percent: SCF and capitalization model with PUFincomedata. Source: SurveyofConsumerFinances, PublicUse File (SOI), Saez (2016) top-up of PUF data, and authors’ calculations. not due to lack of representation at the top in the SCF. 52
Cite this document
Jesse Bricker, Alice Henriques, & and Peter Hansen (2018). How much has wealth concentration grown in the United States? A re-examination of data from 2001-2013 (FEDS 2018-024). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2018-024
@techreport{wtfs_feds_2018_024,
author = {Jesse Bricker and Alice Henriques and and Peter Hansen},
title = {How much has wealth concentration grown in the United States? A re-examination of data from 2001-2013},
type = {Finance and Economics Discussion Series},
number = {2018-024},
institution = {Board of Governors of the Federal Reserve System},
year = {2018},
url = {https://whenthefedspeaks.com/doc/feds_2018-024},
abstract = {Well known research based on capitalized income tax data shows robust growth in wealth concentration in the late 2000s. We show that these robust growth estimates rely on an assumption---homogeneous rates of return across the wealth distribution---that is not supported by data. When the capitalization model incorporates heterogeneous rates of return (on just interest-bearing assets), wealth concentration estimates in 2011 fall from 40.5% to 33.9%. These estimates are consistent in levels and trend with other micro wealth data and show that wealth concentration increases until the Great Recession, then declines before increasing again. Accessible materials (.zip)},
}