feds · July 31, 2011

Taxation of Human Capital and Wage Inequality: A Cross-Country Analysis

Abstract

Wage inequality has been significantly higher in the United States than in continental European countries (CEU) since the 1970s. Moreover, this inequality gap has further widened during this period as the US has experienced a large increase in wage inequality, whereas the CEU has seen only modest changes. This paper studies the role of labor income tax policies for understanding these facts, focusing on male workers. We construct a life cycle model in which individuals decide each period whether to go to school, work, or stay non-employed. Individuals can accumulate skills either in school or while working. Wage inequality arises from differences across individuals in their ability to learn new skills as well as from idiosyncratic shocks. Progressive taxation compresses the (after-tax) wage structure, thereby distorting the incentives to accumulate human capital, in turn reducing the cross-sectional dispersion of (before-tax) wages. Consistent with the model, we empirically document that countries with more progressive labor income tax schedules have (i) significantly lower before-tax wage inequality at different points in time and (ii) experienced a smaller rise in wage inequality since the early 1980s. We then study the calibrated model and find that these policies can account for half of the difference between the US and the CEU in overall wage inequality and 84% of the difference in inequality at the upper end (log 90-50 differential). In a two-country comparison between the US and Germany, the combination of skill-biased technical change and changing progressivity of tax schedules explains all the difference between the evolution of inequality in these two countries since the early 1980s.

Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C. Taxation of Human Capital and Wage Inequality: A Cross-Country Analysis Fatih Guvenen, Burhanettin Kuruscu, and Serdar Ozkan 2013-20 NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

Taxation of Human Capital and Wage Inequality: A Cross-Country Analysis Fatih Guvenen∗ Burhanettin Kuruscu† Serdar Ozkan‡ September 26, 2012 Abstract Wage inequality has been significantly higher in the United States than in continental European countries (CEU) since the 1970s. Moreover, this inequality gap has further widened during this period as the US has experienced a large increase in wage inequality, whereas the CEU has seen only modest changes. This paper studies the role of labor income tax policies for understanding these facts, focusing on male workers. We construct a life cycle model in which individuals decide each period whether to go to school, work, or stay non-employed. Individuals can accumulate skills either in school or while working. Wage inequality arises from differences across individuals in their ability to learn new skills as well as from idiosyncratic shocks. Progressive taxation compresses the (after-tax) wage structure, thereby distorting the incentives to accumulate human capital, in turn reducing the cross-sectional dispersion of (before-tax) wages. Consistent with the model, we empirically document that countrieswithmoreprogressivelaborincometaxscheduleshave(i)significantlylower before-tax wage inequality at different points in time and (ii) experienced a smaller rise in wage inequality since the early 1980s. We then study the calibrated model and find that these policies can account for half of the difference between the US and the CEU in overall wage inequality and 84% of the difference in inequality at the upper end (log 90-50 differential). In a two-country comparison between the US and Germany, the combination of skill-biased technical change and changing progressivity of tax schedules explains all the difference between the evolution of inequality in these two countries since the early 1980s. Keywords: Wage Inequality, Human Capital, Skill-Biased Technical Change, Tax Policies. ∗University of Minnesota and NBER; guvenen@umn.edu; https://sites.google.com/site/fatihguvenen/ †University of Toronto; burhan.kuruscu@utoronto.ca; http://sites.google.com/site/bkuruscu ‡Federal Reserve Board; serdar.ozkan@frb.gov; www.serdarozkan.me 1

1 Introduction Why is wage inequality significantly higher in the United States than in continental European countries (CEU)? And why has this inequality gap between the US and the CEU widened substantially since the 1970s (see Table 1)? More broadly, what are the determinants of wage dispersion in modern economies? How do these determinants interact with technological progress and government policies? The goal of this paper is to shed light on these questions by studying the impact of labor market (tax) policies on the determination of wage inequality, focusing on male workers and using cross-country data. We begin by documenting two new empirical relationships between wage inequality and tax policy. First, we show that countries with more progressive labor income tax schedules have significantly lower wage inequality at different points in time.1 The measure of wages we use is “gross before-tax wages” and can therefore be thought of as a proxy for the marginalproductofworkers.2 Fromthis perspective, progressivityisassociatedwithamore compressed productivity distribution across workers. Second, we show that countries with more progressive income taxes have also experienced a smaller rise in wage inequality over time, and this relationship is especially strong above the median of the wage distribution. These findings reveal a close relationship between progressivity and wage inequality, which motivates the focus of this paper. However, on their own, these correlations fall short of providing a quantitative assessment of the importance of the tax structure—e.g., what fraction of cross-country differences in wage inequality can be attributed to tax policies? For this purpose, we build a model. Specifically, we construct a life cycle model that features some key determinants of wages—most notably, human capital accumulation and idiosyncratic shocks. Here is an overview of the framework. Individuals enter the economy with an initial stock of human capitalandareabletoaccumulatemorehumancapitaloverthelifecycleusingaBen-Porath (1967) style technology (which essentially combines learning ability, time, and existing human capital for production). Individuals can choose to either invest in human capital on the job up to a certain fraction of their time or enroll in school where they can invest full time. We assume that skills are general and labor markets are competitive. As a result, the cost of on-the-job investment will be borne by the workers, and firms will adjust the wage rate downward by the fraction of time invested on the job. Therefore, the cost of human 1Incontemporaneouswork,DuncanandPeter(2008)alsoconstructincometaxschedulesforabroadset ofcountriesandempiricallyinvestigatetherelationbetweenprogressivityandincomeinequality. Although theirmeasureofprogressivityandincomeisdifferentfromoursalongimportantdimensions,theydocument astrongnegativerelationshipbetweenprogressivityandincomeinequality,consistentwithourfindingshere. 2The precise definition of gross wages is given in footnote 12. 2

Table 1: Log Wage Differential Between the 90th and 10th Percentiles (Male Workers) 1978-1982 2001-2005 Change average average Denmark – 0.97 – Finland 0.89 0.94 0.05 France 1.22 1.14 -0.08 Germany 0.93 1.06 0.07 Netherlands 0.84 1.05 0.11 Sweden 0.73 0.87 0.14 CEU 0.92 1.01 0.06 UK 0.99 1.28 0.29 US 1.28 1.60 0.32 capital investment is the forgone earnings while individuals are learning new skills. We introduce two main features into this framework. First, we assume that individuals differ in their learning ability. As a result, individuals differ systematically in the amount of investment they undertake and, consequently, in the growth rate of their wages over the life cycle. Thus, a key source of wage inequality in this model is the systematic fanning out of the wage profiles.3 Second, we allow for endogenous labor supply choice, which amplifies the effect of progressivity, a point that we return to shortly. Finally, for a comprehensive quantitative assessment, we also allow idiosyncratic shocks to workers’ labor efficiency and model differences in consumption taxes and pension systems, which vary greatly across these countries. The model described here provides a central role for policies that compress the wage structure—such as progressive income taxes—because such policies hamper the incentives for human capital investment. This is because a progressive system reduces after-tax wages at the higher end of the wage distribution compared with the lower end. As a result, it reduces the marginal benefit of investment (the higher wages in the future) relative to the marginal cost (the current forgone earnings), thereby depressing investment. A key observation is that this distortion varies systematically with the ability level—and, specifically, it worsens with higher ability—which then compresses the before-tax wage distribution. These effects of progressivity are compounded by endogenous labor supply and differences in average income tax rates: the higher taxes in the CEU reduce labor supply—and, consequently, the benefit of human capital investment—further compressing the wage distribution. The main quantitative exercise we conduct is the following. We consider the eight 3Recent evidence from panel data on individual wages provides support for individual-specific growth rates in wage earnings (cf. Baker (1997), Guvenen (2007, 2009), Huggett et al. (forthcoming)). 3

countries listed in Table 1, for which we have complete data for all variables of interest. We assume that all countries have the same innate ability distribution but allow each country to differintheobservabledimensionsofitslabormarketstructure,suchasinlaborincome(and consumption) tax schedules and retirement pension system. We then calibrate the modelspecific parameters to the US data and keep these parameters fixed across countries. The policy differences we consider explain about half of the observed gap in the log 90-10 wage differential between the US and the CEU in the 2000s and 84% of the wage inequality above the median (log 90-50 differential). The model explains only about 24% of the difference in the lowertailinequalitybetweentheUSandtheCEU,whichisconsistentwiththe ideathat the human capital mechanism is likely to be more important for higher ability individuals and, therefore, above the median of the distribution. We also provide a decomposition that isolates the roles of (i) the progressivity of income taxes, (ii) average income tax rates, (iii) consumption taxes, and (iv) the pension system. We find that progressivity is by far the most important component, accounting for about 2/3 of the model’s explanatory power. The second question we ask is whether the widening of the inequality gap between the US and the CEU since the late 1970s could also be explained by the same human capital channels discussed earlier. One challenge we face in trying to answer this question is that the country-specific tax schedules that we derive in this paper are only available for the years after 2001 (because the detailed information from OECD sources for taxes is only available after that date), whereas the tax structure has changed over time for several of the countries in our sample. Fortunately, for two countries in our sample—the US and Germany—we are also able to derive tax schedules for 1983, which reveal significantly more flattening of tax schedules in the US compared with Germany from 1983 to 2003. When these changes in progressivity and skill-biased technical change (SBTC) are jointly taken into account, the (recalibrated) model generates a much larger rise in inequality in the US than in Germany, in fact, slightly overestimating the actual widening of the inequality gap between these countries. Finally, in section 6, we test some key implications of our model for lifecycle behavior using micro data. First, the model predicts that a country with a more progressive tax system should have a flatter age profile of average wages (by dampening human capital accumulation) compared with a less progressive one. Similarly, progressivity will imply a flatter profile of within-cohort wage inequality over the life cycle. We provide a comparison of the United States (using the Panel Study of Income Dynamics, PSID, data) and Germany (using the German Socio-Economic Panel, GSOEP) and find strong support for both predictions. 4

1.1 Related Literature The negative relation between inequality and redistribution has also been studied in earlier papers. Among these, Benabou (2000), Moene and Wallerstein (2001), and Hassler et al. (2003) use a political economy framework to explain how countries with high inequality and low redistribution (e.g., the United States) can coexist with countries with low inequality and high redistribution (e.g. continental Europe). Hassler et al. (2003) emphasize the interaction between political economy and human capital investment: redistribution reduces human capital investment by the young, in turn reducing wages throughout the life cycle, and thus implying that a larger share of voters will benefit from redistributive politics. As a result, the model features multiple equilibria. An important implication of this environment is that an increase in pre-tax inequality strengthens the incentives for investment and reduces, ceteris paribus, the fraction of voters supporting redistribution. Benabou (2000) explores the effects of redistribution in the presence of imperfect credit markets. When inequality is very low, the benefit of redistribution comes mainly from higher output (due to the relaxation of credit constraints for some high productivity individuals). But when inequality is high, the wealthy do not want redistribution. Thus, support for redistribution decreases initially with higher inequality. Moene and Wallerstein (2001) consider the redistributive and insurance aspects of welfare benefits. In their framework, an increased gap between median and mean income increases political support for welfare benefits if benefits are targeted to the employed as redistribution, but decreases the political support if the benefits are targeted to the poor as insurance against income loss. When the targeting of benefits is made endogenous, their model implies that political support for insurance against income risk still declines as the gap between the median and the mean increases. The channels explored in these papers are likely to be complementary to ours. In terms of methodology, this paper is most closely related to the recent macroeconomics literature that has written fully specified models to address US-CEU differences in labor market outcomes. Prominent examples include Ljungqvist and Sargent (1998), Ljungqvist and Sargent (2008), and Hornstein et al. (2007), who focus on unemployment rates, and Prescott (2004), Ohanian et al. (2008), and Rogerson (2008), who study labor hours differences. Several of these papers rely on representative agent models and are, therefore, silent on wage inequality; and those that do allow for individual-level heterogeneity do not address differences in wage inequality. In terms of modeling choices, the closest framework to ours is Kitao et al. (2008), who study a rich life cycle framework with human capital accumulation and job search and model the benefits system. Their goal is to explain the 5

different unemployment patterns over the life cycle in the US and Europe. Finally, a number of recent papers share some common modeling elements with ours but address different questions. Important examples include Altig and Carlstrom (1999), Krebs (2003), Caucutt et al. (2006), and Huggett et al. (forthcoming). Altig and Carlstrom (1999) study the quantitative impact of the Tax Reform Act of 1986 on income inequality arising solely from behavioral responses associated with labor supply and saving decisions and find that distortions arising from marginal tax rate changes have sizable effects on incomeinequality. Krebs(2003)studiestheimpactofidiosyncraticshocksonhumancapital investment and shows that reducing income risk can increase growth, in contrast to the standard incomplete markets literature, which typically reaches the opposite conclusion. Caucutt et al. (2006) develop an endogenous growth model with heterogeneity in income. They show that a reduction in the progressivity of tax rates can have positive growth effects even in situations where changes in flat-rate taxes have no effect. Another important contribution is Huggett et al. (forthcoming), who study the distributional implications of the Ben-Porath model and estimate the sources of lifetime inequality using US earnings data. Finally, Erosa and Koreshkova (2007) investigate the effects of replacing the current U.S. progressive income tax system with a proportional one in a dynastic model. They find a large positive effect on steady state output, which comes at the expense of higher inequality. Although our paper has many useful points of contact with this body of work, to our knowledge, our combination of human capital accumulation, ability heterogeneity, progressive taxation, and endogenous labor supply is new, as is the attempt to explain cross-country inequality facts in such a framework. The next section lays out the main model and explains the various channels through which tax policy affects wage inequality. Section 3 describes how the country-specific tax schedules are estimated and uses the estimates to document two new empirical relationships between taxes and inequality. Sections 4 and 5 discusses the parameterization and the main quantitative results. Section 6 examines a series of micro implications of the human capital mechanism proposed in this paper. Section 7 concludes. 2 The Model We begin by describing the features of the human capital investment problem. Using this environment, we discuss the various channels through which tax policy affects wage inequality. We then enrich this framework by introducing empirically relevant features (such as idiosyncratic shocks and labor market institutions) that are necessary for a sound quanti- 6

tative analysis. 2.1 Human Capital Accumulation Consider an individual who derives utility from consumption and leisure and has access to borrowing and saving at a constant interest rate, r. Let β be the subjective time discount factor and assume β(1+r) = 1. Each individual has one unit of time in each period, which he can allocate to three different uses: work, leisure, and human capital investment. If an individual chooses to work, he can allocate a fraction (i) of his working hours (n) to human capital investment. At age s, new human capital, Q , is produced according to a s Ben-Porath technology: Q = Aj(h i n )α, (1) s s s s where h denotes the individual’s current human capital stock and Aj is the learning ability s of individual type j. We assume that skills are general and labor markets are competitive. As a result, the cost of human capital investment is completely borne by workers, and firms adjust the hourly wage rate, w , downward by the fraction of time invested on the s job: w = P h (1 − i ), where P is the price of human capital; labor income is simply s H s s H y = w n . Finally, let τ¯(y) and τ(y) denote, respectively, the average and marginal labor s s s income tax functions. The problem of a type j individual can be written as S (cid:88) max βs−1u(c ,1−n ) s s cs,as+1,is s=1 s.t. c +a = (1−τ¯(y ))y +(1+r)a s s+1 s s s h = h +Aj(h i n )α (2) s+1 s s s s y = P h (1−i )n . (3) s H s s s The opportunity “cost of investment” (in human capital units) is equal to h i n and, s s s using equation (1), it can be written as C (Qj) = (Qj/Aj) 1/α , which will play a key role in j s s the optimality conditions that follow. A key parameter in the Ben-Porath technology is Aj. Heterogeneity in Aj implies that individuals will differ systematically in the amount of human capital they accumulate and, consequently, in the growth rate of their wages over the life cycle. This systematic fanning out of wage profiles is the major source of wage inequality in this model. 7

2.2 Inspecting the Mechanisms We are now ready to discuss how taxation of human capital can affect wage inequality. To this end, it is useful to distinguish between two cases. Inelastic Labor Supply. First, suppose that labor supply is inelastic. Assuming an interior solution, the optimality condition for human capital investment is (1−τ(y ))C(cid:48)(Qj) ={β(1−τ(y ))+β2(1−τ(y ))+...+βS−s(1−τ(y ))}, (4) s j s s+1 s+2 S which equates the after-tax marginal cost of investment on the left hand side to the after-tax marginal benefit on the right.4 To understand the effect of taxes, first consider the case where taxes are flat rate (τ(cid:48)(y) = 0, ∀y,). In this case, all terms involving taxes cancel out: C(cid:48)(Qj) ={β +β2 +...+βS−s}. j s Thus, flat-rate taxes have no effect on human capital investment. This is a wellunderstood insight that goes back to at least Heckman (1976) and Boskin (1977).5 Now consider progressive taxes, i.e., τ(cid:48)(y) > 0. We rearrange equation (4) to get: 1−τ(y ) 1−τ(y ) 1−τ(y ) C(cid:48)(Qj) ={β s+1 +β2 s+2 +...+βS−s S }. (5) j s 1−τ(y ) 1−τ(y ) 1−τ(y ) s s s With progressivity, as long as the individual’s earnings grow over the life cycle, the tax ratios in (5) will be strictly less than one, depressing the marginal benefit of investment, which in turn dampens human capital accumulation. Thus, these tax ratios capture the reduction in the value of future wage earnings compared with the forgone wage earnings today. This observation motivates our first measure of progressivity, what we refer to as the progressivity wedge, defined as: 1−τ(y ) s+k PW(y ,y ) ≡ 1− , (6) s s+k 1−τ(y ) s between any two ages s and s+k. A progressivity wedge of zero corresponds to flat taxes, 4Notice that P (the price of human capital) does not appear in the optimality condition (4) and, thus, H has no effect on human capital decision. For clarity we set P =1 from here on. H 5With pecuniary costs of investment, flat taxes can affect human capital investment, as shown by King and Rebelo (1990) and Rebelo (1991). Similarly, Lucas (1990) shows that flat taxes can have a negative impact on human capital investment when labor supply is elastic. 8

and progressivity increases with the size of the wedge. In the next section, we empirically measure these wedges from the data. To understand the effect of progressive taxes on wage inequality, note that the distortion created by progressivity differs systematically across ability levels. At the low end, individuals with very low ability whose optimal plan involves no human capital investment in the absence of taxes would experience no wage growth over the life cycle and, therefore, no distortion from progressive taxation. At the top end, individuals with high ability (whose optimal plan implies low wage earnings early in life and very high earnings later) face very largewedges,whichdepresstheirinvestment. Thus,progressivityreducesthecross-sectional dispersion of human capital and, consequently, wage inequality in an economy, even with inelastic labor supply. Endogenous Labor Supply. Second, considernowthethecasewithelasticlaborsupply. The first order condition can be shown to be (see Appendix A.1) as follows: 1−τ(y ) 1−τ(y ) 1−τ(y ) C(cid:48)(Qj) ={β s+1 n +β2 s+2 n +...+βS−s S n }, (7) j s 1−τ(y ) s+1 1−τ(y ) s+2 1−τ(y ) S s s s where now the marginal benefit accounts for the utilization rate of human capital, which depends on the labor supply choice. Our second measure of progressivity is precisely motivated by this first order condition subject to a normalization: (cid:18) (cid:19) 1−τ(y ) n PW∗(y ,y ) = 1− s+k i , (8) i s s+k 1−τ(y ) n s avg where n is the hours per person in country i and n is the average of n across all countries i avg i in the sample.6 Now, once again, consider the effect of flat-rate taxes. The intra-temporal optimality condition for labor-leisure choice implies that labor supply depends negatively on the tax rate and positively on the level of human capital. A higher tax rate depresses labor supply choice(aslongastheincomeeffectisnottoolarge), whichthenreducesthemarginalbenefit of human capital investment, which reduces the optimal level of human capital. But labor supply in turn depends on the level of human capital, which further depresses labor supply, the level of human capital, and so on. Therefore, with endogenous labor supply, even a 6Notice that because of the rescaling by n , if a country has sufficiently high labor hours and low avg progressivity, this wedge measure can become negative (e.g., the US). Therefore, this new measure is defined relative to a given sample of countries, but is still informative about the relative return to human capital within a group of countries, which is the focus of this paper. 9

flat-rate tax has an effect on human capital investment, which can also be large because of the amplification described here. Insummary,thebaselinemodelstudiedhereimpliesthatcountrieswithmoreprogressive tax systems will have lower wage inequality. As will become clear later, these countries will also experience a smaller change in wage inequality in response to technological changes (such as SBTC). In Section 3, we examine these predictions empirically. 2.3 Enriching the Basic Framework As stated earlier, the main goal of this paper is to provide a quantitative assessment of the importance of the tax structure—e.g., what fraction of cross-country differences in wage inequality can be attributed to tax policies? For this purpose, we introduce several empirically relevant features that are necessary for a sound quantitative analysis. Upper Bound on On-the-Job Investment. We impose an upper bound on the fraction of time that can be devoted to on-the-job investment: i ∈ [0,χ], where χ < 1. Such an upper bound would arise, for example, when firms incur fixed costs for employing each worker (administrative burden, cost of office space, etc.) or as a result of minimum wage laws. Individuals can invest full-time by attending school (i = 1) and enjoy leisure for the rest of the time. Thus, the choice set is i ∈ [0,χ]∪{1}, which is non-convex when χ < 1. Finally, human capital depreciates every period at rate δ < 1. Idiosyncratic Shocks. It is difficult to talk about wage inequality without any sort of idiosyncratic shock. In a human capital model, these shocks would interact with investment choice and can greatly affect the quantitative conclusions we draw from the analysis. Thus, we introduce idiosyncratic shocks. Specifically, when an individual devotes (1−i )n hours s s producing for his employer, his effective labor supply becomes (cid:15)n (1 − i ), where (cid:15) is an s s idiosyncratic Markov shock with a stationary transition matrix Π((cid:15)(cid:48) | (cid:15)) that is identical across agents and over the life cycle. Note that these shocks are not to the stock of human capital (as, for example, in Huggett et al. (forthcoming)). Instead, these can be viewed as shocks to the rental rate or to the efficiency of labor supply. Market Structure. A full set of one-period Arrow securities is available for trade at every date and state, allowing markets to be dynamically complete. An Arrow security that promises to deliver one unit of consumption good in state (cid:15)(cid:48) tomorrow costs q((cid:15)(cid:48)|(cid:15)) in state (cid:15) today. Individuals completely insure themselves against consumption risk by trading these securities. Hence, all individuals of a given type j will have the same (and constant) consumption over the life cycle. However, individuals will have different realized paths of investment, human capital, labor supply, and wages. 10

Pension Benefits. It is easy to see from the discussion above of equations (5) and (7) thattheexistenceofaredistributivepensionsystemwillhaveaneffectsimilartoprogressive taxation. In addition, the retirement pension system represents a major use of tax revenues collected by governments. Therefore, modeling pensions is important for capturing how funds are returned to households. During retirement, individuals receive constant pension payments every period. Essentially, the pension of a worker with ability level j depends on two variables: (i) the average lifetime earnings of workers with the same ability level (denoted by yj), and (ii) the total number of years the worker had Social Security eligible earnings by the time he retired, denoted by mS. The pension function is denoted as Ω(yj,mS).7 The Tax System and the Government Budget. The government imposes a flat-rate consumption tax, τ¯, in addition to the (potentially) progressive labor income tax, τ¯(y).8 c The collected revenues are used for two main purposes: (i) to finance the benefits system, and (ii) to finance government expenditure, G, that does not yield any direct utility to consumers (because of either corruption or waste). The residual budget surplus or deficit, Tr, is distributed in a lump-sum fashion to all households. 2.4 Individuals’ Dynamic Program Individuals solve the following problem (ability type j is suppressed for clarity): V(h,a,m;(cid:15),s) = max [u(c,n)+βE(V(h(cid:48),a(cid:48)((cid:15)(cid:48)),m(cid:48);(cid:15)(cid:48),s+1)|(cid:15))] (9) c,n,i,a(cid:48)((cid:15)(cid:48)) s.t. (cid:88) (1+τ¯)c+ q((cid:15)(cid:48) | (cid:15))a(cid:48)((cid:15)(cid:48)) = (1−τ¯(y))y +a+Tr, (10) c y = (cid:15)h(1−i)n, (11) h(cid:48) = (1−δ)h+A(hin)α, (12) m(cid:48) = m+1{i < 1 & n ≥ n }, (13) min i ∈ [0,χ]∪{1}, 7In reality, pension payments depend on the workers’ own earnings history, but modeling this explicitly also adds an extra state variable, which this structure avoids. 8Notice that this baseline model does not have a capital income tax, which is challenging to introduce for at least two reasons. First, because capital is mobile across international borders (much more so than labor and consumption), it is not exactly clear how we should think about its taxation. Second, and more importantly, capital income taxation introduces significant complications into the numerical solution of the problem. For these two reasons, we abstract from it in the baseline model here. In Appendix D, we study a particular way of taxing capital income and find that, while it matters quantitatively, it does not alter the main conclusions of the paper. 11

for s = 1,2,...,S. Equation (13) shows how individuals accumulate years of service, m. Specifically, individuals get one more year of service credit if they are not in school (i < 1) and are employed more than a certain threshold number of hours: n > n . min Afterretirement, individualsreceiveapensionandthereisnohumancapitalinvestment. Since there is no uncertainty during retirement, a riskless bond is sufficient for smoothing consumption. Therefore, the problem at age s = S +1,..,T can be written as (cid:2) (cid:3) WR(a,yj,mS;s) = max u(c,0)+βWR(a(cid:48),yj,mS;s+1) (14) c,a(cid:48) s.t (1+τ¯)c+qa(cid:48) = (1−τ¯(y ))y +a+Tr c s s y = Ω(yj,mS). s The definition of a stationary recursive competitive equilibrium in this environment is standard, so the formal statement is relegated to Appendix A. 3 Progressivity and Inequality: Two Empirical Facts This section has two purposes. First, we discuss the derivation of country-specific tax schedules that are used in the rest of the paper. Using these tax schedules, we construct empirical measures of the two progressivity wedges defined in (6) and (8) above. Second, with these wedges on hand, we go on to document two new empirical relationships between wage inequality and the progressivity of (labor income) tax policy that are consistent with the presented model and further motivate the quantitative analysis that follows.9 3.1 Deriving Country-Specific Tax Schedules For each country, we follow the procedure described here. First, the OECD tax database provides estimates of the total labor income tax for all income levels between half of average wage earnings (hereafter, AW) to two times AW. The calculation takes into account several types of taxes (central government, local and state, social security contributions made by 9In a different context, earlier papers by Rodriguez (1998) and Moene and Wallerstein (2001) empirically documented a negative relation between inequality and redistributive policies other than taxes. In a regression analysis of eighteen advanced industrialized countries, Moene and Wallerstein (2001) find that greaterinequalityisassociatedwithlowerspendingonprogramsto insureagainstincomelossasashareof both GDP and total government spending. Rodriguez (1998) reaches a similar conclusion: using data from 20 OECD countries and controlling for national income, population, and the age distribution, he finds that pretax inequality has a significantly negative effect on every major category of social transfers as a fraction of GDP. 12

Figure 1: Average Tax Rate Functions, Selected OECD Countries, 2003 UNITED STATES GERMANY FINLAND 0.6 Data: AverageLabor Income Tax Rate Fitted Function 0.5 0.4 0.3 0.2 0.1 0 −0.1 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 Multiples of Average Earnings (In Respective Country) the employee, and so on), as well as many types of deductions and cash benefits (dependent exemptions, deductions for taxes paid, social assistance, housing assistance, in-work benefits, etc.).10 Using these estimates, we calculate the average labor income tax rate, τ¯(y), for 50%, 75%, 100%, 125%, 150%, 175%, and 200% of AW. However, tax rates beyond 200% of AW are also relevant when individuals solve their dynamic program. Fortunately, another piece of information is available from the OECD: the top marginal tax rate and the top bracket corresponding to it for each country. As described in more detail in Appendix B.1, we use this information to generate average tax rates at income levels beyond two times AW. Then, we fit the following smooth function to the available data points:11 τ¯(y/AW) = a +a (y/AW)+a (y/AW)φ. (15) 0 1 2 Theparametersoftheestimatedτ¯(y)functionsforallcountriesarereportedinAppendix B.1, along with the R2 values. Although the assumed functional form allows for various possibilities, all fitted tax schedules turn out to be increasing and concave. The lowest R2 is 0.984 and the mean is 0.991, indicating a very good fit. In Figure 1, we plot the estimated functions for three countries: one of the two least progressive (United States), the most progressive (Finland), and one with intermediate progressivity (Germany). 10Non-wage income taxes (e.g., dividend income, property income, capital gains, interest earnings) and non-cash benefits (free school meals or free health care) are not included in this calculation. 11We have also experimented with several other functional forms, including a popular specification proposed by Guoveia and Strauss (1994), commonly used in the quantitative public finance literature (cf. Castan˜eda et al. (2003), Conesa and Krueger (2006), and the references therein). However, we found that thefunctionalformusedhereprovidesthebestfitacrosstheboardforthisrelativelydiversesetofcountries, as seen from the high R2 values in Table A.1. 13

Figure 2: Progressivity Wedges At Different Income Levels: 1− 1−τ(k×0.5) for k = 2,3,..,6. 1−τ(0.5) 0.35 0.3 D FI E N N SWE 0.25 NET GER 0.2 FRA 0.15 US UK 0.1 0.05 0 0.5 1 1.5 2 2.5 3 3.5 y/AW Figure 2 plots the progressivity wedges computed from the estimated tax schedules for all countries in our sample. Specifically, each line plots PW(0.5,0.5k) and k = 1,2,...,6, which are essentially the wedges faced by an individual who starts life at half the average earnings in that country and looks toward an eventual wage level that is up to six times his initial wage. As seen in the figure, countries are ranked in terms of their progressivity. Consistent with what one could conjecture, the US and the UK have the least progressive tax system, whereas Scandinavian countries have the most progressive ones, and larger continental European countries are scattered between these two extremes. The differences also appear quantitatively large (although a more precise evaluation needs to await the quantitative analysis in the next section): for example, the marginal benefit of investment for a young worker in the US who invests today when his wage is 0.5 × AW and expects to earn 2×AW in the future is 13% lower than in a flat-tax system. The comparable loss is 27% in Denmark and Finland. These differences grow with the ambition level of the individual, dampening human capital investment, especially at the top of the distribution. 3.2 Taxes and Inequality: Cross-Country Empirical Facts The wage inequality data come from the OECD’s Labour Force Survey database and are derived from the gross (before-tax) wages of full-time, full-year (or equivalent) workers.12 Thisistheappropriatemeasureforthepurposesofthispaper, asitmorecloselycorresponds to the marginal product of each worker (and, hence, his wage) in the model. The fact that 12More precisely, wages are measured before taxes and employees’ social security contributions and also include bonuses and overtime pay when applicable. Therefore, they represent a fairly good measure of the total monetary compensation of a worker. 14

Figure 3: Progressivity Wedge (P(0.5, 2.5)) and the L90-10 in 2003. 1.6 1.5 1.4 1.3 1.2 1.1 1 0.9 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.26 0.28 0.3 Progresivity Wedge laitnereffiD egaW 01−09 goL Corr = −0.82 US Regression Line UK FRA GER NET DEN FIN SWE the inequality data pertain to before-tax wages is important to keep in mind; if the data were for after-tax wages, the correlation between progressivity and inequality would be mechanical and, thus, not surprising at all. Furthermore, we focus on male workers to avoid potential selection issues that may arise due to wide differences in female labor force participation rates across countries. We normalize AW in each country to 1 and focus on PW(0.5,2.5) as the measure of progressivity. Similarly, when we calculate PW∗ for a given country, we use the average hours per person in that country between 2001 and 2005 for n in equation (8), and the i average of the same variable across all countries for n .13 Finally, for brevity, in the rest of avg the paper we will refer to the“log 90-10 wage differential”simply as“L90-10,”and similarly for the other wage differentials. Figure 3 plots the relationship between L90-10 and the progressivity wedge in the 2000s. Countries with a smaller wedge—meaning a less progressive tax system and, therefore, a smaller distortion in human capital investment—have higher wage inequality. The relationship is also quite strong with a correlation of –0.82.14 (Repeating the same calculation using PW∗ yields the same correlation.) Both relationships are consistent with the human capital model with progressive taxes presented above. We next turn to the change in inequality over time. Figure 4 plots PW∗ versus the 13The data on average hours per person for each country have been kindly provided to us by Richard Rogerson and are the same as those used in Ohanian et al. (2008). 14This strong relationship is robust to using wedges calculated from different parts of the income distribution: forexample, thecorrelationsbetweenL90-10andPW(k,k+m)ask andmarevariedbetween0.5 to 2.5 range from –0.74 to –0.87. 15

Figure 4: Progressivity Wedge* (PW*(0.5, 2.5)) and Change in L90-50 (Left) and L50-10 (Right): 1980 to 2003 0.2 0.15 0.1 0.05 0 −0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 Progressivity Wedge* 3002−0891 :laitnereffiD egaW 05−09 goL ni egnahC Corr = −0.91 0.15 US Regression Line 0.1 UK 0.05 SWE NET 0 GER −0.05 FRA FIN −0.1 −0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 Progressivity Wedge* 3002−0891 :laitnereffiD egaW 01−05 goL ni egnahC Corr = −0.27 SWE NET UK US GER FIN FRA change in L90-50 (left panel) and L50-10 (right panel). Countries with a more progressive tax system in the 2000s have experienced a smaller rise in wage inequality since the 1980s. The relationship is especially strong at the top of the wage distribution and weaker at the bottom: the correlation between progressivity and the change in L90-50 is very strong (−0.91), whereas the correlation with L50-10 is much weaker (only −0.27); see Figure 4. This result is consistent with the idea that the distortion created by progressivity is likely to be effective especially strongly at the upper end, where human capital accumulation is an important source of wage inequality, but less so at the lower end, where other factors, such as unionization, minimum wage laws, and so on, could be more important. Finally, Table 2 gives a more complete picture of the differences between the two definitions of wedges. The top panel reports the correlation of each wedge measure with log wage differentials, which reveals that the adjustment for utilization rates through labor hours makes little difference in the correlations in 2003. Turning to the change in inequality over time (bottom panel), the simple wedge measure has a somewhat lower correlation with log wage differentials. However, adjusting for average hours per person increases these correlations significantly to −0.66 for the L90-10, and to −0.91 for L90-50 (plotted in the left panel of Figure 4). We conclude that progressivity is strongly correlated with inequality both in the cross-section and over time, especially above the median of the distribution. Overall, these findings reveal a close relationship between progressivity and wage inequality, which motivates the focus of this paper. However, on their own, these correlations fall short of providing a quantitative assessment of the importance of the tax structure. For this purpose, we now take the model to the data. 16

Table 2: Correlation Between Progressivity Measures and Wage Dispersion Measure of Wedge: PW(0.5,2.5) PW∗(0.5,2.5) Log wage differentials Year: 2003 90-10 −.82 −.82 90-50 −.84 −.67 50-10 −.70 −.91 Change from 1980 to 2003 90-10 −.35 −.66 90-50 −.58 −.91 50-10 .13 −.27 4 Parameter Choices We now discuss the parameter choices for the model. We focus on male workers so as to avoid potential selection issues across countries related to different labor market participation rates for female workers. Our basic calibration strategy is to take the United States as a benchmark and pin down a number of parameter values by matching certain targets in the US data.15 We then assume that other countries share the same parameter values with the US along unobservable dimensions (such as the distribution of learning ability), but differ in the dimensions of their labor market policies that are feasible to model and calibrate (specifically, consumption and labor income tax schedules and the retirement pension system). We then examine the differences in economic outcomes—specifically in wage dispersion and labor supply—that are generated by these policy differences alone. A model period corresponds to one year of calendar time. Individuals enter the economy at age 20 and retire at 65 (S = 45). Retirement lasts for 20 years and everybody dies at age 85. The net interest rate, r, is set equal to 2%, and the subjective time discount rate is set to β = 1/(1+r). The curvature of the human capital accumulation function, α, is set equal to 0.80, broadly consistent with the existing empirical evidence (see Browning et al. (1999, Table 2.3)). In Appendix D, we conduct sensitivity analyses with respect to α and consider cross-country variation in retirement age S. 15TakingtheUSasthebenchmarkismotivatedbythefactthatitseconomyissubjecttomuchlessofthe labormarketrigiditiespresentintheCEU—suchasunionizationandotherdistortinginstitutions. Because these institutions are not modeled in this paper, the US provides a better laboratory for determining the unobservable parameters than other countries where these distortions could be more important for wage determination. 17

Utility Function. Preferences over consumption, c, and leisure time, 1−n, are given by this common separable form: (1−n)1−ϕ u(c,n) = log(c)+ψ . (16) 1−ϕ This specification yields two parameters to calibrate: the curvature of leisure, ϕ, and the utility weight attached to leisure, ψ. These parameters are jointly chosen to pin down the average hours worked in the economy, as well as the average Frisch labor supply elasticity. In 2003, the average annual hours worked by American males was 1,890 hours, or approximately 5.2 hours per day (Heathcote et al. (2010, figure 2)). Taking the discretionary time endowment of an individual to be 13 hours per day, we get n = 5.2/13 = 0.4.16 With power utility, the theoretical Frisch elasticity of labor supply is given by (1 − n)/(nϕ). Because in this model, labor supply, n, varies across individuals, there is a distribution of Frisch elasticities. We simply target the Frisch elasticity implied by the average labor hours, n. The empirical target we choose is 0.3, which is consistent with the estimates for male workers surveyed by Browning et al. (1999), which range from zero to 0.5.17 As will become clear later, a higher Frisch elasticity improves the performance of our model, so in our baseline case we choose the relatively conservative value of 0.3. Distributions: Learning Ability, Initial Human Capital, and Shocks. Agentshave two individual-specific attributes at the time they enter the economy: learning ability and initial human capital endowment. We assume that these two variables are jointly uniformly distributed in the population and are perfectly correlated with each other.18 Although the assumption of perfect correlation is made partly for simplicity, a strong positive correlation is plausible and can be motivated as follows. The present model is interpreted as applying to human capital accumulation after age 20 and, by that age, high-ability individuals will have invested more than those with low ability, leading to heterogeneity in human capital stocks at that age, which would then be very highly correlated with learning ability. Indeed, 16Most countries require a minimum days of work (or income earned) to qualify for pension benefits, which is captured with n in (13). We set n =0.10, which does not bind for any country. min min 17Although it is common to use higher elasticity values in representative agent macro studies (e.g., Prescott (2004) among many others), values of 0.5 or lower are more common in quantitative models with heterogeneous agents (cf. Heathcote et al. (2008), and Erosa et al. (2009)). 18We prefer the uniform distribution over a Gaussian distribution because it has a bounded support, so initial human capital and ability can be easily ensured to be non-negative. Another choice would be a log normal distribution, but most empirical measures of ability find it more closely approximated by a symmetric distribution, unlike a log normal one. It will turn out, however, that the wage distribution generatedbythemodelwillbeclosertolognormalwithalongerrighttail(moreconsistentwiththedata), as a result of the convexity arising from the human capital production function. 18

Table 3: Baseline Parametrization Parameter Description Value ϕ Curvature of utility of leisure 5.0 (Frisch = 0.3) ψ Weight on utility of leisure 0.20 α Curvature of human capital function 0.80 S Years spent in the labor market 45 T −S Retirement duration (years) 20 r Interest rate 0.02 β Time discount factor 1/(1+r) δ Depreciation rate of skills (annual) 1.5% E (cid:2) hj(cid:3) Average initial human capital (scaling) 4.95 0 Parameters calibrated to match data targets E[Aj] Average ability 0.195 σ (cid:0) hj(cid:1) /E (cid:2) hj(cid:3) Coeff. of variation of initial human capital 0.076 0 0 σ[Aj]/E[Aj] Coeff. of variation of ability 0.396 γ Dispersion of Markov shock 0.23 p Transition probability for Markov shock 0.90 χ Maximum investment time on the job 0.50 Huggett et al. (forthcoming) estimate the parameters of the standard Ben-Porath model from individual-level wage data and find learning ability and human capital at age 20 to be strongly positively correlated (corr: 0.792). Making the slightly stronger assumption of perfect correlation allows us to collapse the two-dimensional heterogeneity in Aj and hj into 0 one, speeding up computation significantly. Therefore, this jointly uniform distribution of (Aj,hj) yields four parameters to be cal- 0 ibrated. E (cid:2) hj(cid:3) is a scaling parameter and is simply set to a computationally convenient 0 value, leaving three parameters: (i) the cross-sectional standard deviation of initial human capital, σ (cid:0) hj(cid:1) , (ii) the mean learning ability, E[Aj], and (iii) the dispersion of abil- 0 ity, σ(Aj). The idiosyncratic shock process, (cid:15), is assumed to follow a first-order Markov process, with two possible values, {1−γ,1+γ}, and a symmetric transition matrix with Pr((cid:15)(cid:48) = x|(cid:15) = x) = p. Thisstructureyieldstwomoreparameters,γ andp,tobecalibrated— for a total of five parameters. The sixth and last parameter is χ (maximum investment allowed on the job). Finally, because there is measurement error in individual-level wage data, we add a zero mean i.i.d. disturbance to the wages generated by the model (which has no effect on individuals’ optimal choices). Data Targets. Our calibration strategy is to require that the wages generated by the model be consistent with micro-econometric evidence on the dynamics of wages found 19

in panel data on US households. Specifically, these empirical studies begin by writing a stochastic process for log wages (or earnings) of the following general form: (cid:2) (cid:3) logwj = aj +bjs + zj +εj (17) (cid:101)s s s (cid:124) (cid:123)(cid:122) (cid:125) (cid:124) (cid:123)(cid:122) (cid:125) systematiccomp. stochasticcomp. zj = ρzj +ηj, s s−1 s where wj is the“wage residual”obtained by regressing raw wages on a polynomial in age; (cid:101)s the terms in brackets, [aj +bjs], capture the individual-specific systematic (or life cycle) component of wages that result from differential human capital investments undertaken by individuals with different ability levels, and zj is an AR(1) process with innovation ηj. s s Finally, εj is an iid shock that could capture classical measurement error that is pervasive in s microdataand/orpurelytransitorymovementsinwages. Forconcreteness,inthediscussion that follows, we refer to the first two terms in brackets as the “systematic component” of wages and to the latter two terms as the“stochastic component.” We begin with ε and assume that it corresponds to the measurement error in the wage s data. This is consistent with the finding in Guvenen and Smith (2009) that the majority of transitory variation in wages is due to measurement error. Based on the results of the validationstudiesfromtheUSwagedata,19 wetakethevarianceofthemeasurementerrorto be 10% of the true cross-sectional variance of wages in each country, which yields σ2 = 0.034 ε for the United States. We then choose the following six moments from the US data to pin down the six parameters identified earlier: 1. the mean log wage growth over the life cycle (informative about E(Aj)), 2. the ratio of minimum to mean wage (informative about χ), 3. the cross-sectional dispersion of wage growth rates, σ(bj) (informative about σ(Aj)), 4. the cross-sectional variance of the stochastic component (informative about γ), 5. the average of the first three autocorrelation coefficients of the stochastic component of wages (informative about p), and 6. L90-10 in the population (which, together with the previous moments, is informative about σ(hj)). 0 The target value for the mean log wage growth over the life cycle (i.e., the cumulative growth between ages 20 and 55) is 45%. This number is roughly the middle point of the 19For an excellent survey of the available validation studies and other evidence on measurement error in wage and earnings data, see Bound et al. (2001). 20

figures found in studies that estimate lifecycle wage and income profiles from panel data sets, such as the Panel Study of Income Dynamics (PSID); see, for example, Gourinchas and Parker (2002) and Guvenen (2007). The second data moment is the legal minimum wage in the economy relative to the average wage of full-time workers, which, according to the OECD,20 was 0.29 for the US in the early 2000s. The third moment is the cross-sectional standard deviation of wage growth rates, σ(bj). The estimates of this parameter are quite consistent across different papers, regardless of whether one uses wages or earnings. We take our empirical target to be 2%, which represents an average of these available estimates (Baker (1997), Haider (2001), and Guvenen (2009)). The next two moments capture key statistical properties of the stochastic component of wages in the data. These moments are (i) the unconditional variance of the stochastic component, (z +ε ), as well as (ii) the average of its first three autocorrelation coefficients. s s The empirical counterparts for these moments are taken from Haider (2001)Plain Lays the only study that estimates a process for hourly wages and allows for heterogeneous profiles. The figure for the unconditional variance can be calculated to be 0.109 and the average of autocorrelations is calculated to be 0.33, using the estimates in Table 1 of Haider’s paper. Further details and justifications for these parameter choices are in Appendix C.21 Our sixth, and final, moment is L90-10 in 2003. Adding this moment ensures that the calibrated model is consistent with the overall wage inequality in the US in that year, which is the benchmark against which we measure all other countries. The empirical target value is 1.60 (from the OECD’s Labour Force Survey). Table 4 displays the empirical values of the six moments, as well as their counterparts generated by the calibrated model. As can be seen here, all moments are matched fairly well. One point to note is that even though the average of the first three autocorrelation coefficients is pretty low (0.33), the stochastic component includes measurement error as well, which is iid. The Markov shocks themselves have a first order annual autocorrelation of 0.80 (implied by p = 0.90, shown in Table 3). Benefits System and the Government Budget. A great deal of variation can be found across countries in the parameters that control the generosity, the duration, and the 20http://stats.oecd.org/Index.aspx?DataSetCode=RHMW 21Our calibration produces wage dynamics that are also consistent with what some authors have called a RIP process. Basically, if we fit an AR(1) process plus an i.i.d shock to the wage process generated by the model,wefindapersistenceparameterof0.937,aninnovationstandarddeviationof19%,andani.i.dshock standard deviation of 18%. These are in line with recent estimates in the literature (see, e.g., Storesletten et al. (2004a)). 21

Table 4: Empirical Moments Used for Calibrating Model Parameters Moment Data Model Mean log wage growth from age 20 to 55 0.45 0.44 Ratio of minimum to mean wage rate 0.29 0.30 Cross-sectional standard deviation of wage growth rates 2.00% 2.03% Cross-sectional variance of stochastic component 0.109 0.106 Average of first three autocorrelation coeff. of stochastic component 0.33 0.34 L90-10 in 2003 1.60 1.60 insurance component of the benefits system.22 We provide the exact formulas for each country in Appendix B.4. Turning to the government budget, the calibration of G (the surplus wasted by the government) is challenging because of the difficulty of obtaining reliable estimates of its magnitude. In the baseline case, we assume G = 0. So, the government returns all the surplus to households in a lump-sum fashion (Tr). Relaxing this assumption and allowing for G > 0 has very little effect on the results (Appendix D).23 Consumption Taxes. The average tax rate on consumption is taken from McDaniel (2007), who provides estimates for 15 OECD countries for the period 1950 to 2003 by calculating the total tax revenue raised from different types of consumption expenditures and dividing this number by the total amount of corresponding expenditure. McDaniel (2007) does not provide an estimate for Denmark, so we set this country’s consumption tax equal to that of Finland, which has a comparable value-added tax (VAT) rate. 5 Quantitative Results In this section, we begin by presenting the implications of the calibrated model for wage inequality differences across countries at a point in time. We then provide decompositions that quantify the separate effects of progressivity, average income tax rates, consumption taxes, and the pension system on these results. We next turn to the change in inequality over time and provide a comparison between the United States and Germany from 1983 to 22For example, retirees in Denmark and the Netherlands receive the largest pension payments, with the present value of average retirement wealth exceeding half a million US dollars (as of 2007). The US and the UK, however, have the lowest pension entitlements—less than six times the average annual earnings in each respective country (and less than half the wealth in Denmark and the Netherlands). 23In the working paper version (Guvenen et al. (2009)), we also modeled an unemployment insurance system that mimics each country’s actual system in place. It turned out that this additional feature made little difference (which can be seen by comparing the results in that draft to those reported below), but it came at significant cost to the exposition of the model. Thus, we decided to omit it in this version. 22

Figure 5: Wage Dispersion: Model versus Data 1.6 1.5 1.4 1.3 1.2 1.1 1 0.9 0.8 1.2 1.25 1.3 1.35 1.4 1.45 1.5 1.55 1.6 1.65 Model ataD Corr (Model, Data) = 0.91 Data US 0.85 Regression Line 0.8 0.75 UK 0.7 Fra 0.65 Ger Net 0.6 Den Fin 0.55 Swe 0.7 0.75 0.8 0.85 0.9 0.95 Model (a) L90-10 ataD Corr (Model, Data) = 0.85 Data 0.8 Regression Line US 0.75 0.7 0.65 Fra 0.6 UK 0.55 0.5 Ger 0.45 Net Den Swe 0.4 Fin 0.35 0.5 0.52 0.54 0.56 0.58 0.6 0.62 0.64 0.66 0.68 Model (b) L90-50 ataD Corr (Model, Data) = 0.85 Data Regression Line US UK Ger Net Den Swe Fra Fin (c) L50-10 2003. The model statistics below are computed from 10,000 simulated lifecycle paths for individuals drawn from the joint probability distribution of (Aj,hj). 0 5.1 Cross-Sectional Results: the 2000s Figure 5 plots L90-10 for each country in the data against the value predicted by the calibrated model. The correlation between the simulated and actual data is 0.91 (and the countries line up nicely along the regression line), suggesting that the model is able to capture the relative ranking of these eight countries in terms of overall wage inequality observed in the data. To explore how the model fares at different parts of the wage distribution, the middle panel of Figure 5 repeats the same exercise for L90-50 and the bottom panel does the same for L50-10. In both cases, the model-data correlations are high: 0.85. InTable5,wequantifytheimportanceoftaxesforcross-countrydifferencesininequality. The first two columns report L90-10 in the data for all countries, first in levels (second column) and then expressed as a deviation from the US, which is our benchmark country (third column). For example, in Denmark L90-10 is 0.97, which is 0.63 (i.e., 63 log points) lowerthanthatintheUS.Thethirdandfourthcolumnsdisplaythecorrespondingstatistics impliedbythecalibratedmodel. Again, forDenmark, themodelgeneratesanL90-10thatis 0.38 below what is implied by the model for the US. Therefore, the model accounts for 60% (= 38/63) of the difference in L90-10 between the US and Denmark, reported in column (e). Similar comparisons show that the model does quite well in explaining the level of wage inequality in Germany but poorly in explaining the UK. The fraction explained by the model ranges from 35% for France to 56% for Germany. Overall, the model accounts for 48% of the actual gap in inequality between the US and the CEU in 2003. To see which part of the wage distribution is better captured by the model, the next 23

Table 5: Measures of Wage Inequality: Benchmark Model versus Data L90-10 L90-50 L50-10 Data Model % explained % explain. % explain. Level ∆ from US Level ∆ from US (d)/(b) (a) (b) (c) (d) (e) (f) (g) Denmark 0.97 0.63 1.22 0.38 0.60 0.97 0.31 Finland 0.94 0.66 1.27 0.33 0.49 0.78 0.25 France 1.14 0.46 1.44 0.16 0.35 1.23 0.12 Germany 1.06 0.54 1.29 0.30 0.56 0.90 0.28 Netherlands 1.05 0.55 1.36 0.24 0.43 0.65 0.23 Sweden 0.87 0.73 1.28 0.31 0.43 0.75 0.26 CEU 1.00 0.59 1.31 0.29 48% 84% 24% UK 1.28 0., 1.56 0.03 10 6 13 US 1.60 0.00 1.60 0.00 two columns display the same calculation performed in column (e), but now separately for L90-50 (f) and L50-10 (g). For all countries in the CEU, the model explains the upper tail inequality much better than the lower tail inequality. For example, for Denmark, the model explains 97% of L90-50 versus only 31% of L50-10. In fact, the model accounts for at least 65% of L90-50 for all countries in the CEU, averaging 84% across all countries, whereas it accounts for on average only 24% of L50-10.24 That our model does a better job at explaining inequality at the upper end (above the median) will be a recurring theme of this paper. This finding is consistent with the idea that progressive taxation affects the human capital investment of high-ability individuals more than others and, therefore, the mechanism is more effective above the median of the wage distribution. Finally, a notable exception to these generally strong findings is the UK, which is an important outlier: the model explains very little of the difference between the UK and US at the upper tail (6% to be exact) and only slightly more (13%) at the lower end. Decomposing the Effects of Different Policies. The baseline model incorporates several differences between the labor market policies of the US and those of the CEU countries. Here, we quantify the separate roles played by each of these components for the results presented in the previous section. We conduct three decompositions. First, we assume that countries in the CEU have the same retirement pension system as the US but 24The model does poorly in explaining the small L50-10 in France (12%). One reason could be the legal minimumwage(notmodeledhere),whichisequalto62%ofaverageearningsinFrance—thehighestamong the CEU and much higher than the 36% of average earnings in the U.S. If these differences were modeled, it might be possible to reconcile the model better with the very small lower tail wage inequality in France. 24

differ in all other dimensions considered in the baseline model. This experiment separates the role of the tax system for wage inequality from that of the pension system. Second, we also set the consumption taxes of each country equal to that in the US, but each country retains its own income tax schedule as in the baseline model. This experiment quantifies the explanatory power of the model that is coming from the income tax system alone. Third, we go one step further and assume that each country keeps the same progressivity of its income tax schedule but is identical in all other ways to the US, including the average income tax rate. This experiment isolates the role of progressivity alone. In each case, we adjust the lump-sum transfers to balance the government’s budget. Table 6 reports the results. First, in column 2, we assume that all countries have the same pension system as the US. In panel A, the correlation between the data and model is only slightly lower than in the baseline case for all parts of the wage distribution. Turning to panel B, the fraction of the US-CEU difference explained by the model goes down—but only slightly—indicating that more than 95% of the model’s explanatory power is coming from taxes (both income and consumption taxes). Next, in column (3), we also eliminate the differences in consumption taxes across countries. The model-data correlations go further down but, again, somewhat modestly. In panel B, the explanatory power of the model that is attributable to income taxes alone ranges from 75% to 80% for the three measures of wage inequality. The difference between columns 2 and 3 provides a useful measure of the role of consumption taxes, which account for about 17% (= 96% − 79%) of the model’s explanatory power for L90-10. Next, we investigate whether the power of income taxes comes from differences in the average rates across countries or from differences in the progressivity structure. In other words, if continental Europe differed from the US only in the progressivity of its labor income tax system—but had the same average tax rate on labor income—how much of the differences in wage inequality found in the baseline model would still remain? To answer this question, we proceed as follows. First, adjusting the average tax rate to the US level— without affecting progressivity—requires some care. We show in Appendix B.2 how this can be accomplished. Then, using these hypothetical tax schedules, we solve each country’s problem, assuming that all countries have identical labor market policies (set to the US benchmark) and their tax schedules generate the same average tax rate as in the US when using individuals’ choices made using the US income tax schedule. In panel B of column 4, we see that progressivity alone is responsible for 2/3 of the explanatory power of the model for L90-10. Notice that the decomposition we conducted here is not invariant to the order in which different features are eliminated. So, a valid question is whether this conclusion—that 25

Table 6: Decomposing the Effects of Different Policies Benchmark All taxes Lab. Inc. Tax Progressivity Diff. from Benchmark: (1) (2) (3) (4) Progressivity — — — — Average income taxes — — — set to US Consumption tax — — set to US set to US Benefits institutions — set to US set to US set to US A. Correlation Between Data and Model 90-10 0.91 0.90 0.85 0.88 90-50 0.85 0.87 0.85 0.87 50-10 0.85 0.84 0.78 0.81 B. Fraction of US-CEU Difference Explained by Model 90-10 0.48 0.46 (96%)a 0.38 (79%) 0.32 (67%) 90-50 0.84 0.79 (94%) 0.67 (80%) 0.55 (66%) 50-10 0.24 0.23 (96%) 0.18 (75%) 0.16 (67%) aThe numbers in parentheses express the fraction explained by the model in each column as a percentage of thebenchmarkcasereportedincolumn(1). average tax rate differences do not matter much—is robust to changing this order. To investigate this, we repeated the last experiment reported in column 4, but instead of eliminating average tax rate differences and keeping progressivity intact, we flipped the order (same progressivity as the US, but match each country’s average tax rate). In this case, the model only accounts for 14% of L90-10 differences, 20% of L90-50, and 10% of L50-10. This experiment confirms our previous conclusion that average tax rate differences are responsible for only a small fraction of the differences in wage inequality. In summary, the pension system and consumption taxes together are responsible for about 20% of the model’s explanatory power. The more important finding concerns the role of progressivity, which, for all practical purposes, is the key component of the income tax structure for understanding wage inequality differences. Differences in the average income tax rate do not appear to be very important for inequality differences. The Role of Labor Supply Elasticity. We now conduct two sensitivity analyses with respect to the value of labor supply elasticity: we consider (i) the case with a high Frisch elasticity of 0.5 and (ii) the case with only an extensive margin: n ∈ {0,0.40}. In each case, the model is recalibrated to match the same six targets in Table 4. (Appendix D contains further sensitivity analyses with respect to the values of α, δ, χ, G, as well as the treatment of capital income taxes.) 26

Table 7: Effect of Labor Supply Elasticity on Wage Inequality Differences Frisch = 0.5 Discrete hours: n ∈ {0,0.40} L90-10 L90-50 Log 50-10 L90-10 L90-50 Log 50-10 (a) (b) (c) (d) (e) (f) Denmark 0.69 1.07 0.40 0.34 0.53 0.21 Finland 0.57 0.88 0.31 0.29 0.43 0.17 France 0.39 1.32 0.16 0.17 0.56 0.07 Germany 0.68 1.01 0.40 0.29 0.42 0.17 Netherlands 0.48 0.70 0.27 0.27 0.38 0.17 Sweden 0.52 0.87 0.33 0.22 0.38 0.15 CEU 57% 94% 31% 26% 44% 16% UK 13 6 17 2 –3 6 In the first experiment we set ϕ = 3.0, which implies a Frisch elasticity of 0.5. Table 7 reports the counterpart of the analysis we conducted for the benchmark model and reported in Table 5. Comparing the two tables makes it clear that a higher Frisch elasticity improves the model’s explanatory power across the board. Now the model can explain 57% of the US-CEU difference in L90-10 (compared with 48% in the benchmark case) and 94% of the upper tail inequality (from 84% before). However, the improvement in L50-10 is modest, going from 24% in the benchmark case up to 31%. To better understand the role of the intensive margin of labor supply, we now examine another case where workers can only choose between full-time employment at fixed hours (n = 0.40) and nonemployment. The parameters of the utility function are the same as in the baseline case. The results are reported in the last three columns of Table 7. Without the amplification provided by an intensive margin—and the resulting dispersion in hours across countries—the explanatory power of the model falls and, in some cases, it falls significantly. For example, the model accounts for 26% of the difference in L90-10. For the upper-end inequality, the difference is even larger: the model now explains 44%, half of the baseline value, and also much lower than the 94% in the high Frisch case. Finally, the already low explanatory power at the lower tail falls further from 24% in the baseline case to 16%. These findings underscore the importance of the interaction of endogenous labor supply choice (with an intensive margin) with progressive taxation for understanding wage inequality differences across countries, especially above the median of the distribution. 27

5.2 Inequality Trends over Time: 1983–2003 We now turn from levels in 2003 to the change in wage inequality over time. As shown in Table 1, from early 1980s to the early 2000s, wage inequality increased significantly more in the United States (by 32 log points) compared with the CEU (6 log points). Can the human capital mechanisms studied so far help us understand this“widening”of the inequality gap as well? One challenge we face in trying to answer this question is that the tax schedules we derived above are only available for the years after 2001, whereas the tax structure has changed over time for several of the countries in our sample. Fortunately, for two countries in our sample—the US and Germany—we are also able to derive tax schedules for 1983, which allows us to conduct a two-country comparison in this section. How to Introduce SBTC? As noted earlier, in the standard Ben-Porath model studied so far, the price of human capital (P ) was simply a scaling factor and had no effect H on any implication of the model, which is why we normalized it to 1 above. This is an important shortcoming when the goal is to study the changes in human capital investment over time in response to changes in the value of human capital, due to, for example, SBTC. Guvenen and Kuruscu (2010) proposed a tractable way to extend the Ben-Porath model that overcomes this difficulty. This extension basically involves introducing a second factor of production—raw labor ((cid:96))—in addition to human capital, h. The key assumption is that, unlike human capital, raw labor cannot be accumulated over the life cycle (it is fixed). Individualssupplybothfactorsofproductionforatotalhourlywageof(P h +P (cid:96))(1−i ) H s L s at age s, where P is now the price (wage) of raw labor. With this two-factor structure, a L rise in P does increase human capital investment. So SBTC could be modeled as a rise in H P over time with P fixed. The formal statement of this model along with the calibration H L of SBTC are presented in Appendix D.7. (All parameters other than P remain essentially H unchanged in calibration.) Comparing the United States and Germany. The procedure for constructing the 1983 tax schedules is described in Appendix B.3 and the resulting progressivity wedges are shown in Figure 6. As seen here, in 1983 the progressivity of the tax structure in the US and Germany was similar in both countries up to about twice the average earnings level. And above this point, the US actually had the more progressive system. Over time, the US became much less progressive, whereas the change in Germany was more gradual, making the US tax schedule much flatter than that of Germany over time. Using these schedules, we conduct three experiments.25 In the first experiment, we 25Because of the computational burden, these experiments only provide steady state comparisons. Al- 28

Figure 6: Progressivity Wedges at Different Income Levels: US vs. Germany, 1983 and 2003 0.35 US 1983 0.3 GER 1983 0.25 GER 2003 0.2 0.15 US 2003 0.1 0.05 0 0.5 1 1.5 2 2.5 3 3.5 y/AW assume that the tax schedules remained fixed throughout this period. We choose one parameter that controls the skill bias of technology, P , to match the 32 log points rise in H L90-10 in the US during the period. Note from column (1) of Table 8 that, in the data, L90-10 rose by only 13 log points in Germany during the same period. Turning to the model and assuming that Germany has been subject to the same SBTC as the US, the model generates a rise of 19 log points in L90-10 for Germany. Thus, whereas the inequality gap widens in the data by 32 − 13 = 19 log points, the model predicts 32 − 19 = 13 log points, explaining 68% (13/19) of the observed difference in the data. Second, in column (3), we consider the case where the only change over time is in the tax schedules. We do not recalibrate any parameter to match targets in 1983. In the US, L90-10 rises substantially—by 21 log points—with no SBTC. Hence, the flattening of the tax schedule alone accounts for a significant fraction (about 2/3) of the rise in US wage inequality during this time. To our knowledge, this result is new in the literature. In contrast to the US, wage inequality barely changes (by 1 log point) in Germany. This experiment suggests that the dramatic fall in progressivity in the US and the small change in Germany alone could explain almost all of the widening inequality gap! Third, we now incorporate the change in tax schedules and re-calibrate SBTC such that we match the change in L90-10 for the US.26 Now, L90-10 rises by 9 log points in Germany. Thus, the model slightly over-explains—by 16% (= 0.22/0.19−1.0)—the widening gap in the data. though solving for the full transition path is beyond the scope of this paper, it could be important for the quantitative results, so future work on this issue is certainly warranted. 26The required change in log(P /P ) is 6.7 log points, which is about one-third of the value we used in H L the first experiment with fixed tax schedules. 29

Table 8: US vs Germany: Changing Tax Schedules and Changing Inequality Data Model (1) (2) (3) (4) Taxes: Fixed Changing Changing SBTC: Calibrated to US Fixed Calibrated to US Panel A: Change in L90-10 US 0.32 0.32a 0.21 0.32a GER 0.13 0.19 0.01 0.09 ∆(US-GER) 0.19 0.13 0.20 0.22 Panel B: Change in L90-50 US 0.22 0.23 0.15 0.23 GER 0.05 0.14 0.01 0.06 ∆(US-GER) 0.17 0.09 0.14 0.17 Panel C: Change in L50-10 US 0.10 0.09 0.06 0.09 GER 0.07 0.05 0.00 0.03 ∆(US-GER) 0.02 0.04 0.06 0.06 aSBTC (P ) calibrated so that the model matches the rise in L90-10 for the US exactly. H Panels B and C of the table explore how much of the widening gap has occurred at the top and bottom of the distribution. In the data, the L90-50 gap between the US and Germany rose by 17 log points, whereas the L50-10 gap increased by only 2 log points. Therefore, a remarkable fact is that virtually all of the rise in the inequality gap occurred because top-end inequality increased much more in the US (by 0.22) than in Germany (by 0.05). This observation strongly indicates that to understand the widening inequality gap, one needs to understand the economic forces that operate above the median of the wage distribution—andthehumancapitalchannelsstudiedhereprovideoneimportantcandidate. To quantify these human capital effects, we turn to column (4): the model generates the same 17 log points rise in the L90-50 gap as in the data, and overstates the L50-10 gap observed in the data by 4 log points. While these results are encouraging, a caveat must be noted. First, wage inequality in 1983 depends not only on the tax schedule in 1983, but also on the tax schedules that were in place several years prior, since the dispersion in human capital across individuals results from investments made in previous years. Clearly, the same comment applies to 2003. Although in our exercise we do not account for this fact, it is not clear which way this biases the results. This is because the US tax system was even more progressive before the Economic Recovery Tax Act of 1981, whereas the progressivity change in the years 30

Figure 7: Lifecycle Profile of Mean Log Wages: US vs Germany 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 25 30 35 40 45 50 55 Age ))egaW(gol( naeM Germany (GSOEP) US (PSID) preceding 2003 (say, from 1990 to 2003) was more modest. Therefore, if we were to use a time average of tax schedules in our exercise (say, 1973 to 1983 and 1993 to 2003), we conjecture that the reduction in progressivity over time could be larger than we assumed in the experiment just described (which would attribute an even larger role to taxes). A more complete examination of this issue is an exciting topic for future research. 6 Microeconomic Evidence on the Mechanism The model also makes predictions for how the lifecycle profile of wages and hours varies across countries. In particular, because progressivity dampens human capital investment, average wages should grow more slowly over the life cycle in the CEU. Similarly, because progressivity compresses the cross-sectional distribution of human capital investment, wage inequality should rise less over the life cycle in the CEU. Testing these two predictions requires panel data on wages (to disentangle the age profile from time or cohort effects), which is difficult to obtain on a comparable basis for the CEU countries in our sample.27 An exception is the German Socio-Economic Panel (GSOEP), which includes information on wages and hours of German individuals and is available to outside (non-European Union) researchers. In this section, we make use of this dataset and the PSID for the United States to provide a two-country comparison of lifecycle profiles. 27Althoughmostofthecountriesinoursamplehavepaneldatasetsonindividuals,manyofthesedatasets areeitherrestrictedtoresearchersthatarecitizensofthatcountryorhavedocumentationthatisnottranslatedintoEnglish,makingitinfeasibleforustousethosedatasetsinourstudy. TheGermanSocioeconomic Panel(GSOEP)isavailabletooutsideresearchersuponthesubmissionandapprovalofaresearchproposal. 31

6.1 Wages and Hours over the Lifecycle: US vs Germany We focus on male workers who are between 25 and 55 years of age to minimize the effects of early retirement behavior and the consequent fall in employment rates at later ages. The PSID data cover 1968-1992 and the GSOEP data cover 1984 to 2007. Wages. Figure 7 plots the lifecycle profile of mean log wages in the US and Germany. The profiles are extracted from panel data by cleaning cohort effects following the usual procedure in the literature; see Appendix E for details. As seen in the figure, from age 25 to 55 the average wage profile rises by 36 log points in the US, but by only 21 log points in Germany, consistent with the prediction of the model that a more progressive tax system generates a flatter average wage profile. Next, figure 8 plots the lifecycle profile of wage inequality (again controlled for cohort effects) for the two countries. In the US, the variance oflogwagesrisesby26logpoints, comparedto15logpointsforGermany. Again, inequality rises more over the lifecycle in the less progressive country, consistent with the mechanism in the model. Although, in figure 8 we normalized the intercept to zero (to help visual comparison), a relevant question is, how much wage inequality is there at the time workers enter the labor market? To answer this question, we compute the variance of log wages for workers between ages 23 and 27 and find it to be very similar in both countries: 0.251 in the US and 0.260 in Germany.28 This implies that virtually all the difference in wage inequality between Germany and the United States documented in the previous section is generated by the faster rise of inequality over the lifecycle in the US compared to Germany and almost none is due to differences in initial inequality. (Incidentally, this finding is also reassuring, given that our model assumes identical inequality at age 20.) Finally, instead of controlling for cohort effects as we did above, one can alternatively controlfortimeeffects. Usingthisapproach,meanlogwagesriseby0.37intheUScompared with 0.27 in Germany. Inequality rises by 0.12 in the US compared with only 0.02 in Germany. Thus, while the magnitudes change, the rankings of the two countries remain the same under this alternative approach.29 28For this computation, we use data from 1984 to 1992, which is the period the two datasets overlap. 29The model counterparts of these numbers are also of interest. In the model, the rise in the mean log wages (from age 25 to 55) in the US exceeds the same statistic in Germany by 0.16, compared with the 0.15 (=0.36−0.15) figure in the data when cohort effects are controlled for and 0.10 (=0.37−0.27) when time effects are controlled for. Similarly, in the model, the rise of wage inequality in the US exceeds that in Germany by 0.16, compared with 0.11 and 0.10 in the data without cohort and time effects, respectively. By and large, the model is consistent with the signs and rough magnitudes of the differences seen in the data. 32

Figure 8: Within-Cohort Variance of Log Wages: US vs Germany 0.3 0.25 0.2 0.15 0.1 0.05 0 25 30 35 40 45 50 55 Age ))egaW(gol(raV US (PSID) Germany (GSOEP) A complementary piece of evidence is presented in Domeij and Floden (2010) from Sweden. These authors construct the analog of figure 8 for Sweden and find that the rise in wage inequality over the life cycle is much smaller than in both the US and Germany.30 GiventhehighprogressivityofincometaxesinSwedencomparedwiththeUSandGermany, this outcome is exactly what is predicted by the present model. Labor Hours. We begin with the dispersion in hours. In Germany (GSOEP), the standard deviation of log hours is 0.369 compared with 0.324 in the United States (PSID).31 It is a well-known fact that incomplete markets models without preference heterogeneity severely understate the level of hours inequality (c.f. Erosa et al. (2009)) and our model is no exception. In the model, σ(log(n)) = 0.112 in the US and 0.128 in Germany.32 Despite missingonthelevels, themodelisconsistentwiththefactthathoursinequalityissomewhat higher in Germany than in the US. At first blush, it may seem surprising that the model implies higher dispersion in the more progressive country. The reason has to do with lump sum transfers, which happens to work in the opposite direction to progressivity in this two-country comparison. Specifically, the calibrated model implies that lump-sum transfers in Germany are more than twice as large as in the US. By their nature, these transfers create a larger wealth effect on low- 30In Sweden, from age 25 to 55, the variance of log wages rises by 0.08 when controlling for time effects and falls by 0.06 when controlling for cohort effects; see Domeij and Floden (2010, figs. 13 and 14). 31These statistics are computed using data from 1984 to 1992, which is the period the datasets overlap. 32The standard way to circumvent this problem is to introduce heterogeneity in work-leisure preferences, whichistheroutefollowedby,amongothers,Heathcoteetal.(2007),Bilsetal.(2009),andKaplan(2010). Because hours inequality is not the main focus of this paper, we have not pursued this approach here. 33

income individuals (it is a larger fraction of their income) and, therefore, reduce their labor supply more than that of higher-income individuals. Thus, countries with higher lump-sum payments (or more redistributive government services), ceteris paribus, have higher hours inequality. To illustrate this point, we solve the model for Germany by fixing the lump sum transfers to the same fraction as in the US and assume the rest of the budget surplus yields no utility. The implied standard deviation of log hours falls from 0.128 to 0.098, which is now lower than in the US. Therefore, the predictions of the model regarding hours inequality is in general ambiguous, being driven by progressivity and the size of lump-sum transfers. As for average hours, the prediction of the model is much clearer: countries with more progressive taxes should have lower average hours. Consistent with this prediction, it is well documented that Americans on average work much longer hours than Europeans (Prescott (2004), Ohanian et al. (2008)). Here we show that the same is true when we focus on male workers. For Germany, Wanger (2006, Table 3) reports that the average hours per (male) worker in 2003 was 1,557 hours. For the same year, Heathcote et al. (2010, figure 2) report that the average hours per (male) person was 1890 hours, or 21% higher than in Germany.33 Given that hours per worker must be higher than hours per person, this provides a lower bound on the gap between German and US males. This gap is even higher than what is predicted by the model (which is 12.3%). Overall, the lifecycle evidence on wages and hours documented in this section are in line with—and therefore provide further support to—the human capital mechanism that operates in our model. 6.2 Survey Measures of Human Capital Inequality So far we have focused on the model’s implications for variables that are easily measured in the data, such as wages and hours. However, the model also makes very clear predictions about how human capital dispersion should vary by country (or with the progressivity of the country’s tax system). We now test three such predictions in the data. Toconductthisanalysis,weneedanempiricalmeasureofhumancapitalattheindividuallevel for the countries in our sample. The data source we use is the International Adult Lit- 33For the average hours statistics, we do not use the GSOEP and PSID because these data sets seem to overstate average hours. For example, Fuchs-Schu¨ndeln et al. (2010) document that average hours in GSOEP is about 300 hours lower than the NIPA counterpart (called IAB) in 1980 and this gap grows to more than 500 hours by year 2000. This gap seems to be largely attributed to the insufficient treatment in GSOEP of vacation and sick days and other factors that impact the number of work days per year. For consistency, and because similar issues are also relevant for the PSID, we do not use it either. Instead we rely on CPS for the US and IAB data for Germany. 34

Table 9: Human Capital Dispersion Cross-Country Correlation of Test Score Dispersion (Data) with: Wage Dispersion (Data) Human Capital Dispersion (Model) Dispersion measure↓ L90-10 0.88 0.88 L90-50 0.89 0.78 L50-10 0.77 0.88 eracy Survey (IALS), which is a large-scale, international comparative assessment designed to measure a range of skills linked to the economic characteristics of the adult population (ages 16 to 65) within and across nations. The IALS has been extensively used as a measure of human capital of the working age population in the literature (see, among others, Nickell and Bell (1995); Devroye and Freeman (2000); Leuven et al. (2004) and the references therein). We use data from the 1998 survey—the latest available—which contains data from seven of the eight countries in our sample, the exception being France. First, we investigate whether, in the data, higher wage dispersion in a given country is accompanied with larger human capital dispersion, as robustly predicted by our model. Column (1) of Table 9 reports the cross-country correlations between wage and human capital dispersions, the latter measured by the IALS quantitative literacy test score.34 Each correlation is computed using the same measure of dispersion for both variables (L90-10, L90-50, or L50-10). The correlations are strong regardless of the part of the distribution we focus on. Although not reported in the table, the test score dispersion also varies significantlyacrosscountries. Forexample, thecountrywith—byfar—thelargestdispersion is the US, with a 90-10 percentile ratio of 2.26 (as measured by the quantitative score), followed by the UK with 1.83. At the other end lie the Scandinavian countries with a 90-10 percentile ratio of 1.45. (The prose and document literacy tests reveal even larger gaps.) Second, we compare the human capital dispersion implied by the model to that found in thedataacrosscountries. Column(2)ofTable9reportsthecorrelationsbetweenthehuman capital dispersion in the model and those measured by the IALS data. The correlation is robust, ranging from 0.78 to 0.88. Third, and as discussed earlier, our model predicts that countries with a more progressive tax system will have less dispersion in human capital 34The IALS survey is composed of three tests: (i) quantitative literacy (measuring arithmetic and analytical skills used in typical work situations); (ii) prose literacy (the skills needed to understand and use information from texts, including editorials, news stories, poems, etc.); and (iii) document literacy (the skills required to locate and use information contained in various formats, including maps, tables, graphs, job applications, etc.). In Table 9 we reports the results using the quantitative literacy results. We omit the other two measures for brevity because they give very similar results across the board. 35

across individuals. Using P(0.5,2.5), the measure of wedge employed earlier, the correlation withtheL90-10measureofIALShumancapitaldispersionis–0.79. (Usingothertestresults or alternative wedges (e.g., P(0.5,0.5k),k = 2,3,...,6) yields equally strong results.) When these three empirical findings from survey data are put together with the evidence on the lifecycle profiles of wages from US and Germany, they provide strong support to the human capital mechanism that is operational in our model. 7 Conclusions In this paper, we have studied the effects of progressive labor income taxation on wage inequality when a major source of wage dispersion is differential rates of human capital accumulation. To understand the main mechanisms and their quantitative importance, we have examined differences in wage inequality between the United States and seven European countries, which differ significantly in their income tax structures as well as in other dimensions of their labor market institutions. A common theme in our findings is that the model is significantly better at explaining inequality differences at the upper tail compared to the lower tail. Institutions, such as unionization, minimum wage laws (as in the case of France, discussed earlier), and centralized bargaining, are likely to be more important for the lower tail. However, since changes in the upper tail have been so important during this time (as we have documented), the mechanisms studied in this paper provide a promising direction for understanding US-CEU differences in wage inequality. We also found that the most important policy difference for wage inequality is the progressivity of the income tax system, which is responsible for about two-thirds of the model’s explanatory power.35 Finally, we turn to the changes in wage inequality over time. In a two-country the model can account for all of the widening of the inequality gap between the US and Germany, when the actual changes in the tax schedules were also incorporated. We have also explored the micro implications of the model, which provided further supporting evidence for the model. For example, the lifecycle profile of mean wages is flatter in Germany than in the United States, as implied by the higher progressivity in the former country. A similar result is found for within-cohort wage inequality in Germany and the US. Similarly, average hours for males is much lower in Germany than it is in the US. 35In the working paper version (Guvenen et al. (2009)) we conducted the same analysis using data on all workers as opposed to male workers as we did here. The results of that analysis was remarkably similar to those found here. To us, this suggests that the same mechanisms emphasized in this paper are likely to be as important for female workers as it is for males, despite large differences across countries in female labor force participation. 36

These observations are consistent with the predictions of the model and provide further support to the empirical relevance of the human capital mechanisms explored in this paper. An alternative mechanism that is also consistent with the US-Europe inequality gap was proposed by Becker (1985). In his framework, workers choose both hours of work in the market and effort per hour. High ability workers in the US put more effort per hour (and are therefore more productive) than comparable workers in Europe because the return is relatively higher. Thus, wage inequality will be higher in the US than in Europe. An important difference between this mechanism and ours is that our model implies a widening of wage inequality over the life cycle in the US relative to Europe (as documented in Section 6.1), whereas Becker’s model implies that wage inequality would be constant over the lifecycle. An alternative way of modeling for skill acquisition would be through“learning by doing (LBD),”which differs from human capital models in some subtle ways. To understand this, notice that in an LBD model, human capital is acquired by working longer hours. The marginal cost of work is given by the marginal utility of leisure, which is independent of the current tax rate. The marginal benefit is the increase in utility due to higher after-tax earnings both in the current period (higher earnings from longer hours) and future periods (higher wages because of accumulated skills). So, for example, if current taxes are raised without affecting future taxes, this would increase human capital investment in Ben-Porath as we saw in Section 2.2 (because the cost of investment is the current after-tax wage, which is lower now). In contrast, in an LBD model, this will decrease current hours of work because part of the marginal benefit of work (current after-tax earnings) falls. But if there is less work, there is less skill acquisition in an LBD model. This is one example where a change in taxes can increase investment in Ben-Porath while reducing it with learning by doing. However, that this is a carefully selected example. There are many other cases where both models would have qualitatively the same implication (for example if future taxes are raised without affecting current taxes). Finally, we have made several assumptions to make the quantitative exercise computationally feasible.36 An important direction to extend the current framework would be by carefully modeling the differences between the US and the CEU in the financing of the education system as well as in the types of skills taught in schools in both places. This is a difficult but interesting question that is at the top of our future research agenda. 36Thenumericalsolutionofthemodelrequirescarebecausetheindividuals’dynamicproblemhasseveral sourcesofnon-convexities. Asaresult,solvingfortheequilibriumtakesabout14hoursfortheUSandUK, and as much as 30 hours for some countries like Denmark. This makes calibration very time consuming, which prevented us from extending the model in other directions. 37

NOT FOR PUBLICATION SUPPLEMENTAL APPENDIX 38

A Theoretical Appendix: Derivations and Definitions A.1 Derivation of the Optimal Investment Condition (eq. (7)) Here, we derive the optimal investment condition in the most general framework studied in this paper, described in Section 5.2. The optimality conditions presented earlier in the paper ((4), (5), and (7)) can all be obtained as special cases of this formulation. Under the assumptions stated in Section 5.2 (i.e., setting χ ≡ 1, eliminating pension payments (Ω ≡ 0), and setting idiosyncratic shocks to their mean value), the problem of the agent is given by V(h,a,s) = max u((1+r)a +y (1−τ¯(y ))−a ,1−n) s s s s+1 cs,ns,Qs + V(h ,a ,s+1) s+1 s+1 s.t. y = (θ l+θ h )n −C(Q ). s L H s s s Note that total tax liability of the agent is given by yτ¯(y). The derivative of tax liability with respect to y gives the marginal tax rate. Thus, τ(y) = τ¯(y)+yτ¯(cid:48)(y). Using this expression, we obtain the following FOCs for this problem (n ) : (θ l+θ h )(1−τ(y ))u (c ,1−n ) = u (c ,1−n ) s L H s s 1 s s 2 s s (a ) : u (c ,1−n ) = βV (h ,a ,s+1) s 1 s s 2 s+1 s+1 (Q ) : C(cid:48)(Q )(1−τ(y ))u (c ,1−n ) = βV (h ,a ,s+1) s S s 1 s s 1 s+1 s+1 Envelope conditions are: (a ) : V (h ,a ,s) = (1+r)u (c ,1−n ) s 2 s s 1 s s (h ) : V (h ,a ,s) = n (1−τ(y ))u (c ,1−n )+n βV (h ,a ,s+1). s 1 s s s s 1 s s s+1 1 s+1 s+1 Combining the envelope conditions with the FOCs yields βu (c ,1−n ) C(cid:48)(Q )(1−τ(y )) = θ n (1−τ(y )) 1 s+1 s+1 + s s H s+1 s+1 u (c ,1−n ) 1 s s (cid:124) (cid:123)(cid:122) (cid:125) 1 1+r β2u (c ,1−n ) 1 s+2 s+2 +θ n (1−τ(y )) +.... H s+1 s+1 u (c ,1−n ) 1 s s (cid:124) (cid:123)(cid:122) (cid:125) 1 (1+r)2 Rearranging this expression delivers equation (7): 1−τ(y ) 1−τ(y ) 1−τ(y ) C(cid:48)(Qj) =θ {β s+1 n +β2 s+2 n +...+βS−s S n }. j s H 1−τ(y ) s+1 1−τ(y ) s+2 1−τ(y ) S s s s A.2 Equilibrium Definition Definition 1 A stationary recursive competitive equilibrium for this economy is a set of equilib- 39

rium decision rules, c(x), n(x), Q(x), i(x), and a(cid:48)((cid:15)(cid:48),x); value functions, V(x) and WR(x), for working and retirement periods, respectively, where x = (h,a,m;(cid:15),s,j) (notice the inclusion of j into this vector); a pricing function for Arrow securities, q((cid:15)(cid:48)|(cid:15)), and a measure Λ(x) such that 1. Given the labor income tax function, τ¯(y), consumption tax, τ¯, transfers, Tr, and governc ment’s pension function Ω, individuals’ decision rules and value functions solve problems in (9) to (13) and in (14). 2. Asset markets clear: (cid:82) a(cid:48)((cid:15)(cid:48),x)dΛ(x) = 0 for all combinations of ((cid:15)˜,(cid:15)(cid:48)).37 x(:,(cid:15)=(cid:15)˜) 3. Λ(x) is generated by individuals’ optimal choices. 4. The government budget balances: (cid:90) (cid:90) τ¯ (y(x))y(x)dΛ(x)+ τ¯c(x)dΛ(x) = G+Tr n c x(:,s<S) x T (cid:90) (cid:88) + Ω(yj,mS(x))dΛ(x). x(:,s=S−1) s=R The first term in the government’s budget is the total tax revenue from labor income collected from all agents who are working and younger than retirement age. Similarly, the second term is the total tax revenue from the consumption tax, but it is collected from all agents including the retirees. On the right-hand side, the pension payments only depend on a worker’s ability through yj and the number of years she worked until retirement (mS(x)), which in turn depends on the full state vector x at age S −1. Therefore, we integrate the pension payments over the full state vector x conditioning on age S−1 and then sum the same amount over all ages greater than S−1 to find total pension payments. B Country-Specific Labor Market Policies B.1 Estimating Country-Specific Average Tax Schedules Here we provide more details on the estimation of tax schedules described in Section 2.2. Define normalized income as y ≡ y/AW. For each country, denote the top marginal tax rate with τ (cid:101) TOP andthetopbrackety . ThevaluesforthesevariablesaretakenfromtheOECDtaxdatabase.38 (cid:101)TOP Asnotedinthetext, wealreadyhaveaveragetaxratesforallincomelevelsbelow2(i.e., twotimes AW). For values above this number, we have to consider separately the case where a country’s top marginal tax rate bracket is lower and higher than 2. In the former case (y < 2), since we (cid:101)TOP 37The notation x(:,(cid:15)=(cid:15)˜) indicates that the integral is taken over the entire domain of variables in state vector x, except for , which is set equal to (cid:15)˜. Others below are defined analogously. 38From Table I.7, available for download at www.oecd.org/ctp/taxdatabase. 40

Table A.1: Tax Function Parameter Estimates τ¯(y/AW) = a +a (y/AW)+a (y/AW)φ 0 1 2 Country: a a a φ R2 0 1 2 Denmark 1.4647 −.01747 −1.0107 −.15671 0.990 Finland 1.7837 −.01199 −1.4518 −.11063 0.999 France 0.5224 .00339 −.24249 −.41551 0.993 Germany 1.8018 −.01708 −1.3486 −.11833 0.992 Netherlands 3.1592 −.00790 −2.8274 −.03985 0.984 Sweden 9.1211 −.00762 −8.7763 −.01392 0.985 UK 0.5920 −.00390 −.32741 −.30907 0.989 US 1.2088 −.00942 −.94261 −.10259 0.993 know the average tax rate at y = 2, each additional dollar up to 2 is taxed at the rate of τ . (cid:101) TOP Therefore, for y > 2 (cid:101) τ¯(y) = (τ¯(2)×2+τ ×(y−2))/(y) (cid:101) TOP (cid:101) (cid:101) Ifinsteady > 2(whichisonlythecasefortheUSandFrance),wedonotknowthemarginal (cid:101)TOP tax rate between y = 2 and y . Thus, we first set τ(2) = (τ¯(2)×2−τ¯(1.75)×1.75)/0.25 and (cid:101) (cid:101)TOP use linear interpolation between τ(2) and τ . We have TOP (cid:40) τ(2)+ τTOP−τ(2) (y−2) if 2 < y < y τ(y (cid:101) ) = y (cid:101)TOP−2 (cid:101) (cid:101) (cid:101)TOP τ if y > y . TOP (cid:101) (cid:101)TOP Then the average tax rate function for y > 2 is (cid:101) (cid:26) (τ¯(2)×2+τ(y)×(y−2))/y if 2 < y < y (cid:101) (cid:101) (cid:101) (cid:101) (cid:101)TOP τ¯(y) = (cid:101) (τ¯(2)×2+ (τ(2)+τTOP) (y −2)+τ ×(y−y ))/y if y > y 2 (cid:101)TOP TOP (cid:101) (cid:101)TOP (cid:101) (cid:101) (cid:101)TOP We use this expression to compute τ for y = 3,4,...,8 (in addition to the original average tax (cid:101) rate from OECD website). We then fit the functional form given in equation (8) to these 13 data points as explained in the text. The resulting coefficients are reported in Table A.2. B.2 Deriving Tax Schedules with Different Progressivity but Same Average Tax Rate To change the average tax rates in Europe without changing progressivity, we apply the following procedure. Let τ (y) be the marginal tax rate in country i for income level y. We would like to i obtain a new tax schedule τ∗(y) with the same progressivity but with a different level. Thus, we i 41

need to have (for all y and y(cid:48)) 1−τ∗(y(cid:48)) 1−τ (y(cid:48)) 1−τ∗(y(cid:48)) 1−τ∗(y) i = i ⇒ i = i 1−τ∗(y) 1−τ (y) 1−τ (y(cid:48)) 1−τ (y) i i i i Letting this ratio to be equal to a constant k, the new tax schedule τ∗ is obtained by the following expression: 1−τ∗(y) = k(1−τ (y)) for all y. (18) i i Let the average tax rate be τ¯(y) = a +a y+a yφ ⇒ τ (y) = a +2a y+a (φ+1)yφ. i 0 1 2 i 0 1 2 Plugging this last expression into (20) and solving for τ∗(y), we get (cid:104) (cid:105) τ∗(y) = 1−k+k a +2a y+a (φ+1)yφ . i 0 1 2 Observing that yτ¯(y) = (cid:82)y τ (x)dx, we can solve for the average tax rate τ¯∗(y) as i 0 i i τ¯∗(y) = 1−k+k[a +a y+a yφ] = 1−k+kτ¯(y). (19) i 0 1 2 i The new schedule τ¯∗(y) has the same progressivity as τ¯(y) but can have any desired average tax i i rate. We choose k so that the average labor income tax rate in country i is equal to the average labor income tax rate in the US. B.3 Constructing Tax Schedules for 1983 Here, we describe the formulas we use to calculate the average tax rate at different income levels for Germany and the United States in 1983. This information is obtained from the OECD (1986) (see pages 104–105 and 244–248 for the US and pages 74–75 and 149–154 for Germany. In all calculations for Germany, the monetary figures are in Deutsche Mark (DM). Gross income is denoted by GM. B.3.1 Germany Social Security Contributions. In 1983, the social security system in Germany had two brackets with their respective tax rates. Specifically, social security contributions (SSC) were given by: SSC = 0.1138×(min(GI,64800)+0.0588(min(GI,48600)). Allowances. Eachworkerreceivesanallowance(taxexemption)ofDM1080andanallowance of DM 564 for work-related expenses. The OECD considers other miscellaneous allowances in the amount of DM 1606. We treat this amount as fixed for all levels of income. Finally, workers are 42

able to deduct part of their social security contributions determined by this formula: SSC Allowance = max{6000−0.18(GI),0} +min(2340,max{SSC −max{6000−0.18(GI),0}}) +0.5×min(2340,max{SSC −max{6000−0.18GI,0}−2340,0}). Total Tax. Putting together the taxesandallowancesjustdescribed givesthe taxable income of a worker: Taxable Income = GI-SSC Allow.-Basic Allow.-Work-related and other Allow. Now, we can calculate the tax liability to the household. The first step is to round the taxable income. Rounded Taxable Income (RTI) = round(Taxable Income/54)×54. We calculate two variables Y and Z that will be used in the calculations that follow. They are defined as Y = RTI−18000 and Z = RTI−60000. To obtain the income tax for a worker, we need to 10000 10000 apply Germany’s tax schedule in 1983:  zero if RTI ≤ 4212     0.22×RTI−926 if 4213 < RTI ≤ 18035   Income Tax= (((3.05Y −73.76)Y +695)Y +2200)×Y +3034 if 18036 < RTI ≤ 60047   (((0.09Z −5.45)Z +88.13)Z +5040)×Z +20018 if 60048 < RTI ≤ 130031)    0.56×RTI−14837 if RTI > 130032 Income Tax+SSC Average Tax Rate = . Gross Income B.3.2 The United States Social Security Contribution. In 1983, the employee social security contribution in the US was given by SSC Employee = 0.067×(min(Gross Income,35700)) The employer’s social security contribution matches the employee’s contribution of 6.7% on earnings up to $35700. Additionally, employers are required to pay an unemployment tax of 6.2% of earnings up to $7000 and a nationwide average for state-sponsored tax plan of 2.8% of earnings up to $7624. SSC Employee = 0.067×(min(GI,35700))+0.062×(min(GI,7000))+0.028×(min(GI,7624)) Allowances. The total combined allowances and exemptions amount to $2300 per worker. Taxable Income = Gross Income−Basic Allowance−Tax Bracket Allowance. Federal Income Tax. Now, we can calculate the tax liability for the household. We need to apply the US tax schedule in 1983. The first $2300 is not taxed, as discussed earlier. The tax 43

rate is 11% when taxable income is in range (2300,3400); is 13% in range (3400,4400); is 15% in range (4400,8500); 17% in range (8500,10800); is 19% in range (10800,12900); is 21% in range (12900,15000); is 24% in range (15000,18200); is 28% in range (18200,23500); is 32% in range (23500,28800); is 36% in range (28800,34100); is 40% in range (34100,41500); is 45% in range (41500,55300); and 50% above $55,300. State and Local Taxes. For the purposes of calculating local and state taxes, the OECD considers a worker that lives in Detroit, Michigan. Detroit allows an exemption of $600, then a flat 3% tax is applied. Tax Detroit = 0.03(GI−600). The formula for Michigan’s state income tax is given by Tax Michigan = 0.0635(GI−1500)−0.05max(Tax Detroit-200,0)+27.5 Total Local Tax = Tax Michigan+Tax Detroit Total Tax. The total tax liability is equal to the income tax plus the social security contribution and the local tax. Then, we have Total Tax Liability Average Tax Rate = Gross Income B.4 Pension Systems The details of the pension benefits system for OECD countries used in this paper are taken from the OECD publication entitled“Pensions at a Glance: 2007.” The specific numbers used in this section are from Table I.2 and the unnumbered table on page 35 of that document. Further details of these pension systems, including the number of years required to qualify for full benefits, and so on, are described more fully on pages 26–35 of the same document. Let yj be the lifetime average of net (after-tax) labor earnings of all individuals with ability level j; and let y be the same variable averaged across all ability levels. Finally, recall that mR is the total number of years a worker has been employed up to the retirement age, and let m be the maximum number of years of work that an individual can accumulate retirement credits in a given country. The net retirement earnings of individual with ability j is given as Ω(yj,mR) = min (cid:18) 1, mR(cid:19) (cid:2) ay+byj(cid:3) m The first term approximates the credit accumulation process whereby individuals qualify for full retirement benefits after working a certain number of years and only qualify for partial pensions if they retire before that. We set m equal to 40 years for all countries. Different countries differ mainlyinthevalueofthecoefficientsaandb. Broadlyspeaking,adeterminesthe“insurance”component of retirement income, because it is independent of the individual’s own lifetime earnings, whereas b captures the private returns to one’s own lifetime earnings. In this sense, a retirement system with a high ratio of a/b provides high insurance but low incentives for high earnings and vice versa for a low ratio of a/b. Inspecting the coefficients in the table shows that there is a very wide range of variation across countries. Finally, some countries have a ceiling on pensionable 44

Table A.2: Pension System Formulas a b Ranges Ceiling for Pensionable Income (as % of AW) DEN 0.371 0.528 all — FIN 0.011 0.695 all — FRA 0.141 0.484 all 300% GER -0.004 0.621 if yj ≤ 1.5y¯ 0.927 if yj > 1.5y¯ 150% NET 0.005 0.928 all — SWE -0.021 0.735 all 367% UK 0.257 0.154 if yj ≤ y¯ 115% 0.315 0.096 if y¯< yj ≤ 1.5y¯ 0.396 0.042 yj > 1.5y¯ US 0.168 0.355 all 290% income and entitlements, which is also reported in Table A.2. C Further Details of Calibration Dispersion of wage growth rates. Using male hourly earnings data, Haider (2001) estimates a value of σ(bj) = 2.07, and using annual earnings data he estimates it to be 2.02%. Baker(1997,Table4,rows6and8)usesanannualearningsmeasureandestimatesvaluesof1.76% and 1.97% in the two most closely related specifications to the present paper, whereas Guvenen (2009) finds a value of 1.94%, again using male annual earnings data. Finally, Guvenen and Smith (2009) estimate a process for household annual earnings and obtain a value of 1.87%. Calibration of the stochastic component. Overthesampleperiod,Haiderestimates the average innovation variance to be 0.074, an AR coefficient of 0.761, and an MA coefficient of −0.42. Using these parameters, the unconditional variance is 0.109. We match the average of the first three autocorrelation coefficients because Haider (2001) estimates an ARMA(1,1) process, whereas in our model we employ a slightly more parsimonious structure (AR(1)+ iid shock). This latter formulation is a common choice in calibrated macroeconomic models because it requires one fewer state variable while still capturing the dynamics of wages quite well. Nevertheless, because of this difference, it is not possible to exactly match each autocorrelation coefficient in the ARMA(1,1) specification and, so, we match the average of the first three. In the calibrated model, the first three autocorrelations are 0.48, 0.33, and 0.20 compared to 0.42, 0.32, and 0.24 in the data. D Further Sensitivity Analysis In all of the following robustness exercises, we recalibrate our model to the empirical targets described in Section 4. 45

D.1 Taxing Capital Income In our baseline model, we abstracted from taxation of capital income for two reasons. First, some plausible formulations of capital income taxation substantially complicates the numerical solution of the model by invalidating a relatively fast algorithm we were able to use in its absence. Second, the actual treatment of capital income is quite complex, certainly much more so than labor income. For example, some countries (e.g., the United States) tax certain forms of capital income as ordinary income (i.e., they tax “total” income), whereas some other countries (e.g., France, Finland, and Sweden) allow individuals to pay a lower flat-rate tax on certain types of capital income (such as interest income). See, for example, the discussion in Carey and Rabesona (2002, Table 22) and on pages 158-160. Modeling the complexities of this institutional detail is beyond the scope of this paper, so in the benchmark model studied in the main text we abstracted entirely from capital income taxes. With these caveats in mind, here we attempt to quantify the effects of taxing capital income in a simple way. Basically, we assume that the government taxes total income—inclusive of capital income—subject to the tax schedules derived in this paper. To understand why taxing total income could matter for ours results, first notice that there are essentially two types of assets in our economy: human capital and financial assets. When capital income is taxed at the flat rate as in our benchmark analysis, progressivity reduces only the return on human capital hindering investment in human capital relative to investment in financial assets. On the other hand, when progressive tax is applied to total income, progressivity reduces both the return on human capital and financial assets. Thus progressivity does not reduce investment in human capital relative to investment in financial asset as much as in the case where progressivity affects only labor income. To conduct this exercise we have to make some simplifying assumptions to our model and develop a new computational method. The reason is that our computational procedure for the benchmark model relies on the property that the return on savings is independent of the tax rate (which is no longer true in this experiment). This allowed us to compute the human capital investment and consumption-savings decision separately and iteratively. When the progressive tax is applied to total income however, we can no longer use this procedure because we need to compute the total income at each age to compute the tax rate the agent is facing. Thus, we need to solve the human capital investment jointly with consumption-saving decision. However, then it becomes very hard to solve this problem with value function methods, since an individual has to know his borrowing limit in a period to make his optimal choices, which depends on his lifetime human capital and labor supply choices. To circumvent these problems, we consider a benchmark without idiosyncratic shocks and set χ = 1. Since there are no shocks in this version of the model, our target moments reduce to averagewagegrowth,standarddeviationofwagegrowthrates,andvarianceofwagesduetoprofile heterogeneity only. The latter two are obtained from Guvenen (2007). Notice that because (i) therearenoshocksand(ii)individualswanttoinvestsignificantlyearlyon,theywouldhaveavery strong incentive to borrow when utility is separable and hence they want constant consumption. This implies that wealth is negative for many individuals with standard power utility preferences. To mitigate this effect and allow consumption to rise over the lifecycle we use preferences as in Greenwood et al. (1988) (often called GHH). With this structure, we are able to solve the model both when capital income is and is not taxed. The main finding is the following. The new benchmark model with no capital income taxes 46

can account for 69% of the L90-10 gap between the US and CEU in 2003. (This is up from 48% in the baseline model in the text with shocks and χ = 0.5.) Adding capital income taxes to this structure, reduces the explanatory power to 52.8%, for a fall of 23 percent (1−52.8/69). Thus, if all capital income was taxed at the same rate as labor income, the model’s explanatory power would be about a one quarter lower than in the baseline case. Havingsaidthat,itshouldalsobestressedthatassumingthatthisexerciseislikelytooverstate the real effects of capital income taxation. This is because, as mentioned above, in certain CEU countries some capital income is taxed at a flat rate, which is not the case in the United States. Consequently, in those countries, progressivity affects only labor income, making investment in physical assets more attractive than investment in human capital, in turn further compressing the wage distribution. Hence, incorporating such differences would further lower inequality in the CEU and increase the explanatory power of the model. While we do not pursue this approach here, this is an important point to keep in mind. D.2 Accounting for Cross-Country Variation in Retirement Age Our baseline model does not allow for variation in retirement age across countries. However, such variation could have important implications for human capital investment by affecting the effective horizon of individuals. Although modeling endogenous retirement is beyond the scope of this paper, here we explore the effects of allowing for exogenous retirement age differences across countries. We estimate the average retirement age by computing the fraction of people who receive social security pensions and disability benefits at each age.39 We then solve each country’s problem using the computed retirement age as an exogenous value for S. With this adjustment, the explanatory power for L90-10 increases to 70%, because countries with more progressivity also turn out to have a lower retirement age than less progressive ones. So the two effects reinforce each other. D.3 Maximum investment on the job χ We experiment with two values of χ—0.4 and 0.6—one on each side of our baseline choice of 0.5. When χ = 0.6, the model’s explanatory power for L90-10 and L90-50 fall to 35% and 51% respectively, whereas the explanatory power for L50-10 remains unchanged at 24%. It should be noted however that with this choice of χ, the model implies a minimum to mean wage ratio of 0.24, whichis quitea bitlower thanthe 0.29 value inthedata (and whatwasusedto pindown the baseline choice of 0.50 for χ). When χ = 0.4, the model explains 61% of the L90-10 difference between the US and CEU, 116% of L90-50, and 24% of L50-10. In this case, the min to mean wage ratio is a more reasonable 0.30. 39The data for the CEU countries are obtained from Erosa et al. (2011). We thank Gueorgui Kambourov for providing us with their detailed dataset. The data for the US is from Coile and Gruber (2004) 47

Table A.3: Effect of Wasteful Government Spending on Wage Inequality Results G = Tr = 0.5× Gov’t Surplus L90-10 L90-50 L50-10 (a) (b) (c) Denmark 63 90 38 Finland 49 75 29 France 30 71 14 Germany 69 75 60 Netherlands 45 59 31 Sweden 42 67 23 CEU 49% 73% 29% UK 21 0 49 D.4 Wasteful Government Expenditures versus Transfers In the baseline model, the surplus was returned back to households in a lump-sum fashion, essentially assuming that government expenditures are perfect substitutes for private consumption. To examine if our results are sensitive to this assumption, we now assume that half of the government surplusiswasted: G = Tr,andeachcomponentequalshalfofthebudgetsurplus(i.e.,taxrevenues minus benefits payments). This assumption is probably extreme, but it is useful in illustrating whether the results are sensitive to this scenario. From Table A.3, we see that, qualitatively, the explanatory power of the model is lower for some countries for L90-10 and L90-50 but higher for L50-10. Quantitatively, however, the effect is minimal across the board. In fact, in some cases, no difference is visible (because of rounding) compared to the benchmark case in Table 5. D.5 Depreciation of human capital δ Tocheckthesensitivityofourresultstothechoiceofthehumancapitaldepreciationrate, wehave experimented with depreciation rates of 1% and 2%. The model’s explanatory power goes down to 44% when δ = 0.01 and it increases slightly above 50% when δ = 0.02. An important point to note is that it is not possible to match two of our targets, mean wage growth and variance of wage growthratejointlyfordepreciationratesbelow1percent. Forverylowvaluesofdepreciationrate, when we match the increase in wage inequality over the lifecycle, the wage growth turns out to be very high relative to data. The reason is the following. First note that the learning ability cannot be negative, and as a result the lowest wage growth is bound by the minus depreciation rate. For a given minimum ability level, we match the variance of β by adjusting the maximum ability level. However, when we increase the maximum ability to match the variance of β, the average wage growth turns out to be very high compared to data when we use a very low depreciation rate. D.6 Elasticity of human capital production function α When α is higher, there is less diminishing marginal productivity in human capital production. As a result, human capital investment responds more to changes in incentives due for example to 48

changes in taxes. The model’s explanatory power increases to 65% when we set α = 0.9 and it decreases to 28% when we set it to 0.65. Most of the most recent estimates in the literature are above 0.9 (see, e.g., Heckman et al. (1998); Kuruscu (2006)). Thus, our choice of 0.8 is on the conservative side. D.7 Results: US versus CEU with Fixed Tax Schedules Extended Model with SBTC. HereistheformalstatementofthemodelstudiedinSection 5.2: V(h,a,m;(cid:15),s) = max [u(c,n)+βE(V(h(cid:48),a(cid:48)((cid:15)(cid:48)),m(cid:48);(cid:15)(cid:48),s+1)|(cid:15))](20) c,n,i,a(cid:48)((cid:15)(cid:48)) s.t. (cid:88) (1+τ¯)c+ q((cid:15)(cid:48) | (cid:15))a(cid:48)((cid:15)(cid:48)) = (1−τ¯(y))y +a+Tr, (21) c (cid:2) (cid:3) y = (cid:15) P lj +P hj nj(1−ij). (22) L H s s s h(cid:48) = (1−δ)h+Aj (cid:2) (θ lj +θ hj)ijnj (cid:3)α , (23) L H m(cid:48) = m+1{i < 1 & n ≥ n }, (24) min i ∈ [0,χ]∪{1}, Notice that the only changes are the introduction of raw labor into the labor earnings equation and human capital accumulation function. The weights θ and θ in the production function H L in (23) capture the relative efficiency of human capital and raw labor in producing new human capital. As in Guvenen and Kuruscu (2010) we focus on the case where P = θ and P = θ . H H L L This extended model has some new parameters that need to be calibrated. Except those discussed here, all parameter values are kept at the values given in Table 3. An important point to note is that for the cross-sectional analysis of the previous section, the two-factor model would have precisely the same implications as the one-factor Ben-Porath model used earlier. This is because θ and θ are constant at a point in time and their values can be normalized to generate H L exactly the same results as in the previous section. Thus, with proper choices of θ , θ , and the H L distribution of lj, we do not need to recalibrate any other parameter and can still obtain the same results for year 2003 as before. This is the route that we follow in this section.40 For examining the change in inequality over time, we choose ∆log(θ /θ ) to match the 23 H L log points in L90-10 in the US from 1980 to 2003. The required change in ∆log(θ /θ ) is 0.236. H L With this calibration, wage inequality rises by 0.168 in CEU during the same time, compared to 0.070 rise in the data (fourth column of Table A.4). These results imply that differences in labor market policies, even when they are fixed over time, can generate about 41% (= (0.232− 0.168)/(0.230−0.070)) of the widening in the inequality gap between the US and the CEU during this time period. 40More specifically, the two-factor model eliminates initial heterogeneity in human capital but instead introducesrawlabor. Wemakethesameassumptionsforlj aswemadeearlierabouthj. Thatis,weassume 0 that lj is uniformly distributed and is perfectly correlated with Aj. We also assume that θ = θ = 1 in H L 2003, which allows us to use the same mean value and coefficient of variation for lj as for hj in Table 1. 0 49

Table A.4: Rise in Wage Inequality: Model versus Data, 1980–2003. The model is calibrated to match the 23 log points rise in L90-10 for the US from 1980 to 2003. Change in Log Wage Differentials L90-10 = L90-50 + L50-10 CEU Data Level 0.070 0.063 0.007 % 91% 9% Model Level 0.168 0.129 0.039 % 77% 23% US Data Level 0.230 0.160 0.070 % 70% 30% Model Level 0.232 0.184 0.048 % 79% 21% Difference Data: Level 0.160 0.097 0.063 % 61% 39% Model Level 0.065 0.056 0.009 % 87% 13% % Explained 41% 58% 14% Another dimension of the rise in wage inequality is seen in the last two columns of Table A.4. The substantial part of the rise in wage inequality in the CEU has been at the top: L90-50 is responsible for 91% of the total rise in L90-10, whereas only 9% of the rise took place at the lower end. A similar outcome, somewhat less extreme, is observed in the US where 70% of the rise in L90-10 is due to L90-50. The model generates a similar picture: about 77% of the rise in the CEU and 79% in the US is due to L90-50. An alternative way to express these figures is that the model accounts for 58% of the increase in the inequality gap above the median between the US and the CEU but only 14% of the rising gap below the median. As is clear by now, this is a recurring theme in this paper: the model accounts for cross-country inequality facts at the upper tail quite well, but accounts for a smaller fraction at the lower tail. E Data Appendix: GSOEP and PSID E.1 Sample Selection and Data Preparation The sample period for the German SOEP is 1984-2008 and for the PSID is 1968-1992. We keep only males between 25 and 60 years old, regardless of whether they are heads of household. If an individual does not report hours, wages or income, he is dropped from the sample. To further trim earnings outliers, we exclude observations in which earnings grow by more than 500% or less than -80%, earnings are below 100 Euros (2005) or 2 Dollars (1983) per hour or if they are top-coded. To ensure consistency, we drop those who report zero hours but positive earnings or zero earnings but positive hours. We also drop individuals who report more than 80 hours per week for the entire year, 4160 hours, and flag individuals who work less than one quarter at 40 hours per week, 520 hours. In the PSID, we also drop the SEO oversample. 50

In the PSID, we have to identify roles within households to pair the“wife”and the“head”of household’shourswiththatindividual. Todoso,weusethepnumvariablein1967andrequirethat the“wife”is female and the seqnum and relatehd variables in subsequent years. The household head gets seqnum= 1, and wives are seqnum= 2 and relatehd= 2 until 1982, when they become relatehd= 20. In a few cases each year, the hours reported from the household level and matched to the individual do not match individually reported hours, and we drop these. We also create consistent a age variable so that the age increments by 1 each observation even when an individual is surveyed at different times in the year. E.2 Calculations E.2.1 Residual variables The lifecycle profiles are based on residual log wages. To obtain residuals we regress log wages on marital status, race in the US case and education level (i.e., dropout, high school or college in the US; and dropout, vocational, high school or college in Germany). In all regressions, the intercept is of an unmarried, white, high school graduate. The regression is repeated for every year of the sample, so the dummy coefficients vary freely over time. E.2.2 Age Profiles We construct profiles in much the same way as Deaton and Paxson (1994) and Storesletten et al. (2004b). For each variable, we compute mean and variance within an age-year bin, each defined by a calendar year and a 5 year window of ages. We label these bins by the year and age in the center of the range. We calculate life-cycle profiles with time effects by using coefficients from regressing these bins on both age and year dummies and weighting by the number of individuals in the year-age bin. That is, for mean or dispersion of wages within the age-year bin (h,t), we estimate x = dt +g +(cid:15) h,t h t h,t The coefficients on age, dt are stored as a profile relative to a base at the level or dispersion at age h 25 in 1985, the group represented by the intercept term. To calculate profiles with cohort effects, we follow the same procedure, using age coefficients from a regression on age and cohort dummies. Again, we use the same shift strategy so the average of the profile is the same, whether controlling time effects or cohort effects. References Altig, D. and C. T. Carlstrom,“Marginal Tax Rates and Income Inequality in a Life- Cycle Model,”American Economic Review, 1999, 89, 1197–1215. 6 Baker, Michael,“Growth-Rate Heterogeneity and the Covariance Structure of Life-Cycle Earnings,”Journal of Labor Economics, 1997, 15 (2), 338–375. 3, 21 51

Becker, Gary S,“Human Capital, Effort, and the Sexual Division of Labor,”Journal of Labor Economics, January 1985, 3 (1), S33–58. 37 Ben-Porath, Yoram,“The Production of Human Capital and the Life Cycle of Earnings,” Journal of Political Economy, 1967, 75 (4), 352–365. 2 Benabou, Roland, “Unequal Societies: Income Distribution and the Social Contract,” American Economic Review, 2000, 90 (1), 96–129. 5 Bils, Mark, Yongsung Chang, and Sun-Bin Kim,“Comparative Advantage and Unemployment,”RCER Working Paper 547, University of Rochester 2009. 33 Boskin, Michael J.,“Notes on the Tax Treatment of Human Capital,”in“in”Conference on Tax Research 1975 Washington: Dept. Treasury. 1977. 8 Bound, John, Charles Brown, and Nancy Mathiowetz,“Measurementerrorinsurvey data,”in J.J. Heckman and E.E. Leamer, eds., Handbook of Econometrics, Elsevier, 2001, chapter 59, pp. 3705–3843. 20 Browning, Martin, Lars Peter Hansen, and James J. Heckman, “Micro Data and General Equilibrium Models,” in J. B. Taylor and M. Woodford, eds., Handbook of Macroeconomics, 1999. 17, 18 Carey, David and Josette Rabesona, “Tax Ratios on Labor and Capital Income and on Consumption,”in“OECD Economic Studies No 35,”OECD, 2002. 46 Castan˜eda, Ana, Javier D´ıaz-Gim´enez, and Jos´e-V´ıctor R´ıos-Rull, “Accounting for the U.S. Earnings and Wealth Inequality,” The Journal of Political Economy, 2003, 111 (4), 818–857. 13 Caucutt, Elizabeth M., Selahattin Imrohoroglu, and Krishna B. Kumar, “Does the Progressivity of Income Taxes Matter for Human Capital and Growth?,”Journal of Public Economic Theory, 2006, 8 (1), 95–118. 6 Coile, Courtney and Jonathan Gruber,“TheEffectofSocialSecurityonRetirementin theUnitedStates,”inJonathanGruberandDavidA.Wise,eds.,Social Security Programs and Retirement around the World: Micro- Estimation, The University of Chicago Press, 2004. 47 Conesa, Juan Carlos and Dirk Krueger,“On the optimal progressivity of the income tax code,”Journal of Monetary Economics, October 2006, 53 (7), 1425–1450. 13 Deaton, Angus and Christina Paxson,“Intertemporal Choice and Inequality,”Journal of Political Economy, June 1994, 102 (3), 437–67. 51 Devroye, Dan and Richard B. Freeman,“Does Inequality in Skills Explain Inequality of Earnings Across Countries?,”Technical Report, Harvard University 2000. 35 52

Domeij, David and Martin Floden,“Inequality Trends in Sweden 1978-2004,”Review of Economic Dynamics, 2010, 13 (1), 179–208. 33 Duncan, Denvil and Klara Sabirianova Peter,“TaxProgressivityandIncomeInequality,”Working Paper, Georgia State University 2008. 2 Erosa, A., L. Fuster, and G. Kambourov,“The Heterogeneity and Dynamics of Individual Labor Supply over the Life Cycle: Facts and Theory,”Working Paper, University of Toronto 2009. 18, 33 Erosa, Andres and Tatyana Koreshkova,“Progressive taxation in a dynastic model of human capital,”Journal of Monetary Economics, 2007, 54, 667–685. 6 , Luisa Fuster, and Gueorgui Kambourov,“A Theory of Labor Supply Late in the Life Cycle: Social Security and Disability Insurance,” Technical Report, University of Toronto 2011. 47 Fuchs-Schu¨ndeln, Nicola, Dirk Krueger, and Mathias Sommer,“Inequality Trends for Germany in the Last Two Decades: A Tale of Two Countries,”Review of Economic Dynamics, 2010, 13 (1), 103–132. 34 Gourinchas, Pierre-Olivier and Jonathan A. Parker, “Consumption over the Life Cycle,”Econometrica, 2002, 70 (1), 47–89. 21 Greenwood, Jeremy, Zvi Hercowitz, and Gregory W Huffman, “Investment, Capacity Utilization, and the Real Business Cycle,”American Economic Review, June 1988, 78 (3), 402–17. 46 Guoveia, Miguel and Robert P. Strauss,“Effective federal individual income tax functions: An exploratory empirical analysis,” National Tax Journal, 1994, 47 (2), 317–39. 13 Guvenen, Fatih,“Learning Your Earning: Are Labor Income Shocks Really Very Persistent?,”American Economic Review, June 2007, 97 (3), 687–712. 3, 21 ,“AnEmpiricalInvestigationofLaborIncomeProcesses,”Review of Economic Dynamics, January 2009, 12 (1), 58–79. 3, 21, 45 and Anthony A Smith, “Inferring Labor Income Risk from Economic Choices: An Indirect Inference Approach,”Working Paper, University of Minnesota 2009. 20, 45 and Burhanettin Kuruscu, “A Quantitative Analysis of the Evolution of the U.S. Wage Distribution, 1970-2000,” NBER Macroeconomics Annual, 2010, 24 (1), 227–276. 28, 49 , , and Serdar Ozkan,“Taxation of Human Capital and Wage Inequality: A Cross- Country Analysis,”NBER Working Papers 15526 2009. 22, 36 53

Haider, Steven J.,“Earnings Instability and Earnings Inequality of Males in the United States: 1967-1991,”Journal of Labor Economics, 2001, 19 (4), 799–836. 21, 45 Hassler, John, Jose Mora, KjetilStoresletten, and FabrizioZilibotti,“TheSurvival of the Welfare State,”American Economic Review, 2003, 93 (1), 87–112. 5 Heathcote, Jonathan, Fabrizio Perri, and Giovanni L. Violante, “Unequal We Stand: An Empirical Analysis of Economic Inequality in the United States, 1967-2006,” Review of Economic Dynamics, 2010, 13 (1), 15–51. 18, 34 , Kjetil Storesletten, and Giovanni L Violante,“Consumption and Labour Supply with Partial Insurance: An Analytical Framework,” C.E.P.R. Discussion Papers 6280 2007. 33 , , and Giovanni L. Violante, “The Macroeconomic Implications of Rising Wage Inequality in the United States,”NBER Working Papers 14052 June 2008. 18 Heckman, James J, “A Life-Cycle Model of Earnings, Learning, and Consumption,” Journal of Political Economy, August 1976, 84 (4), S11–44. 8 Heckman, James, Lance Lochner, and Christopher Taber,“Explaining Rising Wage Inequality: Explanations With A Dynamic General Equilibrium Model of Labor Earnings With Heterogeneous Agents,”Review of Economic Dynamics, January 1998, 1 (1), 1–58. 49 Hornstein, A., P. Krusell, and G. Violante, “Technology-Policy Interaction in Frictional Labor-Markets,”Review of Economic Studies, 2007, 74 (4), 1089–1124. 5 Huggett, Mark, Gustavo Ventura, and Amir Yaron,“Sources of Lifetime Inequality,” American Economic Review, forthcoming. 3, 6, 10, 19 Jr., Robert Lucas, “Supply-Side Economics: An Analytical Review,” Oxford Economic Papers, April 1990, 42 (2), 293–316. 8 Kaplan, Greg,“Inequality and the Life Cycle,”Technical Report, University of Pennsylvania 2010. 33 King, Robert G and Sergio Rebelo,“Public Policy and Economic Growth: Developing Neoclassical Implications,”Journal of Political Economy, October 1990, 98 (5), S126–50. 8 Kitao, S., L. Ljungqvist, and T. Sargent, “A Life Cycle Model of Trans-Atlantic Employment Experiences,”Working Paper, USC and NYU 2008. 5 Krebs, T., “Human Capital Risk and Economic Growth*,” Quarterly Journal of Economics, 2003, 118 (2), 709–744. 6 54

Kuruscu, Burhanettin, “Training and Lifetime Income,” American Economic Review, 2006, 96 (3), 832–846. 49 Leuven, Edwin, Hessel Oosterbeek, and Hans van Ophem, “Explaining International Differences in Male Skill Wage Differentials by Differences in Demand and Supply of Skill,”Economic Journal, 2004, 114, 466–486. 35 Ljungqvist, Lars and Thomas J. Sargent,“The European Unemployment Dilemma,” Journal of Political Economy, June 1998, 106 (3), 514–550. 5 and , “Two Questions about European Unemployment,” Econometrica, 01 2008, 76 (1), 1–29. 5 McDaniel, Cara,“Average tax rates on consumption, investment, labor and capital in the OECD 1950-2003,”Working Paper, Arizona State University 2007. 22 Moene, Karl Ove and Michael Wallerstein,“Inequality, Social Insurance, and Redistribution,”The American Political Science Review, 2001, 95 (4), 859–874. 5, 12 Nickell, Stephen and Brian Bell,“The Collapse in Demand for the Unskilled and Unemployment across the OECD,”Oxford Review of Economic Policy, 1995, 11 (1), 40–62. 35 OECD, The Tax/Benefit Position of Production Workers 1981-1985, Paris: Organisation for Economic Co-Operation and Development, 1986. 42 Ohanian, Lee, Andrea Raffo, and Richard Rogerson,“Long-Term Changes in Labor Supply and Taxes: Evidence from OECD Countries, 1956-2004,” Journal of Monetary Economics, December 2008, pp. 1353–1362. 5, 15, 34 Prescott, Edward C.,“Why do Americans work so much more than Europeans?,”Federal Reserve Bank of Minneapolis Quarterly Review, 2004, (Jul), 2–13. 5, 18, 34 Rebelo, Sergio,“Long-Run Policy Analysis and Long-Run Growth,”Journal of Political Economy, June 1991, 99 (3), 500–521. 8 Rodriguez, Francisco,“Inequality, Redistribution, and Rent-Seeking.”PhD dissertation, Harvard University 1998. 12 Rogerson, Richard,“Structural Transformation and the Deterioration of European Labor Market Outcomes,”Journal of Political Economy, 2008, 116 (2), 235–259. 5 Storesletten, Kjetil, Chris I. Telmer, and Amir Yaron, “Cyclical Dynamics in Idiosyncratic Labor Market Risk,”Journal of Political Economy, June 2004, 112 (3), 695– 717. 21 , Christopher I. Telmer, and Amir Yaron,“Consumption and risk sharing over the life cycle,”Journal of Monetary Economics, April 2004, 51 (3), 609–633. 51 55

Wanger, Susanne, “Erwerbst¨atigkeit, Arbeitszeit und Arbeitsvolumen nach Geschlecht und Altersgruppen,”Technical Report 2, IAB Forschungsbericht 2006. 34 56

Cite this document

APA

Fatih Guvenen, Burhanettin Kuruscu, & and Serdar Ozkan (2011). Taxation of Human Capital and Wage Inequality: A Cross-Country Analysis (FEDS 2013-20). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2013-20

BibTeX

@techreport{wtfs_feds_2013_20,
  author = {Fatih Guvenen and Burhanettin Kuruscu and and Serdar Ozkan},
  title = {Taxation of Human Capital and Wage Inequality: A Cross-Country Analysis},
  type = {Finance and Economics Discussion Series},
  number = {2013-20},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2011},
  url = {https://whenthefedspeaks.com/doc/feds_2013-20},
  abstract = {Wage inequality has been significantly higher in the United States than in continental European countries (CEU) since the 1970s. Moreover, this inequality gap has further widened during this period as the US has experienced a large increase in wage inequality, whereas the CEU has seen only modest changes. This paper studies the role of labor income tax policies for understanding these facts, focusing on male workers. We construct a life cycle model in which individuals decide each period whether to go to school, work, or stay non-employed. Individuals can accumulate skills either in school or while working. Wage inequality arises from differences across individuals in their ability to learn new skills as well as from idiosyncratic shocks. Progressive taxation compresses the (after-tax) wage structure, thereby distorting the incentives to accumulate human capital, in turn reducing the cross-sectional dispersion of (before-tax) wages. Consistent with the model, we empirically document that countries with more progressive labor income tax schedules have (i) significantly lower before-tax wage inequality at different points in time and (ii) experienced a smaller rise in wage inequality since the early 1980s. We then study the calibrated model and find that these policies can account for half of the difference between the US and the CEU in overall wage inequality and 84% of the difference in inequality at the upper end (log 90-50 differential). In a two-country comparison between the US and Germany, the combination of skill-biased technical change and changing progressivity of tax schedules explains all the difference between the evolution of inequality in these two countries since the early 1980s.},
}