feds · June 30, 2018

Heterogeneity and Unemployment Dynamics

Abstract

This paper develops new estimates of flows into and out of unemployment that allow for unobserved heterogeneity across workers as well as direct effects of unemployment duration on unemployment-exit probabilities. Unlike any previous paper in this literature, we develop a complete dynamic statistical model that allows us to measure the contribution of different shocks to the short-run, medium-run, and long-run variance of unemployment as well as to specific historical episodes. We find that changes in the inflows of newly unemployed are the key driver of economic recessions and identify an increase in permanent job loss as the most important factor.

Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C. Heterogeneity and Unemployment Dynamics Hie Joo Ahn and James D. Hamilton 2016-012 Please cite this paper as: Ahn, Hie Joo, and James D. Hamilton (2016). “Heterogeneity and Unemployment Dynamics,” Finance and Economics Discussion Series 2016-012. Washington: Board of Governors of the Federal Reserve System, https://doi.org/10.17016/FEDS.2016.012r1. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

Heterogeneity and Unemployment Dynamics∗ Hie Joo Ahn†and James D. Hamilton‡ May 1, 2014 Revised: June 8, 2018 Abstract This paper develops new estimates of flows into and out of unemployment that allow for unobserved heterogeneity across workers as well as direct effects of unemployment duration on unemployment-exit probabilities. Unlike any previous paper in this literature, we develop a complete dynamic statistical model that allows us to measure the contribution of different shocks to the short-run, medium-run, and long-run variance of unemployment as well as to specific historical episodes. We find that changes in the inflows of newly unemployed are the key driver of economic recessions and identify an increase in permanent job loss as the most important factor. Keywords: businesscycles, Great Recession, unemploymentduration, unobservedheterogeneity, duration dependence, state space model, extended Kalman filter ∗Theviewsinthispaperaresolelytheresponsibilityoftheauthorsandshouldnotbeinterpretedasreflectingthe views of the Board of Governors of the Federal Reserve System or of any other person associated with the Federal ReserveSystem. WethankStephanieAaronson,KatarinaBorovickova,ShigeruFujitaandRyanMichaelsforhelpful comments on an earlier draft of this paper. †Federal Reserve Board, email: HieJoo.Ahn@frb.gov ‡University of California at San Diego, e-mail: jhamilton@ucsd.edu

Introduction What accounts for the sharp spike in the unemployment rate during recessions? The answer traditionally given by macroeconomists was that falling product demand leads firms to lay off workers, with these inflows into unemployment a key driver of economic downturns. That view hasbeenchallengedbyHall(2005),Shimer(2012)andHallandSchulhofer-Wohl(2017),whoargued that cyclical fluctuations in the unemployment rate are instead primarily driven by declines in the job-finding rates for unemployed workers. By contrast, Yashiv (2007), Elsby, Michaels and Solon (2009), Fujita and Ramey (2009), and Fujita (2011) concluded that flows into the unemployment pool are as important or more important than outflows as cyclical drivers of the unemployment rate. These papers are part of a large literature that tries to measure the relative importance of inflows and outflows to unemployment. Our paper is the first to do this using maximum likelihood estimation of a complete dynamic model. This is a critical step because the unemployment rate is very serially correlated and possibly nonstationary. The variance of a nonstationary variable is not defined. Trying to measure how much of the variance of unemployment comes from inflows versus outflows is not a well-posed question in the absence of a full dynamic model. With a complete dynamic model, we can forecast variables like unemployment at any horizon. As in den Haan (2000) and Hamilton (forthcoming), the variance of the error associated with the forecast is well defined regardless of whether the series is stationary, and we can measure the fraction of this variance that is attributable to inflows and outflows as a function of the horizon. Our framework also allows us to decompose the forecast error associated with any given historical episode into the respective contributions of inflows and outflows. These represent key innovations of our approach that are entirely new to this literature. Themainreasonthatweareabletodowhatothershavenotisthatwedirectlyconfrontanother core issue— unobserved heterogeneity. One can see the critical importance of taking this into account in Figure 1, which plots the stark differences in unemployment-continuation probabilities between the newly unemployed and the long-term unemployed.1 On average, someone who is 1LetU˜n.+denotetheseasonallyunadjustednumberofindividualsinmonthtwhoreporthavingbeenunemployed t fornmonthsorlongeratthattime. Theseasonallyunadjustedmonthlyunemployment-continuationprobabilityfor the long-term unemployed was calculated as p˜4.+ =(U˜7.+/U˜4.+)1/3. The probability for the newly unemployed was t t+3 t calculated as the solution to p˜1(1+p˜1)=U˜2.3/U˜1. The figure plots 12-month moving averages of p˜. t t t+1 t t 1

newly unemployed has less than a 50% chance of still being unemployed next month. By contrast, if someone has already been unemployed for 4 months or longer, there is an 80% chance they will still be unemployed the next month. The newly unemployed during the Great Recession had better job-finding prospects than did the long-term unemployed during the strongest economic boom. Differences in unemployment-continuation probabilities between the newly unemployed and long-term unemployed can make a huge difference for any calculations about unemployment dynamics. Figure 2 illustrateshow ex ante heterogeneityacross workers couldexplainwhy unemploymentcontinuation probabilities are observed to increase as a function of duration. Suppose for example that 80% of the newly unemployed have unemployment-continuation probability of 35% and 20% have probability of 85%. Then the average continuation probability will rise as a function of duration as the latter group comes to make up a larger portion of the remaining unemployed. All of the previous papers in this literature that tried to allow for heterogeneity across workers did so based on differences in observable characteristics such as demographics, education, industry, occupation, geographical region, and reason for unemployment.2 However, regardless of what observable characteristics we condition on, dramatic differences in unemployment-continuation probabilities between the newly unemployed and long-term unemployed remain.3 No two individuals with the same coarse observable characteristics are in fact identical. It seems undeniable that a given pool of unemployed individuals that conditions on any set of observed characteristics is likely to become increasingly represented by those with higher ex ante continuation probabilities the longer the period of time for which the individuals have been unemployed.4 Showing how to incorporate unobserved heterogeneity into a dynamic model of observed variables is the second 2Baker(1992),Shimer(2012),andKroft,etal. (2016)foundthatsuchvariablescontributedlittletovariationover time in long-term unemployment rates, while Aaronson, Mazumder and Schechter (2010), Bachmann and Sinning (2012), Barnichon and Figura (2015), Hall (2014), and Hall and Schulhofer-Wohl (2017) documented important differences across observable characteristics. Elsby, Michaels and Solon (2009) found that incorporating observable heterogeneity reduced the imputed role of cyclical variation in unemployment exit rates. 3For example, for individuals who gave involuntary permanent separation as the reason for unemployment, the averageunemployment-continuationprobabilitysince1994:M1(thesampleforwhichthisfinerseparationexists)was 70% for the newly unemployed and 84% for the long-term unemployed. 4Severalrecentstudieshavefoundthatobservablecharacteristicsthatwerenotconsideredintheearlierliterature canfurtherexplaindifferencesinunemploymentdurations. FabermanandKudlyak(2017)discoveredthathowmuch time newly unemployed devote to searching for a new job predicts how long they will remain unemployed. Kudlyak and Lange (2017) demonstrated that a newly unemployed individual who the previous month had been classified as not in the labor force is likely to remain unemployed longer than a newly unemployed individual who had been employed the previous month. And Morchio (2016) documented that 2/3 of prime-age unemployment comes from only 10% of the workers. 2

main contribution of our paper. The difference in unemployment-continuation probabilities between the newly unemployed and long-term unemployed could reflect not just ex ante heterogeneity but also the possibility that the experience of being unemployed for a longer period of time directly changes the employment probability for a fixed individual. Following van den Berg and van Ours (1996) we will refer to this possibility as “genuine duration dependence”. Individuals lose human capital the longer they are unemployed (Acemoglu, 1995; Ljungqvist and Sargent, 1998), and employers may statistically discriminate against those who have been unemployedforlonger(Eriksson and Rooth, 2014; Kroft, Lange, and Notowidigdo, 2013).5 Alargeliteraturehasdiscussedthedifficultyofdistinguishinggenuinedurationdependencefrom unobserved heterogeneity.6 A common resolution has been to assume that there is no variation over time in unobserved heterogeneity, in which case identification can be achieved by observing repeated spells of unemployment for a given individual (Honoré,1993). Here we instead use a proportional hazards specification in which the identifying assumptions are that genuine duration dependence does not change over time while changes in unobserved heterogeneity are characterized by a simple parametric model. Although this approach imposes some restrictions, we will show that it provides a natural, compelling, and robust way of interpreting the observed data. The resulting new perspective on cyclical unemployment dynamics is the third main contribution of the paper. In Section 1 we introduce the data that we will use in this analysis based on the number of jobseekers each month who report they have been looking for work at various search durations. We describe the accounting identities that will later be usedin our full dynamic model and use average values of observable variables over the sample to explain the intuition behind our identification strategy. We also use these calculations to illustrate why cross-sectional heterogeneity appears to be more plausible than genuine duration dependence as an explanation for the broad features of these data. In Section 2 we extend this framework into a full dynamic model in which we represent hetero- 5Jarosch and Pilossoph (2015) demonstrated that the quantitative magnitude of statistical discrimination found inthesestudiescouldinfactbeconsistentwiththeclaim thatcross-sectionheterogeneityistheprimaryexplanation for the observed tendency of unemployment-continuation probabilities to rise with duration of unemployment. 6See forexampleElbersand Ridder (1982), Heckman and Singer(1984a,b,c), Ridder(1990), Honoré(1993),and van den Berg (2001). 3

geneity in terms of two different types of workers at any given date. Type H workers have a higher ex ante probability of exiting unemployment than type L workers, and all workers are also subject to genuine duration dependence effects as well. Our model postulates that the number of newly unemployed individuals of either type, as well as the probability for each type of exiting the pool of unemployed at each date, evolve according to unobserved random walks. We show how one can approximate the likelihood function for the observed unemployment data and form an inference about each of the state variables at every date in the sample using an extended Kalman filter.7 Empirical results are reported in Section 3. Broken down in terms of inflows versus outflows, we find that variation over time in the inflows of the newly unemployed are more important than outflows from unemployment in accounting for errors in predicting aggregate unemployment at all horizons. Broken down in terms of types of workers, inflow and outflow probabilities for type L workers are more important than those for type H workers, and account for 90% of the uncertainty in predicting unemployment 2 years ahead. In recessions since 1990, shocks to the inflows of type L workers were the most important cause of rising unemployment during the recession. In Section 4 we provide corroboration of these conclusions using a number of alternative data sources and methods. We show why data that have been pointed to as demonstrating the importance of outflows in fact support the opposite conclusion when properly interpreted. In Section 5 we investigate the robustness of our approach to various alternative specifications, including alternative methods to account forthe change in the CPS questionnaire in1994, allowing for correlation between the innovations of the underlying structural shocks in our model, and the possibleeffectsoftimeaggregation. Whilesuchfactorscouldproducechangesinsomeofthedetails of our inference, our overall conclusions (summarized in Section 6) appear to be quite robust. 1 Observable implications of unobserved heterogeneity Thepurposeofthissectionistousesteady-statecalculationstoexplainhowourapproachallows for both unobserved heterogeneity and genuine duration dependence and provide the intuition 7OurapproachiscloselyrelatedtothatinHornstein(2012),whouseddynamicaccountingidentitiestointerpret aggregate panel dynamics in a similar way to that in our paper. However, Hornstein’s model was unidentified— in termsofthediscussionofidentificationinSection1,hismodelhas5unknownsandonly4equations. Asaresult,his specificationdidnotallowhimtocalculatethelikelihoodfunctionfortheobserveddataorforecastsofunemployment or duration. By contrast, our model generates values for all these along with the optimal statistical inference about the various shocks driving the observed dynamics of unemployment. 4

behind some of the results that will be found in Section 3 using our full dynamic model. The Bureau of Labor Statistics reports for each month t the number of working-age individuals who have been unemployed for less than 5 weeks. Our baseline model is specified at the monthly frequency,leadingustousethenotationU1 fortheaboveBLS-reportedmagnitude,indicatingthese t individuals have been unemployed for 1 month or less as of month t. BLS also reports the number who have been unemployed for between 5 and 14 weeks (or 2-3 months, denoted U2.3), 15-26 weeks t (U4.6) and longer than 26 weeks (U7.+). One reason the BLS reports the data in terms of these t t duration aggregates is to try to minimize the role of measurement error by averaging within broad groups. Wewilldothesamethinginourpaper. Ourtheoreticalmodelwillgenerateapredictionof thenumberofunemployedateverymonthlyduration, butwewillonlyusethemodel’simplications about broad duration aggregates for purposes of calculating the likelihood function of the observed data.8 Notwithstanding, when reporting on long-term unemployment, many BLS publications9 further break down the U7.+ category into those unemployed with duration 7-12 months (U7.12) t t andthose withduration longerthan1 year (U13.+). Since long-termunemployment isalso a major t interest in our investigation, we have used the raw CPS micro data from which the usual publicly reported aggregates are constructed to create these last two series for our study.10 The average values over the full sample of these five observed variables on which our inference will be based are reported in the first row of Table 1. Our purpose in this paper is to explore what variation in these duration-specific counts across time can tell us about unemployment dynamics. Our focus will be on the following question— of those individuals who are newly unemployed at time t, what fraction will still be unemployed at time t+k? We presume that the answer to this question depends not just on aggregate economic conditions over the interval (t,t+k) but also on the particular characteristics of those individuals. Let w denote the number of people of type i it who are newly unemployed at time t, where we interpret I U1 = w . (1) t it i=1 (cid:1) 8In January 2011 the BLS changed the maximum allowable unemployment duration response from 2 years to 5 years. Although this affected the BLS’s own estimate of average duration of unemployment, it did not change the total numbers unemployed by the duration categories we use. This is another reason to favor our approach, which relies only on aggregated data. 9See for example Bureau of Labor Statistics (2011) and Ilg and Theodossiou (2012). 10See Appendix A for further details of data construction. 5

We define P (k) as the fraction of individuals of type i who were unemployed for one month or it less as of date t−k and are still unemployed and looking for work at t. Thus the total number of individuals who have been unemployed for exactly k+1 months at time t is given by I Uk+1 = w P (k). (2) t i,t−k it i=1 (cid:1) We first examine what we could infer about unobserved types based only on the historical average values U¯1,U¯2.3,U¯4.6,U¯7.12, and U¯13.+, and then will consider what additional information can be learned from variation over time in these five variables. 1.1 Inference using historical average values alone Suppose for purposes of this section only that the number of newly unemployed individuals of each type remained constant over time at values w and also that the probabilities that individuals i of each type remain unemployed in any given month are constants p for i = 1,...,I. Consider i first the case when there is only one type of worker (I = 1). Under these assumptions (2) would simplify to Uk+1 = wpk. Given the average observed values for Uk for two different values of k, we could then estimate the values of w and p, for example, wˆ = U¯1 and pˆ = U¯2/U¯1. As noted above, we regard aggregate measures like U2.3 as more reliable than a specific estimate such as U2 t t that could be constructed from CPS micro data, and therefore use instead pˆ+pˆ2 =U¯2.3/U¯1. The estimated values for wˆ and pˆ that result from this equation are reported in row 2 of Table 1 and plotted in Panel A of Figure 3. The fact that the sum U¯2.3 is significantly lower than U¯1 means that most of the newly unemployed find jobs quickly (pˆ = 0.48). But if workers who had been unemployed for more than 3 months also had this same job-finding rate, there would be far fewer workers in the 4-6 month, 7-12, and 13+ categories than we observe in the data, as represented by the black circles in Figure 3.11 Consider next the case when there are I =2 types of workers. In this case (2) becomes Uk+1 =w pk +w pk . (3) L L H H 11Theblackcirclesareusedasavisualdevicetosummarizetheintervalaveragesfromrow1ofTable1. Specifically, they are the implied values at the particular durations 1, 3, 5, 9.5, and 15 months from a flexible functional form (equation (6)) that could predict numbers for every duration and whose predictions exactly match the observed average values for the five observations in row 1 of Table 1. 6

This equation describes the average number of individuals who have been unemployed for k +1 months as the sum of two different functions of k, with each of the two functions being fully characterized by two parameters (w and p ). The solid red curve in Panel B of Figure 3 plots i i the first function (w pk), while the dotted blue curve plots the sum. Given observed values of L L U¯1,U¯2.3,U¯4.6, and U¯7.12, we could estimate the four parameters (w ,w ,p ,p ) to match exactly L H L H those four observations, as in Panel B of Figure 3 and row 3 of Table 1.12 These estimates imply that type H individuals comprise a very high fraction, 78%, of the initial pool of unemployed U1. But the unemployment-continuation probability for type H individuals (p =0.36) is much lower H than for type L (p =0.85). Because the type H are likely to find jobs relatively quickly, there are L very few type H individuals included in Un for durations n beyond 4 months, as seen in Panel B of Figure 3. The key feature of the observed data (represented by the black dots in Figure 3) that gives rise to this conclusion is the fact that the numbers drop off very quickly at low durations (as most of the type H workers find jobs), but after that much more slowly (as the remaining type L workers continue searching). What about when I >2? In this case we can still get a useful characterization of heterogeneity across workers by separating them into two broad types. Specifically, for any true values for w i and p for i = 1,...,I > 2 and any observed 4 durations k ,k ,k ,k , we can find values for the 4 i 1 2 3 4 parameters (wˆ ,wˆ ,pˆ ,pˆ ) to approximate the cross-sectional distribution as the solutions to L H L H I wˆ pˆk +wˆ pˆk = w pk for k =k ,k ,k ,k . (4) L L H H i i 1 2 3 4 i=1 (cid:1) Note that if we only observed 4 duration categories, a mixture of two types is a fully general characterization of heterogeneity in the sense that it can completely describe all the features observable inthe data andprovidesthe identical fit to theobserveddataaswouldaspecificationwith I > 2.13 Given measurement error in the CPS data, we do not believe we can reliably use more than 5 observed duration categories, meaning estimation of more than I = 2 types is infeasible using these data. In other data sets and in somewhat different settings from ours, Ham and Rea (1987), Van den Berg and van Ours (1996), and Van den Berg and van der Klaauw (2001) tested 12Specifically,thefourfunctionsareobtainedfromequations(10)-(13)belowforthespecialcasewhentheleft-hand variables represent historical averages and on the right-hand side we set w =w , P (k)=pk, and rx =0. it i it i t 13This result can be viewed as an illustration of Theorem 3.1 in Heckman and Singer (1984b). 7

for the number of types and found I = 2 is sufficient to capture heterogeneiety in the data sets they analyzed. In this paper we will represent heterogeneity in terms of a mixture of two types, though we view this primarily as a convenient approximation for understanding how heterogeneity cangive rise to a slower drop-off injob-exit probabilities at higher durationsthanwould be implied by a homogeneous model as in Panel A of Figure 3. Our primary interest is to characterize how and why this feature of the data changes over time, and a 2-component mixture representation can faithfully capture this.14 Although we did not use the fifth data point, U¯13.+, in estimating these parameters, the frameworkgeneratesapredictionforwhatthatobservationwould be.15 Thisisreportedinthe last entry of row 3 of Table 1 to be 621,000 which is quite close to the observed value of 664,000. The feature of the data that produced this result is that the observed numbers fall off at close to a constant exponential rate once we get beyond 4 months, as the simple mixture model would predict. Alternatively, wecouldequallywelldescribetheobservedaveragesusingamodelinwhichthere isonlygenuinedurationdependence(GDD). Supposethatanindividualwhohasbeenunemployed for τ months has a probability p(τ) of still being unemployed the following month. We can always write this in the form p(τ)=exp(−exp(d )) τ for d an arbitrary function of τ. For example, we could fit the 5 observations in the first row of τ Table 1 perfectly if we used w =U¯1 along with a 4-parameter representation for d such as16 τ d =δ +δ τ +δ τ2+δ τ3. (5) τ 0 1 2 3 A large number of empirical studies have assumed Weibull durations, essentially corresponding to δ =δ =0. The valuesforδ thatwouldexactlyfit the historicalaveragesarereportedinrow4of 2 3 j 14For example, if in a true population for which I >2 there is an increase in the level of inflows with no change in outflows or relative composition (w∗ = λw and p∗ = p for i = 1,...,I), then a 2-mixture approximation would i i i i correctly conclude that only the level of inflows has changed with no change in composition or in outflows; that is, equation (4) would have the solution wˆ∗ =λwˆ and pˆ∗ =pˆ for i=H,L. Likewise if there is a proportional change i i i i in all outflow probabilities with no change in composition or inflows in a population mixture of I types (w∗ = w i i and p∗ =λp for i=1,...,I), the 2-type approximation (4) would correctly conclude that wˆ∗ =wˆ and pˆ∗ =λpˆ for i i i i i i i=H,L since for each k the left and right sides of the equation are then multiplied by λk. 15Following Hornstein (2012) we truncate all calculations at 48 months in equation (14). Most of the models considered in this paper imply essentially zero probability of an unemployment spell exceeding 4 years in duration. 16Specifically, we calculate Uk+1 =wp(1)p(2)···p(k) and find the values of w,δ0,δ1,δ2,δ3 to match the observed values in row 1 of Table 1. 8

Table1andtheimpliedfunctionp(τ)isplottedinpanelAofFigure4. Notethatincontrasttothe popular Weibull assumption and most theoretical models, the fitted function (5) is not monotonic. If we were willing to restrict the functional form of GDD to the Weibull case, we could also interpret the historical averages as resulting from a combination of unobserved heterogeneity and GDD. Supposeweassumedproportionalhazards17 andrepresenttheprobabilitythatanindividual of type i who has been unemployed for τ months will still be unemployed the following month as p (τ) =exp{−exp[x +d ]} (6) i i τ with implied unemployment counts Uk+1 = w p (1)p (2)···p (k). (7) i=L,H i i i i (cid:1) The value of x for i = H,L reflects cross-sectional heterogeneity in unemployment-continuation i probabilities and d captures genuine duration dependence. As noted by Katz and Meyer (1990, τ p. 992), this double-exponential functional form is a convenient way to implement a proportional hazards specification so as to guarantee a positive hazard18, a feature that will be very helpful for the generalization in the following section in which we will allow for variation of x over it time. Suppose we were willing to model GDD using a one-parameter function, say d = δ(τ − τ 1). Then we could find a value for the 5 parameters w ,w ,x ,x ,δ so as to fit the 5 time- L H L H series averages U¯1,U¯2.3,U¯4.6,U¯7.12, and U¯13.+ exactly. These values are reported in row 5 of Table 1. The implied value for δ is close to zero, and the other parameters are close to those for the 17Alvarez,Boroviˇcková,andShimer(2016)concludedthatproportionalhazardsisnotconsistentwiththeobserved data. However,theiridentifyingassumptionwasthattheheterogeneouscharacteristicsofindividualidonotchange even if the individual is observed in different decades. The assumption that employers’ demands for the specific skills of individual i do not change over time seems to us extremely implausible. By contrast, our specification in the following section allows both an individual’s identification with a particulargroup as well as the group’s average unemployment-continuation probabilities to be continually changing, an approach that gives a proportional-hazards specification considerably more flexibility. 18Consider an individual i who has been unemployed for τ months as of the beginning of month t and let the hazardwithinmonthtbeλ =exp(x )exp(d )wheretheexponentiationisadevicetoguaranteethatthehazard i,t,τ it τ is positive for any x and d . The meaning of the hazard is that if we divide month t into n subintervals, the it τ probability that individual i exits unemployment in the interval (s,s+1/n) is λ /n+o(1/n) from which the i,t,τ probability that the individual is still unemployed at the beginning of month t+1 is lim [1−λ /n+o(1/n)]n =exp(−λ )=exp[−exp(x )exp(d )]. i,t,τ i,t,τ it τ n→∞ 9

pure cross-sectional heterogeneity specification of row 3. Thus for this particular parametric example,wewouldconcludethatcross-sectionalheterogeneityismuchmoreimportantthangenuine duration dependence in accounting for why observed unemployment-continuation probabilities rise with duration of unemployment. The feature of the data that gave rise to this conclusion is that the 4-parameter pure heterogeneity model gives a very good prediction of all five observations. 1.2 Inference using changes over time Next consider what we can discover using time-series variation in the observed aggregates. Suppose we repeat the above exercises only using data during the unemployment bulge around the Great Recession. Row 7 of Table 1 and Panel C of Figure 3 show the results if we tried to explain these numbers entirely in terms of unobserved heterogeneity. The implied value for the unemployment-continuation probability for type L individuals, p = 0.89, is only slightly higher L than the value 0.85 fit to the full historical sample. The reason is that the function U¯n drops off after n=4 months at only a slightly slower rate than it did historically. However, we would infer that the inflow of new type L individuals, w = 1,065 is much higher than the historical average L value of 690, in order to account for the fact that U¯n is now dropping off after 4 months from a much higher base. We again find that the 4-parameter model does a reasonable job of anticipating the fifth unused data point. If we instead tried to explain the recent averages purely in terms of GDD, we would use the parametervaluesfromrow8ofTable1. Theseagaincouldfitthedataperfectly, albeitrelyingona function with odd oscillations (see panel B of Figure 4). Although it is mathematically possible to describe the data with this equation, it would be difficult to motivate a theory of why GDD should have changedshape inthisway. It requiresfor example a steeper initial slope to the curve inpanel B of Figure 4 when economic conditions worsened, corresponding to the claim that the scarring associated with unemployment is more severe during a recession. But this is directly contradicted by the experimental finding of Kroft, Lange, and Notowidigdo (2013) that potential employers pay less attention to applicants’ duration of unemployment when the labor market is weaker. We will produce additional evidence in Section 4 below on predictability of changes in unemployment that would also be very hard to interpret based on any theory of cyclically changing GDD. These concerns notwithstanding, would it be possible to allow for both an unrestricted non- 10

monotonic functional form for GDD as well as unobserved heterogeneity? The answer is definitely yes once we take account of changes over time. Suppose for example we were to pool the observationsfrom the first row of Table 1 (the full-sample averages) together with those in row 6 (behavior around the Great Recession), giving us a total of 10 observations. If we took the view that the unobserved heterogeneity parameters may have changed over the cycle but that the GDD function d in (6) is time-invariant, we would then be able to generalize d to be a function of τ determined τ τ by two parameters, say δ and δ , and use the ten observations to infer ten unknowns (values of 1 2 w ,w ,x ,x for the two subsamples along with the parameters δ and δ ). Generalizing a little H L H L 1 2 further, if we use observations across 4 different subsamples we could infer values of w ,w ,x ,x H L H L for each subsample along with a completely unrestricted nonmonotonic GDD function as in (5).19 In fact, if we were able to use all five observations on U1,U2.3,U4.6,U7.12,U13.+ for every date t, t t t t t we could even allow for some modest variation over time in the GDD function d , and indeed such τt a specification will be included in the general results reported in Section 5. RestrictingtimevariationinGDDhasbeenusedinsomestudiesofmicrolabordatasuchasvan denBergandvanOurs(1996)andAlvarez,Boroviˇcková,andShimer(2016). Ourpaperdiffersfrom any previous study in either the micro or macro labor literature in focusing on aggregate cyclical variation in the implications of unobserved heterogeneity. Unobservedattributes like specific skills and how much those skills are demanded surely interact with changes in aggregate labor-market conditions, so allowing for changes during recessions as in the difference between panels B and C of Figure 3 is potentially quite fundamental. Documenting the potential role of cross-sectional heterogeneity in cyclical unemployment fluctuations is one of the key original contributions of our paper. Wehaveusedsteady-statecalculationsinthissectionprimarilytoexplaintheintuitionforwhere the identification is coming from. Nevertheless, it turns out that the key conclusions of the above steady-state calculations— that the majority of newly unemployed individuals can be described as 19More generally, let h(t,τ) denote the observed average unemployment exit probability at date t for individuals who have been unemployed for τ months as of that date. Under the assumption of proportional hazards and time-invariant GDD this can be written as h(t,τ) = θ(t,τ)δ(τ) where δ(τ) captures GDD and θ(t,τ) time-varying heterogeneity. Then the changes over time in cross-sectional heterogeneity are identified nonparametrically from the data: θ(t,τ)/θ(t −1,τ) = h(t,τ)/h(t− 1,τ). If we observe h(t,τ) at 5 discrete values of τ and represent heterogeneity with the 4-parameter function θ(t,τ)=[w (1−p )+w (1−p )]/(w +w ) as in (4), then the Ht Ht Lt Lt Ht Lt the values of w ,w ,p ,p can be recovered and the function δ(τ) is nonparametrically identified in the sense Ht Lt Ht Lt that 5 unrestricted values δ(τ1),...,δ(τ5) can be recovered from the data. 11

type H who find jobs quickly, that dynamic sorting based on unobserved heterogeneity appears to be much more important than genuine duration dependence in explaining why a longer-term unemployed individual is less likely to exit unemployment, and that the key driver of economic recessions is an increased inflow of newly unemployed type L individuals— will also turn out to characterize what we will find as we now turn to a richer dynamic model. 2 Dynamic formulation Our dynamic model is a generalization of (6) in which outflow probabilities for each type of individualchangeovertime. Weassumethatfortypeiworkerswhohavealreadybeenunemployed for τ months as of time t−1, the fraction who will still be unemployed at t is given by p (τ) =exp[−exp(x +d )] for τ =1,2,3,... (8) it it τ where d is a third-order polynomial as in equation (5).20 We also allowed inflows for each τ type to vary over time, letting w change each month. Note the identifying assumption is that it the contribution of genuine duration dependence d , while of the completely general functional τ form used in Figure 4, does not vary over time.21 We now specify a state-space model where the dynamic behavior of the observed vector y = (U1,U2.3,U4.6,U7.12,U13.+)′ is determined as a t t t t t t nonlinearfunctionoflatentdynamicvariables—theinflowsandoutflowprobabilitiesforunemployed individuals with unobserved heterogeneity. Due to the nonlinear nature of the resulting model, we draw inference on the latent variables using the extended Kalman filter. 2.1 State-space representation Our baseline model assumes that the elements of ξ = (w ,w ,x ,x )′ each evolve as t Ht Lt Ht Lt random walks, e.g., 20Wefoundthatthenumericalsearchtofindthemaximumlikelihoodestimatesperformedbestwhenweexpressed this function in terms of scaled Chebyshev polynomials: d τ =˜δ1((τ −1)/48)+˜δ2[2((τ −1)/48)2−1]+˜δ3[4((τ −1)/48)3−3((τ −1)/48)]. 21Infactourapproachcanalsoallowformodesttimevariation. IntherobustnessanalysisinSection5wereplace d with d which changes with t in a restricted way. τ tτ 12

w =w +εw . (9) Ht H,t−1 Ht A random walk is by far the most common assumption in dynamic latent-variable or time-varyingparameter models as it has proven to be a flexible and parsimonious way to adapt inference to a variety of sources of changing conditions or possible structural breaks.22 Note also that equation (9) is an unambiguous improvement over the steady-state calculations described in the previous section(andinvoked in the majority of previous studies in this literature), and includes the steadystate formulation as a special case when the variance of εw is zero. We have also experimented Ht with a model in which we assume AR(1) dynamics for the latent variables with autoregressive coefficients estimated by maximum likelihood. We found the coefficient estimates to be very close to unity and the resulting inference very similar to those reported for our baseline random walk specification. The intuition for how the extended Kalman filter works is as follows. We will have formed an inference about the value of ξ based on the data we observed through date t. For example, we t could use the steady-state calculations of Section 2 on a small initial sample of observed y ,..,y to 1 t0 form an initial inference about w ,w ,x ,x , which would imply values for Un for every H,t0 L,t0 H,t0 L,t0 t0 n from equation (3) based on the average values for that initial sample.23 A random walk means that we enter period t+1 initially expecting it to look like t. This would imply predicted values for the five variables observed at t+1. If U13.+ is higher than predicted, it would be an indication t+1 that p has gone up(since there are essentiallyno type H individuals includedinU12.+). If U2.3 is L t t+1 higher than predicted even with this higher value for p it means that p has likely gone up L,t+1 H,t+1 as well. If U1 is higher than U1, we know that either w or w must have gone up. Given the 5 t+1 t L H new observations in y , we have more than enough information to update an inference about all t+1 4 elements of ξ . Proceeding sequentially through the observed sample in this way, we can form t+1 an inference about ξ forevery date and at the same time improve our inference about the previous t history. The final revised inference about the state at date t based on seeing the full sample of data through date T is referred to as the smoothed inference, denoted ˆξ . t|T 22See for example Baumeister and Peersman (2013). 23Our estimates below start with t0 = 1976:M1 and set ˆξ t0|t0 to the solution to the steady-state model over the period 1972:M1-1976:M1. Our approach allows the true value ξ to differ from this estimate with a very large t0 variance, so that the initial estimate has a very limited contribution. See Appendix B for details. 13

Anotherkeydetailofourapproachisthatweallowforthepossibilitythatunemploymentcounts are all contaminated by error. The durations in CPS are in part self-reported and respondents make a variety of errors. We assume that each element of y has an associated measurement t error r = (r1,r2.3,r4.6,r7.12,r13.+)′. Our identification assumption is that the measurement error t t t t t t is white noise, meaning that the inference is only adjusted for changes in the observed variables that prove to be persistent.24 The observation equations can then be written as follows25, U1 = w +r1 (10) t it t i=(cid:1)H,L U2.3 = [w P (1)+w P (2)]+r2.3 (11) t i,t−1 it i,t−2 it t i=(cid:1)H,L 5 U4.6 = [w P (k)]+r4.6 (12) t i,t−k it t i=(cid:1)H,Lk(cid:1)=3 11 U7.12 = [w P (k)]+r7.12 (13) t i,t−k it t i=(cid:1)H,Lk(cid:1)=6 47 U13.+ = [w P (k)]+r13.+ (14) t i,t−k it t i=(cid:1)H,Lk(cid:1)=12 where P (j) =p (1)p (2)...p (j). (15) it i,t−j+1 i,t−j+2 it We can arrive at the likelihood function for the observed data {y ,...,y } by assuming that 1 T the measurement errors are independent Normal,26 where R , R , R , R and R are the 1 2.3 4.6 7.12 13.+ standard deviations of r1,r2.3,r4.6, r7.12 and r13.+ respectively: t t t t t r ∼N(0,R) t 24Under this assumption, r will also capture any idiosyncratic and purely transient factors in unemployment t inflows and continuation probabilities. 25As in the steady-state example in Section 1, we consider 4 years to be the maximum unemployment duration considered. 26The Normality assumption of measurement errors has often been adopted in the literature of unemployment hazards; see for example Abbring, van den Berg and van Ours (2001)and van den Berg and van der Klaauw (2001). Moreover,theidentical Kalmanfilterequationsthatemergefrom an assumption ofNormality can alsobemotivated using a least-squares criterion; see for example Hamilton (1994a, Chapter 13). 14

R2 0 0 0 0 1   0 R2 0 0 0  2.3    R = 0 0 R2 0 0 .  4.6  5×5    0 0 0 R2 0  (cid:2)(cid:3)(cid:4)(cid:5)  7.12       0 0 0 0 R2  13.+   Let ξ be the vector (w ,w ,x ,x )′ and ε = (εw , εw ,εx , εx′ )′. Our assumption that the t Lt Ht Lt Ht t Lt Ht Lt Ht latent factors evolve as random walks would be written as ξ =ξ + ε (16) t t−1 t 4×1 4×1 (cid:2)(cid:3)(cid:4)(cid:5) (cid:2)(cid:3)(cid:4)(cid:5) ε ∼N( 0 , Σ ) t 4×1 4×1 4×4 (cid:2)(cid:3)(cid:4)(cid:5) (cid:2)(cid:3)(cid:4)(cid:5) (cid:2)(cid:3)(cid:4)(cid:5) (σw)2 0 0 0 L   0 (σw)2 0 0 Σ = H .    0 0 (σx)2 0  4×4  L    (cid:2)(cid:3)(cid:4)(cid:5)   0 0 0 (σx )2   H   In Section 5 we will also report results for a specification in which the shocks are allowed to be contemporaneously correlated. Since the measurement equations (10)-(14) are a function of {ξ ,ξ ,...,ξ }, the state equat t−1 t−47 tion should describe the joint distribution of ξ ’s from t−47 to t, where I and 0 denote a (4×4) t identity and zero matrix, respectively: ε I 0 0 0 ... 0 0 0 t ξ ξ    t   4×4 4×4   t−1  4×1  (cid:2)(cid:3)(cid:4)(cid:5)   ξ t−1   (cid:2)(cid:3) I (cid:4)(cid:5) (cid:2)(cid:3) 0 (cid:4)(cid:5) 0 0 ... 0 0 0   ξ t−2   0           ξ       ξ    4×1     t−2  = 0 I 0 0 ... 0 0 0    t−3  + (cid:2)(cid:3) 0 (cid:4)(cid:5) . (17)    . . .        . . . . . . . . . . . . ... . . . . . . . . .        . . .        . . .               ξ t−46    0 0 0 0 ... I 0 0    ξ t−47            0          ξ   ξ    t−47   0 0 0 0 ... 0 I 0   t−48           0  192×1 192×1   192×192 (cid:2) (cid:3)(cid:4) (cid:5) (cid:2) (cid:3)(cid:4) (cid:5) 192×1 (cid:2) (cid:3)(cid:4) (cid:5) (cid:2) (cid:3)(cid:4) (cid:5) 15

2.2 Estimation Our system takes the form of a nonlinear state space model in which the state transition equation is given by (17) and observation equation by (10)-(14) where P (j) is given by (15) and it p (τ) by(8). Ourbaselinemodelhas 12 parameterstoestimate, namelythe diagonaltermsinthe it variance matrices Σ and R and the parameters governing genuine duration dependence, δ δ and 1, 2 δ . Because the observationequationis nonlinearinx , the extendedKalmanfilter canbe usedto 3 it approximate the likelihood function for the observed data {y ,...,y } and form an inference about 1 T the unobserved latent variables {ξ ,...,ξ }, as detailed in Appendix B. Inference about historical 1 T values for ξ provided below correspond to full-sample smoothed inferences, denoted ˆξ . t t|T 3 Results for the baseline specification We estimated parameters for the above nonlinear state-space model using seasonally adjusted monthlydataon y =(U1,U2.3,U4.6,U7.12,U13.+)′ fort=January1976throughJune2017.Figure t t t t t t 5 plots smoothed estimates for p (1), the probability that a newly unemployed worker of type i it at t − 1 will still be unemployed at t. These average 0.35 for type H individuals and 0.82 for type L individuals, close to the average calculations of 0.36 and 0.85, respectively, that we arrived at in row 2 of Table 1 when we were explaining the intuition behind our identification strategy based on steady-state calculations. The probabilities of type H individuals remaining unemployed rise during the early recessions but are less cyclical in the last two recessions. By contrast, the continuation probabilities for type L individuals rise in all recessions. The gap between the two probabilities increased significantly over the last 20 years. Figure 6 plots inflows of individuals of each type into the pool of newly unemployed. Type H workers constitute 77% on average of the newly unemployed, again close to the value of 78% expected on the basis of the simple steady-state calculations in row 5 of Table 1. Inflows of both types increase during recessions. New inflows of type H workers declined immediately at the end of every recession, but inflows of type L workers continued to rise after the recessions of 1990-91 and 2001 and were still at above-average levels 3 years after the end of the Great Recession. This changing behavior of type L workers’ inflows appears to be another important characteristic of jobless recoveries. The Great Recession is unique in that the inflows of type L workers as well as 16

the continuation probabilities reached higher levels than any earlier dates in our data set. The combined implications of these cyclical patterns are summarized in Figure 7. Before the Great Recession, the share in total unemployment of type L workers fluctuated between 30% and 60%, falling during expansions and rising during and after recessions. But during the Great Recession, the share of type L workers skyrocketed to over 80%. The usual recovery pattern of a falling share of type L workers has been very slow in the aftermath of the Great Recession. While the inflows of type H workers show a downward trend since the 1980’s, those of type L workers exhibit an upward trend. This difference in the low frequency movements of the two series provides a new perspective on the secular decrease in the inflows to unemployment and the secular rise in the average duration of unemployment. Abraham and Shimer (2001) and Aaronson, Mazumder and Schechter (2010) showed that the substantial rise in average duration of unemployment between mid-1980 and mid-2000 can be explained by the CPS redesign, the aging of the population and the increased labor force attachment of women. Bleakley, Ferris and Fuhrer (1999) concluded that the downward trend in inflows can be explained by reduced churning during this period. Figure 6 shows that the downward trendinthe inflows is mainlydrivenby type H workers. The increased share of type L inflows contributed to the rise in the average duration of unemployment since the 1980’s. This suggests that unobserved heterogeneity is important in accounting for low frequency dynamics in the labor market as well as those for business cycle frequencies. Table 2 provides parameter estimates for our baseline model. The estimated genuine duration dependenceparameters, ˜δ ,˜δ , and˜δ are consistentwiththe scarring hypothesis— the longersome- 1 2 3 one from either group has been unemployed, provided the duration has been 11 months or less, the more likely it is that person will be unemployed next month. Once someone has been unemployed for more than a year, it becomes more likely as more months accumulate that they will either find a job or exit the labor force in any given month. This non-monotonic behavior of genuine duration dependence is displayed graphically in Panel A of Figure 8. As seen in Panel B of Figure 8, our estimates of genuine duration dependence imply relatively modest changes in continuation probabilities for type L workers for most horizons. And while the implications for long-horizon continuation probabilities for type H workers may appear more significant, they are empirically irrelevant, since the probability that type H workers would be unemployed for more than 12 months is so remote. To gauge the overall significance of genuine 17

duration dependence, we calculated the unemployment level predictedby our model for eachdate t inthesampleifthevaluesof˜δ ,˜δ ,and˜δ wereallsettozero, andfounditwouldonlybeabout4% 1 2 3 loweronaveragethanthevaluepredictedbyourbaselinemodel. Thusalthoughthevaluesof˜δ and 1 ˜δ are statistically significant, they play a relatively minor role compared to ex ante heterogeneity 3 in accounting for differences in continuation probabilities by duration of unemployment. 3.1 Variance decomposition Manypreviousstudieshavetriedtosummarizetheimportanceofdifferentfactorsindetermining unemploymentby looking at correlationsbetweentheobservedunemploymentrateandthesteadystate unemployment rate predicted by each factor of interest alone; see for example Fujita and Ramey (2009) and Shimer (2012). One major benefit of our framework is that it delivers a much cleaner answer to this question in the form of variance decompositions, a familiar method in linear VARs for measuring how much each shock contributes to the mean squared error (MSE) of an s-period-ahead forecast of a magnitude of interest.27 Our model can be used to account for the difference between the unemployment realization at time t+s and a forecast based on values of the state vector only through date t in terms of the sequence of shocks between t and t+s, denoted ε ,ε ,...,ε . It is convenient to work with a t+1 t+2 t+s linear approximation to that decomposition, which we show in Appendix C takes the form s y −yˆ ≃ [Ψ (ξ ,ξ ,...,ξ )]ε (18) t+s t+s|t s,j t t−1 t−47+j t+j j=1 (cid:1) for Ψ (·) a known (5×4)-valued function of ξ ,ξ ,...,ξ . The mean squared error matrix s,j t t−1 t−47+j associated with an s-period-ahead forecast of y is then t+s s E(y −yˆ )(y −yˆ )′ = [Ψ (ξ ,ξ ,...,ξ )]Σ[Ψ (ξ ,ξ ,...,ξ )]′ (19) t+s t+s|t t+s t+s|t s,j t t−1 t−47+j s,j t t−1 t−47+j j=1 (cid:1) s 4 = Σ [Ψ (ξ ,ξ ,...,ξ )e ][Ψ (ξ ,ξ ,...,ξ )e ]′ m s,j t t−1 t−47+j m s,j t t−1 t−47+j m j=1m=1 (cid:1) (cid:1) fore columnmofthe(4×4)identitymatrixandΣ therowm,columnmelementofΣ. Thusthe m m 27See for example Hamilton (1994a, Section 11.5). 18

contribution of innovations of type L worker’s inflows (the first element of ε =(εw ,εw ,εx ,εx )′) t Lt Ht Lt Ht to the MSE of the s-period-ahead linear forecast error of total unemployment, ι′y , is given by 5 t s ι ′ Σ [Ψ (ξ ,ξ ,...,ξ )e ][Ψ (ξ ,ξ ,...,ξ )e ]′ι (20) 5 1 s,j t t−1 t−47+j 1 s,j t t−1 t−47+j 1 5 j=1 (cid:1) where ι denotes a (5 × 1) vector of ones. Note that as in the constant-parameter linear case, 5 the sum of the contributions of the 4 different structural shocks would be equal to the MSE of an s-period-ahead linear forecast of unemployment in the absence of measurement error. However, in our case the linearization is taken around time-varying values of {ξ ,ξ ,...,ξ }. We can t t−1 t−47+j evaluate equation (20) at the smoothed inferences {ˆξ ,ˆξ ,...,ˆξ } and then take the t|T t−1|T t−47+j|T average value across all dates t in the sample. This gives us an estimate of the contribution of the type L worker’s inflows to unemployment fluctuations over a horizon of s months: T s q =T−1 ι ′ Σ [Ψ (ˆξ ,ˆξ ,...,ˆξ )e ][Ψ (ˆξ ,ˆξ ,...,ˆξ )e ]′ι . (21) s,1 5 1 s,j t|T t−1|T t−47+j|T 1 s,j t|T t−1|T t−47+j|T 1 5 t=1 j=1 (cid:1) (cid:1) 4 Consequently q / q would be the ratio of the first factor’s contribution to unemployment s,1 s,m m=1 (cid:1) volatility at horizon s. Figure 9 shows the contribution of each factor to the mean squared error in predicting overall unemployment as a function of the forecasting horizon. If one is trying to forecast unemployment one month ahead, uncertainty about future inflows of type H and type L workers are equally important. However, the farther one is looking into the future, the more important becomes uncertainty about what is going to happen to type L workers. If one is trying to predict one or two years into the future, the single most important source of uncertainty is inflows of new type L workers, followed by uncertainty about their outflows. Much of the MSE associated with a 2-year-aheadforecast ofunemployment comesfrom not knowing whenthe next recessionwill begin or the current recession will end. For this reason, the MSE associated with 2-year-ahead forecasts is closely related to what some researchers refer to as the “business cycle frequency” in a spectral decomposition. If we are interested in the key factors that change as the economy moves into and out of recessions, inflows and outflows for type L workers are most important. We will provide additional evidence on this point in Section 3.2. 19

PanelBofFigure9breaksthesecontributionsseparatelyintoinflowsandoutflows. Bothinflows and outflows are important. However, the uncertainty about future inflows is more important in accounting for the error we would make in predicting total unemployment, accounting for about 60% of the MSE for any forecasting horizon. 3.2 Historical decomposition Aseparatequestionofinterestishowmuchoftherealizedvariationoversomehistoricalepisode camefromparticularstructuralshocks. Asin(18)ourmodelimpliesanestimateofthecontribution of shocks to a particular observed episode, namely s y −yˆ ≃ [Ψ (ˆξ ,ˆξ ,...,ˆξ )]ˆε (22) t+s t+s|t s,j t|T t−1|T t−47+j|T t+j|T j=1 (cid:1) whereˆε =ˆξ −ˆξ . Fromthisequation, wecanestimate forexample thecontribution t+j|T t+j|T t+j−1|T of εw ,εw ,...,εw (the shocks to w between t+1 and t+s) to the deviation of the level L,t+1 L,t+2 L,t+s L of unemployment at t+s from the value predicted on the basis of initial conditions at t: s ι ′ [Ψ (ˆξ ,ˆξ ,...,ˆξ )]e ˆε′ e . (23) 5 s,j t|T t−1|T t−47+j|T 1 t+j|T 1 j=1 (cid:1) Figure 10 shows the contribution of each component to the realized unemployment rate in the last five recessions. In each panel, the solid line (labeled U ) gives the change in the unembase ployment rate relative to the value at the start of the episode that would have been predicted on the basis of initial conditions. Typically an increase in the inflow of type L workers (whose contribution to total unemployment is indicated by the starred red curves) is the most important reason that unemployment rises during a recession. A continuing increase of these inflows even after the recession was over was an important factor in the jobless recoveries from the 1990 and 2001 recessions. During the first 8 months of the Great Recession, changes in inflows and outflows of type L individuals were of equal importance in accounting for rising unemployment.28 But our model 28Becauseof thelengthand severity of therecession of 2007-2009, the linearization (22)around theJanuary2007 values on which Panel E is based becomes poorer as we try to predict values for 2010. This is why the “U ” line all in Panel E falls below the actual path of unemployment in the case of this recession. As a robustness check, we also calculated the exact nonlinear contribution of each component in isolation of the others to the actual observed unemployment rate and the picture is very similar. The advantage of the linear decomposition is that the sum 20

concludes that new inflows of type L individuals were the most important factor contributing to rising unemployment after July of 2008. 4 Corroboration using other data sources and methods Our conclusions are different from those of some other prominent studies. In this section we examine some of the evidence considered by other researchers and show why properly interpreted it adds further support to our main findings. 4.1 Direct evidence on the importance of the level of inflows Let u denote the unemployment rate during month t and u1 the number of newly unemployed t t as a fraction of the labor force. Shimer (2012) calculated an unemployment-outflow rate f (which t he thought of as the job-finding rate) from u =exp(−f )u +u1 . (24) t+1 t t t+1 He calculated an unemployment-inflow rate x (which he thought of as the employment-exit rate) t from his equation (5), which differs from the natural measure x ≃ u1 by making adjustments t t for within-period inflows and outflows and carefully handles data after the 1994 survey redesign. We used Shimer’s method to generate updated series for f and x . Quarterly averages of these t t monthly series are plotted in the top two panels of Figure 11, and seem to give the impression that most of the cyclical action comes from outflows rather than inflows. Shimer proposed to approximate the unemployment rate as29 x t u ≃ . (25) t x +f t t We can then ask what the path of unemployment since 2007:Q4 would have been if f had stayed t of the individual contributions exactly equals the aggregate, whereas the same is not true in a nonlinear dynamic representation. 29Equation (25) would be exact if the unemployment rate did not change during month t. We have confirmed Shimer’s finding that the correlation between the right and left sides of (25) is 0.98. However, this high correlation masks the fact that the levels are very different, with the left side on average 1.5 percentage points lower than the right. We have subtracted 1.5 percentage points from values predicted by equation (25) to give them a better fit to the data. Of course this only shifts the level of the series and does not affect the magnitude of its fluctuations. 21

fixed at its 2007:Q4 value while quarterly averages of x varied as actually observed. Shimer intert preted the resulting series as the component of unemployment explained by inflows. Alternatively, we could fix x at its 2007:Q4 value and let f vary, giving a component corresponding to the t t contribution of outflows. These two series are plotted in the upper-left panel of Figure 12.30 This calculation seems to imply that most of the fluctuations in unemployment since 2007 should be attributed to outflows, with very little role for inflows. Hall and Schulhofer-Wohl (2017) reached a similar conclusion using an expression describing the unemployment rate as a function of current outflows and past inflows. Shimer’s decomposition might seem to be a clean way to think about this question. However, it shuts down the key channel that drives the cyclical behavior of unemployment, which is the dynamic interaction between inflows and outflows. One can see this clearly when we reframe the question using methods more familiar to macroeconomists, namely summarizing the properties of forecasts of the series. How much of the unanticipated change in unemployment since the Great Recession is due to changes in f that we could not have forecast in 2007:Q4 and how much to t unanticipated changes in x ? One can answer such a question using a quarterly bivariate VAR for t (∆f ,∆x ): t t ∆f =c +φ ∆f +···+φ ∆f +φ ∆x +···+φ ∆x +ε (26) t f ff,1 t−1 ff,8 t−8 fx,1 t−1 fx,8 t−8 ft ∆x =c +φ ∆f +···+φ ∆f +φ ∆x +···+φ ∆x +ε . (27) t x xf,1 t−1 xf,8 t−8 xx,1 t−1 xx,8 t−8 xt Thecounterfactualpathsforf andx usedintheupper-leftpanelofFigure12wouldbenumerically t t identicaltothehistoricaldecompositionfromaVARifweassumedthatallcoefficientsontherighthand side were zero and that the residuals ε and ε were uncorrelated with each other.31 ft xt The assumption that the coefficents on lags are truly zero is easily tested. A test of the null hypothesis φ = ··· = φ = 0 based on OLS estimation of equation (26) for t = 1969:Q3 to fx,1 fx,8 2016:Q4 produces an F(8,173) statistic of 10.02, an overwhelming rejection with a p-value below 30Shimerfixedf andx attheiraveragevaluesoverthefullsampleratherthanthevaluesatafixedpointintime t t as we have done here. The basic result is the same however one does this calculation. Our choice gives a better fit to the post-2007 data and helps clarify the relation between Shimer’s calculations and what we consider a better approach to be described below. 31Thatis,inthiscasethecontributionofε f,t+1,...,ε f,t+s tof t+1,...,f t+s isexactlyequaltothevaluesf t+1,...,f t+s themselves while the contribution of ε f,t+1,...,ε f,t+s to x t+1,...,x t+s is exactly zero. 22

10−6. In other words, a great deal of the observed variation in outflows could have been predicted on the basis of earlier values of inflows. Furthermore, rather than assume that ε and ε are ft xt uncorrelated, we can adopt the more common VAR procedure of using a Cholesky decomposition of their covariance matrix with ∆f ordered first. This Cholesky ordering gives as much of the t benefit to Shimer’s view as possible, with 100% of the correlated component between the residuals attributed to the effect of outflows alone. A variance decomposition of the estimated VAR (26)- (27) finds that even with this ordering, 59% of the variation in outflows over a 3-year horizon can be attributed to inflows. The top right panel of Figure 12 plots the component of f that the t bivariate VAR attributes to changes in inflows since 2007. Future outflowrates fromunemployment are predictable from what we currently see happening to inflows. This is the central point of our paper, and the central reason some earlier research underestimated the importance of inflows. 4.2 Direct evidence on the importance of the composition of inflows We moreover have claimed in this paper that what matters is not simply the level of inflows but also the composition, in particular new inflows of those we have described as type L. Shimer attempted to assess the importance of changes in the composition of unemployment by replacing f in (25) with t J θ f j=1 tj tj f = t (cid:1) J θ j=1 tj (cid:1) where j = 1,...,J are different observable characteristics of the unemployed and θ the share of tj each among the unemployed. He then performed a similar exercise, first fixing f = f¯ at its tj j historical average to find the contribution of changes in observable characteristics, and concluded that these contributed little. Hall and Schulhofer-Wohl (2017) examined how the compositional changes in the inflows influence the path of unemployment rate using a similar exercise. Again these miss the central point— the core issue should be the question of forecasting. The most useful observable variable that signals important changes in the composition of the newly unemployed is the level of new claims for unemployment insurance. Not all of the newly unemployed are eligible for benefits, with some states making it difficult to receive benefits for voluntary quits. And not everyone who is eligible applies for benefits. When someone applies for 23

unemployment insurance they are revealing useful personal information about the circumstances and expected duration of their unemployment. Let Q denote the numberofindividualswho filenewclaimsforunemployment insurance inthe t last week of quarter t and U1 the BLS estimate of the number of individuals who became newly t unemployed in the last month of quarter t. We added ∆q for q = Q /U1 as a third variable t t t t to the VAR (26)-(27). The hypothesis that coefficients on lags of ∆q in equation (26) are t−j all zero leads to an F(8,165) statistic of 7.42, again rejecting the null hypothesis with a p-value below 10−6. Changes in the composition of inflows have huge additional predictive power for future outflows beyond that contained in the level of inflows alone. If there is an increase in the share of newly unemployed people who have a low job finding probability (as captured in this regression by an increase in the share who file for unemployment benefits), then they are likely to stay unemployed longer and to bring down the average job finding probability going forward. The level and composition of inflows together explain 76% of the variance of Shimer’s outflow measure at a 3-year horizon (still ordering outflows first in the VAR to try to give them the strongest effect possible). The historical decomposition of outflow rates that comes from the three-variable VAR is plottedin the lower right panel of Figure 12. Again, much of the evidence that other researchers have treated as signaling the importance of the outflow rate from unemployment is in fact driven by changes in the level and composition of inflows. This empirical evidence demonstrates that an approximation such as equation (25) inherently masks the dynamic channel through which the compositional change in the inflows affects the outflows probability persistently. A more satisfying way to see the contribution of inflows and outflows to the unemployment rate itself is not with an approximation such as (25) but instead by adding the unemployment rate u directly as a fourth variable to the VAR. With f ordered t t first and u last, changes in the level and composition of new inflows into unemployment (namely, t innovationsinx andq )explain86%ofthevariationintheunemploymentrateata3-yearhorizon, t t while outflows f account for only 13%.32 t 32Theremaining1%comesfromthefactthatu isnotanexactlinearcombinationoff andx ,withthedifference t t t interpreted by the VAR as an innovation in u . t 24

4.3 Reproducing the model’s key findings with direct data summaries Finally, it is possible to adapt Shimer’s methods to see directly the features of the data that led to the conclusions of our dynamic structural model. Suppose we were to focus just on u4.+, t the fraction of the labor force who have been unemployed for 4 months or longer. This is an instructive exercise because according to the empirical estimates from our dynamic model, most of the individuals in this group are type L.33 We could think of “inflows” into this group as the individuals who are in this group at t but were not in this group 3 months ago: x4.+ = u4.6. And t t we can measure the outflow probability for this group just as we did in equation (24): u4.+ =(1−F4.+)u4.++u4.6 . (28) t+3 t t t+3 Quarterlyaverages of these inflowandoutflowrates forlong-term unemployment are plottedinthe second two panels of Figure 11. These completely reverse the impression from the top two panels. Wecanalsorelatethesenumbersdirectlytotheestimatesfromourstructuralmodelbyrestating thethree-monthprobabilityofexitinglong-termunemployment(F4.+)asaone-monthcontinuation t probability p4.+ =(1−F4.+)1/3. This series is compared with quarterly averages of our estimated t t type L monthly unemployment-continuation probabilities in the lower left panel of Figure 11. The two series are quite similar. The reason is that, according to our model, most of the individuals who have been unemployed for 4 months or longer are likely to be type L, so that we can nearly readthe unemployment-exitprobabilityforthisgroupdirectlyfromasimplesummarystatisticlike F4.+.34 We can also see in Panel F of Figure 11 the clear basis in the raw data for our conclusion t in the last panel of Figure 10 about the source of the increase in long-term unemployment during the Great Recession. In the first half of the recession, deterioration in both inflow and outflow rates for type L contributed. Our algorithm draws this inference because we find in the data both x4.+ increasing and F4.+ decreasing during this period. Later in the recession, x4.+ continued to t t t rise even as F4.+ stabilized, warranting our model’s inference that the key development in the later t part of the recession was new inflows of type L workers.35 33For p and p around 0.36 and 0.85, respectively, p4 =0.02 while p4 =0.52. See also Panel B of Figure 3. H L H L 34Of course, our series is smoother (and we believe more accurate) than the nonparametric estimate because our inferenceallowsformeasurementerrorintherawdata,makesjointuseofalltheobservedunemploymentcategories, and allows for the fact that not all of those unemployed for longer than 4 weeks are type L. 35Other measures of the unemployment-outflow rate for the long-term unemployed are discussed in Appendix D. 25

4.4 Who are the type L individuals? And where did these new inflows come from? Darby, Haltiwanger and Plant (1986) argued that counter-cyclicality in the average unemployment duration mainly comes from the increased inflow of prime-age workers suffering permanent job loss who are likely to have low job-finding probabilities. Bednarzik (1983) also noted that permanently separated workers are more likely to experience a long duration of unemployment, while Fujita and Moscarini (forthcoming) showed that the unemployed who are likely to experience long-term unemployment spells tend to be those who are not recalled to work by their previous employers. This interpretation certainly fits the facts for the Great Recession. In March 2009 there were 1.38 million newly unemployed individuals who reported permanent separation as their reason for unemployment, 454,000 more than in March 2008. In March 2009 there were 3.47 million newly unemployed individuals altogether, 642,000 more than the previous year. This means that 454/642 = 71% of the increase in U1 between 2008:M3 and 2009:M3 was due to permanent t separations. The increased inflow of type L individuals is clearly correlated with an increase in permanent separations. In Panel A of Figure 13 we compare our estimate of the number of newly unemployed type L workerseachmonthtothenumberofnewlyunemployedindividualswhogavepermanentseparation from their previous job as the reason36. The two series were arrived at using different data and different methodologies but exhibit remarkably similar dynamics. Panel B compares the total number of those unemployed who gave permanent separation as the reason with our estimate of thetotalnumberofunemployedtypeLworkers,forwhichthecorrespondenceisevenmorestriking. To obtain further evidence on the role of observed and unobserved worker characteristics, Ahn (2016) fit models like the one developed here to subsets of workers sorted based on observable characteristics. She replaced ourobservationvector y basedonaggregate unemployment numbers t withy =(U1,U2.3,U4.6,U7.12,U13.+)′ whereU2.3 forexampledenotesthenumberofworkerswith jt jt jt jt jt jt jt observedcharacteristic j who have been unemployed for 2-3 months, the idea being that within the groupj therearenewinflows(w andw )andoutflows(p andp )oftwounobservedtypes jHt jLt jHt jLt 36Permanent separations include permanent job losers and persons who completed temporary jobs. The separate series, permanent job losers and persons who completed temporary jobs, are publicly available from 1994, but their sum (permanent separations) is available back to 1976. 26

of workers. Of particular interest for the present discussion are the results when j corresponds to one of the 5 reasons for why the individual was looking for work. Panel A of Figure 14 displays Ahn’sestimatedvaluesfornewinflowsoftypeLworkersforeachofthecategoriesaswellasthesum 5 wˆ . Our series wˆ inferred from aggregate data is also plotted again for comparison. j=1 jLt|T Lt|T (cid:1) The sum of micro estimates is very similar to our aggregate estimates, and the individual micro components reveal clearly that those we have described as type L workers primarily represent a subset of people who were either permanently separated from their previous job or are looking again for work after a period of having been out of the labor force. Ahn (2016) also calculated the models’ inferences about the total number of type L individuals in any given observable category j who were unemployed in month t. These are plotted in Panel B of Figure 14. Here the correspondence between the aggregate inference and the sum of the micro estimates is even more compelling, as is the conclusion that type L unemployed workers represent primarilyasubset ofthosepermanentlyseparatedfromtheiroldjobsorre-enteringthelaborforce. However,noneofthisismeanttoimplythatthoseweidentifyastypeLandthosewholosetheir jobs due to permanent separations are one and the same group; the evidence in Kroft et al. (2016) clearlyshowsthattheyarenot. Notallindividualswhoarepermanentlyseparatedgetclassifiedby our approach as type L, and not all of those we classify as type L lost their jobs due to permanent separations. Ahn (2016) found that permanent job losers who are type L account for around 50% of the aggregate type L unemployment and drive most of its counter-cyclicality. The second most important group is type L re-entrants to the labor force. Considering that permanent job losers are likely to leave the labor force and re-enter to the labor force, there is a high chance that the type L re-entrants used to be permanent job losers before leaving the labor force. In addition, type L people are found disproportionately more among permanent job losers than they are among in othercategories. The type L individuals account for one third of the newly unemployed permanent job losers, whereas they only comprise less than one fifth of the inflows in other categories. 5 Robustness checks Column 1 of Table 3 summarizes some of the key conclusions that emerge from our baseline analysis. The table breaks down the MSE of a forecast of the overall level of unemployment at 3- 27

month,1-year,and2-yearforecasthorizonsintothefractionoftheforecasterrorthatisattributable to various shocks. In our baseline model, inflows account for more than half the variance at all horizons. InflowsoftypeL workers are mostimportant but the outflowsoftype Lworkersandthe inflows of type H workers are also crucial at a 3-month horizon. At a 1- or 2-year horizon, shocks to inflow and outflow probabilities for type L workers are the most important factors. The table also reports asymptotic standard errors for each of these magnitudes. Subsequent columns show how these conclusions would change under a number of alternative specifications, as discussed below and with more details in Appendix E. 5.1 Accounting for the structural break in the CPS A redesign in the CPS in 1994 introduced a structural break with which any user of these data has to deal. Our baseline estimates adjusted the unemployment duration data using differences between rotation groups 1 and 5 and groups 2-4 and 6-8 in the CPS micro data as described in Appendix A. Column 2 of Table 3 reports the analogous variance decompositions when we instead use Hornstein’s (2012) data adjustment.37 This produces very little change in our numbers. In column 3 we use only data subsequent to the redesign in 1994 making no adjustment to the reported BLS figures. This reduces the estimated contribution of inflows of type L workers at shorter horizons, but preserves our main finding that for business-cycle frequencies, changes for type L workers account for most of the fluctuations in unemployment. We obtained similar results using the full data set from 1976-2013 with no adjustments for the 1994 redesign (column 4). 5.2 Time-varying genuine duration dependence Our baseline specification assumed that the parameters δ ,δ , and δ characterizing genuine 1 2 3 duration dependence in equations (5) and (8) do not change over time. Column 5 of Table 3 reports results for a more general specification d =˜δ ((τ −1)/48)+˜δ [2((τ −1)/48)2−1]+˜δ [4((τ −1)/48)3−3((τ −1)/48)] τt 1t 2t 3t 37NotethatalthoughwereporttheloglikelihoodandSchwarz’s(1978)Bayesiancriterioninrows2and3ofTable 3, the values for columns 2-4 are not comparable with the others due to a different definition of the observable data vector y . t 28

where ˜δ =˜δ (1) in normal months and ˜δ =˜δ (2) if the national unemployment rate is above 6.5%, jt j jt j times when the labor market is in slack and it is likely that many job losers automatically became eligible for extended UI benefits.38 Adding 3 new parameters (˜δ (2) ,˜δ (2) ,˜δ (2) ) to the model results 1 2 3 in an increase in the log likelihood of 46.2, but does not change any of our core conclusions. 5.3 Allowing for correlated shocks Our baseline specification assumed that the shocks to w , w , p and p were mutually Lt Ht Lt Ht uncorrelated. The estimated residuals from our baseline model, ˆε , have some correlation39, but t|T it is small. No correlation is above 0.4 in absolute value and the correlation between εw and εx Lt Ht is only 0.04. We estimated a generalization of the model to allow for nonzero correlations deriving from a factor structure for the innovations, ε = λF +u , where F ∼ N(0,1), λ is a (4×1) vector of t t t t factor loadings, and u is a (4×1) vector of mutually uncorrelated idiosyncratic components with t variance matrix E(u u′) =Q: t t E(ε ε′) =λλ′+Q t t (qw)2 0 0 0 H   0 (qw)2 0 0 Q= L .    0 0 (qx)2 0   H      0 0 0 (qx)2   L   Although the correlations are small, this specification provides a statistically significant improvement in the log likelihood (column 6 of Table 3). Note that equation (22) continues to hold in this more general setting, and we could still calculate the magnitude in (23), which measures what would happen if {εw }s were to have L,t+j j=1 followed its inferred historical path with {εw ,εx ,εx }s all zero. This calculation would H,t+j L,t+j H,t+j j=1 38Vishwanath(1989)andBlanchardandDiamond(1994)developedtheoreticalmodelsinwhichgenuineduration dependence could be linked to market tightness. See Whittaker and Isaacs (2014) for a detailed discussion of the conditions that can trigger extended unemployment benefits. 39The correlation between the iand j elements of ˆε was calculated as t|T T {[ˆε (i)−¯ε(i)][ˆε (j)−¯ε(j)]} (cid:1)t=1 t|T t|T (cid:2) (cid:1) T t=1 [ˆε t|T (i)−¯ε(i)]2 (cid:2) (cid:1) T t=1 [ˆε t|T (j)−¯ε(j)]2 for¯ε(i)=T−1 T ˆε (i). (cid:1)t=1 t|T 29

no longer have a clean statistical interpretation as the answer to a forecasting question when the ε’s are correlated, because in the latter case knowledge of the value of one of the ε’s would cause one to revise the contemporaneous forecast of the others. Nevertheless, we can still calculate the magnitude in (23) for the factor model as a check on whether the quantitative importance of type L inflows is in any way an artifact of having assumed uncorrelated shocks. The lower right panel of Figure 10 plots the quantitative contribution calculated in this way for each of the four shocks during the Great Recession. The graph is virtually identical to that in the lower left from our baseline model. We can also calculate the separate statistical contribution of each of the 5 uncorrelated shocks in the factor model, which consist of the aggregate factor F and the four elements of u . In this t t case, the variance decomposition (19) becomes E(y −yˆ )(y −yˆ )′ t+s t+s|t t+s t+s|t s = [Ψ (ξ ,ξ ,...,ξ )](λλ′+Q)[Ψ (ξ ,ξ ,...,ξ )]′ s,j t t−1 t−47+j s,j t t−1 t−47+j j=1 (cid:1) s = [Ψ (ξ ,ξ ,...,ξ )]λλ′[Ψ (ξ ,ξ ,...,ξ )]′ s,j t t−1 t−47+j s,j t t−1 t−47+j j=1 (cid:1) s 4 + Q [Ψ (ξ ,ξ ,...,ξ )e ][Ψ (ξ ,ξ ,...,ξ )e ]′ m s,j t t−1 t−47+j m s,j t t−1 t−47+j m j=1m=1 (cid:1) (cid:1) for Q the row m, column m element of Q. The contributions of each of the five shocks are m summarized in column 6 of Table 3. The aggregate factor by itself accounts for 64% of the MSE of a 1—year-ahead forecast of unemployment. But the component of inflows of type L workers that is uncorrelated with the aggregate factor would still by itself account for 31% of the MSE, far more important than any other idiosyncratic shock. We conclude that the importance of inflows of type L workers is robust to assumptions about correlations between the shocks. 5.4 Time aggregation Focusing on monthly transition probabilities understates flows into and out of unemployment sincesomeonewholosestheirjobinweek1ofamonthbutfindsanewjobinweek2wouldneverbe counted as having been unemployed. We discuss some of the literature onthis in Appendix E, and 30

explain our reasons for favoring the specification in our baseline model. But there we also show how one could formulate our model assuming that time evolves weekly with potential transitions into and out of unemployment within the month. As seen in column 7 of Table 4, the weekly formulation implies a modestly smaller role for inflows than our baseline model. This is to be expected, as allowing for shorter employment spells by construction imputes some people who exit unemployment by obtaining new jobs but then lose them again before the month is over. Note however that the weekly model in column 7 has a slightly worse fit to the data than the baseline monthly model in column 1. 6 Conclusion People who have been unemployed for longer periods than others have dramatically different probabilities of exiting unemployment, and these relative probabilities change significantly over the business cycle. Even when one conditions on observable characteristics, unobserved differences across people and the circumstances under which they came to be unemployed are crucial for understanding these features of the data. We have shown how the time series of unemployment levels by different duration categories can be used to infer inflows and outflows from unemployment for workers characterized by unobserved heterogeneity. In contrast to other methods, our approach uses the full history of unemployment datatosummarizeinflowsandoutflowsfromunemploymentandallowsustomakeformalstatistical statements about how much of the variance of unemployment is attributable to different factors as well as identify the particular changes that characterized individual historical episodes. In normal times, around three quarters of those who are newly unemployed find jobs quickly. But in contrast to the conclusions of Hall (2005) and Shimer (2012), we find that more than half the variance in unemployment comes from shocks to the number of newly unemployed. A key feature of economic recessions is newly unemployed individuals who have significantly lower jobfinding probabilities. Our inferred values for the size of this group exhibit remarkably similar dynamics to separate measures of the number of people who permanently lose their jobs. We conclude that recessions are characterized by a change in the circumstances under which people become unemployed that accounts for the greater difficulty in finding new jobs during a recession. 31

32

References Aaronson, Daniel, Bhashkar Mazumder, and Shani Schechter (2010). "What is behind the rise in long-term unemployment," Economic Perspectives, QII:28-51. Abbring, Jaap H., Gerard J. van den Berg, and Jan C. van Ours (2001). "Business Cycles and Compositional Variation in U.S. Unemployment," Journal of Business & Economic Statistics, 19(4): pp. 436-448. Abowd, John M., and Arnold Zellner (1985). "Estimating Gross Labor-Force Flows," Journal of Business and Economic Statistics, 3(3): 254-283. Abraham, Katharine G. and Robert Shimer (2002). "Changes in Unemployment Duration and Labor Force Attachment," in Alan Krueger and Robert Solow, eds., The Roaring Nineties, pp. 367-420. Russell Sage Foundation. Acemoglu, Daron (1995), "Public Policy in a Model of Long-term Unemployment," Economica 62:161-178. Ahn, Hie Joo (2016), "The Role of Observed and Unobserved Heterogeneity in the Duration of Unemployment Spells," Working paper, Federal Reserve Board. Alvarez, Fernando, Katarína Boroviˇcková, and Robert Shimer (2016). "Decomposing Duration Dependence in a Stopping Time Model," NBER Working Papers 22188, National Bureau of Economic Research, Inc. Bachmann, Ronald, and Mathias Sinning (2016). "Decomposing the Ins and Outs of Cyclical Unemployment," Oxford Bulletin of Economics and Statistics, 78(6): 853-876. Baker, Michael (1992). "Unemployment Duration: Compositional Effects and Cyclical Variability," American Economic Review, 82(1):313—21. Barnichon, Regis and Andrew Figura (2015). "Labor Market Heterogeneity and the Aggregate Matching Function," American Economic Journal: Macroeconomics, 7(4): 222-49. Baumeister, ChristianeandGertPeersman(2013). "Time-VaryingEffectsofOilSupplyShocks on the U.S. Economy," American Economic Journal: Macroeconomics, 5(4): 1-29. Bednarzik, Robert (1983). "Layoffs and Permanent Job Losses: Workers’ Traits and Cyclical Patterns," Monthly Labor Review, September 1983, pp.3-12, Bureau of Labor Statistics. Blanchard, Olivier J. and Peter Diamond (1994). "Ranking, Unemployment Duration, and 33

Wages," The Review of Economic Studies, 61(3):417-434, Bleakley, Hoyt, Ann E. Ferris, and Jeffrey C. Fuhrer (1999). "New Data on Worker Flows During Business Cycle," New England Economic Review, July/August:49—76, 1999. Bureau of Labor Statistics (2011). "How Long before the Unemployed Find Jobs or Quit Looking?", Issues in Labor Statistics 2011(1):1-6. Darby, Michael R., John C. Haltiwanger, and Mark W. Plant (1986). "The Ins and Outs of Unemployment: The Ins Win," NBER Working Paper # 1997, National Bureau of Economic Research, Inc. Davis, Steven J., R. Jason Faberman, and John Haltiwanger (2006). "The Flow Approach to Labor Markets: New Data Sources and Micro-Macro Links," Journal of Economic Perspectives, 20(3):3—26. Den Haan, Wouter J. (2000). "The Comovement between Output and Prices," Journal of Monetary Economics, 46:3-30. Elbers, Chris, and Geert Ridder (1982), "True and Spurious Duration Dependence: The Identifiability of the Proportional Hazard Model," Review of Economic Studies, 49(3): 403-409. Elsby, Michael W. L., Ryan Michaels, and Gary Solon (2009). "The Ins and Outs of Cyclical Unemployment," American Economic Journal: Macroeconomics, 1(1): 84-110. Elsby, Michael W. L., Bart Hobijn, andAy¸segül S¸ahin(2010). "The LaborMarket inthe Great Recession," Brookings Papers on Economic Activity, Spring 2010: 1-56. Elsby, Michael W. L., Bart Hobijn, and Ay¸segül S¸ahin (2013). "Unemployment Dynamics in the OECD," Review of Economics and Statistics, 95(2): 530-548. Elsby, Michael W. L., Bart Hobijn, Ay¸segül S¸ahin, and Robert G. Valletta (2011). "The Labor Market in the Great Recession: An Update to September 2011," Brookings Papers on Economic Activity, Fall 2011: 353-371. Eriksson, Stefan, and Dan-Olof Rooth (2014). "Do Employers Use Unemployment as a Sorting Criterion When Hiring? Evidence from a Field Experiment," American Economic Review, 104(3): 1014-1039. Faberman, R. Jason and Marianna Kudlyak (2017). "The Intensity of Job Search and Search Duration," Working Paper, Federal Reserve Bank of Chicago. Frazis,HarleyJ.,EdwinL.Robison,ThomasD.Evans,andMarthaA.Duff(2005). "Estimating 34

Gross Flows Consistent with Stocks in the CPS," Monthly Labor Review September 2005: 3-9. Fujita, Shigeru (2011). "Dynamics of Worker Flows and Vacancies: Evidence from the Sign Restriction Approach," Journal of Applied Econometrics, 26: 89-121. Fujita, Shigeru and Garey Ramey (2009). "The Cyclicality of Separation and Job Finding Rates," International Economic Review, 50(2):415-430. Fujita, ShigeruandGiuseppeMoscarini(forthcoming). "RecallandUnemployment,"American Economic Review. Hall, Robert E. (2005). "Job Loss, Job Finding and Unemployment in the U.S. Economy over the Past Fifty Years," in Mark Gertler and Kenneth Rogoff, eds., NBER Macroeconomics Annual 2005, Volume 20, National Bureau of Economic Research, Inc. Hall,RobertE.(2014). "QuantifyingtheLastingHarmtotheU.S.EconomyfromtheFinancial Crisis," in Jonathan Parker and Michael Woodford, eds., NBER Macroeconomics Annual 2014. Volume 29, National Bureau of Economic Research, Inc. Hall, Robert E. and Sam Schulhofer-Wohl (2017). "The Pervasive Importance of Tightness in Labor-Market Volatility," working paper, Stanford University. Ham, John C., and Samuel A. Rea, Jr. (1987). "Unemployment Insurance and Male Unemployment Duration in Canada," Journal of Labor Economics, 5(3): 325-353. Hamilton, James D. (1994a). Time Series Analysis. Princeton: Princeton University Press. Hamilton, James D. (1994b). "State-Space Models," in Robert F. Engle and Daniel L. Mc- Fadden, eds., Handbook of Econometrics, Volume IV, Chapter 50, pp. 3039-3080. Amsterdam: Elsevier. Hamilton, James D. (forthcoming). "Why You Should Never Use the Hodrick-Prescott Filter," Review of Economics and Statistics. Heckman,JamesandB.Singer(1984a). "TheIdentifiabilityoftheProportionalHazardModel," Review of Economic Studies, 51(2):231-41. Heckman,JamesandB.Singer(1984b). "AMethodforMinimizingtheImpactofDistributional Assumptions in Econometric Models for Duration Data," Econometrica, 52(2): 271-320. Heckman, James, and B Singer (1984c), "Econometric duration analysis", Journal of Econometrics, 24: 63-132. Honoré,BoE.(1993). "IdentificationResultsforDurationModelswithMultipleSpells,"Review 35

of Economic Studies, 60(1): 241-246. Hornstein, Andreas (2012). "Accounting for Unemployment: the Long and Short of it," FRB Richmond Working Paper, No.12-07, Federal Reserve Bank of Richmond. Ilg, Randy E., and Eleni Theodossiou (2012). "Job Search of the Unemployed by Duration of Unemployment," Monthly Labor Review, March 2012, pp.41-49, Bureau of Labor Statistics. Jarosch, Gregor, and Laura Pilossoph (2015). "Statistical Discrimination and Duration Dependence in the Job Finding Rate," working paper, Princeton University. Kaitz, Hyman B. (1970). "Analyzing the Length of Spells of Unemployment," Monthly Labor Review, November 1970, pp.11-20, Bureau of Labor Statistics. Katz,LawrenceF.andBruceD.Meyer(1990). "UnemploymentInsurance,RecallExpectations, and Unemployment Outcomes," The Quarterly Journal of Economics, 105(4):973-1002. Kudlyak, Marianna, and Fabian Lange (2014). "Measuring Heterogeneity in Job Finding Rates Among the Nonemployed Using Labor Force Status Histories," working paper, Federal Reserve Bank of Richmond. Kroft, Kory, Fabian Lange, and Matthew J. Notowidigdo (2013). "Duration Dependence and Labor Market Conditions: Evidence from a Field Experiment", The Quarterly Journal of Economics, 128(3): 1123-1167. Kroft,Kory,FabianLange,andMatthewJ.Notowidigdo,andLawrenceF.Katz(2016). "Long- Term Unemployment and the Great Recession: The Role of Composition, Duration Dependence, and Non-Participation," Journal of Labor Economics, 34:S7-S54. Lazear, Edward P. and James R. Spletzer (2012). "Hiring, Churn and the Business Cycle", American Economic Review, 102(3):575-79. Ljungqvist, Lars, and Thomas J. Sargent (1998). "The European Unemployment Dilemma," Journal of Political Economy 106(3):514-550. Morchio, Iacopo (2016). "Work Histories and Lifetime Unemployment," working paper, University of Vienna. Perry, George L. (1972). "Unemployment Flows in the U.S. Labor Market," Brookings Papers on Economic Activity, 3(2):245—292. Polivka, Anne E. and Stephen M. Miller (1998). "The CPS after the Redesign: Refocusing the EconomicLens,"inJohnHaltiwanger, MarilynE.Manser, andRobertTopel, eds., LaborStatistics 36

Measurement Issues, pp.249-86. Ravenna, Federico, and Carl E. Walsh (2012). "Screening and Labor Market Flows in a Model with Heterogeneous Workers," Journal of Money, Credit and Banking, 44(S2): 31-71. Ridder, Geert (1990). "The Non-Parametric Identification of Generalized Accelerated Failure- Time Models", Review of Economic Studies, 57:167-182. Schwarz, Gideon (1978). "Estimating the Dimension of a Model," Annals of Statistics 6:461- 464. Shimer, Robert (2012). "Reassessing the Ins andOuts of Unemployment," Review of Economic Dynamics, 15(2):127-148. Sider, Hal (1985). "Unemployment Duration and Incidence: 1968-1982," American Economic Review, 75(3):461-72. Van den Berg, Gerald J. (2001). "Duration Models: Specification, Identification, and Multiple Durations," in James J. Heckman and Edward Leamer, eds., Handbook of Econometrics, Volume V (North-Holland, Amsterdam). Van den Berg, Gerald J., and Bas van der Klaauw (2001). "Combining Micro and Macro Unemployment Duration Data," Journal of Econometrics 102: 271-309. VandenBerg, GeraldJ., andJanC.vanOurs(1996). "UnemploymentDynamicsandDuration Dependence," Journal of Labor Economics, 14(1):100-125. Vishwanath, Tara. (1989). “JobSearch, StigmaEffect, andEscapeRatefromUnemployment,” Journal of Labor Economics, 7(4):487-502. Vroman, Wayne (2009). "Unemployment insurance recipients and nonrecipients in the CPS," Monthly Labor Review, October 2009, pp.44-53, Bureau of Labor Statistics. White, Halbert (1982). "Maximum Likelihood Estimation of Misspecified Models," Econometrica, 50:1-25. Whittaker, Julie M. and Katelin P. Isaacs (2014). "Unemployment Insurance: Programs and Benefits,"Congressional Research Service Report for Congress, 7-5700. Yashiv, Eran (2007). "US Labor Market Dynamics Revisited," The Scandinavian Journal of Economics, 109(4): 779—806. 37

Table 1. Actual and predicted values for unemployment on average and during Great Recession using different steady-state representations Parameter values Actual or predicted values U1 U2.3 U4.6 U7.12 U13.+ 1976:M1-2017:M06 Observed values (1) 3,178 2,281 1,244 1,064 664 w p Fitted (and predicted) values (2) 3,178 0.4840 3,178 2,281 (618) (78) (1) w w p p Fitted (and predicted) values H L H L (3) 2,488 690 0.3559 0.8476 3,178 2,281 1,244 1,064 (621) w δ δ δ δ Fitted values 0 1 2 3 (4) 3,178 0.1090 -0.3690 0.0140 3.314e-5 3,178 2,281 1,244 1,064 664 w w p (1) p (1) δ Fitted values H L H L (5) 2,482 696 0.3550 0.8440 -0.0050 3,178 2,281 1,244 1,064 664 2007:M12-2013:M12 Observed values (6) 3,339 2,787 2,131 2,426 1,902 w w p p Fitted (and predicted) values H L H L (7) 2,274 1,065 0.32920 0.890 3,339 2,787 2,131 2,426 (2,358) w δ δ δ δ Fitted values 0 1 2 3 (8) 3,339 0.2360 -0.6620 0.0540 -1.27e-3 3,339 2,787 2,131 2,426 1,902 w w p (1) p (1) δ Fitted values H L H L (9) 2,307 1,032 0.3340 0.9000 0.0170 3,339 2,787 2,131 2,426 1,902 Notes to Table 1. Table reports average values of Ux in thousands of workers over the entire t 1976:M1-2017:M6 sample and the 2007:M12-2013:M12 subsample along with predicted values from simple steady-state calculations. Parameters were chosen to fit exactly the values in that row appearing in normal face, while the model’s predictions for other numbers are indicated by parentheses. 38

Table 2. Parameter estimates for the baseline model σ w L 0.0422*** R1 0.1011*** ˜δ1 5.0512*** (0.0039) (0.0054) (1.9164) σ w H 0.0437*** R2.3 0.0753*** ˜δ2 -0.0485 (0.0057) (0.0044) (0.0532) σ x L 0.0476*** R4.6 0.0817*** ˜δ3 2.1674*** (0.0054) (0.0073) (0.8104) x σ H 0.0204*** R7.12 0.0586*** (0.0027) (0.0047) R13+ 0.0393*** (0.0025) No. of Obs. 498 Log-Likelihood 2,574.0 Notes to Table 2. White (1982) quasi-maximum-likelihood standard errors in parentheses. See footnote 20 for the definition of ˜δ . j 39

Table 3. Comparison of variance decomposition across different models Shocks (1) (2) (3) (4) (5) (6) (7) # of Param. 12 12 12 12 15 16 12 Log-L. 2574.0 2474.7 1326.6 2608.7 2620.2 2585.7 2571.0 SIC -5073.4 -4874.8 -2585.5 -5142.8 -5147.2 -5072.0 -5067.5 3 month F - - - - - 0.588 - - - - - - (0.069) w 0.415 0.414 0.225 0.131 0.392 0.233 0.384 L (0.045) (0.048) (0.053) (0.025) (0.046) (0.041) (0.044) w 0.208 0.229 0.225 0.388 0.214 0.123 0.147 H (0.041) (0.048) (0.057) (0.052) (0.044) (0.030) (0.040) p 0.281 0.288 0.279 0.203 0.281 0.000 0.253 L (0.043) (0.046) (0.067) (0.039) (0.045) (0.055) (0.041) p 0.096 0.068 0.271 0.277 0.114 0.057 0.216 H (0.022) (0.017) (0.067) (0.045) (0.022) (0.016) (0.046) Inflows 0.623 0.644 0.450 0.519 0.606 0.355 0.531 L group 0.696 0.703 0.503 0.335 0.673 0.233 0.637 1 year F - - - - - 0.635 - - - - - - (0.070) w 0.510 0.489 0.350 0.307 0.489 0.297 0.476 L (0.050) (0.052) (0.060) (0.042) (0.052) (0.051) (0.049) w 0.074 0.083 0.102 0.210 0.079 0.045 0.051 H (0.017) (0.022) (0.030) (0.041) (0.020) (0.012) (0.016) p 0.380 0.403 0.399 0.318 0.388 0.000 0.391 L (0.051) (0.053) (0.071) (0.053) (0.054) (0.071) (0.053) p 0.036 0.025 0.148 0.165 0.044 0.022 0.082 H (0.009) (0.007) (0.045) (0.034) (0.010) (0.006) (0.022) Inflows 0.584 0.572 0.453 0.516 0.567 0.343 0.527 L group 0.890 0.892 0.750 0.625 0.877 0.298 0.867 2 year F - - - - - 0.644 - - - - - - (0.072) w 0.523 0.485 0.427 0.430 0.503 0.310 0.483 L (0.052) (0.053) (0.064) (0.048) (0.054) (0.053) (0.051) w 0.049 0.054 0.063 0.139 0.053 0.031 0.033 H (0.012) (0.015) (0.020) (0.031) (0.014) (0.008) (0.012) p 0.403 0.444 0.413 0.320 0.413 0.000 0.428 L (0.053) (0.055) (0.071) (0.050) (0.056) (0.075) (0.054) p 0.025 0.017 0.096 0.112 0.031 0.015 0.055 H (0.006) (0.005) (0.032) (0.026) (0.007) (0.004) (0.017) Inflows 0.572 0.539 0.491 0.568 0.556 0.341 0.516 L group 0.926 0.929 0.841 0.750 0.916 0.310 0.911 Notes to Table 3. (1) Baseline model, (2) alternative data, (3) post 94 data, (4) unadjusted data, (5) time-varying GDD, (6) correlated shocks, (7) weekly frequency. Standard errors in parentheses. 40

1.0 0.9 0.8 0.7 0.6 0.5 0.4 newly unemployed >= 4 months 0.3 1950 1960 1970 1980 1990 2000 2010 Figure 1. Unemployment-continuation probabilities for newly unemployed and long-term unemployed, 1949:M1-2017:M6. Figure 2. Illustration of howex ante heterogeneity can cause unemployment-continuation probabilities to increase with duration. Notes to Figure 2. Of newly unemployed at time t, 80 have unemployment-continuation probability of 35% and 20 have probability of 85%. The figure reports the number from each group who are still unemployed in subsequent months and the average continuation probabilities for each surviving cohort. 41

0 5 10 15 20 Duration of unemployment (n) )s0001( deyolpmenu rebmuN 4000 3000 2000 n wp 1000 0 0 5 10 15 20 Duration of unemployment (n) (A) Homogeneous workers )s0001( deyolpmenu rebmuN 4000 3000 n n 2000 w p + w p L L H H 1000 n w p 0 L L (B) Heterogeneous workers 0 5 10 15 20 Duration of unemployment (n) )s0001( deyolpmenu rebmuN 4000 3000 2000 n n w p + w p L L H H 1000 n w p L L 0 (C) Great Recession Figure 3. Predicted (smooth curves) and actual (black circles) numbers of unemployed as a function of duration based on constant-parameter specifications Notes to Figure 3. Horizontal axis shows duration of unemployment in months and vertical axis shows number of unemployed for that duration in thousands of individuals. Circles denote imputed values for U¯1,U¯3,U¯5, U¯9.5, and U¯15 based on equation (6) with w , w , x , x , and L H L H δ chosen to fit the observed values of U¯1, U¯2.3, U¯4.6, U¯7.12, and U¯13.+ exactly. Panel A: homogeneous specification fit to 1976:M1-2017:M6 historical averages for U¯1, and U¯2.3. Panel B: pure cross-sectional heterogeneity specification fit to 1976:M1-2017:M6 historical averages for U¯1, U¯2.3, U¯4.6, and U¯7.12. Panel C: pure cross-sectional heterogeneity specification fit to average values for 2007:M12-2013:M12 for U¯1, U¯2.3, U¯4.6, and U¯7.12. 42

1 0.8 0.6 0.4 0.2 0 5 10 15 20 Duration of unemployment (n) ytilibaborP noitaunitnoC tnemyolpmenU (A) Full sample 1 0.8 0.6 0.4 0.2 0 5 10 15 20 Duration of unemployment (n) ytilibaborP noitaunitnoC tnemyolpmenU (B) Great Recession Figure4. Unemployment-continuationprobabilitiesasafunctionofdurationbasedonconstantparameter pure genuine duration dependence specification Notes to Figure 4. Horizontal axis shows duration of unemployment in months and vertical axis shows probability that individual is still unemployed the following month. Curves denote predicted values from the 5-parameter pure GDD model (w plus parameters in equation (5)) fit to 1976:M1-2017:M6 historical average values for U¯1, U¯2.3, U¯4.6, U¯7.12 and U¯13.+ (panel A) and for values for 2007:M12-2013:M12 (panel B). The GDD model exactly fits the dots plotted in Figure 3 for each case. 43

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 p (1) L,t p (1) H,t 0.2 1980 1985 1990 1995 2000 2005 2010 2015 Figure 5. Probability that a newly unemployed worker of each type will still be unemployed the following month Notes to Figure 5. The series plotted are pˆ (1) for i=L,H with 95% confidence intervals. it|T 3.5 W L W 3 H 2.5 2 1.5 1 0.5 1980 1985 1990 1995 2000 2005 2010 2015 Figure 6. Number of newly unemployed workers of each type Notes to Figure 6. The series plotted are wˆ for i =L,H with 95% confidence intervals. it|T 44

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Share H Share L 0 1980 1985 1990 1995 2000 2005 2010 2015 Figure 7. Share of total unemployment accounted for by each type of worker 0.4 1 0.9 0.3 0.8 0.2 0.7 0.1 0.6 0 0.5 0.4 −0.1 0.3 −0.2 0.2 type L type H −0.3 0.1 0 5 10 15 20 25 2 4 6 8 10 12 14 16 18 20 22 24 (A) GDD alone (B) Continuation probs for each type Figure 8. Effects of genuine duration dependence Notes to Figure 8. Panel A plots d as a function of τ (months spent in unemployment). τ Panel B plots average unemployment-continuation probabilities of type H and type L workers as a function of duration of unemployment. 45

1 1 0.9 W W PL u L Hx 0.9 I O n u flo tf w lo s ws 0.8 PH ux 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 5 10 15 20 25 30 35 40 45 5 10 15 20 25 30 35 40 45 (A) All four factors (B) Inflows vs. outflows 1 0.9 0.8 L H 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 5 10 15 20 25 30 35 40 45 (C) Type H versus type L Figure 9. Fraction of variance of error in forecasting total unemployment at different horizons attributable to separate factors NotestoFigure9. Horizontalaxisindicatesthenumberofmonthsaheadsforwhichtheforecast is formed. Panel A plots the contribution of each of the factors {w ,w ,x ,x } separately, Ht Lt Ht Lt Panel B shows combined contributions of {w ,w } and {x ,x }, and Panel C shows combined Ht Lt Ht Lt contributions of {w ,x } and {w ,x }. Dotted lines denote 95% confidence intervals. Ht Ht Lt Lt 46

10 11 Ubase Ubase 9.5 Uall 10.5 Uall WL WL 9 WH 10 WH PL ux PL ux 8.5 PH ux 9.5 PH ux 8 9 7.5 8.5 7 8 6.5 7.5 6 7 5.5 6.5 5 6 1980.1 1981.1 1982.1 1983.1 1984.1 (A) 1980 Recession (B) 1981 Recession 9.5 8.5 Ubase Ubase 9 Uall 8 Uall WL WL 8.5 WH 7.5 WH PL ux PL ux 8 PH ux 7 PH ux 7.5 6.5 7 6 6.5 5.5 6 5 5.5 4.5 5 4 4.5 3.5 1991.1 1992.1 1993.1 2001.1 2002.1 2003.1 (C) 1990 Recession (D) 2001 Recession 9 9 Ubase Ubase 8.5 Uall 8.5 Uall WL WL 8 WH 8 WH PL ux PL ux 7.5 PH ux 7.5 PH ux 7 7 6.5 6.5 6 6 5.5 5.5 5 5 4.5 4.5 4 4 2008.1 2009.1 2010.1 2011.1 2008.1 2009.1 2010.1 2011.1 (E) 2007 Recession (F) 2007 Recession (correlated shocks) Figure 10. Historical decompositions of five U.S. recessions Notes to Figure 10. Panels A-E use baseline model. Panel F uses factor model. 47

A. Shimer measure of unemployment outflows B. Shimer measure of unemployment inflows 0.60 0.050 0.55 0.045 0.50 0.040 0.45 0.40 0.035 0.35 0.030 0.30 0.025 0.25 0.20 0.020 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 C. Shimer-like measure of long-term unemployment outflows D. Shimer-like measure of long-term unemployment inflows 0.7 0.0200 0.0175 0.6 0.0150 0.5 0.0125 0.4 0.0100 0.0075 0.3 0.0050 0.2 0.0025 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 E. Long-term unemployment-continuation probabilities F. Long-term continuation probabilities in Great Recession 0.95 0.925 0.90 0.900 0.85 0.875 0.80 0.850 0.75 0.825 0.70 Shimer-like measure Shimer-like measure Data from Figure 5 Data from Figure 5 0.65 0.800 1970 1975 1980 1985 1990 1995 2000 2005 2010 2015 2007 2008 2009 2010 2011 Figure 11. Alternative measures of inflows and outflows, 1967:Q1-2016:Q4. Notes to Figure 11. Panel A plots updated series for Shimer’s measure of f (outflows from t unemployment). Panel B plots updated series for Shimer’s measure of x (inflows to unemployt ment). Panel C plots the measure of outflows from long-term unemployment F4.+ from equation t (28). Panel D plots inflows into long-term unemployment measured by u4.6. Panel E plots monthly t unemployment-continuation probabilities as calculated from (1−F4.+)1/3 and from quarterly avt erages of the series pˆ (1) in Figure 5. Panel F shows Panel E data over 2007:Q3-2011:Q4. Lt|T 48

A. Shimer decomposition of unemployment B. Contribution to outflows of level of inflows 10 0.425 Unemployment rate Explained by inflows Explained by outflows 0.400 9 0.375 8 0.350 7 0.325 6 0.300 0.275 5 0.250 4 0.225 Shimer f Explained by inflows 3 0.200 2008 2009 2010 2011 2012 2013 2014 2015 2016 2008 2009 2010 2011 2012 2013 2014 2015 2016 C. VAR decomposition of unemployment D. Contribution to outflows of level and composition of inflows 10 0.425 Unemployment rate Explained by inflows Explained by outflows 0.400 9 0.375 8 0.350 7 0.325 6 0.300 0.275 5 0.250 4 0.225 Shimer f Explained by inflows 3 0.200 2008 2009 2010 2011 2012 2013 2014 2015 2016 2008 2009 2010 2011 2012 2013 2014 2015 2016 Figure 12. Alternative measures of the contributions of inflows and outflows to unemployment and the unemployment-exit rate since 2007:Q4. Notes to Figure 12. Panel A plots the unemployment rate and contributions to it from inflows andoutflowsusing Shimer’smethod. PanelBplotsunemploymentoutflowrate andthe component of outflows that is explained by innovations in inflows according to a bivariate VAR. Panel C plots unemployment rate and contributions to it from innovations in outflows, innovations in level and composition of inflows according to a 4-variable VAR. Panel D plots unemployment outflow rate and the component of outflows that is explained by innovations in level and composition of inflows according to a 3-variable VAR. 49

)snoilliM( 2 WL 1.8 Permanent Separations 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 1980 1985 1990 1995 2000 2005 2010 2015 (A) Newly unemployed )snoilliM( 14 Type L Permanent Separations 12 10 8 6 4 2 0 1980 1985 1990 1995 2000 2005 2010 2015 (B) All unemployed Figure 13. Contribution to unemployment of type L individuals and those unemployed due to permanent separation. NotestoFigure13. PanelAshowsnewlyunemployedtypeLindividualsandnewlyunemployed individuals who gave permanent job loss or end of a temporary job as the reason. Panel B shows total numbers of unemployed type L workers compared to total numbers of unemployed who gave permanent job loss or end of temporary job as the reason. 50

)snoilliM( 2 1.8 Permanent Separation Layoff Leaver 1.6 Re−Entrant New−Entrant 1.4 Sum WL 1.2 1 0.8 0.6 0.4 0.2 0 1985 1990 1995 2000 2005 2010 (A) Newly unemployed )snoilliM( 12 Permanent Separation Layoff Leaver 10 Re−Entrant New−Entrant Sum Type L 8 6 4 2 0 1985 1990 1995 2000 2005 2010 (B) All unemployed Figure 14. Inflows and total numbers of type L workers by reason of unemployment NotestoFigure14. PanelAshowsthenumberoftypeLindividualswhoarenewlyunemployed by reason of unemployment along with the sum across reasons (thick fuchsia) and inference based on uncategorized aggregate data (dashed black). Panel B shows the number of type L workers who have been unemployed for any durationby reason of unemployment along with the sum across reasons(thickfuchsia)andinferencebasedonuncategorizedaggregatedata(dashedblack). Source: Ahn (2016). 51

Online Appendix A. Measurement issues and seasonal adjustment The seasonallyadjustednumbersof people unemployedforlessthan5 weeks, forbetween5 and 14weeks,15-26weeksandforlongerthan26weeksarepublishedbytheBureauofLaborStatistics. To further break down the number unemployed for longer than 26 weeks into those with duration between 27 and 52 weeks and with longer than 52 weeks, we used seasonally unadjusted CPS microdata publicly available at the NBER website (http://www.nber.org/data/cps_basic.html). Since the CPS is a probability sample, each individual is assigned a unique weight that is used to produce the aggregate data. Fromthe CPSmicrodata, we obtainthe numberof unemployedwhose duration of unemployment is between 27 and 52 weeks and the number longer than 52 weeks. We seasonallyadjustthetwoseriesusingX-12-ARIMA,40 andcalculatedtheratioofthoseunemployed 27-52 weeks to the sum. We then multiplied this ratio by the published BLS seasonally adjusted number for individuals who had been unemployed for longer than 26 weeks to obtain our series U7.12.41 t An important issue in using these data is the redesign of the CPS in 1994. Before 1994, individuals were always asked how long they had been unemployed. After the redesign, if an individual is reported as unemployed during two consecutive months, then her duration is recorded automatically as the sum of her duration last month and the number of weeks between the two months’surveyreferenceperiods. Notethatifanindividualwasunemployedduringeachofthetwo weeks surveyed, but worked at a job in between, that individual would likely self-report a duration of unemployment to be less than 5 weeks before the redesign, but the duration would be imputed to be a number greater than 5 weeks after the redesign. As suggested by Elsby, Michaels and Solon (2009) and Shimer (2012) we can get an idea of the size of this effect by making use of the staggered CPS sample design. A given address is sampled for 4 months (called the first through fourth rotations, respectively), not sampled for the next 8 40An earlier version of this paper dealt with seasonality by taking 12-month moving averages and arrived at similaroverallresultstothosepresentedinthisversion. Asafurthercheckontheapproachusedhere,wecompared the published BLS seasonally adjusted number for those unemployed with duration between 15 and 26 weeks to an X-12-ARIMA-adjusted estimate constructed from the CPS microdata, and found the series to be quite close. 41Thisadjustmentisnecessarybecausethepublishednumberforunemployedwithdurationlongerthan26weeks is different from that directly computed from the CPS microdata, although the difference is subtle. The difference arises because the BLS imputes the numbers unemployed with different durations to various factors, e.g., correction of missing observations. 52

months, and then sampled again for another 4 months (the fifth through eighth rotations). After the 1994 redesign, the durations for unemployed individuals in rotations 2-4 and 6-8 are imputed, whereas those in rotations 1 and 5 are self-reported, just as they were before 1994. For those in rotation groups 1 and 5, we can calculate the fraction of individuals who are newly unemployed andcompare thiswith the total fractionof newlyunemployed individualsacross all rotations. The ratio of these two numbers is reported in Panel A of Figure A1, and averaged 1.15 over the period 1994-2007 as reported in the second row of Table A1. For comparison, the ratio averaged 1.01 over the period 1989-1993, as seen in the first row. This calculation suggests that if we want to compare the value of U1 as calculated under the redesign to the self-reported numbers available t before 1994, we should multiply the former by 1.15. This is similar to the adjustment factors of 1.10 used by Hornstein (2012), 1.154 by Elsby, Michaels and Solon (2009), 1.106 by Shimer (2012), and 1.205 by Polivka and Miller (1998). For our study, unlike most previous researchers, we also need to specify which categories the underreported newly unemployed are coming from. Figure A1 reports the observed ratios of rotation1and5sharestothetotalforthevariousdurationgroups,withaveragevaluessummarized in Table A1. One interesting feature is that under the redesign, the fraction of those with 7-12 month duration from rotations 1 and 5 is very similar to that for other rotations, whereas the fraction of those with 13 or more months is much lower.42 Based on the values in Table A1, we should scale up the estimated values for U1 and scale down the estimated values of U2.3 and U13.+ t t t relative to the pre-1994 numbers. The values for U4.6 and U7.12 seem not to have been affected t t much by the redesign. Our preferred adjustment for data subsequent to the 1994 redesign is to multiplyU1 by1.15, U2.3 by0.87,U13.+ by0.77,andleaveU4.6 andU7.12 asis. Wethenmultiplied t t t t t all of our adjusted duration figures by the ratio of total BLS reported unemployment to the sum of our adjusted series in order to match the BLS aggregate exactly. Hornstein (2012) adopted an alternative adjustment, assuming that all of the imputed newly unemployed came from the U2.3 category. He chose to multiply U1 by 1.10 and subtract the t added workers solely from the U2.3 category. As a robustness check we also report results using t 42Onepossibleexplanationisdigitpreference—anindividualismuchmorelikelytoreporthavingbeenunemployed for12monthsthan13or14months. Whensomeoneinrotation5reportstheyhavebeenunemployedfor12months, BLS simply counts them as such, and if they are still unemployed the following month, BLS imputes to them a duration of 13 months. The imputed number of people 13 months and higher is significantly bigger than the selfreported numbers, just as the imputed number of people with 2-3 months appears to be higher than self-reported. 53

Hornstein’s proposed adjustment in Section 5.1, as well as results using no adjustments at all. An alternative might be to to use the ratios for each t in Figure A1 rather than to use the averages from Table A1. However, as Shimer (2012) and Elsby, Michaels and Solon (2009) mentioned, such an adjustment would be based on only about one quarter of the sample and thus multiplies the sampling variance of the estimate by about four, which implies that noise from the correction procedure could be misleading in understanding the unemployment dynamics. Table A1. Average ratio of each duration group’s share in the first/fifth rotation group to that in total unemployment U1 U2.3 U4.6 U7.12 U13.+ 1989-1993 1.01 1.01 0.96 1.02 0.97 1994-2007 1.15 0.87 0.95 1.05 0.77 B. Estimation algorithm The system in Section 2.1 can be written as x =Fx +v t t−1 t y =h(x )+r t t t forx =(ξ′,ξ′ ,...,ξ′ )′,E(v v′) =Q,andE(r r′)=R. Thefunctionh(.)aswellaselementsof t t t−1 t−47 t t t t the variance matrices R and Q depend on the parameter vector θ =(˜δ ,˜δ ,˜δ ,R ,R ,R ,R , 1 2 3 1 2.3 4.6 7.12 R ,σw,σw,σx,σx )′. The extended Kalman filter (e.g., Hamilton, 1994b) can be viewed as an 13+ L H L H iterativealgorithmtocalculateaforecastxˆ ofthestatevectorconditionedonknowledgeofθand t|t−1 observation of Y = (y′ ,y′ ,...,y′)′ with P the MSE of this forecast. With these we can t−1 t−1 t−2 1 t|t−1 approximate the distribution of y conditioned on Y as N(h(xˆ ),Ω ) for Ω =H′P H +R t t−1 t|t−1 t t t t|t−1 t andH =∂h(x )/∂x′| from whichthe approximate likelihoodfunctionassociated withthat t t t xt=xˆ t|t−1 θ, ℓ(θ)= T lnf(y |Y ;θ) t=1 t t−1 (cid:1) lnf(y |Y ;θ)=−(1/2)ln(2π)−(1/2)ln|Ω |−(1/2)[y −h(xˆ )]′Ω−1[y −h(xˆ )], t t−1 t t t|t−1 t t t|t−1 54

can be maximized numerically. The forecast of the state vector can be updated by iterating on K =P H (H′P H +R)−1 t t|t−1 t t t|t−1 t P =P −K H′P t|t t|t−1 t t t|t−1 xˆ =xˆ +K (y −h(xˆ )) t|t t|t−1 t t t|t−1 xˆ =Fxˆ t+1|t t|t P =FP F′+Q. t+1|t t|t PriortothestartingdateJanuary1976foroursample,BLSaggregatesareavailablebutnotthe micro data that we used to construct U13.+. For the initial value for the extended Kalman filter, t we calculated the values that would be implied if pre-sample values had been realizations from an initial steady state, estimating the (4×1) vector ¯ξ from the average values for U¯1,U¯2.3,U¯4.6, and 0 U¯7.+ over February 1972 - January 1976 using the method described in Section 1.1. Our initial guess was then xˆ = ι ⊗¯ξ where ι denotes a (48×1) vector of ones. Diagonal elements of 1|0 48 0 48 P determinehowmuchthe presample valuesof ξ are allowedtodifferfromthis initialguessˆξ . 1|0 j j|0 For this we set E(ξ −ˆξ )(ξ −ˆξ )′ = c I +(1−j)c I with c = 10 and c = 0.1. The value j j|0 j j|0 0 4 1 4 0 1 for c is quite large relative to the range of ξ over the complete observed sample, ensuring that 0 t|T the particular value we specified for xˆ has little influence. For k <j we specify the covariance43 1|0 E(ξ −¯ξ )(ξ −¯ξ )′ =E(ξ −¯ξ )(ξ −¯ξ )′. The small value for c forces presample ξ to be close j 0 k 0 j 0 j 0 1 j to ξ when j is close to k, again consistent with the observed month-to-month variation in ˆξ . k t|T Smoothed inferences about x using the full sample of available data, xˆ =E(x |Y ) and their t t|T t T variance matrix P = E[(x −xˆ )(x −xˆ )′] can be calculated by iterating backwards on the t|T t t|T t t|T 43In other words, c0I4 c0I4 c0I4 ··· c0I4   c0I4 c0I4+c1I4 c0I4+c1I4 ··· c0I4+c1I4 P1|0 =   c0 . . I4 c0I4+ . . c1I4 c0I4+ . . 2c1I4 ··· c0I4+ . . 2c1I4    .  . . . ··· .     c0I4 c0I4+c1I4 c0I4+2c1I4 ··· c0I4+47c1I4  55

following equations for t=T −1,T −2,...,1: J =P F′P−1 t t|t t+1|t xˆ =xˆ +J (xˆ −xˆ ) t|T t|t t t+1|T t+1|t P =P +J (P −P )J′. t|T t|t t t+1|T t+1|t t These smoothed inferences xˆ and functions of them are plotted in Figures 5-7 and 10. t|T We calculated standard errors for the estimate ˆθ as in equation (3.13) in Hamilton (1994b): E(ˆθ−θ)(ˆθ−θ)′ ≃V =K−1K K−1 1 2 1 ∂ℓ(θ) K = 1 ∂θ∂θ′(cid:12) (cid:12)θ=ˆθ (cid:12) (cid:12) ′ ∂lnf(y |Y ;θ) ∂lnf(y |Y ;θ) K = T t t−1 t t−1 . 2 t=1(cid:13)(cid:14) ∂θ (cid:12) (cid:15)(cid:14) ∂θ (cid:12) (cid:15)(cid:16) (cid:1) (cid:12)θ=ˆθ (cid:12)θ=ˆθ (cid:12) (cid:12) (cid:12) (cid:12) To obtain standard errors for the variance decompositions in Figure 9 and Table 3, we generated J = 1,000 draws from the asymptotic distribution of ˆθ, θ[j] ∼ N(ˆθ,V), j = 1,...,J and calcuated q (θ[j]) as in equation (21) for each s and each k = 1,...,4. The standard deviation s,k of q (θ[j])/ 4 q (θ[j]) across draws j was used to get the error bands and standard errors in s,k k=1 s,k (cid:1) Figure 9 and Table 3. The standard errorsused for Figures 5 and 6 incorporate both filterand parameteruncertainty. The matrix P summarizes uncertainty we would have about x even if we knew the true value of t|T t the parameters in θ. Given that we also have to estimate θ, the true uncertainty is greater than that represented by P . Following Ansley and Kohn (1986) we calculate the total variance as t|T ′ P +Z VZ t|T θ=ˆθ t t (cid:12) (cid:12) ∂xˆ t|T Z = . t ∂θ′ (cid:12) (4×12) (cid:12)θ=ˆθ (cid:12) (cid:12) The values of {Z }T can be found by numerical differentiation, e.g., replace ˆθ with ˆθ +δe for t t=1 i δ = 10−8 and e the ith column of I and then redo the iteration to calculate xˆ (ˆθ+δe ). The i 12 t|T i 56

ith column of Z is then δ−1[xˆ (ˆθ+δe )−xˆ |(ˆθ)]. t t|T i t|T C. Derivation of linearized variance and historical decompositions The state equation ξ =ξ +ε implies t+1 t t+1 ξ = ξ +ε +ε +ε +···+ε t+s t t+1 t+2 t+3 t+s = ξ +u . t t+s Letting y = (U1,U2.3,U4.6,U7.12,U13.+)′ denote the (5×1) vector of observations for date t, our t t t t t t model implies that in the absence of measurement error y would equal h(ξ ,ξ ,ξ ,...,ξ ) t t t−1 t−2 t−47 where h(·) is a known nonlinear function. Hence y =h(u +ξ ,u +ξ ,...,u +ξ ,ξ ,ξ ,...,ξ ). t+s t+s t t+s−1 t t+1 t t t−1 t−47+s We can take a first-order Taylor expansion of this function around u =0 for j =1,2,...,s, t+j s y ≃h(ξ ,...,ξ ,ξ ,ξ ,...,ξ )+ [H (ξ ,ξ ,...,ξ ,ξ ,ξ ,...,ξ )]u t+s t t t t−1 t−47+s j t t t t t−1 t−47+j t+s+1−j j=1 (cid:1) for H (·) the (5×4) matrix associated with the derivative of h(·) with respect to its jth argument. j Using the definition of u , this can be rewritten as t+j s y ≃c (ξ ,ξ ,...,ξ )+ [Ψ (ξ ,ξ ,...,ξ )]ε t+s s t t−1 t−47+s s,j t t−1 t−47+j t+j j=1 (cid:1) from which (18) follows immediately. Similarly, for purposes of a historical decomposition note that the smoothed inferences satisfy ˆξ =ˆξ +ˆε +ˆε +ˆε +···+ˆε t+s|T t|T t+1|T t+2|T t+3|T t+s|T where ˆε = ˆξ −ˆξ . For any date t+s we then have the following model-inferred t+s|T t+s|T t+s−1|T 57

value for the number of people unemployed: ι ′h(ˆξ ,ˆξ ,ˆξ ,...,ˆξ ). 5 t+s|T t+s−1|T t+s−2|T t+s−47|T For an episode starting at some date t, we can then calculate ι ′h(ˆξ ,ˆξ ,ˆξ ,...,ˆξ ,ˆξ ,...,ˆξ ). 5 t|T t|T t|T t|T t−1|T t+s−47|T Thisrepresentsthepaththatunemploymentwouldhavebeenexpectedtofollowbetweentandt+s asa resultofinitialconditionsattimetiftherewere nonewshocksbetweentandt+s. Giventhis path for unemployment that is implied by initial conditions, we can then isolate the contribution of each separate shock between t and t+s. Using the linearization in equation (18) allows us to represent the realized deviation from this path in terms of the contribution of individual historical shocks as in (22). D. Alternative estimates of unemployment-continuation probabilities There is an unresolved controversy in the literature about how to measure outflows from unemployment. Our measure (28) follows van den Berg and van Ours (1996), van den Berg and van der Klaauw (2001), Elsby, Michaels and Solon (2009), Shimer (2012), and Elsby, Hobijn and S¸ahin (2013) in deriving flow estimates from the observed change in the number of unemployed by duration. An alternative approach, employed by Fujita and Ramey (2009) and Elsby, Hobijn and S¸ahin (2010), is to look at only those individuals for whom there is a matched observation of unemployment in month t−1 and a status of employment or out of the labor force in month t. In the absence of measurement error, the two estimates should be the same, but in practice they turn out to be quite different. One reason for the discrepancy is misclassification. For example, an individual who goes from long-term unemployed to out of the labor force to back to long-term unemployedin three successive months counts as a successful “graduate” from long-term unemployment using matched flows but is contributing to the stubborn persistence of long-term unemployment when using the stock data. A follow-up paper to Elsby, Hobijn andS¸ahin (2010) by Elsby et al. (2011) documented that of the individuals who were employed or out of the labor force in month t−1 and who were recorded as unemployed in month t, more than half reported their 58

duration of unemployment to be 5 weeks or longer. Another important reason is that individuals for whom two consecutive observations are available differ in important ways from those for whom some observations are missing. Abowd and Zellner (1985) and Frazis et al. (2005) acknowledged that these measurement errors are more likely to bias the matched flow data than the stock data and suggested methods to correct the bias. Since our goal is to understand how the reported stock of long-term unemployed came to be so high and why it falls so slowly, we feel that our approach, which is consistent with the observed stock data by construction, is preferable for our application. E. Details of robustness tests The standard errors in Table 3 were calculated as follows. For each model, we generated 500 draws for the k-dimensional parameter vector (where k is reported in the first row of the table) from a N(ˆθ,Vˆ) distribution where ˆθ is the MLE and Vˆ is the (k×k) variance matrix from inversed hessian of the likelihood function. For each draw of θ(ℓ) we calculated the values implied by that θ(ℓ) and then calculated the standard error of that inference across the draws θ(1),...,θ(500). Shimer (2012) argued that this time-aggregation bias would result in underestimating the importance of outflows in accounting for cyclical variation in unemployment, and Fujita and Ramey (2009), Shimer (2012) and Hornstein (2012) all formulated their models in continuous time. On the other hand, Elsby, Michaels and Solon (2009) questioned the theoretical suitability of a continuous-time conception of unemployment dynamics, asking if it makes any sense to count a worker who loses a job at 5:00 p.m. one day and starts a new job at 9:00 a.m. the next as if they had been unemployed at all. We agree, and think that defining the central object of interest to be the fraction of those newly unemployed in month t who are still unemployed in month t+k, as in our baseline model, is the most useful way to pose questions about unemployment dynamics. Nevertheless, and following Kaitz (1970), Perry (1972), Sider (1985), Baker (1992), and Elsby, Michaels and Solon (2009) we also estimated a version of our model formulated in terms of weekly frequencies as an additional check for robustness. We can do so relatively easily if we make a few simplifying assumptions. We view each month t as consisting of 4 equally-spaced weeks and assume that in each of these weeks there is an inflow of w workers of type i, each of whom has a probability p (0) = exp[−exp(x )] of exiting it it it unemployment the following week. This means that for those type i individuals who were newly 59

unemployed during the first week of month t, w [p (0)]3 are still unemployed as of the end of the it it month. Thus for the model interpreted in terms of weekly transitions, equations (10) and (11) would be replaced by U1 = {w +w [p (0)]+w [p (0)]2+w [p (0)]3}+r1 t it it it it it it it t i=(cid:1)H,L 4 U2.3 = w [p (1)]8−s+w [p (2)]12−s +r2.3 t i,t−1 i,t−1 i,t−2 i,t−2 t i=(cid:1)H,Ls (cid:1) =1(cid:17) (cid:18) for p (τ) given by (5) and (8). Note that although this formulation is conceptualized in terms of it weekly inflow and outflows w and p , the observed data y are the same monthly series used in our i i t other formulations, and the number of parameters is the same as for our baseline formulation. 60

1.4 1.4 1.3 1.3 1.2 1.2 1.1 1.1 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 U1 0.5 U2.3 U1(avg) U2.3(avg) 0.4 1996 1998 2000 2002 2004 2006 2008 2010 2012 0.4 1996 1998 2000 2002 2004 2006 2008 2010 2012 Panel A: U1 Panel B: U2.3 1.4 1.4 1.3 1.3 1.2 1.2 1.1 1.1 1 1 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 U4.6 0.5 U7.12 U4.6(avg) U7.12(avg) 0.4 1996 1998 2000 2002 2004 2006 2008 2010 2012 0.4 1996 1998 2000 2002 2004 2006 2008 2010 2012 Panel C: U4.6 Panel D: U7.12 1.4 U13+ 1.3 U13+(avg) 1.2 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 1996 1998 2000 2002 2004 2006 2008 2010 2012 Panel E: U13.+ Figure A1. Ratio of each duration group’s share in the first and fifth rotation groups to that in all rotation groups 61

Appendix References Ansley, F. Craig, and Robert Kohn. (1986). "Prediction Mean Squared Error for State Space Models with Estimated Parameters," Biometrika, 73(2):467-473. Hamilton, James D. (1994b). "State-Space Models," in Robert F. Engle and Daniel L. Mc- Fadden, eds., Handbook of Econometrics, Volume IV, Chapter 50, pp. 3039-3080. Amsterdam: Elsevier. 62

Cite this document
APA
Hie Joo Ahn and James D. Hamilton (2018). Heterogeneity and Unemployment Dynamics (FEDS 2016-012). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2016-012
BibTeX
@techreport{wtfs_feds_2016_012,
  author = {Hie Joo Ahn and James D. Hamilton},
  title = {Heterogeneity and Unemployment Dynamics},
  type = {Finance and Economics Discussion Series},
  number = {2016-012},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2018},
  url = {https://whenthefedspeaks.com/doc/feds_2016-012},
  abstract = {This paper develops new estimates of flows into and out of unemployment that allow for unobserved heterogeneity across workers as well as direct effects of unemployment duration on unemployment-exit probabilities. Unlike any previous paper in this literature, we develop a complete dynamic statistical model that allows us to measure the contribution of different shocks to the short-run, medium-run, and long-run variance of unemployment as well as to specific historical episodes. We find that changes in the inflows of newly unemployed are the key driver of economic recessions and identify an increase in permanent job loss as the most important factor.},
}