feds · May 18, 2020

Should Children Do More Enrichment Activities? Leveraging Bunching to Correct for Endogeneity

Abstract

We study the effects of enrichment activities such as reading, homework, and extracurricular lessons on children's cognitive and non-cognitive skills. We take into consideration that children forgo alternative activities, such as play and socializing, in order to spend time on enrichment. Our study controls for selection on unobservables using a novel approach which leverages the fact that many children spend zero hours per week on enrichment activities. At zero enrichment, confounders vary but enrichment does not, which gives us direct information about the effect of confounders on skills. Using time diary data available in the Panel Study of Income Dynamics (PSID), we find that the net effect of enrichment is zero for cognitive skills and negative for non-cognitive skills, which suggests that enrichment may be crowding out more productive activities on the margin. The negative effects on non-cognitive skills are concentrated in higher-income students in high school, consistent with elevated academic competition related to college admissions. Accessible materials (.zip)

Finance and Economics Discussion Series Divisions of Research & Statistics and Monetary Affairs Federal Reserve Board, Washington, D.C. Should Children Do More Enrichment Activities? Leveraging Bunching to Correct for Endogeneity Carolina Caetano, Gregorio Caetano, and Eric Nielsen 2020-036 Please cite this paper as: Caetano, Carolina, Gregorio Caetano, and Eric Nielsen (2020). “Should Children Do More Enrichment Activities? Leveraging Bunching to Correct for Endogeneity,” Finance and Economics Discussion Series 2020-036. Washington: Board of Governors of the Federal Reserve System, https://doi.org/10.17016/FEDS.2020.036. NOTE: Staff working papers in the Finance and Economics Discussion Series (FEDS) are preliminary materials circulated to stimulate discussion and critical comment. The analysis and conclusions set forth are those of the authors and do not indicate concurrence by other members of the research staff or the Board of Governors. References in publications to the Finance and Economics Discussion Series (other than acknowledgement) should be cleared with the author(s) to protect the tentative character of these papers.

Should Children Do More Enrichment Activities? Leveraging Bunching to Correct for Endogeneity Carolina Caetano1, Gregorio Caetano1, and Eric Nielsen2(cid:42) 1University of Georgia 2Federal Reserve Board April 2020 Abstract Westudytheeffectsofenrichmentactivitiessuchasreading,homework,andextracurricular lessons on children’s cognitive and non-cognitive skills. We take into considerationthatchildrenforgoalternativeactivities,suchasplayandsocializing, inordertospendtimeonenrichment. Ourstudycontrolsforselectiononunobservablesusinganovelapproachwhichleveragesthefactthatmanychildrenspendzero hoursperweekonenrichmentactivities. Atzeroenrichment,confoundersvarybut enrichmentdoesnot,whichgivesusdirectinformationabouttheeffectofconfounders on skills. Using time diary data available in the Panel Study of Income Dynamics (PSID),wefindthattheneteffectofenrichmentiszeroforcognitiveskillsandnegativefornon-cognitiveskills,whichsuggeststhatenrichmentmaybecrowdingout moreproductiveactivitiesonthemargin. Thenegativeeffectsonnon-cognitiveskills areconcentratedinhigher-incomestudentsinhighschool,consistentwithelevated academic competition related to college admissions. JEL Codes: I21, I2, J01, C24. Keywords: cognitive skills, non-cognitive skills, bunching, enrichment, homework, college,timeuse,skilldevelopment. 1 Introduction Families spend substantial resources on activities intended to increase children’s skills. These “enrichment” activities include homework, tutoring, reading, and extra-curricular (cid:42) We thank Michael Ahn and Hannah Hall for excellent research assistance. We also thank seminar participantsforfeedback,andinparticularMatthewLarsenandPeterHinrichsforveryhelpfulcomments asdiscussants. HaoTengprovidedinvaluablehelpwiththetimediarydata. Theanalysisandconclusions setforthherearethoseoftheauthorsanddonotindicateconcurrencebyothermembersoftheresearch staff,theBoardofGovernors,ortheFederalReserveSystem. Carolinagratefullyacknowledgesthesupport oftheBonbrightFoundation. 1

activities such as music and art lessons. The money and time committed to these activities aresubstantialandincreasingacrossthesocioeconomicspectrum,leadingtoconcernsthat they may contribute to cross-sectional and intergenerational inequality (Aguiar and Hurst, 2007; Bianchi, 2000; Ramey and Ramey, 2010; Duncan and Murnane, 2011; Doepke and Zilibotti, 2017, 2019). Enrichment activities have opportunity costs that go beyond the time and money spent by parents. The time and energy of the child are also limited – an hour spent doing homework is an hour not spent on other activities, such as play or sleep. Moreover, ff time spent on enrichment could have spillover e ects into the remainder of the day. For example, an exhausted child may not want to engage in active play after finishing their homework, preferring more passive activities. A child over-stimulated by an after-school activity may fall asleep later than usual. Yet sleep and play are activities that have direct, positiveimpactsonskills(Walker,2017;Gray,2019). Theopportunitycostsofenrichment activities might therefore be substantial depending on the activities replaced. ff This paper estimates the net e ect of enrichment activities on cognitive and nonff cognitive skills taking the substitution patterns among di erent activities, and their ff potentiale ectsonskills,intoaccount. UsingtimediarydatafromtheChildDevelopment Supplement (CDS) of the Panel Study of Income Dynamics (PSID), we find that spending ff moretimeonenrichmentactivitiesyieldsazeronete ectoncognitiveskillsandasizeable, ff negative net e ect on non-cognitive skills. ff Our results appear to contradict the positive e ects of various forms of enrichment activities on both cognitive and non-cognitive skills found in the child development literature (e.g. Todd and Wolpin, 2007; Bernal and Keane, 2011; Fiorini and Keane, ff 2014; Caetano et al., 2019). However, relative to this prior literature, we are e ectively ff identifying a di erent parameter. In order to control for confounders in the estimation of ff the e ect of enrichment on skills, researchers often rely on detailed model specifications withmanycontrolvariables. Theseincludevariablesthatmaybeinfluencedbyenrichment, 2

and hence be post-determined (e.g., time spent on other activities, family expenditures, other family investments). Thus, for example, if we were to control for the amount of time ff spent on active play, we would be shutting o one of the indirect paths through which ff homework could negatively impact skills. The estimated positive e ects of enrichment ff when we include other activities as controls would thus reflect only the direct e ect of enrichment on skills, but substitution and negative spillovers could counteract these ff ff positive e ects, possibly even yielding a net negative total e ect. We propose a novel method to control for confounders that allows us to avoid adding post-determined variables entirely. Our method makes use of the fact that enrichment time cannot fall below zero and that many children in our data bunch at this lower limit. We argue that the choice of enrichment is the outcome of a constrained optimization problem that depends on both observed covariates and unobserved confounders, with the constraint that chosen enrichment be non-negative. The group of children at zero enrichment includes those for whom the constraint is barely binding, and those for whom the constraint is binding with great intensity. Consequently, the children who choose zero ff enrichment are di erent and more heterogeneous in comparison with the children who choose just a few minutes of enrichment per week. Consistent with this idea, we show that the children who chose zero enrichment ff are discontinuously di erent in every observable way from the children who chose just above zero. Further, we present evidence that the same discontinuities also hold for the unobservables. This creates an opportunity around the bunching point, because although the unobservables are discontinuous there, the treatment itself is not – a few minutes of ff enrichment per week is not very di erent from zero minutes of enrichment. Thus, when we control for observables and then compare the skills of the children at zero enrichment ff and the skills of the children just barely above zero, the di erence uncovers the direct ff e ect of the unobservables on skills. We can use this information to build a correction for the selection bias. 3

ff Applying our method to the PSID time diary data, we find that the net e ect of ff enrichment on cognitive skills is approximately zero, and that the net e ect of enrichment onnon-cognitiveskillsisquitenegativeandsignificant. Theseresultsarerobusttoseveral alternative definitions of enrichment time, alternative constructions of cognitive and non-cognitive skills, and various other sensitivity analyses. Breaking our results down by ff the child’s grade in school, we find that the cognitive e ects are also around zero in all ff grades, while the negative non-cognitive e ects are concentrated entirely in high school. To rationalize these results, we present a simple model arguing that if enrichment time is chosen so as to maximize cognitive skills, we would expect that a marginal increase in enrichment would yield zero net return to cognitive skills. Intuitively, optimality requires substituting enrichment for other activities up to the point where all activities ff haveequalmarginalreturnstocognitiveskills,sothatthenete ectofamarginalincrease in enrichment on cognitive skills would be zero. We argue that plausible deviations from this stylized assumption will still yield cognitive estimates gravitating around zero, which is what we find. Moreover, it is not generally possible to maximize both cognitive and non-cognitive skills at the same time. Thus, the level of enrichment that maximizes cognitive skills may be past the optimum for non-cognitive skills, leading to negative non-cognitive returns on the margin. Indeed, we find that the composition of enrichment shifts in later grades to activities that may come at the direct expense of non-cognitive skills. Whereas childreninearliergradesspendrelativelymoreoftheirenrichmenttimeonactivitieswith a social component, children in high school spend almost all of their enrichment time on ff homework, which may generate a sharper trade-o between cognitive and non-cognitive skills, thus explaining why we find negative non-cognitive estimates only for high school. Furtherbreakingthehighschoolestimatesdownbyhouseholdincome,wefindthatthe ff negative non-cognitive e ects are particularly large for middle- and high-income youth. The large negative returns for high-income youth may be explained by the higher amount 4

oftimethatgroupspendsonenrichment,coupledwithdiminishingreturnstoenrichment on non-cognitive skills. The even larger negative returns for middle-income youth may be explained by substitution patterns, since enrichment comes at the expense of social activities for this group. ff Ourresultshighlightthepitfallsandtrade-o sassociatedwithintensiveinvestmentin children’shumancapital. Theperceptionthatsuchactivitieshavehighreturnsdrivesthese ff investments. Manyfamiliesstraintoinvestinane orttoincreasethechanceofadmission to college. The stress that these high and rising investments place on both parents and children is well documented in the child development literature (e.g. Luthar and Becker, 2002; Luthar, 2003; Villaire, 2003; Ginsburg et al., 2007; Gray, 2011; Jarvis et al., 2014; Veiga et al., 2016), has been the subject of many books (e.g. Rosenfeld and Wise, 2000; ff Anderegg, 2003; Lareau, 2003; Warner, 2005; Gray, 2013; Abeles, 2015; Lukiano and Haidt, 2018), and has been widely covered in the popular press (see Gray, 2010; Rosin, 2015;Rosen,2015;Khazan,2016;Avent,2017forrecentexamples). Yet,wefindthatmany children are spending so much time on enrichment that, on the margin, they are actively harming their non-cognitive skills. This is particularly relevant given the widespread evidence of the importance of non-cognitive skills for key economic outcomes later in life (e.g. Heckman and Rubinstein, 2001; Heckman et al., 2006; Waddell, 2006; Lindqvist and Vestman, 2011; Deming, 2017). The rest of the paper is organized as follows. Section 2 presents the data. Section 3 presentsouridentificationstrategy,followedbytheresultsinSection4. Section5discusses these results. Finally, Section 6 concludes. 2 Data We use data from the Panel Study of Income Dynamics (PSID) and the 1997, 2002 and 2007 waves of the Child Development Supplement (CDS). The CDS data contain detailed 5

timediarydataandextensivemeasuresofcognitiveandnon-cognitiveskills,andisoneof only two datasets that can be used for our study.1 We link the CDS with the PSID, which allows us to build controls related to child, family and environmental characteristics. The time diaries in each CDS wave collect data on the full 24-hour breakdown of one random weekday and one random weekend day for each child. The child’s activities ff during the selected days are coded into one of over 300 di erent categories reported by the child, or by the parent if the child is young, with subsequent editing and help from the PSID interviewer. We exclude cases where the day is described as non-typical, either the weekday or weekend day data is missing, or where the diary does not cover the full 24 hours. However, when the time slots between 10 p.m. and 6 a.m. are missing we do not exclude the observation and instead record that time as “sleeping," consistent with prior literature (Fiorini and Keane, 2014; Caetano et al., 2019). Finally, we aggregate the 300+ primitivetime-usecategoriesintoeightcategories: enrichmentactivities,otherenrichment activities, play and social activities, passive leisure, duties/chores, class time, sleep, and other. Figure 1 shows the proportional breakdown of time among these categories. Ourdefinitionofenrichmentintendstocapturethekindsofactivitiesthataretypically considered to be investments in children’s skills. Therefore, our baseline measure includes only those activities that are unambiguously related to skill development over and above class time in school. In a typical week, children on average spend about 3% of their time on this type of enrichment, or roughly 5 hours/week. Figure 2 shows the breakdown of enrichment activity into various sub-categories. The primary component of this baseline measure is homework, at two-thirds of the total. The next most important component of enrichment is reading a book, at 14% of the total. While 7% of enrichment time is spent on before- or after-school programs, relativelylittleisspentoneachoftheremainingcategories: otherreading(e.g.,magazines 1Toourknowledge,theonlyotherdatasetthathascognitiveandnon-cognitiveskillmeasurementsas wellastimeinputsspanningtheentiredayistheLongitudinalStudyofAustralianChildren(LSAC).However, theCDSdatacontainsmoredetailedtime-useinformation,whichisusedfortheprecisecategorizationof theactivitiesandthestudyofsubstitutionpatterns. 6

Figure 1: Daily Time Breakdown 10% 15% 7% 2% 18% 3% 4% 40% Enrichment Activities Other Enrichment Activities Play and Social Activities Passive Leisure Duties/Chores Class Time Sleep Other Note: Panelplotstheaveragedivisionofenrichmenttimeintodifferentsub-categoriesoveratypicalweek. Thefigurepoolsthe1997,2002and2007CDSwaves. Figure 2: Enrichment Time Breakdown 1% 4% 14% 3% 66% 2% 2% 7% Reading a Book Other Reading Being Read To Homework Before- or After- School Program Other Education Other Academic Lessons Non-Academic Lessons Note: Panelplotstheaveragedivisionofenrichmenttimeintodifferentsub-categoriesoveratypicalweek. Thefigurepoolsthe1997,2002and2007CDSwaves. and newspapers), being read to (e.g., by parents), other academic lessons (e.g., tutoring, academic courses and lectures), non-academic lessons (e.g., piano and soccer lessons), and other education (e.g., driving lessons, military training). 7

As a robustness check, we also extend the notion of enrichment by including activities that are sometimes considered enrichment but which do not have such a clear connection to academic skills or human capital as traditionally conceived. This extended measure, which we label “broad enrichment,” includes our standard notion of enrichment plus “other enrichment:” making art/music, visiting museums, organized (structured) sports, volunteer work, the educational use of computers, and so forth. Figure 11 in Appendix A presents the breakdown of “other enrichment" into its constituent pieces, demonstrating that about two-thirds of the category is organized sports. We also define a number of other time aggregates which we will use in Section 5 to assesssubstitutionpatternsbetweenenrichmentandotheractivities. Theirbreakdowncan be seen in Figure 11 in Appendix A. First, we define “passive leisure" as activities that do not involve active, face-to-face social participation (e.g., any screen time, computer games, etc.) Two-thirds of passive leisure consists of watching TV. “Play and social activities," by contrast,consistsofsports(notthroughschoolorinanorganizedleague),socialinteractive games (e.g., board games, hide and seek), hobbies, socializing, social and church groups, etc. A little less than half of the time spent on this category is spent on social interactive games. Wedefine“dutiesandchores"asallnecessary,non-leisureandnon-schoolactivities suchashouseholdchores,paidwork,travel(e.g.,commuting,errands),shopping,personal care (hygiene, medical care, etc.), and meals. Traveling, meals and personal care take the most time within this category. “Class time" is defined as time at school for enrolled children and daycare or nursery care for children not in school. “Sleep" is defined as sleep at night, naps, and, as explained above, missing time slots between 10 pm and 6 am. Altogether, these time use categories are mutually exclusive and exhaustive. We create our primary cognitive skill measure by applying iterated principle factor analysis to the standardized letter-word, applied problems, and passage comprehension subtests of the Woodcock Johnson Revised Tests of Achievement, Form B, which are available in each CDS wave. We likewise construct our non-cognitive skill measure 8

through iterated principle factor analysis applied to parental assessments captured in 36 questions on the child’s behavior. The loading factors for these scales are shown in Table 7 in Appendix A.2 Table 1 presents summary statistics for our sample. We have a pooled sample of 4,330 children ranging from 5 to 18 years of age, with an average age of just under 12. While children in our data spend on average a little over five hours per week on enrichment activities, about 30% do not spend any time at all on enrichment. About 40% of the children are black and about 7% are Hispanic. Further, 26% of the children in our sample attend a gifted program, 8% attend a special education program, 1% are home schooled, and 8% attend a private school. Throughout the paper, we classify children as low-, middle- or high-income if their household income falls in the bottom, middle, or top of the sample income terciles, respectively. We denote by X the vector of observed child, family and environmental characteristics that we use as controls. Care is needed in the specification of X because many of the potential control variables available in our data are likely to be post-determined, and, as discussed in the introduction, including them would change the meaning of our estimates. Our approach therefore is to use only controls that are unambiguously pre-determined. We are able to adopt this parsimonious set of controls because our identification strategy can handle bias stemming from confounding unobservables. Our list of controls includes child’s age and squared age (in months), and indicators for: CDS wave (1997, 2002 and 2007), grade (thirteen variables, from kindergarten through grade 12), gender, ethnicity (black, Hispanic and other non-white ethnicity), whether the child has siblings, family income tercile, whether the mother is alive, and whether the father is alive.3 As a robust- 2Forrobustness,wealsousetheinternalizingandexternalizingsubscalesofthebehaviorproblemsindex (BPI),astandardizedscaleincludedineachCDSwave,asalternativemeasuresofnon-cognitiveskills. The internalizingscalecapturestheprevalenceofwithdrawnbehaviors,whiletheexternalizingscalecaptures outwardlyaggressivebehaviors(PetersonandZill,1986). Alsoforrobustness,weuseeachcomponentofour cognitiveskillmeasure(appliedproblems,letterword,andpassagecomprehension)asseparatemeasuresof cognitiveskill. Ourcognitiveandnon-cognitivemeasuresareallconstructedsothatahigherscoreisbetter andareallnormalizedtohaveameanofzeroandastandarddeviationofone. 3Forsomeofthesecontrolvariables,someobservationshaveamissingvalue(lessthan1%ofthesample). 9

Table 1: Summary Statistics Activities(hoursperweek) Mean StandardDeviation Enrichment 5.22 6.00 OtherEnrichment 4.03 6.21 PlayandSocialActivities 12.30 10.19 PassiveLeisure 17.48 11.94 Duties/Chores 24.68 11.29 Class 30.96 10.78 Sleep 67.20 9.19 Other 5.34 9.08 OtherVariables Enrichment=0 0.29 0.45 1997Wave 0.26 0.44 2002Wave 0.46 0.50 2007Wave 0.28 0.45 ChildisinGradePreK-5 0.31 0.46 ChildisinGrade6-8 0.33 0.47 ChildisinGrade9-12 0.37 0.48 ChildisFemale 0.50 0.50 ChildisWhite 0.48 0.50 ChildisBlack 0.40 0.49 ChildisHispanic 0.07 0.26 ChildHasSiblings 0.88 0.33 ChildisLow-Income 0.33 0.47 ChildisMiddle-Income 0.33 0.47 ChildisHigh-Income 0.33 0.47 Child’sFatherisAlive 0.97 0.16 Child’sMotherisAlive 0.99 0.08 ChildisinGiftedProgram 0.26 0.44 ChildisinSpecialEducationProgram 0.08 0.27 ChildisHomeSchooled 0.01 0.11 ChildisinPrivateSchool 0.08 0.27 Age(years) 11.86 39.85 Note: N=4,330. Activitycategoriesareexhaustive. The1997,2002and2007CDSWavesarepooled. ness check, we also estimate alternative specifications where we add as controls some additional variables that may be post-determined, such as whether the child is in a gifted Inthesecases,weincludethemissingobservationsinoursamplebyassigningthemauniquevalueforthe relevantcontrolvariableandcreatinganindicatorvariableforwhetherthatobservationhadamissingvalue forthatcontrol. Wethenincludetheseindicatorsasadditionalcontrols. Theresultingestimatesarevery similartothecasewherewesimplydropallobservationswithanymissingcontrolvariables. 10

program, whether the child is in a special education program, whether the child is home schooled, and whether the child attends a private school. Adding these controls barely changes our estimates. Importantly, we do not include time spent on other activities as controls, since these are determined jointly with enrichment time. We also do not include lagged test scores as controls, even though these are commonly included in the child development literature. Including lagged controls reduces substantially our sample size, as the child would need to be observed in consecutive waves of the CDS.Thiswouldalsosubstantiallyrestricttheagerangeofthechildrenthatwecanusein the sample, thus not allowing for the breakdown by grade range that we present below. Moreover, our correction strategy renders the use of lagged skills to control endogeneity ff less important. Indeed, the ability to credibly estimate causal e ects without the use of lagged scores is an advantage of our approach. 3 Identification Strategy Consider the standard outcome equation S =βI +h(X)+(cid:15), (1) where S refers to either cognitive or non-cognitive skill, I refers to enrichment time (I stands for “investment,” since enrichment activities are generally undertaken as investments in human capital), X is a vector of observed pre-determined controls, and (cid:15) is the unobservable error term. If we ignore any potential endogeneity problem and simply regress Y onto I and X, ffi the coe cient of I may be biased. Indeed, in Appendix B, we apply Caetano (2015)’s test of exogeneity and show that the evidence of endogeneity in the equation above is overwhelming. Moreover, we show evidence there that the bias in the estimation of β ff is positive, so that a simple regression of S on I and X would over-estimate the e ect of 11

enrichment on skills. ff Without additional assumptions, the e ect of enrichment in the equation above, β, cannot be identified. However, the enrichment variable I is of a peculiar nature that can be leveraged to identify β. Specifically, in Section 3.1 we argue that a substantial fraction of the sample would have chosen a negative amount of enrichment if it were possible but were instead constrained to choose zero. Section 3.2 uses this information to incorporate some structure into the model. Finally, in Section 3.3 we show that β can be identified in the augmented model. 3.1 Enrichment is a constrained choice. Figure 3 plots the empirical cumulative distribution function of enrichment time in our sample and shows that there is substantial bunching at zero. About 30% of children ff spend no time on enrichment, while the rest are continuously distributed among di erent, positive levels. Figure 3: Evidence of Bunching at Zero Enrichment Time noitcnuF noitubirtsiD evitalumuC laciripmE 1 8. 6. 4. 2. 0 0 10 20 30 40 50 Hours Per Week on Enrichment Activities Note: Figure plots the cumulative density function of time spent per week on enrichment activities (in hours)forourfullsample. Why does this bunching happen? To answer this, first note that the children who 12

ff spend no time on enrichment are discontinuously di erent in every observable way from the children who spend any positive time at all on enrichment. Figure 4 shows some examples of these discontinuities. The upper left panel of the figure shows a local linear fit of an indicator of whether the child is black conditional on the amount of time the child spends on enrichment, as well as the proportion of children who are black among thechildrenwhospendzerotimeonenrichment. Thechildrenatzeroarediscontinuously more likely to be black than the children who spend marginally positive amounts of time on enrichment. In the header of the panel, we show the p-value of a test of whether the share of black children is continuous at zero enrichment time, and it is clear that we can confidently reject this hypothesis (p=0.017). The other panels of Figure 4 show similar patterns. Children who spend no time on enrichment are discontinuously more likely to be male (p = 0.003), to have a mother who works full-time (p = 0.006), to have an unmarried mother at birth (p=0.036), to not be enrolled in a private school (p=0.022) and to spend time on passive leisure activities (p=0.000). That is, in each case, we find that the children at zero seem to be negatively selected on observables associated with higher expected achievement. ff ThelastpanelinFigure4reflectsthestarkdi erencesbetweenthelivesofthechildren whospendnotimeonenrichmentandeverybodyelse. Inparticular,notethatthechildren at zero enrichment spend on average four more hours per week on passive leisure than the children at one hour of enrichment. Since the total number of hours in a week is the same for everyone, this means that, relative to the group of children who spend one hour of enrichment, the group of children at zero enrichment is spending one fewer hour on enrichment and three fewer hours in other activities, potentially more productive than passive leisure, such as play, socializing, and sleep. In Appendix B, we present evidence that this pattern of negative selection at zeroenrollment is also true for unobservables. This pattern – that every characteristic of ff the child, observable and unobservable, is so starkly di erent at zero – can be naturally 13

Figure 4: Evidence that Bunching is Selective kcalB si dlihC 54. 4. 53. 3. 52. P-value of Discontinuity: .017 0 5 10 15 20 Hours Per Week on Enrichment Activities elaM si dlihC 6. 55. 5. 54. 4. P-value of Discontinuity: .003 0 5 10 15 20 Hours Per Week on Enrichment Activities emiT-lluF deyolpmE si rehtoM 8. 57. 7. 56. P-value of Discontinuity: .006 0 5 10 15 20 Hours Per Week on Enrichment Activities htriB ta deirraM saW rehtoM 8. 57. 7. 56. 6. P-value of Discontinuity: .036 0 5 10 15 20 Hours Per Week on Enrichment Activities loohcS etavirP sdnettA dlihC 51. 1. 50. P-value of Discontinuity: .022 0 5 10 15 20 Hours Per Week on Enrichment Activities erusieL evissaP 52 02 51 01 P-value of Discontinuity: 0.000 0 5 10 15 20 Hours Per Week on Enrichment Activities Note: Eachpanelshowsaplotofthelocallinearestimatoroftheexpectedvalueofavariableconditional onenrichmenttime,alongwithits90%confidenceinterval. Theexpectedvalueofthevariableamongthe childrenwhospentnotimeonenrichmentisalsoshown,alongwithits90%confidenceinterval. Finally,the p-valueofatestforwhetherthereisdiscontinuityatzeroisshownintheheaderofeachpanel. explained if enrichment is a choice that is constrained to be non-negative. Let us work with this idea. There are two types of enrichment: “desired enrichment," which is the amount a person would like to choose absent the non-negativity constraint, and “actual 14

enrichment," which is the amount they actually choose. In Figure 5 we explore how a single unobservable variable, say “ability," is mapped to this choice. Suppose that we vary ability,butkeepeveryothercharacteristicfixed(wewilldenotetheseotherobservableand unobservable characteristics as C in the plot). For every level of ability we expect a certain level of desired enrichment. We suppose that higher levels of ability are related to higher levelsofdesiredenrichment,asdepictedintheleftpanel. Wheneverdesiredenrichmentis positive, the constraint is not binding, and thus the desired and actual enrichment curves coincide. However, as we move to lower ability levels, the desired enrichment may be negative, as shown in the dashed curve. Meanwhile the actual enrichment choice cannot be negative, and thus the two curves separate. All those who desire a negative amount of enrichment choose actual enrichment equal to zero. Figure 5: Relationship Between Child’s Ability and Enrichment Time E[actualenrichment|ability,C] E[ability|desiredenrichment,C] 0 ability E[ability|actualenrichment=0,C] E[desiredenrichment|ability,C] 0 desiredenrichmenttime Note: In the left panel, the solid line denotes actual (chosen) enrichment, which is equal to desired enrichmentwhendesiredenrichmentisnon-negative. Fornegativedesiredenrichment(dashedline),actual enrichmentmustbezero. Therightpanelinvertsthisrelation,showingthatthisconstraintwillgeneratea discontinuityintheexpectedcharacteristicsofchildrenwhodozeroenrichment,sincethatgroupincludes allthechildrenforwhomtheconstraintisbinding. C representsallothercharacteristicsthatdetermine enrichment(observedorunobserved). We can then look at this relationship in an inverse way, by plotting in the right panel the expected ability for every level of enrichment. As explained in the left panel, the group of children who choose zero enrichment include all those for whom the constraint that enrichment must not be negative was binding, and thus the expected ability for these 15

ff children should reflect the fact that they are very selected and very di erent from their counterparts who chose marginally positive amounts of enrichment. Indeed, the figure shows that the average ability for those choosing zero enrichment, the solid black dot, is discontinuously lower than the average ability for those choosing small, positive levels of enrichment. 3.2 A model with constrained enrichment choice Given the discussion in the previous section, we understand that desired enrichment, ∗ denoted by I , is a function of characteristics both observable, X, and unobservable, η, in the following equation ∗ I =g(X)+η. (2) This equation alone is not an assumption. First, we are not specifying g in any way. In ∗ fact, one can think of η as simply the residual of I once g(X) is taken out, for any given ∗ function g. We are not saying that X and η have a causal relationship with I , and we do not require any independence between η and X. We do not even suppose that g is identifiable. We now introduce some structure based on the discussion of the previous section. We suppose that the actual choice of enrichment is constrained: I =max{0,I ∗} . (3) Finally,weaddadditionalstructurebyopeningtheerrorterminequation(1)as(cid:15)=δη+ε: S =βI +h(X)+δη+ε, E[ε | I,X]=0. (4) This equation makes two assumptions. First, it assumes that all the unobservable confounders are indexed by η. The unobservable ε can be simply understood as the 16

residual S −E[S | I,X,η], which represents the remaining independent heterogeneity. This is a selection on unobservables model which assumes that I is exogenous only if we also condition on the unobservable term η. Second, and this is our main identifying assumption, equation (4) assumes that η enters linearly, as opposed to entering through a more general nonparametric form such as f(η,X). Appendix C discusses this assumption in depth, and provides evidence that it does not seem to play an important role in our empirical conclusions.4 3.3 Identifying β Our model is therefore composed of equations (2), (3) and (4), which together imply E[S | I,X]=(β+δ)I +h(X)− δg(X)+δE[I ∗| I ∗ ≤0,X]1(I =0). (5) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32)(cid:32) (cid:124) (cid:123)(cid:122) (cid:125) m(X) Equation(5)showsthattheexpectedskillconditionaloncovariatesisdiscontinuousat zero enrichment. To understand this discontinuity, consider the group of children who chose I = 0. Why do we see skill variation in this group? The variation is not due to ff di erences in time spent on enrichment (the first term of the equation) because I = 0 for everyone in this group. Part of the variation in skills is explained by variation in the controls, specifically through the term h(X)− δg(X). However, even if we condition on ff the controls, there is further variation in skills due to the di erences in the unobservable confounder, η. From equation (2), conditional on X, the variation in η is identical to the ∗ variation of I . Thus, if somehow we could identify E[I ∗| I ∗ ≤ 0,X], we could identify δ by relating the variation in skills and the variation in E[I ∗| I ∗ ≤ 0,X] among those who chose zero 4Equation(4)alsoseemstomakeassumptionsabouttheobservables,i.e. linearityinI andseparability inX. Inrealitythismodelandouridentificationstrategyallowforheterogeneoustreatmenteffects(see Caetanoetal.(2020))andthusthetrueeffectofI onS maybeofthegeneralformβ(I,X,ε),andtheeffects reportedcanbeinterpretedasaverages. Inanycase,inAppendixC,wefindthatthetreatmenteffectsseem tobeuncorrelatedwithI. 17

enrichment. We could then extrapolate by assuming that this bias is the same for those who spend positive amounts of time on enrichment, as in the first term of equation (5). Explicitly, let us rewrite equation (5) as E[S | I,X]=β+m(X)+δ[I +E[I ∗| I ∗ ≤0,X]1(I =0)]. (6) Then, if we could identify E[I ∗| I ∗ ≤0,X], we could implement a correction by adding the term I +E[I ∗| I ∗ ≤ 0,X]1(I = 0) to the regression as another control. As long as E[I ∗| I ∗ ≤ 0,X] < 0 for some values of X in the data, the correction term I +E[I ∗| I ∗ ≤ 0,X]1(I = 0) will be linearly independent of I (note that E[I ∗| I ∗ ≤0,X]1(I =0) is orthogonal to I). This allows us to identify β and δ separately. HowcanE[I ∗| I ∗ ≤0,X]beidentified? Althoughwedonotobservethelatentenrichment ∗ ∗ choice I when it is negative, we do observe it when it is positive, since I =I when I >0. Our strategy then is to use observations with I >0 to make an out-of-sample prediction of the average desired investment I ∗ when I ∗ ≤0. Specifically, we can make assumptions ∗ about the shape of the distribution of the confounders η, and relate it to the shape of I through equation (2). We explore three assumptions, which are nested and ordered from strongest to weakest: 1. Tobit: η | X ∼N(X (cid:48) θ,σ2) (cid:48) and m(X)=X γ. 2. HeteroskedasticTobit: η | X ∼N(l(X),σ2(X)), which drops both the linearity of m and of the mean, as well as the homoskedasticity requirements, keeping only the normality assumption. 18

3. HeteroskedasticTailSymmetry: for all censored quantiles q , 0 η | X has symmetric tails below q and above 1− q , 0 0 which drops the normality assumption but keeps the symmetry between the constrained part of the distribution and the corresponding upper tail. InAppendixDweshowhoweachofthethreeassumptionscanbeleveragedtoidentify andestimateE[I ∗| I ∗ ≤0,X],andhoweachoftheseassumptionsfitourdata. Tosummarize, there is strong evidence of heteroskedasticity, which means that the Tobit assumption is likely not flexible enough. The heteroskedastic Tobit assumption fits the data quite well, although there is some evidence that the tails of the empirical distributions are fatter than normalityimplies. Theheteroskedastictailsymmetryassumptionseemstosolvethisissue. Irrespective of this evidence, below we show the results for corrections based on all three assumptions, and our conclusions hold for all three cases. 4 Empirical Results 4.1 Full-Sample Estimates Table 2 presents our main results estimated on the full sample. Column (i), which shows the results of simple regressions of skills on enrichment time without controls (equation (6) without either m(X) or the correction term), demonstrates that both cognitive and non-cognitive skills are strongly positively correlated with enrichment time. Column (ii), which adds controls back into the specifications in column (i) (equation (6) without the correction term), shows that while observables seem to explain part of the correlation betweenenrichmenttimeandskills,theresidualrelationshipsremainpositive,particularly for cognitive skills. The discontinuity plots shown in Figure 4, as well as the evidence discussed in Ap- 19

ff Table 2: Full-Sample Results: The E ect of Enrichment Time on Skills (i) (ii) (iii) (iv) (v) Uncorrected Uncorrected Tobit Het. Het. NoControls w/Controls Tobit Symmetric Cognitive β 0.018** 0.011** -0.004 -0.007 -0.002 (0.003) (0.002) (0.006) (0.006) (0.006) δ 0.013** 0.015** 0.010** (0.005) (0.004) (0.005) Non-Cognitive β 0.006** 0.003 -0.015 -0.024** -0.019* (0.003) (0.003) (0.010) (0.009) (0.010) δ 0.015* 0.022** 0.018** (0.008) (0.007) (0.008) Note: N=4,330. Bootstrappedstandarderrorsinparentheses(500iterations). Thecorrectedspecifications use50clusters(seeAppendixDforestimationdetailsandFigure21inAppendixEforanalogousresults withdifferentnumbersofclusters.) **p<0.05,*p<0.1. pendix B (see Figure 14 there), suggest that the uncorrected estimates in columns (i) and (ii) are positively biased. The remaining columns in Table 2 show our corrected estimates of β (equation (6)) under the di ff erent assumptions on the distribution of η | X discussed in Section 3.3 ranging from the strongest to the weakest. Column (iii) shows the results when we implement the Tobit strategy to estimate E[I ∗| I ∗ ≤0,X], column (iv) shows the results under the heteroskedastic Tobit strategy, and column (v) shows the results under the heteroskedastic tail symmetry strategy.5 All standard errors are bootstrapped using 500 iterations. For cognitive skills, all of the corrected estimates are quite similar – the estimated ff β’s fall from 0.011 standard deviations (s.d.) to around -0.004 s.d. The large di erences between column (ii), where the estimate is positive and highly significant, and columns 5Note that implementing the heteroskedastic Tobit and heteroskedastic tail symmetry corrections requires that we discretize X in order to estimate E[I ∗|X ∗ ≤ 0,X]. We discretize X using hierarchical clustering,inwhichobservationsaregroupedintoclustersbasedonthesimilarityoftheirobservables. All theresultsreportedinthepaperuse50clustersintheestimationofE[I ∗|X ∗≤0,X]. Additionally,forthe specificationofm(X)inequation(6)weuseboththenon-clusteredcontrolvariablesdescribedinSection2 aswellastheclusterindicators. Fordetails,pleaserefertoSectionEintheAppendix. Wealsoshowthere thatourresultsdonotappeartobeanartifactofeithertheparticularwaywediscretizeX,orthenumberof clusters. 20

(iii)-(v), where the estimates are negative and insignificant, show that our correction method is able to handle endogeneity which was not absorbed by the pre-determined controls. Ourmostgeneralcorrectionmethod(symmetry)yieldsa90%confidenceinterval of [−0.012,0.008]. Correcting for selection has even more dramatic consequences for the non-cognitive estimates – the corrected non-cognitive β’s are negative, with point estimates ranging between -0.024 and -0.015 s.d. The point estimate using our preferred method (column ff (v)) is -0.019, significantly di erent from zero at 10%. This is about three times larger in magnitude than the unconditional correlation (column (i)). The non-cognitive estimates in Table 2 are also economically significant. To see this, consider two otherwise similar children: one who engages in zero enrichment and one who spends 12.5 hrs/week, putting her at the 90th percentile in the full-sample distribution. These 12.5 hours come at the expense of other activities the child could have done instead during that time. The preferred corrected estimates imply that the 90th percentile child would have 0.19 s.d. lower non-cognitive skills than the child at zero. This is a ff sizeable di erence relative to what is often found in the child development and education literatures.6 For both cognitive and non-cognitive skills, the estimated δs are positive and highly significant, confirming the evidence we presented in Appendix B of large amounts of endogeneity bias in the uncorrected estimates. The fact that the β estimates in the "No Controls" (i) column are larger than in the "Uncorrected" (ii) column provides yet further evidence of positive bias. Note that the standard errors from column (ii) of Table 2 are much smaller than the standard errors from the corrected models, (columns (iii)-(v)). This is a feature of our 6By way of comparison, the very sizable black-white gap in cognitive skills is generally found to be around 1 s.d. (Neal and Johnson (1996)). Effect sizes of -0.2 s.d. are large in magnitude relative to the literatureonteachervalue-added,whichtypicallyfindsthatastandarddeviationincreaseinteacherquality correspondstoanincreaseofroughly0.05-0.1s.d. instudentachievement(whichrelatestoourmeasureof cognitiveskills)orstudentbehavior(whichrelatestoourmeasureofnon-cognitiveskills). Seeforexample Chettyetal.(2014);KaneandStaiger(2008);Kaneetal.(2008);Jackson(2018). 21

ff approach, not a bug. The only di erence between the corrected and uncorrected models is the presence of the generated regressor Eˆ[I ∗| I =0,X =x]. Adding one regressor will not generally cause the standard errors in a regression to blow up, so the fact that we see an increase in the standard errors in our application suggests greater underlying uncertainty ff surrounding the true causal e ects of enrichment time on skills once endogeneity is accounted for. Not considering this correction term would lead to overly precise, biased estimates. In turn, this could lead to excessively optimistic and confident expectations of policymakers or families regarding the impact of enrichment activities. Table8inAppendixAshowsthatourbaselineresultsarerobusttoplausiblealternative measures of cognitive and non-cognitive skills. First, we consider each of the components of our cognitive measure separately. For each component (applied problems, letter-word comprehension, and passage comprehension), we find sizeable, positive uncorrected estimates and statistically insignificant corrected estimates. Next, we consider alternative measures of non-cognitive skills based on the internalizing and externalizing subscales of the behavior problems index (BPI) included in the CDS. Here, the uncorrected estimates ff suggest significant, positive e ects for externalizing problems only, while the corrected estimates for both scales are negative and similar in magnitude to the main non-cognitive estimates reported in Table 2. Our results are also robust to alternative definitions of enrichment time. First, we consider broad enrichment, which expands the notion of enrichment to include additional activities less directly intended to the development of cognitive skills such as organized sports, arts, and volunteering (see Section 2 for details.) Table 9 in Appendix A shows thatusingthisbroadermeasureyieldsremarkablysimilarestimatestothebaselineresults presented in Table 2. The uncorrected estimates again show significant, positive associations between (broad) enrichment and skills, while the corrected estimates again indicate ff ff a null e ect for cognitive skills and a significant negative e ect for non-cognitive skills. Indeed, the corrected non-cognitive point estimate assuming symmetry is very similar 22

to the baseline estimate and is significant at the 95% level. Conversely, when we do the opposite and restrict enrichment to consist only of homework, we find the same pattern of zero cognitive estimates and even more significant, more negative non-cognitive estimates (Table 10 in Appendix A). 4.2 Estimates by Grade The full-sample estimates imply that enrichment time, when corrected for selection on ff ff unobservables, has roughly no e ect on cognitive skills and a significant, negative e ect on non-cognitive skills. Here, we break down these results by grade level by applying our ff method separately for children in di erent age ranges. The estimates by grade are presented in Table 3. The uncorrected estimates show that each additional hour of enrichment is associated with a statistically and economically significant increase in cognitive skills for children in middle and high school. Yet, the ff corrected estimates are all around zero, with some weak evidence of negative e ects for high school. The headline result for cognitive skills from the full-sample estimates in ff Table 2 carries over to each grade range separately: the corrected e ect of enrichment on cognitive skills is roughly zero for all grade ranges. Table 4 repeats the analysis for non-cognitive skills. The uncorrected estimates suggest a significant, positive association between enrichment and non-cognitive skills for high school only. Interestingly, this grade range happens to be exactly the one in which we find the most evidence of endogeneity, as seen by the estimates of δs in columns (iii)-(v). Indeed, the corrected estimates are negative and significant only for high school. 23

Table 3: Cognitive Estimates by Grade Levels (i) (ii) (iii) (iv) (v) Uncorrected Uncorrected Tobit Het. Het. NoControls w/Controls Tobit Symmetric PreK-5 β 0.008* 0.000 0.003 0.002 -0.002 (0.005) (0.003) (0.013) (0.012) (0.011) N=1331 δ -0.003 -0.002 0.002 (0.012) (0.011) (0.009) 6-8 β 0.020** 0.009** 0.003 -0.001 0.001 (0.003) (0.002) (0.011) (0.011) (0.011) N=1414 δ 0.005 0.008 0.007 (0.009) (0.009) (0.009) 9-12 β 0.027** 0.013** -0.008 -0.009 -0.008 (0.003) (0.002) (0.008) (0.008) (0.009) N=1585 δ 0.016** 0.017** 0.017** (0.006) (0.006) (0.007) Table 4: Non-Cognitive Estimates by Grade Levels (i) (ii) (iii) (iv) (v) Uncorrected Uncorrected Tobit Het. Het. NoControls w/Controls Tobit Symmetric PreK-5 β 0.001 -0.001 0.030 0.026 0.023 (0.005) (0.005) (0.024) (0.024) (0.021) N=1331 δ -0.027 -0.023 -0.020 (0.021) (0.022) (0.018) 6-8 β 0.003 -0.003 0.005 0.000 -0.003 (0.005) (0.005) (0.020) (0.019) (0.018) N=1414 δ -0.007 -0.003 0.000 (0.016) (0.016) (0.015) 9-12 β 0.012** 0.010** -0.035** -0.040** -0.039** (0.003) (0.004) (0.012) (0.011) (0.014) N=1585 δ 0.035** 0.039** 0.039** (0.008) (0.008) (0.010) Note(Tables3and4): Numberofobservations(N)foreachgraderangeisshown. Bootstrappedstandard errorsinparentheses(500iterations). Thecorrectedspecificationsuse50clusters. **p<0.05,*p<0.1. 24

4.3 High School Estimates By Household Income ff Wewanttounderstandwhythenon-cognitivee ectsarenegativeforhighschoolchildren. To this end, we break down the high school estimates based on the income of the child’s household. Table 5 presents the cognitive estimates for high school children by household income tercile. The uncorrected estimates show a significant, positive association between cognitiveskillsandenrichmentforeachincometercile. Bycontrast,thecorrectedestimates are all negative and indistinguishable from zero. Table 5: Cognitive Estimates by Income Tercile - Grades 9-12 (i) (ii) (iii) (iv) (v) Uncorrected Uncorrected Tobit Het. Het. NoControls w/Controls Tobit Symmetric Low β 0.024** 0.013** -0.002 -0.007 -0.012 (0.006) (0.006) (0.018) (0.019) (0.025) N=468 δ 0.011 0.014 0.020 (0.013) (0.014) (0.020) Middle β 0.027** 0.018** -0.008 -0.006 -0.005 (0.005) (0.005) (0.015) (0.014) (0.018) N=529 δ 0.019* 0.017* 0.019 (0.011) (0.010) (0.014) High β 0.015** 0.009** -0.009 -0.009 -0.006 (0.003) (0.003) (0.012) (0.012) (0.011) N=580 δ 0.015 0.014 0.011 (0.009) (0.009) (0.009) Note: Numberofobservations(N)foreachhouseholdincometercileisshown. Bootstrappedstandarderrors inparentheses(500iterations). Thecorrectedspecificationsinclude50clusters. **p<0.05,*p<0.1. Table6showstheanalogousresultsfornon-cognitiveskills. Theuncorrectedestimates suggest some positive association between non-cognitive skills and enrichment. However, ff the corrected estimates uniformly indicate negative causal e ects, particularly for middleff and high-income children. These negative e ects are large. For instance, our preferred middle-income estimates (column v) are over twice the magnitude of the unconditional 25

relationship between enrichment and non-cognitive skills (column i). Correspondingly, the estimated δs are large, positive, and are statistically significant for the top two income terciles, indicating strong positive selection into enrichment within each income group. Table 6: Non-Cognitive Estimates by Income Tercile - Grades 9-12 (i) (ii) (iii) (iv) (v) Uncorrected Uncorrected Tobit Het. Het. NoControls w/Controls Tobit Symmetric Low β 0.020** 0.015* -0.007 -0.008 -0.017 (0.007) (0.008) (0.026) (0.025) (0.035) N=468 δ 0.017 0.017 0.026 (0.020) (0.019) (0.028) Middle β 0.022** 0.019* -0.040 -0.059** -0.060* (0.009) (0.010) (0.025) (0.023) (0.034) N=529 δ 0.044** 0.057** 0.064** (0.017) (0.016) (0.027) High β 0.002 0.004 -0.034* -0.030* -0.028 (0.005) (0.005) (0.018) (0.018) (0.018) N=580 δ 0.030** 0.027* 0.025* (0.015) (0.014) (0.014) Note: Numberofobservations(N)foreachhouseholdincometercileisshown. Bootstrappedstandarderrors inparentheses(500iterations). Thecorrectedspecificationsinclude50clusters. **p<0.05,*p<0.1. In the next section, we discuss possible reasons why all of our causal estimates of enrichmentoncognitiveskillsgravitatearoundzero,andwhythereseemstobeanegative ff e ect of enrichment on non-cognitive skills which is particularly concentrated in the high school years for middle- and high-income children. 26

5 Discussion 5.1 A Simple Model of Enrichment Time Allocation At first sight, our empirical results are puzzling. How could it be that spending more time on reading, studying, extracurricular lessons, and other such activities does not improve ff cognitive skills? Further, how could the e ects of these activities on non-cognitive skills be negative? Here we discuss one possible rationalization of these results using a stylized model of time allocation. This model aims to capture two key ideas. First, our empirical approach measures ff the causal e ect of spending more time on enrichment relative to alternative uses of time. Every additional hour that a child spends on enrichment is an hour not spent on some otheractivity. Iftheactivitiesforegoneinfavorofenrichmenthave,onthemargin,greater ff returns than enrichment, then the net e ect of spending more time on enrichment will be negative. The second idea is that children and families will not generally be able to choose enrichment so as to simultaneously maximize both cognitive and non-cognitive skills. Thus, if improving cognitive skills tends to be the objective, the resulting choice of enrichment might be beyond the non-cognitive optimum, leading to negative net non-cognitive returns on the margin. Suppose that the total time budget is normalized to 1 and there are only two possible activities: enrichmenttimeI andleisuretimeL=1− I. Skillsareproducedfromenrichment and leisure according to S =f (I,L), S =f (I,L), (7) c c nc nc where S and S denote cognitive and non-cognitive skills, respectively. c nc 5.1.1 Howcanthecognitiveestimatesbezero? Consider a hypothetical, stylized scenario where families choose enrichment so as to ff maximize cognitive skills. Assuming di erentiability and an interior solution, the optimal 27

allocation of enrichment time for cognitive skills, I , will satisfy the first order condition c ∂ ∂ f (I ,1− I )= f (I ,1− I ). (8) c c c c c c ∂I ∂L Equation (8) states that the optimal enrichment choice equalizes the marginal return to enrichment and leisure activities for cognitive skills. Intuitively, if the marginal return of an additional hour of enrichment is higher than the marginal return for leisure, then it is worth it to substitute one hour from leisure to enrichment. Therefore, if children ff and families chose enrichment so as to maximize cognitive skills, the marginal e ect of enrichment on cognitive skill will be dS ∂ ∂ d(1− I) c = f (I ,1− I )+ f (I ,1− I )· c c c c c c dI ∂I ∂L dI ∂ ∂ = f (I ,1− I )− f (I ,1− I ) c c c c c c ∂I ∂L =0. ff At the optimum, the marginal e ect of investment on cognitive skills should be zero. This result might explain why our causal cognitive estimates are close to zero in all cases. If children and families choose enrichment so as to maximize cognitive skills, causal estimates around zero are exactly what we would expect to find. This example considers only two activities, enrichment and leisure. However, a similar result holds when there are more than two activities. This conclusion also continues to ffi hold approximately even if some children bunch at I =0, provided that su ciently many c choose interior solutions (I >0).7 c In practice, the exact returns of each activity are not precisely known, and thus en- 7Infact,inthisscenariowewouldexpectcognitiveestimatestobeslightlynegativeandsmall,exactly as we are finding them to be. Indeed, while the net cognitive returns to enrichment would be zero for thoseininteriorsolutions,itwouldbenegativeforthoseatthecornersolution. Ifnetcognitivereturnsto enrichmentwerenotnegativeatzeroforthebunchedchildren,wewouldexpectthemtochoosepositive valuesofenrichmentinstead. 28

richment is likely not chosen to be exactly I . One may even wonder how realistic it is to c suppose that families and children maximize cognitive skills when choosing enrichment. Over 70% of enrichment is composed of activities that strongly target academic cognitive skills (mainly homework). In such cases, grades and test scores are likely the main optimization objective, and these outcomes align closely with our definition of cognitive skills. However, the choice of other enrichment activities such as extra-curricular classes and reading may be more complex than the simple maximization of cognitive skills. For example, families may be optimizing something else entirely, such as college acceptance, or a combination of cognitive skills and other considerations, such as the enjoyment of the activity, caving to social pressure, or belonging to a social group. Nevertheless, the model conclusions will still hold approximately in all these cases provided cognitive skills are given enough consideration in the objective function of children and families. 5.1.2 Howcanthenon-cognitiveestimatesbenegative? ff Next, we discuss why enrichment activities may yield a negative e ect on non-cognitive skills, particularly for middle- and high-income children in high school. We begin by arguing that the optimal amount of enrichment for cognitive skills, I , will generally be c ff di erent from the optimal amount of enrichment for non-cognitive skills, I . Figure 6 nc explainsthispoint. Itplotshypotheticalcognitiveandnon-cognitiveproductionfunctions as a function of enrichment for later grades. The cognitive-maximizing point, I , lies c to the right of the non-cognitive-maximizing point, I . Around I , the marginal return nc c to enrichment is close to zero for cognitive skills and negative for non-cognitive skills, reflecting the findings of Section 4 for high school. Toseethis,considersomeonewhospendsI −1hoursonenrichmentandisconsidering c doing one extra hour of enrichment, say homework. Hypothetically, this extra hour of ff homework could have a positive direct e ect on cognitive skills and a small negative ff indirect e ect through the foregone substituted activities (e.g., play, sleep) for a total 29

Figure 6: Cognitive and Non-Cognitive Skills Production - High School I I Enrichment nc c sllikS f (I,L) c f (I,L) nc Note: Thisfigureillustratesapotentialexplanationforourfindingsofzeroneteffectsoncognitiveskills andnegativeneteffectsonnon-cognitiveskillsforyouthinhighschool. Thetopcurveshowshowcognitive skillsvarycausallywithenrichment. Thelowercurveshowstheanalogousrelationshipfornon-cognitive skills. I isthelevelofenrichmentthatmaximizescognitiveskills,andI isthelevelofenrichmentthat c nc maximizesnon-cognitiveskills. AroundI ,theneteffectofenrichmentoncognitiveskillsisclosetozero c anditscorrespondingeffectonnon-cognitiveskillsisnegative. ff ff positive net e ect. However, homework may have only a small positive direct e ect on ff non-cognitive skills while having a very negative indirect e ect through the foregone ff activities, for a net negative e ect. Therefore to maximize cognitive skills one may want to spend one more hour on homework, while to maximize non-cognitive skill one may not. ff The trade-o between maximizing cognitive and non-cognitive skills depends entirely ff on which activities we are considering. While the trade-o in the case of homework may ff be high, the trade-o in the case of other activities with a higher social component, for ff example, may not be as pronounced. This may explain why the e ect of enrichment on non-cognitive skills is more negative in high school. If the composition of enrichment ff changes from activities with low trade-o s (social extra-curriculars) to activities with high ff ff trade-o s (homework), we would expect the e ect of enrichment on non-cognitive skills to become more negative in higher grades. Indeed, this seems to be the case. Figure 7 shows that the composition of enrichment ff time in higher grades is more focused on activities that tend to provide a high direct e ect 30

ff on cognitive skills while having little or no e ect on non-cognitive skills. In particular, the average share of enrichment time devoted to homework increases notably, from 52% in the PreK-5 group to 79% in grades 9-12.8 At the same time, the average share devoted to reading books falls from 16% to 10%, and the time spent in before- and after-school programs (which often involves socializing with other children) declines precipitously from 18% to 0%. Figure 7: Enrichment Time Breakdowns by Grade Level Grades PreK-5 Grades 6-8 Grades 9-12 3% 1% 5% 5% 0% 3% 16% 16% 10% 52% 2% 4% 4% 2% 79% 2% 2% 66% 1% 2% 3%0% 5% 18% Reading a Book Other Reading Being Read To Homework Before- or After- School Program Other Education Other Academic Lessons Non-Academic Lessons Note: Panelsplottheaveragedivisionoftimeintodifferentcategoriesoveratypicalweekforeachgrade level. Thefigurepoolsthe1997,2002and2007CDSwaves. How can we explain the breakdown of the high school results by household income presented in Section 4.3? There we see that low-income youth in high-school have a 8Notethatthiscomparisonislikelytounderstatethetruedisparity,sincethenatureofthehomework acrossgradesisdifferentaswell,withhighschoolhomeworklikelybeinglessassociatedwithnon-cognitive skills(andmoreassociatedwithcognitiveskills)thanhomeworkinearliergrades. 31

ff slightly negative e ect for non-cognitive skills, while the middle- and high-income youth ff have very negative e ects. One possible explanation for this pattern could be that the ff composition of enrichment time in high school might di er by household income, as it does by age. However, we do not find evidence to support this hypothesis – Figure 12 in Appendix A shows that the composition of enrichment time in high school is on average ff very similar across the di erent income terciles. All three groups spend about the same share on each enrichment activity, including homework. ff Figure 8 may help explain the di erence in the non-cognitive estimates between highand low-income youth. It shows that high-income children spend considerably more time on enrichment than middle- and low-income children. Thus, since the composition of enrichment is the same, we might expect high-income children to have relatively more negative non-cognitive estimates simply due to diminishing returns.9 ff We still need to understand the di erence in the non-cognitive estimates between medium- and low-income youth, since both the amount of time spent on enrichment, as wellasthecompositionofenrichment,isthesameacrossthoseterciles. Inthenextsection, ff we argue that substitution patterns may explain the di erence – at the margin, low- and ff middle-income children spend time on enrichment at the expense of di erent activities. 5.2 Which activities are crowded out by enrichment? In this section, we attempt to gain some insight into the substitution patterns between enrichmentandotheractivities. Whenhigh-schoolersdoanadditionalhourofenrichment, from which activities is that hour taken? In particular, we ask if these activities are ff di erent depending on household income. A detailed analysis aimed at obtaining the exact extent of substitution between enrichment and each alternative activity would 9Notethatcognitiveestimatesareallnearzeroacrossincometerciles,whichisconsistentwiththeidea thatallincomegroupsarechoosinglevelsofenrichmentneartheircorrespondingI . ThefactthatI for c c high-incomechildrenislargerthanI fortheirlower-incomecounterpartsmaybeduetodifferencesinthe c productionfunction. Indeed,itisplausiblethathigh-incomechildrenhaveaccesstoadditionalresources thatmightbecomplementarytoenrichmentactivities(e.g.,smallerclasssizes,betterteachers,etc). 32

Figure 8: Enrichment Time by Income Tercile: Grades 9-12 noitcnuF ytisneD evitalumuC laciripmE 1 8. 6. 4. 2. 0 0 10 20 30 40 50 Hours Per Week on Enrichment Activities Low-Income Medium-Income High-Income Note: This Figure shows the empirical cumulative distribution functions of enrichment time for each householdincometercile,amonghighschoolchildren. ff require the identification of causal substitution e ects and is beyond the scope of this paper. Nevertheless, here we use some of the ideas explored so far in this paper to provide suggestive evidence that children in high school substitute enrichment away from ff di erentactivitiesdependingontheirincome. Wefindthatmiddle-incomechildrenforego activitiesthatwouldhavemorebenefitstonon-cognitiveskillsthanlow-andhigh-income children. We begin with the low-income tercile in Figure 9. Each panel plots the average time spent on the other activity categories described in Section 2 for each level of enrichment. The figure shows that as low-income children spend more time on enrichment, they tend to spend less time on play and social activities, passive leisure, duties/chores, sleep, and classtime. Therelationshipbetweenenrichmentandotherenrichmentactivities(activities included in the broad enrichment category but not included in the baseline enrichment category) is roughly flat. Of course, these relationships are not necessarily causal, so they may not reflect actual 33

Figure 9: Child Time Use by Enrichment Time: Low-Income Children in Grades 9-12 seitivitcA laicoS dna yalP 02 81 61 41 21 01 8 6 4 P-value of Discontinuity: .01 0 5 10 15 20 Hours Per Week on Enrichment Activities erusieL evissaP 42 22 02 81 61 41 21 01 8 P-value of Discontinuity: .399 0 5 10 15 20 Hours Per Week on Enrichment Activities serohC/seituD 63 43 23 03 82 62 42 22 02 P-value of Discontinuity: .358 0 5 10 15 20 Hours Per Week on Enrichment Activities seitivitcA tnemhcirnE rehtO 61 41 21 01 8 6 4 2 0 P-value of Discontinuity: .056 0 5 10 15 20 Hours Per Week on Enrichment Activities gnipeelS 07 86 66 46 26 06 85 65 45 P-value of Discontinuity: .447 0 5 10 15 20 Hours Per Week on Enrichment Activities ssalC 63 43 23 03 82 62 42 22 02 P-value of Discontinuity: 0.000 0 5 10 15 20 Hours Per Week on Enrichment Activities Note: Eachpanelshowsaplotofthelocallinearestimatoroftheexpectedvalueofavariableforagiven amountoftimespentonenrichment,alongwithits90%confidenceinterval. Theexpectedvalueofthe variableforthechildrenwhospentnotimeonenrichmentisalsoshown,alongwithits90%confidence interval. Finally,thep-valueofatestforwhetherthereisdiscontinuityatzerotimeonenrichmentisalso shownintheheaderofeachpanel. substitution. However, we can use Caetano (2015)’s test of exogeneity (see Section 3 and Appendix B) to gain insight into whether endogeneity is likely to play a major role in these observed raw correlations. For instance, the average amount of play and social 34

ff activities for the zero-enrichment children is starkly di erent from the average for those children who do just one or two hours of enrichment per week. This suggests that the raw correlation between play and social activities and enrichment is not necessarily causal – the discontinuity is evidence of uncontrolled-for endogeneity. By contrast, there is no evidence of a discontinuity at zero (p=0.399) for passive leisure, suggesting that the very negative gradient between passive leisure and enrichment may be causal, and thus imply some substitution. Extending this logic to the other panels, we conclude that there is some evidence that high school children from low-income households substitute toward enrichment away from passive leisure, duties/chores, and sleep. The gradient between enrichment and passive leisure is much steeper than the analogous gradients for sleep and duties/chores, suggesting that the evidence of substitution away from passive leisure is the strongest. Analogously, Figure 10 shows some evidence that high school children from middleincome households substitute toward enrichment away from play/social activities and duties/chores. The strongest evidence points to play/social activities, which has a much steeper slope. ff The apparent di erences in substitution patterns by household income may help exff plain why the non-cognitive e ect for the middle-income group is particularly negative relative to the low-income group despite their similar levels and compositions of enrichment. Middle-income high school students tend to substitute toward enrichment away from play/social activities, while low-income high school students tend to substitute toward enrichment away from passive leisure (screen time, consisting mostly of watching TV). Clearly, the opportunity cost of enrichment is higher for middle-income students, as play/socialactivitiesareknowntobebeneficialtonon-cognitiveskillsrelativetowatching ff TV ((Lukiano and Haidt, 2018)). 35

Figure 10: Child Time Use by Enrichment Time: Middle-Income Children in Grades 9-12 seitivitcA laicoS dna yalP 02 81 61 41 21 01 8 6 4 P-value of Discontinuity: .38 0 5 10 15 20 Hours Per Week on Enrichment Activities erusieL evissaP 42 22 02 81 61 41 21 01 8 P-value of Discontinuity: .025 0 5 10 15 20 Hours Per Week on Enrichment Activities serohC/seituD 63 43 23 03 82 62 42 22 02 P-value of Discontinuity: .338 0 5 10 15 20 Hours Per Week on Enrichment Activities seitivitcA tnemhcirnE rehtO 61 41 21 01 8 6 4 2 0 P-value of Discontinuity: .06 0 5 10 15 20 Hours Per Week on Enrichment Activities gnipeelS 07 86 66 46 26 06 85 65 45 P-value of Discontinuity: .032 0 5 10 15 20 Hours Per Week on Enrichment Activities ssalC 63 43 23 03 82 62 42 22 02 P-value of Discontinuity: 0.000 0 5 10 15 20 Hours Per Week on Enrichment Activities Note: Eachpanelshowsaplotofthelocallinearpolynomialestimatoroftheexpectedvalueofavariablefor agivenamountoftimespentonenrichment,alongwithits90%confidenceinterval. Theexpectedvalueof thevariableforthechildrenwhospentnotimeonenrichmentisalsoshown,alongwithits90%confidence interval. Finally,thep-valueofatestforwhetherthereisdiscontinuityatzerotimeonenrichmentisalso shownintheheaderofeachpanel. Figure 13 in Appendix A presents the analogous results for high-income children, suggesting that they substitute toward enrichment away from passive leisure, play/social activities, duties/chores and other enrichment, with the strongest evidence for passive 36

ff leisureduetoitsmuchsteeperslope. Itseemsthatdi erencesinenrichmenttotalsexplain ff most of the di erences in the non-cognitive returns to enrichment between high- and low-income youth, as the substitution patterns seem similar. ff ff In sum, di erences in substitution seem to explain most of the di erence in the nonff ff cognitive e ects between low- and medium-income children, while the di erences in total ff time spent on enrichment seem to explain most of the di erence in the non-cognitive ff e ects between low- and high-income children. 6 Conclusion ff In this paper, we estimate the total e ect of time spent on enrichment activities on cognitive and non-cognitive skills. We propose an endogeneity correction which leverages thebunchingatzeroenrichmentgeneratedbytheconstraintthatthechoiceofenrichment cannot be negative. Ourresultssuggestthatthesizable,positivecorrelationsobservedbetweenenrichment time and childhood skills are mostly driven by unobservables. Correcting for the bias ff introduced by these unobservables, we find that the net causal e ect of enrichment activities is negligible and may even be negative for cognitive skills. Regarding noncognitive skills, the corrected estimates are also negligible in earlier grades, but quite ff negative and very significant in high school. The negative high school e ects for noncognitive skills are particularly large for middle- and high-income children. We interpret our results through the lens of a model of time allocation and skill production. We argue that if parents and children put a lot of weight on cognitive skill production when choosing their level of enrichment, we would expect marginal cognitive returns to enrichment to gravitate towards zero. However, parents and children cannot maximize cognitive and non-cognitive skills at the same time. If there is a practical tradeff o between maximizing cognitive skills and non-cognitive skills when choosing the level 37

of enrichment, then the non-cognitive returns might be negative. ff This model may also explain why the non-cognitive e ects are quite negative in high school. Intensifying competition for college admissions means that high school may be a time when enrichment is especially geared towards cognitive skills and away from non-cognitive skills. We show that this is indeed the case, as enrichment shifts towards more homework and less social activities as children get older. ff The more negative e ects for high-income in comparison to low-income youth in high schoolmaybeexplainedbythefactthathigh-incomeyouthspendsubstantiallymoretime ff onenrichment. Themorenegativee ectsformiddle-incomeincomparisontolow-income youth in high school may be explained by the substitution patterns: middle-income youth tend to choose their last hour of enrichment at the expense of play and social activities. In contrast, low-income youth tend to choose their last hour of enrichment at the expense of TV. Social activities are likely more beneficial for non-cognitive skills than TV, so the opportunity cost of enrichment for middle-income children is higher. Finally,wecallattentiontotheneedforthedevelopmentoffurther,largerdatasources connecting time use and skills. Currently, the question posed in this paper can only be studied with two datasets, both with limited sample sizes (CDS-PSID and LSAC). Larger datasets would not only improve the precision of the estimates but would also allow a more complete study of causal substitution patterns. 38

References Abeles, V. (2015). Beyond Measure, Rescuing an Overscheduled, Overtested, Underestimated Generation. Simon and Schuster, New York, NY. Aguiar, M. and Hurst, E. (2007). Measuring Trends in Leisure: The Allocation of Time Over Five Decades. Quarterly Journal of Economics, 122:969–1006. Anderegg, D. (2003). Worried All the Time: Rediscovering the Joy in Parenthood in an Age of Anxiety. Free Press, New York, NY. Avent, R. (2017). High-Pressure Parenting. The Economist, Feb/Mar 2017. Bernal, R. and Keane, M. P. (2011). Child Care Choices and Children’s Cognitive Achievement: The Case of Single Mothers. Journal of Labor Economics, 29(3):459–512. Bianchi, S. M. (2000). Maternal Employment and Time with Children: Dramatic Change or Surprising Continuity? Demography, 37:401–414. Bonhomme, S., Lamadon, T., and Manresa, E. (2017). Discretizing Unobserved Heterogeneity. Working Paper. Bonhomme, S. and Manresa, E. (2015). Grouped Patterns of Heterogeneity in Panel Data. Econometrica, 83(3):1147–1184. Caetano, C. (2015). A Test of Exogeneity Without Instrumental Variables in Models With Bunching. Econometrica, 83(4):1581–1600. Caetano, C., Caetano, G., and Nielsen, E. (2020). Correcting Endogeneity Bias in Models with Bunching. Working Paper. Available here. Caetano, G., Kinsler, J., and Teng, H. (2019). Towards causal estimates of children’s time allocation on skill development. Journal of Applied Econometrics, 34(4):588–605. ff Chetty, R., Friedman, J. N., and Rocko , J. E. (2014). Measuring the Impacts of Teachers I: Evaluating Bias in Teacher Value-Added Estimates. American Economic Review, 104(9):2593–32. Deming, D. J. (2017). The Growing Importance of Social Skills in the Labor Market. The Quarterly Journal of Economics, 132(4):1593–1640. Doepke, M. and Zilibotti, F. (2017). Parenting With Style: Altruism and Paternalism in Intergenerational Preference Transmission. Econometrica, 85:1331–1371. Doepke, M. and Zilibotti, F. (2019). Love, Money, and Parenting: How Economics Explains the Way We Raise Our Kids. Princeton University Press. Duncan, G. and Murnane, R. (2011). Restoring Opportunity: The Crisis of Inequality and the Challenge for American Education. Russell Sage, New York, NY. 39

ff Fiorini, M. and Keane, M. P. (2014). How the Allocation of Children’s Time A ects Cognitive and Noncognitive Development. Journal of Labor Economics, 32(4):787–836. Ginsburg, K. R. et al. (2007). The Importance of Play in Promoting Healthy Child Development and Maintaining Strong Parent-Child Bonds. Pediatrics, 119(1):182–191. Gray, P. (2010). The Decline of Play and Rise in Children’s Mental Disorders. Psychology Today, Jan 2010. Gray, P. (2011). The Decline of Play and the Rise of Psychopathology in Children and Adolescents. American Journal of Play, 3(4):443–463. Gray, P. (2013). Free to Learn: Why Unleashing the Instinct to Play Will Make Our Children Happier, More Self Reliant and Better Students for Life. Basic Books, New York, NY. Gray, P. (2019). Evolutionary Functions of Play: Practice, Resilience, Innovation, and Cooperation. In Smith, P. K., editor, The Cambridge Handbook of Play: Developmental and Disciplinary Perspectives, pages 84–102. Cambridge University Press, Cambridge. Heckman, J. J. (1979). Sample Selection Bias as a Specification Error. Econometrica, 47(1):153–161. Heckman, J. J. and Rubinstein, Y. (2001). The Importance of Noncognitive Skills: Lessons from the GED Testing Program. American Economic Review, 91(2):145–149. ff Heckman,J.J.,Stixrud,J.,andUrzua,S.(2006). TheE ectsofCognitiveandNoncognitive Abilities on Labor Market Outcomes and Social Behavior. Journal of Labor Economics, 24(3):411–482. ff Jackson, C. K. (2018). What Do Test Scores Miss? The Importance of Teacher E ects on Non–Test Score Outcomes. Journal of Political Economy, 126(5):2072–2107. Jarvis,P.,Newman,S.,andSwiniarski,L.(2014). On‘BecomingSocial’: TheImportanceof Collaborative Free Play in Childhood. International Journal of Play, 3(1):53–68. ff Kane, T. J., Rocko , J. E., and Staiger, D. O. (2008). What Does Certification Tell Us About ff Teacher E ectiveness? Evidence from New York City. Economics of Education Review, 27(6):615 – 631. Kane, T. J. and Staiger, D. O. (2008). Estimating Teacher Impacts on Student Achievement: An Experimental Evaluation. Working Paper 14607, National Bureau of Economic Research. Khazan, O. (2016). Ending Extracurricular Privilege. The Atlantic, Dec 2016. Lareau,A.(2003). UnequalChildhoods: Class,Race,andFamilyLife. UniversityofCalifornia Press, Berkeley, CA. Lin, C.-C. and Ng, S. (2012). Estimation of Panel Data Models with Parameter Heterogeneity when Group Membership is Unknown. Journal of Econometric Methods, 1(1):42–55. 40

Lindqvist,E.andVestman,R.(2011). TheLaborMarketReturnstoCognitiveandNoncognitive Ability: Evidence from the Swedish Enlistment. American Economic Journal: Applied Economics, 3(1):101–28. ff Lukiano ,G.andHaidt,J.(2018). TheCoddlingoftheAmericanMind: HowGoodIntentions and Bad Ideas are Setting up a Generation for Failure. Penguin. ffl Luthar, S. S. (2003). The Culture of A uence: Psychological Costs of Material Wealth. Child Development, 74(6):1581–1593. ffl Luthar, S. S. and Becker, B. E. (2002). Privileged But Pressured? A Study of A uent Youth. Child Development, 73(5):1593–1610. Neal,D.A.andJohnson,W.R.(1996). TheRoleofPremarketFactorsinBlack-WhiteWage ff Di erences. The Journal of Political Economy, 104:869–895. Peterson, J. and Zill, N. (1986). Marital Disruption, Parent-Child Relationships, and Behavioral Problems in Children. Journal of Marriage and the Family, 48:295–307. Ramey, G. and Ramey, V. (2010). The Rug Rat Race. Working Paper. ffl Rosen, R. (2015). Why A uent Parents Put So Much Pressure on Their Kids. The Atlantic, Dec 2015. Rosenfeld, A. A. and Wise, N. (2000). The Over-Scheduled Child: Avoiding the Hyperffi parenting Trap. St Martin’s Gri n, New York, NY. Rosin, H. (2015). The Silicon Valley Suicides. The Atlantic. Todd, P. and Wolpin, K. (2007). The Production of Cognitive Achievement in Children: Home, School, and Racial Test Score Gaps. Journal of Human Capital, 1(1):91–136. ff Veiga, G., Neto, C., and Rie e, C. (2016). Preschoolers’ Free Play: Connections with Emotional and Social Functioning. International Journal of Emotional Education, 8(1):48– 62. Villaire, T. (2003). Families on the Go: Active or Hyperactive? Our Children, 28(5):4–5. Waddell, G. R. (2006). Labor-Market Consequences of Poor Attitude and Low Self-Esteem in Youth. Economic Inquiry, 44(1):69–97. Walker, M. (2017). Why We Sleep: Unlocking the Power of Sleep and Dreams. Simon and Schuster. Warner,J.(2005). PerfectMadness: MotherhoodintheAgeofAnxiety. RiverheadBooks,New York, NY. 41

A Supporting Tables and Figures Figure 11: Time Breakdowns - Other Time Aggregates Other Enrichment Activities Passive Leisure 67% 1% 61% 9% 3% 11% 17% 30% Computer (Educational) Volunteering Sports (Structured) Arts TV Other Media Arts Excursions Other Play and Social Activities Duties/Chores 44% 28% 6% 11% 6% 1% 6% 5% 13% 12% 27% 21% 3% 17% Play Interactive Games Hobbies Caring for Others Chores Sports (Unstructured) Religious Activities Paid Traveling Other Group Activities Conversations Shopping Personal Care Socializing Meals Note: PanelsplottheaveragedivisionoftimeintodifferentcategoriesoveratypicalweekforourfullCDS sample. Thefigurepoolsthe1997,2002and2007CDSwaves. 42

Table 7: Cognitive and Non-Cognitive Factor Loadings CognitiveSkills 1997 2002 2007 LetterWord 0.95 0.94 0.85 AppliedProblems 0.89 0.89 0.76 PassageComprehension 0.96 0.96 0.90 Non-CognitiveSkills Cheatortellslies 0.46 0.52 0.56 Bulliesormeantoothers 0.55 0.56 0.51 Feelsnoregretaftermisbehaving 0.41 0.45 0.43 Breaksthingsonpurpose 0.46 0.48 0.47 Hassuddenchangesinmood 0.55 0.56 0.58 Feelsnolove 0.49 0.52 0.57 Toofearfuloranxious 0.41 0.47 0.50 Feelsworthlessorinferior 0.48 0.53 0.64 Sadordepressed 0.52 0.55 0.64 Criestoomuch 0.42 0.36 0.38 Easilyconfused 0.50 0.53 0.53 Hasobsessions 0.51 0.51 0.60 Ratherhighstrung,tenseandnervous 0.48 0.54 0.53 Arguestoomuch 0.60 0.59 0.59 Disobedient 0.51 0.58 0.57 Stubborn,sullen,orirritable 0.61 0.61 0.64 Hasaverystrongtemper 0.59 0.65 0.64 Hasdifficultyconcentrating 0.57 0.59 0.59 Impulsive,oractswithoutthinking 0.62 0.62 0.62 Restlessoroverlyactive 0.55 0.52 0.49 Hastroublegettingallongwithotherchildren 0.59 0.59 0.59 Notlikedbyotherchildren 0.44 0.43 0.50 Withdrawn,doesnotgetinvolvedwithothers 0.37 0.43 0.45 Clingstoadults 0.32 0.31 0.27 Demandsalotofattention 0.58 0.53 0.54 Toodependentonothers 0.43 0.46 0.49 Thinksbeforeacting,notimpulsive 0.52 0.52 0.58 Generallywellbehaved,doeswhatadultsrequest 0.53 0.59 0.60 Cangetoverbeingupsetquickly 0.42 0.44 0.51 Waitsturnsingamesandotheractivities 0.47 0.52 0.49 Getsalongwellwithotherchildren 0.60 0.62 0.61 Admiredbyotherchildren 0.55 0.55 0.57 Cheerful,happy 0.42 0.48 0.58 Triesthingsforhimself/herself 0.35 0.34 0.46 Doesneat,carefulwork 0.39 0.41 0.49 Curiousandexploring,likesnewexperiences 0.12 0.21 0.26 Note: Cognitiveandnon-cognitivefactorloadingsareshownforeachCDSwave. 43

Table 8: Uncorrected and Corrected Results – Alternative Skill Measures (i) (ii) (iii) (iv) (v) Uncorrected Uncorrected Tobit Het. Het. NoControls w/Controls Tobit Symmetric Cognitive Applied β 0.013** 0.008** 0.000 -0.004 0.000 Problems (0.003) (0.002) (0.006) (0.005) (0.006) δ 0.007 0.010** 0.007 (0.005) (0.004) (0.005) LetterWord β 0.012** 0.008** -0.001 -0.003 -0.001 (0.003) (0.001) (0.006) (0.005) (0.006) δ 0.007 0.009** 0.007 (0.005) (0.004) (0.005) Passage β 0.014** 0.009** -0.001 -0.003 0.000 Comprehension (0.003) (0.001) (0.006) (0.005) (0.006) δ 0.009* 0.010** 0.008* (0.005) (0.004) (0.005) Non-Cognitive External β 0.010** 0.006** -0.014 -0.023** -0.017* (0.002) (0.002) (0.009) (0.008) (0.009) δ 0.016** 0.023** 0.018** (0.008) (0.007) (0.008) Internal β 0.002 -0.001 -0.021** -0.026** -0.017* (0.003) (0.003) (0.010) (0.009) (0.010) δ 0.016** 0.020** 0.013 (0.008) (0.007) (0.008) Note: N=4,330. Bootstrappedstandarderrorsinparentheses(500iterations). Thecorrectedspecifications use50clusters. **p<0.05,*p<0.1. 44

Table 9: Uncorrected and Corrected Results – Broad Enrichment (i) (ii) (iii) (iv) (v) Uncorrected Uncorrected Tobit Het. Het. NoControls w/Controls Tobit Symmetric Cognitive β 0.024** 0.010** -0.001 -0.003 -0.002 (0.002) (0.001) (0.007) (0.006) (0.006) δ 0.011* 0.012** 0.011** (0.006) (0.006) (0.005) Non-Cognitive β 0.009** 0.007** -0.018* -0.018** -0.017** (0.002) (0.002) (0.011) (0.009) (0.008) δ 0.023** 0.023** 0.021** (0.009) (0.008) (0.007) Note: N=4,330. Bootstrappedstandarderrorsinparentheses(500iterations). Thecorrectedspecifications use50clusters. **p<0.05,*p<0.1. Table 10: Uncorrected and Corrected Results – Homework Only (i) (ii) (iii) (iv) (v) Uncorrected Uncorrected Tobit Het. Het. NoControls w/Controls Tobit Symmetric Cognitive β 0.032** 0.010** 0.007 0.002 0.003 (0.003) (0.002) (0.007) (0.007) (0.008) δ 0.003 0.006 0.006 (0.005) (0.005) (0.006) Non-Cognitive β 0.011** 0.006** -0.020* -0.032** -0.029** (0.003) (0.003) (0.011) (0.010) (0.013) δ 0.020** 0.028** 0.028** (0.008) (0.007) (0.010) Note: N=4,330. Bootstrappedstandarderrorsinparentheses(500iterations). Thecorrectedspecifications use50clusters. **p<0.05,*p<0.1. 45

Figure 12: Enrichment Time Breakdowns - High School, By Income Tercile Low-Income Medium-Income High-Income 0% 0% 0% 4% 3% 2% 10% 9% 11% 1% 3% 77% 3% 2% 80% 2% 2% 78% 5% 1% 3%1% 2%0% Reading a Book Other Reading Being Read To Homework Before- or After- School Program Other Education Other Academic Lessons Non-Academic Lessons Note: Panelsplottheaveragedivisionoftimeintodifferentcategoriesoveratypicalweekforeachincome tercileamongthoseingrades9-12. Thefigurepoolsthe1997,2002and2007CDSwaves. 46

Figure 13: Child Activities by Enrichment Time: High-Income Children in Grades 9-12 seitivitcA laicoS dna yalP 02 81 61 41 21 01 8 6 4 P-value of Discontinuity: .462 0 5 10 15 20 Hours Per Week on Enrichment Activities erusieL evissaP 42 22 02 81 61 41 21 01 8 P-value of Discontinuity: .195 0 5 10 15 20 Hours Per Week on Enrichment Activities serohC/seituD 63 43 23 03 82 62 42 22 02 P-value of Discontinuity: .246 0 5 10 15 20 Hours Per Week on Enrichment Activities seitivitcA tnemhcirnE rehtO 61 41 21 01 8 6 4 2 0 P-value of Discontinuity: .377 0 5 10 15 20 Hours Per Week on Enrichment Activities gnipeelS 07 86 66 46 26 06 85 65 45 P-value of Discontinuity: .019 0 5 10 15 20 Hours Per Week on Enrichment Activities ssalC 63 43 23 03 82 62 42 22 02 P-value of Discontinuity: 0.000 0 5 10 15 20 Hours Per Week on Enrichment Activities Note: Eachpanelshowsaplotofthelocallinearestimatoroftheexpectedvalueofavariableforagiven amountoftimespentonenrichment,alongwithits90%confidenceinterval. Theexpectedvalueofthe variableforthechildrenwhospentnotimeonenrichmentisalsoshown,alongwithits90%confidence interval. Finally,thep-valueofatestforwhetherthereisdiscontinuityatzerotimeonenrichmentisalso shownintheheaderofeachpanel. 47

B The uncorrected estimates are positively biased. In Section 3, we claim that I is endogenous in the uncorrected model given by equation (1). In fact, we argue further that the bias resulting from this endogeneity is positive. We provide additional evidence in support of these claims here. We do this using Caetano (2015)’s test of exogeneity. The test exploits the fact that if I is exogenous, then E[(cid:15) | I,X] = 0, and therefore E[S | I,X] must be continuous in I at zero. The idea is thus to estimate E[S | I,X] using only observations for which I >0. If I is exogenous,thenthelimitofEˆ[S | I,X]asI approacheszeroshouldbeequaltoEˆ[S | I =0,X]. We perform this test using the full list of controls included in our main analysis in Section 4. We find strong evidence that E[S | I =0,X] is discontinuous at I =0. Thus, we conclude that I is endogenous and the uncorrected estimator of β must be biased. In order to display this multivariate result in a two-dimensional plot, Figure 14 shows the residuals of regression (1) applied to cognitive skills (left panel) and non-cognitive skills (right panel) when we use only observations such that I > 0 to estimate the coefficients. The solid lines represent local linear fits of the residuals of these regressions conditional on enrichment time. The plots also show the average residuals at zero enrichment along with their 90% confidence intervals. Because we are conditioning on all controls X when we run regression (1), the local linear fits already incorporate all discontinuities in the controls. Therefore, the fact that the residuals at zero enrichment are discontinuously lower than the residuals just above zeroforbothcognitiveandnon-cognitiveskillsisdirectevidencethattheunobservedconfounders are also discontinuous at zero enrichment. Moreover, because the discontinuity in the residuals is positive, we conclude that the unobservables that contribute positively to enrichment also directly contribute positively to skills, and thus the OLS estimator of β in equation (1) is biased upward for both types of skill. 48

Figure 14: Evidence that Standard Estimates May be Biased Upward laudiseR evitingoC 50. 0 50.- 1.- P-value of Discontinuity: .003 0 5 10 15 20 Hours Per Week on Enrichment Activities laudiseR evitingoC-noN 50. 0 50.- 1.- 51.- P-value of Discontinuity: .06 0 5 10 15 20 Hours Per Week on Enrichment Activities Note: Eachpanelshowsaplotofthelocallinearpolynomialestimatoroftheexpectedvalueoftheresiduals fromequation(1),estimatedonthepositiveenrichmentsubsample,conditionalonenrichmenttime,along withits90%confidenceinterval. Theexpectedvalueoftheresidualsforthechildrenwhospentnotimeon enrichmentisalsoshown,alongwithits90%confidenceinterval. Finally,thep-valueofatestforwhether thereisdiscontinuityatzeroisalsoshownintheheaderofeachpanel. 49

C Are our findings an artifact of assuming linearity in η? The linear specification of the error term in equation (4) can be relaxed in a more flexible, non-parametric model, as discussed in Caetano et al. (2020). Ultimately, however, we can never escape the fact that in order to use the discontinuities in skill at zero enrichment to ff makeinferencesaboutendogeneityglobally,wemustbewillingtoacceptthatthee ectof ff the confounders on skills at zero enrichment is informative about the corresponding e ect in the rest of our sample. ff ff E ectively, our method estimates the average treatment e ect corrected for the endogeneity from the variables that are correlated with enrichment around zero. Thus, it ff is important to acknowledge that the e ect of confounders on skills at zero may not be ff representative of the e ect in the whole sample, which may lead us to either over-correct or under-correct for bias. In this section we show evidence that, if anything, we might be under-correcting for the positive bias in our application. Our main empirical findings therefore do not seem to be an artifact of this linearity assumption. To fix ideas, we consider two extreme scenarios: one where people only spend either 0 or 1 hours per week on enrichment, and one where people spend between 0 and 50 hours of enrichment per week. The linearity assumption is more plausible in the first scenario ff than in the second scenario. Intuitively, the e ect of the confounders at I = 0 is more ff plausibly similar to the e ect of the confounders at I =1 than at I =50. We build on this idea by restricting the sample to reflect the first scenario, and then we progressively expand the sample until it reaches the second scenario. In Figure 15, we show how our main estimate βˆ for cognitive (left panel) and non-cognitive skills (right ff panel)changesfordi erenttruncationsofoursampledependingonthemaximumallowed enrichment value (I ≤ I , ranging from I =1 to I =50).10 As the maximum hours max max max per week spent on enrichment in our full sample is 50, the estimates in the far right of 10TokeepeverythingelseconstantirrespectiveofI ,wemaintainthesameestimateofE[I ∗|I ∗≤0,X] max usingourpreferredtailsymmetryapproach. NotethattheidentificationofE[I ∗|I ∗≤0,X]doesnotdepend ontheassumptionoflinearityofthestructuralequationonη,whichiswhatwearetryingtotesthere. 50

each panel are the estimates reported in Table 2. ff Figure 15: Estimates for Di erent Sub-samples of the Data - Full Sample 4. 2. 0 2.- 4.- 0 5 10 15 20 25 30 35 40 45 50 2. 0 2.- 4.- 6.- 8.- 0 5 10 15 20 25 30 35 40 45 50 Note: Each panel shows the estimate of β for cognitive (left panel) or non-cognitive skills (right panel) restrictingthesampletoonlychildrenwhoseenrichmenthoursarelowerthanorequaltoI forvaluesof max I rangingfrom1(onlythosewhochoseI =0orI =1)to50(everyone). Theseplotssuggestthatourmain max findingsarenotanartifactofthelinearityinη assumption. We find that the estimates of β are mostly similar to the main estimates from Table 2, except for very small values of I where the estimates are more negative (albeit with max substantially wider confidence intervals). Note that this is in fact a joint test of the linearity of η as well as of whether the ff ff treatment e ects vary with I (see footnote 4 in Section 3.2.) If the e ect of I on S varies with I (be it because it is a function of I or because it is a function of X, which is itself correlated with I) we should also find variations in our estimates when we restrict the sample as in Figure 15 above. The fact that our estimates are constant as we increase I max ff indicates that the treatment e ects are likely to be uncorrelated with I. Additionally, this approach allows us to understand whether our conclusions are robust to the elimination of enrichment outliers from our sample, which indeed seems to be the case. For completeness, Figure 16 shows the analogous plots for high school age children. The findings are similar. We conclude that our main findings in the paper - negative but insignificant cognitive estimates and negative significant non-cognitive estimates in high school - are likely not an artifact of our assumption of linearity in η. 51

ff Figure 16: Estimates for Di erent Sub-samples of the Data - Grades 9-12 5. 0 5.- 1- 0 5 10 15 20 25 30 35 40 45 50 1 5. 0 5.- 1- 0 5 10 15 20 25 30 35 40 45 50 Note: EachpanelshowstheanalogousestimatestoFigure15butforthesub-sampleofchildreninhigh school. 52

E ∗| ∗ ≤ D Identification of [I I 0,X] This section discusses three strategies for the identification of E[I ∗| I ∗ ≤0,X], presented in increasing order of generality. For convenience we repeat here the assumptions of each method described in Section 3.3. The technical details of all these methods can be found in Caetano et al. (2020). Tobit OurfirststrategyistoidentifyE[I ∗| I ∗ ≤0,X]inastructuresimilartoHeckman(1979). We ∗ call this the “Tobit” strategy because it assumes that I satisfies a Tobit model. Specifically, (cid:48) in this strategy, we assume that we can write g(X)=X γ and that η | X ∼N(X (cid:48) θ,σ2). (9) By assuming normality and homoskedasticity, the Tobit strategy constrains the shape of the entire distribution of the unobservable confounders η. Thus, the mean and the variance of this distribution can be identified simply by looking at any portion of the distribution of enrichment time above zero. When equation (9) holds, E[I ∗| I ∗ ≤0,X]=X (cid:48) π − σλ(− X (cid:48) π/σ) (10) where π = γ +θ and λ(·) is the inverse Mill’s ratio. The parameters π and σ can be estimated straightforwardly via a Tobit regression of I on X and plugged into equation (10) to build an estimator of the correction term. In practice, the Tobit strategy turns out to be too restrictive in our context. To illustrate this, Figure 17 plots the empirical conditional cumulative distribution function of enrichment time I for white and Hispanic high school students. Because we observe the ff positive quantiles F I∗|X (q) for di erent values of q, we can infer the shape of part of the 53

distribution of η | X per equation (2). Indeed, the figure shows that the homoskedasticity ff ff assumption clearly does not hold, as the variance of I is di erent for di erent values of X. Figure 17: Evidence of heteroskedasticity on the distribution of I | X noitcnuF noitubirtsiD evitalumuC laciripmE 1 8. 6. 4. 2. 0 0 5 10 15 20 25 30 Hours Per Week on Enrichment Activities White Hispanic Note: Each curve depicts the CDF of I for white and Hispanic high school students. The curves show evidenceofheteroskedasticityinthedistributionofdesiredenrichmentfordifferentvaluesofthecontrols. Figure 18 shows the homoskedastic Tobit fit for white (left panel) and Hispanic (right panel) high school students, along with their corresponding empirical CDF of I. It is evident that in both cases the fit on the positive side of I is not satisfactory, which would Figure 18: Homoskedastic Tobit Fit noitcnuF noitubirtsiD evitalumuC laciripmE 1 8. 6. 4. 2. 0 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 Hours Per Week on Enrichment Activities noitcnuF noitubirtsiD evitalumuC laciripmE 1 8. 6. 4. 2. 0 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 Hours Per Week on Enrichment Activities Note: EachpaneldepictstheCDFofenrichment(I)forwhite(leftpanel)andHispanic(rightpanel)high schoolstudentspresentedinFigure17(thickcurve)alongwiththecorrespondinghomoskedasticTobitfit (thincurve). Theplotsshowevidencethatthehomoskedasticnormalfitforpositivevaluesofenrichmentis notsatisfactory,whichmayleadustoover-estimatethemagnitudeofE[I ∗|I ∗≤0,X]andunderestimatethe magnitudeofδ. 54

lead us to over-estimate the magnitude of E[I ∗| I ∗ ≤ 0,X] and thus under-estimate the magnitude of the bias. Indeed, this is what we find in Section 4. HeteroskedasticTobit Next, we relax the linear mean and homoskedasticity requirements while maintaining the assumption that η | X follows a normal distribution. Specifically, we suppose that η | X ∼N(l(X),σ2(X)). (11) This assumption allows the mean and the variance of η to vary with X in an unrestricted way, but retains the requirement that η be normal separately for each value of X. In this case, E[I ∗| I ∗ ≤0,X]=g(X)+l(X)− σ(X)λ(−(g(X)+l(X))/σ(X)). (12) If X is discrete (or can be discretized, as is the case in our setting, see Appendix E) we can estimateg(X)+l(X)andσ(X)separatelyforeachX byrunningaTobitregressionofI ona constant using only the observations with controls equal to X. Figure 19 plots the empirical conditional cumulative distribution function I for the same two values of controls (white and Hispanic high school students) in Figure 17, along with the corresponding heteroskedastic Tobit fits for each. The fits are clearly superior to the homoskedastic fits presented in Figure 18. Nonetheless, it seems that the upper tail of the data is fatter than the upper tail implied by a normal distribution for both values of X. These plots are not an exception – we observe a similar pattern for many other values of X in the PSID data. If the upper tail is any indication of what is happening in the lower tail, this suggests that the heteroskedastic Tobit model will tend to under-estimate the magnitude of E[I ∗| I ∗ ≤0,X], thus over-estimating the magnitude of δ. Indeed, this is consistent with what we find in Section 4. 55

Figure 19: Heteroskedastic Tobit Fit noitcnuF noitubirtsiD evitalumuC laciripmE 1 8. 6. 4. 2. 0 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 Hours Per Week on Enrichment Activities noitcnuF noitubirtsiD evitalumuC laciripmE 1 8. 6. 4. 2. 0 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 Hours Per Week on Enrichment Activities Note: EachpaneldepictstheCDFofenrichment(I)forwhite(leftpanel)andHispanic(rightpanel)high schoolstudentspresentedinFigure17(thickcurve)alongwiththecorrespondingheteroskedasticTobit fit(thincurve). Theplotsshowevidenceofabetterfitthanthehomoskedasticcaseforpositivevaluesof enrichment. Thetailsseemtobefatterintheempiricaldistributionincomparisonwiththefit,whichmay leadustosomewhatunder-estimatethemagnitudeofE[I ∗|I ∗≤0,X]andover-estimatethemagnitudeofδ. HeteroskedasticTailSymmetry Finally, we drop the normality assumption entirely and require only tail symmetry: for all censored quantiles q , 0 η | X has symmetric tails below q and above 1− q . (13) 0 0 Tail symmetry requires only that the lower tail of η | X below the censoring point and the corresponding upper tail be symmetric, a weaker assumption than symmetry of the entire distribution, which in turn is weaker than normality. Tail symmetry allows us to infer the behavior of I ∗| X when I ∗ <0 by looking at the shape of the upper tail of I | X.11 To see this assumption in action, Figure 20 provides the corresponding plots shown in Figure 19 under tail symmetry. For quantiles below the bunching threshold, the fitted values follow the mirror image of the corresponding upper tail. 11ThisapproachcanonlybeimplementedforvaluesofX suchthattheproportionofchildrenwhoare bunchedatzeroenrichmentislessthanhalfofthesample(P(I =0|X)<0.5). Inoursample,thisistruefor almostallvaluesofX (over97%oftheobservations). ForvaluesofX suchthattheproportionofchildren bunchedatzeroisover50%,wecanestimateE[I ∗|I ∗≤0,X]withtheheteroskedasticTobitapproach,which isfeasiblewithanyamountofbunching. 56

Figure 20: Heteroskedastic Symmetric Fit noitcnuF noitubirtsiD evitalumuC laciripmE 1 8. 6. 4. 2. 0 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 Hours Per Week on Enrichment Activities noitcnuF noitubirtsiD evitalumuC laciripmE 1 8. 6. 4. 2. 0 -35 -30 -25 -20 -15 -10 -5 0 5 10 15 20 25 30 Hours Per Week on Enrichment Activities Note: EachpaneldepictstheCDFofI forwhite(leftpanel)andHispanic(rightpanel)highschoolstudents presentedinFigure17(thickcurve)alongwiththecorrespondingheteroskedasticSymmetricfit(thincurve). Under tail symmetry, E[I ∗| I ∗ ≤0,X]=F I − |X 1(1− F I|X (0))−E[I | I ≥ F I − |X 1(1− F I|X (0)),X], (14) where F I|X (·) is the cumulative distribution function of I conditional on X. We carry out the estimation of E[I ∗| I ∗ ≤0,X] using equation (14) in three steps. For each value of X, we firstestimatetheprobabilityofbunchingatzeroenrichment,F I|X (0). Thenweestimatethe quantileofIintheuppertailthatcorrespondstothemirrorimageofI =0,F I − |X 1(1− F I|X (0)). Finally we estimate the mean of I | X at the upper tail, E[I | I ≥ F I − |X 1(1− F I|X (0)),X]. 57

E Clustering and the discretization of X In Section 3.3 and Appendix D, we discuss two strategies for the estimation of E[I ∗| I ∗ ≤ 0,X], heteroskedastic Tobit and heteroskedastic tail symmetry, which require the distribution of X to have a discrete support. In our setting, there are some important controls that have continuous support. Therefore, we want to be able to discretize X in a non-arbitrary way such that the discretized covariates naturally reflect the joint distribution of the original X. We discuss here how we discretize X using clustering methods. We show that ourresultsarenotanartifactofthespecificwayinwhichweimplementthediscretization nor of the number of clusters we use. Our approach classifies observations with similar observed controls into discrete clusters and uses the corresponding cluster membership indicators as discretized versions of X. The classification attempts to maximize the similarities in X among observations in the same cluster – two children in the same cluster have by construction ff more similar controls than two children in di erent clusters. Formally, let k be the i cluster to which child i belongs. The underlining assumption in our method is that E[I ∗| I ∗ ≤ 0,X = X ] = E[I ∗| I ∗ ≤ 0,X ∈ k ] – we require that the heterogeneity in η condii i tional on X can be entirely explained by the clusters. The larger the number of clusters, the more similar are the controls of the children within the same cluster and thus the weaker this assumption becomes. ff While there are many di erent clustering methods available, we report results based on hierarchical clustering because it produces clusters that are nested: if two children are in the same cluster when there are K clusters, then they will also be in the same (cid:48) cluster whenever there are K <K clusters. Moreover, the move from K to K+1 clusters alwaysconsistsofsplittingone(andonlyone)clusterintotwosmallerclusters. Thenested nature of the hierarchical clusters provides some desirable discipline in the comparison ff of estimates for di erent numbers of clusters. If the estimate of β using K total clusters ff ff is di erent from the estimate using K+1 total clusters, the di erence is due only to the 58

cluster which was split. If the estimates of β remain close to constant as the number of clustersincreases,thisraisesourconfidenceontheassumptionthattheclustersadequately capture di ff erences in the conditional distributions η | X (so that the assumption E[I ∗| I ∗ ≤ 0,X =X ]=E[I ∗| I ∗ ≤0,X ∈ k ] is approximately valid). i i ff We have experimented with di erent linkage methods (average, complete, and Ward’s) ff and di erent dissimilarity measures (Gower, L , L , and correlation), and the results are 1 2 ff uniformly very similar across these di erent cases. We report results with Ward’s linkage method, in which the criterion at each step is to merge two clusters so as to achieve the minimum total within-cluster variance, and the Gower dissimilarity measure, as this measure works well when there is a combination of continuous and discrete variables. Alltheresultsinthepaperincludingtheplotsinthissectionusetheclusterstoestimate E[I ∗| I ∗ ≤ 0,X = X ], but include both the original vector X and the cluster indicators in i the main regression, equation (6). Specifically, we specify m(X)=X (cid:48) τ+ (cid:80) K α 1(X ∈ k) in k=1 k equation (5), in order to account for potential non-linearities (all estimates are robust to the exclusions of the cluster indicators). Figure 21 shows the analogous results to column (v) (tail symmetry) of Table 2 for ff di erent numbers of clusters K. Clearly, adding more clusters than 50 (the number of clusters used in all tables in the paper) do not change the estimates meaningfully. This Figure 21: Uncorrected and Corrected Cognitive and Non-Cognitive Estimates 520. 0 520.- 50 60 70 80 90 100 Number of Clusters (K) Uncorrected Symmetric 520. 0 520.- 50 60 70 80 90 100 Number of Clusters (K) Uncorrected Symmetric Note: Leftfigureshowscognitiveestimates,andrightfigureshowsnon-cognitiveestimates. Shadedareas depictthe90%confidenceintervals. Allstandarderrorsarebootstrappedusing500iterations. 59

suggests that our results are not an artifact of our discretization of X. Note that as the number of clusters increases, so does the list of controls used in m(X). Because the clusters are nested, as we add clusters, we increase the flexibility of m. Therefore,Figure21canbeseenasanillustrationoftheresultsofasequenceoftraditional omitted variable bias tests (e.g. Ramsey RESET test) in the specification of equation (6). The near constancy of the estimates in Figure 21 from K =50 to K =100 confirms that our approach is able to control for all confounders, including those due to mispecification of the function of controls m(X). A growing literature within economics explores the use of clustering techniques apff plied to group fixed e ects estimators in panel settings (Lin and Ng, 2012; Bonhomme ff and Manresa, 2015; Bonhomme et al., 2017). Our use of clustering di ers from these applications. We do not cluster on the outcome variable, and we do not use the clusters to handle endogeneity – that is accomplished through our correction term. Rather, the role of the clusters in our setting is to allow the distribution of unobservables to change with observables in a flexible yet tractable way. 60

Cite this document
APA
Carolina Caetano, Gregorio Caetano, & and Eric Nielsen (2020). Should Children Do More Enrichment Activities? Leveraging Bunching to Correct for Endogeneity (FEDS 2020-036). Board of Governors of the Federal Reserve System, Finance and Economics Discussion Series. https://whenthefedspeaks.com/doc/feds_2020-036
BibTeX
@techreport{wtfs_feds_2020_036,
  author = {Carolina Caetano and Gregorio Caetano and and Eric Nielsen},
  title = {Should Children Do More Enrichment Activities? Leveraging Bunching to Correct for Endogeneity},
  type = {Finance and Economics Discussion Series},
  number = {2020-036},
  institution = {Board of Governors of the Federal Reserve System},
  year = {2020},
  url = {https://whenthefedspeaks.com/doc/feds_2020-036},
  abstract = {We study the effects of enrichment activities such as reading, homework, and extracurricular lessons on children's cognitive and non-cognitive skills. We take into consideration that children forgo alternative activities, such as play and socializing, in order to spend time on enrichment. Our study controls for selection on unobservables using a novel approach which leverages the fact that many children spend zero hours per week on enrichment activities. At zero enrichment, confounders vary but enrichment does not, which gives us direct information about the effect of confounders on skills. Using time diary data available in the Panel Study of Income Dynamics (PSID), we find that the net effect of enrichment is zero for cognitive skills and negative for non-cognitive skills, which suggests that enrichment may be crowding out more productive activities on the margin. The negative effects on non-cognitive skills are concentrated in higher-income students in high school, consistent with elevated academic competition related to college admissions. Accessible materials (.zip)},
}