towards causal estimates of children’s time allocation · pdf filetowards causal...

Towards Causal Estimates of Children’s Time

Allocation on Skill Development

Gregorio Caetano, Josh Kinsler and Hao Teng⇤

May 2016

Abstract

Cognitive and non-cognitive skills are critical for a host of economic and social out-comes as an adult. While there is broad agreement that a significant amount of skillacquisition and development occurs early in life, the precise activities and investmentsthat drive this process are not well understood. In this paper we examine how chil-dren’s time allocation affects their accumulation of skill. Children’s time allocation isendogenous in a model of skill production since it is chosen by parents and children.We apply a recently developed test of exogeneity to search for models that yield causalestimates of the impact time inputs have on child skills. We show that the test, whichexploits bunching in time inputs induced by a non-negativity time constraint, has powerto detect endogeneity stemming from omitted variables, simultaneity, measurement er-ror, and several forms of model misspecification. Results suggest that with a rich setof controls we can consistently estimate the impact of time inputs on skills, thoughthere is significant heterogeneity in which controls matter for different skills at differentages. For children aged 12 to 17, active time with adult family members appears to bethe most productive use of time in developing cognitive skills. In contrast, time spentalone and passive time with the family are important to develop non-cognitive skillsat these ages. For children aged 5 to 11, with few exceptions time allocation does nothave a strong effect on cognitive skills. Moreover, sleep is the most productive inputfor non-cognitive skill development at these ages.

⇤Gregorio Caetano: Economics Department, University of Rochester. Josh Kinsler: Economics Depart-ment, University of Georgia. Hao Teng: Economics Department, University of Rochester. We would liketo thank Carolina Caetano, Vikram Maheshri, Daniel Ringo, and David Slichter for helpful discussions. Allerrors are our own.

1

1 Introduction

There is a growing consensus among economists that skills acquired during childhoodhave an important influence on later life outcomes.1 Cognitive skills measured as early as ageseven have been linked with educational attainment, employment, and wages as an adult.2

Recent research stresses the importance of early childhood non-cognitive skills on later lifelabor market, marriage, health, and criminal outcomes.3 Additional analyses indicate thatadult labor market outcomes are largely determined by skills already in place by age 16.4

In light of the evidence linking children’s cognitive and non-cognitive skills to adult out-comes, it is important to understand the determinants of these skills. One can envisiona production technology for child skills where endowed ability is combined with variousenvironmental elements and time inputs (i.e., activities) contributed by parents, siblings,grandparents, friends, etc. Households will choose time inputs optimally given their prefer-ences and constraints. The challenge in estimating the causal effect of each time input onskill production is two-fold. First, the time allocation of child activities is endogenous sinceit reflects choices made by households. Second, researchers are typically unable to observethe full vector of activities, and can often only observe one or two time inputs. Thus, evenif exogenous variation is available for a particular activity of interest, it is difficult to in-terpret the resulting coefficient without information on the substitution among all potentialactivities.

These challenges are well known and have been widely discussed in the literature, yetso far they have not been systematically addressed.5 On the one hand, studies such asDustmann and Schönberg (2012) and Bernal and Keane (2010) utilize quasi-experimentalpolicy variation that enables them to study how an increase in maternal time affects childcognition. However, a lack of comprehensive data on all other time inputs prevents themfrom understanding the substitution between activities, making it difficult to infer whatwould be the effect of a reallocation of time inputs in circumstances other than the changeimplied by the policy. On the other hand, studies such as Todd and Wolpin (2007), Fioriniand Keane (2014), and Del Boca et al. (2013) estimate the impact of a more comprehensivelist of child inputs on skills, but lack quasi-experimental variation.6 Going forward, the

1See Almond and Currie (2011) for a comprehensive review of the literature.2For instance, McLeod and Kaiser (2004) find that age 7 test scores and family background measures

predict 11% (12%) of the variability in the probability of high school (college) completion. Using similarmeasures, Currie and Thomas (1999) are able to explain 4-5% of the variation in employment and 20% ofthe variation in wages at age 33.

3See for example Cunha et al. (2006), Deming (2009) and Heckman et al. (2013).4See Keane and Wolpin (1997) and Cameron and Heckman (1998) as examples.5See Todd and Wolpin (2003) and Keane (2010) for a discussion of these issues.6Todd and Wolpin (2007) choose their most preferred model using root mean-squared error as the selection

2

lack of exogenous variation in models with many inputs is unlikely to be solved throughan instrumental variable approach since each input requires its own instrument. Moreover,running an experiment is also difficult, as the treatment arms would have to consist of fullyprescribed inputs for each child. If only some inputs are manipulated, parents can optimizeover the remaining ones to either reinforce or negate the intended effect. Either way, it willbe difficult to estimate the ceteris paribus causal impact of substituting one time input fora well-defined alternative time input.

Our aim in this paper is to estimate the impact of children’s time allocation on cognitiveand non-cognitive skills while also investigating endogeneity concerns in a systematic manner.To do this, we exploit a recently developed test of exogeneity (Caetano (2015)) that providesan objective statistical criterion to determine whether the parameters of interest can beinterpreted as causal. We use the test to guide us in the search for causal models withoutan explicit source of quasi-experimental variation (Caetano and Maheshri (2016)). Ourempirical approach is essentially a model selection one, but with a key difference from existingapproaches: our criterion to select models speaks directly to causality, not to fit. In thecontext of skill development, model selection occurs in a very large model space; indeed,there are many ex ante equally plausible ways of formulating the relationship between timeinputs and skill. We show that a multivariate version of the exogeneity test in this contextis a feasible tool to substantially reduce the set of models that can plausibly be consideredappropriate to make causal inference.

Intuitively, the test exploits the idea that children’s actual time allocation varies contin-uously at zero, but their intended time allocation varies discontinuously at zero. Becausewe want to estimate skills as functions of children’s actual time allocation, any discontinuityin skills should be evidence of endogeneity. More specifically, consistent with the rest ofthe literature, we assume that cognitive and non-cognitive skills are continuous functions oftime allocated to various activities. Thus, conditional on controls, skill measures should notvary discontinuously when a child spends exactly zero minutes on any activity (in compar-ison to, say, just a few minutes). If we find such discontinuities, then it is evidence thatunobserved confounders that are not absorbed by controls vary discontinuously at the zerominute threshold, and hence the model suffers from endogeneity. For a test based on thislogic to have power, we need to ensure that unobservables vary discontinuously when time

criteria. However, as the authors point out, this measure speaks to fit but not necessarily to whether theparameters of interest can be interpreted as causal. Fiorini and Keane (2014) approach the identificationproblem by estimating multiple production functions that rely on different exogeneity assumptions. Stabilityof the productivity ranking of inputs is cited as evidence that a reallocation of time use can enhance childdevelopment. However, as the authors point out, it is difficult to claim these estimates are causal sincethe models could suffer from similar biases. Similarly, Del Boca et al. (2013) assume inputs are exogenousconditional on lagged skill, but it is difficult to assert whether these controls are enough to handle endogeneity.

3

inputs equal zero. We argue that this is the case in our context since children cannot spendnegative time in any activity, and therefore, may bunch at zero because of a “corner-solution”.For instance, consider children’s reading ability as an important unobserved confounder. In-tuitively, children who spend a few minutes per week reading choose (or let their parentschoose) that amount optimally given their constraints and preferences. In contrast, amongchildren who spend zero minutes per week reading, there are children who optimally chooseexactly zero minutes and children who would optimally choose negative amounts of readingtime if possible. This implies that the average reading ability among children who spend x

minutes reading tends to be discontinuous at x = 0, and thus the test would have power todetect endogeneity stemming from unobservables related to reading ability. The more un-observed confounders that vary discontinuously at zero minutes for a given time input, themore powerful the test will be as it can detect endogeneity stemming from multiple sources.

One potential concern is that our exogeneity test might lack power: in spite of the intu-ition above, potential confounders might not vary discontinuously when children spend zerominutes in an activity. If this is the case, then we would not be able to detect endogeneityeven if endogeneity exists. To allay this concern we first show that each activity of inter-est has a discontinuous mass of observations at zero minutes, providing direct evidence ofbunching (McCrary (2008)). Next, we show that a variety of key observed characteristicsare discontinuous at the zero threshold. While not a formal test, this is suggestive that un-observables are discontinuous as well.7 It also suggests that discontinuities when inputs arezero are the norm rather than the exception. Finally, we are able to detect discontinuities inthe most parsimonious models. It is only when we add a richer set of controls that we failto reject exogeneity.

Even if the evidence discussed above establishes that the test has power, the test mightnot have power to detect all types of endogeneity we are concerned about. In particular,some potential confounders might be continuous when inputs are zero, or they might beconfounders only when inputs are not zero. We allay concerns about their existence in manyways. First, we test for discontinuities at zero minutes for all time inputs jointly. Thisincreases the power of the test since a confounder that is undetected when one input iszero can be detected when a different input is zero. Second, we provide empirical evidencethat key variables that elicit a comprehensive list of sources of potential endogeneity in thiscontext – omitted variables (e.g., child’s reading ability), simultaneity, measurement error,and many forms of model misspecification – are discontinuous when inputs are zero. Third,

7This logic is analogous to the one used in regression discontinuity (RD) designs. Support for the RDidentifying assumption is based on the idea that if observables vary continuously at a threshold, then unob-servables are also likely to vary continuously at that same threshold.

4

we implement a series of robustness checks designed to detect confounders that cannot bedetected by the test, and find that models that survive the exogeneity test also survive theserobustness checks. Finally, we show that estimates are similar across models that survive thetest, although they are often different among models that do not survive the test. We trackthe cumulative properties a confounder must have to bias our preferred estimates and beundetectable by both the test of exogeneity and the further robustness checks we perform.Ultimately, we conclude that interpreting our preferred estimates as causal is plausible, inthe sense that it is unlikely a confounder will possess all these properties at the same time.

We implement our approach using skill assessments and time diaries from the ChildDevelopment Supplements of the Panel Study of Income Dynamics (PSID). With the helpof their primary caregiver, children fill out a detailed 24 hour time diary to record all oftheir activities during the day, where each activity took place, and with whom they did theactivity. These time diaries are collected in 1997, 2002, and 2007, and cover one weekday andone weekend day for each survey year. Cognitive and non-cognitive skills are also assessedduring each wave of the survey. We split the sample into older (12-17) and younger (5-11) children and estimate separate production functions to allow for heterogeneity in theaccumulation of skill as children age. In addition to time use and skill assessments, thePSID also includes a detailed list of child demographics, family background characteristics,and other measures of the environment in which the child is raised.

Our search for a model whose parameters can be interpreted as causal proceeds in thefollowing manner. In our baseline models, we categorize child activities according to the levelof engagement – active (e.g., reading) or passive (e.g., watching TV) – and with whom theactivity is completed – mother, father, siblings, friends, grandparents, others, or no one. Wethen relate the time devoted to these activities to skill measures (math, vocabulary, compre-hension, and non-cognitive) using standard production functions, such as value-added. Wealso consider models containing many other subclassifications of activities as suggested inthe previous literature. The activity excluded from all models is sleeping, so that the resultsshould be interpreted as a substitution between a given activity and sleeping time. Everymodel also includes a series of indicator variables that reflect whether the time devoted toeach activity is zero. These indicator variables are included to absorb any discontinuouschange in the outcome variable when inputs are zero, conditional on controls. If we rejectthe null hypothesis that the coefficients on the zero time input indicators are jointly equal tozero, then we conclude that this particular model suffers from endogeneity. In our context,a model is defined by the outcome variable (skill), the main explanatory variables (timeinputs), controls (ranging from none to a detailed list of child, family, and environmentalobservables) and a particular functional form establishing the relationships between these

5

variables.For both younger and older children, our search ultimately identifies models where we

fail to reject exogeneity of children’s time inputs. Part of our success in this endeavor is dueto the detailed nature of the time use data. When we estimate the impact of a particulartime input, we are able to control for all other time inputs of the same child. These alterna-tive inputs absorb much of the endogeneity, as they elicit heterogeneity in preferences andconstraints across children and activity partners in the sample. However, the time use dataalone is not sufficient to account for all endogeneity. For younger children, child and familycharacteristics are crucial additional controls needed to absorb endogeneity in all cognitiveand non-cognitive skill models. In contrast, for older children, lagged skills in combinationwith different categories of controls are important in absorbing endogeneity for different skillmeasures. For math skill models, child and family characteristics are crucial, while for vo-cabulary and comprehension models, child characteristics, school characteristics and schoolexperience are important. Further, for models of non-cognitive skills, child characteristics,family environmental characteristics and school characteristics are important for controllingfor endogeneity. As surveys become increasingly onerous in their time demands on respon-dents, our results can provide guidance regarding which observed variables are critical tocollect when studying child skill development.

Once we arrive at models that fail to reject exogeneity, we turn our attention to the keyparameters of interest. For older children, the activities that most promote cognitive skillformation are active time with adult family members, such as parents and grandparents.For example, one additional hour per week spent in active time with grandparents leadsto 2.9% of a standard deviation increase in comprehension scores.8 Non-cognitive skills, onthe other hand, are increased most by passive time with parents and alone. For youngerchildren, the estimates of the impact of time inputs are quite different. Sleeping or nappingis the most productive time for the development of non-cognitive skills. For example, oneadditional hour per week spent in sleeping or napping rather than in active time with friendswould increase non-cognitive skills by 1.8% of a standard deviation. Math, vocabulary, andcomprehension skills are less sensitive to alternative time allocations.

Overall, we find that active time is not necessarily better than passive time at improv-ing skills, and skill improvement is more difficult to achieve through time re-allocation foryounger children as compared to older children. We also find that time in school improvescognitive skills but has a varying impact on non-cognitive skills according to child age, help-ing older children but harming younger children. In the penultimate section of the paper,

8This result aligns with studies in developmental psychology which emphasize the irreplaceable role ofgrandparents in the development of grandchildren (see Smith (2003) for more details).

6

we present a simple model of time allocation to help illustrate why selection on observablesis a valid assumption in this case. The model is also useful as a lens through which to inter-pret our findings, place them in the literature, and to discuss promising avenues for furtherresearch.

The rest of the paper is organized as follows. In Section 2, we describe the PSID data.Section 3 presents our approach and provides both theoretical and empirical evidence insupport of this approach. In Section 4, we present our main results. In Section 5, weperform various robustness checks. In Section 6, we discuss the interpretation of our results,before we conclude in Section 7.

2 Data

To estimate the effect of time inputs on skill we use data from the Panel Study of IncomeDynamics (PSID) and the three waves of the Child Development Supplements (CDS-I, CDS-II, and CDS-III).9 In 1997, the PSID started collecting data on a random sample of the PSIDfamilies that had children under the age of 13. About 3,500 children aged 0-12 residing in2,400 households were interviewed in 1997, and then followed in two subsequent waves, 2002and 2007. Rows 1-3 in Table 1 illustrate the age range and average age for each wave,respectively.

Data collected in the CDS include measures of children’s cognitive and non-cognitiveskills, time use diaries, and information about child and family characteristics, such as parent-child relationships, child health, and home environment. We match the CDS children withtheir PSID families to get additional information such as family annual income, mother andfather’s ages, mother and father’s education levels, and so on. We pool CDS children acrossthe three waves and divide all observations into two groups based on age: younger children(5-11 years old) and older children (12-17 years old). Pooling the data in this mannermaximizes our sample size while still allowing for heterogenous production functions acrossdevelopmental stages. Rows 4 and 5 in Table 1 illustrate the age range and average age foreach group.

To the best of our knowledge, the only other data set combining information on childskills and family background with time use diaries is the Longitudinal Study of AustralianChildren (LSAC). Compared to the LSAC, the PSID-CDS has the advantage of focusing ona larger age range of children (0-22 years old) and has richer time use data in terms of the

9Panel Study of Income Dynamics is a US longitudinal survey of a nationally representative sample ofindividuals and families, started in 1968 with a sample of 4800 families. It is funded by National Instituteof Child Health and National Development (NICHD).

7

number of children activities and with whom these activities were performed. As an example,the PSID-CDS allows us to separate the time children spend with mothers and fathers, whoaccording to Del Boca et al. (2013), have differential impacts on skill development. Theability to split activities according to detailed partner categories is also helpful in mitigatingendogeneity, an issue we discuss further in Section 6.1.

2.1 Time Use Diaries

The time use diary from the CDS collects the details of child activities for two randomdays of a week (one weekday and one weekend). Diary forms are mailed to each child’saddress, and each child (with the help of her primary caregiver if needed) fills out a detailed24 hour time diary to record all of her activities during the day, such as where each activitytook place and with whom they did the activity.10 An interviewer then visits the householdto check/edit the diary that has been completed.11

The PSID-CDS classifies child activities according to the type of activity (215 in CDSI, 317 in CDS II, and 315 in CDS III), where the activity took place (14), and with whom(11) the activity was completed. In CDS I, most diaries (80%) are completed by the child’sprimary caregiver or the child and her primary caregiver together. Sampled children areconsiderably older in CDS II and CDS III, and as a result approximately half of the childrenin these rounds complete the time diaries on their own.

We clean the time use data so that the diaries are as representative as possible. Timediaries may have limited reliability since they are only a very small sample of a given child’sdays.12 To allay this concern, we first exclude cases where either the weekday or the weekenddiary is not reported. Second, we exclude diaries that describe a non-typical day.13 Third,we keep only complete diaries and do not impute unassigned slots, with one exception: timeperiods between 10 p.m. and 6 a.m. that are missing are recoded as sleeping or napping,as in Fiorini and Keane (2014).14 As a result, we drop 4% of time diaries in CDS I, 3% inCDS II, and 1% in CDS III. Thus, we are left with complete diaries – those such that theduration of all the activities add up to 24 hours for one typical weekday and one typicalweekend. The numbers of observations in our samples are 2,807 in CDS I, 2,520 in CDS II,and 1,424 in CDS III.

1092% (90%) of the primary caregivers are mothers for the younger (older) cohort.11Some interviews were done via phone: 24% for the younger cohort and 9% for the older cohort.12Researchers have found that young children’s parents enjoy working with their child to complete the

child’s time diaries, and these diaries can adequately represent the child’s day (Timmer et al. (1985)), butit is not clear whether that particular day is a representative sample.

13Respondents are asked whether each reported day is typical.14We also follow them by recoding as sleeping or napping the time periods between 10 p.m. and 6 a.m.

originally filled as “refused to answer”.

8

Since we have over 200 variables corresponding to the type of activity the child performsand 11 variables corresponding to with whom the child performs the activity, it is not feasibleto estimate the effect of every single combination of these two variables given the availablesample size. Ideally, all else constant, more disaggregated categories of activities are pre-ferred, as it better exploits the heterogeneity in the data and estimates more interpretableparameters. However, as the categories of activities become increasingly disaggregated, theset of potential control variables gets exponentially larger. We choose to categorize children’sactivities into two general types of activities, namely active and passive.15 Active (passive)activities include all activities in which the child actively (passively) participates. The activ-ities that we recode as active are taking lessons (e.g., dancing), reading, socializing, activeleisure, household chores, jobs, school/day care, and organizational activities. In contrast,the activities we recode as passive are obtaining goods and services, traveling/waiting, usingcomputers, watching TV, passive leisure, and personal needs and care.16

We categorize with whom the child performs the activities into seven groups of people:mother, father, grandparents, siblings, friends, others (i.e., someone other than the first fivegroups), and self. In reality, a child could perform an activity with many different people atthe same time. Whenever the child is with more than one person within the same time slot,we assign the slot to the partner in the following order: mother, father, grandparent, sibling,friend, and others.17 Finally, we also add two other categories: refuse to answer or do notknow and sleeping or napping.18 We choose sleeping or napping as the omitted time inputin our estimation, so that all our reported results should be interpreted as a substitutionbetween a given activity and sleeping or napping.

Our decision to categorize child activities according to activity partners reflects two ideas.First, prior research indicates that there is important heterogeneity in the productivity oftime depending on the partner (Del Boca et al. (2013)). Second, disaggregating activity part-ners allows us to better control for selection. Each of these partners has different preferencesand marginal costs of time, which may influence input choices and, for a given input choice,the return of these inputs on skills (which likely depend on factors such as partner’s age, levelof education, intensity of engagement, child’s perceived authority of activity partner, etc).Controlling for how time is allocated across all these potential partners helps minimize thissource of endogeneity. However, we also explore the sensitivity of our results to additional

15We choose these labels rather than “quality” / “non-quality” or “productive” / “non-productive” becausethey seem to describe more objectively the type of task the child is performing, and do not necessarily reflectour expectations about how the production function should look like.

16A full description of our recoding rules is available upon request.17Our results are robust to changes in this order.18Following Fiorini and Keane (2014), we distinguish “refused to answer or do not know” (which we include

in our sample) from the case where an activity is missing (which we exclude from our sample).

9

controls that relate to the categorizations chosen by Fiorini and Keane (2014) and Del Bocaet al. (2013).

2.2 Summary Statistics

2.2.1 Children’s Time Allocation

In this section, we describe children’s time allocation using the recoded activity categoriesas described above. We construct a weekly measure for each time input by multiplying theweekday hours by 5 and the weekend day hours by 2, and then adding up the total hours.

Tables 2 and 3 show the weekly distributions of time (in hours) for younger and olderchildren, respectively. Sleeping or napping is the most popular activity in our sample, asexpected. Younger children also spend a lot of passive time with their mother, while olderchildren spend a lot of active time by themselves. This is not surprising as a five- to eleven-year-old child is likely to spend a lot of time on personal needs and care with her mother,while an older, twelve- to seventeen-year-old child, is likely to spend a lot of time at school.19

Compared to younger kids, older children have smaller means across the 16 time categoriesexcept for active and passive time with friends and with self. Specifically, as children age, theaverage active time with the mother drops from 13 hours per week to 8 hours per week. Activeand passive time with friends and with self seem to reduce time with parents, grandparentsand siblings as children age. It is also evident that there is less variation in active time withparents amongst older children, while there is more variation in passive time with parents.Importantly, almost every input category has a sizable mass of respondents reporting zerominutes.

2.2.2 Children’s Skills, Demographics, and Parental Background

In this section, we discuss other variables in the data that are relevant to our analysis. Westart by describing the children’s skill variables that we use as outcomes in our models. PSID-CDS children aged 3 and older are evaluated using the Woodcock-Johnson Revised Tests ofAchievement (WJ-R), Form B (Woodcock and Johnson (1989)). In 1997, children aged 3-5 are administered Letter-Word Identification and Applied Problems sub-tests. Childrenaged 6 and above receive Letter Word and Passage Comprehension sub-tests as well as

1990% (96%) of self active time for the older (younger) cohort consists of time at school. We incorporateschool time into self active time because it is difficult to determine with whom the child spends the bulkof their time at school. As discussed in Section 5, our results are robust to splitting self active time intotwo categories of inputs, one incorporating school activities and the other incorporating the other activitiescomprising of self active time.

10

Applied Problems and Calculation sub-tests. In the 2002 and 2007 waves, these tests are re-administered, with the exception of the Calculation sub-tests. Since the Calculation sub-testis only administered for the 1997 wave, we do not include it as one of our skill measures. Thus,we use standardized versions of Letter Word, Applied Problems, and Passage Comprehensionas our child cognitive skill measures.20 In the following sections we refer to these scores asVocabulary, Math, and Comprehension.

Non-cognitive skills are measured through parental assessment. In all three waves, theprimary caregiver is asked questions about the child’s behavioral problems. Twenty-sixquestions are used to measure the child’s behavioral problem scale, and ten other questionsare asked about the positive aspects of children’s lives, including obedience/compliance,social sensitivity, persistence and autonomy. With these thirty-six questions, we construct ameasure of non-cognitive skills by using iterated principal factor analysis, similar to Cunhaand Heckman (2008) and Fiorini and Keane (2014). In Table 21 in the Appendix we showthe rotated factor loadings. The factor loadings are all above 0.19 and stable across thetwo age groups. The constructed measure is standardized to have mean zero and standarddeviation one and is ordered so that a higher score means better non-cognitive skills.

The PSID-CDS collects extensive information on the child, her household, as well as herschool environment. In Table 4, we present demographic and parental background statisticsfor a few selected variables. Child characteristics are presented in rows 1 to 4, parentalcharacteristics are presented in rows 5 to 12, and environmental characteristics are presentedin rows 13 to 16. The table shows that the younger and older children are fairly similar interms of demographic and parental background characteristics. On average, children in thesample are the second child to her mother, and more than 50% live with both biologicalparents. The only sizable difference across age groups is the age of the children and theirparents. Parents of the younger children are on average in their late thirties and parents ofthe older children are on average in their early forties. Also, households income is slightlyhigher for the older children relative to their younger counterparts.

3 Empirical Approach

In this section, we discuss our empirical approach, paying special attention to how weimplement the test of exogeneity in our setting. A more technical discussion of the test ina multivariate context can be seen in Caetano and Maheshri (2016); see Caetano (2015) forthe formal description of the test in the univariate context.

20We do not use the standardized scores provided by the PSID-CDS. Instead we standardize the raw scoreof each skill measure to have mean 0 and standard deviation 1 for both older and younger children.

11

We are interested in assessing whether we can consistently estimate � via OLS in thefollowing equation:

Skilli = Inputi� + Controli⇡ + Errori, (1)

where i denotes a child. Skilli refers to a particular skill of the child (e.g., mathematics skill),as measured by standardized assessment scores. Inputi refers to a vector of all activities doneby the child in hours per week, whose jth element is denoted as Inputji (e.g., active time spentwith the mother). Controli refers to covariates added to absorb confounding factors, andErrori refers to the unobserved determinants of Skilli that are not absorbed by covariates.

In this context, a “model” is defined as a unique combination of (Skill, Input,Control)in equation (1) for precise definitions of Skill, Input and Control.21 We can consistentlyestimate � := (�1

, ..., �

J)0 via OLS in model (Skill, Input,Control), described in equation(1), if:

Assumption 1. Cov

0

B@Errori, Inputji | Input�ji ,Controli| {z }

Covariates

ji

1

CA = 0, for all j, where Input�ji :=

(Input1i , ..., Inputj�1i , Inputj+1

i , ..., InputJi ).

Our approach consists of testing Assumption 1 (jointly for all j) in all feasible models.In models that survive the test, we conclude that, at the same time for all j, all confoundingfactors that would bias �

j are absorbed by Covariatesji :=�Input�j

i ,Controli�. Thus, we

can reasonably interpret the �̂

OLS estimated from the models that survive the test as causaleffects of time inputs on skills. Of course, the credibility of this approach depends cruciallyon the capability of the test to detect potential endogeneity. In the rest of this section, weexplain the test and discuss in detail the types of endogeneity the test can and cannot detectin our context.

3.1 Testing the Exogeneity Assumption

The test of exogeneity relies on the assumption that unobserved confounders will be dis-continuous when at least one time input is zero; thus, when inputs are zero, unobservedpotential confounders are elicited in the equation. If these unobservables affect Skilli condi-tional on covariates (i.e., if E

⇥Skilli|Inputji = x,Covariatesji

⇤is discontinuous at x = 0, for

some j), then they are not fully controlled for and hence � cannot be consistently estimated21For robustness, we also consider different types of models in the paper, including ones where skills are a

non-linear function of inputs.

12

via OLS in this model. Thus, the test reflects whether covariates are able to absorb theconfounding unobservables which vary discontinuously when at least one input is equal tozero.

We explain the intuition of this test in Figure 1, which illustrates the correlation betweena generic time input and a generic child skill across all children in the sample. The goalof the test is to understand whether part of this correlation can be interpreted as causal.The discontinuity shown in Panel (a) must be the result of either the observed covariates orunobserved confounders varying discontinuously when the time input is equal to zero.22 Asshown in Panel (b), conditional on all observed covariates

�Input�j

i ,Controli�, the disconti-

nuity remains. The remaining discontinuity in Panel (b) must be the result of unobservedconfounders that are not absorbed by the covariates, so Assumption 1 is rejected for thismodel.

The statistical power of the test comes from the assumption that unobservables varydiscontinuously when a time input is zero. Below we show empirical evidence supportive ofthat; but first, we discuss why this is the case. Unobservables are likely discontinuous atzero in our context because observations are bunching at a threshold, leading to a “cornersolution” problem. For instance, consider a generic unobservable “mother type”, which helpsdetermine the skills of a child. Figure 2 illustrates how the average mother type variesdepending on the level of time spent reading books to the child. Mothers who read lessto their child tend to have a lower type, as illustrated in the figure. However, somethingunique happens at zero. The mothers who read zero minutes to their child are discontinuouslydifferent from the mothers who read a little to their child. The reason is that among motherswho read zero, there are some whose type is so low that if possible they would have readnegative amounts of time to their child. In this example, if mother type is not fully absorbedby covariates, then E


⇤will be discontinuous at x = 0, which

explains the discontinuity found in Panel (b) of Figure 1. Each time input can elicit manysuch unobservable confounders; if covariates do not fully absorb them, then endogeneity willbe detected.

More concretely, we implement the test by adding the vector Di to equation (1):

Skilli = Inputi� + Controli⇡ +Di�+ Error0i, (2)

where Di :=�d

1i , ..., d

Ji

�, d

ji := 1{Input

ji=0}. The vector Di allows for a discontinuity in

22Of course, we rule out the possibility that the main effect is discontinuous at zero in equation (1). Thisimplicit assumption, also made in all papers in the literature, is plausible in our context (e.g., a minuteof reading a book should not affect the child’s skills that much). Another reason this assumption seemsinnocuous in our context is that we cannot reject the null hypothesis of continuity for models with a detailedenough list of covariates.

13

E


⇤at x = 0 for each j. We implement an F-test for whether

� = 0, which tests for the null hypothesis that E⇥Skilli|Inputji = x,Covariatesji

⇤is continuous

at x = 0 for all j jointly. This test is equivalent to testing whether Assumption 1 holds(Caetano (2015); Caetano and Maheshri (2016)).

3.2 Evidence of Bunching

The test described above exploits the potential bunching of observations resulting from anon-negative time constraint. Here we show empirically that observations do indeed bunchat zero time input thresholds. Figure 4 shows the cumulative distribution function (CDF)of various activities for both older and younger children. The fact that the CDFs crossthe vertical axis away from the origin is direct evidence of bunching, as it shows that theprobability density function is discontinuous at zero (McCrary (2008)). Moreover, the CDFsare smooth away from zero, suggesting that there is not bunching elsewhere. Tables 2 and 3show the proportion of observations with zero time inputs for all inputs, providing evidencethat other activities have similar distribution functions.

Note that the bunching of inputs is not necessary for the test to work; it is sufficientthat unobservables are discontinuous at the threshold for whatever reason. Nevertheless,the evidence of bunching is suggestive of the “corner solution” intuition developed above:unobservables should be discontinuous because people cannot choose negative amounts oftime for any child related activity.

3.3 Sources of Detectable Endogeneity

While the bunching evidence above indicates that the proposed test should have power,the test may not have power to detect all sources of endogeneity. In this section, we provideevidence concerning the sources of endogeneity the test has power to detect. To structurethe discussion, we write the following general model of skill production:

Skilli = f( gInputi,Otheri) (3)

where gInputi is a vector of J̃ activities, defined at a very detailed level, and Otheri is a vectorthat includes all other inputs in the production function. Elements of gInputi in this genericframework are defined precisely by a unique combination of all its features. For instance,reading different books, or reading different pages of the same book, refer to different timeinputs. The production function f(·) is unrestricted. This is a “general” model in thefollowing sense: if we observed all elements of gInputi and Otheri, and if we were able to

14

estimate a non-restrictive f(·), then we would be able to identify the causal partial effect ofgInputi on Skilli. This is a good benchmark, as it allows us to discuss a comprehensive list of

the sources of endogeneity that might arise in our application as deviations from this idealscenario.

Assumption 1 essentially combines all of the simplifying assumptions that are neededto go from the general production function outlined above to the OLS specification we aimto estimate. For example, Assumption 1 includes assumptions about linearity, additiveseparability, and that Controli is sufficient to account for Otheri. A failure of any of theseassumptions will imply the existence of a variable wi that may bias our main estimates if itis not absorbed by covariates. If this variable wi is discontinuous when Inputji = 0 for somej, then the test has power to detect its presence.

Figure 5 shows a few examples of potentially omitted variables wi that are likelyelements of Otheri, where E[wi|Inputji = x] is discontinuous at x = 0 for some j.23 Theseplots indicate that the exogeneity test has power to detect whether we incorrectly omittedwi: if wi affects skill conditional on covariates then skill will be discontinuous at Inputji = 0.As an example, Panel (a) of Figure 5 indicates that if maternal education affects child skillconditional on covariates, then the test will detect endogeneity when maternal educationis excluded from Controli in (2). Panels (b)-(d) in Figure 5 indicate discontinuities in thenumber of books at home, hours the mother works per week, and household income whenvarious time inputs are zero, expanding the list of potential elements of Otheri whose omissionthe test has power to detect. Additional examples are provided in Figures 10 and 11 in theAppendix.

The wi we consider in Figure 5 are observable, so they can in principle be incorporatedin Controli to avoid any endogeneity stemming from their exclusion. However, if observablesare discontinuous at zero, then unobservables are also likely to be discontinuous at zero. Forinstance, the discontinuity in mother’s education level suggests that unobservables such aswhether the mother is well read, whether the mother pays attention to the academic progressof the child, etc, might also be discontinuous.24 If these variables are not fully absorbed bycovariates, then their discontinuity will be captured by Di, leading to a rejection of the

23Each plot shows E[wi|Inputji = x] along with a local cubic fit, where wi is denoted in the title, and Inputjiis denoted in the horizontal axis. At x = 0, we also show the 95% confidence interval. For x > 0, the scatterplot aggregates to the next hour of the time input. The shaded region represents the 95% confidence intervalfor the fit with an out-of-sample prediction at zero minutes. For the local fit, we use the Epanechnikovkernel with the rule-of-thumb bandwidth for the kernel and 1.5 times the rule-of-thumb bandwidth for thestandard-error calculation. Results are robust for different choices of kernel and bandwidths.

24The argument that similar patterns of discontinuity in observables should also be found in unobservablesis analogous to the one made by researchers about continuity when implementing regression discontinuitydesigns.

15

model.Another potential source of endogeneity is simultaneity. If time inputs are caused

by skills, rather than the other way around, Assumption 1 will be violated. For instance,children with low comprehension skill may be less willing to read, which may generate aspurious correlation between time spent reading and the comprehension skill measure. Thetest has power to detect endogeneity due to simultaneity exactly because of bunching. Indeed,the causal relationship of interest is plausibly continuous at zero, while the reverse causalrelationship implies a discontinuous correlation between skill and inputs at zero minutesbecause children of different skills are bunched at that threshold. For instance, spending timeto read the first word of the title of a book has essentially the same effect on comprehensionskills as not reading it at all. In contrast, children that spend zero minutes reading shouldhave a discontinuously lower comprehension ability in comparison to children reading a littleamount, since children reading zero minutes may not even know how to read. In Figure 6,we show that strong correlates of skill levels, such as birth weight, race, age and height, arein fact discontinuous when various time inputs are zero, suggesting that we have power todetect simultaneity bias.

Measurement error (i.e., gInputi 6= Inputi) can also generate endogeneity if the degreeof misreporting is correlated with the observed time input vector.25 For example, childrenwho spend more (active or passive) time alone may be more likely to fill out their owntime-use survey, and children might tend to overstate certain inputs relative to adults (e.g.,they might overstate the amount of time they spend with friends or other family membersto conceal how often they are alone). In Figure 7, we show discontinuities at zero timeinputs for examples of wi that are likely correlates of misreporting, such as whether the childcompleted the time diary alone. These discontinuities suggest that we have power to detectendogeneity stemming from measurement error. Additional examples are provided in Panels(c) and (d) of Figure 11 in the appendix.

Our test is also useful for detecting misspecification errors. This is important in ourcontext since there are countless ways to group activities and model the relationship betweenskill and time inputs. In particular, we make four key simplifying assumptions to arrive atEquation (1). We discuss each assumption in turn, along with evidence that key variables wi

elicited by the corresponding assumption vary discontinuously at zero. First, we aggregatemany time activities into only a few categories, which may induce endogeneity due to over-aggregation (i.e., J̃ > J). As an example, if a subcategory of maternal active time, suchas reading with the mother, increases disproportionately as active time with the mother

25Notice that this includes classical measurement error, i.e., when the error is uncorrelated to the truemeasures of time inputs.

16

increases, then we may arrive at a biased estimate of maternal active time. Figure 8 showsdiscontinuities for some examples of wi that speak directly to this potential issue. Forexample, Panel (a) shows that children who spend no passive time with their father arelikely to spend a discontinuously larger proportion of the active time they spend with theirfather at home, relative to children who spend little passive time with their father. Thissuggests that if active time with the father is differentially productive at home (a type ofheterogeneity precluded by our aggregation scheme) and results in endogeneity, then theindicator variable for passive time with father would detect it. Additional examples areprovided in Panels (e) and (f) of Figure 11 in the appendix.

Second, we assume that f( gInputi,Otheri) is separable, ruling out the presence of het-erogeneous effects. For instance, mothers who read well may be more willing to read totheir children, and this activity may generate a higher return to their children’s skill rel-ative to mothers who do not read well. The plots in Figure 5 (see also Figure 10) depictdiscontinuities for examples of wi along which heterogeneity in returns of activities likelyoccurs, suggesting that we can detect endogeneity resulting from heterogeneous treatmenteffects. For instance, Panel (a) of Figure 5 shows that the level of education of the mother isdiscontinuous when active time spent with the mother is zero. Thus, the test has power todetect endogeneity from heterogeneous effects to the extent that any other time input (e.g.,passive time with the mother) has a different effect on the child’s skills depending on thelevel of education of the mother.

A third potential misspecification issue that will generate endogeneity is the presenceof non-linear effects (i.e., f( gInputi,Otheri) 6= gInputi� + Controli⇡). In this case, wi =

f( gInputi,Otheri)� gInputi��Controli⇡ might be discontinuous when inputs are zero. Panel(a) of Figure 9 shows that E[Inputj

0

i |Inputji = x] is discontinuous at x = 0 for examples ofj 6= j

0. Children who spend zero active time with their mother spend a discontinuously largeramount of active time with their grandparents, an average increase from 2 to 5 hours perweek. This is direct evidence that the test has power to detect endogeneity from non-lineareffects (e.g., f j0(5)�f

j0(2) 6= (5�2)�j0).26 Additionally, our linear model (2) may incorrectlypredict a discontinuous impact of inputs at zero because of non-linearities away from zero.In this case, Di will be significantly different from zero in an attempt to correct for thismodel misspecification. Regardless of the reason, the test has power to detect endogeneitystemming from non-linear effects.

Finally, misspecification of controls may also lead to endogeneity. If wi is discontinu-26In reality, there is heterogeneity across observations with the same value of Inputji , which enhances the

power of the test because it can detect endogeneity if f j0 (x1) � f

j0 (x2) 6= (x1 � x2)�j0 for other values ofx1 and x2. For instance, Panel (b) of Figure 9 shows that the entire distribution is discontinuous at x = 0,not only its first moment. Caetano and Maheshri (2016) discusses this point in more detail.

17

ous at zero, then w

0i := g(wi) is also discontinuous at zero for almost all functions g(·). Thus,

the test also has power to detect endogeneity due to misspecification of observed controlswi, which can occur since it is unclear how they should be included in the equation.27

These examples of wi are just a small subset of observed variables for which we finddiscontinuities at x = 0. Moreover, for ease of exposition, we have discussed in turn theimplications of each simplification that is needed to go from the general production functionoutlined in equation (3) to the specifications we estimate. Our approach is agnostic aboutthe specific reason why Assumption 1 might fail, and in fact jointly tests for all sources ofdetectable endogeneity, even ones we may not conceive. Of course, even among these sourcesof endogeneity there may be confounders that cannot be detected by the test. For example,some confounder wi implied by an aggregation choice may not be discontinuous when inputsare zero. Next, we argue why these confounders are likely to be rare in our context.

3.4 Which confounders cannot be detected by the test?

Consider the set of all potentially endogenous variables w, characterized by being corre-lated to both Inputi and Skilli. If any such w is not absorbed by covariates, Assumption 1in the context of Equation (1) will be violated. These variables fall under two categories:(a) those that vary discontinuously at Inputji = 0, and (b) those that vary continuouslyat Inputji = 0. Thus far, we have focused our discussion on confounders of type (a) andour ability to detect them with our test. However, endogenous variables of type (a) canactually be further subdivided in two types: (a1) those that are correlated with Skilli whenInputji = 0, and (a2) those that are uncorrelated with Skilli when Inputji = 0. These threetypes of confounders, (a1), (a2), and (b), form a partition: any confounder wi must be ofone and exactly one of these three types. The exogeneity test described above can detectall confounders of type (a1), but cannot detect confounders of types (a2) or (b). Note thatour multivariate setting adds redundancies that contribute to the power of the test since aconfounder of type (a2) or (b) for input j can be of type (a1) for input j

0.28 In any case,as we argue next, confounders of type (a1) should be the norm rather than the exception inour context.

27If wi enters the equation non-linearly, discontinuities in higher moments of the distribution will addpower to the test. Here we mostly show discontinuities in the first moment of the distribution, but weactually find discontinuities in the whole distribution (e.g., Panel (b) of Figure 9). For instance, the varianceof wi is often discontinuously higher when inputs are zero. This is intuitive, as observations tend to bediscontinuously more heterogeneous at that point because of bunching.

28All observations with Inputji > 0 are such that Inputj0

i = 0 for some j

0, for all j, reducing the possibilitiesof confounders of type (a2). Similarly, all observations with Inputji = 0 are such that Inputj

0

i = 0 for somej

0 6= j, for all j, reducing the possibility of confounders of type (b).

18

To help frame our argument, Figure 3 illustrates examples of confounders of types (a)and (b). The solid black lines (and points) in the figure correspond to Inputji , while thedashed black lines correspond to Inputj?i , where Inputj?i is the unrestricted optimal choiceof input j by individual i. Of course, Inputj?i = Inputji for Inputji > 0, but Inputj?i Inputjifor Inputji = 0. This occurs because people cannot spend a negative amount of time on anactivity.29

Panel (a) of Figure 3 distinguishes between confounders of type (a1) and (a2), both ofwhich vary discontinuously at Inputji = 0. In this example, the red range along the right sideof the vertical axis is the support region of the confounder for the whole sample, while theblue range along the left side of the vertical axis is the support region of the confounder forthe subsample of observations such that Inputji = 0. A confounder, by definition, must becorrelated to Skilli in the red range. A confounder is of type (a1) if it is correlated to Skilliin the blue range, while it is of type (a2) if it is not correlated to Skilli in the blue range.The evidence of bunching shown above suggests that type (a2) is unlikely, since a significantportion of the sample is such that Inputji = 0. Moreover, the redundancies implied by themultivariate test are particularly helpful in this case because the blue range of the sameconfounder will vary across inputs, allowing the test to cover more of the support of anyconfounder.

Panel (b) of Figure 3 depicts a confounder of type (b). In this case, the average of theconfounder when Inputj?i 0 has to be equal to the corresponding average for observationswhere Inputj?i = 0. This is implausible because the confounder is by definition correlatedto Inputji and there are many observations such that Inputj?i < 0 (as per the bunchingevidence shown in Section 3.2). Indeed, the discontinuity plots discussed in the previoussection suggest that confounders of type (a) are much more likely to occur.

Despite the improbability of confounders of types (a2) and (b), we pursue in Section 5an extensive set of robustness checks designed to detect them. However, prior to presentingthese checks, we report our main findings.

Remark 1. Our approach is not more helpful than standard approaches in verifying whetherthere is important unobserved heterogeneity in a given model. Rather, our approach im-proves over standard approaches only in verifying whether this unobserved heterogeneityleads to endogeneity. For instance, lack of endogeneity due to heterogeneous effects (e.g.,Inputi and Otheri are non-separable) does not imply lack of heterogeneous effects. It onlyimplies that our estimate should be an unbiased estimate of the weighted average of different

29Note that Inputji , rather than Inputj?i , should be included as inputs in the production function, since wewant to identify the effect of the actual (not the desired) time spent on activities. Thus, we do not have acensored model, we only have a corner solution model. Wooldridge (2002) discusses this distinction in detail.

19

heterogeneous effects, where weights are given by the distribution of Otheri in the data. Forinstance, it may be that an additional hour of passive time alone has a positive effect fora wealthier child and a negative effect for a poorer child (e.g., the TV program the childis actually watching may be completely different for these two types of children). In thiscase, the estimates of a model where endogeneity cannot be detected by the test of exogene-ity could be zero. Indeed, they should be an unbiased weighted average of one additionalhour on passive time alone across all children, where weights are given by the proportion ofwealthier and poorer children.

4 Main Results

We start by proposing a set of regression models that we can plausibly estimate giventhe data described in Section 2. As explained, a “model” is defined as a unique combinationof (Skill, Input,Control) in equation (1), where Skill 2 {math, vocabulary, comprehension,non-cognitive skills}, Input 2 {sleeping or napping, active time with “companion”, passivetime with “companion”, don’t know or refuse to answer}, and companion 2 {mother, father,grandparents, siblings, friends, others, self}. The set of potential Controls will vary by agegroup because of data restrictions. We now describe the models we consider and present ourresults for each age group in turn.

4.1 Linear Treatment Effects

4.1.1 Older Children (12-17 Year Olds)

For older children, we consider only value-added models, which have been standard inthe literature, by including in all specifications the value of Skilli observed in the previouswave.30 We consider a sequence of seven specifications of the value-added model, where eachspecification includes a richer set of controls than the previous one. We take this approachfor two reasons. First, it illustrates that our exogeneity test has power to detect endogeneityin the most parsimonious models. Second, it helps identify the key controls that absorbimportant sources of endogeneity.

The details of each specification are as follows. Specification (1) has no controls otherthan the corresponding lagged skill. Specification (2) adds child characteristics, such asage, gender, and race. Specification (3) adds mother demographic characteristics, such asage, education level, and age at child’s birth. Specification (4) adds family demographiccharacteristics, such as father’s age, whether the child lives with biological parents, and

30We also show results for non value-added models in the appendix, for completeness.

20

household annual income. Specification (5) adds family environmental characteristics, suchas whether the child has a musical instrument at home and whether the child’s neighborhoodis safe. Specification (6) adds school characteristics, such as whether the child is in a publicor private school and the school’s teacher-student ratio. Specification (7) adds the child’sschool experience, such as whether the child has ever repeated a grade, and whether thechild has ever attended a gifted program.31

Table 5 shows the exogeneity test F-statistic and corresponding p-value for each modelwe consider. The F-statistics and p-values in bold represent the surviving specifications,i.e., specifications that we are not able to reject exogeneity at the 10% significance level. Inspecification (1), we reject exogeneity irrespective of the dependent variable, which providesdirect evidence that our test has power to detect endogeneity in the basic value-added modelwe consider, complementing the evidence shown in Section 3.3. For different skill measures,the specification of Control that makes the test no longer able to reject exogeneity is different.For example, the child’s observed characteristics, together with their lagged skill and alltime inputs, are enough to absorb any confounder in the production function of math,vocabulary, and non-cognitive skills. In contrast, comprehension seems to be a morecomplex production process, as we fail to reject only models that include observed child,family, and school characteristics. School characteristics (i.e. specification (6)) lead to ajump in p-value for comprehension skills (i.e. p-value goes from 0.079 to 0.175), whichis suggestive of the importance of school characteristics in absorbing endogeneity. Thus,Table 5 suggests that, for older children, different groups of control variables are playingvery different roles in absorbing endogeneity depending on the skill in question.32 We are

31Here is a full list of the control variables included for each category. Child characteristics: child’sage, child’s age squared, child’s gender, child’s race indicators, birth order to mother, born in the USindicator, child’s grade indicator, and child’s BMI. Mother demographic characteristics: mother’s educationin years, mother’s current age, mother’s current age squared, and mother’s age at child birth. Familydemographic characteristics: father’s education in years, father’s current age, father’s age at child birth,mother’s marital status at child birth, household annual income (in $1,000s), number of siblings child liveswith, indicators of whether child lives with biological parents, and indicator of whether child lives withgrandparents. Family environmental characteristics: spending on tutoring programs, spending on extra-curricular lessons, indicator for whether caregivers spent on school supplies for the child, indicator forwhether child has a musical instrument at home, indicator for whether child has a desk at home, and ratingof neighborhood quality. School characteristics: indicator for whether school is public, indicators for schoolenrollment criteria (e.g., based on geography), school’s teacher-student ratio, and child’s teacher’s full-timeteaching experience. Child’s school experience: indicator for whether child has ever skipped grade, indicatorfor whether child has ever attended a gifted program, and indicator for whether child has ever repeated agrade.

32Note that in exercises not reported here, we vary the order in which we add controls. The importance ofeach group of variables in accounting for endogeneity is similar to what is observed in Table 5. We completea similar exercise for the younger cohort as well. A full description of the permutation exercises we performedis available upon request.

21

unable to reject exogeneity in specifications (6) and (7) for all four skills.33

Table 6 presents the estimated coefficients of time inputs from a surviving specification(specification (7)) for all four skill measures. We find that for math skills, active time withmother, active time with father, active time with grandparents, active time with friends, selfactive time, passive time with mother, passive time with siblings, and self passive time arestatistically significant. Active time with grandparents is the most productive input: onemore hour a week spent on active time with grandparents rather than on sleeping or nappingwould increase the math test score by 2.4% of a standard deviation, while one more houra week spent on passive time with mother rather than sleeping or napping would increasetest score by about 0.6% of a standard deviation. It is also noteworthy that active timewith friends is as productive as active time with mother for older children. Although wefind that parental inputs have an impact on math skills for older children, there is little tono effect on vocabulary skills. In contrast, we find that active time with grandparents hasa statistically significant effect on child cognitive skills generally (i.e. math, vocabularyand comprehension). We also find that passive time with mother, self active time, andself passive time have similar effects (0.7%-0.8% of a standard deviation) on child’s non-cognitive skills.

The coefficients in Table 6 indicate the impact of each input on skills relative to sleeping.However, by comparing the coefficients with each other we can comment on the relativeeffectiveness of various inputs. For example, substituting an additional hour per week ofactive time with the father for active time with others would increase math scores by 1.4%of a standard deviation (with a standard error of 0.2%). The effect on skills of substitutingone input for another among older children could be quite different from the effect on youngerchildren, a group we now turn to.

4.1.2 Younger Children (5-11 Year Olds)

Data restrictions prevent us from considering value-added models for the younger set ofrespondents. For a child in this age group to have Letter Word or Applied Problem scores

33Our identification strategy is based on the premise that we should expect that any confounder will beabsorbed as we add controls, otherwise the test of exogeneity would detect its presence. However, this mightnot be the case if the standard errors of �̂ also increased with the addition of controls. In that case, adiscontinuity in E

hw|Inputji = x

iat x = 0 would be wrongly interpreted as continuous; i.e., confounders

of type (a) would be erroneously understood to be confounders of type (b). To check if this is the case,we present the distribution of the standard errors of all elements of �̂ for all combinations of specifications,skills and age group in the online appendix. In practice, the standard errors do not seem to increase as moredetailed controls are added in the specifications that we consider. This is not surprising since the additionof controls is simply an addition of incidental parameters to the regression, so it does not necessarily affectinference on the parameters of �, which remain fixed across all specifications.

22

in the previous wave, she has to be at least 8 years old in the current wave, and for her tohave a Passage Comprehension score in the previous wave, she has to be at least 11 yearsold in the current wave.34 As a result, if we want to estimate the value added model foryounger children in the same way as for older children, we would be left with 172 observationsonly. Given that we have 15 time inputs and a large number of control variables (i.e. 11in specification (2), 15 in specification (3), 24 in specification (4), 30 in specification (5), 36in specification (6), and 39 in specification (7)), there would be very few degrees of freedomleft. Thus, we consider only models that exclude lagged skills. So in the baseline model (i.e.specification (1)) for young children, we include only time inputs. Specifications (2) through(7) add the same controls used when estimating the production for older children.35

Table 7 presents the F-statistics and corresponding p-values for the test of exogeneityperformed in each model we consider. Differently from the case of older children, childcharacteristics are no longer enough to absorb confounders with regard to math skills.Instead, mother’s demographic characteristics are now pivotal for absorbing endogeneity inmath skills. For non-cognitive skills, family demographic characteristics are importantto absorb confounders amongst younger children. Family demographic characteristics arealso crucial for absorbing confounders for the vocabulary skills, but for comprehensionskills, mother demographic characteristics seem to be enough. We fail to reject exogeneityin specifications (4)-(7) for all skill measures.36 This result is somewhat surprising given thefact that lagged test scores are not included as controls, as in the case of older children.This could be explained in part by the fact that younger children have fewer opportunitiesto choose different activities from each other relative to older children, reducing the scopefor endogeneity.37

Table 8 shows the estimated coefficients from specification (7) for the four skill measures.The estimates for children ages 5-11 in Table 8 are quite different from the estimates in Table6 for children ages 12-17. Time inputs, in the way we categorize them, do not seem to playa critical role in improving younger children’s math or vocabulary skills. When we control

34As described in Section 2, Letter Word and Applied Problems were not administered for children below3, and Passage Comprehension was not administered for children below 6.

35See footnote 31 for a full description of the control variables.36For both age groups, the surviving models do not change even when we explicitly include napping as an

input in the regression (leaving only night sleeping as the input of reference), along with its correspondingindicator variable, and implement a stronger test of whether the 16 coefficients of the indicator variables areequal to zero.

37In the appendix, we present estimation results without including lagged skills for older children (Tables22 and 23). Not surprisingly, without the lagged scores more controls are generally needed in order for usto fail to reject exogeneity. However, we are able to arrive at specifications where the coefficients can beinterpreted causally. Reassuringly, the surviving models for children aged 12-17, with or without lagged testscores, provide similar estimates for all time inputs (see Tables 6 and 23).

23

only for child characteristics, we find significant impacts of maternal time on cognitive skillformation, however these models are rejected by the exogeneity test. Once we add controlsfor parental characteristics, the impact of parental time on skill formation vanishes and wefail to reject the model. Comprehension skills, on the other hand, are affected by time useat younger ages. Passive time with siblings, passive time with mother, as well as self activetime have a statistically significant influence on comprehension skills. For non-cognitiveskills, the estimates for younger children are mostly negative or zero, which suggests thatspending time sleeping or napping appears to be the most productive way to improve youngerchildren’s non-cognitive skills. Among the negative coefficients, active time with friendsis the most unproductive one: one more hour a week spent in sleeping or napping ratherthan in active time with friends would increase the child’s non-cognitive skills by 1.8% ofa standard deviation. Comparing the estimates for the non-cognitive skills between theyounger and older children, we find that in order to improve a child’s non-cognitive skills,it is more productive to spend time sleeping or napping when the child is young, and moretime with herself or with her parents as the child gets older.

Similar to older children, we can compare various activities to each other using thecoefficients in Table 8. Because many of these coefficients are small and insignificant, thedifferences will also be small and insignificant. One of the few exceptions would be tosubstitute one additional hour per week in active time with mother for active time withfriends. This would yield a 1.7% of a standard deviation (with a standard error of 0.3%)increase in non-cognitive skills. While variation in time inputs has relatively little impactfor younger children, family background variables are quite strong predictors of math andverbal skills.

4.2 Non-Linear Treatment Effects

Our main specifications assume that the effects of time inputs on child skills are linear, butthere can be interesting hidden heterogeneity in the results. In this section, we re-estimateour models using a linear B-spline in order to allow for non-linear treatment effects:

Skilli =X

j

f

j(Inputji ) + Controli⇡ +Di�+ Error0i, (4)

where f

j(·) is a linear B-spline function of Inputji with parameters �

jk, k = 1, 2, 3, repre-senting the linear effect within equally frequent intervals of the distribution of Inputji .

24

4.2.1 Older Children (12-17 Year Olds)

The exogeneity test results for children ages 12-17 are presented in Table 9. For com-parison, we show in bold the surviving specifications according to the linear model (2). It isuseful to check if the models that survive the linear exogeneity test also survive the non-linearexogeneity test. As discussed at the end of Section 3.3, the coefficients of Di in these linearmodels can capture endogeneity from either discontinuous confounders or from a failure ofthe linearity assumption. The results show that the specifications that survive the exogene-ity test in the linear model also tend to survive the exogeneity test in the B-spline model,and vice-versa. The only exception is specification (5) for comprehension, which survives thenon-linear test but does not survive the linear test, suggesting that the linear test detectsendogeneity partly due to misspecification of the production function. Overall, most of thepower of the test seems to stem from discontinuous unobservables, otherwise the B-splinemodels would fail to reject in even the most parsimonious specifications. From specifications(6) onwards, all models survive both exogeneity tests for all skills.

Table 10 shows estimates for all four skill measures in our preferred model of specification(7). We find that maternal active time has a significant positive effect on math and non-cognitive skills only when it is more than 15 hours per week, and maternal passive timeonly has a significant positive effect on math and comprehension skills when it is below 17hours per week, and in fact has a negative, significant effect in comprehension skills when itis above 29 hours per week. A large amount (above 36 hours per week) of active self time(e.g., mostly due to school activities) seems to be productive for math, while a little (upuntil 1 hour per week) passive time with the father seems to be productive for vocabulary.These results are consistent with the linear results, but provide further details about theproduction function of skills.

4.2.2 Younger Children (5-11 Year Olds)

The exogeneity test results for younger children are presented in Table 11. Again, modelsthat survive the linear test of exogeneity tend to survive the non-linear one and vice-versa,with a few exceptions, suggesting that the linear test of exogeneity detects endogeneity partlystemming from misspecification of the skill production function.

Table 12 reports the estimation results for children ages 5-11. Active time with the motherhas a positive effect on non-cognitive skills, if in moderation (between 6 and 15 hours perweek), but a negative effect when it is more than 15 hours per week. A lot of passive timewith the mother or with siblings seem to be productive for vocabulary and comprehension.In contrast, passive time with the father seems to be counterproductive for the same skills.

25

Moreover, up to 32 hours per week of self active time (mostly due to school activities) has apositive effect on cognitive skills.

Remark 2. As discussed in Remark 1, the fact that the treatment effect estimates are notlinear is not evidence that our surviving specifications in the linear models suffer from en-dogeneity. Indeed, the results suggest that the linear estimates in our preferred models area weighted average of the corresponding non-linear estimates. For example, the coefficientof passive time with the mother on mathematics skills is 0.006 for older children, which issimilar to a weighted average of the three coefficients of passive time with the mother fromspecification (7) shown in Table 10 (i.e. 0.06⇡1/3(0.016+0.000+0.004)). In general, an F-test for whether each coefficient of the linear model is the same as the weighted average ofthe corresponding coefficients of the B-spline model for all 15 time inputs yields a p-value of0.6526.

5 Sensitivity Analysis

Thus far, we have chosen appropriate models for causal inference purely based on theexogeneity test described in Section 3. However, there can be confounders that are notdetectable by the test. As discussed in Section 3.3, there are two potential categories ofconfounders: (a) confounders that are discontinuous at Inputji = 0, and (b) confoundersthat are continuous at Inputji = 0. Among type (a) confounders, there are two subtypes:(a1) those that are correlated with skill at Inputji = 0, and (a2) those that are not. Theexogeneity test introduced in Section 3.1 is capable of detecting all unobservables of type(a1), but is incapable of detecting unobservables of types (a2) or (b).

As discussed in Subsection 3.4, there are a number of reasons to believe that the class ofvariables included in types (a2) and (b) is small in our context. Regardless of how implausiblethe existence of these variables might be, this section provides robustness checks that can inprinciple detect them if they exist.

5.1 Comparing Surviving and Non-Surviving Specifications

In this section, we compare estimates of � across specifications, irrespective of whetherthe specification survives or does not survive the test, as shown in Section 4. This com-parison is often done in empirical studies, where, heuristically, a good model is one thatprovides estimates that are robust to added controls (which might be omitted variables in

26

the model).38 This “test of stable coefficients” is in principle capable of detecting endogeneityfrom the two undetectable sources of endogeneity discussed above. Indeed, added controlsmay partly absorb (both at Inputji = 0 and at Inputji > 0) confounders of type (a2) or (b),leading to a change in the main estimates. If a model survives the test of exogeneity, butdoes not survive this test, then it is evidence that the test of exogeneity did not detect someimportant source of endogeneity.

We test for whether the fifteen elements of � in each specification (1)-(6) from Section4 are jointly significantly different from the corresponding coefficients in specification (7),our preferred model. We present the p-value of this test for each skill measure for olderand younger children in Tables 13 and 14, respectively. Numbers in bold refer to thosespecifications that survive the exogeneity test at the 10% level of significance. In general,specifications that survive the exogeneity test (in bold) also survive the test of stable co-efficients (p-value > 10%). Across all models of both tables, only one model that survivesthe exogeneity test is rejected by the other test: specification (2) for math in Table 5. Thissuggests that confounders from the undetectable sources of endogeneity discussed aboveare only controlled for after mother characteristics are added as controls (specification (3)).Conversely, a few models do not survive the exogeneity test but survive the other test (e.g.,specification (4) for comprehension in Table 5, specification (3) for non-cognitive skill inTable 7). In these cases, the test of stable coefficients is unable to detect some confoundersthat are discontinuous when inputs are zero because they are not correlated to the full list ofcontrols of specification (7). From specification (5) onwards, all specifications survive bothtests for all skills and both cohorts. Overall, these results are consistent with the idea that,as we add controls from specifications (1) to (7) in Section 4, we converge to the true causalestimates.

Tables 15 and 16 show analogous results for the non-linear models discussed in Section4.2. For each cohort and each specification (1)-(6), we show the p-value from a test ofwhether the 27 coefficients �

jk are significantly different from the corresponding ones inspecification (7).39 We show in bold the specifications that survive the exogeneity test. Forolder children, all surviving specifications according to the exogeneity test also survive theother test, but the reverse is not true. For younger children, in two cases a specificationsurvives the exogeneity test but does not survive the other test, and in one case the reversehappens. As in the linear models, all specifications from specification (5) onwards surviveboth tests for all skills and both age groups.

38For instance, Fiorini and Keane (2014) implement a somewhat weaker version of this test whereby theycompare whether the ranking of the magnitude of each coefficient is the same across specifications.

39Some inputs did not allow for more than one or two B-spline terms.

27

In an online appendix, we present the actual estimates for specifications (1)-(7) for eachage group and for each skill, for both the linear and the B-spline cases, illustrating moreexplicitly how the estimates are virtually unchanged for the surviving specifications but oftenchange for the non-surviving ones.40

5.2 Alternative Specifications

In this section, we perform many additional robustness checks on specification (7) fromSection 4. Tables 17 and 18 report the p-value of a test for whether the coefficient of �

changes as we add controls to specification (7) from Section 4. Each specification in thesetables contain additional controls of two types: (a’) variables that are discontinuous whensome input is zero (some of which are shown in the plots presented in Section 3), and (b’)variables that are continuous when each input is zero, for all inputs.41 These variables mightbe correlated to undetectable confounders, as discussed above. For instance, observables oftype (a’) (type (b’)) might be correlated to unobservables of type (a2) (type (b)). If theyare, then they will partly absorb confounders that are undetectable by the exogeneity test,which would tell us that the test is unable to detect important sources of endogeneity. Thep-values in Tables 17 and 18 provide clear evidence that our estimates of specification (7)are statistically unchanged in all alternative specifications for both age groups.

Specifications (1’)-(3’) are particularly useful to allay further concerns about omittedvariables and simultaneity. In specification (1’), we add more control variables related tochild characteristics, family demographic characteristics, and environmental characteristics.42

In specification (2’), we add the 15 lagged (i.e., from the previous wave) time inputs.43 Inspecification (3’), we add the other three lagged skill measures as well as the interactionsbetween any two of the four lagged skills.44

In specification (4’), we add controls related to misreporting of time diaries (12 additionalcontrols)45, to allay further concerns about measurement error. Specifications (5’)-(11’) are

40The online appendix is available at http://bit.ly/1KOy1aj.41Of course, these variables may not be confounders of type (b), because they may not be correlated to

inputs at all.42Here is the full list of added controls in specification (1’): child’s birth weight, child’s current height,

mother’s race indicators, father’s race indicators, birth order to father, mother’s working hours in a week,mother’s working days in a week, indicator for whether mother’s working schedule is a regular (vs. night)shift, number of books mother read last year, and indicator for whether caregivers spent on clothes for thechild last year.

43This specification is referred to as the “cumulative model” by Todd and Wolpin (2007) and Fiorini andKeane (2014).

44We do not present younger cohort’s results for specification (3’) for lack of data, as discussed in Section4.

45The list includes whether the diary was self-administered, whether the diary was reviewed face-to-face,

28

included to check for undetectable confounders from over-aggregation. Active time activitiesare further subcategorized in the data as educational, social, and school activities, whilepassive time activities are further subcategorized in the data as general care and mediaactivities.46 In specification (5’), we add one more time input by separating school timefrom self active time, and test whether any of the 15 original coefficients change.47 Inspecification (6’), we add the proportions of each active time input spent in educationalactivities (7 additional controls).48 In specification (7’), we add the proportions of eachpassive time input spent in general care (7 additional controls).49 In specification (8’), weadd the proportions of each passive time input spent watching TV (7 additional controls).50

In specification (9’), we add the proportions of each time input spent at home as opposed toelsewhere (14 additional controls).51 In specification (10’), we add the proportions of eachtime input spent in activities with someone “participating” (14 additional controls).52 Inspecification (11’), we add the proportions of each time input spent during weekends (14additional controls).

In specification (12’), we check if our results are robust to the definition of age groups. Inthis specification, the younger group refers to five- to twelve-year-old children (rather thanfive- to eleven-year old children) and the older group refers to thirteen- to seventeen-year-oldchildren (rather than twelve- to seventeen-year old children).53

Analogously, Tables 19 and 20 present the same robustness checks for the non-linear mod-

whether the diary was reviewed via phone, and indicators of who completed the diaries.46Fiorini and Keane (2014) stratifies active and passive activities according to these five types, depending

on whether the activity involves parents. Thus, specifications (5’) and (6’) attempts to check forevidence of heterogeneous effects in dimensions that are captured by their specification of inputsand not captured by ours.

47School activities, originally fully included in self active time, comprise attending classes for full-timestudents, and daycare or nursery school for children not in school. They represent about 19% (18%) of allactivities, 59% (54%) of all active activities and 90% (96%) of the self active time activity for the older(younger) cohort.

48Educational activities include helping adults doing household chores, taking extracurricular lessons, andreading. They represent about 6% (5%) of all activities and 20% (15%) of all active activities for the older(younger) cohort.

49General care include obtaining goods and services, personal needs and care (e.g. having meals), andtraveling/waiting. They represent about 16% (15%) of all activities and 55% (60%) of all passive activitiesfor the older (younger) cohort.

50Watching TV represents about 8% (8%) of all activities and 29% (31%) of all passive activities for theolder (younger) cohort.

51Time spent at home accounts for about 26% of children’s total time in a week for both cohorts.52When filling out the time diaries, the respondents were asked not only about with whom each activity

was performed, but also whether the partner actually participated in the activity (versus being just aroundwhile the child performed the activity). Participation time accounts for about 18% (21%) of children’s totaltime in a week for the older (younger) cohort. This variable was used in Del Boca et al. (2013) to categorizeinputs.

53Lack of degrees of freedom prevents us from estimating the parameters with age groups with narrowerranges.

29

els discussed in Section 4.2, with the aim of allaying further concerns about non-linearities.We test whether the coefficients �

jk, for all j and k (27 coefficients) change as we changespecification (7). The results show that, similarly to the linear models, the estimates do notchange.

Given the evidence presented in this section, it is difficult to conceive of a confounderthat may be biasing our estimates. It needs to be of type (a2) or (b) for all inputs and atthe same time be undetectable by all the robustness checks provided in this section. Forinstance, it is difficult to conceive of variables (of type (a2) or (b) for all inputs) correlatedto both Skilli and Inputi observed in the current wave, and yet uncorrelated to both Skilliand Inputi observed in the previous wave.

6 Discussion

6.1 Why Does Selection on Observables Work in This Context?

The results for the linear and non-linear models discussed in the prior sections indicatethat with rich enough controls we are able to arrive at specifications for which we fail toreject exogeneity. Moreover, as discussed in detail in the past sections, this does not appearto result from a lack of power. A natural question to ask at this point is why a selection onobservables approach seems to be appropriate in the context of this application.

While the richness of the available controls in the PSID is certainly helpful for mitigatingendogeneity, incorporating the full set of inputs into the production function is also quiteuseful. To see this, consider the following simple model of input choices and skill formationwhere, for simplicity, we treat the child as the sole decision-maker. Skill for individual i isdetermined according to

Skilli = f(Inputi, ✓i),

where Inputi is a vector of J time inputs and ✓i is a vector of other inputs (i.e., Otheriin equation (3)) impacting skill which reflects any heterogeneity in the production functionacross children (e.g., how much attention the child pays when reading). Children chooseInputi to maximize utility

Ui = g(Inputi, ✓i,!i)

subject to Inputji � 0 andPJ

j=1 Inputji = T , where T is the total available time (i.e., 24hours per day). !i is a vector denoting heterogeneity in utility that is not associated with

30

heterogeneity in skill production (e.g., how much the child enjoys reading). In this generalformulation, skill and time inputs can in principle affect utility directly, as can the otherinputs influencing the production of skill, ✓i.

Given this maximization problem, the chosen vector of time inputs, Input⇤i , is implicitlydefined by the levels of ✓i and !i:54

Input⇤i = h(✓i,!i)

so that individuals with different levels of (✓i,!i) tend to choose different levels of the vectorof inputs. For a given ✓i, the variation in inputs due to !i is not endogenous and is in factprecisely the type of variation we want to exploit when estimating the production function.Of course, although the component of !i that is orthogonal to ✓i would make ideal instru-ments to identify the effect of interest, it is difficult to know ex ante which source of variationis included in !i and which source of variation is included in ✓i, hence our need to developan alternative identification strategy in this paper.

We can write Input⇤,ji as

Input⇤,ji = h

j(✓i,!i, Input⇤,�ji ).

In our context, endogeneity arises if an input is correlated with ✓i across individuals, condi-tional on covariates: Cov

�Input⇤,ji , ✓i|Input⇤,�j

i ,Controli�6= 0, i.e., if hj(·, Input⇤,�j

i ,Controli)varies with ✓i.

We conjecture that we are able to eliminate endogeneity and identify the effects of in-terest with our data for two reasons. First, to the extent that other inputs Input⇤,�j

i absorbelements of ✓i, adding them as covariates can substantially reduce the potential for endo-geneity, requiring less of the vector Controli. Second, as we add Controli we are able toshut down any correlation between ✓i and Input⇤,ji (conditional on Input⇤,�j

i ) before we shutdown the correlation between !i and Input⇤,ji . The full set of controls incorporated in theempirical model must be unable to thoroughly absorb !i, otherwise there would be no inde-pendent variation remaining in Inputi to estimate the production function. !i reflects tastesand household constraints, which are likely quite heterogeneous across people, while ✓i isbound by technical features of the skill production technology. Thus, it is not surprisingthat covariates can fully control for ✓i without fully controlling for !i.

The above discussion illustrates a largely under-appreciated benefit of modeling the fullvector of inputs in skill production. The inclusion of a comprehensive list of time activities

54Input⇤i represents gInputi in equation (3). For simplicity in the exposition, we assume no measurementerror in this section.

31

not only enhances the interpretability of the production parameters, but can also substan-tially allay endogeneity concerns. Indeed, all else constant, Input⇤,�j

i helps absorb moreconfounders the more disaggregated inputs are.

6.2 What Can (and Cannot) be Inferred from Our Estimates?

In this paper, we estimate the average marginal productivity of each input on each skill,across all children of each age group. It is useful to interpret these estimates with the aidof the framework described above. We estimate E[fj(Input⇤i , ✓i)] for each j, where fj refersto the first derivative of the production function f with respect to its jth input, and theexpectation is taken across all children i of each age group.

When E[fj(Input⇤i , ✓i)] > 0, we conclude that on average children will see an improvementin skill if they decide to spend more time on activity j (relative to sleeping), in comparisonto their current time. However, that does not necessarily imply that children should spendmore time on activity j. Indeed, children and their families likely make time allocationchoices in order to maximize utility, not skill. To illustrate the implications of this, we showhow different children and their parents might choose different levels of time inputs, andhow these different choices might lead to different estimates of fj(Input⇤i , ✓i). Assume thatchildren and their parents care about skill (f), non-skill (u), and costs (c) such that

Ui = f(Inputi, ✓i)� nc(Inputi, ✓i,!i)

where nc(Inputi, ✓i,!i) := c(Inputi, ✓i,!i) � u(Inputi, ✓i,!i) represent the utility cost netof non-skill benefits, which is allowed to be heterogenous across different time investments.Intuitively, one can think of c as representing the component of utility related to “costs” andu as representing the component of utility related to “fun”, although u can be interpretedmore generally to also encompass any mistake in optimization.55 The first order conditionsfor an optimum in the interior imply

fj(Input⇤i , ✓i)� ncj(Input⇤i , ✓i,!i) = fj0(Input⇤i , ✓i)� ncj0(Input⇤i , ✓i,!i)

where ncj is defined analogously to fj. In words, there should be a one-to-one relationshipbetween differences in marginal productivity across two positive inputs j and j

0 and theircorresponding net costs. If time input j is observed to have a greater marginal product than

55For instance, if children and their parents want to maximize the true skill but perceive the produc-tion function to be f̃(Inputi, ✓i,!i) instead of f(Inputi, ✓i), u can be written as u := f̃(Inputi, ✓i,!i) �f(Inputi, ✓i), where in this case !i is interpreted as the vector representing the heterogeneity of this mis-perception across children and their family. If instead they maximize just fun, then u := u

0(Inputi, ✓i,!i)�f(Inputi, ✓i) where u

0 represents the actual component of the utility representing “fun”.

32

input j

0, the reason must be that input j is commensurately more costly (net of non-skillutility benefits). In addition, consider a situation where Input⇤,ji = 0 and Input⇤,j

0

i > 0. Thenit must be the case that

fj(Input⇤i , ✓i)� ncj(Input⇤i , ✓i,!i) fj0(Input⇤i , ✓i)� ncj0(Input⇤i , ✓i,!i).

That is, if the optimal choice for input j is zero, then the marginal net return of input j

should be lower than the marginal net return of input j0, for Input⇤,j0

i > 0.Given the discussion above, it is difficult to predict ex ante the expected distribution

of fj(Input⇤i , ✓i). The effects depend implicitly on the distribution across children of themarginal net costs of each activity, ncj(Input⇤i , ✓i,!i), which are in turn functions of thejoint distribution of (✓i,!i).56

This framework is useful to understand the role of heterogeneity in shaping our estimatesof the effect of time allocation on skills. As discussed in Remark 1, the estimates of oursurviving models should represent an unbiased average of the distribution of fj(Input⇤i , ✓i)across all children of each age group. The fact that we find that sleeping has a positive returnon non-cognitive skills for younger children suggests that on average, if all children increasedtheir time sleeping by one hour we would observe an increase in average non-cognitive ability.However, it may be that the non-cognitive ability of some children would decline with such areallocation. Our specification of inputs is not detailed enough to capture such heterogeneouseffects. In this paper, we have focused on providing evidence of heterogeneous effects onlyalong a few dimensions (age groups and current time allocation) because of data constraints.To compensate for a lack of data, we ensure the test of exogeneity has power to detectendogeneity from heterogeneous effects that are not captured by our specification of inputs.Thus, we can reasonably conclude that the unobserved heterogeneity not incorporated inour specification of inputs does not generate endogeneity. However, we cannot conclude thatthis unobserved heterogeneity is small or unimportant for policy. Future investigation ofheterogeneous effects of time allocation on skills along dimensions other than the ones wehave studied is warranted.

56Moreover, non-linearities in the production function can complicate the interpretation even further. Iff(Input⇤i , ✓i) is non-separable between Input⇤i and ✓i, or if f(·, ✓) is non-linear in inputs, as it appears to beaccording to our results in Section 4, then children with different values of (✓i,!i) should choose differentlevels of Input⇤(✓,!), leading them to have potentially different values of fj(Input⇤(✓,!), ✓). Remarks 1 and2 discuss this topic in more detail.

33

6.3 Relationship with the Previous Literature

It is widely believed that child outcomes might improve if more of their time is spent inactive activities.57 However, evaluating this conventional wisdom is difficult because it is notclear which activities are actually productive and what these activities might substitute for.This paper adds to the literature by examining how child cognitive and non-cognitive skillsare impacted by time use, where time is categorized into comprehensive and precisely definedactivities. We find that active time with parents or other activity partners does not helpyounger children, and helps older children but only in developing math skills. Additionalpassive time does not hurt and sometimes helps with skill development. Schooling helpsdevelop cognitive skills, but has different effects on non-cognitive skills according to childage. Additional time in school helps older children develop non-cognitive skills, but inhibitsthe development of non-cognitive skills for younger children. An increase in sleep helps non-cognitive skills for younger children, but does not matter for older children. Finally, theoverall impact of time inputs are smaller for younger children than for older children.58

Although there is an extensive literature in economics on child skill development, thereare only two studies, Del Boca et al. (2013) and Fiorini and Keane (2014), that estimatethe effect of children’s time allocation on skill formation. Del Boca et al. (2013) also usethe PSID-CDS, but do not incorporate all child activities, making it difficult to compareour results to theirs even if both papers provided unbiased estimates. In contrast, Fioriniand Keane (2014) incorporate a comprehensive list of activities as we do, but use datafrom Australia rather than the US. Thus, it is difficult to make comparisons between ourestimates and theirs even if both papers provided unbiased estimates. Indeed, one canthink of institutional differences across countries that may lead to different estimates ofE[fj(Input⇤i , ✓i)] because wi is distributed differently for children with the same value of ✓i(e.g., child care costs, female labor supply elasticity, social norm about how children shouldbe raised, etc).59

57According to the American Academy of Pediatrics (AAP), children today spend seven hours a day onentertainment media (a passive activity). The AAP, however, recommends that children and teens shouldengage with entertainment media for no more than an hour or two a day. It is recommended that moretime be allocated to outdoor play, reading, hobbies and free-play, all of which are active activities. Seehttps://www.aap.org for additional details.

58We suspect that the structured nature of younger children’s days may not leave much scope for familiesto affect their children’s skill via a reallocation of activities. In the decade since these households wereinterviewed, there has been a significant focus both in academia and the public media on early childhoodinvestments. It would be interesting to explore these same questions for a more recent cohort of children.

59The data confirms that the joint distribution of (✓i,!i) in the Australian data is completely different fromthat in the American data. This can be inferred by the difference in the distribution of Input⇤i (✓,!) acrossthese two countries as seen in the summary statistics in both papers. For instance, on average Americanchildren spend more passive time and less active time with their mother than Australian children do. Asdiscussed in Section 6.2, differences in the joint distribution of (✓i,!i) should lead to different estimates of

34

Nevertheless, for completeness we compare our main findings with those from Fioriniand Keane (2014). This discussion is limited to younger children, since Fiorini and Keane(2014) do not have data for older children. A common finding across the two studies is thatnon-cognitive skills of younger children are relatively unresponsive to parental time inputs.Additionally, the fit of the non-cognitive skill regressions in both papers tends to be poor,suggesting that much of the variation in child non-cognitive skills remains unexplained. Bothstudies also find that sleeping is one of the more important activities for non-cognitive skillproduction. While our findings regarding the production of non-cognitive skills is similar toFiorini and Keane (2014), our results relating to the production of cognitive skills are quitediferent. In particular, we find that active time with parents or others in the US has little tono effect on cognitive skill formation, while Fiorini and Keane (2014) find that educationaltime with parents or others in Australia is quite productive. The source of this difference isdifficult to pin down. In addition to the issues cited above, our aggregation scheme for timeinputs and the set of controls included in our models are different from theirs.

To truly understand the differences in findings, and ultimately the role of time allocationin skill development more broadly, much richer data and models of skill production and timeallocation are needed. It is not enough to simply estimate more flexible production functions,since as noted above it is difficult to interpret the results without a formal model of timeallocation.60 Such a model would require specifying a utility function, determining the costsassociated with each time input, and assessing the information available to children and theirparents as they consider these input choices. While such a model is beyond the scope ofthis paper, we believe our approach and estimates of skill production are an important steptowards the creation of this broader framework.

7 Conclusion

Cognitive and non-cognitive skills are critical for a host of economic and social outcomesas an adult. While there appears to be a consensus view that a significant amount of skillacquisition and development occurs early in life, the precise activities and investments thatdrive this process are not well understood. In this paper we examine how children’s timeallocation affects the accumulation of skill.

To do this, we apply a recently developed test of exogeneity to search for models that yieldcausal estimates of the impact time inputs have on child skills. The test exploits bunching

E[fj(Input⇤i , ✓i)] purely due to the presence of heterogeneous effects.60The non-linearity we incorporate in our models is of a relatively modest form, a limitation imposed by

the size of our sample.

35

in time inputs induced by a non-negativity time constraint. We provide evidence that thetest is able to detect endogeneity arising from omitted variables, simultaneity, measurementerror, and a host of misspecification errors. There are potential sources of endogeneity thatthe test is unable to detect. However, our robustness exercises, which are designed to detectthem, suggest that our rich set of controls, together with a comprehensive list of time inputs,are able to absorb them. The test indicates that with a sufficient set of controls, alreadyavailable in the most detailed datasets, we are unable to reject exogeneity of time inputs forboth younger and older children.

For younger children, we find that sleeping is critical for the development of non-cognitiveskills while maternal passive time is important for cognitive skill development. For olderchildren, active time with adults is relatively valuable in developing cognitive skills, whilepassive time with parents and alone are important for non-cognitive skills. However, theseeffects are likely to be heterogeneous across families, children within families and activitieswithin our time input categories. As better data become available, a similar approach tothe one implemented here can be used to uncover causal estimates at a more disaggregatedlevel.

An additional benefit of our approach is that it can be used as a first stage in a broadermodel aimed at understanding household decisions about work, leisure, and investments inchildren. Typically, papers that are interested in such questions embed a skill productionfunction in a more detailed household behavioral model (e.g., Del Boca et al. (2013)). Usingour estimates would reduce the computational burden as well as ensure that endogeneityconcerns have been considered.

Finally, our approach to estimating how children’s time allocation affects skill develop-ment can be utilized to study the consequences of other similar resource allocation decisions.Examples include understanding the impact of watching violent media on violent behavioror the productivity benefits of time spent exercising. In both examples, the activity of in-terest is endogenous, it is unclear which activity is being substituted for,61 and individualsare likely to bunch at zero as a result of non-negativity constraints. As time diaries becomemore ubiquitous, the methodology employed here provides researchers with a potential toolto study causality without an ex-ante source of exogenous variation.

61DellaVigna and Ferrara (2015) discusses these endogeneity issues in the context of the economic andsocial impacts of the media.

36

References

Almond, D. and Currie, J. (2011). Human capital development before age five. Handbook oflabor economics, 4:1315–1486.

Bernal, R. and Keane, M. P. (2010). Quasi-structural estimation of a model of childcarechoices and child cognitive ability production. Journal of Econometrics, 156(1):pp.164–189.

Caetano, C. (2015). A test of exogeneity without instrumental variables in models withbunching. Econometrica, 83(4):pp.1581–1600.

Caetano, G. and Maheshri, V. (2016). Identifying dynamic spillovers of crime with a causalapproach to model selection.

Cameron, S. V. and Heckman, J. J. (1998). Life cycle schooling and dynamic selection bias:Models and evidence for five cohorts of american males. Journal of Political Economy,106(2):pp. 262–333.

Cunha, F. and Heckman, J. J. (2008). Formulating, identifying and estimating the technologyof cognitive and noncognitive skill formation. The Journal of Human Resources, 43(4):pp.738–782.

Cunha, F., Heckman, J. J., Lochner, L., and Masterov, D. V. (2006). Interpreting theevidence on life cycle skill formation. Handbook of the Economics of Education, 1:pp.697–812.

Currie, J. and Thomas, D. (1999). Early test scores, socioeconomic status and future out-comes. Technical report, National bureau of economic research.

Del Boca, D., Flinn, C., and Wiswall, M. (2013). Household choice and child development.The Review of Economic Studies, page rdt026.

DellaVigna, S. and Ferrara, E. L. (2015). Economic and social impacts of the media. Tech-nical report, National Bureau of Economic Research.

Deming, D. (2009). Early childhood intervention and life-cycle skill development: Evidencefrom head start. American Economic Journal: Applied Economics, pages 111–134.

Dustmann, C. and Schönberg, U. (2012). Expansions in maternity leave coverage and chil-dren’s long-term outcomes. American Economic Journal: Applied Economics, 4(3):pp.190–224.

37

Fiorini, M. and Keane, M. P. (2014). How the allocation of children’s time affects cognitiveand noncognitive development. Journal of Labor Economics, 32(4):pp. 787–836.

Heckman, J., Pinto, R., and Savelyev, P. (2013). Understanding the mechanisms throughwhich an influential early childhood program boosted adult outcomes. The AmericanEconomic Review, 103(6):pp. 2052–2086.

Keane, M. P. (2010). A structural perspective on the experimentalist school. The Journalof Economic Perspectives, 24(2):pp. 47–58.

Keane, M. P. and Wolpin, K. I. (1997). The career decisions of young men. Journal ofPolitical Economy, 105(3):pp. 473–522.

McCrary, J. (2008). Manipulation of the running variable in the regression discontinuitydesign: A density test. Journal of Econometrics, 142(2):pp. 698–714.

McLeod, J. D. and Kaiser, K. (2004). Childhood emotional and behavioral problems andeducational attainment. American Sociological Review, 69(5):pp. 636–658.

Smith, P. K. (2003). The psychology of grandparenthood: An international perspective. Rout-ledge.

Timmer, S. G., Eccles, J., and O’Brien, K. (1985). How children use time. Time, goods, andwell-being, pages pp. 353–382.

Todd, P. E. and Wolpin, K. I. (2003). On the specification and estimation of the productionfunction for cognitive achievement. The Economic Journal, 113(485):pp. F3–F33.

Todd, P. E. and Wolpin, K. I. (2007). The production of cognitive achievement in children:Home, school, and racial test score gaps. Journal of Human Capital, 1(1):pp. 91–136.

Woodcock, R. W. and Johnson, M. B. (1989). Woodcock-Johnson tests of achievement. DLMTeaching Resources.

Wooldridge, J. M. (2002). Econometric Analysis Cross Section Panel. MIT press.

38

Table 1: Summary of Ages

Age Range Average AgeCDS I: 1997 0-12 years old 6 years and 9 monthsCDS II: 2002 5-17 years old 11 years and 9 monthsCDS III: 2007 10-22 years old 16 years and 9 monthsYounger Children 5-11 years old 8 years and 4 monthsOlder Children 12-17 years old 14 years and 6 months

Table 2: Weekly Time in Each Activity (in Hours), Younger Children

Younger ChildrenMean SD Proportion

of ZeroActive time with mother 13.04 9.77 0.09Passive time with mother 23.21 12.09 0.04Active time with father 1.93 4.28 0.69Passive time with father 2.66 5.58 0.57Active time with grandparents 1.25 4.12 0.85Passive time with grandparents 1.90 5.98 0.79Active time with siblings 2.48 4.77 0.62Passive time with siblings 2.95 5.62 0.52Active time with friends 2.44 5.45 0.70Passive time with friends 1.13 3.18 0.71Active time with others 2.41 4.98 0.69Passive time with others 3.60 7.61 0.39Self active time 30.45 11.52 0.08Self passive time 7.72 4.62 0.00Sleeping or napping 70.49 7.67 0.00Refused to answer or do not know 0.28 1.88 0.96

Note: The third column shows the proportion of children who spend zero minutes in a week on the corresponding time category.

39

Table 3: Weekly Time in Each Activity (in Hours), Older Children

Older ChildrenMean SD Proportion

of ZeroActive time with mother 7.69 8.21 0.23Passive time with mother 22.13 14.45 0.07Active time with father 1.22 3.80 0.81Passive time with father 2.56 5.62 0.62Active time with grandparents 0.46 2.16 0.92Passive time with grandparents 1.38 5.69 0.86Active time with siblings 1.60 4.07 0.75Passive time with siblings 3.49 6.74 0.55Active time with friends 4.42 7.51 0.56Passive time with friends 5.19 7.77 0.37Active time with others 1.91 5.18 0.81Passive time with others 2.58 7.89 0.58Self active time 34.95 13.67 0.06Self passive time 10.98 8.11 0.00Sleeping or napping 64.55 10.01 0.00Refused to answer or do not know 2.90 6.65 0.67

Note: The third column shows the proportion of children who spend zero minutes in a week on the corresponding time category.

Table 4: Demographics and Parental Background

Younger Children Older ChildrenMean SD Mean SD

Child’s age (months) 100.50 24.33 174.10 20.57Child’s gender 0.51 0.50 0.50 0.50Birth order to mother 1.92 1.09 1.95 1.06Born in US 0.98 0.14 0.98 0.14

Mother’s age 35.20 6.37 41.36 6.01Father’s age 38.02 6.95 44.27 6.65Mother’s age at child birth 27.53 6.02 27.91 5.71Father’s age at child birth 30.54 6.45 31.02 6.31Mother has only high school degree 0.29 0.45 0.31 0.46Mother has college degree 0.20 0.40 0.21 0.41Father has only high school degree 0.17 0.37 0.15 0.36Father has college degree 0.27 0.45 0.27 0.45

Number of siblings child lives with 1.67 1.86 2.21 2.66Lives with two biological parents 0.60 0.49 0.55 0.50Lives with grandparent 0.08 0.27 0.07 0.26Household annual income (in $1,000s) 107.95 128.00 117.71 129.68

40

Table 5: Exogeneity Test Results: Older Children

Controls Math Vocabulary Comprehension Non-cognitiveF-stat p-Value F-stat p-Value F-stat p-Value F-stat p-Value

(1) Lagged Score 2.920 0.000 2.519 0.001 2.923 0.000 1.900 0.019(2) Child Chrs. 1.312 0.187 1.262 0.219 2.103 0.008 1.315 0.185(3) Mother Demog. Chrs. 1.256 0.223 1.246 0.230 1.857 0.023 1.312 0.186(4) Family Demog. Chrs. 1.071 0.379 1.160 0.297 1.694 0.046 1.341 0.169(5) Family Environ. Chrs. 1.065 0.385 1.078 0.373 1.557 0.079 1.267 0.215(6) School Chrs. 0.951 0.506 1.075 0.375 1.331 0.175 1.225 0.245(7) School Experience 0.879 0.588 1.020 0.431 1.376 0.151 1.232 0.240

Note: Entries in bold are “surviving specifications” for which we cannot reject exogeneity at 10% of significance. Each specifi-

cation contains different control variables: (1) no controls, except for the lagged corresponding input; (2) child characteristics;

(3) mother demographic characteristics; (4) family demographic characteristics; (5) Family environmental characteristics; (6)

School characteristics; (7) Child’s school experience. See footnote 31 for a full description of the control variables. All standard

errors are corrected for heteroskedasticity.

41

Table 6: Effects of Children’s Time Allocation: Older Children

Math Vocabulary Comprehension Non-cognitive

Active time with mother 0.008** 0.002 0.001 0.005(0.003) (0.003) (0.003) (0.004)

Passive time with mother 0.006** 0.002 -0.001 0.007**(0.002) (0.002) (0.003) (0.003)

Active time with father 0.016** 0.007 -0.000 0.008(0.006) (0.006) (0.007) (0.007)

Passive time with father -0.001 -0.001 0.007 0.009*(0.004) (0.004) (0.004) (0.005)

Active time with grandparents 0.024** 0.025** 0.029** 0.019(0.012) (0.012) (0.013) (0.018)

Passive time with grandparents -0.003 -0.003 -0.002 -0.000(0.004) (0.005) (0.005) (0.006)

Active time with siblings -0.002 -0.006 -0.011* 0.010(0.005) (0.005) (0.007) (0.007)

Passive time with siblings 0.006* 0.003 0.000 0.005(0.004) (0.003) (0.004) (0.005)

Active time with friends 0.008** 0.005 -0.002 0.002(0.003) (0.003) (0.004) (0.005)

Passive time with friends 0.004 -0.004 -0.001 0.002(0.003) (0.003) (0.003) (0.004)

Self active time 0.007** -0.001 0.002 0.007**(0.002) (0.002) (0.002) (0.003)

Self passive time 0.005* 0.002 -0.001 0.008**(0.003) (0.002) (0.003) (0.003)

Active time with others 0.002 -0.007 -0.005 0.003(0.006) (0.005) (0.006) (0.007)

Passive time with others 0.003 -0.004 -0.005 0.001(0.003) (0.003) (0.003) (0.006)

Don’t know or refuse to answer 0.003 0.001 -0.000 -0.000(0.003) (0.003) (0.004) (0.004)

R-Squared 0.604 0.608 0.568 0.400Observations 1453 1455 1453 1454Exogeneity test F-statistic 0.879 1.020 1.376 1.232Exogeneity test p-value 0.588 0.431 0.151 0.240

Note: All estimates are for specification (7). See footnote 31 for a full description of the control variables. Standard errors

corrected for heteroskedasticity are in parentheses. * Significant at the 10% level. ** Significant at the 5% level.

42

Table 7: Exogeneity Test Results: Younger Children


(1) No controls 11.065 0.000 11.327 0.000 8.001 0.000 2.774 0.000(2) Child Chrs. 2.020 0.011 2.761 0.000 2.497 0.001 2.427 0.002(3) Mother Demog. Chrs. 0.979 0.475 1.675 0.049 1.338 0.171 1.644 0.056(4) Family Demog. Chrs. 0.794 0.685 1.094 0.356 1.034 0.416 1.434 0.122(5) Family Environ. Chrs. 0.719 0.768 1.126 0.326 1.025 0.426 1.396 0.140(6) School Chrs. 0.685 0.802 1.104 0.346 0.958 0.498 1.389 0.143(7) School Experience 0.657 0.828 1.116 0.335 0.891 0.574 1.333 0.173

Note: Entries in bold are “surviving specifications” for which we cannot reject exogeneity at 10% of significance. Each speci-

fication contains different control variables: (1) no controls; (2) child characteristics; (3) mother demographic characteristics;

(4) family demographic characteristics; (5) family environmental characteristics; (6) school characteristics; (7) child’s school

experience. See footnote 31 for a full description of the control variables. All standard errors are corrected for heteroskedasticity.

43

Table 8: Effects of Children’s Time Allocation: Younger Children


Active time with mother 0.001 0.001 0.004 -0.001(0.002) (0.002) (0.002) (0.004)

Passive time with mother 0.001 0.003* 0.004* -0.004(0.002) (0.001) (0.002) (0.003)

Active time with father 0.001 0.000 0.004 0.008(0.003) (0.003) (0.005) (0.007)

Passive time with father -0.003 -0.005** -0.004 -0.003(0.003) (0.003) (0.004) (0.006)

Active time with grandparents 0.002 0.001 0.008 -0.004(0.003) (0.003) (0.005) (0.008)

Passive time with grandparents -0.004 0.002 0.003 -0.006(0.003) (0.002) (0.004) (0.005)

Active time with siblings -0.001 -0.002 -0.003 -0.007(0.003) (0.003) (0.004) (0.006)

Passive time with siblings 0.002 0.003 0.007** -0.007(0.002) (0.002) (0.003) (0.005)

Active time with friends -0.000 -0.001 -0.003 -0.018**(0.003) (0.003) (0.003) (0.008)

Passive time with friends 0.000 0.000 0.004 -0.007(0.003) (0.003) (0.005) (0.009)

Self active time 0.003 0.001 0.005** -0.009**(0.002) (0.002) (0.003) (0.004)

Self passive time 0.002 0.002 0.003 -0.003(0.002) (0.002) (0.003) (0.005)

Active time with others 0.000 0.004 0.007 -0.014**(0.003) (0.003) (0.004) (0.006)

Passive time with others -0.001 -0.002 -0.001 -0.005(0.002) (0.002) (0.003) (0.004)

Don’t know or refuse to answer 0.007 0.003 0.010 -0.018(0.009) (0.006) (0.009) (0.020)

R-Squared 0.805 0.807 0.673 0.131Observations 2443 2449 2085 2548Exogeneity test F-statistic 0.657 1.116 0.891 1.333Exogeneity test p-value 0.828 0.335 0.574 0.173

Note: All estimates are for specification (7). See footnote 31 for a full description of the control variables. Standard errors

corrected for heteroskedasticity are in parentheses. * Significant at the 10% level. ** Significant at the 5% level.

44

Table 9: Exogeneity Test Results: Older Children, B-spline


(1) Lagged Score 1.959 0.015 2.214 0.005 2.217 0.005 1.832 0.026(2) Child Chrs. 1.109 0.342 1.366 0.156 1.641 0.057 1.400 0.139(3) Mother Demog. Chrs. 1.100 0.351 1.438 0.121 1.753 0.036 1.383 0.147(4) Family Demog. Chrs. 1.049 0.401 1.402 0.138 1.698 0.045 1.383 0.147(5) Family Environ. Chrs. 0.853 0.618 1.322 0.180 1.434 0.123 1.255 0.224(6) School Chrs. 0.827 0.648 1.305 0.191 1.339 0.170 1.239 0.235(7) School’s Experience 0.805 0.673 1.270 0.214 1.320 0.182 1.228 0.243

Note: All specifications in this table are in the form of a linear B-Spline with 2 knots placed at 33rd and 67th percentiles of

each time input, whenever possible. Entries in bold are “surviving specifications” for which we cannot reject exogeneity at 10%

of significance in the linear model . Each specification contains different control variables: (1) no controls, except for the lagged

corresponding input; (2) child characteristics; (3) mother demographic characteristics; (4) family demographic characteristics;

(5) family environmental characteristics; (6) school characteristics; (7) child’s school experience. See footnote 31 for a full

description of the control variables. All standard errors are corrected for heteroskedasticity.

45

Table 10: B-spline Estimation Results: Older Children

Math Vocabulary Comprehension Non-

cognitive

Active time with mother (0,5.8) 0.024 0.013 0.016 -0.007

(0.016) (0.015) (0.016) (0.018)

Active time with mother (5.8,15) -0.009 -0.004 -0.005 -0.001

(0.008) (0.007) (0.008) (0.008)

Active time with mother (15,.) 0.020** 0.006 0.003 0.014**

(0.005) (0.005) (0.006) (0.006)

Passive time with mother (0,17.4) 0.016** 0.008 0.010* 0.007

(0.006) (0.005) (0.006) (0.007)

Passive time with mother (17.41,28.7) 0.000 0.001 0.005 0.011*

(0.006) (0.005) (0.006) (0.007)

Passive time with mother (28.7,.) 0.004 0.000 -0.010** 0.004

(0.003) (0.003) (0.004) (0.004)

Active time with father (0,.) 0.014** 0.007 0.000 0.007

(0.006) (0.006) (0.007) (0.007)

Passive time with father (0,1.2) 0.131 0.263* 0.119 -0.075

(0.122) (0.146) (0.161) (0.143)

Passive time with father (1.2,.) -0.001 -0.003 0.008 0.010*

(0.005) (0.004) (0.005) (0.005)

Active time with grandparents (0,.) 0.026** 0.025** 0.031** 0.018

(0.012) (0.012) (0.012) (0.018)

Passive time with grandparents (0,.) -0.003 -0.002 -0.001 -0.000

(0.004) (0.005) (0.005) (0.006)

Active time with siblings (0,.) -0.001 -0.005 -0.010 0.010

(0.006) (0.005) (0.007) (0.007)

Passive time with siblings (0,1.7) -0.064 -0.098 -0.047 -0.100

(0.084) (0.075) (0.080) (0.104)

Passive time with siblings (1.7,.) 0.008** 0.005 0.002 0.006

(0.004) (0.004) (0.004) (0.005)

Active time with friends (0,.) 0.009** 0.005 0.000 0.003

(0.003) (0.003) (0.004) (0.005)

Passive time with friends (0,0.8) 0.104 0.305 -0.006 -0.061

(0.376) (0.442) (0.331) (0.269)

Passive time with friends (0.8,.) 0.004 -0.004 -0.000 0.001

46

(0.003) (0.003) (0.003) (0.004)

Self active time (0,31.5) 0.003 -0.001 0.004 0.009

(0.004) (0.004) (0.005) (0.006)

Self active time (31.5,36.3) 0.019 -0.005 -0.012 0.013

(0.014) (0.013) (0.015) (0.017)

Self active time (36.3,.) 0.008** 0.001 0.004 0.005

(0.003) (0.003) (0.003) (0.004)

Self passive time (0,5.1) -0.013 0.002 -0.013 -0.016

(0.028) (0.027) (0.031) (0.035)

Self passive time (5.1,9) 0.007 0.015 0.004 0.029

(0.015) (0.014) (0.016) (0.018)

Self passive time (9,.) 0.005* 0.000 0.001 0.006

(0.003) (0.003) (0.003) (0.004)

Active time with others (0,.) 0.002 -0.007 -0.004 0.003

(0.006) (0.005) (0.006) (0.007)

Passive time with others (0,2.5) -0.059 0.014 -0.062 -0.025

(0.039) (0.036) (0.043) (0.049)

Passive time with others (2.5,.) 0.006* -0.004 -0.002 0.001

(0.003) (0.004) (0.004) (0.007)

Don’t know or refuse to answer 0.004 0.001 0.001 -0.001

(0.003) (0.003) (0.004) (0.004)

R-squared 0.609 0.612 0.575 0.404

Observations 1,453 1,455 1,453 1,454

Exogeneity test F-statistic 0.805 1.270 1.320 1.228

Exogeneity test p-value 0.673 0.214 0.182 0.243

Note: All estimates are for specification (7). See footnote 31 for a full description of the control variables. In the first column,

the parentheses shown after each time input indicates the time intervals. For example, (0,2.5) means between 0 hours and 2.5

hours per week. Depending on the distribution, some time inputs have less than three time intervals because the time input

was not complex enough to accommodate two knots. Standard errors corrected for heteroskedasticity are in parentheses. *

Significant at the 10% level. ** Significant at the 5% level.

47

Table 11: Exogeneity Test Results: Younger Children, B-spline


(1) No Controls 6.918 0.000 8.080 0.000 6.055 0.000 2.142 0.006(2) Child Chrs. 1.502 0.096 1.358 0.159 1.059 0.390 1.610 0.063(3) Mother Demog. Chrs. 0.981 0.472 1.052 0.398 0.764 0.719 1.459 0.112(4) Family Demog. Chrs. 0.815 0.661 0.841 0.632 0.675 0.811 1.615 0.062(5) Family Environ. Chrs. 0.661 0.825 0.785 0.696 0.554 0.911 1.474 0.106(6) School Chrs. 0.625 0.857 0.766 0.717 0.519 0.932 1.474 0.106(7) School’s Experience 0.642 0.842 0.837 0.637 0.570 0.900 1.460 0.112

Note: All specifications in this table are in the form of a linear B-Spline with 2 knots placed at 33rd and 67th percentiles of each

time input, whenever possible. Entries in bold are “surviving specifications” for which we cannot reject exogeneity at 10% of

significance in the linear model . Each specification contains different control variables: (1) no controls; (2) child characteristics;

(3) mother demographic characteristics; (4) family demographic characteristics; (5) family environmental characteristics; (6)

school characteristics; (7) child’s school experience. See footnote 31 for a full description of the control variables. All standard

errors are corrected for heteroskedasticity.

48

Table 12: B-spline Estimation Results: Younger Children

Math Vocabulary Comprehension Non-

cognitive

Active time with mother (0,5.8) -0.015 -0.001 0.010 -0.029

(0.011) (0.011) (0.015) (0.023)

Active time with mother (5.8,15) 0.002 0.002 0.005 0.019**

(0.004) (0.004) (0.006) (0.008)

Active time with mother (15,.) 0.002 0.000 0.002 -0.009**

(0.002) (0.002) (0.003) (0.004)

Passive time with mother (0,17.4) 0.005 0.000 0.001 0.002

(0.004) (0.004) (0.005) (0.008)

Passive time with mother (17.41,28.7) 0.001 0.003 0.003 -0.009

(0.003) (0.003) (0.004) (0.006)

Passive time with mother (28.7,.) 0.000 0.004* 0.006* -0.000

(0.002) (0.002) (0.004) (0.005)

Active time with father (0,.) 0.001 0.000 0.005 0.008

(0.003) (0.003) (0.005) (0.007)

Passive time with father (0,1.2) -0.020 0.046 0.039 -0.013

(0.066) (0.068) (0.093) (0.121)

Passive time with father (1.2,.) -0.002 -0.006** -0.004 -0.003

(0.003) (0.003) (0.004) (0.006)

Active time with grandparents (0,.) 0.002 0.002 0.008 -0.004

(0.004) (0.003) (0.005) (0.008)

Passive time with grandparents (0,.) -0.003 0.002 0.002 -0.005

(0.003) (0.002) (0.004) (0.005)

Active time with siblings (0,.) -0.001 -0.003 -0.002 -0.007

(0.003) (0.003) (0.004) (0.006)

Passive time with siblings (0,1.7) -0.017 -0.034 -0.051 -0.130*

(0.036) (0.036) (0.050) (0.075)

Passive time with siblings (1.7,.) 0.003 0.004 0.009** -0.004

(0.003) (0.002) (0.003) (0.005)

Active time with friends (0,.) 0.001 -0.001 -0.002 -0.017**

(0.003) (0.003) (0.003) (0.008)

Passive time with friends (0,0.8) 0.003 -0.070 -0.107 0.496**

(0.125) (0.115) (0.193) (0.234)

Passive time with friends (0.8,.) 0.001 0.000 0.004 -0.011

49

(0.004) (0.003) (0.005) (0.010)

Self active time (0,31.5) 0.006** 0.006** 0.009** -0.007

(0.003) (0.002) (0.003) (0.005)

Self active time (31.5,36.3) -0.004 -0.008 -0.000 -0.022*

(0.007) (0.007) (0.009) (0.013)

Self active time (36.3,.) -0.000 -0.003 0.001 -0.004

(0.005) (0.004) (0.007) (0.009)

Self passive time (0,5.1) 0.035** 0.018 0.061** 0.023

(0.013) (0.012) (0.020) (0.025)

Self passive time (5.1,9) -0.001 0.006 0.005 -0.004

(0.008) (0.008) (0.011) (0.015)

Self passive time (9,.) 0.000 -0.002 -0.003 -0.007

(0.003) (0.003) (0.004) (0.007)

Active time with others (0,.) -0.000 0.004 0.006 -0.014**

(0.003) (0.003) (0.004) (0.006)

Passive time with others (0,2.5) -0.009 -0.001 -0.002 -0.003

(0.018) (0.018) (0.025) (0.034)

Passive time with others (2.5,.) -0.001 -0.001 0.000 -0.004

(0.002) (0.002) (0.003) (0.004)

Don’t know or refuse to answer 0.007 0.002 0.009 -0.017

(0.009) (0.006) (0.009) (0.020)

R-squared 0.807 0.808 0.676 0.138

Observations 2,443 2,449 2,085 2,548

Exogeneity test F-statistic 0.642 0.837 0.570 1.460

Exogeneity test p-value 0.842 0.637 0.900 0.112

Note: All estimates are for specification (7). See footnote 31 for a full description of the control variables. In the first column,

the parentheses shown after each time input indicates the time intervals. For example, (0,2.5) means between 0 hours and 2.5

hours per week. Depending on the distribution, some time inputs have less than three time intervals because the time input

was not complex enough to accommodate two knots. Standard errors corrected for heteroskedasticity are in parentheses. *

Significant at the 10% level. ** Significant at the 5% level.

50

Table 13: P-Values for Comparing Surviving and Non-surviving Specifications: Older Chil-dren

Controls Older CohortMath Vocabulary Comprehension Non-cognitive

(1) Lagged Score 0.000 0.000 0.000 0.975(2) Child Chrs. 0.040 0.463 0.006 0.974(3) Mother Demog. Chrs. 0.185 0.540 0.072 0.984(4) Family Demog. Chrs. 0.585 0.491 0.303 0.999(5) Family Environ. Chrs. 0.932 0.958 0.675 1.000(6) School Chrs. 0.958 0.998 0.957 0.998

Note: This table shows the p-values of a joint test for whether the 15 coefficients of Inputi for each specification are the same

as the corresponding ones from specification (7) in Table 5. Entries in bold are “surviving specifications” with respect to the

exogeneity test, i.e., those for which we cannot reject exogeneity at 10% of significance. Each specification contains different

control variables: (1) no controls, except for the lagged corresponding input; (2) child characteristics; (3) mother demographic

characteristics; (4) family demographic characteristics; (5) family environmental characteristics; (6) school characteristics. See

footnote 31 for a full description of the control variables. All standard errors are corrected for heteroskedasticity.

Table 14: P-Values for Comparing Surviving and Non-surviving Specifications: YoungerChildren

Controls Younger CohortMath Vocabulary Comprehension Non-cognitive

(1) No controls 0.000 0.000 0.000 0.004(2) Child Chrs. 0.000 0.000 0.001 0.011(3) Mother Demog. Chrs. 0.364 0.151 0.350 0.531(4) Family Demog. Chrs. 0.570 0.555 0.533 0.904(5) Family Environ. Chrs. 0.374 0.431 0.441 0.942(6) School Chrs. 0.830 0.682 0.828 1.000

Note: This table shows the p-values of a joint test for whether the 15 coefficients of Inputi for each specification are the same

as the corresponding ones from specification (7) in Table 7. Entries in bold are “surviving specifications” with respect to the

exogeneity test, i.e., those for which we cannot reject exogeneity at 10% of significance. Each specification contains different

control variables: (1) no controls; (2) child characteristics; (3) mother demographic characteristics; (4) family demographic

characteristics; (5) family environmental characteristics; (6) school characteristics. See footnote 31 for a full description of the

control variables. All standard errors are corrected for heteroskedasticity.

51

Table 15: P-Values for Comparing Surviving and Non-surviving Specifications: Older Chil-dren, B-spline

Controls Older CohortMath Vocabulary Comprehension Non-cognitive

(1) Lagged Score 0.000 0.000 0.000 0.990(2) Child Chrs. 0.182 0.857 0.148 0.998(3) Mother Demog. Chrs. 0.465 0.844 0.428 0.996(4) Family Demog. Chrs. 0.734 0.869 0.591 0.999(5) Family Environ. Chrs. 0.996 0.998 0.956 1.000(6) School Chrs. 1.000 1.000 0.999 1.000

Note: This table shows the p-values of a test for whether the 26 coefficient estimates of Inputi for each specification are

statistically the same as the corresponding ones from Specification (7) in Table 9. Entries in bold are “surviving specifications”

for which we cannot reject exogeneity at 10% of significance. Each specification contains different control variables: (1) no

controls, except for the lagged corresponding input; (2) child characteristics; (3) mother demographic characteristics; (4)

family demographic characteristics; (5) family environmental characteristics; (6) school characteristics. All standard errors are

corrected for heteroskedasticity. See footnote 31 for a full description of the control variables. All standard errors are corrected

for heteroskedasticity.

Table 16: P-Values for Comparing Surviving and Non-surviving Specifications: YoungerChildren, B-spline

Controls Younger CohortMath Vocabulary Comprehension Non-cognitive

(1) No Controls 0.000 0.000 0.000 0.007(2) Child Chrs. 0.002 0.000 0.005 0.035(3) Mother Demog. Chrs. 0.331 0.120 0.378 0.898(4) Family Demog. Chrs. 0.660 0.583 0.610 0.979(5) Family Environ. Chrs. 0.760 0.813 0.694 0.999(6) School Chrs. 0.943 0.872 0.899 1.000

Note: This table shows the p-values of a test for whether the 27 coefficient estimates of Inputi for each specification are

statistically the same as the corresponding ones from Specification (7) in Table 11. Entries in bold are “surviving specifications”

for which we cannot reject exogeneity at 10% of significance. Each specification contains different control variables: (1) no

controls; (2) child characteristics; (3) mother demographic characteristics; (4) family demographic characteristics; (5) family

environmental characteristics; (6) school characteristics. See footnote 31 for a full description of the control variables. All

standard errors are corrected for heteroskedasticity.

52

Table 17: Do Coefficient Changes as Controls are Added? (P-Value): Older ChildrenAlternative Specifications Math Vocabulary Comprehension Non-cognitive

(1’) (7) + more controls 0.855 0.969 0.992 0.966(2’) (7) + lagged inputs 0.997 0.996 0.990 0.979(3’) (7) + lagged skills 0.798 0.452 0.791 0.826(4’) (7) + measurement error controls 0.534 0.660 0.597 0.995(5’) (7), school time as a separate input 0.998 0.999 0.958 1.000(6’) (7) + prop. educational activities 0.984 0.960 0.723 0.671(7’) (7) + prop. general care 0.973 0.723 0.988 0.995(8’) (7) + prop. watching TV 0.506 0.827 0.822 0.999(9’) (7) + prop. at home 0.781 0.920 0.907 0.498(10’) (7) + prop. participation time 0.863 0.629 0.873 0.874(11’) (7) + prop. weekend time 0.965 0.822 0.699 0.689(12’) (7), change age of cohorts 0.953 0.571 0.462 0.243

Note: This table shows the p-values of a test for whether the 15 coefficient estimates of Inputi for each alternative specification

are statistically the same as the corresponding ones from Specification (7) in Table 5. Alternative specifications: (1’) the full

list of added controls can be seen in footnote 42; (2’) lagged time inputs of all 15 activities; (3’) lagged skill measures of other

types and interactions of any two skills; (4’) full list of added controls can be seen in footnote 45; (5’) 16 time inputs (15 original

time inputs plus school activities), whereby the p-value refers to a test of whether the 15 coefficients of the original time inputs

are statistically unchanged with respect to specification (7); (6’) proportions of each time input spent in educational activities

(e.g. reading): 7 additional covariates; (7’) proportions of each time input spent in general care (i.e having meals): 7 additional

covariates; (8’) proportions of each time input spent watching TV: 7 additional covariates; (9’) proportions of each time input

spent at home: 14 additional covariates; (10’) proportions of each time input that partner actually participates in the activity:

12 additional covariates; (11’) proportions of each time input spent during weekends: 15 additional covariates; (12’) the age

range of the older cohort is changed to be 13-17 years old.

53

Table 18: Do Coefficient Changes as Controls are Added? (P-Value): Younger ChildrenAlternative Specifications Math Vocabulary Comprehension Non-cognitive

(1’) (7) + more controls 0.902 0.717 0.389 0.898(2’) (7) + lagged inputs 0.991 0.874 0.981 0.942(4’) (7) + measurement error controls 0.836 0.877 0.504 0.989(5’) (7), school time as a separate input 1.000 1.000 1.000 0.988(6’) (7) + prop. educational activities 0.918 0.934 0.922 0.842(7’) (7) + prop. general care 0.997 0.968 0.982 1.000(8’) (7) + prop. watching TV 0.999 0.999 0.999 1.000(9’) (7) + prop. at home 0.663 0.097 0.518 0.727(10’) (7) + prop. participation time 0.955 0.991 0.882 0.919(11’) (7) + prop. weekend time 0.976 0.874 0.906 0.942(12’) (7), change age of cohorts 0.954 0.554 0.492 0.833

Note: This table shows the p-values of a test for whether the 15 coefficient estimates of Inputi for each alternative specification

are statistically the same as the corresponding ones from Specification (7) in Table 7. Alternative specifications: (1’) the full list

of added controls can be seen in footnote 42; (2’) lagged time inputs of all 15 activities; (3’) lagged skill measures of other types

and interactions of any two skills, which is not implemented for this cohort because of lack of data; (4’) full list of added controls

can be seen in footnote 45; (5’) 16 time inputs (15 original time inputs plus school activities), whereby the p-value refers to a

test of whether the 15 coefficients of the original time inputs are statistically unchanged with respect to specification (7); (6’)

proportions of each time input spent in educational activities (e.g. reading): 7 additional covariates; (7’) proportions of each

time input spent in general care (i.e having meals): 7 additional covariates; (8’) proportions of each time input spent watching

TV: 7 additional covariates; (9’) proportions of each time input spent at home: 14 additional covariates; (10’) proportions of

each time input that partner actually participates in the activity: 12 additional covariates; (11’) proportions of each time input

spent during weekends: 15 additional covariates; (12’) the age range of the older cohort is changed to be 5-12 years old.

54

Table 19: Do Coefficient Changes as Controls are Added? (P-Value): Older Children, B-splineAlternative Specifications Math Vocabulary Comprehension Non-cognitive

(1’) (7) + more controls 0.940 0.997 0.991 0.999(2’) (7) + lagged inputs 1.000 1.000 1.000 1.000(3’) (7) + lagged skills 0.946 0.871 0.830 0.994(4’) (7) + measurement error controls 0.887 0.869 0.978 0.999(5’) (7), school time as a separate input 0.999 1.000 0.995 1.000(6’) (7) + prop. educational activities 0.999 0.990 0.937 0.989(7’) (7) + prop. general care 1.000 0.969 1.000 1.000(8’) (7) + prop. watching TV 0.923 0.970 0.990 1.000(9’) (7) + prop. at home 0.915 0.999 1.000 0.861(10’) (7) + prop. participation time 0.996 0.909 0.998 0.997(11’) (7) + prop. weekend time 1.000 0.997 0.974 0.847(12’) (7), change age of cohorts 0.908 0.224 0.795 0.313

Note: This table shows the p-values of a test for whether the coefficient estimates of Inputi for each alternative specification

are statistically the same as the corresponding ones from Specification (7) in Table 9. All specifications in this table are in the

form of a linear B-Spline with 2 knots placed at 33rd and 67th percentiles of each time input, whenever possible. Alternative

specifications: (1’) the full list of added controls can be seen in footnote 42; (2’) lagged time inputs of all 15 activities; (3’)

lagged skill measures of other types and interactions of any two skills; (4’) full list of added controls can be seen in footnote

45; (5’) 30 time inputs (27 original time inputs plus 3 school inputs), whereby the p-value refers to a test of whether the 27

coefficients of the original time inputs are statistically unchanged with respect to specification (7); (6’) proportions of each time

input spent in educational activities (e.g. reading): 7 additional covariates; (7’) proportions of each time input spent in general

care (i.e having meals): 7 additional covariates; (8’) proportions of each time input spent watching TV: 7 additional covariates;

(9’) proportions of each time input spent at home: 14 additional covariates; (10’) proportions of each time input that partner

actually participates in the activity: 12 additional covariates; (11’) proportions of each time input spent during weekends: 15

additional covariates; (12’) the age range of the older cohort is changed to be 13-17 years old.

55

Table 20: Do Coefficient Changes as Controls are Added? (P-Value): Younger Children, B-splineAlternative Specifications Math Vocabulary Comprehension Non-cognitive

(1’) (7) + more controls 0.951 0.979 0.821 0.973(2’) (7) + lagged inputs 1.000 0.993 1.000 0.998(4’) (7) + measurement error controls 0.993 0.987 0.908 1.000(5’) (7), school time as a separate input 0.999 0.998 1.000 0.999(6’) (7) + prop. educational activities 1.000 1.000 0.999 0.997(7’) (7) + prop. general care 1.000 1.000 1.000 1.000(8’) (7) + prop. watching TV 1.000 1.000 1.000 1.000(9’) (7) + prop. at home 0.860 0.269 0.896 0.984(10’) (7) + prop. participation time 1.000 1.000 0.999 0.998(11’) (7) + prop. weekend time 1.000 0.999 0.999 1.000(12’) (7), change age of cohorts 0.659 0.690 0.384 0.573

Note: This table shows the p-values of a test for whether the coefficient estimates of Inputi for each alternative specification

are statistically the same as the corresponding ones from Specification (7) in Table 11. All specifications in this table are in the

form of a linear B-Spline with 2 knots placed at 33rd and 67th percentiles of each time input, whenever possible. Alternative

specifications: (1’) the full list of added controls can be seen in footnote 42; (2’) lagged time inputs of all 15 activities; (3’)

lagged skill measures of other types and interactions of any two skills, which is not implemented for this cohort because of lack

of data; (4’) full list of added controls can be seen in footnote 45; (5’) 30 time inputs (27 original time inputs plus 3 school

inputs), whereby the p-value refers to a test of whether the 27 coefficients of the original time inputs are statistically unchanged

with respect to specification (7); (6’) proportions of each time input spent in educational activities (e.g. reading): 7 additional

covariates; (7’) proportions of each time input spent in general care (i.e having meals): 7 additional covariates; (8’) proportions

of each time input spent watching TV: 7 additional covariates; (9’) proportions of each time input spent at home: 14 additional

covariates; (10’) proportions of each time input that partner actually participates in the activity: 12 additional covariates;

(11’) proportions of each time input spent during weekends: 15 additional covariates; (12’) the age range of the older cohort is

changed to be 5-12 years old.

56

Figure 1: Intuition for the Test of Exogeneity

E[Skilli|Input

ji=0]

E[Skilli|Input

ji = x]

Inputji(a) Correlation between Time Input andChild Skill, Unconditional

E[Skilli|Input

ji=0, Covariates

ji ]

E[Skilli|Input

ji = x, Covariates

ji ]

Inputji(b) Correlation between Time Input andChild Skill, Conditional on Covariates

Figure 2: Why are Unobservables Discontinuous at Inputji = 0?

Mother’s Typei

Inputji

E[Mother’s Typei|Input

ji = 0]

57

Figure 3: Types of Confounders

E[Confounderi|Input

ji = 0]

Confounderi

Inputji , Inputj?i(a) Type (a1) vs. Type (a2)

E[Confounderi|Input

ji = 0]

Confounderi

Inputji , Inputj?i(b) Type (b)

Note: Inputj?i represents the optimal choice of input j by individual i. Red range: Support of confounderamong all observations of sample. Blue range: Support of confounder among all observations of sample forwhich Inputji = 0. The confounder is of type (a1) if some of its correlation with Skilli happens for values ofthe confounder in the blue range, otherwise it is of type (a2).

58

Figure 4: Evidence of Bunching

(a) Self Active Time, Older Cohort0

.1.2

.3.4

.5.6

.7.8

.91

Cum

ulat

ive

Den

sity

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90Self Active time

(b) Passive Time with Friends, Older Cohort

0.1

.2.3

.4.5

.6.7

.8.9

1C

umul

ativ

e D

ensi

ty

0 5 10 15 20 25 30 35 40 45 50 55 60Inactive time with friends

(c) Active Time with Mother, Older Cohort

0.1

.2.3

.4.5

.6.7

.8.9

1C

umul

ativ

e D

ensi

ty

0 5 10 15 20 25 30 35 40 45 50 55 60Active time with mother

(d) Active Time with Father, Younger Cohort

0.1

.2.3

.4.5

.6.7

.8.9

1C

umul

ativ

e D

ensi

ty

0 5 10 15 20 25 30 35 40 45Active time with father

Note: Each plot shows the cumulative density function of the time spent in the corresponding activity forthe corresponding cohort. The fact that these plots cross the vertical axis not at the origin is direct evidenceof bunching, as it implies the probability density function is discontinuously larger at zero. Time describedin the horizontal axis is reported in hours per week, but continuously (in minutes per week).

59

Figure 5: Evidence of Power to Detect Endogeneity from Omitted Variables

1111

.512

12.5

1313

.514

Con

ditio

nal M

ean

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30Active time with mother (hours per week)

(a) Mother’s Level of Education (Years),Younger Children

3.8

44.

24.

44.

64.

85

Con

ditio

nal M

ean

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30Active time with father (hours per week)

(b) Number of Books Child Has, Younger Chil-dren

3035

4045

5055

Con

ditio

nal M

ean

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30Don't know or refuse to answer (hours per week)

(c) Hours Mother Works Per Week, Older Chil-dren

4060

8010

012

014

016

018

020

022

024

0C

ondi

tiona

l Mea

n

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30Passive time with father (hours per week)

(d) Household Income ($1,000s Per Year),Older Children

Note: In each plot, the vertical axis shows the mean of a potential confounder conditional on a given level of time input (i.e.

horizontal axis variable). The scatter plot represents the observed conditional mean of the confounder (aggregated to the next

hour of the time input). At zero time input, we show the 95% confidence interval. The solid curve represents a third order local

polynomial regression of the confounder on the time input, using time input data at the minute per week level. The shaded

region represents the 95% confidence interval for this regression with an out-of-sample prediction at zero minutes. See footnote

23 for more details on the regression and confidence interval.

60

Figure 6: Evidence of Power to Detect Endogeneity from Simultaneity

(a) Child’s Birth Weight (Pounds), Older Chil-dren

55.

56

6.5

77.

58

8.5

99.

5C

ondi

tiona

l Mea

n


(b) Child is White, Older Children

0.1

.2.3

.4.5

.6.7

.8.9

1C

ondi

tiona

l Mea

n


(c) Child’s Age (Months), Younger Chil-dren

5060

7080

9010

011

012

0C

ondi

tiona

l Mea

n

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30Active time with friends (hours per week)

(d) Child’s Height (Inches), Younger Chil-dren

4042

4446

4850

5254

56C

ondi

tiona

l Mea

n

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30Active time with friends (hours per week)







61

Figure 7: Evidence of Power to Detect Endogeneity from Measurement Error

(a) Weekend Diary was Completed WithoutHelp, Younger Children

.35

.4.4

5.5

.55

.6.6

5.7

Con

ditio

nal M

ean


(b) Child Completed Weekday Diary (With orWithout Help), Older Children

0.1

.2.3

.4.5

.6.7

.8.9

1C

ondi

tiona

l Mea

n

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30Passive time with others (hours per week)







Figure 8: Evidence of Power to Detect Endogeneity from Over-Aggregation of Inputs

(a) Proportion of Active Time with FatherSpent at Home, Younger Children

0.1

.2.3

.4.5

.6.7

.8.9

1C

ondi

tiona

l Mea

n


(b) Proportion of Passive time with FriendsWatching TV, Older Children

0.0

5.1

.15

.2.2

5.3

.35

Con

ditio

nal M

ean

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30Active time with friends







62

Figure 9: Evidence of Power to Detect Endogeneity from Non-Linear Effects

(a) Active Time with Grandparents (1st Mo-ment), Younger Children

01

23

45

67

Con

ditio

nal M

ean


(b) ActiveTime with Grandparents (Distribu-tion), Younger Children

.6.6

5.7

.75

.8.8

5.9

.95

1C

umul

ativ

e Pr

obab

ility

0 5 10 15 20 25 30 35 40 45Active time with grandparents (hours per week)

When active time with mother is 0When active time with mother is 1When active time with mother is 2When active time with mother is 3

Note: In the plots of the right, we show the cumulative density function of the confounder for selected values of the time input

(in hours), for the confounder and time input shown in the corresponding plot of the left. In the plots of the left, the vertical

axis shows the mean of a potential confounder conditional on a given level of time input (i.e. horizontal axis variable). The

scatter plot represents the observed conditional mean of the confounder (aggregated to the next hour of the time input). At

zero time input, we show the 95% confidence interval. The solid curve represents a third order local polynomial regression of

the confounder on the time input, using time input data at the minute per week level. The shaded region represents the 95%

confidence interval for this regression with an out-of-sample prediction at zero minutes. See footnote 23 for more details on the

regression and confidence interval.

63

Appendix

Table 21: Non-cognitive Skills Loading Factors

Younger Cohort Older CohortCheats or tells lies 0.4988 0.5272Bullies or mean to others 0.5519 0.5513Feels no sorry after misbehaving 0.4300 0.4729Breaks things on purpose 0.4874 0.4803Has sudden changes in mood 0.5532 0.5647Feels no love 0.4679 0.5351Too fearful or anxious 0.4455 0.4776Feels worthless or inferior 0.4894 0.5751Sad or depressed 0.5354 0.6158Cries too much 0.4139 0.3418Easily confused 0.5152 0.5314Has obsessions 0.4735 0.5828Rather high strung, tense and nervous 0.5248 0.5353Argues too much 0.5822 0.5771Disobedient 0.5439 0.5872Stubborn, sullen, or irritable 0.6125 0.6310Has a very strong temper 0.6117 0.6362Has difficulty concentrating 0.5895 0.5874Impulsive, or acts without thinking 0.5933 0.6395Restless or overly active 0.5370 0.5324Has trouble getting along with other children 0.5942 0.6106Not liked by other children 0.4502 0.4747Withdrawn, does not get involved with others 0.4028 0.4533Clings to adults 0.3216 0.2812Demands a lot of attention 0.5431 0.5203Too dependent on others 0.4666 0.4593Thinks before acting, not impulsive 0.4883 0.5418Generally well behaved, does what adults request 0.5719 0.5788Can get over being upset quickly 0.4433 0.4623Waits turn in games and other activities 0.5142 0.4801Gets along well with other children 0.6195 0.6168Admired by other children 0.5779 0.5424Cheerful, happy 0.4721 0.5283Tries things for himself/herself 0.3619 0.4122Does neat, careful work 0.4185 0.4201Curious and exploring, likes new experiences 0.1932 0.2397

Note: The larger is the factor loading, the larger is the conditional correlation between the variables and the factor (i.e. the

measure of non-cognitive skills).

64

Table 22: Exogeneity Test Results: Older Children, Not Value-Added


(1) No Controls 3.406 0.000 2.468 0.001 2.009 0.012 1.722 0.041(2) Child Chrs. 1.214 0.254 1.719 0.042 1.681 0.049 1.281 0.206(3) Mother Demographic Chrs. 1.077 0.373 1.644 0.056 1.466 0.110 1.208 0.258(4) Family Demographic Chrs. 0.919 0.543 1.490 0.101 1.385 0.146 1.326 0.178(5) Family Environmental Chrs. 0.943 0.515 1.333 0.174 1.283 0.205 1.220 0.249(6) Other Environmental Chrs. 0.834 0.640 1.351 0.164 1.205 0.261 1.175 0.284(7) School Experience 0.838 0.635 1.349 0.165 1.300 0.194 1.169 0.289

Note: Entries in bold are “surviving specifications” for which we cannot reject exogeneity at 10% of significance. Each speci-

fication contains different control variables: (1) no controls; (2) child characteristics; (3) mother demographic characteristics;

(4) family demographic characteristics; (5) family environmental characteristics; (6) school characteristics; (7) child’s school

experience. All standard errors are corrected for heteroskedasticity. See footnote 31 for a full description of the control variables.

All standard errors are corrected for heteroskedasticity.

65

Table 23: Effects of Children’s Time Allocation: Older Children, Not Value-Added


Active time with mother 0.009** 0.004 0.003 0.001(0.004) (0.004) (0.004) (0.004)

Passive time with mother 0.005* 0.002 0.000 0.004(0.003) (0.003) (0.003) (0.003)

Active time with father 0.018** 0.013 0.010 0.009(0.007) (0.008) (0.009) (0.008)

Passive time with father 0.002 0.001 0.010** 0.005(0.005) (0.005) (0.005) (0.006)

Active time with grandparents 0.022* 0.028* 0.036** 0.018(0.012) (0.016) (0.013) (0.019)

Passive time with grandparents -0.005 -0.004 -0.002 -0.009(0.005) (0.006) (0.006) (0.008)

Active time with siblings 0.003 -0.003 -0.004 0.011(0.007) (0.007) (0.007) (0.008)

Passive time with siblings 0.008* 0.004 0.004 0.000(0.004) (0.004) (0.004) (0.006)

Active time with friends 0.010** 0.008* 0.002 -0.002(0.004) (0.004) (0.004) (0.005)

Passive time with friends 0.003 -0.002 0.001 -0.004(0.004) (0.004) (0.004) (0.005)

Self Active time 0.009** 0.001 0.004 0.004(0.003) (0.003) (0.003) (0.004)

Self Passive time 0.004 0.002 0.000 0.003(0.003) (0.003) (0.003) (0.004)

Active time with others 0.011 -0.003 0.002 0.005(0.007) (0.006) (0.008) (0.009)

Passive time with others 0.002 -0.005 -0.006 0.003(0.004) (0.005) (0.004) (0.006)

Don’t know or refuse to answer 0.005 0.003 0.003 -0.001(0.004) (0.004) (0.004) (0.006)

R-Square 0.478 0.422 0.436 0.134Observations 1453 1455 1453 1454Exogeneity test F-statistic 0.838 1.349 1.300 1.169Exogeneity test p-value 0.635 0.165 0.194 0.289

Note: Standard errors corrected for heteroskedasticity are in parentheses. All estimates are for specification (7). * Significant

at the 10% level. ** Significant at the 5% level. See footnote 31 for a full description of the control variables.

66

Figure 10: Further Evidence of Power of Test (1 of 2)

(a) Mother’s Age, Younger Children30

3234

3638

4042

Con

ditio

nal M

ean

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30Active time with father (hours per week)

(b) Mother’s Age at Child Birth, Younger Chil-dren

2425

2627

2829

30C

ondi

tiona

l Mea

n


(c) Mother is Married, Older Children

0.1

.2.3

.4.5

.6.7

.8.9

1C

ondi

tiona

l Mea

n


(d) Child Lives with Biological Parents, OlderChildren

.2.3

.4.5

.6.7

.8.9

1C

ondi

tiona

l Mea

n


(e) Child Has Musical Instrument at Home,Older Children

0.1

.2.3

.4.5

.6.7

.8.9

1C

ondi

tiona

l Mea

n


(f) Caregiver Spent on School Supplies LastYear, Older Children

.8.8

2.8

4.8

6.8

8.9

.92

.94

.96

.98

1C

ondi

tiona

l Mea

n


Notes: See the note for Figure 5 in the text.

67

Figure 11: Further Evidence of Power of Test (2 of 2)

(a) Father’s Level of Education (Years),Younger Children

10.5

1111

.512

12.5

1313

.514

14.5

Con

ditio

nal M

ean


(b) Neighborhood is Safe at Night (Rating 1-5),Older Children

.4.6

.81

1.2

1.4

1.6

1.8

22.

2C

ondi

tiona

l Mea

n


(c) Child Completed Weekend Diary (With orWithout Help), Older Children

0.1

.2.3

.4.5

.6.7

.8.9

1C

ondi

tiona

l Mea

n


(d) Primary Caregiver Completed Weekday Di-ary, Older Children

0.1

.2.3

.4.5

.6.7

.8C

ondi

tiona

l Mea

n


(e) Proportion of Active time with Friends En-gaging in Arts and Crafts, Older Children

0.0

5.1

.15

.2.2

5.3

.35

.4.4

5.5

Con

ditio

nal M

ean

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30Passive time with friends

(f) Proportion of Passive Time with MotherWatching TV, Younger Children

.2.3

.4.5

.6.7

Con

ditio

nal M

ean

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30Passive time with father

Notes: See the note for Figure 5 in the text.

68

towards causal estimates of children’s time allocation · pdf filetowards causal...

Documents