The Family Gap phenomenon: does having children impact on parents labour market outcomes?

The Family Gap phenomenon: does having children impact on parents labour market outcomes? By Amber Dale Applied Economic Analysis 1. Introduction and Background In recent decades the workplace has seen many dramatic changes. The introduction of laws against the discrimination of women, ethnic minorities and those with disabilities have been key in evolving the labour market into what it is today. The ONS recently claimed that the gender pay gap was at an all time low in the UK. However, there is a new phenomenon that is appearing in the labour market, nicknamed the family gap, and is the idea that there are significant differences in labour supply between those who have families and those who do not. Is it time that some new policies were introduced? Much of the empirical analysis surrounding the family gap phenomenon focuses on women. Women traditionally were expected to give up employment when they had children, and it can be argued that even today women with children will be less inclined to go out to work. This could be because the time a woman spends at work will be time foregone with her children, plus the expense of childcare would effectively reduce her wage. An important study by Salamaliki (2012) shows a negative association between female labour market supply and the number of children she has, providing evidence for an incompatibility hypothesis, which considers women s employment and fertility affecting each other reciprocally. O Neill (2003) found similar evidence of this relationship, and goes on to suggest that the persistent gender wage gap over the years is at least partly attributable to the employment gender gap that exists after men and women become parents. Despite significant amounts of research concluding that women with children are less likely to be employed than their counterparts without children, more recent studies, such as one published by Rabenmutter (2013), show a positive relationship between female employment and fertility. Figure 1 shows a graph, taken from his paper, showing cross-country data on female labour supply and fertility. This can simply be explained by the fact that children are

expensive, costing on average 10,400 per year to raise 1, and so women have to work more to maintain the family income. Figure 1: Total fertility rate plotted against female employment Data source: Rabenmutter (2013) p1 Whilst both these theories make economic and intuitive sense, in the modern era these same principles can be applied to men, who might also want to work less in order to spend time with their children. Thus this paper does not solely focus on the effects that children, but analysed the outcomes that children have on both male and female labour market activity, using data recently collected in the Understanding Society survey. Labour market activity does not only encompass whether an individual is in employment or not, it is an umbrella term that encapsulates the number of hours an individual works, their wage, level of employment and more. Whilst the majority of empirical analysis focuses on whether the individual is economically active, a few studies have used alternate dependent variables when establishing whether children affect labour market activity. This includes a study conducted by Gjerberg (2012), which focused specifically on female doctors in Norway. Gjerberg found that among female doctors, the probability of becoming a specialist decreased 1 The Guardian. (2012). How much does it cost to raise a child?.

with an increasing number of children. Similarly Windsor (2006) found that gender and dependent children significantly affected management advancement of female accountants, especially mothers. This has been the motivation for the final part of the paper, as after establishing whether having children affects whether the individual is economically active, I then analysed the impact that children have on their parents positions within the workplace. 2. Model The first model I used was the ordinary least squares model (OLS) technique, which attempts to find the function which most closely approximated the data, as a best fit. Whilst this is the simplest of models that can be applied, it has certain deficiencies which made it less useful for my analysis. The most significant problem with OLS being that it is capable of generating predicting values outside the range of 0-1, which is problematic as my dependent variable is a binary variable. Thus more appropriate models to use are the logit and probit models, which work on a maximum likelihood method and so limit the outputs to within the confines of 0 and 1. Whilst I ran both regressions, I focused the sensitivity analysis on the logit model, as it is more useful in terms of interpretation as the log odds are given. The first equation represents the initial relationship I tested, the relationship between the explanatory variables and current economic activity. The second equation uses the same explanatory variables but the dependent variable has been changed to job class. EconActive = 0 + 1female + 2married + 3 education+ 4children + 5volunteers + 6age + 7lnsavings + 8urban + u JobClass = 0 + 1female + 2married + 3 education+ 4children + 5volunteers + 6age + 7lnsavings + 8urban + u Here, u represents the normally distributed standard error term. Explanations of the dependent variables and explanatory variable will be discussed in part III. 3. Data The data used in this study is a single cross-section (wave 2) from the Understanding Society survey, which builds on an earlier survey called the British Household Panel Survey. The

survey attempts to capture information about people s social and economic circumstances, attitudes, behaviours and health 2. Dependent variables The original dependent variable that I used is the current economic activity variable (b_jbstat). This has been recoded into 1=economically active and 0= economically inactive. Whilst most of the options were obvious to categorise, the maternity leave option was more ambiguous. To decide whether to include this option in economically active, I created two new variables. The first was a filtering out of all options that were not the Maternity leave response on the current economic activity variable, and the second was a re-coding of the in paid employment variable (b_employ), 1 being in paid employment and 0 being not. I compared the responses of these two new variables, and the results are given in appendix 1. As 290 of the 342, or 85% of women on maternity leave consider themselves in paid employment, I have included this option in the economically active category. After running my regressions with this dependent variable I used the Current job: Eight Class (JobClass) as my dependent variable. Explanatory variables The first explanatory variable I used is gender (female). This is a fairly obvious variable to include, but whilst many other researchers chose to restrict their sample just to women, as earlier mentioned I feel that in a modern society it is important to consider the effects that children have on both parents economic activity. Another essential variable included in the model is age, however this has been narrowed to responses at or below 65, in order to restrict the sample to those of a working age. The variable marital status (married) has been used in many other empirical investigations. Whilst it can be traditionally assumed that this sign on this coefficient would be negative, due to the financial stability that marriage can provide, Maria Iacovou (2001) argues that the sign associated with the marriage variable might be uncertain. This is due to possible disincentive 2 Understanding Society. (2015). About the Study

effects brought about by the benefits system, as the support available to lone parents is arguably higher, and since the withdrawal of the benefits when those who begin to earn brings about a high effective marginal taxation rate. Much empirical analysis includes either the individual s highest qualification or school leaving age. Including both could lead to issues of multicollinearity as when paired they give a Pearson correlation coefficient of 0.56. In order to decide which to include, I looked at the frequencies associated with the variables and found that school leaving age had 47,676 missing values, compared to highest qualification which had only 1332. Therefore highest qualification was included in my model. The highest qualification variable (education) splits the sample between those who have compulsory and post compulsory education. When this data was collected, compulsory schooling ended at 16 and thus the option a-levels or higher has been included in the post-compulsory education category. This is an important variable to include as there is much data supporting the relationship between labour market activity and level of education. I have included frequency of volunteering (volunteers) in my model as the more hours an individual spends volunteering the less time they are likely to have available to work. The cutoff point between volunteers and does not volunteer is once a fortnight, as this is a substantial amount of time which could interfere with paid employment. Due to the dependent variable being current economic activity, wage is not an appropriate independent variable to use. Thus, in order to gain information on the households wealth many studies have in-cooperated spouses income or education into their analysis, Gjerberg (2003) and Xia (2010). However, these variables were unavailable in the dataset, so in order to gain some evidence of household income I decided to use the annual income from savings and investments variable (lnsavings). However, this variable in its original form carried skewness. As seen from the table of descriptive statistics(figure 4), the mean income from savings is 336.4 and the maximum much higher at 180,000. Thus in order to combat this issue, the log function of the savings variable was included. Missing variables Aside from the variables listed above and in figure 2, variable descriptions, there are two other variables that are consensually included in many other models that estimate the

relationship between children and their parents labour market outcomes. These are age during first pregnancy and ethnicity. Whilst age during first pregnancy is simply unavailable in this dataset, the variable ethnic group is included. Other surveys, such as ones conducted in America, have large proportions of respondents of different ethnicities and so these can be separated, for example Xia separated creates a dummy for White, Black and Hispanic. However in the Understanding Society dataset there are only 3892 responses to ethnic group, and 2722 of these are British (refer to appendix 2 appendix for a full breakdown of the frequencies). Thus due to this small sample size and lack of variation, ethnic group has been omitted from my analysis.

Figure 2: Variable Descriptions Old Variable New Description Coding Expected Variable Sign of Name Coefficient b_jbstat EconActive Whether individual is Dummy (1=economically economically active active, 0=inactive) b_sex Female Gender of individual Dummy (1=female, 0=male) - b_marstat Married Marital status Dummy (1=married, 0=Not -/+ married) b_hiqual_dv Education Post compulsory Dummy (1=obtained post + education compulsory education, 0=has not) b_nchild_dv Children Number of own children No coding -/+ b_volfreq Volunteers Frequency of Dummy (1=frequently - volunteering volunteers, 0=does not) b_dvage Age Age of individual No coding. Restricted to less than or equal to 65 -/+ b_fiyrinvinc_dv Lnsavings Natural log of individuals The natural log of - annual income from b_fiyrinvinc_dv savings and investments b_urban_dv Urban Resides in urban/rural area Dummy (1=urban, 0=rural) + b_jbnssec8_dv JobClass Whether individual has Dummy (1=higher position, high/low position 0=lower position)

Figure 3: Summary Statistics of Variable Variable Mean Standard Minimum Maximum Skew Deviation EconActive.615.486 0 1 -.474 Female.541.498 0 1 -.166 Married.522.499 0 1 -.090 Education.522.499 0 1 -.102 Children.51.932 0 9 1.958 Volunteers.558.496 0 1 -.237 Age 40.525 14.057 16 65 -.040 *Savings 336.487 3042.056 0 180000 29.524 Lnsavings 4.761 2.170 0 12.1 -.037 Urban.759.427 0 1-1.212 JobClass.751.432 0 1-1.165 *Savings is not included in the model 4. Empirical Analysis The results from all initial regressions can be found in figure 4. Firstly an OLS regression was run which included the independent variables mentioned in figure 2, with the explanatory variable EconActive. In this regression, the associated signs with each coefficient is as expected, except from the children variable, which is a positive. Most of the variables appear significant at the 1% level, with the exception of Urban which is significant at the 10% level, however Married and Children appear insignificant. As earlier mentioned, the OLS estimation method might not be the most appropriate for my binary dependent variable, thus the joint testing of these two apparently insignificant variables was carried out having run a more appropriate regression. Having run a logit and probit regression using exactly the same variables, the signs of all the coefficients remained constant, however the Children variable has now become negative, and

slightly less insignificant. This negative relationship is much more consistent with previous literature which makes it evident that the logit model is more appropriate for this analysis. However, the variables Married and Children still appear individually insignificant so a joint test was performed to establish whether these variables are jointly significant. Below are the formulations involved in my conduction of a likelihood ratio test, and output containing the data can be found in appendix 3: Test H : = 0, = 0 0 3 5 Against H 1 : 3 0, 5 0 LR = 2 (LogL - LogL ) U R LR = - 2368.078 ( 2371.775) LR= 3.697 Comparing the LR statistic to the chi-squared critical table values for df=2, the value is smaller than even the 10% significance level value. This indicates the null hypothesis can be accepted and that the variables Married and Children have no individual or joint significance. Despite this, they will continue to be included in the model due to their importance as variables. Number of children is the variable that my analysis is centred around therefore cannot be removed from the model. The remaining analysis used the logit rather than probit model. These models produce results which are fairly consistent, however the logit model provides easier interpretation due to the provision of odds ratios. The results of the analysis so far have shown that those who are married and have post-compulsory education are more likely to be economically active, whilst the probability of being economically active is decreasing in savings, volunteering frequency, children, and age. As well, those who are female and living in an urban area are less likely to be employed that males and those living in a rural area. The second test I performed was the hosmer-lemeshow test, results of this test can be found in appendix 4. The null hypothesis here is no specification error and the alternate hypothesis is that there is a specification error. As my model produced a large test statistic and very low

p-value, I conclude that it is likely there is a specification error. However, this could be due to the large sample size as even small divergences of the model from the data would be flagged up and cause significance. It could also be due to the exclusion of relevant variables, such as ethnicity, age during first pregnancy or partners income, as mentioned earlier. Another suitable goodness of fit measure is found with the expectation-prediction evaluation. Here is states that the model correctly predicts 79.7%, a gain on the 73.7% correct prediction of a structured guess. This suggests that the variables that have been included are valid, as they increase the likelihood of a correct prediction, but perhaps some key variables are missing. Figure 4: Regression Outputs 1 Regression 1 Regression 2 Regression 3 Regression 4 EconActive EconActive EconActive JobClass OLS Logit Probit Logit Constant 1.427 7.347 3.581 -.559 (.045) (.452) (.204) (.406) Female -.095*** -.687*** -.376***.492* (.016) (.105) (.060) (.145) Married.023.037.064.111 (.019) (.121) (.070) (171) Education.117***.582***.364*** 1.591*** (.020) (.120) (.070) (.164) Children.014 -.120 -.009.063 (.010) (.076) (.040) (.087) Volunteers -.083*** -.509*** -.300***.065 (.016) (.103) (.059) (.144) Age -.012*** -.097*** -.046*** -.006 (.001) (.007) (.003) (.007) Lnsavings -.027*** -.170*** -.094***.107** (.004) (025) (.014) (.036) Urban -.029* -.204* -.108*.701*** (.017) (.107) (.062) (.147)

R-Squared 0.20 0.307 0.14 Note: standard errors are in parenthesis Significance levels given as follows: *=significant at 10%, **=significant at 5%, ***=significant at 1% Satisfied with the results of this diagnostic test, as well as the predictions of the logit and probit model, I then changed the dependent variable from EconActive to JobClass. This is to establish whether children have a similar negative effect on their parents position in the workplace as they do on the probability their parent will be economically active. Referring to regression 4 of figure 4, it can be noted that many of the signs associated with the coefficients have changed. Now volunteering, living in an urban area, savings, children and being female all increase the likelihood of a higher position. Thus, it would appear that whilst children negatively impact the probability of a parent working, those parents that do continue work have higher positions. This is a plausible claim, as it could be argued that those of lower positions are more inclined to leave employment once having a child, whereas those who have worked for more senior positions might be more reluctant to. Having established the impact that children have in general, I then wanted to establish the differing effects that children have on each parent. Thus, in appendix 5, the samples for both dependent variables have been split. Whilst splitting the sample has meant many of the variables have lowered or lost significance, it does allow for general patterns between the genders to be shown. It would appear that for females, having children decreases the odds of being economically active and having a more senior position. These findings are consistent with the results of other empirical studies, for example Gjerberg s findings that female doctors that have children are less likely to be specialised than their child-less counterparts. The opposite is true for males, having children increases the likelihood of both being economically active and having a more senior position. This has been summarised in figure 5 below. Figure 5: impact children have on males and females respectively (odds) EconActive JobClass

Male 2.512 1.080 Female.626.993 5. Conclusion This papers aim was to discover any trends in the labour market related to children and to test the hypothesis of the family gap. The model itself contained many of the important contributing factors associated with employment and job class, and this can be shown through the relatively high R square values along with the results of expectation-prediction values of the hosmer-lemeshow test. However the model was far from perfect, and could have been improved through increasing the data range and the inclusion of a few key variables that I did not have access to. The results have led me to conclude that the family gap is a very real phenomenon, but that is effects men and women differently, corroborating the findings of previous studies. Whilst fathers appear to be more motivated for employment and to reach higher levels of employment, mothers are much less likely to be employed than women who do not have children. What must next be established is whether this is through choice. Is it a mothers decision to stay at home with her children, or reduced employability due to the label of her having children?

References Journals: Gjerberg, E. (2003). Women doctors in Norway: the challenging balance between career and family life. Social Science & Medicine. 57 (7), 1327 1341. Gutiérrez-Domènech, M. (2005). Employment after motherhood: a European comparison. Labour Economics. 12 (1), 99 123. Iacovou, M. (2001). FERTILITY AND FEMALE LABOUR SUPPLY. ISER Working Papers. 01-19 (1) O'Neill, J. (2003). The Gender Gap in Wages, circa 2000. The American Economic Review. 93 (2), 309-314. Rabenmutter, A. (2011). The Effect of Culture on Fertility, Female Labour Supply, the Gender Wage Gap and Childcare. CESIFO WORKING PAPER. 3337 (3) Salamaliki,P. (2013). The causal relationship between female labor supply and fertility in the USA. Journal of Population Economics. 26. 109-145. Windsor, C. (2006). The effect of gender and dependent children on professional accountants career progression. Critical Perspectives on Accounting. 17 (6), 828 844. Xia, L. (2012). How Children Affect Women's Labor Market Outcomes: Estimates from Using Miscarriage as a Natural Experiment. Economics Bulletin. 34 (3), 2908-2920. Websites: The Guardian. (2012). How much does it cost to raise a child?.available: http://www.theguardian.com/news/datablog/2012/jan/26/cost-raising-children. Last accessed 24/03/2015. Understanding Society. (2015). About the Study. Available: https://www.understandingsociety.ac.uk/about. Last accessed 24/03/2015.

Appendix 1: Responses of those on maternity leave to whether employed. Maternity 1.00 Count Employed.00 52 1.00 290 Appendix 2: Frequencies of the ethnic group variable ethnic group Frequency Percent Valid Percent Cumulative Percent Valid british/english/scottish/welsh/ northern irish 2722 5.0 69.9 69.9 irish 73.1 1.9 71.8 gypsy or irish traveller 2.0.1 71.9 any other white background 119.2 3.1 74.9 white and black caribbean 31.1.8 75.7 white and black african 16.0.4 76.1 white and asian 20.0.5 76.6 any other mixed background 18.0.5 77.1 indian 195.4 5.0 82.1 pakistani 184.3 4.7 86.8 bangladeshi 165.3 4.2 91.1 chinese 21.0.5 91.6 any other asian background 67.1 1.7 93.3 caribbean 66.1 1.7 95.0 african 118.2 3.0 98.1 any other black background 8.0.2 98.3 arab 15.0.4 98.7 any other ethnic group 52.1 1.3 100.0 Total 3892 7.1 100.0 Missing missing 3.0 inapplicable 46811 85.7 proxy 3882 7.1 refused 3.0 don't know 6.0 Total 50705 92.9

Total 54597 100.0 Appendix 3: Outputs from likelihood raito test Unrestricted model: Model Summary Cox & Snell R Nagelkerke R Step -2 Log likelihood Square Square 1 2368.078 a.210.307 a. Estimation terminated at iteration number 6 because parameter estimates changed by less than.001. Restricted model: Model Summary Cox & Snell R Nagelkerke R Step -2 Log likelihood Square Square 1 2371.775 a.209.306 a. Estimation terminated at iteration number 6 because parameter estimates changed by less than.001. Appendix 4: Outputs from hosmer-lemeshow test Hosmer and Lemeshow Test Step Chi-square df Sig. 1 129.499 8.000 Classification Table a,b Predicted Observed EconActive.00 1.00 Percentage Correct Step 0 EconActive.00 0 679.0 1.00 0 1907 100.0 Overall Percentage 73.7 a. Constant is included in the model. b. The cut value is.500 Classification Table a Predicted Observed EconActive

.00 1.00 Percentage Correct Step 1 EconActive.00 294 385 43.3 1.00 140 1767 92.7 Overall Percentage 79.7 a. The cut value is.500 Appendix 5: Regression outputs 2 (sample split) Regression 5 EconActive MALES Constant 6.620 (.652) Married.388* (.199) Education.236 (.204) Children.921 (.226) Volunteers -.541** (.168) Age -.090*** (.011) Lnsavings -.141*** (.039) Urban -.270 (.180) Regression 6 EconActive FEMALES 7.439 (.603) -.183 (.154).775*** (.153) -.468*** (.096) -.470*** (.134) -.107*** (010) -.190*** (.033) -.153 (.136) Regression 7 JobClass MALES -.991 (.503).290 (238) 1.615*** (.225).077 (.114).110 (.196).001 (.010).080* (.046).799*** (.197) R-squared 0.336 0.32 0.149 0.122 Regression 8 JobClass FEMALES.811 (.685) -.107 (.252) 1.520*** (.242) -.007 (.139) -.020 (.216) -.020 (.012)*.140* (.057).540* (.223) Note: standard errors are in parenthesis Significance levels given as follows: *=significant at 10%, **=significant at 5%, ***=significant at 1%