Assignment #5 Solutions: Chapter 14 Q1. a. R 2 is.037 and the adjusted R 2 is.033. The adjusted R 2 value becomes particularly important when there are many independent variables in a multiple regression model. The more variables in the model, the higher the R 2, even if the variables do not explain variation in the dependent variable. The adjusted R 2 takes the number of variables into account and makes adjustments. Since this model has seven predictors, the adjusted R 2 should be reported. b. The F statistic tells us whether the set of independent variables explains any of the variation in the dependent variable (does R 2 equal 0?). In this case, the F value is 8.60, which is statistically significant at p<.001, suggesting at least some of the variation in job satisfaction is predicted by the set of variables in the model. c. The beta value of 0.152 is just standardized (Z-scored) coefficient for that independent variable. The beta value allows us to make comparisons between independent variables that are measured on different scales. d. Job satisfaction = a + b(age) + b(hrs worked) + b(hrs housework) + b(job stress) + b(diff house) + b(# childs) + b(sex) + e e. The constant value of 5.24 just tells us what the value of job satisfaction would be if all the values for all the independent variables were 0. f. The t value of 0.34 for number of children, with a statistical significance level >0.05, tells us that controlling for the effects of the other independent variables in the model, number of children is not predictive of job satisfaction. g. The seven variables predict only 3.3% of the variance in job satisfaction (we are using the adjusted R 2 value because there are several variables in the model). We can see from the t values and accompanying statistical significance levels that hours worked weekly, view of job as being stressful, and view of difficulty completing family responsibilities affect job satisfaction. With p>.05, we see that age, hours spent doing housework, number of young children and gender are not predictive of job satisfaction when the effects of the other variables are controlled. h. Age was statistically significantly associated with job satisfaction in the bivariate regression. It may be that the relationship between age and job satisfaction was spurious (that is some other variable really was responsible for the relationship), or age influenced variables such as hours worked or views of one s job and once
those variables were controlled for, the direct effects of age on job satisfaction no longer remained statistically significant. Q2. a. Using the adjusted R 2 values since there are several variables, the addition of the two variables increased the amount of explained variance from 3.3% to 3.5% (or from 3.7% to 4.1% if not using the adjusted R 2 values). b. The view of life at home as being stressful (t=0.06; p>0.05) is not predictive of job satisfaction in the multivariate model, but the view that work is too tiring to have time to do home duties is predictive (t=2.79; p=0.005). c. The addition of the two variables in the model does not affect whether or not the other original seven variables are statistically significant. d. This answer may vary somewhat since there are several plausible hypothetical relationships. However, gender and age should appear to the far left of the causal model, followed by number of children (which would be affected by age). These variables would predict hours worked and hours of housework (the hours variables would likely be depicted with double-headed arrows, indicating a correlation). Then would come all the beliefs/views questions about work and home life, ending with job satisfaction. Q3. The researcher s hypotheses are only partially correct. Age, gender and income are not associated with amount of charitable donations. However, perceptions of having higher income predicts larger donations. Number of children predicts charitable donations, BUT not in the hypothesized direction. Actually, having more children in the household predicts higher charitable donations. Additional Questions in STATA: 1. Homicide Data: Outcome: homicide, Treatment variable: poverty. reg homicide03 poverty02 Source SS df MS Number of obs = 50 F( 1, 48) = 11.37 Model 60.434278 1 60.434278 Prob > F = 0.0015 Residual 255.207526 48 5.31682346 R-squared = 0.1915 Adj R-squared = 0.1746 Total 315.641804 49 6.44166947 Root MSE = 2.3058 homicide03 Coef. Std. Err. t P> t [95% Conf. Interval] poverty02.3553473.1053992 3.37 0.001.1434279.5672667 _cons.5879896 1.274539 0.46 0.647-1.974643 3.150622
The model passes the F-test and approximately 17% of the linear variation in homicides is explained by poverty levels. The linear model is estimated as: homicide03 = 0.59 + 0.36 (poverty02) The correlation coefficient is the square root of the R-squared or 0.4376 5 20 % in poverty, 2002 According to the estimated model: A one-percent increase in poverty rates will lead to 0.35 more homicides per 100,000 by state. 2. Environment treaty Data: Outcome: envtreat, Treatment: ngo. use "C:\Users\Economics Lab\Box Sync\UC Merced\example_3_env_treaties.dta", clear. reg envtreat ngo Source SS df MS Number of obs = 170 F( 1, 168) = 425.05 Model 1359.19248 1 1359.19248 Prob > F = 0.0000 Residual 537.213405 168 3.19769884 R-squared = 0.7167 Adj R-squared = 0.7150 Total 1896.40588 169 11.2213366 Root MSE = 1.7882 envtreat Coef. Std. Err. t P> t [95% Conf. Interval] ngo.0033296.0001615 20.62 0.000.0030107.0036484 _cons 3.650747.1938948 18.83 0.000 3.267963 4.033531
The model passes the F-test and approximately 72% of the linear variation in treaty rates is explained by NGO levels. The linear model is estimated as: envtreat = 3.65 + 0.003 (ngo) The correlation coefficient is the square root of the R-squared or 0.8466 0 1000 2000 3000 4000 # ngos 2000 env treat participation According to the estimated model, a one unit increase in the number of NGOs will lead to 0.003 more environmental treaties being signed into law. To make this more relevant; a one-standard deviation increase in the number of NGOs (851 more), will lead to 2.83 more environmental treaties. 3. Aids Knowledge Data: Outcome: aiscon, Treatment: educ. reg aidscon educ Source SS df MS Number of obs = 8309 F( 1, 8307) = 357.72 Model 60.2576237 1 60.2576237 Prob > F = 0.0000 Residual 1399.28985 8307.168447075 R-squared = 0.0413 Adj R-squared = 0.0412 Total 1459.54748 8308.175679764 Root MSE =.41042 aidscon Coef. Std. Err. t P> t [95% Conf. Interval] educ.0652831.0034516 18.91 0.000.058517.0720491 _cons.6648986.0072617 91.56 0.000.6506638.6791334
0.2.4.6.8 1 The model passes the F-test and approximately 4% of the linear variation in AIDs proliferation knowledge is explained by education levels. The linear model is estimated as: aidscon = 0.66 + 0.065(educ) The correlation coefficient is the square root of the R-squared or 0.2032 0 1 2 3 4 5 educational attainment knowledgable about aids transmission? According to the estimated model: A one category increase in education rates within the population will lead to a 6.5% increase in the proportion of the population that understands how AIDS is transmitted. MULTIPLE REGRESSION:. reg homicide03 poverty02 divorced00 lesshs03 urban00 confederate Source SS df MS Number of obs = 50 F( 5, 44) = 9.46 Model 163.531233 5 32.7062466 Prob > F = 0.0000 Residual 152.110571 44 3.45705844 R-squared = 0.5181 Adj R-squared = 0.4633 Total 315.641804 49 6.44166947 Root MSE = 1.8593 homicide03 Coef. Std. Err. t P> t [95% Conf. Interval] poverty02.0025617.1361424 0.02 0.985 -.2718153.2769388 divorced00.2770672.2107499 1.31 0.195 -.1476714.7018058 lesshs03.2776229.1093682 2.54 0.015.0572057.4980401 urban00.0441114.0199339 2.21 0.032.0039372.0842857 confederate 1.885498.8315673 2.27 0.028.2095841 3.561412 _cons -5.799143 2.895789-2.00 0.051-11.63522.0369363
The model passes the F-test and approximately 46% of the linear variation in homicide rates is explained by the model. The linear model is estimated as: homicide03 = -5.79+0.003(poverty02)+0.28(divorced00)+0.27(lesshs03)+0.04(urban)+1.89(confederate) The correlation coefficients within the model are represented with a correlogram: homic~03 pover~02 divor~00 lesshs03 urban00 confed~e homicide03 1.0000 poverty02 0.4376 1.0000 divorced00 0.1679 0.1978 1.0000 lesshs03 0.6428 0.7247 0.1261 1.0000 urban00 0.1539-0.3113-0.1734-0.0449 1.0000 confederate 0.5426 0.5354 0.0466 0.6176-0.1919 1.0000 All variables are positively related to each other except for urban. It is negatively related with all the other independent variables but is still positively related to homicide rates. These correlations can be seen in the individual graphs: 5 20 % in poverty, 2002
8 10 12 14 percent divorced 2000 20 25 % ages 25+ with less than high school education, 2003
40 60 80 100 % urban 2000 0.2.4.6.8 1 state in confederacy? Poverty and divorce rates do not seem to be statistically related to homicide rates after including the other variables in the model; even though they might be when only included independently in a single-regression. In fact, if you only include drop-out rates,
urbanicity and confederate in the model the adjusted R 2 increases.. reg homicide03 lesshs03 urban00 confederate Source SS df MS Number of obs = 50 F( 3, 46) = 15.27 Model 157.47137 3 52.4904567 Prob > F = 0.0000 Residual 158.170434 46 3.43848769 R-squared = 0.4989 Adj R-squared = 0.4662 Total 315.641804 49 6.44166947 Root MSE = 1.8543 homicide03 Coef. Std. Err. t P> t [95% Conf. Interval] lesshs03.2949173.0836748 3.52 0.001.1264886.4633461 urban00.0395089.0181953 2.17 0.035.0028836.0761341 confederate 1.80119.8230456 2.19 0.034.1444847 3.457895 _cons -2.886389 1.693703-1.70 0.095-6.295635.5228576