San Francisco State University Michael Bar ECON 312 Fall 2018 Final Exam, section 2 Tuesday, December 18 1 hour, 30 minutes Name: Instructions 1. This is closed book, closed notes exam. 2. You can use one double-sided sheet of paper, letter size (8½ 11 in or 215.9 279.4 mm), with any content you want. 3. No calculators of any kind are allowed. 4. Show all the calculations, and explain your steps. 5. If you need more space, use the back of the page. 6. Fully label all graphs. Good Luck
1. (40 points). Jennifer is close to completing her bachelor s degree in economics, and she considers pursuing a master s degree. For her ECON 690 project she collected a sample of workers with either bachelor s degree or master s degree, with the following variables: inc - respondent s total annual income (in $) afqt - percentile score on U.S. Armed Forces Qualifying Exam female - dummy (=1 if female, 0 otherwise) black - dummy (=1 if black, 0 otherwise) ma - dummy (=1 if highest level of education is master s degree, 0 if highest level of education is bachelor s degree) Jennifer estimated two models, and her results are reported in the next table. The numbers in parentheses are 95% confidence intervals. Dependent variable: log(inc) model1 model2 Constant 10.376 *** (10.234, 10.519) 10.374 *** (10.231, 10.516) afqt 0.005 *** (0.004, 0.007) 0.006 *** (0.004, 0.007) female -0.315 *** (-0.388, -0.242) -0.318 *** (-0.395, -0.241) black 0.010 (-0.094, 0.115) -0.005 (-0.112, 0.102) ma 0.104 * (-0.017, 0.226) 0.046 (-0.124, 0.216) female:ma 0.054 (-0.189, 0.298) black:ma 0.296 (-0.087, 0.680) Observations 848 848 R 2 0.142 0.145 Adjusted R 2 0.138 0.139 Residual Std. Error 0.534 (df = 843) 0.534 (df = 841) F Statistic 34.928 *** (df = 4; 843) 23.699 *** (df = 6; 841) Note: * p<0.1; ** p<0.05; *** p<0.01 a. Demonstrate how you would use the fitted equation from model1 to predict the total income of a Hispanic female with master s degree, who scored at the 95 th percentile on her Armed Forces Qualifying Exam (afqt = 95). No need to calculate the final number, just write the fitted equation and substitute the values. ıııııı = exp(bb 1 + bb 2 ssssssss + bb 3 ffffffffffff + bb 4 bbbbbbbbbb + bb 5 mmmm) = exp(10.375 + 0.005 95 0.315 1 + 0.01 0 + 0.104 1) = exp (10.375 + 0.005 95 0.315 + 0.104) We exponentiate because the dependent variable is log(inc). 1
b. Interpret the estimated coefficient on ma in model1. bb 5 = 0.104 means that workers with master s degree are earning 10.4% more in annual income than workers with bachelor s degree, holding all other regressors the same (i.e. gender, race, and score on the Armed Forces Qualifying Exam). c. Interpret the estimated coefficient on black in model1. bb 4 = 0.010 means that black workers annual income is approximately 1% higher than that of non-black workers, holding all other regressors the same (i.e. gender, education level, and score on the Armed Forces Qualifying Exam). Remark. Notice that this difference is not significant, so this data gives no evidence that income of black workers differ from income of non-black workers. Perhaps the racial gap shrinks among workers of advanced degrees (remember that this sample consists of workers with bachelor s and master s degrees only. d. Suppose that Jennifer wants to test whether female workers income is lower than the income of male workers. Write the null and alternative hypotheses for her test, based on model1. HH 0 : ββ 3 = 0 HH 1 : ββ 3 < 0 2
e. Interpret the estimated coefficient on black:ma in model2. bb 7 = 0.296 is the difference between the benefit from master s degree for black and for nonblack. That is, black workers income benefit from master s degree is 29.6%% more than nonblack workers, holding other regressors fixed (gender, and score on the Armed Forces Qualifying Exam). Steps. bbbbbbbbbbbbtt bbbbbbbbbb,mmmm = ıııııı bbbbbbbbbb,mmmm ıııııı bbbbbbbbbb,bbbbbbh = bb 4 + bb 7 bbbbbbbbbbbbtt nnnnnn bbbbbbbbbb,mmmm = ıııııı nnnnnn bbbbbbbbbb,mmmm ıııııı nnnnnn bbbbbbbbbb,bbbbbbh = bb 4 Thus, the difference between the benefit of black and non-black is: bbbbbbbbbbbbbb bbbbbbbbbb,mmmm bbbbbbbbbbbbbb nnnnnn bbbbbbbbbb,mmmm = bb 7 f. Suppose that Jennifer wants to test whether female workers benefits from master s degree are different from benefits of male workers from master s degree. Write the null and alternative hypotheses for her test, based on model2. HH 0 : ββ 6 = 0 HH 1 : ββ 6 0 g. Based on the reported confidence intervals, what is your conclusion about the test in the last section? Explain your answer. The 95% confidence interval for ββ 6 is (-0.189, 0.298), contains all the null values of ββ 6 which cannot be rejected at significance level of αα = 5% against a two-sided alternative. Since the reported confidence interval contains 0, we fail to reject the null hypotheses at significance level αα = 5%. We conclude that female workers benefits from master s degree are NOT different from benefits of male workers from master s degree. 3
h. Suppose that Jennifer wants to test whether income of female workers with master s degree is higher than the income of female workers with bachelor s degree. Write the null and alternative hypotheses for her test, based on model2. HH 0 : ββ 5 + ββ 6 = 0 HH 1 : ββ 5 + ββ 6 > 0 2. (5 points). P-value for a test is (circle the correct answer): a. The probability of accepting a true null hypothesis. b. The probability of rejecting a true null hypothesis. c. The probability of accepting a false null hypothesis. d. The probability of rejecting a false null hypothesis. e. None of the above. 3. (5 points). An estimator θθ nn of the unknown population parameter θθ, based on random sample of size nn, is efficient if (circle the correct answer): a. vvvvvv(θθ ) = 0 nn. b. vvvvvv(θθ nn ) 0 as nn. c. bbbbbbbb(θθ nn ) 0 as nn. d. EE θθ nn θθ = 0 nn. e. None of the above. 4
4. (20 points). Simone serves as an expert witness in a discrimination lawsuit against a major mortgage lending company. She collected data on 2,380 loan applications from that company, with the following variables: deny = 1 if mortgage application was denied, 0 otherwise black = 1 if applicant is black, 0 in non-black dir ratio of debt payments to total income of applicant, in % lvr ratio of loan amount to value of property, in % cs credit score (in points, higher value is better) dmi = 1 if applicant was denied mortgage insurance, 0 otherwise Simone estimated the probit and logit models, and her results (marginal effects) are given in the next table. Estimated standard errors are in parentheses, and the constant is omitted: Dependent variable: deny Probit mfx Logit mfx black 0.0852 *** (0.0215) 0.0738 *** (0.0197) dir 0.0039 *** (0.0006) 0.0036 *** (0.0006) lvr 0.0013 *** (0.0004) 0.0014 *** (0.0004) cs -0.0299 *** (0.0031) -0.0262 *** (0.0027) dmi 0.7825 *** (0.0605) 0.8043 *** (0.0585) Pseudo R 2 0.2345 0.2368 p-value 0 0 Observations 2,381 2,381 Log Likelihood -667.7190-665.6904 Akaike Inf. Crit. 1,347.4380 1,343.3810 Note: * p<0.1; ** p<0.05; *** p<0.01 a. Interpret the estimated marginal effect of black in the logit model. mmmmmm(bbbbbbbbbb) = 0.0738, means that black applicants are 7.38% more likely to be denied a mortgage, than non-black applicants, holding all other regressors (mortgage characteristics) at their sample means values. 5
b. Suppose Simone wants to test statistically whether the lending company discriminates against black applicants. Write the null and alternative hypotheses of this test. Let ββ 2 be the unknown marginal effect on black. The test is therefore: HH 0 : ββ 2 = 0 HH 1 : ββ 2 > 0 Remark: If black applicants are being discriminated, then their chances of being denied a mortgage are higher, i.e. this is upper-tail test. c. Interpret the estimated marginal effect of lvr in the logit model. mmmmmm(llllll) = 0.0014 means that a 1% increase in the ratio of loan amount to value of property, increases the chances of mortgage application denial by 0.14%, holding all regressors at the sample average values. d. Interpret the estimated marginal effect of cs class in the logit model. mmmmmm(cccc) = 0.0262 means that a 1 point increase in applicant s credit score, lowers the chances of mortgage application denial by 0.262%, holding all regressors at the sample average values. 6
5. (10 points). Suppose that you estimated a regression model using OLS, and the plot of residuals against the fitted values looks like the next figure. Heteroscedasticity. a. (3 points). What kind of econometric problem your model likely suffers from? b. (4 points). What are the consequences of the problem in the previous section? OLS estimators are inefficient Estimated standard errors are biased, and therefore statistical hypotheses tests are invalid. c. (3 points). Propose a practical solution to the problem you identified in section a. The most practical solution to compute and report robust standard errors (in R, using the sandwich package). 7
6. (10 points). Suppose that Kevin estimated two models, and his fitted equations are: EEEEEEEEEEEEEEEE = bb 1 + 3SS + 3EEEEEE EEEEEE = dd 1 0.2SS Where SS is schooling and EEEEEE is experience. Dray is another researcher who estimated the following model: EEEEEEEEEEEEEEEE = bb 1 + bb 2 SS a. (3 points). Suppose that Kevin s model is the correct one. What is the econometric problem in Dray s model? Omitted variable bias. b. (4 points). What are the likely consequences of the problem in the previous section? i. Biased and inconsistent estimator of the coefficient on the schooling, ii. Biased standard errors of estimators, which makes all statistical tests invalid. c. (3 points). What would be the value of Dray s estimated coefficient on schooling, bb 2? bb 2 = bb 2 + bb 3 dd 2 = 3 + 3 ( 0.2) = 2.4 8
7. (10 points). Suppose that you are planning to use time series data that looks like the following two variables. Trends in variables. a. (3 points). What kind of econometric problem you are likely to face when using time series data that looks like the above two variables? b. (4 points). What are the consequences of the problem in the previous section? Spurious regression, which leads to biased and inconsistent estimators (similar to omitted variable bias). c. (3 points). Propose one practical solution to the problem you identified in section a. i. Detrending (removing the trend from variables) before using them in regression. ii. Normalizing expressing the variables in terms of ratios of the original variable to some other key variable. For example, CC tt cc tt =, or dddddd GGGGPP tt = DDDDDD tt tt GGGGPP tt are normalized consumption and normalized deficit, both expressed as a fraction of GDP. iii. Including time as a regressor. 9