Econ 371 Problem Set #4 Answer Sheet. 6.2 This question asks you to use the results from column (1) in the table on page 213.

Similar documents
ECON Introductory Econometrics. Seminar 4. Stock and Watson Chapter 8

Final Exam - section 1. Thursday, December hours, 30 minutes

You created this PDF from an application that is not licensed to print to novapdf printer (

Problem Set 6 ANSWERS

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998

Quantitative Techniques Term 2

İnsan TUNALI 8 November 2018 Econ 511: Econometrics I. ASSIGNMENT 7 STATA Supplement

Labor Force Participation and the Wage Gap Detailed Notes and Code Econometrics 113 Spring 2014

u panel_lecture . sum

Your Name (Please print) Did you agree to take the optional portion of the final exam Yes No. Directions

Cameron ECON 132 (Health Economics): FIRST MIDTERM EXAM (A) Fall 17

Labor Market Returns to Two- and Four- Year Colleges. Paper by Kane and Rouse Replicated by Andreas Kraft

Example 2.3: CEO Salary and Return on Equity. Salary for ROE = 0. Salary for ROE = 30. Example 2.4: Wage and Education

Econometrics is. The estimation of relationships suggested by economic theory

Dummy variables 9/22/2015. Are wages different across union/nonunion jobs. Treatment Control Y X X i identifies treatment

Categorical Outcomes. Statistical Modelling in Stata: Categorical Outcomes. R by C Table: Example. Nominal Outcomes. Mark Lunt.

Chapter 6 Part 3 October 21, Bootstrapping

ECON Introductory Econometrics Seminar 2, 2015

1) The Effect of Recent Tax Changes on Taxable Income

[BINARY DEPENDENT VARIABLE ESTIMATION WITH STATA]

Problem Set 9 Heteroskedasticty Answers

Advanced Econometrics

The Multivariate Regression Model

Effect of Education on Wage Earning

Handout seminar 6, ECON4150

F^3: F tests, Functional Forms and Favorite Coefficient Models

Relation between Income Inequality and Economic Growth

tm / / / / / / / / / / / / Statistics/Data Analysis User: Klick Project: Limited Dependent Variables{space -6}

Two-stage least squares examples. Angrist: Vietnam Draft Lottery Men, Cohorts. Vietnam era service

Time series data: Part 2

The relationship between GDP, labor force and health expenditure in European countries

Professor Brad Jones University of Arizona POL 681, SPRING 2004 INTERACTIONS and STATA: Companion To Lecture Notes on Statistical Interactions

sociology SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 SO5032 Quantitative Research Methods

Assignment #5 Solutions: Chapter 14 Q1.

Heteroskedasticity. . reg wage black exper educ married tenure

Example 7.1: Hourly Wage Equation Average wage for women

Technical Documentation for Household Demographics Projection

Model fit assessment via marginal model plots

Table 4. Probit model of union membership. Probit coefficients are presented below. Data from March 2008 Current Population Survey.

Solutions for Session 5: Linear Models

Example 8.1: Log Wage Equation with Heteroscedasticity-Robust Standard Errors

Econometric Methods for Valuation Analysis

Logistic Regression Analysis

Chapter 11 Part 6. Correlation Continued. LOWESS Regression

Impact of Household Income on Poverty Levels

Dummy Variables. 1. Example: Factors Affecting Monthly Earnings

*1A. Basic Descriptive Statistics sum housereg drive elecbill affidavit witness adddoc income male age literacy educ occup cityyears if control==1

Module 9: Single-level and Multilevel Models for Ordinal Responses. Stata Practical 1

Don t worry one bit about multicollinearity, because at the end of the day, you're going to be working with a favorite coefficient model.

The SAS System 11:03 Monday, November 11,

Testing the Solow Growth Theory

Homework Assignment Section 3

Modeling wages of females in the UK

Determinants of FII Inflows:India

. ********** OUTPUT FILE: CARD & KRUEGER (1994)***********.. * STATA 10.0 CODE. * copyright C 2008 by Tito Boeri & Jan van Ours. * "THE ECONOMICS OF

EC327: Limited Dependent Variables and Sample Selection Binomial probit: probit

An Examination of the Impact of the Texas Methodist Foundation Clergy Development Program. on the United Methodist Church in Texas

SAS Simple Linear Regression Example

Homework Assignment Section 3

(ii) Give the name of the California website used to find the various insurance plans offered under the Affordable care Act (Obamacare).

The impact of cigarette excise taxes on beer consumption

Review questions for Multinomial Logit/Probit, Tobit, Heckit, Quantile Regressions

Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron

An analysis of the relationship between economic development and demographic characteristics in the United States

Advanced Industrial Organization I Identi cation of Demand Functions

STATA Program for OLS cps87_or.do

Stat 328, Summer 2005

Does Globalization Improve Quality of Life?

Sociology Exam 3 Answer Key - DRAFT May 8, 2007

STATISTICS 110/201, FALL 2017 Homework #5 Solutions Assigned Mon, November 6, Due Wed, November 15

Impact of Stock Market, Trade and Bank on Economic Growth for Latin American Countries: An Econometrics Approach

Question 1a 1b 1c 1d 1e 1f 2a 2b 2c 2d 3a 3b 3c 3d M ult:choice Points

Final Exam, section 1. Tuesday, December hour, 30 minutes

Effects of the Great Recession on American Retirement Funding

Final Exam, section 1. Thursday, May hour, 30 minutes

Impact of Minimum Wage and Government Ideology on Unemployment Rates: The Case of Post-Communist Romania

Name: 1. Use the data from the following table to answer the questions that follow: (10 points)

Final Exam, section 2. Tuesday, December hour, 30 minutes

6 Multiple Regression

EXST7015: Multiple Regression from Snedecor & Cochran (1967) RAW DATA LISTING

Testing Capital Asset Pricing Model on KSE Stocks Salman Ahmed Shaikh

Are Old Age Workers Out of Luck? An Empirical Study of the U.S. Labor Market. Keith Brian Kline II Sreenath Majumder, PhD March 16, 2015

of U.S. High Technology stocks

Prof. Dr. Ben Jann. University of Bern, Institute of Sociology, Fabrikstrasse 8, CH-3012 Bern

Sean Howard Econometrics Final Project Paper. An Analysis of the Determinants and Factors of Physical Education Attendance in the Fourth Quarter

a. Explain why the coefficients change in the observed direction when switching from OLS to Tobit estimation.

11/28/2018. Overview. Multiple Linear Regression Analysis. Multiple regression. Multiple regression. Multiple regression. Multiple regression

Jet Fuel-Heating Oil Futures Cross Hedging -Classroom Applications Using Bloomberg Terminal

Limited Dependent Variables

Cameron ECON 132 (Health Economics): SECOND MIDTERM EXAM (A) Fall 17

CHAPTER 4 ESTIMATES OF RETIREMENT, SOCIAL SECURITY BENEFIT TAKE-UP, AND EARNINGS AFTER AGE 50

Module 4 Bivariate Regressions

Postestimation commands predict Remarks and examples References Also see

Homework 0 Key (not to be handed in) due? Jan. 10

Ownership structure and corporate performance: evidence from China

CHAPTER 2 ESTIMATION AND PROJECTION OF LIFETIME EARNINGS

Are the movements of stocks, bonds, and housing linked? Zachary D Easterling Department of Economics The University of Akron

STATA log file for Time-Varying Covariates (TVC) Duration Model Estimations.

Multinomial Logit Models - Overview Richard Williams, University of Notre Dame, Last revised February 13, 2017

Non-linearities in Simple Regression

Transcription:

Econ 371 Problem Set #4 Answer Sheet 6.2 This question asks you to use the results from column (1) in the table on page 213. a. The first part of this question asks whether workers with college degrees earn more than workers with only a high school degree. Based on the regression results, workers with college degrees earn $5.46/hour more, on average, than workers with only high school degrees. b. The second part of question asks a similar question, only in this case focussing on the wage differential for men versus women. The regression results indicate that men earn $2.64/hour more, on average, than women. 6.3 The next question asks you to use the results from column (2) in the table on page 213. b. In the second part of the question, you are as to predict the earnings for two individuals: Sally, who is a 29-year-old female college graduate, and Betsy, who is a 34-year-old female college graduate. Sally s earnings prediction is 4.40 + 5.48 1 2.62 1 + 0.29 29 = 15.67 dollars per hour. 4.40 + 5.48 1 2.62 1 + 0.29 34 = 17.12 dollars per hour. The difference is 1.45 dollars per hour. 6.4 The next question asks you to use the results from column (3) in the table on page 213. b. Here you are asked why the regressor W est is excluded from the regression. The regressor W est is omitted to avoid perfect multicollinearity. If W est is included, then the intercept can be written as a perfect linear function of the four regional regressors. Because of perfect multicollinearity, the OLS estimator cannot be computed. 6.5 In question 6.5, you are to used the results from an analysis of housing prices. b. Here you are asked to estimate the impact of a increase in house size by 100 square feet through the addition of a bathroom. In this case BDR = 1 and Hsize = 100. The resulting expected change in price is 23.4 + 0.156 100 = 39.0 thousand dollars or $39,000. c. In part c you are asked to predict the impact on housing price from a deterioration of the house s condition to poor. The loss in this case is $48,800. 7.4 This question continues question 6.4 above, providing standard errors for the estimated regression model, as reported in the table on page 247. a. You are asked whether or not the regional differences appear to be important. The F-statistic testing the coefficients on the regional regressors are zero is 6.10. The 1% critical value (from the F 3, distribution) is 3.78. Because 6.10 > 3.78, the regional effects are significant at the 1% level. 1

The two empirical exercises in this homework use the same dataset: CollegeDistance. The data can be downloaded from the Web site listed in the assignment (which you can also reach from the class website). A program that carries all of the tasks for problems in E6.2 is appended to this answer sheet. E6.2 a. The first task you are asked to do is to regress the years of completed education (ED) on distance to the nearest college (Dist) and to report the estimated slope. The results are as follows: ÊD = 13.96 0.073Dist, R 2 = 0.0074 The slope, then, for this regression is -0.073. (0.038) (0.013) b. Next, you are asked to run an additional regression including some of the other variables in the data set. The resulting parameter estimates are: Variable Parameter Est. Standard Error dist 0.032 0.012 bytest 0.093 0.0030 female 0.145 0.050 black 0.367 0.068 hispanic 0.398 0.074 incomehi 0.395 0.062 ownhome 0.152 0.065 dadcoll 0.696 0.071 cue80 0.023 0.009 stwmfg80 0.051 0.020 intercept 8.827 0.241 The estimated effect of Dist is now 0.032. c. The coefficient has fallen by more than 50%. Thus, it seems that result in (a) did suffer from omitted variable bias. d. The regression in (b) fits the data much better as indicated by the R 2, R2 and SER. The R 2 and R 2 are similar because the number of observations is large (n = 3796). e. Students with a dadcoll = 1 (so that the student s father went to college) complete 0.696 more years of education, on average, than students with dadcoll = 0 (so that the student s father did not go to college). f. These terms capture the opportunity cost of attending college. As ST W M F G80 increases, forgone wages increase, so that, on average, college attendance declines. The negative sign on the coefficient is consistent with this. As CUE80 increases, it is more difficult to find a job, which lowers the opportunity cost of attending college, so that college attendance increases. The positive sign on the coefficient is consistent with this. g. Bob s predicted years of education = 0.0315 2 + 0.093 58 + 0.145 0 + 0.367 1 + 0.398 0 + 0.395 1 + 0.152 1 + 0.696 0 + 0.023 7.5 + 0.051 9.75 + 8.827 = 14.75. The program computes this more precisely using the lincom command. h. Jim s expected years of education is 2 0.0315 = 0.0630 less than Bob s. Thus, Jim s expected years of education is 14.75 0.063 = 14.69. E6.2 These are the answers to the additional questions. a. The first additional question asks you to construct a 90% confidence interval around the predictions in parts g and h. This can be read directly from the Stata output using the lincom command and the level(90) option. Specifically, the 90% confidence interval for part g is given by: (14.63886, 14.94217). The 90% confidence interval for part h is (14.56512,14.88975). b. The second question asks you to test the hypothesis that the additional variables in E6.2b are jointly significant. This is done using the test command after the regression. In this case, the F-statistic is 215.43 and the p-value associate with the test being < 0.0001, so we would reject this restriction. The more complicated model is a statistically significant improvement on the basic model at the 10%, 5%, and 1% levels. 2

c. Finally, you are asked to test the hypothesis that the coefficients on Black and Hispanic are the same. Again, we can use the test command after the regression to test this hypothesis. This gives us an F- statistic of 0.13, with a p-value of 0.7168. Clearly, we would not reject the null hypothesis. At least based on these data, the additional years of education completed by these two sub-populations, conditional on all the other factors, are the same. 3

; Problem Set #4 ; # delimit ; clear; cap log close; cd "R:\users\jaherrig\My Documents\Classes\Economics 371\Stata"; ; Specify the output file ; log using Problemset4.log,replace; set more off; ; Read in and summarize the data ; use CollegeDistance.dta; describe; summarize ; ; Estimate the model for question E6.2a ; reg ed dist,r; reg ed dist; ; Estimate the model for question E6.2b. Also, include a test of two hypotheses: First, that the additional variables jointly have zero coefficients Second, that the black and hispanic coefficients are the same ; reg ed dist bytest female black hispanic incomehi ownhome dadcoll cue80 stwmfg80,r; test bytest female black hispanic incomehi ownhome dadcoll cue80 stwmfg80; test reg black=hispanic; ed dist bytest female black hispanic incomehi ownhome dadcoll cue80 stwmfg80; ; Compute the fitted value of ED for E6.2g and E6.2h ; lincom _cons + 2dist + 58bytest + 0female + 1black + 0hispanic + 1incomehi + 1ownhome + 0dadcoll + 7.5cue80 + 9.75stwmfg80, level(90); lincom _cons + 4dist + 58bytest + 0female + 1black + 0hispanic + 1incomehi + 1ownhome + 0dadcoll + 7.5cue80 + 9.75stwmfg80, level(90); log close;

clear; exit;

------- log: R:\users\jaherrig\My Documents\Classes\Economics 371\Stata \Problemset4.log log type: text opened on: 14 Oct 2009, 08:30:56. set more off;. ;. > Read in and summarize the data > > ;. use CollegeDistance.dta;. describe; Contains data from CollegeDistance.dta obs: 3,796 vars: 14 1 Aug 2006 17:31 size: 227,760 (78.3% of memory free) ------- storage display value variable name type format label variable label ------- female black hispanic bytest dadcoll momcoll ownhome urban cue80 stwmfg80 dist tuition incomehi ed ------- Sorted by:. summarize ; Variable Obs Mean Std. Dev. Min Max -------------+-------------------- female 3796.5453109.4980083 0 1 black 3796.1925711.394371 0 1 hispanic 3796.1498946.3570151 0 1 bytest 3796 51.00193 8.819251 28.95 71.36 dadcoll 3796.2020548.4015858 0 1 -------------+-------------------- momcoll 3796.1393572.3463645 0 1 ownhome 3796.8192835.3848338 0 1 urban 3796.243941.4295141 0 1 cue80 3796 7.654874 2.86577 1.4 24.9

stwmfg80 3796 9.556499 1.364411 6.59 12.15 -------------+-------------------- dist 3796 1.724921 2.133836 0 16 tuition 3796.9131396.2835778.43418 1.40416 incomehi 3796.2863541.4521164 0 1 ed 3796 13.82929 1.813969 12 18. ;. > Estimate the model for question E6.2a > > ;. reg ed dist,r; Linear regression Number of obs = 3796 F( 1, 3794) = 29.83 Prob > F = 0.0000 R-squared = 0.0074 Root MSE = 1.8074 Robust ed Coef. Std. Err. t P> t [95% Conf. Interval] dist -.0733727.0134334-5.46 0.000 -.0997101 -.0470353 _cons 13.95586.0378112 369.09 0.000 13.88172 14.02999. reg ed dist; Source SS df MS Number of obs = 3796 -------------+------------------------------ F( 1, 3794) = 28.48 Model 93.0256754 1 93.0256754 Prob > F = 0.0000 Residual 12394.3568 3794 3.266831 R-squared = 0.0074 -------------+------------------------------ Adj R-squared = 0.0072 Total 12487.3825 3795 3.29048287 Root MSE = 1.8074 ed Coef. Std. Err. t P> t [95% Conf. Interval] dist -.0733727.0137498-5.34 0.000 -.1003304 -.046415 _cons 13.95586.0377241 369.95 0.000 13.88189 14.02982. ;. > Estimate the model for question E6.2b. Also, include a test of two > hypotheses: > First, that the additional variables jointly have zero coefficients > Second, that the black and hispanic coefficients are the same > > ;. reg ed dist bytest female black hispanic incomehi ownhome dadcoll cue80 > stwmfg80,r; Linear regression Number of obs = 3796 F( 10, 3785) = 197.68 Prob > F = 0.0000 R-squared = 0.2788 Root MSE = 1.5425

Robust ed Coef. Std. Err. t P> t [95% Conf. Interval] dist -.0315387.0116616-2.70 0.007 -.0544023 -.0086752 bytest.0938201.0029804 31.48 0.000.0879768.0996634 female.145408.0503939 2.89 0.004.0466061.2442098 black.367971.0675359 5.45 0.000.2355608.5003812 hispanic.3985196.0738763 5.39 0.000.2536785.5433608 incomehi.3951984.0619207 6.38 0.000.2737972.5165996 ownhome.1521313.0649193 2.34 0.019.0248511.2794115 dadcoll.6961324.0707602 9.84 0.000.5574006.8348641 cue80.0232052.00931 2.49 0.013.0049521.0414583 stwmfg80 -.0517777.0196751-2.63 0.009 -.0903526 -.0132029 _cons 8.827518.2413001 36.58 0.000 8.354427 9.300609. test bytest female black hispanic incomehi ownhome dadcoll cue80 stwmfg80; ( 1) bytest = 0 ( 2) female = 0 ( 3) black = 0 ( 4) hispanic = 0 ( 5) incomehi = 0 ( 6) ownhome = 0 ( 7) dadcoll = 0 ( 8) cue80 = 0 ( 9) stwmfg80 = 0 F( 9, 3785) = 215.43 Prob > F = 0.0000. test black=hispanic; ( 1) black - hispanic = 0 F( 1, 3785) = 0.13 Prob > F = 0.7168. reg ed dist bytest female black hispanic incomehi ownhome dadcoll cue80 > stwmfg80; Source SS df MS Number of obs = 3796 -------------+------------------------------ F( 10, 3785) = 146.35 Model 3481.95254 10 348.195254 Prob > F = 0.0000 Residual 9005.42997 3785 2.37924173 R-squared = 0.2788 -------------+------------------------------ Adj R-squared = 0.2769 Total 12487.3825 3795 3.29048287 Root MSE = 1.5425 ed Coef. Std. Err. t P> t [95% Conf. Interval] dist -.0315387.0123703-2.55 0.011 -.0557918 -.0072857 bytest.0938201.0031622 29.67 0.000.0876204.1000199 female.145408.0505889 2.87 0.004.0462239.244592 black.367971.071363 5.16 0.000.2280574.5078846 hispanic.3985196.0744617 5.35 0.000.2525308.5445085 incomehi.3951984.0605308 6.53 0.000.2765222.5138746 ownhome.1521313.0668075 2.28 0.023.0211492.2831135

dadcoll.6961324.0687248 10.13 0.000.5613911.8308737 cue80.0232052.0096321 2.41 0.016.0043207.0420898 stwmfg80 -.0517777.0198523-2.61 0.009 -.0906999 -.0128556 _cons 8.827518.2502782 35.27 0.000 8.336825 9.318211. ;. > Compute the fitted value of ED for E6.2g and E6.2h > > ;. lincom _cons + 2dist + 58bytest + 0female + 1black + 0hispanic + > 1incomehi + 1ownhome + 0dadcoll + 7.5cue80 + 9.75stwmfg80, > level(90); ( 1) 2 dist + 58 bytest + black + incomehi + ownhome + 7.5 cue80 + 9.75 stwmfg80 + _cons = 0 ed Coef. Std. Err. t P> t [90% Conf. Interval] (1) 14.79051.0921789 160.45 0.000 14.63886 14.94217. lincom _cons + 4dist + 58bytest + 0female + 1black + 0hispanic + > 1incomehi + 1ownhome + 0dadcoll + 7.5cue80 + 9.75stwmfg80, > level(90); ( 1) 4 dist + 58 bytest + black + incomehi + ownhome + 7.5 cue80 + 9.75 stwmfg80 + _cons = 0 ed Coef. Std. Err. t P> t [90% Conf. Interval] (1) 14.72744.0986563 149.28 0.000 14.56512 14.88975. log close; log: R:\users\jaherrig\My Documents\Classes\Economics 371\Stata \Problemset4.log log type: text closed on: 14 Oct 2009, 08:30:57 -------