Labor Force Participation and the Wage Gap Detailed Notes and Code Econometrics 113 Spring 2014

Similar documents
Advanced Econometrics

The relationship between GDP, labor force and health expenditure in European countries

Quantitative Techniques Term 2

u panel_lecture . sum

İnsan TUNALI 8 November 2018 Econ 511: Econometrics I. ASSIGNMENT 7 STATA Supplement

Econ 371 Problem Set #4 Answer Sheet. 6.2 This question asks you to use the results from column (1) in the table on page 213.

ECON Introductory Econometrics. Seminar 4. Stock and Watson Chapter 8

Final Exam - section 1. Thursday, December hours, 30 minutes

Example 2.3: CEO Salary and Return on Equity. Salary for ROE = 0. Salary for ROE = 30. Example 2.4: Wage and Education

Labor Market Returns to Two- and Four- Year Colleges. Paper by Kane and Rouse Replicated by Andreas Kraft

Your Name (Please print) Did you agree to take the optional portion of the final exam Yes No. Directions

Econometrics is. The estimation of relationships suggested by economic theory

CHAPTER 4 ESTIMATES OF RETIREMENT, SOCIAL SECURITY BENEFIT TAKE-UP, AND EARNINGS AFTER AGE 50

F^3: F tests, Functional Forms and Favorite Coefficient Models

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998

Technical Documentation for Household Demographics Projection

Professor Brad Jones University of Arizona POL 681, SPRING 2004 INTERACTIONS and STATA: Companion To Lecture Notes on Statistical Interactions

Dummy variables 9/22/2015. Are wages different across union/nonunion jobs. Treatment Control Y X X i identifies treatment

Heteroskedasticity. . reg wage black exper educ married tenure

You created this PDF from an application that is not licensed to print to novapdf printer (

A COMPARATIVE ANALYSIS OF REAL AND PREDICTED INFLATION CONVERGENCE IN CEE COUNTRIES DURING THE ECONOMIC CRISIS

[BINARY DEPENDENT VARIABLE ESTIMATION WITH STATA]

Trade Imbalance and Entrepreneurial Activity: A Quantitative Panel Data Analysis

Problem Set 9 Heteroskedasticty Answers

The SAS System 11:03 Monday, November 11,

An Examination of the Impact of the Texas Methodist Foundation Clergy Development Program. on the United Methodist Church in Texas

Problem Set 6 ANSWERS

Example 7.1: Hourly Wage Equation Average wage for women

Categorical Outcomes. Statistical Modelling in Stata: Categorical Outcomes. R by C Table: Example. Nominal Outcomes. Mark Lunt.

Time series data: Part 2

Handout seminar 6, ECON4150

The Multivariate Regression Model

Assignment #5 Solutions: Chapter 14 Q1.

AN EMPIRICAL ANALYSIS OF THE RELATIONSHIP BETWEEN FOREIGN TRADE AND ECONOMIC GROWTH IN CENTRAL AFRICA

EC327: Limited Dependent Variables and Sample Selection Binomial probit: probit

tm / / / / / / / / / / / / Statistics/Data Analysis User: Klick Project: Limited Dependent Variables{space -6}

Effect of Health Expenditure on GDP, a Panel Study Based on Pakistan, China, India and Bangladesh

Impact of Household Income on Poverty Levels

Solutions for Session 5: Linear Models

Examination of State Lotteries

sociology SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 SO5032 Quantitative Research Methods

Cameron ECON 132 (Health Economics): FIRST MIDTERM EXAM (A) Fall 17

*1A. Basic Descriptive Statistics sum housereg drive elecbill affidavit witness adddoc income male age literacy educ occup cityyears if control==1

Cross-country comparison using the ECHP Descriptive statistics and Simple Models. Cheti Nicoletti Institute for Social and Economic Research

Model fit assessment via marginal model plots

Violent Conflict and Foreign Direct Investment in Developing Economies: A Panel Data Analysis

Don t worry one bit about multicollinearity, because at the end of the day, you're going to be working with a favorite coefficient model.

Housing Prices, Macroeconomic Variables and Corruption Index in ASEAN

Example 8.1: Log Wage Equation with Heteroscedasticity-Robust Standard Errors

Two-stage least squares examples. Angrist: Vietnam Draft Lottery Men, Cohorts. Vietnam era service

Effect of Education on Wage Earning

GGraph. Males Only. Premium. Experience. GGraph. Gender. 1 0: R 2 Linear = : R 2 Linear = Page 1

FIGURE I.1 / Per Capita Gross Domestic Product and Unemployment Rates. Year

Allison notes there are two conditions for using fixed effects methods.

1) The Effect of Recent Tax Changes on Taxable Income

The impact of cigarette excise taxes on beer consumption

Chapter 11 Part 6. Correlation Continued. LOWESS Regression

Modeling wages of females in the UK

Table 4. Probit model of union membership. Probit coefficients are presented below. Data from March 2008 Current Population Survey.

International Journal of Multidisciplinary Consortium

Recruiting and Retaining High-quality State and Local Workers: Do Pensions Matter?

Religion and Volunteerism

Testing the Solow Growth Theory

. ********** OUTPUT FILE: CARD & KRUEGER (1994)***********.. * STATA 10.0 CODE. * copyright C 2008 by Tito Boeri & Jan van Ours. * "THE ECONOMICS OF

Chapter 6 Part 3 October 21, Bootstrapping

Advanced Industrial Organization I Identi cation of Demand Functions

FOREIGN CURRENCY DERIVATIES AND CORPORATE VALUE: EVIDENCE FROM CHINA

The Impact of Aid on the Economic Growth of Developing Countries (LDCs) in Sub-Saharan Africa

Final Exam, section 1. Thursday, May hour, 30 minutes

An analysis of the relationship between economic development and demographic characteristics in the United States

Sean Howard Econometrics Final Project Paper. An Analysis of the Determinants and Factors of Physical Education Attendance in the Fourth Quarter

Relation between Income Inequality and Economic Growth

Impact of Stock Market, Trade and Bank on Economic Growth for Latin American Countries: An Econometrics Approach

Impact of Minimum Wage and Government Ideology on Unemployment Rates: The Case of Post-Communist Romania

Module 9: Single-level and Multilevel Models for Ordinal Responses. Stata Practical 1

Online Appendix Not For Publication

STATA log file for Time-Varying Covariates (TVC) Duration Model Estimations.

Sociology Exam 3 Answer Key - DRAFT May 8, 2007

Question 1a 1b 1c 1d 1e 1f 2a 2b 2c 2d 3a 3b 3c 3d M ult:choice Points

Testing Capital Asset Pricing Model on KSE Stocks Salman Ahmed Shaikh

Does Globalization Improve Quality of Life?

Keywords: Capital structure, Profitability, Performance analysis.

Dummy Variables. 1. Example: Factors Affecting Monthly Earnings

M. Candasamy. KPMG Mauritius - Advisory, Mauritius. Bhavish Jugurnath. University of Mauritius, Moka, Mauritius

CHAPTER 2 ESTIMATION AND PROJECTION OF LIFETIME EARNINGS

Are Old Age Workers Out of Luck? An Empirical Study of the U.S. Labor Market. Keith Brian Kline II Sreenath Majumder, PhD March 16, 2015

The Impact of a $15 Minimum Wage on Hunger in America

Percentage of foreclosures in the area is the ratio between the monthly foreclosures and the number of outstanding home-related loans in the Zip code

Logistic Regression Analysis

Review questions for Multinomial Logit/Probit, Tobit, Heckit, Quantile Regressions

SAS Simple Linear Regression Example

ECON Introductory Econometrics Seminar 2, 2015

EQUITY FORMATION AND FINANCIAL PERFORMANCE OF LISTED DEPOSIT MONEY BANKS IN NIGERIA

Ownership structure and corporate performance: evidence from China

Analysis on Factors that Affect Stock Prices: A Study on Listed Cement Companies at Dhaka Stock Exchange

11/28/2018. Overview. Multiple Linear Regression Analysis. Multiple regression. Multiple regression. Multiple regression. Multiple regression

STATA Program for OLS cps87_or.do

Limited Dependent Variables

Internet Appendix. The survey data relies on a sample of Italian clients of a large Italian bank. The survey,

A New Look at Technical Progress and Early Retirement

Transcription:

Labor Force Participation and the Wage Gap Detailed Notes and Code Econometrics 113 Spring 2014 In class, Lecture 11, we used a new dataset to examine labor force participation and wages across groups. To do so, we pooled cross- sections of the Current Population Survey, Outgoing Rotational Groups, with 5 year gaps between each cross- section to keep the dataset manageable. Specifically, we merged the cross- sections from 1983, 1988, 1993, 1998, 2003, 2008, and 2013, and used the survey for the fourth month of each group (they were surveyed at multiple points). To begin our study of labor markets, we will focus on labor force participation, which is characterized by a group of dummy variables: empl: 1 if employed, 0 otherwise. unem: 1 if unemployed but in the labor force, 0 otherwise nilf: 1 if not in labor force, 0 otherwise. We use the summarize command to take a first look at these variables:. su empl unem nilf Variable Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- empl 1116395.6139807.4868353 0 1 unem 1116395.041342.1990801 0 1 nilf 1116395.3446773.4752631 0 1 It is also interesting to look at the fraction of the population that is unemployed or underemployed, as in working part- time. The dummy variable unempt is equal to one when the respondent is unemployed or part- time employed.. su unempt Variable Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- unempt 46154.1809161.3849528 0 1 We can evaluate how these variables have changed over time using the tabstat command:. tabstat empl unem unempt nilf, by(year) Summary statistics: mean by categories of: year (Year) year empl unem unempt nilf ---------+---------------------------------------- 1983.5812963.0615547.1632216.357149 1988.6200508.0359243.2089526.3440249 1993.612172.0441608.1951287.3436672 1998.642206.0288856.2268245.3289084 2003.628476.0367215.1716817.3348025 2008.6266061.0352009.1779906.338193 2013.5931076.0435911.1533983.3633013 ---------+---------------------------------------- Total.6139807.041342.1809161.3446773 --------------------------------------------------

Since most labor market statistics are conditioned on the set of the population that is in the labor force, we can condition the tabstat command using if nilf==0, which will calculate the means only using the sample of the pooled cross section for which workers are in the labor force.. tabstat empl unem unempt nilf if nilf==0, by(year) Summary statistics: mean by categories of: year (Year) year empl unem unempt nilf ---------+---------------------------------------- 1983.9042474.0957526.1632216 0 1988.9452353.0547647.2089526 0 1993.9327159.0672841.1951287 0 1998.9569573.0430427.2268245 0 2003.9447961.0552039.1716817 0 2008.946811.053189.1779906 0 2013.9315358.0684642.1533983 0 ---------+---------------------------------------- Total.9369135.0630865.1809161 0 -------------------------------------------------- Not surprisingly, the recent unemployment rate of 6-7% is reflected in the mean of unem, conditional on labor force participation. As this is how unemployment rates are calculated, this suggests that our dataset is a pretty meaningful representation of the US Labor Force and participation within. 1 Labor Force Participation Regressions In Lecture Module 11, we specified a linear probability model to study labor force participation rates as a function of education, age, age^2, gender, and demographic characteristics. We first need to code our demographic characteristics from the survey results (in the variable wbho ):. gen age2 = age^2. gen black = 0. replace black = 1 if wbho==2. gen hispanic = 0. replace hispanic = 1 if wbho==3. gen other = 0. replace other = 1 if wbho==4 The outside group is white. We will also include year fixed effects, which we will estimate using the i.year command within the regression specification. The code and results for the regression are listed in Regression 1A. However, year fixed effects may not be sufficient if there are reasons why education levels, the age of the workforce, and composition of the population may change within states across time. As this is a 30- year collection of cross- sections 5 years apart, large changes that are differential to states could happen. So, to control for these possibilities, or more specifically absorb state- year specific shocks, we will treat state- year combinations as groups and estimate using fixed effects. To define state year groups, we use:. egen state_year = group(state year) Then, we run a fixed effects regression with xtreg, but using i(state_year) as an option (after the fe ). The precise code and results are below in Regression 1B.

REGRESSION 1A. reg nilf educ age age2 i.year female black hispanic other Source SS df MS Number of obs = 1114685 -------------+------------------------------ F( 13,1114671) =38250.21 Model 77603.4956 13 5969.49966 Prob > F = 0.0000 Residual 173960.5451114671.156064475 R-squared = 0.3085 -------------+------------------------------ Adj R-squared = 0.3085 Total 251564.0411114684.225681934 Root MSE =.39505 nilf Coef. Std. Err. t P> t [95% Conf. Interval] educ -.0587172.0003404-172.48 0.000 -.0593844 -.0580499 age -.0333336.0001092-305.14 0.000 -.0335477 -.0331195 age2.0004477 1.13e-06 397.47 0.000.0004455.0004499 year 1988 -.0107145.0013691-7.83 0.000 -.013398 -.0080311 1993 -.0059709.0013667-4.37 0.000 -.0086495 -.0032922 1998 -.0152777.0014244-10.73 0.000 -.0180695 -.012486 2003 -.0028469.0013742-2.07 0.038 -.0055403 -.0001534 2008 -.0073528.0013927-5.28 0.000 -.0100825 -.0046232 2013.0094398.0013996 6.74 0.000.0066967.0121829 female.1315024.0007509 175.12 0.000.1300306.1329742 black.041159.0012912 31.88 0.000.0386284.0436897 hispanic.0214434.0014136 15.17 0.000.0186728.0242141 other.063026.0017417 36.19 0.000.0596124.0664396 _cons.8689099.0025079 346.47 0.000.8639944.8738253 REGRESSION 1B. xtreg nilf educ age age2 female black hispanic other, i(state_year) fe Fixed-effects (within) regression Number of obs = 1114685 Group variable: state_year Number of groups = 357 R-sq: within = 0.3071 Obs per group: min = 1262 between = 0.5021 avg = 3122.4 overall = 0.3082 max = 14772 F(7,1114321) = 70558.75 corr(u_i, Xb) = 0.0184 Prob > F = 0.0000 nilf Coef. Std. Err. t P> t [95% Conf. Interval] educ -.0583487.0003431-170.07 0.000 -.0590211 -.0576762 age -.0334134.0001091-306.26 0.000 -.0336272 -.0331996 age2.0004482 1.13e-06 398.37 0.000.000446.0004504 female.1312328.0007493 175.14 0.000.1297641.1327014 black.0328807.0013498 24.36 0.000.0302351.0355264 hispanic.0118378.0015081 7.85 0.000.008882.0147936 other.0648217.0018745 34.58 0.000.0611477.0684957 _cons.8674742.0024177 358.80 0.000.8627356.8722129 sigma_u.03225172 sigma_e.39415021 rho.00665096 (fraction of variance due to u_i) F test that all u_i=0: F(356, 1114321) = 16.35 Prob > F = 0.0000

To add contrast to our results related to labor force participation, we now condition the sample to only those in the workforce, and evaluate the same factors and their relationship to unemployment status. We allow for state- year fixed effects, since unemployment rates across states due to local shocks and other factors that are not national. The code and regression results are below in Regression 1C. REGRESSION 1C. xtreg unem educ age age2 female black hispanic other if nilf==0, i(state_year) fe Fixed-effects (within) regression Number of obs = 731170 Group variable: state_year Number of groups = 357 R-sq: within = 0.0300 Obs per group: min = 785 between = 0.1774 avg = 2048.1 overall = 0.0312 max = 9663 F(7,730806) = 3231.24 corr(u_i, Xb) = 0.0084 Prob > F = 0.0000 unem Coef. Std. Err. t P> t [95% Conf. Interval] educ -.0187377.000257-72.92 0.000 -.0192413 -.0182341 age -.0071768.0001129-63.59 0.000 -.007398 -.0069556 age2.0000665 1.32e-06 50.23 0.000.0000639.0000691 female -.0056194.0005602-10.03 0.000 -.0067174 -.0045214 black.0642282.0010287 62.44 0.000.062212.0662443 hispanic.0160746.0011127 14.45 0.000.0138937.0182556 other.0197317.0014015 14.08 0.000.0169849.0224786 _cons.2767253.0022465 123.18 0.000.2723222.2811283 sigma_u.02089801 sigma_e.23842608 rho.00762393 (fraction of variance due to u_i) F test that all u_i=0: F(356, 730806) = 15.47 Prob > F = 0.0000 Review Questions for Final 1a. Within state-year groups, calculate the age at which labor force participation is maximized or minimized. Is this a maximum or minimum? How do we know? Be careful about the definition of nilf (not in labor force) when answering this question. 1b. Within state-year groups, calculate the age at which unemployment is maximized or minimized. Is this a maximum or minimum? How do we know? 1c. Going from Regression 1A to Regression 1B, some coefficients change a bit, while others do not (educ, female). What do you think the state-year fixed effects are controlling for in this case? Think omitted variables here. 1d. In Regression 1C, please interpret the coefficients on educ, female and black.

2 Wage Gap Regressions In this section, we present the detailed code and related questions for our discussion of wage gaps. We use the same dataset as above. To begin, we use the real wage, rw, which is the wage of the respondent divided by a local price index, and transform using natural logs:. gen ln_rw = ln(rw) After transforming the variable into natural logs, we regress the real wage of each respondent on their education, age, and demographics, using year fixed effects. The code and results are in Regression 2A. REGRESSION 2A. xtreg ln_rw educ age age2 female black hispanic other if nilf==0, i(year) fe warning: existing panel variable is not year Fixed-effects (within) regression Number of obs = 598155 Group variable: year Number of groups = 7 R-sq: within = 0.3499 Obs per group: min = 78994 between = 0.8909 avg = 85450.7 overall = 0.3557 max = 89543 F(7,598141) = 45993.87 corr(u_i, Xb) = 0.0626 Prob > F = 0.0000 educ.2018287.0005673 355.79 0.000.2007169.2029405 age.0635323.0002598 244.51 0.000.063023.0640415 age2 -.0006443 3.09e-06-208.27 0.000 -.0006504 -.0006382 female -.2545932.001236-205.98 0.000 -.2570158 -.2521706 black -.1055568.0021608-48.85 0.000 -.109792 -.1013217 hispanic -.0892944.0022784-39.19 0.000 -.0937599 -.0848288 other -.0398047.002868-13.88 0.000 -.0454259 -.0341835 _cons.9987371.0050874 196.31 0.000.9887659 1.008708 sigma_u.02503829 sigma_e.47686742 rho.00274928 (fraction of variance due to u_i) F test that all u_i=0: F(6, 598141) = 237.22 Prob > F = 0.0000 Next, we use state_year fixed effects as above rather than year fixed effects to absorb changes in wages attributable to state- year groups that are also correlated to demographic changes. The code and results are below in Regression 2B. REGRESSION 2B. xtreg ln_rw educ age age2 female black hispanic other if nilf==0, i(state_year) fe warning: existing panel variable is not state_year Fixed-effects (within) regression Number of obs = 598155 Group variable: state_year Number of groups = 357 R-sq: within = 0.3514 Obs per group: min = 638 between = 0.4984 avg = 1675.5 overall = 0.3547 max = 7478

F(7,597791) = 46264.39 corr(u_i, Xb) = 0.0519 Prob > F = 0.0000 educ.194149.0005637 344.43 0.000.1930442.1952538 age.0636368.0002559 248.65 0.000.0631352.0641384 age2 -.0006464 3.05e-06-212.14 0.000 -.0006523 -.0006404 female -.2543043.0012164-209.07 0.000 -.2566883 -.2519202 black -.1257219.0022224-56.57 0.000 -.1300777 -.1213661 hispanic -.1347845.0023991-56.18 0.000 -.1394867 -.1300823 other -.0819694.0030604-26.78 0.000 -.0879678 -.0759711 _cons 1.027283.0050173 204.75 0.000 1.017449 1.037116 sigma_u.09603396 sigma_e.46909875 rho.04022452 (fraction of variance due to u_i) F test that all u_i=0: F(356, 597791) = 61.23 Prob > F = 0.0000 Review Questions for Final 2a. Please interpret precisely the coefficient on female for both regressions 2A and 2B. 2b. Using Regression 2B, please calculate and interpret precisely the difference in wage for a black female compared to a white male. Next, we will evaluate how the wage gap has changed over time. We will focus on the male- female wage gap for now. Though this can be done in a variety of ways, the plan will be to first define a year specific dummy variable for females. That is, we are now (for example) allowing for the male- female gap to be different in 1983 from its value in 2003. The code for this is below:. gen female83 = female. gen female88 = female. gen female93 = female. gen female98 = female. gen female03 = female. gen female08 = female. gen female13 = female. replace female83 = 0 if year!=1983. replace female88 = 0 if year!=1988. replace female93 = 0 if year!=1993. replace female98 = 0 if year!=1998. replace female03 = 0 if year!=2003. replace female08 = 0 if year!=2008. replace female13 = 0 if year!=2013 The gen command assigns a variable identical to female, and then the replace command gives a zero to all observations not of that stated year. The results of replacing female with these seven variables in the within state- year regression is below in Regression 2C.

REGRESSION 2C. xtreg ln_rw educ age age2 female83 female88 female93 female98 female03 female08 female13 black hispanic other if nilf==0, i(state_year) fe Fixed-effects (within) regression Number of obs = 598155 Group variable: state_year Number of groups = 357 R-sq: within = 0.3528 Obs per group: min = 638 between = 0.4354 avg = 1675.5 overall = 0.3559 max = 7478 F(13,597785) = 25069.75 corr(u_i, Xb) = 0.0155 Prob > F = 0.0000 educ.1936641.0005633 343.83 0.000.1925601.194768 age.063622.0002557 248.86 0.000.0631209.0641231 age2 -.0006464 3.04e-06-212.37 0.000 -.0006523 -.0006404 female83 -.3344265.0031891-104.86 0.000 -.3406771 -.3281759 female88 -.3062215.0031881-96.05 0.000 -.3124701 -.2999729 female93 -.2397609.0031825-75.34 0.000 -.2459985 -.2335232 female98 -.2416921.0033374-72.42 0.000 -.2482332 -.2351509 female03 -.2278291.0031345-72.68 0.000 -.2339726 -.2216856 female08 -.2242251.0031918-70.25 0.000 -.2304809 -.2179693 female13 -.2033225.0032617-62.34 0.000 -.2097154 -.1969296 black -.1260882.00222-56.80 0.000 -.1304393 -.1217372 hispanic -.1343798.0023965-56.07 0.000 -.1390769 -.1296827 other -.0817574.0030571-26.74 0.000 -.0877492 -.0757657 _cons 1.028751.0050119 205.26 0.000 1.018928 1.038574 sigma_u.09581929 sigma_e.46857791 rho.04013759 (fraction of variance due to u_i) F test that all u_i=0: F(356, 597785) = 59.96 Prob > F = 0.0000 Review Questions for Final 2c. Please comment on the direction of the wage gap over time. Precisely, please interpret the change in the wage gap from 1983 to 2013, as evidenced in Regression 2C. 2d. Suppose, that I want to test precisely the difference between the coefficient on female83 and female13. Please derive a regression that allows me to do this. Show your work! Next, we d like to evaluate these results by looking not just within state- year groups, but adding industries and occupations to the mix. Within the dataset, we use the two- digit industry classification, ind_2d, and the two digit occupational classification, docc03, for this purpose. Since the industry and occupational classifications are available only for 2003 onward, we drop observations for which either are not available using drop if ind_2d==. docc03==.. Then, we define industry- state- year groups, occupation- state- year groups, and then industry- occupation- state- year groups:.egen ind_state_year = group(ind_2d state year).egen occ2_state_year = group(docc03 state year).egen ind_occ2_state_year = group(ind_2d docc03 state year)

REGRESSION 2D xtreg ln_rw educ age age2 female black hispanic other if nilf==0, i(state_year) fe Fixed-effects (within) regression Number of obs = 258721 Group variable: state_year Number of groups = 153 R-sq: within = 0.3473 Obs per group: min = 638 between = 0.4046 avg = 1691.0 overall = 0.3466 max = 6935 F(7,258561) = 19657.57 corr(u_i, Xb) = 0.0242 Prob > F = 0.0000 educ.2161984.0008921 242.35 0.000.2144499.2179469 age.0593373.0003969 149.50 0.000.0585594.0601153 age2 -.00059 4.62e-06-127.71 0.000 -.0005991 -.000581 female -.2209542.0019147-115.40 0.000 -.224707 -.2172015 black -.1397012.0035137-39.76 0.000 -.146588 -.1328144 hispanic -.1229709.0033608-36.59 0.000 -.1295579 -.1163839 other -.0574861.0042478-13.53 0.000 -.0658117 -.0491605 _cons 1.039937.0080455 129.26 0.000 1.024168 1.055706 sigma_u.08300552 sigma_e.4852719 rho.02842624 (fraction of variance due to u_i) F test that all u_i=0: F(152, 258561) = 44.12 Prob > F = 0.0000 REGRESSION 2E. xtreg ln_rw educ age age2 female black hispanic other if nilf==0, i(ind_state_year) fe Fixed-effects (within) regression Number of obs = 258721 Group variable: ind_state_~r Number of groups = 7208 R-sq: within = 0.2669 Obs per group: min = 1 between = 0.5068 avg = 35.9 overall = 0.3463 max = 750 F(7,251506) = 13080.20 corr(u_i, Xb) = 0.2568 Prob > F = 0.0000 educ.1948149.0009537 204.28 0.000.1929457.1966841 age.0480302.0003956 121.42 0.000.0472549.0488055 age2 -.0004731 4.57e-06-103.45 0.000 -.0004821 -.0004642 female -.1830613.0020392-89.77 0.000 -.1870581 -.1790646 black -.1291588.0034449-37.49 0.000 -.1359107 -.1224069 hispanic -.0979738.0033067-29.63 0.000 -.1044548 -.0914929 other -.0525642.0041416-12.69 0.000 -.0606817 -.0444467 _cons 1.325345.0082127 161.38 0.000 1.309248 1.341441 sigma_u.25011209 sigma_e.46296176 rho.22592415 (fraction of variance due to u_i) F test that all u_i=0: F(7207, 251506) = 5.54 Prob > F = 0.0000

REGRESSION 2F. xtreg ln_rw educ age age2 female black hispanic other if nilf==0, i(occ2_state_year) fe Fixed-effects (within) regression Number of obs = 258721 Group variable: occ2_state~r Number of groups = 3361 R-sq: within = 0.2105 Obs per group: min = 1 between = 0.7906 avg = 77.0 overall = 0.3429 max = 1039 F(7,255353) = 9727.83 corr(u_i, Xb) = 0.3934 Prob > F = 0.0000 educ.1464703.0010065 145.52 0.000.1444975.148443 age.048822.000378 129.15 0.000.0480811.049563 age2 -.0004799 4.38e-06-109.58 0.000 -.0004884 -.0004713 female -.1866442.0020564-90.76 0.000 -.1906748 -.1826136 black -.0983273.003322-29.60 0.000 -.1048384 -.0918163 hispanic -.0786142.0032107-24.49 0.000 -.0849071 -.0723213 other -.0473485.0039969-11.85 0.000 -.0551823 -.0395147 _cons 1.44415.0079293 182.13 0.000 1.428608 1.459691 sigma_u.2414583 sigma_e.44916715 rho.22419298 (fraction of variance due to u_i) F test that all u_i=0: F(3360, 255353) = 16.15 Prob > F = 0.0000 REGRESSION 2G. xtreg ln_rw educ age age2 female black hispanic other if nilf==0, i(ind_occ2_state_year) fe Fixed-effects (within) regression Number of obs = 258721 Group variable: ind_occ2_s~r Number of groups = 46724 R-sq: within = 0.1690 Obs per group: min = 1 between = 0.4172 avg = 5.5 overall = 0.3423 max = 402 F(7,211990) = 6160.71 corr(u_i, Xb) = 0.3738 Prob > F = 0.0000 educ.1304153.0011223 116.20 0.000.1282155.132615 age.0427017.0004135 103.28 0.000.0418913.043512 age2 -.0004159 4.78e-06-86.93 0.000 -.0004252 -.0004065 female -.1702831.0023029-73.94 0.000 -.1747968 -.1657694 black -.0879955.0036047-24.41 0.000 -.0950606 -.0809303 hispanic -.0686034.0034313-19.99 0.000 -.0753287 -.061878 other -.0396592.0043013-9.22 0.000 -.0480897 -.0312287 _cons 1.612161.0087454 184.34 0.000 1.59502 1.629302 sigma_u.4084345 sigma_e.43896446 rho.46401885 (fraction of variance due to u_i) F test that all u_i=0: F(46723, 211990) = 2.40 Prob > F = 0.0000

Review Questions for Final 2e. Do industries and occupations contribute to the wage gap (ie. different genders and races selecting into different industries and occupations), or is the wage gap amplified when looking within industries or occupations? 2f. Suppose that I claim within industry-occupation-state-year groups, the male-female wage gap is exactly twice as large as the white-black wage gap. Please write this hypothesis, and a suitable alternative. Please derive an estimating equation that allows for one to test this hypothesis. 2g. Write out code that does the following. Within industry-state-year groups, evaluate the differences in the male-female wage gap as a function of having a college degree. Put differently, does having a college degree affect the size/direction of the wage gap? Write out the regression specification you wish to estimate, and the code that will do it (including any variables that you need to generate).