Review questions for Multinomial Logit/Probit, Tobit, Heckit, Quantile Regressions

Similar documents
a. Explain why the coefficients change in the observed direction when switching from OLS to Tobit estimation.

Table 4. Probit model of union membership. Probit coefficients are presented below. Data from March 2008 Current Population Survey.

The model is estimated including a fixed effect for each family (u i ). The estimated model was:

Final Exam - section 1. Thursday, December hours, 30 minutes

ECO671, Spring 2014, Sample Questions for First Exam

Your Name (Please print) Did you agree to take the optional portion of the final exam Yes No. Directions

[BINARY DEPENDENT VARIABLE ESTIMATION WITH STATA]

How exogenous is exogenous income? A longitudinal study of lottery winners in the UK

Name: 1. Use the data from the following table to answer the questions that follow: (10 points)

Econometric Methods for Valuation Analysis

1) The Effect of Recent Tax Changes on Taxable Income

Final Exam, section 2. Tuesday, December hour, 30 minutes

Multinomial Logit Models - Overview Richard Williams, University of Notre Dame, Last revised February 13, 2017

Module 4 Bivariate Regressions

Categorical Outcomes. Statistical Modelling in Stata: Categorical Outcomes. R by C Table: Example. Nominal Outcomes. Mark Lunt.

sociology SO5032 Quantitative Research Methods Brendan Halpin, Sociology, University of Limerick Spring 2018 SO5032 Quantitative Research Methods

Public-private sector pay differential in UK: A recent update

hhid marst age1 age2 sex1 sex2

Rockefeller College University at Albany

Final Exam, section 1. Thursday, May hour, 30 minutes

Modeling wages of females in the UK

Final Exam, section 1. Tuesday, December hour, 30 minutes

Labor Participation and Gender Inequality in Indonesia. Preliminary Draft DO NOT QUOTE

Public Employees as Politicians: Evidence from Close Elections

Effect of Education on Wage Earning

FIGURE I.1 / Per Capita Gross Domestic Product and Unemployment Rates. Year

Cumulative Abnormal Returns

Wage Gap Estimation with Proxies and Nonresponse

Saving for Retirement: Household Bargaining and Household Net Worth

CHAPTER 4 ESTIMATES OF RETIREMENT, SOCIAL SECURITY BENEFIT TAKE-UP, AND EARNINGS AFTER AGE 50

Web Appendix Figure 1. Operational Steps of Experiment

Econometric Methods for Valuation Analysis

Problem Set 9 Heteroskedasticty Answers

Sociology Exam 3 Answer Key - DRAFT May 8, 2007

CHAPTER 4 DATA ANALYSIS Data Hypothesis

Limited Dependent Variables

Fall 2004 Social Sciences 7418 University of Wisconsin-Madison Problem Set 5 Answers

Ministry of Health, Labour and Welfare Statistics and Information Department

Homework for Quantitative Economics for the Evaluation of the European Policy Homework for Period I and Period II

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, Last revised January 10, 2017

Econ 371 Problem Set #4 Answer Sheet. 6.2 This question asks you to use the results from column (1) in the table on page 213.

Appendix for Incidence, Salience and Spillovers: The Direct and Indirect Effects of Tax Credits on Wages

1 Inferential Statistic

ONLINE APPENDIX (NOT FOR PUBLICATION) Appendix A: Appendix Figures and Tables

May 9, Please put ONLY your ID number on the blue books. Three (3) points will be deducted for each time your name appears in a blue book.

Economics 345 Applied Econometrics

Maximum Likelihood Estimation Richard Williams, University of Notre Dame, Last revised January 13, 2018

Assessing Model Stability Using Recursive Estimation and Recursive Residuals

Internet Appendix: High Frequency Trading and Extreme Price Movements

Discrete Choice Modeling

Volume Title: Pensions, Labor, and Individual Choice. Volume URL: Chapter URL:

tm / / / / / / / / / / / / Statistics/Data Analysis User: Klick Project: Limited Dependent Variables{space -6}

Quantitative Techniques Term 2

Estimating Heterogeneous Choice Models with Stata

Religion and Volunteerism

THE ABOLITION OF THE EARNINGS RULE

Renters Report Future Home Buying Optimism, While Family Financial Assistance Is Most Available to Populations with Higher Homeownership Rates

Center for Demography and Ecology

Problem Set 2. PPPA 6022 Due in class, on paper, March 5. Some overall instructions:

B003 Applied Economics Exercises

Two-stage least squares examples. Angrist: Vietnam Draft Lottery Men, Cohorts. Vietnam era service

Intro to GLM Day 2: GLM and Maximum Likelihood

2. Employment, retirement and pensions

CONVERGENCES IN MEN S AND WOMEN S LIFE PATTERNS: LIFETIME WORK, LIFETIME EARNINGS, AND HUMAN CAPITAL INVESTMENT $

Labor Force Participation and Fertility in Young Women. fertility rates increase. It is assumed that was more women enter the work force then the

For Online Publication Additional results

Logistic Regression Analysis

Egyptian Married Women Don t desire to Work or Simply Can t? A Duration Analysis. Rana Hendy. March 15th, 2010

Do Households Increase Their Savings When the Kids Leave Home?

PASS Sample Size Software

Modelling the potential human capital on the labor market using logistic regression in R

Data Analysis. BCF106 Fundamentals of Cost Analysis

Lifetime Earnings and Vietnam Era Draft Lottery. Evidence from Social Security Administration Records. Joshua Angrist

institution Top 10 to 20 undergraduate

The Effects of Income Support Settings on Incentives to Work. Nicolas Hérault, Guyonne Kalb and Justin van de Ven

Why do the youth in Jamaica neither study nor work? Evidence from JSLC 2001

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

West Coast Stata Users Group Meeting, October 25, 2007

Final Exam Suggested Solutions

Model fit assessment via marginal model plots

CHAPTER 7 U. S. SOCIAL SECURITY ADMINISTRATION OFFICE OF THE ACTUARY PROJECTIONS METHODOLOGY

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION

The Two-Sample Independent Sample t Test

WWS 508b Precept 10. John Palmer. April 27, 2010

14.471: Fall 2012: Recitation 3: Labor Supply: Blundell, Duncan and Meghir EMA (1998)

Fertility and women s labor force participation in a low-income rural economy

Session 178 TS, Stats for Health Actuaries. Moderator: Ian G. Duncan, FSA, FCA, FCIA, FIA, MAAA. Presenter: Joan C. Barrett, FSA, MAAA

The Role of Exponential-Growth Bias and Present Bias in Retirment Saving Decisions

Econ Spring 2016 Section 12

Labor Force Participation and the Wage Gap Detailed Notes and Code Econometrics 113 Spring 2014

Female Labor Supply in Chile

9. Logit and Probit Models For Dichotomous Data

Small Sample Performance of Instrumental Variables Probit Estimators: A Monte Carlo Investigation

Financial Risk Tolerance and the influence of Socio-demographic Characteristics of Retail Investors

Models of Multinomial Qualitative Response

WesVar uses repeated replication variance estimation methods exclusively and as a result does not offer the Taylor Series Linearization approach.

Determinants of financial inclusion for youth entrepreneurship: Evidences from Addis Ababa City and Shirka Wereda, Ethiopia.

Determinants of demand for life insurance in European countries

Principles of Econometrics Mid-Term

Investor Competence, Information and Investment Activity

Transcription:

1. I estimated a multinomial logit model of employment behavior using data from the 2006 Current Population Survey. The three possible outcomes for a person are employed (outcome=1), unemployed (outcome=2) and out of the labor force (outcome=3). The coefficients for outcomes 2 and 3 are presented below. The coefficients for outcome 1 are normalized to zero. VARIABLES unemployed Out of labor force female 0.0575 0.677 (2.647) (74.59) age -0.129-0.305 (-33.56) (-211.3) Age-squared 0.00122 0.00379 (25.35) (225.6) # of kids aged 0-5 0.00907 0.181 (0.490) (22.78) # of kids aged 6-17 0.0711 0.199 (6.557) (42.17) Constant -0.309 3.711 (-4.467) (132.5) Observations 325458 325458 a. compute the probability that a 40 year old male with no kids is i. employed ii. unemployed b. After estimating the above multinomial logit model, I executed the following stata commands and received the output listed below:

. mfx, predict(p outcome(2)) Marginal effects after mlogit y = Pr(emp==2) (predict, p outcome(2)) =.02700302. mfx, predict(p outcome(2)) variable dy/dx Std. Err. z female* -.0048006.00056-8.50 age -.0005028.0001-4.94 age2-3.61e-06.00000-2.94 #kids<5 -.0014705.00048-3.05 #kids 6-17 -.0000104.00028-0.04 (*) dy/dx is for discrete change of dummy variablefrom 0 to 1 (*) dy/dx is for discrete change of dummy variable from 0 to 1. mfx, predict(p outcome(3)) Marginal effects after mlogit y = Pr(emp==3) (predict, p outcome(3)) =.34939412 variable dy/dx Std. Err. z female*.1515857.00198 76.51 age -.0681668.00034-198.50 age2.0008495.00000 209.74 #kids<5.0410849.0018 22.88 #kids 6-17.0446071.00106 41.91 Use the above results to compute the effect of having an additional child under the age of 5 on the probability that a person is employed. Show how you derived your estimate. d. Suppose you wish to test that children have different effects on employment behavior of men and women. Explain how you could test this hypothesis. Define the variables you would construct, the model(s) you would estimate, how you would construct your test statistic, the distribution of test statistics, and how you would decide whether to reject the null hypothesis.

2. Using data from IRS Form 5500 filings by U.S. pension plans, I estimated a model of contributions to pension plans as ln(1 + c i ) = α 0 + U i α 1 + PD i α 2 + e i Where the subscript i indicates the pension plan, c is employer contributions per participant, U is a dummy that equals one if the plan is a union plan, and PD is a dummy that equals if the plan is participant directed (meaning that the employee decides how to invest the money). Note that employer contributions can equal zero, since some pension plans are funded entirely by employee contributions. I estimated a Tobit and OLS version of the model. The results are below. Sigma represents the standard error of the residual. Standard error of coefficients is presented in parentheses below each coefficient. Variable OLS Tobit Union -0.0626*** -0.0676*** (0.012) (0.014) Participant Directed -0.307*** -0.311*** (0.0068) (0.0080) Constant 5.894*** 5.684*** (0.0057) (0.0067) Sigma 2.7434 3.226*** Observations 744615 744615 a. Explain why the coefficients change in the observed direction when switching from OLS to Tobit estimation. b. Use the tobit model to make the following predictions for a participant directed non-union plan. Give a brief outline of each calculation and be sure to account for the fact that the dependent variable is ln(1+c), not c. i. employer contributions per capita ii. employer contributions per capita conditional on non-zero contributions being made iii. the predicted probability that no contribution is made by the employer iv. the predicted probability that per capita employer contributions are between $1000 and $2000 ( is this a bit like an ordinal probit question??) c. Use the tobit model to predict the effect of a switch to participant direction on per capita contributions. Give a brief outline of your calculation. d. Explain how you could test whether the effect of unionism on employer contributions has been constant over time (there are 18 years of data in the sample). Describe the models you would estimate, how you form the appropriate test statistic, the distribution of the test statistic (including degrees of freedom), and the conditions under which you would reject the null hypothesis.

3. Using data from the March 2011 CPS, I estimated an OLS model explaining the number of children living in a household with an adult woman present as a function of the woman s age (and its square), her years of education, and her marital status. The never married dummy variable is omitted and the sample is restricted to women aged 21-50. Table 1. OLS and Tobit estimates of determinants of number of children living in households. (t-statistics are in parentheses) VARIABLES OLS Tobit Age 0.0311 0.0292 (6.270) (4.257) Age2-0.000558-0.000594 (-7.953) (-6.104) Years of education -0.0637-0.0860 (-29.71) (-29.04) married 0.581 0.905 (40.65) (44.58) divorced 0.169 0.330 (8.536) (11.83) Constant 3.279 3.806 (30.39) (25.40) Sigma 1.202 1.598 Observations 49,104 49,104 R-squared 0.056 a. Compare the OLS and Tobit coefficient estimates. What pattern do you observe? Why should you expect this pattern? Provide a rationale for the direction of the bias in OLS. b. Use the tobit model to predict each of the following for a 30 year old married woman with 12 years of education. Provide a brief outline of how you computed your answer. i. the expected number of children ii. the expected number of children, conditional on having more than 0 children. iii. the expected number of children, conditional on having more than 2 children. iii. the probability of having 4 or more children. c. What is the marginal effect of another year of education (compared to never married) on the expected number of children for the person described in (b)? Provide a brief outline of how you computed your answer for i. OLS model ii. Tobit model d. How will the relative size of the OLS and Tobit estimates compare for married vs never married workers? Just provide qualitative comparison no numbers required. Justify your answer. e. Suppose that you want to test the hypothesis that the Tobit regression coefficients (NOT just the intercepts) are identical across three racial categories. Explain how you could test this with an LR test. Precisely describe the restricted and unrestricted models, the degrees of freedom for your test statistic, and the conditions under which you would reject the null hypothesis.

4. (20 points) Using data from the March 2008 CPS, I estimate tobit models of annual social security income. The sample includes 62-70 year old men. The only controls that I used in the model are age dummies (62 excluded) and the person s years of education measured as the deviation from the mean. T-statistics are in parentheses. Tobit estimates for Social Security Income Age dummies 63 4869*** (7.364) 64 6652*** (10.23) 65 10176*** (16.01) 66 14532*** (22.20) 67 15570*** (23.11) 68 16819*** (24.70) 69 15992*** (23.86) 70 15991*** (23.04) Years of education -347.6*** (measured as deviation from mean) (-3.135) Constant -4329*** (-8.801) σ (std. deviation of residual) 10849*** (78.26) Observations 5783 Log-likelihood -40784.617

Provide a description of the how you derive your estimates for the questions below. a) Based on the estimated model for men, what is the probability that a 62 year old with the average amount of education would i. Receive no social security income ii. Receive social security income of $5,000-10,000 iii. Receive social security income of more than $10,000 b) For the same man described in (a), what is i. The expected annual social security income? ii. The expected annual social security income conditional on receiving a non-zero income? c) For the man described in (a), what is the marginal effect of turning 63 on his expected Social Security benefit? To answer the next part of this question, you need some background on how Social Security benefits are determined. To be eligible for a Social Security retirement benefit, a person must be at least 62 years old and have contributed into the system for at least 40 quarters (10 years) over one s lifetime. For men, this means that virtually everyone is eligible to collect. The size of the benefit depends on two factors. (1) the person s average social security earnings over the highest 35 years of their career; and (2) when the person files for benefits. The person can file as early as 62 but receives a delayed retirement bonus for every year that they postpone retirement. For example, if a person is eligible for a $10,000 benefit at age 62, they would be entitled to a check of $10,500 if they postpone filing to age 62. Since more educated workers have higher incomes, we would expect that higher levels of education would lead to higher social security benefits for those that have filed. At the same time, research has shown that increased education leads to later retirement dates perhaps because more educated workers are in less physically demanding jobs and thus choose to continue working to a later age. d. Does the tobit model allow for the possibility that increased education will reduce the probability of receiving a benefit, but increase the size of the benefit conditional on receiving nonzero Social Security income (or vice versa)? Justify your answer by explaining what parameter(s) in the tobit model determine the direction of these two effects.

5. As an alternative to the Tobit model, I used a Heckit to estimate the determinants of Social Security benefits. In this model, I treat the decision to file for social security benefits as a sample selection problem. The controls are identical to those used for the Tobit, except that I add marital status and numbe of children living at home as a control in the sample selection equation (but not the Social Security benefit equation). The reference group is age 62 and never married for both the social security and sample selection equation. T-statistics are in parentheses. (1) (2) Social Security Sample Selection (i.e. receive SS benefit) Age dummies 63 159.5 0.380*** (0.225) (5.605) 64-66.29 0.534*** (-0.0798) (7.946) 65 65.45 0.821*** (0.0609) (12.31) 66-1282 1.361*** (-0.848) (18.49) 67-1466 1.524*** (-0.903) (19.27) 68-1753 1.748*** (-0.997) (20.49) 69-2133 1.652*** (-1.249) (20.24) 70-2406 1.679*** (-1.388) (19.45) Years of education (dev. from mean) 793.5*** -0.0855*** (7.490) (-6.513) married 0.229*** (2.800) widowed 0.476*** (4.020) divorced 0.281*** (3.017) # children at home -0.227*** (-5.508) lambda -5648*** (-3.215) Constant 17903*** -0.788*** (8.400) (-8.689) Observations 5783 5783

a. What advantage does the Heckit have over the Tobit? Also, based on the Heckit estimates found here, what evidence is there for or against the underlying assumptions of the Tobit model? b. Suppose that instead of Heckit, I had estimated the model using OLS for the sample of men receiving a Social Security check (i.e. excluding those with zero Social Security income). Would you expect that the estimated effect of education on Social Security benefits would be over- or under-estimated in the OLS model? Justify your prediction. c. For the same person that you used in #1 (62 year old with average education) and assuming he is married with no children at home, predict each of the following and provide a brief outline of how you derived your answers. Provide a brief outline of how you derived these predictions. i. probability of receiving Social Security ii. expected Social Security income iii. expected Social Security income conditional on receiving a nonzero benefit d. Given that the decision to file for Social Security is affected by a wide range of variables that we have not controlled for (e.g. health, other sources of wealth, work preferences, physical demands of job, etc.), provide a story that would lead to the type of sample selection observed here.

6. A study by Joshua Angrist 1 investigated the effect of voluntary military service on postservice earnings. According to their estimates, the difference in mean earnings of veterans (those who were in the military) and non-veterans is $1233 annually. a. To examine the true effect of military service on earnings, Angrist estimates a simple OLS model: Y i = X i β + V i α + e i Where Y i is annual earnings, X i is a vector of controls describing person i s earnings potential (e.g. education, age) and V i is a dummy that equals one for veterans. The estimated coefficient on V i is -$197 and it is statistically significant. What could cause the regression estimate of the military effect to be NEGATIVE while the simple difference in means suggests that military service increases earnings by over $1000? b. For the cohort of people in the sample examined by Angrist, there was no draft and military service was voluntary. Since military service was voluntary, is the OLS estimate of the veteran effect likely to be biased upwards or downwards? Explain any assumptions that you make about behavior and why this would lead to either an upward or downward bias. c. Explain what model you could estimate to eliminate the bias in (b) and get the true effect of veteran status on earnings. Describe the estimation process and any additional data or variables that you will need to estimate the model. Also, indicate what parameters in the estimated model reveal the true effect of military service on earnings. d. Explain how you could use the estimation parameters described in (c) to estimate the total difference in earnings for two people who are identical in all respects except that one is a veteran and one is not. e. During the late 1960s, there was a draft where eligible men were randomly selected from the population (they randomly drew birthdates to determine a person s order in the draft). If the OLS earnings equation described in (a) was estimated using people who were age eligible for the military in the late 1960s (the draft cohort ) instead of the 1990s cohort (the voluntary cohort ), do you think the estimated effect of military service would rise or fall? Would the estimated effect of military service be closer to the true effect for the draft or voluntary cohort? Explain the basis for your prediction. 1 Angrist, Joshua (1998). Estimating the Labor market Impacton Voluntary Military Service Using Social Security Data on Military Applicants. Econometrica 66, 249-88.

7. Using 7 years of data from the Survey of Consumer Finances gathered between 1989 and 2007 (the survey is done once every 3 years) I estimated several regressions to examine the factors that influence the value of vehicles owned by a household The dependent variable in each case is the natural log of the real value of all vehicles owned (in 2007 dollars). The control variables are the natural log of real household income; and dummy variables indicating whether the household has a married couple (omitted dummy), a single female, or a single male; and year dummies (1989 omitted). OLS 10 th quantile 50 th quantile 90 th quantile Ln(real income) 0.423*** 0.409*** 0.429*** 0.445*** (0.0038) (0.0086) (0.0047) (0.0058) Single female -0.548*** -0.571*** -0.556*** -0.517*** (0.016) (0.030) (0.019) (0.023) Single male -0.329*** -0.463*** -0.327*** -0.191*** (0.015) (0.029) (0.019) (0.023) 1992 0.00310 0.0509-0.0732*** 0.0237 (0.022) (0.040) (0.027) (0.032) 1995 0.242*** 0.368*** 0.210*** 0.168*** (0.021) (0.040) (0.026) (0.032) 1998 0.246*** 0.440*** 0.188*** 0.155*** (0.021) (0.040) (0.027) (0.032) 2001 0.300*** 0.510*** 0.246*** 0.177*** (0.021) (0.040) (0.026) (0.032) 2004 0.261*** 0.401*** 0.214*** 0.207*** (0.021) (0.039) (0.026) (0.031) 2007 0.238*** 0.395*** 0.197*** 0.159*** (0.021) (0.039) (0.026) (0.032) Constant 4.897*** 3.839*** 4.941*** 5.655*** (0.048) (0.11) (0.059) (0.071) Observations 24801 24801 24801 24801 a. Based on the regressions above, controlling for income, sex, and marital status, what has happened to the mean value of cars owned between 1989 and 2007? Explain how you came to your conclusion.. b. Controlling for income, sex, and marital status, what has happened to the range of car values owned over time? Explain how you came to your conclusion.. c. For a given income and year, is the range of car values owned greater among the single male or single female population? Explain how you came to your conclusion. d. For a married couple with $100,000 of income in 2007, what is the projected range of car values (from 10 th to 90 th percentile)? Be sure to note the use of log transformation in some of the variables when you do your calculations.

8 An article by Rangvid (2010) 2 uses data on students in Denmark to investigate peer effects. The hypothesis is that the academic performance of a student s peers influence one s own academic performance. That is, ceteris paribus, students in a classroom with brighter peers will do better. To investigate the hypothesis, Rangvid estimates the effect of the average academic score for a student s peers on their own performance using both OLS and quantile regression methods. Other controls for family background, teacher quality, and class size are also included. The OLS estimate of the peer effect is given by the horizontal line in the figure below. The estimated effect from various quantile regressions is given by the bold line. The dashed lines represent confidence intervals for the estimates. Suppose that a school system is considering tracking students. This would put all the high performers in one classroom and the low performers in another. Given the information provided above, describe how you can tell a. whose academic performance would improve? b. whose academic performance would worsen? c. whether the average academic performance of all the students combined would rise or fall. 2 Educational Peer Effects: Quantile cregression Evidence from Denmark with PISA2000 data, unpublished manuscript.