B003 Applied Economics Exercises Spring 2005 Starred exercises are to be completed and handed in in advance of classes. Unstarred exercises are to be completed during classes. Ex 3.1 Ex 4.1 Ex 5.1 to be handed in by Thu of w/c 14 Feb to be handed in by Thu of w/c 28 Feb to be handed in by Thu of w/c 7 Mar Setting up and using STATA To configure Stata in your account: Login to WTS (use your Cluster WTS password) Double-click on the folder Applications and Resources Double-click on the folder Unix Applications Double-click on Setup Exceed (type any key) Double-click on the Stata icon (use your main UCL password, i.e. e-mail password) NOTE: Stata in the WTS environment directly sets up your personal path and there is no need to specify it for any of the files that will be opened or saved during this session (you can see the current working path that Stata is using in the bottom-left corner of the window) Accessing data All datasets used for the coursework or used in lectures can be found on the course website at http : //www.ucl.ac.uk/ uctp100/b003/data.htm Double-click on the dataset required and save it to your R: drive. The dataset can now be opened in STATA by choosing File and then Open. A guide to STATA commands 1 can be found at http : //www.ucl.ac.uk/ uctp100/b003/b003stat.pdf 1 I am grateful to Elena Martinez-Sanchis for providing this document. 1
Tutorial 1 Introduction to STATA 1.1 The data for this question is in a file called income.dta. This is a sample of 412 individuals living in households in the UK Midlands during the 1980s. (a) Open a log file to keep a record of your analysis of the data. (b) Describe the variables and summarise their means, variances and ranges. List the characteristics of all 30 year old white respondents. (c) Draw a histogram for income (inc). Construct a new variable equal to the logarithm of income, draw a histogram of this and comment. Draw separate histograms by ethnic origin (eth) and comment on any difference visible. (d) Generate a discrete variable recording whether household income exceeds 10000 per annum and tabulate this against ethnic origin. What do you learn? Explain carefully assumptions needed to justify the difference between column percentages as a measure of economic discrimination faced by nonwhite households. Paying particular attention to issues of measurement error, omitted variables and simultaneity, comment on the reasonableness of assumptions required in each of these respects. (e) Draw a scatterplot of income against years of education (edu). Calculate the covariance between these variables. Is higher education associated with greater income? (f) Are nonwhite respondents less well educated in this data set? How does this bear on your earlier discussion? To what extent are ethnic differences in income apparent within education groups and what can be concluded from this? 2
Tutorial 2 Wage Equations 2.1 You will find the data for this question in a file called wage.dta. The data are a sample of 300 male manual workers from the UK. (a) Regress ln wage (wage) on education (educ) and interpret your results. Can you reject the hypothesis that education has no effect? (b) Add years of experience (exp) to the regression. What happens to the estimated effect from education? Explain why this happens. Explain how you might investigate whether returns to experience are constant, increasing or decreasing and do so. (c) Now reestimate your most recent regression equation separately for skilled (those with skill=1 ) and semiskilled (those with skill=0 ) workers. How do your conclusions differ for the two groups? For which group do wages increase fastest in the early years of employment? After how many years experience do workers in the two groups seem to reach their highest wage? (d) Comment on the plausibility of your results. 2.2 You have data from an extensive medical survey on health and economic status. A cross section of 500 men provides information on earnings w in per week, age in years a and an index of fitness f. (a) You run the following regression of ln earnings on the fitness index: ln w = 6.1 0.31 f + u (3.7) R 2 = 0.05 Here, and below, figures in parentheses are absolute t values and u is an error term. Calculate a 95% confidence interval for the coefficient on the fitness index. (b) You then add age to the regression with the following results: ln w = 5.3+ 0.07 a + 0.12 f + u (4.8) (0.91) R 2 = 0.19 Explain carefully why and how adding age to the regression can change the size and sign of the estimated fitness effect. Do the results indicate that age must be negatively or positively correlated with fitness in this sample? (c) What economic reasons might there be to expect age and fitness to affect earnings? To what extent do you regard these results as supporting the contention that they do? 3
Tutorial 3 Consumer Demand 3.1 * You will find the data for this question in a file called butter.dta. These are aggregate monthly data on butter and margarine purchases. (a) Use the data to estimate a double log demand equation for butter. What are the advantages and disadvantages of this functional form? (b) How do you interpret your estimate of the income elasticity - is it significantly different from one? (c) Does the own-price elasticity make economic sense? Do butter and margarine appear to be complements or substitutes? (d) Test whether the demand for butter differs during the summer months and interpret your results. (e) Does advertising have any significant impact on demand for butter? Can you explain this economically? (f) Repeat your analysis for margarine rather than butter and comment on whether the results are compatible with those obtained for butter. 3.2 You are given data on food expenditure, income and demographic characteristics of 150 married couples, where the husband is a manual worker and is aged 30-50. All survey interviews take place in the same month. You decide to estimate an Engel curve and regress the ratio of food expenditure to income (wfoy) on the logarithm of net current income (ncy), the number of children aged 6-10 (kids2 ), the number of children aged 11-17 (kids3 ) and an indicator for London (london=1 if the couple live in Greater London, and 0 otherwise). Your estimates are: OLS Estimation, Dependent variable is wfoy, 150 observations Variable Coefficient Standard error constant 0.877 0.065 ncy -0.155 0.014 kids2 0.026 0.006 kids3 0.138 0.032 london 0.047 0.018 R 2 0.496 Mean of wfoy 0.189 Mean of ncy 4.473 (a) On the basis of the above estimates establish if food is a necessity or a luxury. (b) Establish the statistical significance of the coefficient on london. (c) Further, compute the income elasticity for a hypothetical couple living outside London, with mean log income and no children. (d) It has been suggested that share of budget spent on food is a good indicator of household living standard. Households spending a lower share of their budget on food are to be regarded as better off. If number of young children in a household were to increase by one, by how much would income need to rise to preserve the household s living standard? 4
Tutorial 4 Labour Supply 4.1 * The data for this question are in a file called work.dta. This is a cross section of data on the labour supply of married women in the US. (a) Regress hours of work on wage, wage squared and other income, using data only on those women who work. Is the evidence of backward bending statistically convincing? What problems arise with modelling behaviour of the women who do not work? (b) Consider a woman with a wage of $8 per hour and other income of $10000. How many hours would you expect her to work? (c) Would an increase in her wage lead her to choose more or fewer hours? What does this imply about the relative size of income and substitution effects? Calculate the uncompensated and compensated effect of a wage increase. (d) Suppose the government imposes a tax of 30% on earnings. Calculate the revenue raised from this individual (i) if labour supply is fixed (ii) if labour supply changes. (e) Do any of the other variables provided help in explaining labour supply decisions? 4.2 The government of Nagarstan is worried about child labour. It collects data on household circumstances for a sample of children working in the textile industry. You are asked to estimate a labour supply equation assuming that the child s hours of work are chosen by the parents, in the perceived interests of the child or of the household as a whole, conditional on the income earned by the parents. You estimate the following equation by OLS where hours are hours worked per day, wage is the child s hourly wage, parental income is daily income and u is an additive disturbance. Figures in parentheses are absolute t statistics. Hours = 18.71 0.412 Wage 0.229 Parental income + u (9.20) (2.31) (4.56) n = 318 R 2 = 0.176 Mean wage = 3.70 Mean parental income=60.2 (a) Explain how the wage effect can be decomposed into income and substitution effects. Calculate the substitution effect of a wage increase for a child on the mean wage with mean parental income. In light of your answer, explain how economic theory might explain the fact that wage increases appear to reduce chosen hours of work. (b) The government is under international pressure to ban child labour. Does your equation suggest anything about the sort of households which would suffer greatest loss of income? The government does not believe a ban on child labour could be policed. Does your equation suggest anything about alternative policies which might reduce child labour? (c) Should we be worried by the fact that the equation is estimated only from data on households where children do work? 5
Tutorial 5 Consumption and Investment 5.1 * The data for these questions can be found in a file cexp.dta. The variables relate to the UK over the period 1969-1984. (a) What is meant by permanent income? Suppose consumers expenditure C t in period t is related to perceptions of permanent income Yt P in period t according to C t = γ + κyt P and suppose that consumers update their perceptions of permanent income according to a rule Yt P Yt 1 P = λ(y t Yt 1) P + ɛ t where Y t is actual disposable income and ɛ t is a random error term reflecting new information on future incomes. Show that C t = λγ + (1 λ)c t 1 + κλy t + κɛ t and use the data to estimate such an equation by OLS. What is your estimate of the long run marginal propensity to consume κ? How sensitive to current income do perceptions of permanent income seem to be? (b) Consider an individual choosing consumption in two adjacent periods C t and C t+1. His prospective marginal utilities of consumption in each period are C β 1 t and 1 1+δ Cβ 1 t+1 where δ is a subjective rate of discount reflecting the consumer s impatience and β < 1. Given that the MRS is equal to marginal utility in the first period divided by marginal utility in the second, show that (C t+1 /C t ) 1 β = (1 + r t )/(1 + δ). where r t is the real interest rate linking the two periods. If we take logs we get an approximation to this relationship ln C t+1 ln C t = 1 1 β (r t δ). Interpret this equation. Use the data you are given to regress change in log consumption on the interest rate and a constant. How do you interpret your estimate of the coefficient on r t? 6
5.2 You are studying investment in the London pizza delivery sector. You have data on current investment, previous period s capital stock and sales revenue of 700 firms. (a) Suppose that there is a constant returns to scale Cobb-Douglas production function relating pizzas produced Q it to capital stock K it and labour input L it Q it = AK ρ it L1 ρ it where A and ρ are parameters. Suppose the pizza delivery market is competitive, so that firm aims to maximise profits given the price of pizzas p t and the per unit costs of capital r t and labour w t. Show that desired capital stock is K it = ρp t Q it /r t. (b) Suppose that firms replace depreciation to the existing capital stock which is given by δk it 1 + e it, where K it 1 is capital stock one period earlier, δ is a depreciation parameter and e it is an error term, and purchase new capital to fill in a proportion λ of the gap between desired capital stock and that carried over from the previous period. Show that total investment is I it = λρ p tq it r t + (δ λ)k it 1 + e it. (c) You regress investment I it on current sales revenue S it and capital stock at the end of the previous period K it 1. The estimated equation is I it = 0.312 S it 0.472 K it 1 + e it (2.297) (1.754) where e it is the error term. Figures in parentheses are absolute t values. Explain why it is not possible to infer values of the parameters λ, δ and ρ without further information. You are told to assume that δ = 0.100 and r t = 1.00. What are your estimates of λ and ρ? Would you reject λ = 1? (d) Comment on the direction of causation between S it and I it and any problems that may be raised regarding interpretation of the coefficient estimates. 7