EXERCISES. Exercise 1 (Chapter Two): 1.1: Household Characteristics

EXERCISES Exercise 1 (Chapter Two): 1.1: Household Characteristics Open c:\intropov\data\hh.sav that contains household level variables. There will be 496 households in the file. Note that each column corresponds to each variable, whereas each row represents each observation or household. All the variables included in the data are described in Appendix 1. Of variables, there is a variable called weight. This weight is the weight given to each household. From this weight, we can calculate the population weight by multiplying weight by the size of household (See also Chapter 2 of Poverty Manual ). Answer the following questions: (a) Generate the population weight called pop. (b) Compare the total number of households and the sum of population. (c) There are four regions in the survey: Dhaka, Chittagong, Khulna, and Rajshahi. region is a string variable. Record these regional names into a different variable called reg1. Record Dhaka, Chittagong, Khulna, and Rajshahi into 1, 2, 3, and 4, respectively. To convert a string variable into a numeric, type the following commands in the Syntax Editor. STRING reg1 (A8). IF (region = " Dhaka " ) reg1 = " 1 ". IF (region = " Chittagong " ) reg1 = " 2 ". IF (region = " Khulna " ) reg1 = " 3 ". IF (region = " Rajshahi " ) reg1 = " 4 ". VARIABLE LABELS reg1 ' four regions (numeric) '. EXECUTE.

Having executed the commands above, go to the Variable View tab and then click on Type to change reg1 to a numeric variable. Note that a string variable region has changed into a numeric variable reg1. Fill in the following table: Household characteristics Region 1 Region 2 Region 3 Region 4 Total Number of households in the population Total population Average distance a household to paved road Average distance of a household to nearest bank % of Households with electricity % of Households with toilet % of population with electricity % of population with toilet Average household size Can you conclude from the results that one region is more affluent than the other? Describe why. (d) Household characteristics also vary with the gender of household head. Compute the means of the variables in the following table. Household Male headed Female headed characteristics households households Average household size Average years of schooling of household head Average household assets Average household land holding Number of households in the population Sample size Ratio of sample household s to household in the population

Do you think the female-headed households are underrepresented in the sample? [Hint: Compare the ratio of sample households to households in the population for male- and female-headed households.] Can you still conclude whether the female-headed households are more (or less) educated or less (or more) affluent than their counterparts? Discuss. 1.2: Individual Characteristics Open c:\intropov\data\ind.sav. This file has information on individual members of households. Sort this file by hcode and then merge it with the household level data (c:\intropov\data\hh.sav). Remember that the household level data is also sorted by hcode. As a result, you will get the new merged data. Unlike STATA, SPSS involves more procedures in merging files. In the merged file, you will find many missing values represented by dots (.). When the household level data is merged with the individual level data, there are less observations in the former data set compared to the latter. For instance, the age of household head will be given for the first member of the household, who is the head of the household. Dots will appear for the other members within the same household. SPSS does not automatically fill the dots with the same value as the first member of the household. Note that STATA does this automatically. In SPSS, you need to fill the dots by using two steps. -The first step is to use Split File by hcode. This will group the data by household code. Note that Split File On sign will appear on the right corner of the Data Editor window. -The second step is to use Replace Missing Values from the Transform menu. Replace dots with the series mean of the first value of the variable in which you are interested. SORT CASES BY hcode. SPLIT FILE LAYERED BY hcode.

RMV / famsize = SMEAN(famsize) / toilet = SMEAN(toilet). Having gone through these two steps, we will have a merged file without any missing values or dots. Save this file as c:\intropov\data\hsurvey.sav. Complete the following questions. (a) Answer the following questions and discuss how the results vary across regions. Regional variation Region 1 Region 2 Region 3 Region 4 Total Average years of education of the population. % of female population % of working population (with positive working hours) % of working population (working in farm) Are the results very different between male and female? Gender differences Male Female Total Average years of schooling Average age Average working hours Average working hours in farm Average working hours in nonfarm [Note: The individual file also have a weight variable, which is in fact the household weight so that total weight is equal to the total population. A detailed discussion on weight is provided in Chapter 2 of poverty manual]

1.3: Expenditure Open the data c:\intropov\data\consume.sav. It contains quantity and expenditure of each food item at household level. Note that total expenditure (hhexp) is the sum of total food (expfd) and non-food (expnfd) expenditures for a household. These are household monthly expenditures. To get per capita expenditure per month, thus, hhexp, expfd, and expnfd have to be divided by the size of household. But the household size is not in c:\intropov\data\consume.sav file but in another file called c:\intropov\data\hh.sav. Thus, the two files have to be merged together. Sort the c:\intropov\data\consume.sav data by hcode and then merge it with c:\intropov\data\hh.sav. (i) (ii) (iii) (iv) (v) (vi) Compute per capita food expenditure (pcfood), per capita non-food expenditure (pcnfood), and per capita total expenditure (pcexp). Repeat (i) with weight and without weight. How do they differ? (Note: When weighted, population should be used as a weight, where population weight = weight given to each household size of household) * Which region has the highest and the lowest per capita food and per capita total expenditure? Does per capita total expenditure differ between male-headed and femaleheaded households? Do you find any positive correlation between number of years of schooling of household head and per capita total expenditure? Explain the answer by using a graph. Is per capita total expenditure declining with the size of household? Generate the square of household size. COMPUTE size_sq = famsize**2. EXECUTE. * Two estimates can differ widely. The correct procedure is to use the population weight.

Exercise 2 (Chapter Three): The focus of this part of exercises is on constructing a poverty line. The poverty line specifies the society s minimum standard of living to which everyone should be entitled. Poverty line used as a yardstick to identify the poor is thus the baseline for any poverty analysis. Once the poverty line is determined, one can construct poverty profiles, the distribution of poverty across sectors, geographical regions, and socioeconomic groups, and a comparison of key characteristics of the poor with those of the non-poor. This exercise discusses three methods that have been used to derive the poverty line in Bangladesh. These are namely direct calorie intake, food energy intake, and cost of basic needs. These three methods will be exercised in turn. (Note: A food basket considered for the healthy survival of a typical family in rural Bangladesh is the same as the one used in STATA exercise) 2.1: Direct Calorie Intake The file c:\intropov\data\consume.sav provides information on quantities of 10 food items consumed by the households included in the data. Note that potato and other vegetables are lumped together into one item called veg. These quantities of 10 food items can be converted into calories based on food calorie conversion factors that are provided in the table below. Note also that the quantities in the data are expressed in kg per week and thus have to be converted into gram per day. Based on calories of food basket and quantities of food consumed by each household in the survey, we can obtain the household s calorie intake. In order to get per capita calorie intake, we need to merge c:\intropov\data\consume.sav with c:\intropov\data\hh.sav which has the size of household. Generate a variable called pccal indicating per capita calorie intake.

Food Per capita normative daily requirements Average rural consumer price (taka/kg) Items Calorie Quantity (gm) Rice 1386 397 15.19 Wheat 139 40 12.81 Pulse 153 40 30.84 Milk (cow) 39 58 15.9 Oil (mustard) 180 20 58.24 Meat (beef) 14 12 66.39 Fish 51 48 46.02 Potato 26 27 8.18 Other vegetable 26 150 38.3 Sugar 82 20 30.49 Fruits 6 20 28.86 Total 2112 Classify an individual is poor if his or her per capita calorie intake is less than the nutritional requirement of 2112 calories per day and zero otherwise. Create a new variable called z_dci equal to 100 if the household is poor and zero otherwise. Save the file as c:\intropov\data\pline.sav. IF (pccal < 2112 ) z_dci = 100. IF (pccal >= 2112) z_dci = 0. EXECUTE. Discuss the percentage of poor by regions. Which region is the poorest? 2.2: Food-Energy Intake Method The food energy intake (FEI) method is simple. Since separate poverty lines are estimated for each region, it takes into account the differences in regional costs of living and food preference. A classic method of FEI has been proposed by Greer and Thorbecke

(1986). They provide a method that computes the food poverty line at which an individual s food energy intake is just sufficient to satisfy his or her calorie requirement per day. Their proposed cost-of-calorie function is Ln(E) = a + b C + u where E is the per capita total expenditure, C is the number of calories obtained from the food basket, and u is the error term. Once the equation is estimated, we are able to construct a poverty line for each region. Since the calorie requirement is the same for all regions at 2112, the poverty line is estimated separately for each region as pline = exp(â +bˆ 2112) where exp stands for exponential and â and bˆ are the coefficient of estimates in the log equation above. We now apply this methodology to the data. Open c:\intropov\data\pline.sav. (i) Generate the logarithm of per capita total expenditure. COMPUTE lpcexp = LN(pcexp). EXECUTE. (ii) Regress log of per capita total expenditure ( pcexp ) against per capita calorie intake ( pccal ). Use weighted least square method, where weight is the population. REGRESSION /REGWGT = pop /STATISTICS COEFF OUTS R ANOVA /DEPENDENT lpcexp /METHOD=ENTER pccal. (iii) What are the estimates of the slope and the constant term?

(iv) Create a variable called feipline, which is equal to the exponential of (estimated constant + estimated slope multiplied by 2112). COMPUTE feipline = EXP (â + b ˆ 2112 ). EXECUTE. Other than this method, there is a simpler way of calculating a poverty line under FEI method. The steps are as follows: (i) (ii) (iii) Obtain the weighted mean of per capita total expenditure (with weight = pop) within the range where per capita calorie intake lies between its lower bound (=2112*0.9) and its upper bound (=2112*1.1). Name this weighted average of per capital total expenditure as feipline. Create a variable feipoor =100 if feipline is greater than per capita total expenditure and zero otherwise. COMPUTE feipoor = 0. EXECUTE. IF (feipline > pcexp ) feipoor = 100. EXECUTE. (iv) Compute the percentage of poor by regions. Which region is the poorest by FEI method? 2.3: Cost of Basic Needs Rowntree s (1901) approach to specifying poverty lines based on the concept of physical efficiency measures poverty in terms of lack of command over basic consumption needs essential for maintaining physical efficiency. This approach is so-called the cost of basic needs (CBN) method of constructing poverty lines. This method involves determining

food and non-food costs of basic consumption baskets and then adding up the two costs gives the poverty line. We provide exercises on food poverty line and non-food poverty line separately. A: Food Poverty Line First of all, we choose a basket of a reference group. Open C:\intropov\data\Hh.sav and merge it with c:\intropov\data\consume.sav after sorting the files by hcode. Call this merged file c:\intropov\data\exp.sav. Create per capita food, non-food, and total expenditure. Generate the cumulative sum of population ( cpop ) of which its last value must be one (How to create cpop will be explained in Exercise 4). Create a reference group, the bottom 20 percent in the distribution of per capita total expenditure. Type the following command in the Syntax Editor: COMPUTE ref = 0. IF (cpop<=0.2) ref =1. EXECUTE. Having defined the reference group, merge c:\intropov\data\hsurvey.sav with c:\intropov\data\vprice.sav. In this case, arrange variables ( thana and vill ) in their ascending order before merging the two data sets. In the file c:\intropov\data\vprice.sav, there is village level price information on all 11 food items. Under the method of cost of basic needs, it is assumed that all individuals belonging to the bottom 20 percent nationally enjoy the same standard of living but have different consumption patterns. Given the basket of total expenditure and calories and average prices of each food item in the basket for the reference group, compute the quantity of each food item in the basket. Convert quantities into calorie by using calorie conversion factor. Make sure that the unit is converted correctly. Compute the cost of per calorie through dividing the average of per capita expenditure on food basket by the average of per capita total calorie of the basket for the reference group. Create a variable called costcal for the cost of per

calorie. Calculate the food poverty line equal to multiplying costcal by 2112. Note that there is only one food poverty line in this case. (i) (ii) (iii) What is the cost of per calorie for the reference group? Create 5 different quintile groups. Compare the cost of per calorie for each of these quintiles. What is the monthly food poverty line? Label this food poverty line as fline. Save this file as c:\intropov\data\pline.sav. B: Non-food Poverty Line Parametric methods of setting non-food poverty lines can be readily estimated using a food-share Engel curve of the regression form, which is illustrated in Poverty Lines in Theory and Practice by M. Ravallion (1999). In this exercise, we practice nonparametric ways of defining non-food poverty lines which do not impose a functional form on the Engel curve. We will illustrate constructing both the upper and the lower poverty line. (i) (ii) Open file c:\intropov\data\pline.sav. Arrange per capita food expenditure in ascending order. This is an important step to follow. Otherwise, you will get an incorrect result. SORT CASES BY pcfd. (iii) Compute the weighted average of per capita non-food expenditure for those households whose per capita food expenditure lies within plus or minus 10 percent around the food poverty line.

COMPUTE filter_$=(pcfd>=fline*0.9 & pcfd<=fline*1.1). VARIABLE LABEL filter_$ pcfd>=fline*0.9 & pcfd<=fline*1.1 (FILTER) VALUE LABELS filter_$ 0 Not Selected 1 Selected FORMAT filter_$ (f1.8). FILTER BY filter_$. EXECUTE. WEIGHT BY pop. DESCRIPTIVES VARIABLES = pcnfd /STATISTICS=MEAN. (iv) (v) Call this mean of per capita non-food expenditure (which is weighted by the population weight) as nfline1 Compute the upper poverty line ( upline ) by summing the food poverty line and nfline1. We can apply the same approach to setting the lower poverty line described above, with the difference that we compute the non-food expenditure of households in the neighborhood of the point where per capita total expenditure is equal to the food poverty line. Answer the following questions. (Note: In this case, per capita total expenditure has to be sorted in its ascending order) (i) (ii) (iii) (iv) What is the non-food poverty line obtained from this method? Compute the lower poverty line ( cbnpline ). Compare the upper poverty line with the lower poverty line. Which one would you use? Why? Calculate the incidence of poverty using the upper poverty line and the lower poverty line. How are they different? Discuss. We have constructed poverty lines based on three different methods described above. Discuss how different the percentage of people living below each of these poverty lines using the three methodologies. Also discuss which method would you adopt in setting an official poverty line for your own country.

Exercises 3 (Chapter Four) 3.1: Getting Started Open the data file c:\intropov\data\example.sav. The file contains the individual consumption information of three countries. The figures are all monthly consumption. All three countries have 10 citizens. (i) (ii) (iii) (iv) Compare the means of consumption for three countries. Suppose that a poverty line is set at 126 per month. Given this poverty line, compute the following poverty estimates for each country. a : the head-count index b: the poverty gap index c: the squared poverty gap index (or the severity of poverty index) Repeat (ii) when the poverty line is 130. Which country has the highest poverty? Why? Why would you use the poverty gap index and its squared poverty gap index rather than the head-count index even though the latter is extremely simple and widely used? 3.2: Poverty Measures Now we work with the data file c:\intropov\data\pline.sav. Make sure that you have variables including per capita total expenditure ( pcexp ) and poverty lines ( fline and cbnpline ) constructed by the cost of basic needs. (i) Compute five poverty measures including head-count ratio, poverty gap index, squared poverty gap index (severity of poverty index) and Watts measure for per capita total expenditure, using both the food poverty line and the non-food poverty line derived from the cost of basic need method.

The following program calculates the four poverty measures for the whole population. IF (pcexp < pline) hcount = 100. IF (pcexp >= pline) hcount =0. EXECUTE. Compute gap = hcount*(pline-pcexp)/pline. Compute severity =hcount*((pline-pcexp)/pline)**2. Compute Watts = hcount*(ln(pline)-ln(pcexp)). execute. WEIGHT BY pop. DESCRIPTIVES VARIABLES=hcount gap severity Watts /STATISTICS=MEAN. (ii) Estimate the incidence of poverty, the poverty gap index (PGR), and the severity of poverty index (FGT) for specific subgroups using the food poverty line and the total poverty line. Headcount index PGR FGT (a) 4 Regions (b) Male-headed households (c) Female-headed households (d) Households with more than 5 members (e) Households with less than or equal to 5 (iii) Poverty calculations are based on a sample of households rather than the population. Thus, we must compute standard errors of each poverty measure.. When poverty measures have large standard errors, small changes in poverty may be statistically insignificant and should be carefully interpreted. To compute corrected standard error, we suggest two methods. One of ways to correct standard errors is simply divide the standard deviation of a poverty measure by the square root of sample size ( n = 496 in our

example). Go to Analyze and choose Descriptive Statistics. Alternatively, type the following command in the Syntax Editor: WEIGHT BY pop. DESCRIPTIVES VARIABLES = HCOUNT PGR FGT / STATISTICS = MEAN STDDEV. Having obtained the standard deviations of poverty measures, simply divide them by 496 to get their corrected standard errors. The other method is to adjust the population taking into account sample size. Type the following command in the Syntax Editor: COMPUTE pop1 = EXECUTE. n pop. spop WEIGHT BY pop1. DESCRIPTIVES VARIABLES = HCOUNT PGR FGT / STATISTICS = MEAN SEMEAN. where n, spop, and SEMEAN are the size of sample, the sum of population, and the standard error of the mean, respectively. In our example, n is equal to 496 and spop is equal to 13280. Having adjusted the population weight taking into account the size of sample, simply compute poverty measures and their standard errors, which are weighted by the adjusted population. Having computed these, fill in the following table.

Region 1 Headcount ratio Poverty gap Ratio FGT ratio (Standard errors) Region 2 (Standard errors) Region 3 (Standard errors) Region 4 (Standard errors)

Exercise 4 (Chapter Five) 4.1: Stochastic Dominance There is no general consensus on poverty line. Thus, it might be appropriate to measure poverty using all possible poverty lines in a given range. Note that the choice of poverty measures has a significant implication on the direction of changes in poverty. Hence, it will be useful to find conditions under which all members of the class of poverty measures give the same ranking. These issues are dealt with using the idea of stochastic dominance. The first-order stochastic dominance test compares the percentage of poor for different regions, which have the probability distribution functions for each region. A simple way of testing the first order dominance for each of four regions is to plot the percentage of poor on the vertical axis and the poverty lines on the horizontal axis. Poverty Percentage of poor Line Region 1 Region 2 Region 3 Region 4 3000 7.08 0.62 1.80 2.31 4000 12.04 9.58 11.78 13.93 5000 27.38 27.27 42.89 32.36 6000 45.25 49.84 58.44 48.52 7000 57.34 61.40 74.04 65.90 Poverty Poverty gap ratio Line Region 1 Region 2 Region 3 Region 4 3000 0.40 2.39 5.90 11.17 4000 0.09 1.38 4.88 10.42 5000 0.29 1.77 6.68 13.77 6000 0.38 2.08 6.34 12.14 7000 0.31 1.96 6.02 11.95 We have calculated the head-count ratio and the poverty gap ratio for all four regions. These poverty measures are estimated for various poverty lines as shown in the table

above. The first order dominance curve is the relationship between poverty line (x-axis) and the corresponding head-count ratio (y-axis). PLOT FORMAT = OVERLAY /PLOT = hc1 hc2 hc3 hc4 with pline. After formatting the graph by interpolation, the graph will look like this: (i) (ii) (iii) Does one distribution dominate over the other? Does any one of lines cross another line? Can you conclude from the graph that one region has a higher incidence of poverty than another region? Is it true for other poverty measures? If the two curves do not intersect at all, we do not need to test the second or third dominance because the first dominance will imply higher poverty on the basis of all

poverty measures including the head-count ratio. Otherwise, we move on to testing the second-order stochastic dominance. It is the relationship between poverty line (x-axis) and the corresponding poverty gap ratio (y-axis). This curve is also called the poverty deficit curve. If the second order dominance condition is satisfied, (when the curves do not intersect), we can say unambiguously that poverty measured by entire class of Foster, Greer and Thorbecke poverty measures with the exception of the head-count ratio will be higher in one region than in another region. Given the table above, simply plot the poverty gap ratio (y-axix) against the poverty lines (x-axis) for each of the regions. Repeat questions (i), (ii) and (iii). If the poverty deficit curves also intersect, then we move on to the third order stochastic dominance, which is the relationship between the poverty line (x-axis) and the severity of poverty (or square of poverty gap ratio).

Exercise 5 (Chapter Six) 5.1: Lorenz curve The Lorenz curve is a simple device that has been used widely to describe and analyze data on income distribution. This curve has become important in recent times because it provides a useful method of ranking income distribution from the welfare point of view. The Lorenz curve is defined as the relationship between the proportion of people with income less than or equal to a specified amount, and the proportion of total income received by those people. More generally, the Lorenz curve is represented by a function L(p), which is interpreted as the fraction of total income received by the bottom pth fraction of people, when the people are arranged in ascending order of their incomes. The curve is drawn in a unit square. Thus, if p=0, L(p)=0 and if p=1, L(p)=1. The slope of the curve is positive and increases monotonically: the curve is convex to the p axis. From this, it follows that p L(p). The straight line represented by the equation, L(p)=p, is called the egalitarian line. In constructing the Lorenz curve, we require to compute the cumulative proportion of per capita total expenditure and population. The following commands will be involved given that you have computed the mean of per capita expenditure ( mpcexp ) and the sum of population ( spop ). COMPUTE cpcexp = pop*pcexp / mpcexp/spop. Compute cpop=pop/spop. EXECUTE. SORT CASES BY cpcexp (A). CREATE /cpcexp=csum(cpcexp). /cpop=csum(cpop).

Note that cpcexp, cpop and CSUM are the cumulative proportion of per capita expenditure, the cumulative proportion of population and the cumulative sum, respectively. Check point: Are the last values of cpcexp and cpop equal to 1? If not, there is a problem. After having created cpcexp and cpop, we need the following commands: Compute p=cpop-pop/spop/2. Compute q=cpcexp-pcexp*pop/spop/mpcexp/2. Note that we have just made a continuity correction. (i) (ii) (iii) Go to Graphs menu and select Interactive and Line. Graph q on the vertical axis and q on the horizontal axis. Does the Lorenz curve have a positive slope? Is the curve convex to the p axis? Can you say that its slope is increasing monotonically? Construct the Lorenz curve for Dhaka and Chittagong. Does one curve lie above to the other? Which one is closer to the egalitarian line? Can you conclude that the distribution of expenditure in the Dhaka region is more equal than in the Chittagong region? Discuss. Do these two Lorenz curves intersect each other? If the two curves intersect, we cannot say that one region is more equal than the other. In this respect, the Lorenz curve provides only the partial ranking of distributions. 5.2: Inequality Measures We exercise four inequality measures in this section including the Gini index, generalized Gini index, Atkinson measure, and Theil s index.

(i) Gini Index Of all the inequality measures, the Gini index is used most widely. It became popular because of its direct relationship with the Lorenz curve. The Gini index measures the extent to which the Lorenz curve departs from the egalitarian line. It is defined as twice the area between the Lorenz curve and the egalitarian line. This definition ensures that the value of the Gini index lies between zero (for complete equality) and one (for complete or most extreme inequality). Having created the cumulative proportion of per capita total expenditure and population, the following commands are to generate the Gini index and quintile shares. COMPUTE gini = 100*(1-2*q). EXECUTE. IF (p<=.20) quint = 1. IF (p>.20 and p<=.40) quint = 2. IF (p>.40 and p<=.60) quint = 3. IF (p>.60 and p<=.80) quint = 4. IF (p>.80) quint = 5. EXECUTE. COMPUTE share =100* pcexp*pop/(spop*mpcexp). EXECUTE. Compute the Gini index for the four regions. Which region is the most unequal among the four regions?

(ii) Atkinson s Measure The inequality measure proposed by Atkinson is x * A = 1 µ which is in fact a measure of loss of welfare caused as a consequence of inequality in the society. x* is called equally distributed equivalent level of income which is the level of per capita income that if received by everyone, would make the total welfare exactly equal to the total welfare generated by the actual income distribution. With homothetic utility function, Atkinson s index is equal to 1 1 n = f x 1 ε 1 ε ) 1 ( i i ), ε 1 µ i= A ε ( 1 n exp( f i log e ( xi )) i= 1 = 1, ε = 1 µ ε is a measure of degree of inequality-aversion. The following program can be used to compute Atkinson s measures for pcexp when ε is 1, 1.5, and 2. Compute lpcexp=ln(pcexp). Compute pcexp1=(pcexp)**(-0.5). Compute pcexp2= (pcexp)**(-1). Execute. Calculate the weighted mean (weight = pop) of lpcexp, pcexp1 and pcexp2 using the descriptive command:

WEIGHT BY pop. DESCRIPTIVES VARIABLES = lpcexp pcexp1 pcexp2 / STATISTICS = MEAN. The following calculations will give the Atkinson s measures depending on the value of relative aversion parameter. A ( ε = 1.5) = 1- (mean pcexp)**(-2)/mpcexp. A ( ε = 2) = 1- (mean pcexp)**(-1)/mpcexp. A ( ε = 1) = 1- (exp(mean lpcexp))/mpcexp. Repeat the following example to compute the Atkinson s inequality measure. Fill in the following gaps. Households Per capita Relative expenditure frequency (exp) (feq) (1) (2) 1 1 0.03 2 1000 0.03 3 2000 0.03 4 3000 0.07 log(exp) (3) (1) (3) (exp) (-0.5) (exp) (-1) 5 4000 0.17 8.29 33176 0.016 0.000 6 5000 0.2 7 6000 0.2 8 10000 0.17 9 12000 0.07 9.39 112712 0.009 0.000 10 14000 0.03 Weighted Mean 6100 8.36 54498 0.044 0.030 Atkinson index (ε =1) Atkinson index (ε =1.5) Atkinson index (ε =2) 0.91

Is the Atkinson s inequality measure increasing as the inequality aversion parameter increases? (iv) Theil s Index Theil (1967) proposed two inequality measures that are based on the notion of entropy in information theory. The two entropy measures are defined as T 0 = log µ log( x) f ( x) dx 0 T 1 = 1 x x f x d x µ µ log( ) ( ) log 0 where µ is the mean income and f(x) is the density function. Compute the two entropy measures from the table presented above. Is your result for T 0 equal to 0.36? Is your result fro T 1 equal to 0.22?

Exercise 6 (Chapter Seven) Poverty profiles describe nature and extent of poverty. They provide breakdown of aggregate poverty according to various socioeconomic and demographic characteristics of households. They show how poverty varies with subgroups of society, such as regions, household size, age, etc. Poverty profiles can also show the impact of the sectoral and regional patterns of economic changes on aggregate poverty. 6.1: Characteristics of the poor Open the file c:\intropov\data\pline.sav. Note that people whose per capita expenditure is less than per capita monthly poverty line defined by the cost of basic needs are classified as poor and non-poor otherwise. Answer the following questions. Poor Average distance of a household to paved road Average distance of a household to nearest bank % households with electricity % households with sanitary toilets Average household assets Average household land holding Average household size % households headed by female % households headed by male Average years of schooling of head Average age of household head Average total working hours in farming Average total working hours in non-farming Non-poor Calculate the head-count ratio, the poverty gap ratio, and the severity of poverty by all household characteristics shown above. Construct graphs for each of these subgroups. Discuss a poverty profile in the rural Bangladesh based on your findings.

Exercise 7 (Chapter Eight) Suppose that we want to explain per capita total expenditure in terms of socioeconomic and demographic household characteristics in the data. We estimate a regression model with the logarithm of per capita expenditure as the dependent variable. The explanatory variables can include; - gender of household head - age of household head - age-square of household head - size of household - size-square of household - education and employment status of household head - access to basic infrastructure such as distance to a paved road or to bank - asset positions such as land holding - region or urban/rural - and other variables We generate variables that do not exist in the data. COMPUTE lpcexp = ln(pcexp). COMPUTE sq_age = age**2. COMPUTE sq_size = size**2. EXECUTE. There is categorical variables in the regression model, such as gender and region. In this case, we have to convert these categorical variables into dummy variables. For instance, if the head of household is male, create a new variable equal to 1 and 0 otherwise. Note that in the regression model, only one of two dummies has to be included. IF (gender = 1) male = 1. IF (gender = 2) female = 0. EXECUTE.

Similarly, create a regional dummy variable called reg1. There will be four dummies yet only three dummy variables should be included. To run a regression model, open the Syntax Editor, write the following command, and then click on the Run button to execute the analysis. REGRESSION /DESCTIPTIVES /REGWGT=pop /STATISTICS COEFF OUTS R ANOVA /DEPENDENT lpcexp /METHOD=ENTER male age sq_age size sq_size edu road land reg1 reg2 reg3 (include more variables). The Regression command is used to produce both simple and multiple regression equations and associated statistics. The /DESCRIPTIVE subcommand tells SPSS to produce descriptive statistics for all the variables included in the analysis. These statistics include means, standard deviations, a correlation matrix, and so on. The /REGWGT subcommand indicates that the regression model is weighted (by population in our example) The /STATISTICS subcommand produces statistical results of the model including R-square, adjusted R-square, sum of squares, degrees of freedom, estimated coefficients, t and F statistics, etc. The /DEPENDENT subcommand is used to identify the dependent variable in the regression model. In this example, our dependent variable is the log of per capita expenditure. The /METHOD subcommand must immediately follow the /DEPENDENT subcommand. This subcommand is used to tell SPSS the way you want your independent variables to be added to the regression equation. ENTER is the most direct method used to build a regression equation; it tells SPSS simply to enter all the independent variables that you indicate for inclusion in the regression equation. Since all the dummy variables takes values 0 and 1, the above model cannot be estimated by the ordinary least square (OLS) method. This is because there is a perfect multicollinearity between the dummy variables and the constant term in the regression

model. To overcome this problem, a certain constraint on the coefficient has to be imposed. Another problem you have to deal with is heteroskedasticity. Since each sampled household has a different population weight attached due to sampling design used, the OLS method will give inefficient coefficient estimates because of heteroskedasticity. The coefficients in the model, however, can be estimated efficiently using the weighted least square (WLS) method, where population is used as the weight. Thus, the model is estimated based on the restricted weighted least squares method. (i) (ii) (iii) What is your R_square of this model? Do the signs of coefficients match with your hypothesis? Are coefficients significant at 5 % significance level? It is a good idea to visually examine the scatter plot of the two variables when interpreting a regression analysis. To do so, you may need to type: PLOT / FORMAT REGRESSION / PLOT lpcexp WITH (explanatory variables). In addition, the scatter plot of the residuals against the fitted values will help you to see visually whether the model is a good fit. To carry out this task, we need additional subcommands in the regression. REGRESSION /DESCTIPTIVES /STATISTICS COEFF OUTS R ANOVA /DEPENDENT lpcexp /METHOD=ENTER male age sq_age size sq_size edu road land reg1 reg2 reg3 (include more variables). /SCATTERPLOT = (*ZRESID, *ZPRED) /SAVE ZRESID ZPRED. After having saved ZPRED and ZRESID, simply type the following command in the Syntax Editor: PLOT / FORMAT REGRESSION / PLOT ZRESID WITH ZPRED.