Two-Sample Cross Tabulation: Application to Poverty and Child. Malnutrition in Tanzania

Similar documents
Poverty Mapping in Indonesia: An effort to Develop Small Area Data Based on Population Census 2000 Results (with example case of East

STAT 1220 FALL 2010 Common Final Exam December 10, 2010

County poverty-related indicators

CONSUMPTION POVERTY IN THE REPUBLIC OF KOSOVO April 2017

a. Explain why the coefficients change in the observed direction when switching from OLS to Tobit estimation.

Bargaining with Grandma: The Impact of the South African Pension on Household Decision Making

Linear Regression with One Regressor

7. For the table that follows, answer the following questions: x y 1-1/4 2-1/2 3-3/4 4

Review questions for Multinomial Logit/Probit, Tobit, Heckit, Quantile Regressions

A 2009 Update of Poverty Incidence in Timor-Leste using the Survey-to-Survey Imputation Method

Growth and Poverty Reduction in Tanzania

PART ONE. Application of Tools to Identify the Poor

Poverty and Witch Killing

8.1 Binomial Distributions

Impact of Household Income on Poverty Levels

1) The Effect of Recent Tax Changes on Taxable Income

Background Notes SILC 2014

Aaron Sojourner & Jose Pacas December Abstract:

Review: Population, sample, and sampling distributions

1. The Armenian Integrated Living Conditions Survey

Chapter 9 & 10. Multiple Choice.

Appendix (for online publication)

Export markets and labor allocation in a low-income country. Brian McCaig and Nina Pavcnik. Online Appendix

Survey Sampling, Fall, 2006, Columbia University Homework assignments (2 Sept 2006)

Estimating Quarterly Poverty Rates Using Labor Force Surveys: A Primer

Gender wage gaps in formal and informal jobs, evidence from Brazil.

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN EXAMINATION

Your Name (Please print) Did you agree to take the optional portion of the final exam Yes No. Directions

Poverty and Inequality Maps for Rural Vietnam

A Two-Step Estimator for Missing Values in Probit Model Covariates

Econ Spring 2016 Section 12

Estimating Quarterly Poverty Rates Using Labor Force Surveys

Annex 1 to this report provides accuracy results for an additional poverty line beyond that required by the Congressional legislation. 1.

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Sampling Distributions

NBER WORKING PAPER SERIES THE GROWTH IN SOCIAL SECURITY BENEFITS AMONG THE RETIREMENT AGE POPULATION FROM INCREASES IN THE CAP ON COVERED EARNINGS

Panel Regression of Out-of-the-Money S&P 500 Index Put Options Prices

(iii) Under equal cluster sampling, show that ( ) notations. (d) Attempt any four of the following:

Growth in Tanzania: Is it Reducing Poverty?

Explaining procyclical male female wage gaps B

Cross- Country Effects of Inflation on National Savings

Optimal Window Selection for Forecasting in The Presence of Recent Structural Breaks

Statistic Midterm. Spring This is a closed-book, closed-notes exam. You may use any calculator.

MONTENEGRO. Name the source when using the data

The Impact of a $15 Minimum Wage on Hunger in America

Appendix A. Additional Results

Statistical Models of Stocks and Bonds. Zachary D Easterling: Department of Economics. The University of Akron

PASS Sample Size Software

HOUSEHOLDS INDEBTEDNESS: A MICROECONOMIC ANALYSIS BASED ON THE RESULTS OF THE HOUSEHOLDS FINANCIAL AND CONSUMPTION SURVEY*

χ 2 distributions and confidence intervals for population variance

CYPRUS FINAL QUALITY REPORT

Web Appendix Figure 1. Operational Steps of Experiment

Labor Participation and Gender Inequality in Indonesia. Preliminary Draft DO NOT QUOTE

Final Exam Suggested Solutions

PART 4 - ARMENIA: SUBJECTIVE POVERTY IN 2006

Problem Set 07 Discrete Random Variables

The Moldovan experience in the measurement of inequalities

Small Area Estimates Produced by the U.S. Federal Government: Methods and Issues

1. Overall approach to the tool development

Chapter 5. Sampling Distributions

ECO671, Spring 2014, Sample Questions for First Exam

Obesity, Disability, and Movement onto the DI Rolls

Extend the ideas of Kan and Zhou paper on Optimal Portfolio Construction under parameter uncertainty

Wage Scars and Human Capital Theory: Appendix

Online Appendix: Revisiting the German Wage Structure

Program on Retirement Policy Number 1, February 2011

INTERNATIONAL REAL ESTATE REVIEW 2002 Vol. 5 No. 1: pp Housing Demand with Random Group Effects

The current study builds on previous research to estimate the regional gap in

Central Statistical Bureau of Latvia INTERMEDIATE QUALITY REPORT EU-SILC 2011 OPERATION IN LATVIA

SELECTED ECONOMIC CHARACTERISTICS American Community Survey 5-Year Estimates

What is So Bad About Inequality? What Can Be Done to Reduce It? Todaro and Smith, Chapter 5 (11th edition)

Internet Appendix for Asymmetry in Stock Comovements: An Entropy Approach

THE CONSUMPTION AGGREGATE

Session 178 TS, Stats for Health Actuaries. Moderator: Ian G. Duncan, FSA, FCA, FCIA, FIA, MAAA. Presenter: Joan C. Barrett, FSA, MAAA

POVERTY ANALYSIS IN MONTENEGRO IN 2013

Chapter 6 Micro-determinants of Household Welfare, Social Welfare, and Inequality in Vietnam

BIO5312 Biostatistics Lecture 5: Estimations

Financial Literacy and Financial Inclusion: A Case Study of Punjab

PWBM WORKING PAPER SERIES MATCHING IRS STATISTICS OF INCOME TAX FILER RETURNS WITH PWBM SIMULATOR MICRO-DATA OUTPUT.

Health Expenditures and Life Expectancy Around the World: a Quantile Regression Approach

The data definition file provided by the authors is reproduced below: Obs: 1500 home sales in Stockton, CA from Oct 1, 1996 to Nov 30, 1998

Prepared for 2013 Federal Committee on Statistical Methodology Research Conference November 5, 2013

The Effect of Macroeconomic Conditions on Applications to Supplemental Security Income

Effect of Education on Wage Earning

Random Variables and Applications OPRE 6301

Sampling and sampling distribution

Income Distribution Database (

Halving Poverty in Russia by 2024: What will it take?

Does health capital have differential effects on economic growth?

How Are SNAP Benefits Spent? Evidence from a Retail Panel

Context Power analyses for logistic regression models fit to clustered data

CYPRUS FINAL QUALITY REPORT

STUDY ON SOME PROBLEMS IN ESTIMATING CHINA S GROSS DOMESTIC PRODUCT

Nature or Nurture? Data and Estimation Appendix

CHAPTER 12 EXAMPLES: MONTE CARLO SIMULATION STUDIES

Portfolio Risk Management and Linear Factor Models

Per Capita Housing Starts: Forecasting and the Effects of Interest Rate

Errors in Survey Reporting and Imputation and their Effects on Estimates of Food Stamp Program Participation

SOLUTIONS TO MIDTERM EXAMINATION

Section 2: Estimation, Confidence Intervals and Testing Hypothesis

Transcription:

Two-Sample Cross Tabulation: Application to Poverty and Child Malnutrition in Tanzania Tomoki Fujii and Roy van der Weide December 5, 2008 Abstract We apply small-area estimation to produce cross tabulations from two sample surveys. To account for the sampling error, we bootstrap the second sample, into which the household welfare indicator is imputed. We combine a consumption survey and a demographic and health survey a setup applicable to many developing countries to estimate the joint distribution of poverty and child malnutrition in Tanzania. We apply our method in two ways one with a consumption model and the other with an anthropometric model. The results based on these two models are very close. Our method is applicable to many other issues. Singapore Management University. Email:tfujii@smu.edu.sg. World Bank. Email:rvanderweide@worldbank.org. 1

1 Introduction Cross tabulation is a useful descriptive tool that has been routinely used in economics, sociology, political science and other disciplines. They are easy to produce and helpful for understanding the relationship between two discrete variables of interest. However, when these two variables are not in the same sample, cross tabulations cannot be produced by the standard methods. In this paper, we propose a simple method of producing cross tabulations when the two variables are included in two separate samples. We apply our method to the cross tabulation between poverty and child malnutrition in Tanzania. Our method can be applied to many other issues, including unemployment, disability, race, religion and education. This paper is organized as follows: We develop the methodology in Section 2, followed by a description of the data in Section 3. The results are presented in Section 4, followed by brief discussions in Section 5. 2 Methodology We apply the small-area estimation developed by Elbers et al. (2002, 2003) which will be hereafter called ELL-SAE to produce cross tabulations from two samples. The ELL-SAE method was originally developed to produce spatially disaggregated estimates of poverty and other aggregate welfare indicators, which are often presented in the form of maps. These poverty maps have become popular among researchers and policy-makers, and have been applied to a number of countries. The ELL-SAE method combines two samples. The first sample contains a household welfare indicator of interest y h such as the logarithmic consumption or income per capita as well as its covariates x h for each household h, while the second sample only contains x h. We require that x h in the two samples come from the same population. In a standard application, a consumption model y h = x h β + u h is estimated by the generalized least-squares (GLS) estimation with a consumption survey, where u h η m(h) + ɛ h is the error term, m( ) is the cluster membership function (the mapping from the household to the cluster it belongs to), η the homoskedastic cluster-specific component, and ɛ the heteroskedastic household-specific component. Both η and ɛ have a zero expectation. The household welfare indicator is then imputed into each household in the second sample, 2

which is usually a census. Monte-Carlo simulations are carried out to account for the model error (associated with the variance of the model coefficients) and the idiosyncratic error (associated with u h ). That is, letting R be the number of simulations, the welfare indicator for household h in the r-th round (r {1,, R}) is imputed by the following formula: ỹ (r) h = x h β (r) + η (r) m(h) + ɛ(r) h, where β (r) is drawn from the estimated distribution of the GLS estimator ˆβ GLS. The two components of the error term are drawn from the empirical distribution, where η (r) is drawn for each cluster and ɛ (r) for each household. Aggregate welfare indicators are then estimated with ỹ (r). For example, the proportion of households under the poverty line in a set of households (e.g. a community) K is calculated as: P (r) K is the counting measure and z the logarithmic poverty line. # 1 {h K} h K Ind(ỹ(r) h < z), where #{ } The point estimate ˆP K and its standard error ˆσ( ˆP K ) are obtained by taking the mean and standard deviation of P (r) K over r. When the second sample is not a census but a sampled survey, a similar simulation method can be used. However, each household in the second sample must be replicated by the number of households it represents to accurately evaluate the prediction errors. One of the few applications of the ELL-SAE in which the second sample is a sampled survey is Elbers et al. (2004). Whether the second sample is a census or a sampled survey, all the previous applications of ELL-SAE only produced conditional estimates of aggregate welfare indicators. For example, ˆP K considered above is nothing but an estimate conditional on the household being in K. Conditional estimates are useful for comparing the aggregate welfare estimates across various indicators, such as the location of the household, the household size and the race of the household. However, we are often interested in the joint distribution as well. For example, suppose we are interested in the relationship between poverty and child malnutrition. We may be interested in the conditional distribution such as µ P M, or the proportion of the poor households (P ) among households with malnourished children (M). However, we may also be interested in the joint distribution of poverty and malnutrition such as µ P M, or the proportion of the poor households with malnourished children. When the second sample is a census and includes the nutritional status of children, we can simply apply the multiplication rule. That is, we can estimate µ P M at µ M ˆµ P M, where µ M is the proportion of households with malnourished children in the census. Its standard error σ(ˆµ P M ) can be estimated at µ M ˆσ(ˆµ P M ). When the second sample is a sampled survey, estimating the joint distribution becomes 3

more complicated. First, µ M must be replaced by its sample estimate ˆµ M (S 2 ), where the argument S 2 makes explicit the dependence on the realization of the second sample. Second, µ P M is now dependent on S 2. As a result, ˆµ P M is also dependent on S 2. Hence, we create a bootstrapped sample for the second sample in each round of the Monte-Carlo simulation to simulate the distribution of S 2. In the r-th round of simulation, we estimate the joint distribution ˆµ P M (S (r) 2 ) = ˆµ M(S (r) 2 ) ˆµ P M(S (r) 2 ), where S(r) 2 is the bootstrapped second sample. By estimating the joint distribution in each round, we can appropriately incorporate the sampling errors from the second sample. 3 Data We use the Household Budget Survey (HBS) for 2000/01 and the Tanzania Demographic and Health Survey (TDHS) for 2004-05, details of which are given in National Bureau of Statistics (2002) and National Bureau of Statistics and ORC Macro (2005) respectively. These two surveys have a number of variables in common, including the age, sex and education of each household member, the housing characteristics and asset holdings. HBS contains a consumption measure, but not anthropometric measures. On the other hand, the opposite is true for TDHS; TDHS does not have a consumption measure, but it has anthropometric measures. We use the heightfor-age Z-score, which is a widely used measure of long-term nutritional status of children. As an auxiliary data set, we created the means of the Population and Housing Census for 2002 at the ward level, and merged them into the HBS and TDHS. We follow the definition of poverty given by National Bureau of Statistics (2002). That is, those households whose consumption per adult equivalence is less than 7,253 Tanzanian Schillings are deemed poor. For malnutrition, we use the standard definition of stunting. That is, those whose height-for-age Z-score is less than negative two are deemed malnourished (stunted). Note that we cannot produce a cross tabulation between poverty and child malnutrition with HBS or TDHS alone, because the definitions of poverty and malnutrition are based on consumption and height-for-age Z-score respectively. To ensure the two surveys represent the same population, we have taken the following steps: First, we have eliminated the observations in Zanzibar from the TDHS sample, because Zanzibar was not covered in HBS. Second, we kept only those households with at least one child under 4

the age of five. Third, when there are more than one children whose age in years is the smallest in the household, we randomly chose one child from them. By having only one child in each household, we do not need to distinguish between household-level and individual-level random effects for the height-for-age model because the status of malnourishment can be thought of as a household characteristic. 1 Thus, we shall hereafter call a household whose youngest child is malnourished a malnourished household. Finally, we dropped records with missing values in relevant variables. After these steps have been taken, we have 11,006 households in the HBS sample and 4,118 households in the TDHS sample. We apply ELL-SAE with bootstrapping of the second sample to the TDHS and HBS samples in two directions, where bootstrapping was done in two stages to be consistent with the design. In one direction, the HBS is the first sample and the TDHS is the second sample, where y h is the consumption measure (C-SAE). In the other direction, the first and second samples are switched, and y h is the anthropometric measure (A-SAE). Since each direction yields a cross tabulation between poverty and child malnutrition, we can cross-examine the results. We create separate regression models for urban and rural areas. The detailed regression results are available in the appendix. 4 Results Let us first look at the conditional estimates reported in Table 1. We use the subscripts B and NB to indicate whether the second sample is bootstrapped or not. We also use NP and NM to denote non-poor and non-malnourished. Column (1) provides the conditional estimate of the proportion of poor households by the C-SAE B, given the status of malnourishment. For example, the proportion µ P M of the poor households among the malnourished households is estimated at 21.26% in the urban areas. Column (4), on the other hand, shows the conditional estimates of the proportion of malnourished households by A-SAE B, given the poverty status. As shown in Column (1), the proportion of poor households among the malnourished households is higher than that among the nonmalnourished households in both urban and rural areas (i.e. ˆµ P M > ˆµ P NM ). Likewise, Column (4) shows ˆµ M P > ˆµ M NP. However, the difference is significant only for the A-SAE model 1 This distinction is important when we have more than one children. In Cambodia, the individual-level random effects far exceed the household-level or cluster-level random effects (Fujii, 2005). 5

Table 1: Conditional estimates of poverty and stunting by SAE as well as the survey-only estimates. All the numbers are in percentages, and standard errors in the parentheses. Urban Rural (1) (2) (3) (4) (5) (6) C-SAE B C-SAE NB HBS A-SAE B A-SAE NB TDHS ˆµ P M 21.26 21.39 ˆµ M P 30.47 30.48 (2.38) (0.44) (2.34) (2.04) ˆµ P NM 17.76 17.94 ˆµ M NP 24.73 24.60 (2.02) (0.30) (2.44) (1.55) ˆµ P 18.61 18.77 18.55 ˆµ M 25.80 25.70 24.12 (1.79) (0.25) (2.12) (2.08) (1.32) (2.73) ˆµ P M 37.14 37.13 ˆµ M P 40.46 40.62 (1.40) (1.10) (1.58) (0.99) ˆµ P NM 34.75 34.72 ˆµ M NP 40.26 40.33 (1.41) (1.10) (1.50) (1.00) ˆµ P 35.69 35.66 33.77 ˆµ M 40.32 40.43 39.12 (1.12) (0.79) (2.44) (1.10) (0.74) (1.39) in urban areas. Since the idiosyncratic errors tend to cancel out for more aggregated estimates, the standard errors tend to be smaller for higher aggregation. For example, we have ˆσ(ˆµ P ) is smaller than ˆσ(ˆµ P M ) and ˆσ(ˆµ P NM ) in Column (1). Similar observations can be made for Column (4). The bootstrapping of the second sample does not change the point estimates, as the comparisons between Columns (1) and (2), and between Columns (4) and (5) show. However, the standard errors change substantially, especially for the C-SAE in the urban areas. This is because the prediction error is relatively small, but the sampling error due to the second sample (i.e. TDHS) is large for the C-SAE in the urban areas. Hence, ignoring the sampling error from the second sample may result in significant underestimate of standard errors. Finally, we report in Column (3) the estimate of µ P based only on the HBS whereas we report in Column (6) the TDHS-only estimate of µ M. The bootstrapped standard errors, which take into account of the sampling design, are reported below the point estimates. As we can see, the ELL-SAE and survey-only estimates are very close, and their differences can be attributed to statistical errors. In Table 2, we report the estimates of the joint distribution in the urban and rural areas. For example, the proportion µ P M of poor households with malnourished children in the urban area is estimated at 5.19% with the C-SAE B and at 5.68% with the A-SAE B. In both urban 6

Table 2: Estimate of the joint distribution of poverty and child malnutrition. The numbers are in percentages and the standard errors in the parenthesis. Urban Rural C-SAE B A-SAE B P NP Total P NP Total M 5.19 19.25 24.45 5.68 20.12 25.80 (0.79) (2.31) (2.79) (0.83) (2.16) (2.08) NM 13.42 62.13 75.55 12.98 61.22 74.20 (1.63) (2.73) (2.79) (1.81) (2.54) (2.08) Total 18.61 81.39 100.00 18.65 81.35 100.00 (1.79) (1.79) (0.00) (2.46) (2.46) (0.00) M 14.49 24.53 39.02 13.65 26.67 40.32 (0.77) (1.08) (1.46) (0.88) (1.39) (1.10) NM 21.19 39.79 60.98 20.11 39.57 59.68 (1.07) (1.17) (1.46) (1.57) (1.60) (1.10) Total 35.69 64.31 100.00 33.77 66.23 100.00 (1.12) (1.12) (0.00) (2.22) (2.22) (0.00) and rural areas, the point estimates produced by the C-SAE B and A-SAE B are very close. We also formally test the equality of the cross tabulations by C-SAE B and A-SAE B. Let ˆµ C and ˆµ H be the estimates of the joint distribution (ˆµ P M, ˆµ P NM, ˆµ NP M ) obtained by C-SAE B and A-SAE B. We dropped ˆµ NP NM, because we have ˆµ NP NM = 1 ˆµ P M ˆµ P NM ˆµ NP M. Now, suppose that the C-SAE B provides correct estimates under the null hypothesis. Then, the test statistic (ˆµ C ˆµ H ) T V ar[ˆµ C ](ˆµ C ˆµ H ) T follows a χ 2 (3)-distribution under the normal approximation. The χ 2 (3)-test statistic in urban and rural areas are 6.57 and 2.59 respectively. Hence, the null hypothesis of the equality of the two cross tabulations cannot be rejected at a five percent significance level. The same conclusion can be drawn when we assume that the A-SAE B provides correct estimates. It is worth pointing out that we can obtain both the point estimates and the standard errors from Table 1 using the multiplication rule. For example, using the conditional distribution from the A-SAE B and the marginal distribution from the HBS, the proportion µ P M of the poor households with malnourished children in urban areas can be estimated at 18.55% 30.47% = 5.65%. Similarly, by combining C-SAE and TDHS estimates, the same proportion can be estimated at 5.13%. Noting that the bootstrapping for the ELL-SAE estimates and that for the survey-only estimates are carried out independently, we can use the following relationship to estimate the 7

standard errors: σp 2 M = V ar[ˆµ P ˆµ M P ] = ˆµ 2 P σ2 M P + ˆµ2 M P σ2 P + σ2 M P σ2 P, where σ2 A is σ2 (ˆµ A ) with A {P, P M, M P }. This formula allows us to see clearly where the errors come from. For example, using the A-SAE B and HBS, σ P M is estimated at 0.78%, where the majority of the variance of ˆµ P M comes from the second term in this expression. However, the use of this formula does not make the estimation of joint distribution simpler because the second sample still needs bootstrapped. 5 Discussions We have proposed a method of producing cross tabulations from two samples by applying the ELL-SAE to bootstrapped second sample. Our method was used to estimate the joint distribution of poverty and child malnutrition in Tanzania. While poverty and child malnutrition (for the youngest child in the household) are found positively correlated in Tanzania, the correlation is very small. Based on C-SAE B, the correlations in urban and rural areas are 3.84% and 2.43% respectively. The corresponding figures for A-SAE B are 5.07% and 0.16%. This indicates that lack of food intake may not be the dominant factor of child malnutrition. Other factors, such as infectious diseases and lack of child care, may also be equally important. Our application is applicable and useful to many developing countries, since both demographic and health surveys and consumption surveys are widely available so that our approach can be directly applied. At the same time, relatively few countries have surveys that include both a consumption component and an anthropometric component so that generating cross tabulations by a standard method is difficult. Further, the resulting cross tabulation is useful, because it helps policy-makers understand how the policies to combat poverty would decrease malnutrition and vice versa. While we have applied the ELL-SAE in two directions to cross-examine the validity of the estimates in this study, this is not always possible. Suppose, for example, that we have a consumption survey and a labor force survey and that we are interested in the joint distribution of poverty and household head s unemployment. In this case, we can impute consumption into the labor force survey, but it may be difficult to impute unemployment into the consumption survey. 2 When the bidirectional application of ELL-SAE is not possible, cross-examination of 2 The standard ELL-SAE does not apply here, though one could develop a variant of the ELL-SAE. 8

the estimates is not possible. Still our method can still produce useful a cross tabulation. References Elbers, C., J.O. Lanjouw, and P. Lanjouw, 2002, Welfare in villages and towns: Micro-level estimation of poverty and inequality, World Bank. 2003, Micro-level estimation of poverty and inequality, Econometrica 71(1), 355 364. Elbers, C., J.O. Lanjouw, P. Lanjouw, and P.G. Leite, 2004, Poverty and inequality in Brazil: New estimates from combined PPV-PNAD data, World Bank. Fujii, T., 2005, Micro-level estimation of child malnutrition indicators and its application in Cambodia, The World Bank. National Bureau of Statistics 2002, Household budget survey 2000/01, National Bureau of Statistics, President s Office, Planning and Privatization, United Republic of Tanzania. National Bureau of Statistics and ORC Macro, 2005, Tanzania demographic and health survey 2004-2005, National Bureau of Statistics and ORC Macro. 9

Appendix We use PovMap software Version 1.2.4 for estimating regression models and carrying out simulations. 3 In the C-SAE, we take the logarithm of the consumption per adult equivalence as y h. In the A-SAE, we use the standardized height, which is the height-for-age Z-score converted into the corresponding height for a 24-month-old girl, as was done by Fujii (2005). This step allows us to eliminate negative values so that we can compute inequality measures (though not reported) without affecting other results. The location effects are included only for the A-SAE but not in the C-SAE because the point estimate of ση 2 was negative in both urban and rural areas. We modelled the heteroskedasticity of ɛ only for the consumption model in the rural area, in which the coefficient on ŷ in the residual regression model was significant. Note that the regression models presented here are used only for prediction, and do not describe a causal relationship. 3 This software can be downloaded from website: http://iresearch.worldbank.org/povmap/index.htm. 10

Table 3: Regression results for C-SAE for urban areas. Variable Coef S.E. Intercept 10.1365 (0.0300) Household size -0.0774 (0.0022) Highest education among spouses is primary completed or higher 0.0612 (0.0129) Source of cooking heat is firewood -0.1631 (0.0141) Household has an iron 0.1366 (0.0130) Household has a television 0.1976 (0.0209) Household has a radio 0.2075 (0.0151) Household has a car 0.2552 (0.0328) Eat meat twice a week or less -0.2594 (0.0135) Have often/always have problems satisfying the food needs -0.1909 (0.0163) Region=13-0.1638 (0.0434) Region=19-0.2759 (0.0218) Region=20-0.2941 (0.0334) Region=21-0.3388 (0.0609) Number of clusters 617 Number of observations 6685 R 2 for the first-stage OLS 0.4198 11

Table 4: Regression results for A-SAE for urban areas. Variable Coef S.E. Intercept 0.8085 (0.0148) Youngest child is one year old -0.0236 (0.0044) Youngest child is two year old -0.0245 (0.0046) Youngest child is three year old -0.0248 (0.0049) Youngest child is four year old -0.0163 (0.0056) The next oldest household member is two or more years older than the youngest child 0.0112 (0.0063) Household head s education is incomplete primary or higher -0.0100 (0.0052) Floor is concrete/cement/tiles/timber 0.0153 (0.0040) Source of light is paraffin -0.0081 (0.0042) Household has a refrigerator 0.0211 (0.0059) Household has a radio 0.0089 (0.0038) Proportion of households with a widowed head in the ward -0.2053 (0.0650) Ward average of the number of non-relatives per household 0.0611 (0.0129) Ward average of the number of grand children per household 0.0802 (0.0215) Ward average of the number of spouses times the adult equivalence -0.0147 (0.0047) Number of clusters 95 Number of observations 751 R 2 for the first-stage OLS 0.2642 Location Effect (σ η/σ 2 u) 2 0.0304 12

Table 5: Regression results for C-SAE for rural areas. Variable Coef S.E. Intercept 9.8327 (0.0980) Household size -0.1296 (0.0111) Household size squared 0.0036 (0.0005) Ratio of children under five in the household 0.4164 (0.1136) Highest education in household is primary completed or higher 0.0870 (0.0299) Source of cooking heat is not firewood/gas/electricity 0.1687 (0.0582) Floor is concrete/cement/tiles/timber 0.1975 (0.0460) Household has a bicycle 0.0931 (0.0271) Household has an iron 0.2270 (0.0374) Household has a radio 0.1156 (0.0283) Household has a bank account 0.2041 (0.0596) Eat meat twice a week or less -0.1217 (0.0324) Have often/always have problems satisfying the food needs -0.2338 (0.0340) Region=7-0.3519 (0.1113) Number of clusters 538 Number of observations 4321 R 2 for the first-stage OLS 0.3241 13

Table 6: Regression results for A-SAE for rural areas. Variable Coef S.E. Intercept 0.6595 (0.0255) Youngest child is one year old -0.0337 (0.0020) Youngest child is two year old -0.0285 (0.0022) Youngest child is three year old -0.0325 (0.0026) Youngest child is four year old -0.0354 (0.0028) Youngest child is female 0.0042 (0.0015) Oldest spouse s education is some secondary or higher 0.0103 (0.0069) Agricultural land owned by the household 0.0000 (0.0000) Household has a refrigerator 0.0181 (0.0149) Household has a radio 0.0036 (0.0016) Household has an iron 0.0085 (0.0023) Wall is not poll or mud -0.0034 (0.0017) Floor is concrete/cement/tiles/timber 0.0090 (0.0029) Floor is not concrete/cement/tiles/timber/earth -0.0219 (0.0124) Ward average of the number of handicapped individuals per household -0.9490 (0.2167) Ward average of the male ratio in the household 0.2116 (0.0378) Ward average of the ratio of females under the age of five 0.1944 (0.0645) Ward average of the maximum age in the household 0.0007 (0.0003) Number of clusters 290 Number of observations 3367 R 2 for the first-stage OLS 0.1725 Location Effect (σ η/σ 2 u) 2 0.0260 14