Data quality analysis of the NRVA 2007/08 Beatriz Godoy 1, consultant July-August, 2009

Similar documents
National Bureau of Statistics. Poverty measurement note

CN Tower 301 Front St W. Toronto, ON Environics Analytics FoodSpend. Page 1

Map data 2017 Google `

YOUR GUIDE TO EASY PROVISIONING

EXECUTIVE SUMMARY IMPACT OF FOOD PRICE INCREASES ON LOW-INCOME NEW BRUNSWICKERS

2019 The City of Seven Hills Farmers Market Vendor Application Application Fee $20.00

2018 The City of Seven Hills Farmers' Market Vendor Application

CHAPTER 5. ALTERNATIVE ASSESSMENT OF POVERTY

Farm Vendor Application

Kentucky Farmers Market Price Report 9/25 10/01/2017

Having regard to the Treaty establishing the European Community,

COMMUNITY QUESTIONNAIRE 2012

IN1: Regular employment income [START]

Core Adult Lunch Menu - Allergen Information

Kentucky Farmers Market Price Report 7/3 7/10/2016

Factsheet: Trade in Goods

Consumer Price Index. June Business and economy

ALLEN FARMERS MARKET

Kentucky Farmers Market Price Report 5/30 6/05/2016

Consumer Price Index. March Business and economy

Consumer Price Index. December Business and economy

Consumer Price Index. February Business and economy

Consumer Price Index. September Business and economy

A simple model of risk-sharing

Federal Crop Insurance: Specialty Crops

Budgeting for pet care

Kentucky Farmers Market Price Report 7/17 7/23/2017

2019 CSA Late April October Wild Carrot Farm, LLC 261 Old Mount Tom Road Bantam, CT

U.S. Department of Agriculture Food and Nutrition Service Administrative Review Alexandria, VA 22302

HFS Market Surveys in South Sudan

The First Case of FTA Compensation for Income Loss on Food Crops in Korea

Parental investment in child nutrition

THE CAYMAN ISLANDS CONSUMER PRICE INDEX REPORT: JUNE 2016 (Date of release: August 10, 2016)

Invitation for Bid. SY2018 NATIONAL SCHOOL LUNCH PROGRAM Fresh Fruit and Vegetable Grant

INCOME, EXPENDITURE AND CONSUMPTION OF HOUSEHOLDS IN 2016

Florida Farmers Market nutrition program Grower s Handbook

WELCOME TO RITEWAY PROVISIONING. Dining Made Easy!

Crop Insurance Options for Organic Producers

INCOME, EXPENDITURE AND CONSUMPTION OF HOUSEHOLDS IN 2017

Part VII Fresh Market Vegetables Acreage Loss Insuring Agreement

PART II: ARMENIA HOUSEHOLD INCOME, EXPENDITURES, AND BASIC FOOD CONSUMPTION

ISSUE 4: FARMERS MARKETS AND COMMUNITY-SUPPORTED AGRICULTURE 1

Serbian LSMS (Household Poverty Survey): Basic Results of the Joint Analysis.

Analysis of 5 Million Meals Challenge

Kentucky Farmers Market Price Report 7/4 7/10/2016

THE CAYMAN ISLANDS CONSUMER PRICE INDEX REPORT: DECEMBER 2017 (Date of release: February 15, 2018)

Annex V: Annual Monitoring report school year 2009/2010 stay of play till *

Kentucky Farmers Market Price Report 7/2 7/8/2018

Rarely or never. 2 times per week or less. Less than 1 time per week. All of the time. 3 or more times per week. 3 or more times per week

Low Income Thresholds

The Azerbaijani economy in the first nine of 2018: Brief Overview

DEPARTMENT 13 VEGETABLES Judging: Tuesday, July 31, :00 AM Judge: John Bierbower

2.1 Approximately how many people in this district lost all their sources of income as a result of the disaster?

Prime Age Adult Mortality and Household Livelihood in Rural Mozambique: Preliminary Results and Implications for HIV/AIDS Mitigation Efforts

The impact of the Kenya CT-OVC Program on household spending. Kenya CT-OVC Evaluation Team Presented by Tia Palermo Naivasha, Kenya January 2011

APPENDIX 2: SUMMARY OF EVIDENCE

Province: District: Province Number: Interviewer(s): Dwelling Number:

U.S. Department of Agriculture Food and Nutrition Service Administrative Review Branch Alexandria, VA FINAL AGENCY DECISION ISSUE AUTHORITY

Fo od Bu dgeting Made Easy

Illinois Farmers Market Nutrition Program

OFFICIAL RELEASE. Monthly Consumer Price Index September 2018

Managing Revenue Risk: How to Determine if NAP or Other Revenue Insurance Products Are a Fit for Your Business

Annex IV: Summary report 'Strategy'

Migration Responses to Household Income Shocks: Evidence from Kyrgyzstan

ANNUAL FIXED FEES FOR SERVICES PLUS REIMBURSABLE COSTS.

National Crop Insurance Services

Statistical Factsheet. France CONTENTS. Main figures - Year 2016

Statistical Factsheet. Belgium CONTENTS. Main figures - Year 2016

Statistical Factsheet. Italy CONTENTS. Main figures - Year 2016

SNAP & WIC FOOD ASSISTANCE PROGRAMS

Understanding the Consumer Price Index (CPI)

2015 MEDIA FARMERS MARKET VENDOR APPLICATION Deadline: March 27, 2015

Statistical Factsheet. Lithuania CONTENTS. Main figures - Year 2016

T R A N S L A T I O N

UKRAINE Market Monitor Review January-June 2018

Quarter 1: Post Distribution Monitoring Report. January - March 2017 HIGHLIGHTS. 2. Methodology

Post-Distribution Monitoring Report. WFP Kyrgyzstan

DOCUMENTING THE ECONOMIC COST OF UNSAFE ABORTION IN UGANDA

USDA Risk Management

Netherlands. May 2018 Statistical Factsheet

Italy. May 2018 Statistical Factsheet

Austria. May 2018 Statistical Factsheet

Estonia. May 2018 Statistical Factsheet

U.S. Department of Agriculture Food and Nutrition Service Administrative Review Alexandria, VA 22302

SOMALILAND CONSUMER PRICE INDEX

France. May 2018 Statistical Factsheet

Important Notes About This Guide:

RULES AND REGULATIONS Title 7 AGRICULTURE

Greece. Sources: European Commission, Eurostat, and Directorate General for Economic and Financial Affairs. Updated: M ay 2018

Measuring Poverty in Armenia: Methodological Features

MULTITOPIC HOUSEHOLD SURVEY

2016 Crop Insurance Update

Who is Hungry. Understanding Poverty and Hunger Worldwide. Hunger By Region. 925 Million Hungry

Utah Urban Small-Scale Mixed Vegetable Production Costs and Returns 5 Acres, 2015

Estimating Living Wage Globally. Martin Guzi Masaryk University, CELSI, GLO and WageIndicator

Denmark. Sources: European Commission, Eurostat, and Directorate General for Economic and Financial Affairs. Updated: M ay 2018

Straw Hat Farms. of Marion County, IA. Produce Catalog 2017 Growing Season

Hüsnü M. Özyeğin Foundation Rural Development Program

Economic and Social Council

Designing LSMS Questionnaires. Kinnon Scott February 27, 2001

Transcription:

Data quality analysis of the NRVA 2007/08 Beatriz Godoy 1, consultant July-August, 2009 The NRVA 2007/08 data set is a nationally representative, multi-topic household survey data for Afghanistan. It covers various topics of living standard and activities of households in Afghanistan such as agriculture, labor, migration, gender, etc. The data were collected from September 2007 to September 2008. The NRVA data have been entered, cleaned, and processed and are currently ready for analysis. The purpose of this assignment is to assess the quality of the NRVA data, from an independent view. EXECUTIVE SUMMARY To carry out a survey such as this one, under normal conditions involves a considerable challenge, and conducting it in Afghanistan was even still greater on account of the conditions which the country is currently undergoing. This survey was successfully conducted over a span of twelve months, despite the usual drawbacks in this type of operations, -instructions that were clearly spelt out initially but that are forgotten by the enumerators later, or questions that were misunderstood by respondents, and the like- all of which were solved in an effortless manner. Within the frame of this consultancy I analyzed different indicators of the data quality, and I can say based on our experience in other countries that have carried out similar surveys that this one is among the best ones. The results obtained do not show significant errors; much to the contrary, the few errors indicate a strong supervision of field teams, since experience has shown over and over again that these particular surveys are very sensitive to mistakes from the interviewer's side. I have developed a set of indicators in the body of this report 2, which is divided into the following six chapters: (1) completeness of household s data, (2) completeness of individual s data, (3) demographic consistency, (4) empty records and duplicates, (5) internal consistency of sections and (6) multidimensional quality controls. The result of assessing the quality of the NRVA 2007/08 data gives excellent indicators; consequently the users are advised to begin the analytical exploitation of this rich database. 1 Address: Sistemas Integrales Casilla 13168 Santiago, Chile. Phones: (56-2) 638-1841 and (56-2) 639-4554. Fax: (56-2) 639-2086. E-mail: beatriz.godoy@ariel.cl 2 Note: The list of possible quality indicators for a survey like the NRVA 2007/08 has almost not limit, therefore I have selected the ones that according to our experience are more sensitive to errors.

1.- COMPLETENESS OF HOUSEHOLD S DATA In this stage of the quality analysis I revise the completeness of household s data in terms of the completed sections or chapters of the questionnaire. There are sections that are obligatory, that is to say, that they must be available at every household, and others that must be completed depending on its demographic structure. Among the obligatory sections the following must be mentioned: 1: Household Roster, 2: Housing, 5: Assets and Credit, 12: Household Expenditures, 15: Food consumption last 7 days and 16: Iodized salt, avian flu and household expenses. There are others which may not be applicable to a household such as for example 3: Livestock, 4: Agriculture might be empty if the household has no farming activities or 8: Sources of household income, 10: Cash for work, Food for work could be blank if the household subsists due to pensions or other benefits; but for each of these sections there is a filter question at the beginning; and at the least, this question must be answered (see example in figure 1). Figure 1: Example of filter question In the case of some specialized sections, the complexities used at this stage consist in checking the number of individuals who meet the application criteria of the section corresponding to the number of people that responded it. These criteria will invariably be found in section: 1: Household Roster. Section 6: Education (for household members 6 years old and more) Section 9-A: Labor for members 6-15 yrs. Section 9-B: Labor and Migration, for members 16 years and over. Section 17: Number of children born and marriage information, for every married woman 49 years and less. Section 19: Immunization and child health, for children less than 5 years. Section 20: Women's activities, for women 10 years and above. The result of this exploration is that very few households have incomplete data for some section(s), see Figure 2:

Figure 2: Completeness of household data Total number of Households: 20,576 Individual sections Consistent records Number of households with missing information on the section Number Percent Number Percent Cover 20,576 100.00% 0.00% Section 1: Household Roster 20,576 100.00% 0.00% Section 2a: Housing and Utilities 20,576 100.00% 0.00% Section 2b: Housing and Utilities 20,575 100.00% 1 0.00% Section 3: Livestock 20,573 99.99% 3 0.01% Section 4: Agriculture 20,573 99.99% 3 0.01% Section 5: Assets and Credit 20,576 100.00% 0.00% Section 6: Education (for househiold members 6 years old and more) 20,576 100.00% 0.00% Section 7: Disabilities (for all household members) 20,576 100.00% 0.00% Section 8: Sources of HH income 20,574 99.99% 2 0.01% Section 9a: Labor for members 6-15 yrs 20,443 99.35% 70 0.34% Section 9b: Labor and migration 16 and over 20,499 99.63% 37 0.18% Section 10: Cash for work, Food for work 20,570 99.97% 6 0.03% Section 11a: Migration and remittances 20,520 99.73% 56 0.27% Section 11b:Migration and remittances 20,531 99.78% 45 0.22% Section 12: Household Expenditures 20,572 99.98% 4 0.02% Section 13: Household shocks and coping strategies 20,571 99.98% 5 0.02% Section 14: Finale male interview 20,570 99.97% 6 0.03% Section 15a: Food consumption las 7 days 20,544 99.84% 32 0.16% Section 15b: Food consumption las 7 days 20,544 99.84% 32 0.16% Section 16: Iodized salt, avian flu and hh expenses 20,542 99.83% 34 0.17% Section 17: Number of children born and marriage information (for ever married women up to 49 years) 20,517 99.71% 49 0.24% Section 19: Immunization and child health (for all children under 5) 20,576 100.00% 0.00% Section 20: Women's activities (for all females in the household, ages 10 and older) 20,537 99.81% 39 0.19% Section 20: Women's activities (for all females in the household, ages 10 and older) 20,576 100.00% 0.00% The results are excellent; nevertheless I recommend to see in detail the sections where information is missing considering the objective of the analysis so as to identify the degree of seriousness if any at all that this lack of data might involve.

2.- COMPLETENESS OF INDIVIDUAL S DATA In this chapter I analyze the completeness of the data at the individual level. The checks are based on the following: all household members 6 years and more should answer Section 7: Education, all household members should answer Section 7: Disabilities, all household members aged 6-15 must answer Section 9- A: Labor, all household members aged 16 years and more must answer Section 9-B: Labor and Migration, all women ever married that are 49 years or less must answer Section 17: Number of children born and marriage information, all children less than 5 years must answer Section 19: Immunization and child health, and all women 10 years and above must answer Section 20: Women's activities. Total number of individuals in Section 1, Household Roster: Figure 3: Completeness of individual sections 152,284 Individual sections Consistent records matches the criteria but are not present in the section Nb of household members that: fail the criteria but are present in the section don't have complete information on section 1 (Roster) are not in section 1 (Roster) Number Percent Number Percent Number Percent Number Percent Number Percent Section 1: Household Roster 152,021 99.83% 0.00% 0.00% 263 0.17% 0.00% Section 6: Education (for househiold members 6 years old and more) 152,288 100.00% 0.00% 9 0.01% 0.00% 0.00% Section 7: Disabilities (for all household members) 151,822 99.70% 475 0.31% 0.00% 0.00% 0.00% Section 9a: Labor for members 6-15 yrs 152,211 99.95% 0.00% 64 0.04% 22 0.01% 0.00% Section 9b: Labor and migration 16 and over 152,243 99.97% 0.00% 29 0.02% 25 0.02% 0.00% Section 17: Number of children born and marriage 152,274 99.99% 0.00% 23 0.02% 0.00% 0.00% information (for ever married women up to 49 years) Section 18: Recent births (children born since August 2005) 151,701 99.62% 0.00% 571 0.37% 12 0.01% 13 0.01% Section 19: Immunization and child health (for all children under 5) Section 20: Women's activities (for all females in the household, ages 10 and older) 151,163 99.26% 0.00% 988 0.65% 146 0.10% 0.00% 151,323 99.37% 891 0.59% 11 0.01% 72 0.05% 0.00% In a country such as Afghanistan, it is a challenge to have obtained 99.37 percent of women in section 20, therefore I don t see this 891 of missing woman as indication of quality of the data sets. In some countries the issue of disability is culturally very badly seen, this could be the reason why some individuals didn t answer Section 7. Anyway, this figure of 475 represents only 0.31 percent of the individuals. 3.- DEMOGRAPHIC CONSISTENCY In this chapter I devote the efforts exclusively to analyze the demographic consistency of the households. The indicators used for this purpose are: (1) Each household must have a head, and only one. (2) Each Head must be fifteen years old or more. (3) If a household member has a spouse, and there is a spouse ID CODE in question 5, then this person should be in the household roster. (4) If a household member has a spouse, this spouse should have different sex. (5) If a household member has the father living in the household, then the ID CODE registered in question 7 must be in the roster (6) ID CODE of father should correspond to a MAN (7) The age difference between the individual and the father should be more than 15 years (8) If a household member has the mother living in the household, then the ID CODE registered in question 9 must be in the roster (9) ID CODE of father should correspond to a WOMAN (10) The age difference between the individual and the mother should be more than 12 and 55 years.

Figure 4: Demographic consistency of households 1.- Number of household's head per household 6.- Sex of father n percent n percent One head 20,573 99.99 male 94,923 99.79 Two head 2 0.01 female 204 0.21 Head missing 1 0.00 Total 95,127 100 Total 20,576 100.00 7.- Age difference between father and household member 2.- Consistency of age of the household head n percent n percent less than 15 years 138 0.09 less than 15 yrs old 25 0.12 15 years or more 151,883 99.91 >15 years old 20,552 99.88 Total 152,021 100 Total 20,577 100.00 8.- Existence of ID CODE of mother 3.- Existence of ID CODE of spouse n percent n percent mother not found in roster 40 0.03 spouse not found in roster 7 0.00 mother found in roster 151,981 99.97 spouse found in roster 152,014 100.00 Total 152,021 100 Total 152,021 100.00 9.- Sex of mother 4.- Consistency of spouse's sex n percent n percent male 188 0.19 Same sex 124 0.25 female 100,460 99.81 Different sex 50,437 99.75 Total 100,648 100 Total 50,561 100 10.- Age difference between mother and household member 5.- Existence of ID CODE of father n percent n percent between 12 and 55 years 151,552 99.69 father not found in roster 22 0.01 less than 12 years 315 0.21 father found in roster 151,999 99.99 more than 55 years 154 0.1 Total 152,021 100.00 Total 152,021 100 As shown in Figure 4, the results do not show significant errors; on the contrary, the few errors indicate a strong supervision of the field teams, since experience demonstrates that this chapter of the questionnaire is very sensitive to errors from the interviewer's side.

4.- EMPTY RECORDS AND DUPLICATES Prior to performing an in depth analysis, it is advisable to identify any empty or duplicated records. By empty records I understand those records which have no data in the relevant variables of the subject registered in the file. In this set of data files, and prior to this revision, the variables cid, stratum, hh_weight (and at times mem_weight ) were added to all the data files; therefore these variables do not count when looking for empty records. There are very few cases (see Figure 4 below), and the analysts should decide whether to they should be dropped from the original files or not. Figure 5: Number of empty records empty records file name name of the section total number of records n percent s_m.dta Cover male 20,576 0 0.00 s_f.dta Cover female 20,576 0 0.00 s_1.dta Section 1: Household Roster 152,284 263 0.17 2a.dta Section 2a: Housing and Utilities 20,576 1 0.00 2b.dta Section 2b: Housing and Utilities 20,575 0 0.00 3.dta Section 3: Livestock 20,573 1 0.00 4.dta Section 4: Agriculture 20,573 17 0.08 5a.dta Section 5: Assets and Credit (question 1) 20,575 53 0.26 5b.dta Section 5: Assets and Credit (questions 2-5) 71,488 0 0.00 5c.dta Section 5: Assets and Credit (questions 6-27) 20,575 1 0.00 6.dta Section 6: Education (for househiold members 6 years old and more) 152,284 274 0.18 7.dta Section 7: Disabilities (for all household members) 152,284 475 0.31 8.dta Section 8: Sources of HH income 20,574 25 0.12 9a.dta Section 9a: Labor for members 6-15 yrs 48,751 0 0.00 9b.dta Section 9b: Labor and migration 16 and over 73,359 0 0.00 10.dta Section 10: Cash for work, Food for work 20,570 21 0.10 11a.dta Section 11a: Migration and remittances 21,080 0 0.00 11b.dta Section 11b:Migration and remittances 20,912 0 0.00 12.dta Section 12: Household Expenditures 20,572 10 0.05 13.dta Section 13: Household shocks and coping strategies 20,571 2 0.01 14.dta Section 14: Finale male interview 20,570 2 0.01 15a.dta Section 15a: Food consumption las 7 days 20,544 0 0.00 15b.dta Section 15b: Food consumption las 7 days 1,869,504 0 0.00 16.dta Section 16: Iodized salt, avian flu and hh expenses 20,542 263 1.28 17.dta Section 17: Number of children born and marriage information (for ever married women up to 49 years) 22,975 0 0.00 18.dta Section 18: Recent births (children born since August 2005) 12,689 0 0.00 19a.dta Section 19: Immunization and child health - for all children under 5 (questions 1-28) 24,846 0 0.00 19b.dta Section 19: Immunization and child health - for all children under 5 (questions 29 and 30) 20,540 851 4.14 20a.dta Section 20: Women's activities - for all females in the household, ages 10 and older (questions 1-10) 20,531 343 1.67 20b.dta Section 20: Women's activities - for all females in the household, ages 10 and older (questions 11-44) 47,982 0 0.00 As for duplicates, there are identical records only in files 5b (Section 5: Assets and Credit - questions 2-5) and 11b (Section 11b: Migration and remittances). But section 5b was designed so that it is very possible that more than one item with the same characteristics are recorded for the same household. And under Section 11b, it is also possible that the same household receives remittances from more than one person with exactly the same characteristics (relationship to head, sex, place of residence and occupation). Therefore I conclude that there are no true duplicates to be dropped from the database. 5.- GENERAL CONSISTENCY OF SECTIONS Section 2: Housing and utilities. I checked for out-of-range values for all qualitative variables, for internal skip errors and there is almost no error. Section 3: Livestock. (1) The number of animals owned in question 3.1 and the number of productive females in question 3.3. There are only 2 households with an inconsistency in this regard. (2) The ID CODES registered in question 3.5 must exist in Section 1: Household Roster. There are 147 households (0.7 percent) with at least one of the ID CODES on that question that are not in Section 1. Section 4: Agriculture. I checked that the quantities of land are consistent: (3) The area cultivated during the most recent summer cultivation season must be less or equal to the

total area of land the household members had access to: {Q_4_14 <= Q_4_4 + Q_4_6 + Q_4_7 + Q_4_8}. There is only one household failing this check. (4) The area sharecropped-out, rented-out, mortgage-out, or left fallow during the most recent summer cultivation season, must be less than or equal to the total area of land owned by the household members: {Q_4_4 >= Q_4_9 + Q_4_10 + Q_4_11 + Q_4_12}. There are fifty-nine household failing this check (representing 0.29 percent of the households). (5) The area of rain-fed land cultivated during the most recent summer cultivation season must be less than or equal to the total area of rain-fed land the household members had access to: {Q_4_27 <= Q_4_22 + Q_4_24}. There are four households failing this check. (6) The area of rain-fed land left fallow in the most recent summer cultivation season must be less than or equal to the total area owned and sharecropped-in: {Q_4_25 <= Q_4_22 + Q_4_24}. No household fails this check. (7) The area of garden plot the household had access to in the most recent summer cultivation season must be greater than or equal the detail given in question 4.31b: {Q_4_31a >= Q_4_31b_1 + Q_4_31b_2 + Q_4_31b_3 + Q_4_31b_4}. Fifteen households fail this check. Section 5: Assets and Credits. Question 5.1 How many of the following items does your household own? against the items listed in questions 5.2 to 5.5; see figure below. Figure 6: Section 5, Assets and credits I checked that all the items owned by the household (in question 5.1) are listed in questions 5.2 to 5.5. There are less than 3% of the households that present an inconsistency between the items reported in 5.1 and the detail given in questions 5.2 to 5.5. Section 6: Education. The age should be consistent with the highest level and year attended by the household members. I can only check that the age is not less than a minimum for each level and year, assuming that the minimum age to attend primary grade 1 is 5 years. See results below in Figure 7:

Figure 7: Age vs. Highest level of education and year attended n percent Too young for this level and year 363 0.24 Consistent 41,953 27.55 Not checked 109,968 72.21 Total 152,284 100.00 Section 8: Sources of household income. There are two sources of income that I can check against other parts of the questionnaire: (1) Section 3: Livestock and the following sources of income: 2 Livestock production for home consumption 7 Prod & sales of livestock & products If any of the sources above appear in Section 8, then the answer to question 3.1 must be 1:YES. There are 74 (0.4 percent) households that fail this check. (2) Section 4: Agriculture and the following sources of income: 1 Crop production for home consumption 3 Production & sale of field crops 4 Prod & sales of cash crops (except Opium) 5 Production & sale of opium 6 Prod & sales of orchard products If any of the sources above appear in Section 8, then the answer to question 4.1 must be different than 4:NO. There are 136 (0.7 percent) households that fail this check. Section 12: Household expenditures. The fact that a household has information on this section does not necessarily imply that the information is usable. I computed the total amount of money spent in all nonfood items per household, during the past 30 days. There are 34 households (0.17 percent) that report zero expenditure in non-food items during the past 30 days. This is indeed not an error, because it could very well be that a household did not buy any. Section 15b: Food consumption in the past 7 days. I also computed the total quantity consumed by the household members (as an absolute number), and there are 18 households that have zero quantity of food consumed during the past 7 days, but one of them reported to have eaten several meals outside the household (question 15.3). Subsequently there are 17 households (0.08 percent) without food consumption at all during the past 7 days. 6.- MULTIDIMENSIONAL QUALITY CONTROLS DAILY CALORIE PER CAPITA INTAKE CONTROL This control uses the daily per capita energy provided by each food item in Section 15: Food consumption in the last 7 days, to perform various tests on the likelihood of the reported data. The purpose of these tests is to detect probable reporting or data entry errors and not to assess nutritional adequacy at this stage. This is why I do not use adult equivalents; on the contrary, I assume that all household members are similar and they should consume roughly more than 800 kcal per capita per day and less than 4,000 kcal per capita per day. The formula used is as follows: { Energy provided by a food item (Kcal/person /day) } = { Food intake (in unit of measurement of the food item) } x { Energy value of one measurement unit of food (kcal) } / { Number of person days }

Out of the 20,576 households, 32 do not have information on this section at all, and another 18 have zero food consumption. For the remaining households I proceeded as follows: 1. I computed the total energy provided by each food item. The energy value of 1 unit of food is taken from the food composition table shown in Annex I. 2. I compute the number of person days as 7 times the household size. I correct this value using the answer to question 15.2 How many meals were eaten by guests from the household cooking pot in the last 7 days?. If I assume that a person eats, on average, three times a day, then I add { 7 * (q_15_2 / 3) } to get the corrected Number of person days. I found that 2.8 percent of the households report a daily per capita consumption of less than 800 calories and that 3.7 percent of households consumed more than 4,000 calories per day and per capita. CASH BALANCE CONTROL The purpose of this control is only to detect values that are suspicious of being anomalous (outliers) in the variables related to the cash expenses and cash income. With this clear and unique objective I constructed two variables: total yearly income in cash of the households and total yearly expense in cash of the households. Next, I calculated the difference in the logarithms of these two variables and I searched for extreme values. The total yearly income in cash of the households includes the following questions in the questionnaire: Income from livestock activities, question 3.15 Largest loan obtained in cash, question 5.19 (if question 5.15 = cash) Annual income from question 8.4 The sum of question 9.20 for all members of the household. And the total yearly expense of the households considers: Rent in question 2.12 or 2.46 Water expenses in questions 2.33 and 2.38 (main and secondary source) Amount paid for the items acquired during the past 12 months, in question 5.4 Household expenses from section 12 Women s expenses in 16.12 and 16.13 I found only about 0.6 percent of the households that could be examined in greater detail because some of the values seem too high or too low, Figure 8 is a scatter plot showing the relationship between the two variables so computed using logarithm scales.

Figure 8: Cash income and expenditures 10000000 1000000 Income 100000 10000 1000 1000 10000 100000 1000000 10000000 Expenditure I haven t imputed a value to the amount spent on the food consumed by the household members in one year; therefore it is very likely that income looks higher than expenditures. Anyhow the cash income and expenditures are never expected to perfectly match in these surveys. However, the points too far from the diagonal probably reveal recording or under reporting errors

Annex I Food composition table item name unit kcal item name unit kcal 01 rice Kg 3,630 46 radish Kg 280 02 wheat flour Kg 3,570 47 turnip Kg 230 03 purchased nan piece 284 48 cabbage Kg 160 04 barley Kg 3,270 49 leek Kg 440 05 maize corn 726 50 broccoli Kg 200 06 beans Kg 3,500 51 hot pepper Kg 290 07 mung Kg 3,400 52 wild leaves Kg 190 08 chickpeas Kg 3,570 53 coriander Kg 190 09 lentils Kg 3,540 54 mint Kg 240 10 macaroni Kg 3,790 55 dried tomato Kg 2,590 11 other bread Kg 56 dried vegetable Kg 2,387 12 beef Kg 1,240 57 pickled vegetable Kg 1,447 13 veal Kg 1,300 58 green bean Kg 310 14 lamb Kg 2,355 59 other vegetable Kg 15 goat Kg 1,570 60 apple Kg 490 16 chicken Kg 1,270 61 grapes Kg 670 17 liver Kg 1,440 62 melon Kg 270 18 dried meat Kg 6,295 63 peach Kg 460 19 fish Kg 460 64 dried apricot Kg 2,960 20 other meat Kg 65 orange Kg 330 21 milk Kg 855 66 plum Kg 460 22 milk powdered Kg 5,070 67 pomegranate Kg 430 23 yogurt Kg 1,530 68 pear Kg 560 24 Curd Kg 500 69 banana Kg 920 25 krut Kg 4,842 70 raisins Kg 2,930 26 dogh Kg 383 71 fresh mulberries Kg 820 27 ghee Kg 8,730 72 dried mulberries Kg 3,330 28 butter Kg 6,930 73 mangoes Kg 400 29 cheese Kg 3,100 74 walnut Kg 2,770 30 eggs Nb 70 75 pistachio Kg 3,330 31 other dairy Kg 76 almonds Kg 2,470 32 vegetable oil Kg 8,840 77 other fruit Kg 33 animal fat Kg 9,020 78 white sugar Kg 3,860 34 other oil Kg 79 brown sugar Kg 3,860 35 potato Kg 750 80 honey Kg 3,120 36 sweet potato Kg 730 81 chocolate Kg 3,940 37 onion Kg 420 82 black tea Kg 0 38 tomato Kg 180 83 green tea Kg 0 39 okra Kg 390 84 bottled water Lt 0 40 spinach Kg 250 85 other beverage Lt 41 cauliflower Kg 150 86 salt Kg 0 42 eggplant Kg 330 87 black pepper Kg 2,370 43 carrots Kg 370 88 ginger garlic Kg 1,000 44 pumpkin Kg 390 89 tomato sauce Kg 240 45 cucumber Kg 170 90 mixed spices Kg 3,250 91 other spices Kg